The F&O Twist on Process Mining – Event log data requirements

, ,

In this blog about process mining, I will talk about my learnings about data requirements for event logs to be used for process mining. Next to the file format, I will provide several examples showing the differences and variation options.

Data requirements and file formats

When I explored process mining, I came across a good page explaining the basic concepts of event logs: Event Data – Process Mining

This page is teaching you some basics, but there is much more to be aware of. The minimum data needed to process data mining are three attribute types: Case ID, Activity, and Timestamp.

Case-IDThis is a unique instance ID of iterations of the same process. Each instance is an execution of the same process. Related activities will be identified by this key field.
ActivityThe description of the process steps. To be able to perform process mining, you would need to have multiple rows for a single case ID. With only one activity, there is no process diagram possible. The same activity can occur multiple times for the same case. In that scenario, there will be rework identified in your process. The description should clearly distinguish all activities in a process. Next to a process step, an activity can also be a status change.
TimestampThe timestamp is used to set the activities in the correct order. It will also identify the time between activities. Looking at delays, you can find bottlenecks in your process.
It is possible to have a start and end timestamp for activities. In that scenario, you would be able to analyze the activity duration (active time) next to the (idle) time between activities. One timestamp is mandatory whereas a second is optional.
Next to three mandatory attributes, there are more attributes that can be added to the file. More attributes would give you more analysis options. You can try to find correlations, make comparisons, and you can create additional reports with visualizations.

Some commonly used attributes are a Resource, Location, and Department. It is also possible to add financial details per event, such as the cost or revenue. What additional attributes to add, depends on the process and available data in your source system.

For Power Automate Process Mining, the file format is not important. Where there are some industry standards like eXtensible Event Stream (XES) or Character Separated Values (CSV), Microsoft added dozens of connectors able to read files from a share or connect to databases or other sources. As long as there is a supported connector and you have the minimum required attributes, you can create a new process for process mining.

The column headers are not important for the file itself. As explained in my previous post, you would need to map the event log column names with the attributes in step 4 of creating the process. You can read more here: The F&O Twist on Process Mining – Create new process with event log – Dynamicspedia

What exactly happens when you use more or fewer attributes, I will explain with several examples. You can download the examples at the bottom of this blog. I kept the examples simple with only a few cases per file and it is based on something we all know from life: A day at the barbershop. I hope these examples will give you some more insights about event logs.

Barbershop demo 1

The first demo is about having the three minimum required fields. There are three case instances with each 4 activities. A person arrives in the lobby. When the seat is available, hair will be washed and cut. In the end, the customer will pay for the barber’s services.

CaseIdActivityNameStartTimestamp
1Lobby2023-10-10 10:05:00.000000
1Washing2023-10-10 10:06:00.000000
1Cutting2023-10-10 10:10:00.000000
1Payment2023-10-10 10:28:00.000000
2Lobby2023-10-10 10:09:00.000000
2Washing2023-10-10 10:30:00.000000
2Cutting2023-10-10 10:37:00.000000
2Payment2023-10-10 10:58:00.000000
3Lobby2023-10-10 10:33:00.000000
3Washing2023-10-10 11:00:00.000000
3Cutting2023-10-10 11:05:00.000000
3Payment2023-10-10 11:21:00.000000

If the file has a lot of instances, it will become unwieldy to analyze the data. Process Mining will help in that case. A new process will be created where the file is imported

The file contents will be loaded correctly. In this example, there is no need for any additional change in Power Query. In the next step, the three columns will be linked with the Power Automate Process Mining attributes.

Once the processing is ready, you can view and analyze the report. The first view doesn’t tell us more than that there are three cases started and three cases ended with a list of the activities.

When you change the view to duration, you can see long durations for both the Lobby and Cutting activity. Both have a long time between the start of the event and the next activity. For cutting, for sure the time is required, but when setting the view on maximum duration, a client had to wait 27 minutes in the lobby. In this case, the lobby seems to be the bottleneck. Let’s hope they offer beverages and have some magazines to pass the time.

Barbershop demo 2

Before talking about a mitigation for the bottleneck in the process as we had seen above, let’s have a look at the difference when we specify a start and end timestamp for the events. The file is almost the same, except for having another column with the end timestamp per event.

CaseIdActivityNameStartTimestampEndTimestamp
1Lobby2023-11-11 10:05:00.0000002023-11-11 10:05:00.000000
1Washing2023-11-11 10:06:00.0000002023-11-11 10:09:00.000000
1Cutting2023-11-11 10:10:00.0000002023-11-11 10:27:00.000000
1Payment2023-11-11 10:28:00.0000002023-11-11 10:29:00.000000
2Lobby2023-11-11 10:09:00.0000002023-11-11 10:28:00.000000
2Washing2023-11-11 10:30:00.0000002023-11-11 10:37:00.000000
2Cutting2023-11-11 10:37:00.0000002023-11-11 10:56:00.000000
2Payment2023-11-11 10:58:00.0000002023-11-11 10:59:00.000000
3Lobby2023-11-11 10:33:00.0000002023-11-11 10:59:00.000000
3Washing2023-11-11 11:00:00.0000002023-11-11 11:04:00.000000
3Cutting2023-11-11 11:05:00.0000002023-11-11 11:20:00.000000
3Payment2023-11-11 11:21:00.0000002023-11-11 11:23:00.000000

After creating a new process, there aren’t that many differences in the reports. The only difference is now that you can analyze the process having activity duration times and idle times between the events.

In this example, it is more clear that clients must wait a long time in the lobby. The owner of the barbershop acknowledged this as an issue and decided to hire an additional employee, so the barbershop could help multiple clients at the same time.

Barbershop demo 3

To be able to check more details, find correlations and root causes of possible bottlenecks, the event log is now extended with more details. The Barber is added, the gender of the client, what shampoo was used and what finish was used for hair styling at the end of the cutting activity. When creating a new process, we would like to first check if hiring the second barber indeed will solve the bottleneck we had before.

CaseIdActivityNameStartTimestampEndTimestampBarberGenderShampooFinish
1Lobby2023-11-11 10:05:00.0000002023-11-11 10:05:00.000000BarryMale
1Washing2023-11-11 10:06:00.0000002023-11-11 10:09:00.000000BarryMaleDandruff
1Cutting2023-11-11 10:10:00.0000002023-11-11 10:27:00.000000BarryMaleWax
1Payment2023-11-11 10:28:00.0000002023-11-11 10:29:00.000000BarryMale
2Lobby2023-11-11 10:09:00.0000002023-11-11 10:10:00.000000EmmaFemale
2Washing2023-11-11 10:11:00.0000002023-11-11 10:18:00.000000EmmaFemaleRegular
2Cutting2023-11-11 10:19:00.0000002023-11-11 10:49:00.000000EmmaFemaleGel
2Payment2023-11-11 10:51:00.0000002023-11-11 10:52:00.000000EmmaFemale
3Lobby2023-11-11 10:33:00.0000002023-11-11 10:35:00.000000BarryFemale
3Washing2023-11-11 10:36:00.0000002023-11-11 10:40:00.000000BarryFemaleRegular
3Cutting2023-11-11 10:41:00.0000002023-11-11 11:00:00.000000BarryFemaleWax
3Payment2023-11-11 11:01:00.0000002023-11-11 11:03:00.000000BarryFemale
4Lobby2023-11-11 10:40:00.0000002023-11-11 10:52:00.000000EmmaMale
4Cutting2023-11-11 10:53:00.0000002023-11-11 11:10:00.000000EmmaMaleGel
4Payment2023-11-11 11:11:00.0000002023-11-11 11:12:00.000000EmmaMale

Again, when the file will provided for a new process, the columns should be mapped with Process Mining attributes.

The attribute types Case ID, Activity, Event Start, Event End, and Resource are single use attributes. You can assign them only once when mapping your source columns. The other attribute types can be used multiple times, like the Shampoo and Finish.

When the report is ready, we can check the change in duration times. As expected, by having more employees, the waiting time in the lobby has been decreased significantly.

A given fact is that cutting the hair is labor intensive and will not likely be able to speed up. However, now we have more attributes, it would be possible to make correlations and compare e.g. the barber or another attribute. The primary dashboard on the cloud version of Power Automate Process Mining is not giving you the option to add filters at this moment. You can use the desktop app to filter data. It is also possible to add a custom Power BI workspace for process mining, but I will elaborate on that in another blog.

In the left-bottom corner of the screen, there is an option to add one or more filters based on the attributes you have in the event log. After altering filters you can apply the changes.

In case there are huge differences in e.g. the activity time for cutting, the Process Mining desktop application provides features to compare views on the process and start a root cause analysis. In a future blog, I will elaborate on that option.

Download

If you want to try out these demos yourself, feel free to download the zip file containing the three demos.

There is more…

Still after several blogs, there are more topics to discuss around process mining. In my next posts, I will for example write about:

  • Process mining integration, available in Dynamics 365 Supply Chain Management
  • How to create event logs from Dynamics 365 F&O transactional data
  • What features are

If you want to learn more about Process Mining, you can explore the documentation on Microsoft Learn: Overview of process mining and task mining in Power Automate – Power Automate | Microsoft Learn



I do hope you liked this post and will add value for you in your daily work as a professional. If you have related questions or feedback, don’t hesitate to use the Comment feature below.


That’s all for now. Till next time!

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.