Spaces:
Sleeping
Sleeping
File size: 145,347 Bytes
8097001 |
|
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"source": [
"# Conformance Checking\n",
"*by: Sebastiaan J. van Zelst*"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"source": [
"In this tutorial, we'll be focusing on *conformance checking*.\n",
"The conceptual idea of conformance checking is rather easy, i.e., computing to what degree a given process model conforms to the exeuction of a process, as recorded by the event data.\n",
"We are going to use the same process model as we have seen before, i.e., based on our [running example event log](data/running_example.csv):"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"source": [
"However, to check conformance w.r.t. the model, we're going to use a slightly [different event log](data/running_example_broken.csv).\n",
"In this tutorial, we'll consider two types of techniques, i.e., *token-based-replay*, and, *alignments*.\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"source": [
"## Token-Based-Replay"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"source": [
"In order to understand token-based-replay, we first need to cover a bit of Petri net theory.\n",
"Let's use the Petri net based on the clean [running example event log](data/running_example.csv), as an example."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"jupyter": {
"outputs_hidden": false
},
"pycharm": {
"name": "#%%\n"
},
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<IPython.core.display.Image object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import pandas as pd\n",
"import pm4py\n",
"df = pm4py.format_dataframe(pd.read_csv('data/running_example.csv', sep=';'), case_id='case_id',activity_key='activity',\n",
" timestamp_key='timestamp')\n",
"pn, im, fm = pm4py.discover_petri_net_inductive(df)\n",
"pm4py.view_petri_net(pn, im, fm)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"source": [
"### Places and Transitions\n",
"Observe that the Petri net consists of two different type of nodes, i.e., cirlces and rectangles.\n",
"We refer to the circles as *places* and we refer to the rectangles as *transitions*.\n",
"Furthermore, notice that, a place can only be connected (by means of an arc) to a transition.\n",
"Similarly, a transition can only be connected (by means of an arc) to a place.\n",
"Hence, *places never connect directly to places* and *transitions never connect directly to transitions*."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"source": [
"### Tokens, Enabledness and Transition Firing\n",
"There is one place in the model containing a black 'dot'.\n",
"This dot is referred to as a *token*.\n",
"For convienence, let's call the place containing the token 'source'.\n",
"A transition can consume and produce tokens, referred to as *firing a transition*.\n",
"A transition is allowed to fire, if all of its 'incoming places' contain at least one token.\n",
"Any transition for which this property holds is referred to as an *enabled transition*.\n",
"In the example net, only the 'source' place contains a token.\n",
"Consequently, the only transition that has a token in all of its 'incoming places' is the transition *register request*, i.e., it is enabled.\n",
"If we diced to fire the an enabled transition, it consumes one token from each of its 'incoming places' and it produces a token in each of its 'outgoing places'.\n",
"For example, if we *fire* the *register request* transition, it consumes the token in the source place and it produces a fresh token in its outgoing place (i.e., the place connected to it by means of an outgoing arc).\n",
"\n",
"*It is extremely important to note that there is no relationship between token production and consumption, i.e., tokens that are consumed cease to exist, tokens that are produced are always \"fresh tokens\"*.\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"source": [
"### Token-Based-Replay - The Basics\n",
"When we use token-based-based-replay, we are effecitvely mimicking behavior observed in the event log in the context of a given process model.\n",
"\n",
"Let's assume that in the event log, we observe the trace: \n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"source": [
"$\\langle \\text{register request, examine casually, check ticket, decide, reject request} \\rangle$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"source": [
"The token-based replay algorithm will simply mimick the trace in the model and keep track of the number of tokens we need to produce, respectively consume to *replay* the trace in the model.\n",
"For example, the first activity in the trace, i.e., *register request*, can be directly mimicked by consuming the token in the source place.\n",
"To subsequently fire the *examine causally* activity, we need the token produced by firing the *register request* transition.\n",
"The token needs to be consumed by the *black* transition (this is referred to as an invisible transition) that connects to the output place of the *register request* transition.\n",
"Said transition will produce two fresh tokens (observe that it has two outgoing places), one of which can subsequently be consumed by the *examine casually* transition.\n",
"Essentially, the token-based-replay algorithm keeps repeating this rationale, until it has mimicked the complete trace.\n",
"\n",
"In the previous example, the trace can be completely mimicked (or *replayed*) by the model.\n",
"However, consider the trace:"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"source": [
"$\\langle \\text{register request, examine casually, check ticket, reject request} \\rangle$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"source": [
"When analyzing the trace, we observe that a decision is missing.\n",
"The token-based-replay algorithm can detect this, i.e., it can detect that due to the lack of the *decide* transition in the trace, when mimicking it, tokens would remain in the input places of the *decide* transition, and, similarly, tokens would be missing in the input place of the *reject request* transition.\n",
"\n",
"For a given event log, the token-based-replay algorithm simply mimicks every trace in the event log, and, keeps track of the number of detected problems (missing and remaining tokens when mimicking the bahvior).\n",
"It subsequently compares the dected number of problems with the total amount of 'correct behavior' and produces a 'conformity score' (often referred to as a 'fitness' score) between $0$ and $1$.\n",
"If the score is $1$, no problems were detected.\n",
"If the score is $0$, no normal behavior was detected."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"source": [
"### Token-Based-Replay in pm4py"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"jupyter": {
"outputs_hidden": false
},
"pycharm": {
"name": "#%%\n"
},
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"outputs": [
{
"data": {
"application/json": {
"ascii": false,
"bar_format": null,
"colour": null,
"elapsed": 0.00988912582397461,
"initial": 0,
"n": 0,
"ncols": null,
"nrows": 15,
"postfix": null,
"prefix": "replaying log with TBR, completed variants :: ",
"rate": null,
"total": 6,
"unit": "it",
"unit_divisor": 1000,
"unit_scale": false
},
"application/vnd.jupyter.widget-view+json": {
"model_id": "fa4996a01e5c444896cb5920ac3f460b",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"replaying log with TBR, completed variants :: 0%| | 0/6 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
"{'perc_fit_traces': 100.0,\n",
" 'average_trace_fitness': 1.0,\n",
" 'log_fitness': 1.0,\n",
" 'percentage_of_fitting_traces': 100.0}"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pm4py.fitness_token_based_replay(df, pn, im, fm)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"source": [
"In the previous example, we observe that all traces in the event log perfectly comply with the model.\n",
"This is the case because the algorithm we used always guarantees it, i.e., it will always describe all behavior given to it in the event log.\n",
"\n",
"Let's consider computing fitness when using an event log that has some problems, i.e., w.r.t. the model that we learned on the clean event log."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"outputs": [
{
"data": {
"application/json": {
"ascii": false,
"bar_format": null,
"colour": null,
"elapsed": 0.012327432632446289,
"initial": 0,
"n": 0,
"ncols": null,
"nrows": 15,
"postfix": null,
"prefix": "replaying log with TBR, completed variants :: ",
"rate": null,
"total": 6,
"unit": "it",
"unit_divisor": 1000,
"unit_scale": false
},
"application/vnd.jupyter.widget-view+json": {
"model_id": "f7e9f4aa0c4d497ebd9867b3ca8b7e19",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"replaying log with TBR, completed variants :: 0%| | 0/6 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
"{'perc_fit_traces': 16.666666666666668,\n",
" 'average_trace_fitness': 0.8077731092436974,\n",
" 'log_fitness': 0.8156108597285068,\n",
" 'percentage_of_fitting_traces': 16.666666666666668}"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_problems = pm4py.format_dataframe(pd.read_csv('data/running_example_broken.csv', sep=';'), case_id='case:concept:name',activity_key='concept:name',\n",
" timestamp_key='time:timestamp')\n",
"pm4py.fitness_token_based_replay(df_problems, pn, im, fm)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"source": [
"In this case, only $16\\frac{2}{3}$ percent of the cases fits w.r.t. the model\n",
"Yet, the total fitness score is still $0.8$.\n",
"We can roughly interpret this as $80\\%$ of the behavior in the event log is fitting the model."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"source": [
"### Using Other Process Model Formalisms"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"source": [
"In ``pm4py``, the token-based-replay algorithm is only defined for Petri nets.\n",
"Hence, if we have other model types, e.g., a process tree or a BPMN model, we need to convert these to Petri nets."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"outputs": [
{
"data": {
"application/json": {
"ascii": false,
"bar_format": null,
"colour": null,
"elapsed": 0.012128591537475586,
"initial": 0,
"n": 0,
"ncols": null,
"nrows": 15,
"postfix": null,
"prefix": "replaying log with TBR, completed variants :: ",
"rate": null,
"total": 6,
"unit": "it",
"unit_divisor": 1000,
"unit_scale": false
},
"application/vnd.jupyter.widget-view+json": {
"model_id": "2617d52d622d4c4589455dfc514ef24b",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"replaying log with TBR, completed variants :: 0%| | 0/6 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
"{'perc_fit_traces': 16.666666666666668,\n",
" 'average_trace_fitness': 0.8077731092436974,\n",
" 'log_fitness': 0.8156108597285068,\n",
" 'percentage_of_fitting_traces': 16.666666666666668}"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tree = pm4py.discover_process_tree_inductive(df)\n",
"pn, im, fm = pm4py.convert_to_petri_net(tree)\n",
"pm4py.fitness_token_based_replay(df_problems, pn, im, fm)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"outputs": [
{
"data": {
"application/json": {
"ascii": false,
"bar_format": null,
"colour": null,
"elapsed": 0.012835025787353516,
"initial": 0,
"n": 0,
"ncols": null,
"nrows": 15,
"postfix": null,
"prefix": "replaying log with TBR, completed variants :: ",
"rate": null,
"total": 6,
"unit": "it",
"unit_divisor": 1000,
"unit_scale": false
},
"application/vnd.jupyter.widget-view+json": {
"model_id": "c6af2c45c34846cb886acf6f06c78645",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"replaying log with TBR, completed variants :: 0%| | 0/6 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
"{'perc_fit_traces': 16.666666666666668,\n",
" 'average_trace_fitness': 0.8077731092436974,\n",
" 'log_fitness': 0.8156108597285068,\n",
" 'percentage_of_fitting_traces': 16.666666666666668}"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"bpmn = pm4py.discover_process_tree_inductive(df)\n",
"pn, im, fm = pm4py.convert_to_petri_net(bpmn)\n",
"pm4py.fitness_token_based_replay(df_problems, pn, im, fm)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"source": [
"In general, a process tree, i.e. no matter the source (whether it is designed by hand or discovered by a discovery algorithm), can always be translated into a Petri net, without any problems.\n",
"For BPMN models this is not the case.\n",
"In the example, we use the inductive miner, which always yields a process tree.\n",
"As such, the BPMN model we discovered is actually a process tree transformed into a BPMN model.\n",
"As a consequence, the subsequent transformation into a Petri net will not yield any problems.\n",
"However, in general, various quality issues can emerge when converting an arbitrary BPMN model into a Petri net, i.e., from a conformance checking perspective."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"source": [
"## Diagnostics"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"source": [
"Thusfar, we have shown how to compute a number, quantifying how well the given event log and model conform to one another.\n",
"However, token-based-replay does not provide detailed diangostics on the problems detected (albeit we did implement rudimentary diagnostics based on token-based-replay).\n",
"To compute *conformance diagnostics* we advocate the use of *alignments*.\n",
"For convienence, let's again consider our process model:"
]
},
{
"attachments": {
"aee68a5e-ef71-4cf5-9304-32d6856f15e2.png": {
"image/png": ""
}
},
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"source": [
"### Alignments\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"source": [
"Simply put, an *alignmnet* maps the observed trace onto the *closest* firing sequence described by the model:\n",
"Let's revisit our earlier example, i.e., "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"source": [
"$\\langle \\text{register request, examine casually, check ticket, decide, reject request} \\rangle$."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"source": [
"An alignment of the this trace looks as follows: "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"source": [
"$\\langle \\text{(register request,register request), (examine casually, examine casually), (check ticket, check ticket), (decide, decide), (reject request, reject request)} \\rangle$.\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"source": [
"The alignment is a sequence of pairs, e.g., $(register request,register request)$, $(examine casually, examine casually)$.\n",
"In each pair, the first element corresponds to observed behavior in the log whereas the second argument corresponds to an action in the model.\n",
"Hence, the closest 'behavior' that the observed trace can be moapped to is $\\langle \\text{register request, examine casually, check ticket, decide, reject request} \\rangle$, i.e., exactly the same sequence of action as the observed trace.\n",
"This makes sense as the model actually describes the observed trace."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"source": [
"### Non-Fitting Behavior"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"source": [
"In certain cases, observed behavior cannot be mimicked by the model, or, behavior may be missing.\n",
"In both cases, we use the $\\gg$ symbol to represent this.\n",
"\n",
"For example, consider the trace: "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"source": [
"$\\langle \\text{register request, register request, examine casually, check ticket, reject request} \\rangle$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"source": [
"It is quite easy to see that, given the reference model presented earlier, the *register request* transition is duplicated, and, the *decide* activity is missing.\n",
"An alignment of the trace w.r.t. the model shown before quantifies this exactly:"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"source": [
"$\\langle \\text{(register request, register request), (register request, }\\gg\\text{), (examine casually, examine casually), (check ticket, check ticket), (}\\gg\\text{, decide), (reject request, reject request)} \\rangle$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"source": [
"The elements of the alignment (which we refer to as moves) reflect our observation before."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"source": [
"1. The firs *register request* activity is also described by the model\n",
"2. The second *register request* activity is not described by the model (i.e., the first argument refers to the trace, the 2nd argument $\\gg$ represents to the model).\n",
"3. The *examine casually* activity is described by the log and the model\n",
"4. The *check ticket* activity is described by the log and the model\n",
"5. The *decide* activity was not observed in the data, yet, the model describes that it is supposed to be observed\n",
"6. The *reject request* activity is described by the log and the model"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"source": [
"Observe that, according to the alignment, the given trace should be mapped on the following model behavior:"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"source": [
"$\\langle \\text{register request, examine casually, check ticket, decide, reject request} \\rangle$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"source": [
"### Alignments in pm4py"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"source": [
"Like token-based-replay, computing alignments in ``pm4py`` is rather straightforward:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"outputs": [
{
"data": {
"application/json": {
"ascii": false,
"bar_format": null,
"colour": null,
"elapsed": 0.010935306549072266,
"initial": 0,
"n": 0,
"ncols": null,
"nrows": 15,
"postfix": null,
"prefix": "aligning log, completed variants :: ",
"rate": null,
"total": 6,
"unit": "it",
"unit_divisor": 1000,
"unit_scale": false
},
"application/vnd.jupyter.widget-view+json": {
"model_id": "e22c86b3da32455bb31e28ca5db6e7aa",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"aligning log, completed variants :: 0%| | 0/6 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
"[{'alignment': [('>>', 'register request'),\n",
" ('>>', None),\n",
" ('examine thoroughly', 'examine thoroughly'),\n",
" ('check ticket', 'check ticket'),\n",
" ('decide', 'decide'),\n",
" ('>>', None),\n",
" ('reject request', 'reject request')],\n",
" 'cost': 10002,\n",
" 'visited_states': 7,\n",
" 'queued_states': 22,\n",
" 'traversed_arcs': 22,\n",
" 'lp_solved': 1,\n",
" 'fitness': 0.8888888888888888,\n",
" 'bwc': 90002},\n",
" {'alignment': [('register request', 'register request'),\n",
" ('>>', None),\n",
" ('check ticket', 'check ticket'),\n",
" ('examine casually', 'examine casually'),\n",
" ('>>', 'decide'),\n",
" ('>>', None),\n",
" ('pay compensation', 'pay compensation')],\n",
" 'cost': 10002,\n",
" 'visited_states': 7,\n",
" 'queued_states': 23,\n",
" 'traversed_arcs': 23,\n",
" 'lp_solved': 2,\n",
" 'fitness': 0.8888888888888888,\n",
" 'bwc': 90002},\n",
" {'alignment': [('register request', 'register request'),\n",
" ('>>', None),\n",
" ('>>', 'examine casually'),\n",
" ('check ticket', 'check ticket'),\n",
" ('decide', 'decide'),\n",
" ('reinitiate request', '>>'),\n",
" ('>>', None),\n",
" ('pay compensation', 'pay compensation')],\n",
" 'cost': 20002,\n",
" 'visited_states': 8,\n",
" 'queued_states': 27,\n",
" 'traversed_arcs': 27,\n",
" 'lp_solved': 5,\n",
" 'fitness': 0.8,\n",
" 'bwc': 100002},\n",
" {'alignment': [('register request', 'register request'),\n",
" ('>>', None),\n",
" ('check ticket', 'check ticket'),\n",
" ('examine thoroughly', 'examine thoroughly'),\n",
" ('decide', 'decide'),\n",
" ('>>', None),\n",
" ('reject request', 'reject request')],\n",
" 'cost': 2,\n",
" 'visited_states': 7,\n",
" 'queued_states': 24,\n",
" 'traversed_arcs': 24,\n",
" 'lp_solved': 1,\n",
" 'fitness': 1.0,\n",
" 'bwc': 100002},\n",
" {'alignment': [('register request', 'register request'),\n",
" ('>>', None),\n",
" ('examine casually', 'examine casually'),\n",
" ('check ticket', 'check ticket'),\n",
" ('decide', 'decide'),\n",
" ('reinitiate the request for real', '>>'),\n",
" ('>>', 'reinitiate request'),\n",
" ('>>', None),\n",
" ('check ticket', 'check ticket'),\n",
" ('examine casually', 'examine casually'),\n",
" ('decide', 'decide'),\n",
" ('>>', 'reinitiate request'),\n",
" ('>>', None),\n",
" ('examine casually', 'examine casually'),\n",
" ('check ticket', 'check ticket'),\n",
" ('decide', 'decide'),\n",
" ('>>', None),\n",
" ('reject request', 'reject request')],\n",
" 'cost': 30004,\n",
" 'visited_states': 18,\n",
" 'queued_states': 59,\n",
" 'traversed_arcs': 59,\n",
" 'lp_solved': 19,\n",
" 'fitness': 0.8235294117647058,\n",
" 'bwc': 170002},\n",
" {'alignment': [('register request', 'register request'),\n",
" ('>>', None),\n",
" ('>>', 'examine casually'),\n",
" ('check ticket', 'check ticket'),\n",
" ('>>', 'decide'),\n",
" ('decide something', '>>'),\n",
" ('>>', None),\n",
" ('pay compensation', 'pay compensation')],\n",
" 'cost': 30002,\n",
" 'visited_states': 8,\n",
" 'queued_states': 25,\n",
" 'traversed_arcs': 25,\n",
" 'lp_solved': 2,\n",
" 'fitness': 0.6666666666666667,\n",
" 'bwc': 90002}]"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pn, im, fm = pm4py.discover_petri_net_inductive(df)\n",
"pm4py.conformance_diagnostics_alignments(df_problems, pn, im, fm)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"source": [
"Like token-based-replay, alignments can also be used to quantify 'fitness':"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"outputs": [
{
"data": {
"application/json": {
"ascii": false,
"bar_format": null,
"colour": null,
"elapsed": 0.011507749557495117,
"initial": 0,
"n": 0,
"ncols": null,
"nrows": 15,
"postfix": null,
"prefix": "aligning log, completed variants :: ",
"rate": null,
"total": 6,
"unit": "it",
"unit_divisor": 1000,
"unit_scale": false
},
"application/vnd.jupyter.widget-view+json": {
"model_id": "85b88bb2093547bcb6586352566c96a0",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"aligning log, completed variants :: 0%| | 0/6 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
"{'percFitTraces': 16.666666666666668,\n",
" 'averageFitness': 0.8446623093681916,\n",
" 'percentage_of_fitting_traces': 16.666666666666668,\n",
" 'average_trace_fitness': 0.8446623093681916,\n",
" 'log_fitness': 0.843731055042718}"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pm4py.fitness_alignments(df_problems, pn, im, fm)"
]
}
],
"metadata": {
"celltoolbar": "Slideshow",
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.1"
},
"rise": {
"enable_chalkboard": true,
"footer": "",
"overlay": ""
},
"vscode": {
"interpreter": {
"hash": "9c7007b39dafc711984a86ea23c9ef98efa1ef34fe1cef62a06d5e45f97dbcb0"
}
}
},
"nbformat": 4,
"nbformat_minor": 4
}
|