File size: 5,426 Bytes
d6ea71e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
Quickstart
===========

Eager to get started valuing some soccer actions? This page gives a quick
introduction on how to get started.

Installation
------------

First, make sure that socceraction is installed:

.. code-block:: console

   $ pip install socceraction[statsbomb]

For detailed instructions and other installation options, check out our
detailed :doc:`installation instructions <install>`.

Loading event stream data
-------------------------

First of all, you will need some data. Luckily, both `StatsBomb <https://github.com/statsbomb/open-data>`_ and
`Wyscout <https://www.nature.com/articles/s41597-019-0247-7>`_ provide a small freely available dataset.
The :ref:`data module<api-data>` of socceraction makes it trivial to load these datasets as
`Pandas DataFrames <https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html>`__.
In this short introduction, we will work with Statsbomb's dataset of the 2018 World Cup.

.. code-block:: python

   import pandas as pd
   from socceraction.data.statsbomb import StatsBombLoader

   # Set up the StatsBomb data loader
   SBL = StatsBombLoader()

   # View all available competitions
   df_competitions = SBL.competitions()

   # Create a dataframe with all games from the 2018 World Cup
   df_games = SBL.games(competition_id=43, season_id=3).set_index("game_id")


.. note::
  Keep in mind that by using the public StatsBomb data you are agreeing to their `user agreement <https://github.com/statsbomb/open-data/blob/master/LICENSE.pdf>`__.

For each game, you can then retrieve a dataframe containing the teams, all
players that participated, and all events that were recorded in that game.
Specifically, we'll load the data from the third place play-off game between
England and Belgium.

.. code-block:: python

  game_id = 8657
  df_teams = SBL.teams(game_id)
  df_players = SBL.players(game_id)
  df_events = SBL.events(game_id)


Converting to SPADL actions
---------------------------

The event stream format is not well-suited for data analysis: some of the
recorded information is irrelevant for valuing actions, each vendor uses their
own custom format and definitions, and the events are stored as unstructured
JSON objects. Therefore, socceraction uses the :doc:`SPADL format
<spadl/index>` for describing actions on the pitch. With the code below, you
can convert the events to SPADL actions.

.. code-block:: python

   import socceraction.spadl as spadl

   home_team_id = df_games.at[game_id, "home_team_id"]
   df_actions = spadl.statsbomb.convert_to_actions(df_events, home_team_id)

With the `matplotsoccer package <https://github.com/TomDecroos/matplotsoccer>`_, you can try plotting some of these
actions:

.. code-block:: python

    import matplotsoccer as mps

    # Select relevant actions
    df_actions_goal = df_actions.loc[2196:2200]
    # Replace result, actiontype and bodypart IDs by their corresponding name
    df_actions_goal = spadl.add_names(df_actions_goal)
    # Add team and player names
    df_actions_goal = df_actions_goal.merge(df_teams).merge(df_players)
    # Create the plot
    mps.actions(
        location=df_actions_goal[["start_x", "start_y", "end_x", "end_y"]],
        action_type=df_actions_goal.type_name,
        team=df_actions_goal.team_name,
        result=df_actions_goal.result_name == "success",
        label=df_actions_goal[["time_seconds", "type_name", "player_name", "team_name"]],
        labeltitle=["time", "actiontype", "player", "team"],
        zoom=False
    )

.. figure:: spadl/eden_hazard_goal_spadl.png
   :align: center


Valuing actions
---------------

We can now assign a numeric value to each of these individual actions that
quantifies how much the action contributed towards winning the game.
Socceraction implements three frameworks for doing this: xT, VAEP and
Atomic-Vaep. In this quickstart guide, we will focus on the xT framework.

The expected threat or xT model overlays a :math:`M \times N` grid on the
pitch in order to divide it into zones. Each zone :math:`z` is
then assigned a value :math:`xT(z)` that reflects how threatening teams are at
that location, in terms of scoring. An example grid is visualized below.

.. image:: valuing_actions/default_xt_grid.png
   :width: 600
   :align: center

The code below allows you to load
league-wide xT values from the 2017-18 Premier League season (the 12x8 grid
shown above). Instructions on how to train your own model can be found in the
:doc:`detailed documentation about xT <valuing_actions/xT>`.

.. code-block:: python

    import socceraction.xthreat as xthreat

    url_grid = "https://karun.in/blog/data/open_xt_12x8_v1.json"
    xT_model = xthreat.load_model(url_grid)



Subsequently, the model can be used to value actions that successfully move
the ball between two zones by computing the difference between the threat
value on the start and end location of each action. The xT framework does not
assign a value to failed actions, shots and defensive actions such as tackles.

.. code-block:: python

    df_actions_ltr = spadl.play_left_to_right(df_actions, home_team_id)
    df_actions["xT_value"] = xT_model.rate(df_actions_ltr)


.. image:: valuing_actions/eden_hazard_goal_xt.png
   :align: center


-----------------------

Ready for more? Check out the detailed documentation about the
:doc:`data representation <spadl/index>` and
:doc:`action value frameworks <valuing_actions/index>`.