File size: 7,177 Bytes
d6ea71e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
.. currentmodule:: socceraction.data.statsbomb

=========================
Loading StatsBomb data
=========================

The :class:`StatsBombLoader` class provides an API client enabling you to
fetch `StatsBomb event stream data`_ as Pandas DataFrames. This document provides
an overview of the available data sources and how to access them.

------
Setup
------

To be able to load StatsBomb data, you'll first need to install a few
additional dependencies which are not included in the default installation of
socceraction. You can install these additional dependencies by running:

.. code-block:: console

  $ pip install "socceraction[statsbomb]"


--------------------------
Connecting to a data store
--------------------------

First, you have to create a :class:`StatsBombLoader` object and configure it
for the data store you want to use. The :class:`StatsBombLoader` supports
loading data from the StatsBomb Open Data repository, from the official
StatsBomb API, and from local files.


Open Data repository
====================

StatsBomb has made event stream data of certain leagues freely available for
public non-commercial use at https://github.com/statsbomb/open-data. This open
data can be accessed without the need of authentication, but its use is
subject to a `user agreement`_. The code below shows how to setup an API client
that can fetch data from the repository.

.. code-block:: python

  # optional: suppress warning about missing authentication
  import warnings
  from statsbombpy.api_client import NoAuthWarning
  warnings.simplefilter('ignore', NoAuthWarning)

  from socceraction.data.statsbomb import StatsBombLoader

  api = StatsBombLoader(getter="remote", creds=None)


.. note::
   If you publish, share or distribute any research, analysis or insights based
   on this data, StatsBomb requires you to state the data source as StatsBomb
   and use their logo.


StatsBomb API
=============

API access is for paying customers only. Authentication can be done by setting
environment variables named ``SB_USERNAME`` and ``SB_PASSWORD`` to your login
credentials. Alternatively, the constructor accepts an argument ``creds`` to
pass your login credentials in the format ``{"user": "", "passwd": ""}``.

.. code-block:: python

  from socceraction.data.statsbomb import StatsBombLoader

  # set authentication credentials as environment variables
  import os
  os.environ["SB_USERNAME"] = "your_username"
  os.environ["SB_PASSWORD"] = "your_password"
  api = StatsBombLoader(getter="remote")

  # or provide authentication credentials as a dictionary
  api = StatsBombLoader(getter="remote", creds={"user": "", "passwd": ""})


Local directory
===============

A final option is to load data from a local directory. This local directory
can be specified by passing the ``root`` argument to the constructor,
specifying the path to the local data directory.

.. code-block:: python

  from socceraction.data.statsbomb import StatsBombLoader

  api = StatsBombLoader(getter="local", root="data/statsbomb")

Note that the data should be organized in the same way as the StatsBomb Open
Data repository, which corresponds to the following file hierarchy:

.. code-block::

  root
  β”œβ”€β”€ competitions.json
  β”œβ”€β”€ events
  β”‚   β”œβ”€β”€ <match_id>.json
  β”‚   β”œβ”€β”€ ...
  β”‚   └── ...
  β”œβ”€β”€ lineups
  β”‚   β”œβ”€β”€ <match_id>.json
  β”‚   └── ...
  β”œβ”€β”€ matches
  β”‚   β”œβ”€β”€ <competition_id>
  β”‚   β”‚   └── <season_id>.json
  β”‚   β”‚   └── ...
  β”‚   └── ...
  └── three-sixty
      β”œβ”€β”€ <match_id>.json
      └── ...



------------
Loading data
------------

Next, you can load the match event stream data and metadata by calling the
corresponding methods on the :class:`StatsBombLoader` object.


:func:`StatsBombLoader.competitions()`
======================================

.. code-block:: python

   df_competitions = api.competitions()

.. csv-table::
   :class: dataframe
   :header: season_id,competition_id,competition_name,country_name,competition_gender,season_name

    106,43,FIFA World Cup,International,male,2022
    30,72,Women's World Cup,International,female,2019
    3,43,FIFA World Cup,International,male,2018


:func:`StatsBombLoader.games()`
===============================

.. code-block:: python

   df_games = api.games(competition_id=43, season_id=3)


.. csv-table::
   :class: dataframe
   :header: game_id,season_id,competition_id,competition_stage,game_day,game_date,home_team_id,away_team_id,home_score,away_score,venue,referee_id

    8658,3,43,Final,7,2018-07-15 17:00:00,771,785,4,2,Stadion Luzhniki,730
    8657,3,43,3rd Place Final,7,2018-07-14 16:00:00,782,768,2,0,Saint-Petersburg Stadium,741

:func:`StatsBombLoader.teams()`
===============================

.. code-block:: python

   df_teams = api.teams(game_id=8658)

.. csv-table::
   :class: dataframe
   :header: team_id,team_name
   :align: left

    771,France
    785,Croatia



:func:`StatsBombLoader.players()`
=================================

.. code-block:: python

   df_players = api.players(game_id=8658)


.. csv-table::
   :class: dataframe
   :header: game_id,team_id,player_id,player_name,nickname,jersey_number,is_starter,starting_position_id,starting_position_name,minutes_played

    8658,771,3009,Kylian MbappΓ© Lottin,Kylian MbappΓ©,10,True,12,Right Midfield,95
    8658,785,5463,Luka Modrić,,10,True,13,Right Center Midfield,95


:func:`StatsBombLoader.events()`
================================

.. code-block:: python

   df_events = api.events(game_id=8658)

.. csv-table::
   :class: dataframe
   :header: event_id,index,period_id,timestamp,minute,second,type_id,type_name,possession,possession_team_id,possession_team_name,play_pattern_id,play_pattern_name,team_id,team_name,duration,extra,related_events,player_id,player_name,position_id,position_name,location,under_pressure,counterpress,game_id

    47638847-fd43-4656-b49c-cff64e5cfc0a,1,1,1900-01-01,0,0,35,Starting XI,1,771,France,1,Regular Play,771,France,0.0,"{...}",[],,,,,,False,False,8658
    0c04305d-5615-4520-9be5-7c232829954b,2,1,1900-01-01,0,0,35,Starting XI,1,771,France,1,Regular Play,785,Croatia,1.412,"{...}",[],,,,,,False,False,8658
    c5e17439-efe2-480b-9cff-1600998674d7,3,1,1900-01-01,0,0,18,Half Start,1,771,France,1,Regular Play,771,France,0.0,{},['7e1460eb-c572-4059-8cd4-cec4857f818d'],,,,,,False,False,8658


If `360 data snapshots`_ are available for the game, they can be loaded by
passing ``load_360=True`` to the ``events()`` method. This will add two columns
to the events dataframe: ``visible_area_360`` and ``freeze_frame_360``. The
former contains the visible area of the pitch in the 360 snapshot, while the
latter contains the player locations in the 360 snapshot.

.. code-block:: python

   df_events = api.events(game_id=3788741, load_360=True)


.. _StatsBomb event stream data: https://statsbomb.com/what-we-do/soccer-data/
.. _statsbombpy: https://pypi.org/project/statsbombpy/
.. _user agreement: https://github.com/statsbomb/open-data/blob/master/LICENSE.pdf
.. _360 data snapshots: https://statsbomb.com/what-we-do/soccer-data/360-2/