Spaces:

scfive
/

socr

Configuration error

App Files Files Community

socr / docs /documentation /intro.rst

scfive

Upload 203 files

d6ea71e verified 5 months ago

raw

history blame contribute delete

5.43 kB

	Quickstart
	===========

	Eager to get started valuing some soccer actions? This page gives a quick
	introduction on how to get started.

	Installation
	------------

	First, make sure that socceraction is installed:

	.. code-block:: console

	$ pip install socceraction[statsbomb]

	For detailed instructions and other installation options, check out our
	detailed :doc:`installation instructions <install>`.

	Loading event stream data
	-------------------------

	First of all, you will need some data. Luckily, both `StatsBomb <https://github.com/statsbomb/open-data>`_ and
	`Wyscout <https://www.nature.com/articles/s41597-019-0247-7>`_ provide a small freely available dataset.
	The :ref:`data module<api-data>` of socceraction makes it trivial to load these datasets as
	`Pandas DataFrames <https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html>`__.
	In this short introduction, we will work with Statsbomb's dataset of the 2018 World Cup.

	.. code-block:: python

	import pandas as pd
	from socceraction.data.statsbomb import StatsBombLoader

	# Set up the StatsBomb data loader
	SBL = StatsBombLoader()

	# View all available competitions
	df_competitions = SBL.competitions()

	# Create a dataframe with all games from the 2018 World Cup
	df_games = SBL.games(competition_id=43, season_id=3).set_index("game_id")


	.. note::
	Keep in mind that by using the public StatsBomb data you are agreeing to their `user agreement <https://github.com/statsbomb/open-data/blob/master/LICENSE.pdf>`__.

	For each game, you can then retrieve a dataframe containing the teams, all
	players that participated, and all events that were recorded in that game.
	Specifically, we'll load the data from the third place play-off game between
	England and Belgium.

	.. code-block:: python

	game_id = 8657
	df_teams = SBL.teams(game_id)
	df_players = SBL.players(game_id)
	df_events = SBL.events(game_id)


	Converting to SPADL actions
	---------------------------

	The event stream format is not well-suited for data analysis: some of the
	recorded information is irrelevant for valuing actions, each vendor uses their
	own custom format and definitions, and the events are stored as unstructured
	JSON objects. Therefore, socceraction uses the :doc:`SPADL format
	<spadl/index>` for describing actions on the pitch. With the code below, you
	can convert the events to SPADL actions.

	.. code-block:: python

	import socceraction.spadl as spadl

	home_team_id = df_games.at[game_id, "home_team_id"]
	df_actions = spadl.statsbomb.convert_to_actions(df_events, home_team_id)

	With the `matplotsoccer package <https://github.com/TomDecroos/matplotsoccer>`_, you can try plotting some of these
	actions:

	.. code-block:: python

	import matplotsoccer as mps

	# Select relevant actions
	df_actions_goal = df_actions.loc[2196:2200]
	# Replace result, actiontype and bodypart IDs by their corresponding name
	df_actions_goal = spadl.add_names(df_actions_goal)
	# Add team and player names
	df_actions_goal = df_actions_goal.merge(df_teams).merge(df_players)
	# Create the plot
	mps.actions(
	location=df_actions_goal[["start_x", "start_y", "end_x", "end_y"]],
	action_type=df_actions_goal.type_name,
	team=df_actions_goal.team_name,
	result=df_actions_goal.result_name == "success",
	label=df_actions_goal[["time_seconds", "type_name", "player_name", "team_name"]],
	labeltitle=["time", "actiontype", "player", "team"],
	zoom=False
	)

	.. figure:: spadl/eden_hazard_goal_spadl.png
	:align: center


	Valuing actions
	---------------

	We can now assign a numeric value to each of these individual actions that
	quantifies how much the action contributed towards winning the game.
	Socceraction implements three frameworks for doing this: xT, VAEP and
	Atomic-Vaep. In this quickstart guide, we will focus on the xT framework.

	The expected threat or xT model overlays a :math:`M \times N` grid on the
	pitch in order to divide it into zones. Each zone :math:`z` is
	then assigned a value :math:`xT(z)` that reflects how threatening teams are at
	that location, in terms of scoring. An example grid is visualized below.

	.. image:: valuing_actions/default_xt_grid.png
	:width: 600
	:align: center

	The code below allows you to load
	league-wide xT values from the 2017-18 Premier League season (the 12x8 grid
	shown above). Instructions on how to train your own model can be found in the
	:doc:`detailed documentation about xT <valuing_actions/xT>`.

	.. code-block:: python

	import socceraction.xthreat as xthreat

	url_grid = "https://karun.in/blog/data/open_xt_12x8_v1.json"
	xT_model = xthreat.load_model(url_grid)



	Subsequently, the model can be used to value actions that successfully move
	the ball between two zones by computing the difference between the threat
	value on the start and end location of each action. The xT framework does not
	assign a value to failed actions, shots and defensive actions such as tackles.

	.. code-block:: python

	df_actions_ltr = spadl.play_left_to_right(df_actions, home_team_id)
	df_actions["xT_value"] = xT_model.rate(df_actions_ltr)


	.. image:: valuing_actions/eden_hazard_goal_xt.png
	:align: center


	-----------------------

	Ready for more? Check out the detailed documentation about the
	:doc:`data representation <spadl/index>` and
	:doc:`action value frameworks <valuing_actions/index>`.