Skip to content

Initialize a GameDataset and some sample operations

A GameDatset instance can contain tracking and events data in a TrackingFrame and EventsFrame respectively.

In [2]:
import sys
sys.path.insert(1, '../../')

from codeball import GameDataset, Zones

metadata_file = (r"../../codeball/tests/files/metadata.xml")
tracking_file = (r"../../codeball/tests/files/tracking.txt")
events_file = (r"../../codeball/tests/files/events.json")

game_dataset = GameDataset(
    tracking_metadata_file=metadata_file,
    tracking_data_file=tracking_file,
    events_metadata_file=metadata_file,
    events_data_file=events_file,
)

print(type(game_dataset.tracking))
print(type(game_dataset.events))
<class 'codeball.codeball_frames.TrackingFrame'>
<class 'codeball.codeball_frames.EventsFrame'>

Tracking

GameDataset.tracking holds a TrackingFrame with all the tacking data of the game.

In [7]:
game_dataset.tracking.head()
Out[7]:
period_id timestamp ball_state ball_owning_team_id ball_x ball_y P3578_x P3578_y P3568_x P3568_y ... P3590_x P3590_y P3591_x P3591_y P3592_x P3592_y P3593_x P3593_y P3594_x P3594_y
0 1 0.04 None None NaN NaN 0.84722 0.52855 0.65268 0.24792 ... 0.41381 0.52790 0.41787 0.48086 0.41215 0.36689 0.47050 0.73219 0.48864 0.36357
1 1 0.08 None None NaN NaN 0.84722 0.52855 0.65231 0.24513 ... 0.41375 0.52780 0.41719 0.47864 0.41132 0.36169 0.47040 0.73204 0.48834 0.36362
2 1 0.12 None None NaN NaN 0.84722 0.52855 0.65197 0.24387 ... 0.41371 0.52906 0.41697 0.47824 0.41131 0.36072 0.47075 0.73229 0.48814 0.36372
3 1 0.16 None None NaN NaN 0.84722 0.52855 0.65166 0.24288 ... 0.41370 0.53056 0.41685 0.47815 0.41117 0.35930 0.47118 0.73266 0.48793 0.36278
4 1 0.20 None None NaN NaN 0.84722 0.52855 0.65141 0.24251 ... 0.41369 0.53151 0.41669 0.47749 0.41120 0.35910 0.47163 0.73287 0.48784 0.36240

5 rows × 50 columns

If you want to filter the TrackingFrame, you can use it's methods (on top of all standard DataFrame methods). For example to get a TrackingFrame only with the data of team with team_id FIFATMA you can do:

In [8]:
game_dataset.tracking.team('FIFATMA').head()
Out[8]:
P3578_x P3578_y P3568_x P3568_y P3569_x P3569_y P3570_x P3570_y P3571_x P3571_y ... P3573_x P3573_y P3574_x P3574_y P3575_x P3575_y P3576_x P3576_y P3577_x P3577_y
0 0.84722 0.52855 0.65268 0.24792 0.66525 0.46562 0.68103 0.59083 0.62405 0.80669 ... 0.60798 0.45155 0.50212 0.45314 0.62012 0.60667 0.51839 0.77140 0.50555 0.50863
1 0.84722 0.52855 0.65231 0.24513 0.66482 0.46548 0.68095 0.59054 0.62371 0.80594 ... 0.60783 0.44918 0.50158 0.45544 0.61987 0.60474 0.51801 0.77130 0.50545 0.50532
2 0.84722 0.52855 0.65197 0.24387 0.66467 0.46537 0.68078 0.59035 0.62354 0.80601 ... 0.60779 0.44866 0.50126 0.45662 0.61980 0.60422 0.51787 0.77080 0.50552 0.50524
3 0.84722 0.52855 0.65166 0.24288 0.66460 0.46488 0.68063 0.58987 0.62318 0.80604 ... 0.60762 0.44898 0.50119 0.45815 0.61976 0.60397 0.51773 0.77031 0.50563 0.50524
4 0.84722 0.52855 0.65141 0.24251 0.66452 0.46469 0.68052 0.58934 0.62286 0.80626 ... 0.60748 0.44888 0.50114 0.45986 0.61967 0.60417 0.51759 0.77008 0.50576 0.50531

5 rows × 22 columns

Final example, let's say you want to get the x coordiante data, only for the field players (excluding goalkeeper) for team_id FIFATMA, you can get that by doing:

In [6]:
game_dataset.tracking.team('FIFATMA').players('field').dimension('x').head()
Out[6]:
P3568_x P3569_x P3570_x P3571_x P3572_x P3573_x P3574_x P3575_x P3576_x P3577_x
0 0.65268 0.66525 0.68103 0.62405 0.50533 0.60798 0.50212 0.62012 0.51839 0.50555
1 0.65231 0.66482 0.68095 0.62371 0.50461 0.60783 0.50158 0.61987 0.51801 0.50545
2 0.65197 0.66467 0.68078 0.62354 0.50430 0.60779 0.50126 0.61980 0.51787 0.50552
3 0.65166 0.66460 0.68063 0.62318 0.50394 0.60762 0.50119 0.61976 0.51773 0.50563
4 0.65141 0.66452 0.68052 0.62286 0.50371 0.60748 0.50114 0.61967 0.51759 0.50576

Events

Similarly, GameDataset.events holds a TrackingFrame with all the tacking data of the game, and if you want to filter it, you can do so using it's methods. For example to get all event that go into the opponent box you can do game_dataset.events.into(Zones.OPPONENT_BOX), or if you want to get all the passes you can do:

In [15]:
game_dataset.events.type('PASS').head()
Out[15]:
event_id event_type result success period_id timestamp end_timestamp ball_state ball_owning_team team_id player_id coordinates_x coordinates_y end_coordinates_x end_coordinates_y receiver_player_id inverted
1 None PASS COMPLETE True 1 14.44 15.08 alive FIFATMA FIFATMA P3577 0.49875 0.51275 0.50136 0.51295 P3574 True
3 None PASS COMPLETE True 1 15.36 17.04 alive FIFATMA FIFATMA P3574 0.50300 0.51500 0.36627 0.36551 P3575 True
5 None PASS COMPLETE True 1 18.60 20.28 alive FIFATMA FIFATMA P3575 0.33014 0.40293 0.19398 0.60179 P3569 True
7 None PASS COMPLETE True 1 21.20 23.20 alive FIFATMA FIFATMA P3569 0.19071 0.57078 0.20094 0.18478 P3570 True
9 None PASS COMPLETE True 1 23.92 25.12 alive FIFATMA FIFATMA P3570 0.20244 0.18002 0.31899 0.01941 P3571 True

Since in this game tracking and event daa come from the same provider, GameDataset.metadata for this case is the same as GameDataset.tracking.metadata and GameDataset.events.metadata. There ou can access metadata about the data like frame rate, field dimensions, teams and players details, etc. Like:

In [23]:
game_dataset.metadata.teams[0].players[5].name
Out[23]:
'Player 5'
In [24]:
game_dataset.metadata.frame_rate
Out[24]:
25
In [25]:
game_dataset.metadata.score
Out[25]:
Score(home=0, away=2)

For more details about metadata attributes and methods see kloppy's documentation.