Run the below commands to setup the virtual environment (using poetry):
make activate
make setup
Run the below command to view the documentation using pdoc3:
make docs_view
In some cases we may want to tell python where out code is, therefore we can append the full path to the code (this can be found by typing pwd in the code folder) to the current PYTHONPATH variable:
export PYTHONPATH=$PYTHONPATH:/users/tulio/project_folder/code/
For this example, the path would be to the source code.
The code uses DVC for data versioning. To update the version of your dataset, run:
dvc repro
git add dvc.lock
git commit -m "DVC vX.X"
To store the data remotely, run:
dvc remote add --default your-remote-bucket
dvc push
To pull the data, run
make pull_data
In this notebook, we build, train, validate, and test a Neural Network with PyTorch to predict the target_label field (win, draw, lose) of the upcoming matches
- Read the dataset
- Data Processing
- Neural Network Training and Validation
- Test the Neural Network
- Improvement ideas
Rosters schema:
- id: (INT) Identifier of roster, which is unique for a player and match
- goals: (INT) Number of goals scored by player in that match
- shots: (INT) Number of shots to the goal for that player
- own_goals: (INT) Number of own goals scored by player in that match
- xG: (FLOAT) Expected goals for that player
- time: (INT) Field time in minutes of player
- player_id: (INT) Unique identifier for that player.
- team_id:(INT) Unique identifier of that team
- position:(INT) Position played by that player
- player:(STR) Name of the player
- h_a:(STR, ['h','a']) Home or away
- yellow_card:(INT,[0,1,2]) Number of yellow cards
- red_card:(INT,[0,1]) Number of red cards
- roster_in: (INT) ID of roster that substitued this player
- roster_out: (INT) ID of roster that left to give place to this player
- key_passes: (INT) number of key passes
- assists: (INT) number of assists to goals
- xA: (FLOAT) expected assists to goals
- xGChain: (FLOAT) expected goals chain
- xGBuildup: (FLOAT) expected goals buildup
- positionOrder: (INT) order in lineup position
- date: (DATE) date of match
- homeScore: (INT) Score of home team
- awayScore: (INT) Score of away team
- matchId: (INT) Unique identifier of the match
Teams schema:
- matchId: (INT) Unique identifier of the match
- teamId: (INT) Unique identifier of the team
- h_a: (STR, ['h','a']) Home or away
- xG: (FLOAT) Expected goals for the team
- xGA: (FLOAT) Expected goals against
- npxG: (FLOAT) Expected goals for the team (excluding penalties and own goals)
- npxGA: (FLOAT) Expected goals against (excluding penalties and own goals)
- deep: (FLOAT) Passes completed within an estimated 20 yards of goal (crosses excluded)
- deep_allowed: (FLOAT) Allowed deep passes for the opposite team
- scored: (INT) Goals scored
- missed: (INT) Goals scored against
- xpts: (FLOAT) Expected points
- result: (STR, ['l','w','d']) Match result, win, draw, or loss
- wins: (BOOLEAN) True if team wins
- draws: (BOOLEAN) True if team draws
- loses: (BOOLEAN) True if team loses
- pts: (INT) Points gained for that team
- npxGD: (FLOAT) Difference between expected goals for and against, excluding penalties and own goals.
- ppda.att: (FLOAT) Passes per defensive action in the attack part of the field (PPDA metric is calculated by dividing the number of passes allowed by the defending team by the total number of defensive actions.)
- ppda.def: (FLOAT) Passes per defensive action in the defensive part of the field.
- ppda_allowed.att: (FLOAT) Opponent passes per defensive action in the attack part of the field.
- ppda_allowed.def: (FLOAT) Opponent passes per defensive action in the defensive part of the field.
Additional fields:
- home_points: (INT) Points in season before match for home team
- away_points: (INT) Points in season before match for away team
- scored_goals_season_h: (INT) Goals scored in season for home team
- missed_goals_season_h: (INT) Goals missed in season for home team
- scored_goals_season_a: (INT) Goals scored in season for away team
- missed_goals_season_a: (INT) Goals missed in season for away team
- league: (STR) League of the match
- season: (INT) Season of the match
- (TO BE INSERTED)n_points_h: (INT) Points earned in the last N encounters for home team
- (TO BE INSERTED)n_points_a: (INT) Points earned in the last N encounters for away team
- (TO BE INSERTED)top_assists_h: (FLOAT) Highest individual season assists.
- (TO BE INSERTED)top_score_a: (FLOAT) Highest individual season score in squad
- avg_ppda.att_h:(FLOAT) Season average of ppda.att for host team
- avg_ppda.def_h:(FLOAT) Season average of ppda.def for host team
- avg_ppda.att_a:(FLOAT) Season average of ppda.att for visiting team
- avg_ppda.def_a:(FLOAT) Season average of ppda.def for visiting team
- avg_ppda_allowed.att_h:(FLOAT) Season average of ppda_allowed.att for host team
- avg_ppda_allowed.def_h:(FLOAT) Season average of ppda_allowed.def for host team
- avg_ppda_allowed.att_a:(FLOAT) Season average of ppda_allowed.att for visiting team
- avg_ppda_allowed.def_a:(FLOAT) Season average of ppda_allowed.def for visiting team
- avg_deep_h:(FLOAT) Season average of deep passes for host team
- avg_deep_a:(FLOAT) Season average of deep passes for visitor team
- avg_deep_allowed_h: (FLOAT) Season average of allowed deep passes for host team
- avg_deep_allowed_a: (FLOAT) Season average of allowed deep passes for visitor team ... to be continued