This project will identify the jersey numbers of American football players in broadcast footage using a two stage approach. The first stage will be a pre-trained Mask R-CNN that will detect players, and the second stage will be a fine-tuned Faster R-CNN to extract digits from the player bounding boxes.
# From repository root directory
python main/main.py <INPUT_VIDEO_PATH>
-
Capture frames from two Alabama football games (1280 x 720 resolution), one with home uniforms and the other with away uniforms.
-
Do a pass over the frames, removing ones that do not contain at least one Alabama player with a completely visible jersey number.
-
Run a pre-trained Mask R-CNN to extract person bounding boxes from the frames.
-
Pad the bounding boxes with 0s (blackness) to create square images then re-scale to size 256 x 256. Discard non-Alabama players.
-
Label the digits with VGG Image Annotator (VIA) tool.
For more details, see data folder, which holds its own README.md file.
-
Start with the same pre-trained Mask R-CNN as above. As shown in this tutorial, it is possible to train the Mask R-CNN for object detection using a dataset that only contains bounding boxes.
-
Evaluate performance on
football_player_test
for the following three scenarios:- Fine-tuning with Street View House Numbers (SVHN) dataset
- Fine-tuning with
football_player_train
- Fine-tuning with
SVHN + football_player_train
-
The best performing model is hereby referred to as
jersey_number_detector
.
- Improve speed by pipelining the two stages
- Synthesize the discrete digits into actual jersey numbers (i.e. '2' and '9' --> '29')
- Filter out away team jersey numbers based on color, which will vary per game
- Filter out sideline player noise
- Supplement the jersey numbers with roster information (player names)
- Add game context using OCR on the scoreboard