Using Deep Learning to Identify Cyclists Risk Factors in London

Jupyter Notebook | Report | Presentation

Aim
GSV Dataset
Object Detection | YOLOv5
Image Segmentation | PSPNet101

Aim

The aim of this project was to use imagery to estimate safety on the roads of London, from a cyclist’s perspective. After a brief introduction to the most important road safety indicators, a ranked list with several risk factors was compiled. Risk factors were obtained from Google StreetView (GSV) imagery dataset using the object detection YOLOv5 (released in June 2020 by Glenn Jocher) and image segmentation PSPNet101 (Pyramid Scene Parsing Network) (released in July 2017 by Hengshuang Zhao et al.).

Imagery dataset contains 518 350 images of greater London, distributed across 4833 boroughs. Each image is labeled in accordance to the LSOA it belongs. Images are organized in sets of 4 which corresponds to 4 90º angles from a total of 129 588 points.

Both YOLOv5 and PSPNet101 were benchmarked and validated using a set of 1 image per LSOA from the dataset.

Data was storage and processed in the secure High Performance Cluster from Imperial College London.

GSV Dataset

Description

Along this project, it was used a Google StreetView imagery dataset from Greater London. It includes, approximately, 1/2 million images distributed across all LSOAs. For each data point there are 4 images ranging from 0º to 360º. These images were previously pre-processed (not as part of this project) to guarantee uniformity across them. More details are provided below.

Number of Images per LSOA in Greater London

Knowing the number of available images per LSOA allows us to normalize the objects counting in each area.

Distribution by Latitude and Longitude of All Image Locations

There is an higher density of GSV images in Central London.

Example of Data Point with 4 Images Covering 360º Angle

Each image per data point covers a 90º degrees angle.

img_id = 23052

Number of Available Images per LSOA in the Dataset

Distribution stats on the availability of GSV images across Greater London LSOAs.

Minimum	Maximum	Mean	Standard Deviation	Mode	Median
1	211	27	24	25	11

Total Number of Available Images in the Complete GSV Dataset

Not all images present in the GSV imagery dataset are LSOA labeled. For this reason, only 478 724 of the 518 350 were used when performing object detection or image segmentation.

Number Images in GSV Dataset	Number of LSOA identified Images (image_labels.csv)	Number of Non-Repeated LSOA identified Images (image_labels.csv)	Number of Image Identified LSOAs (image_labels.csv)
518 350	512 812	478 724	4832

Generated Files

GSV generated files are available in this project's repository.

File	Description
imgId_lsoa.json	File converting GSV image ids into the London LSOAs they belong.
lsoa_number_images.json	Number of GSV images for each London LSOA.
london_shapefiles	Collection of shapefiles of London OAs, MSOAs and LSOAs.

Object Detection | YOLOv5

Description

YOLOv5 is the most recent version of YOLO which was originally developed by Joseph Redmon. First version runs in a framework called Darknet which was purposely built to execute YOLO.

Version 5 is the 2nd model which was not developed by Joseph Redmon (after version 4) and the first running in the state-of-the-art machine learning framework, in this case, PyTorch.

This model was pre-trained using Coco dataset. Thus, it is able to identify 80 object categories. Distributed over 11 categories.

Full list of MS Coco categories

Person	Vehicle	Outdoor	Animal	Accessory	Sports	Kitchen	Food	Furniture	Electronic	Appliance	Indoor
Person	Bicycle	Traffic Light	Bird	Backpack	Frisbee	Bottle	Banana	Chair	TV	Microwave	Book
	Car	Fire Hydrant	Cat	Umbrella	Skis	Wine Glass	Apple	Couch	Laptop	Oven	Clock
	Motorcycle	Stop Sign	Dog	Handbag	Snowboard	Cup	Sandwich	Potted Plant	Mouse	Toaster	Vase
	Airplane	Parking Meter	Horse	Tie	Sports Ball	Fork	Orange	Bed	Remote	Sink	Scissors
	Bus	Bench	Sheep	Suitcase	Kite	Knife	Broccoli	Dinning Table	Keyboard	Refrigerator	Teddy Bear
	Train		Cow		Baseball Bat	Spoon	Carrot	Toilet	Cell Phone		Hair Drier
	Truck		Elephant		Baseball Glove	Bowl	Hot dog				Toothbrush
	Boat		Bear		Skateboard		Pizza
			Zebra		Surfboard		Donut
			Giraffe		Tennis Racket		Cake

YOLOv5 Executed in a Static Image from the Dataset

This example illustrates very well the power of this tool. Even the reflection of the car in a window nearby the algorithm was able to count as the right object.

YOLOv5 Executed in Real-Time in a Video from London

Video uploaded to YouTube showing how YOLOv5 is able to detect in real-time, with high accuracy, objects from a big range of sizes and sometimes occluded by others.

Number of Detections to the Top 15 Most Common Objects

In the top 15 most commonly detected objects in the GSV dataset are the ones identified as highly relevant to assess cyclist's road safety.

Object	Number Detections*	Object	Number Detections*	Object	Number Detections*
Car	1 509 344	Bicycle	10 894	Chair	2191
Person	107 266	Motorcycle	8970	Handbag	2090
Truck	70 083	Traffic Light	6310	Backpack	1939
Potted Plant	37 917	Bench	5013	Stop Sign	1282
Bus	11 512	Clock	2750	Fire Hydrant	1168

* >= 0.5 YOLOv5 score

LSOA Objects Distribution in Greater London

List of the most relevant objects distribution by LSOA with the corresponding histograms on the right.

Bicycle LSOA (↑)	Bicycle Distribution Histogram (↑)

Bus LSOA (↓)	Bus Distribution Histogram (↓)

Car LSOA (↓)	Car Distribution Histogram (↓)

Parking Meter LSOA (↓)	Parking Meter Distribution Histogram (↓)

Person LSOA (↑)	Person Distribution Histogram (↑)

Stop Sign LSOA (↑)	Traffic Light Distribution Histogram (↑)

Traffic Light LSOA (↑)	Traffic Light Distribution Histogram (↑)

Truck LSOA (↓)	Truck Distribution Histogram (↓)

* ↑ and ↓ were positively and negatively associated to road safety, respectively.

Combining Some of the Previous Risk Factors

It was combined 5 of the previous LSOAs to obtain a measure on the total number of pedestrians and cyclists in London (in the context of this project, this was perceived as enhancing safety factor for other cyclists). And a second LSOA where the total number of (motorized) vehicles in London was plotted.

Pedestrians and Cyclists in Greater London (average number per image) (↑)	Traffic (buses, cars and trucks) in Greater London (average number per image) (↓)

Combination of the 2 Previous LSOAs

During this project, we did not defined a precise metric for assessing cyclist road safety. Although, one strong possibility would be a weighted combination of positive and negative risk factors like the ones exposed by LSOA distributions above.

Top 15 Detected Objects Correlation Matrix

Includes Pearson correlation factor for each combination of objects, plus the respective p-value scores.

GIF Representation of the 2 Most Correlated Objects

This GIF highlights the similar distribution between 1 of the 2 most correlated objects present in the correlation matrix above.

Top 15 Detected Objects Distribution

Top 15 detections contain all the objects that were defined as relevant in assessing road safety in a cyclist perspective. One immediate observation is that the majority of the detected objects were cars. This is not surprising once GSV images were taken from the road.

Detailed Object Detection Information for All Categories in MS Coco, Present in the GSV Imagery

In the dropdown below is provided detailed information on the total number of occurrences, minimum, maximum and mean number of objects per London LSOA.

COCO Objects Stats for all LSOAs

Category	Total Number Occurrences	Minimum	Maximum	Mean
Person	107 266	0	695	22
Bicycle	10 894	0	144	2
Car	1 509 344	13	1891	312
Motorcycle	8970	0	74	1
Airplane	234	0	4	0
Bus	11 512	0	36	2
Train	657	0	5	0
Truck	70 083	0	192	14
Boat	971	0	22	0
Traffic Light	6310	0	54	1
Fire Hydrant	1168	0	11	0
Stop Sign	1282	0	8	0
Parking Meter	968	0	7	0
Bench	5013	0	23	1
Bird	509	0	9	0
Cat	27	0	2	0
Dog	419	0	3	0
Horse	35	0	2	0
Sheep	13	0	5	0
Cow	79	0	2	0
Elephant	2	0	1	0
Bear	3	0	1	0
Zebra	5	0	1	0
Giraffe	22	0	1	0
Backpack	1939	0	20	0
Umbrella	378	0	9	0
Handbag	2090	0	28	0
Tie	39	0	5	0
Suitcase	467	0	8	0
Frisbee	384	0	4	0
Skis	2	0	1	0
Snowboard	0	0	0	0
Sports Ball	102	0	4	0
Kite	465	0	16	0
Baseball Bat	7	0	3	0
Baseball Glove	1	0	1	0
Skateboard	245	0	3	0
Surfboard	80	0	2	0
Tennis Racket	13	0	1	0
Bottle	71	0	9	0
Wine Glass	1	0	1	0
Cup	9	0	2	0
Fork	0	0	0	0
Knife	0	0	0	0
Spoon	1	0	1	0
Bowl	6	0	2	0
Banana	6	0	3	0
Apple	6	0	2	0
Sandwich	8	0	3	0
Orange	2	0	1	0
Broccoli	1	0	1	0
Carrot	0	0	0	0
Hot Dog	1	0	1	0
Pizza	4	0	2	0
Donut	3	0	1	0
Cake	1	0	1	0
Chair	2191	0	56	0
Couch	16	0	2	0
Potted Plant	37 917	0	406	7
Bed	30	0	2	0
Dining Table	133	0	9	0
Toilet	30	0	3	0
Tv	68	0	2	0
Laptop	1	0	1	0
Mouse	0	0	0	0
Remote	0	0	0	0
Keyboard	0	0	0	0
Cell Phone	21	0	2	0
Microwave	4	0	1	0
Oven	6	0	1	0
Toaster	0	0	0	0
Sink	4	0	1	0
Refrigerator	320	0	7	0
Book	11	0	7	0
Clock	2750	0	31	0
Vase	17	0	4	0
Scissors	1	0	1	0
Teddy Bear	4	0	1	0
Hair Dryer	0	0	0	0
Toothbrush	0	0	0	0
Total	1 785 642	0	1891	370

YOLOv5 Limitations

For all road objects we intended to identify, the accuracy rates were very high, with very few misclassifications due to the high detection threshold (0.5) it was set. For this reason, the number of detected objects in the image is likely to be higher than the detected one. In terms of other objects, satellite dishes were often misclassified as clocks. There is a strong resemblance in frontal images between clock pointers and dishes arms. Boats were wrongly classified as construction containers due to their shape. Fences as benches presumably due to their texture. And Streetlights as kites and frisbees, possible because they have similar backgrounds - sky.

Generated Files

All the generated files are available on the project's repository or, in the case of the object detected images (1 per LSOA), in a linked Google Drive folder.

File	Description
total_stats.json	Number of objects detected by YOLOv5 in GSV imagery by class.
lsoa_objects_number.json	Number of objects detected by YOLOv5 in GSV imagery by class and LSOA.
lsoa_objects_number_average_per_image.json	Average number of objects detected by YOLOv5 in GSV imagery per image (includes all classes and LSOAs). JSON format.
lsoa_objects_number_average_per_image.csv	Average number of objects detected by YOLOv5 in GSV imagery per image (includes all classes and LSOAs). CSV format.
yolov5_lsoa	Folder with 1 processed image per LSOA.
img_ids_clock.txt	List of all image IDs in GSV imagery dataset where clocks were detected.

Future Directions

Analysis of a significant set of GSV images in London unveiled meaningful LSOA level patterns. One is the airplane distribution in the areas closer to the 2 airports in Greater London. Second, the presence of potted plants was found to be more significant around the biggest parks. This shows the potential of GSV imagery analysis is not limited to assess road safety.

Airplane	Potted Plant

Image Segmentation | PSPNet101

Description

Image segmentation models reached a precision plateau (in terms of average IoU) in the previous 2 years. Due to their long execution times, it was chosen the model executing faster, with the higher precision and better documentation.

PSPNet101 was pre-trained in the Cityscapes dataset. This way, it was able to label all pixels from an image across 100 categories.

Full list of Cityscapes categories

Void	Flat	Construction	Object	Nature	Sky	Human	Vehicle
Unlabeled	Road	Building	Pole	Vegetation	Sky	Person	Car
Ego Vehicle	Sidewalk	Wall	Polegroup	Terrain		Rider	Truck
Rectification Border	Parking	Fence	Traffic Light				Bus
Out of ROI	Road	Guard Rail	Traffic Sign				Caravan
Static		Bridge					Trailer
Dynamic		Tunnel					Train
Ground							Motorcycle
							Bicycle
							License Plate

Example of a Segmented Image with Identified Labels Included

After executing PSPNet101 in one of the images from the dataset, we obtain a segmented one where all pixels have an associated color accordingly to the category they belong. It was created a dictionary that links each one of these colors to the different object categories.

Segmented Images Distribution by Number of Pixels

Road safety related objects are among the most detected. Consequently, PSPNet101 pre-trained in Cityscapes is an appropriate tool to extract relevant information on this topic.

Number of Labeled Pixels for the Top 20 Most Common Categories

Due to time constraints, contrarily to the object detection part, it was only possible to analyse the general presence of pixel labels at a dataset (not LSOA) level.

Pixel Label	Number Pixels	Pixel Label	Number Pixels	Pixel Label	Number Pixels	Pixel Label	Number Pixels
Building	47 394 852 284	Sidewalk	2 772 560 820	Motorcycle	299 507 380	Traffic Sign	58 135 598
Sky	38 423 367 965	Fence	2 177 733 764	Person	232 309 236	Rider	13 948 361
Road	38 235 843 337	Terrain	1 787 689 493	Bicycle	95 469 333	Traffic Light	12 472 659
Vegetation	30 977 112 560	Wall	765 524 909	Truck	91 256 316	Train	6 842 318
Car	9 830 297 990	Pole	303 407 190	Bus	81 476 810	Total	173 559 808 323

PSPNet101 Limitations

The main difficulties of image segmentation are:

Account for image angles when trying to capture the shape of an object;
Object occlusion;
Sometimes roads and sidewalks appear unexpectedly disrupted;
Image resolution. In the case of structures with a small area (streetlights), it might not be possible to segment them due to low resolution. This happens because the imagery dataset, which was extracted from GSV, did not keep the original quality.

Generated Files

All the generated files are available on the project's repository or, in the case of the segmented images (1 per LSOA), in a linked Google Drive folder.

File	Description
total_stats.json	Total number of pixels for each Cityscapes label in the GSV dataset.
rgb_label.json	Conversion from RGB values to the respective Cityscapes label.
pspnet101_lsoa	Folder with 1 segmented image per LSOA.

Future Directions

Analysing segmented images road by road;
Having a higher resolution London imagery dataset with better coverage from all Greater London territory;
Link image segmentation analysis with the objects detected using YOLOv5;
Although this would not represent a significant improvement, using a more precise pre-trained model like Xception71 available in TensorFlow DeepLab Model Zoo would increase the quality of the segmented images.

Supervisors

Majid Ezzati (Imperial College London) | Ricky Nathvani (Imperial College London)

Featured in Towards Data Science (Medium) -> Article

Roadmap -> Wiki

Draft -> Google Doc

luisdrita / roadsafety Goto Github PK

roadsafety's Introduction

Using Deep Learning to Identify Cyclists Risk Factors in London

Table of Contents

Aim

GSV Dataset

Description

Number of Images per LSOA in Greater London

Distribution by Latitude and Longitude of All Image Locations

Example of Data Point with 4 Images Covering 360º Angle

Number of Available Images per LSOA in the Dataset

Total Number of Available Images in the Complete GSV Dataset

Generated Files

Object Detection | YOLOv5

Description

YOLOv5 Executed in a Static Image from the Dataset

YOLOv5 Executed in Real-Time in a Video from London

Number of Detections to the Top 15 Most Common Objects

LSOA Objects Distribution in Greater London

Combining Some of the Previous Risk Factors

Combination of the 2 Previous LSOAs

Top 15 Detected Objects Correlation Matrix

GIF Representation of the 2 Most Correlated Objects

Top 15 Detected Objects Distribution

Detailed Object Detection Information for All Categories in MS Coco, Present in the GSV Imagery

YOLOv5 Limitations

Generated Files

Future Directions

Image Segmentation | PSPNet101

Description

Example of a Segmented Image with Identified Labels Included

Segmented Images Distribution by Number of Pixels

Number of Labeled Pixels for the Top 20 Most Common Categories

PSPNet101 Limitations

Generated Files

Future Directions

Supervisors

roadsafety's People

Contributors

Stargazers

Watchers

Forkers

Recommend Projects

Recommend Topics

Recommend Org

Jobs