WEATHER-RELATED TRAFFIC ACCIDENT PREDICTION USING RANDOM FOREST WITH CATBOOST AND LIGHTGBM

1 of
Published on Video
Go to video
Download PDF version
Download PDF version
Embed video
Share video
Ask about this video

Page 1 (0s)

WEATHER-RELATED TRAFFIC ACCIDENT PREDICTION USING RANDOM FOREST WITH CATBOOST AND LIGHTGBM.

Page 2 (38s)

[Audio] I will start by providing an introduction to the topic of using Random Forest with CatBoost and LightGBM for predicting weather-related traffic accidents. Following this, I will explain the research aims and objectives. Afterwards, I will go through the relevant literature and discuss the methodology utilized. After this, I will proceed to explain the results and draw conclusions. Lastly, I will put forward my recommendations for the future..

Page 3 (1m 8s)

[Audio] I will be discussing how advanced machine learning algorithms can be used to accurately predict traffic accidents due to weather-related incidents. The research was conducted using the Large-Scale Traffic and Weather dataset, which contains real-world traffic and weather data across the United States. Studying this dataset allowed for the identification of patterns and the creation of a working model to improve the forecasting of traffic conditions due to extreme weather conditions. CatBoost and LightGBM were used to accurately predict the likelihood of a traffic accident in adverse weather conditions. The results of this study will be discussed further..

Page 4 (1m 48s)

[Audio] The purpose of this presentation is to demonstrate the capability of using random forest algorithms to forecast weather-related traffic accidents. To do so, two machine learning algorithms, LightGBM and CatBoost, will be compared on their ability to predict such circumstances. Furthermore, various techniques to evaluate the accuracy and performance of the model will also be exhibited. Ultimately, from this presentation, a recommendation for the best model to use the random forest algorithm for forecasting traffic will be made..

Page 5 (2m 23s)

[Audio] Our research seeks to utilize random forest algorithms to analyze the correlation between poor weather conditions and road accidents. We will be looking into various features that may be most effective in predicting these accidents dependent on climate. Additionally, we plan to assess the precision and model performance of the algorithms utilized..

Page 6 (2m 45s)

Literature Review.

Page 7 (2m 51s)

[Audio] Over the years, a number of researchers have looked into the correlation between weather, traffic accident prediction and its effects. The latest studies have shown some interesting findings. In 2018, An et al. used weather and traffic data to predict the frequency of accidents on a highway route. In 2021, Ovi et al. developed a system for real-time traffic accident prediction and Hossain et al. implemented a machine learning method to predict traffic accident severity in South Africa. Zhao and Deng employed heterogeneous ensemble learning and millions of traffic accident data from the United States to create an accident duration prediction model. Additionally, La Torre et al. proposed an European Average Prediction Model which was used to analyze traffic accident-related data..

Page 8 (3m 59s)

[Audio] I will be discussing how Random Forest with Catboost and LightGBM can be used to accurately predict weather-related traffic accidents. To achieve this, I looked at major contributions and outcomes in the past related to traffic accident predictions. Examples include a spatio-temporal action graph to effectively model the relation between objects associated with an accident and a multitask learning approach to improve accident anticipation accuracy. There have been costs estimated for fatal injuries from traffic accidents in Malaysia using conjoint analysis and a large scale dataset for near-miss incident was introduced with an adaptive loss function. My presentation aims to explain how these techniques can be used together to help predict weather-related traffic accidents..

Page 9 (4m 47s)

Methodology.

Page 10 (4m 53s)

[Audio] We have employed Random Forest with Catboost and LightGBM to predict traffic accidents related to weather conditions. Temperature, humidity, pressure and precipitation from weather event dataset have been utilized to determine the weather conditions. Our findings show an accuracy of approximately 91% for predicting the traffic accidents. Feature importance analysis was conducted to find the most important weather attributes that cause traffic accidents..

Page 11 (5m 24s)

[Audio] Identified and cleaned the relevant data points to ensure the best possible performance, we preprocessed and transformed the data through a MinMaxScaler and a SMOTE and Random Under Sampler Class balancing. Using the regressor and classifier algorithms, we were able to train our model with supervised learning algorithms, Random Forest, LightGBM and CatBoost, to create a robust and effective predictive model for weather-related traffic accidents and calculate the relevant metrics..

Page 12 (5m 53s)

Results & discussions.

Page 13 (5m 59s)

[Audio] Our study revealed that the most serious traffic accidents happen during foggy days and rain, presumably due to reduced visibility making it harder for drivers to detect potential hazards, plus heightened traffic congestion heightening the possibility of collisions. Surprisingly, severe weather conditions such as storms, hail and cold temperatures have a lesser influence on the rate of traffic accidents and incidents. Our research team are utilizing Random Forests with Catboost and LightGBM to anticipate the severity of traffic accidents on the basis of traffic data and weather conditions. We are optimistic that our discoveries can help improve both safety and mobility..

Page 14 (6m 40s)

[Audio] Our research into the correlation between weather conditions and traffic accidents has revealed a clear increase in accident severity in recent years, particularly in 2019. Examining this further, it was found that the monsoon season was the most affected by the weather, with a higher probability of accidents occurring in the 'heavy' weather category. These findings underscore the need to factor in the weather and its severity when predicting traffic accidents. The graph illustrates this correlation..

Page 15 (7m 11s)

[Audio] Weather-related traffic accidents are a major issue in the U S , with numerous states reflecting a considerable number of episodes each year. The mishap conveyance guide uncovers that mischances are bound to happen in urban areas and along real expressways. As indicated by the information, California, Texas, Florida, and New York detailed the most noteworthy number of episodes in 2020. For instance, California had more than 2.5 million mishap reports, most of which occurred in Los Angeles, San Francisco, and San Diego. Comparably, Texas had more than 2 million occurrences, focused in Dallas, Houston, and San Antonio. Florida and New York likewise detailed over 1.7 and 1.6 million episodes separately. With the assistance of Random Forest with CatBoost and LightGBM, an effective model can be created to precisely anticipate climate-related traffic accidents..

Page 16 (8m 13s)

[Audio] When it comes to weather-induced traffic accidents, Florida is particularly prone to danger. Data from 2016-2020 shows that the area around latitude 28.00 and longitude -82.33 experiences an especially high number of fatal accidents, with 515 reported in that same timespan. This is likely due to the fact that Florida is a coastal state, and therefore consistently experiences higher than average levels of rainfall. In the winter, this is caused by air masses that generate turbulence and usher in mild, moist air from the Gulf. During the summer, heavy rains are the result of heat-induced thunderstorms..

Page 17 (8m 57s)

[Audio] This slide is focused on the key features used to predict the severity of a road accident. Using the random forest, CATBoost and LightBGM techniques, we can analyse the accident severity and get a better prediction. The important variables taken into account are the state, month, year, day of the week, weather severity and type along with the timezone. This will help us to get a much clearer view of the risk of the accident and the severity..

Page 18 (9m 27s)

REGRESSOR METRICS. WEATHER-RELATED TRAFFIC ACCIDENT PREDICTION USING RANDOM FOREST WITH CATBOOST AND LIGHTGBM.

Page 19 (10m 12s)

[Audio] My model for this presentation has produced outstanding results in predicting weather-related traffic accidents. Utilizing Random Forest with CatBoost and LightGBM, the metrics table indicates that the model is functioning exceedingly well in multi-class classification. Weighted recall, precision, and accuracy scores are higher than 0.92, as are weighted F0.5, F1, and F2 scores. Furthermore, analysing the confusion matrix reveals that most data points were correctly identified by the model, with almost 8% of records unrecognized..

Page 20 (10m 51s)

[Audio] Based on the results of the three models, it can be seen that the CatBoost Regressor and LightGBM Regressor had better R², precision and recall scores than the SageMaker Autopilot model. Specifically, the CatBoost Regressor achieved the highest R² of 0.92554, and both the CatBoost Regressor and the LightGBM Regressor achieved precision scores of 0.999 and recall scores of 1, which are both excellent results..

Page 21 (11m 32s)

[Audio] This study showed that using a combination of Random Forest, CatBoost, and LightGBM algorithms together yielded better results in predicting weather-related traffic accidents than using any single model. With further refinement, this approach could be applied to real-world scenarios. To further improve the model, a larger dataset incorporating variables such as drivers’ gender, age, and level of experience should be taken into consideration in order to gain a better understanding of the accident’s severity..

Page 22 (12m 4s)

[Audio] This project looks into the effect of weather on car crashes and shows that it is possible to forecast weather-related accidents. The Random Forest Regressor, CatBoost, and LightGBM results suggest that both CatBoost and LightGBM Regressors are highly accurate in terms of R², precision, and recall, with LightGBM Regressor proving to be faster in terms of training time. Additionally, it has been uncovered that more than 70% of auto collision happen during foggy and rainy conditions. Data analysis shows that Weather Type, Weekday, Time Zone, Month, Location Latitude, and Location Longitude are important components of this research. The project offers convincing evidence of the strong relationship between weather and road accidents, rendering it ideal for massive data processing and real-time predictions..

Page 23 (12m 57s)

[Audio] I have developed a model to predict weather-related traffic accidents, combining Random Forest, CatBoost, and LightGBM to improve accuracy. To further enhance the model's capabilities, I suggest using the identified influential features to determine the latitudes and longitudes most likely to experience accidents. This data could be used to target safety interventions in areas vulnerable to inclement weather. Additionally, I encourage joint research efforts between transportation agencies, weather forecasting organizations, and educational institutions to create early warning systems that alert traffic authorities about impending dangers. By working together, we can create comprehensive solutions to weather-driven road accidents and ensure safe roads for all..

Page 24 (13m 46s)

[Audio] I presented a model to predict weather-related traffic accidents using Random Forest with CatBoost and LightGBM techniques. To do this, I used historical accident data of a city and merged it with weather forecast data from different sources. The results from this model were compared with existing models and proved to be more effective in predicting weather-related traffic accidents and could be use in various cities. I would like to thank my Supervisor Channabasva Chola and Upgrad & Liverpool John Moores University for their guidance and support during the project..