GithubHelp home page GithubHelp logo

nikita9604 / analytics-of-covid-19-w.r.t-environmental-conditions Goto Github PK

View Code? Open in Web Editor NEW
3.0 1.0 0.0 6.51 MB

Descriptive and Predictive Analytics of COVID-19 with respect to different Weather parameters

License: GNU General Public License v3.0

Jupyter Notebook 99.16% Python 0.84%
covid-19 weather-parameters descriptive-analytics predictive-analytics machine-learning-algorithms confirmed-cases spread dataset covid19-data correlation

analytics-of-covid-19-w.r.t-environmental-conditions's Introduction

Analytics of COVID-19 w.r.t Environmental Conditions

To perform Descriptive and Predictive Analytics of COVID-19 with respect to different Weather parameters such as temperature, humidity, dew point, wind speed, pressure and precipitation intensity.

Datasets Used

  1. Covid-19 dataset
  2. Weather dataset

Steps Involved as per the Data Analytics Life Cycle

  1. Objective
  2. Understanding the Data
  3. Data Cleaning and Data Transformation
  4. Data Enhancement
  5. Data Analytics
  6. Data Visualization

Objective

To analyse the spread of COVID-19 disease with respect to environmental conditions of a particular region and check whether using the data of weather conditions of a particular place, can we predict the total number of Confirmed Cases on that particular day.

Descriptive Analytics

The datasets containing the COVID-19 data and weather data were first cleaned, imputed and then merged. The merged dataset was used to find out the trend of the spread with respect to the date. Observing the plot of total confirmed cases per day vs Days, it was decided to split the dataset into 2 sets - one before March 15 and the other After March 15. March 15 was an elbow point in the graph plotted.

After splitting up based on this point the results improved as this reduced a lot of hidden factors that might have skewed the model. Plots between Confirmed cases grouped by only pressure or only precipitation Intensity was not fruitful as the graph showed no trend in this manner. Using the correlation values obtained, pairs of variables that had good correlation with each other and with the Confirmed Cases were taken and plotted. For this, plotly.express graphs were used. Each graph had one weather parameter each in the x-axis and y axis, and the intensity of colour of the data circles and their size corresponded to the Confirmed Cases count. All these plots did not follow any specific trend. Very minute trends were observed when graphs were plotted only for a particular month, that too for months till March as till then the virus was concentrated in China alone.

Predictive Analytics

Correlation matrix is shown and observed for the 2 split datasets and also for the whole dataset. Then different Machine Learning models were used to try and fit the data to it. The following models were performed:

  1. Linear regression:
    • With combination of max correlated features.
    • Simple Linear regression with a weather parameter.
    • With all the weather parameters.
  2. XGboost:
    • Simple and Multiple
  3. SVM
    • Radical basis function and Linear Kernel
  4. Decision Tree based
  5. Considering only China (more number of cases):
    • Linear regression
    • Xgboost

Conclusion

In this work, we are motivated to study and analyze the impact of different weather parameters in relation to the number of infected cases due to COVID-19. We have presented descriptive and predictive analytics for the spread of COVID-19 on different features taken from climatic conditions such as temperature, humidity, dew point, wind speed, pressure and precipitation intensity. To validate the proposed result, we have used publicly available datasets which were trained on the specified climatic conditions.

In our fight against coronavirus, it is possible to note here that this project can be used as an input to create general awareness and bust the myth on weather stimulating coronavirus spread that emerged during the past couple of months. Moreover, as we are limping through this period, it is advisable to continue with the lockdown and ensure social distancing until the vaccine is created irrespective of the change in climate.

Reference

Dataset are taken from the following links:

analytics-of-covid-19-w.r.t-environmental-conditions's People

Contributors

nikita9604 avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.