Arun Ramachandran's Projects
Extracting essential data from a datasets and displaying it is a necessary part of data science; therefore individuals can make correct decisions based on the data. In this assignment, you will extract some essential economic indicators from some data, you will then display these economic indicators in a Dashboard. Gross domestic product (GDP) is a measure of the market value of all the final goods and services produced in a period. GDP is an indicator of how well the economy is doing. A drop in GDP indicates the economy is producing less; similarly an increase in GDP suggests the economy is performing better. In this, you will examine how changes in GDP impact the unemployment rate.
In this lab, you will learn in details how to make calls to the Foursquare API for different purposes. You will learn how to construct a URL to send a request to the API to search for a specific type of venues, to explore a particular venue, to explore a Foursquare user, to explore a geographical location, and to get trending venues around a location. Also, you will learn how to use the visualization library, Folium, to visualize the results.
There are many models for clustering out there. In this lab, we will be presenting the model that is considered the one of the simplest model among them. Despite its simplicity, *k*-means is vastly used for clustering in many data science applications, especially useful if you need to quickly discover insights from unlabelled data.
In this lab, you will learn how to convert addresses into their equivalent latitude and longitude values. Also, you will use the Foursquare API to explore neighbourhoods in New York City. You will use the explore function to get the most common venue categories in each neighbourhood, and then use this feature to group the neighbourhoods into clusters. You will use the k-means clustering algorithm to complete this task. Finally, you will use the Folium library to visualise the neighbourhoods in New York City and their emerging clusters.
My Personal Repository
š©āš»šØāš» Awesome cheatsheets for popular programming languages, frameworks and development tools. They include everything you should know in one single file.
A curated list of tech internships resources.
The code sample is from the Boston Housing Data Analysis which was performed using Python. The code basically involved various data visualizations on the columns and thereby extracting meaningful information from the graphs like Scatter Plots, Boxplots. Then we used those graphs for analysis via hypothesis testing like code sample included t-test, ANOVA, Correlation and other metrics to extract information which supports the visualizations we prepared. It also contains Linear Regression to create a machine learning model to support our analysis. The code sample included Pearson test for continuous variables and Chi-Square Test for the categorical variables.
HTTP checks & tests (private & public) monitoring - check the status of your URL
Peer-graded Assignment - Capstone Project Notebook This capstone project course will give you a taste of what data scientists go through in real life when working with data.
In this assignment, you are a Data Analyst working at a Real Estate Investment Trust. The Trust will like to start investing in Residential real estate. You are tasked with determining the market price of a house given a set of features. You will analyze and predict housing prices using attributes or features such as square footage, number of bedrooms, number of floors, and so on. This dataset contains house sale prices for King County, which includes Seattle. It includes homes sold between May 2014 and May 2015.
A helpful 4-page data science cheatsheet to assist with exam reviews, interview prep, and anything in-between.
In this Assignment, you will demonstrate the data visualization skills you learned by completing this course. You will be required to generate two visualization plots. The first one will be a plot to summarize the results of a survey that was conducted to gauge an audience interest in different data science topics. The second plot is a Choropleth map of the crime rate in San Francisco.
A Generalized Metadata Search & Discovery Tool
Fork of GitHub Desktop to support various Linux distributions
The open-source repo for docs.github.com
In this lab, you will use a Python library to obtain financial data. You will extract historical stock data using yfinance.
Feature Extraction And Statistics for Time Series
:books: Freely available programming books
One second to read GitHub code with VS Code.
In this lab, we will be looking at Agglomerative clustering, which is more popular than Divisive clustering. We will also be using Complete Linkage as the Linkage Criteria.
Recommendation systems are a collection of algorithms used to recommend items to users based on information taken from the user. These systems have become ubiquitous can be commonly seen in online stores, movies databases and job finders. In this notebook, we will explore recommendation systems based on Collaborative Filtering and implement simple version of one using Python and the Pandas library.
Lab: Content-based Recommendation Systems - Recommendation systems are a collection of algorithms used to recommend items to users based on information taken from the user. These systems have become ubiquitous can be commonly seen in online stores, movies databases and job finders. In this notebook, we will explore Content-based recommendation systems and implement a simple version of one using Python and the Pandas library.
Density-based Clustering locates regions of high density that are separated from one another by regions of low density. Density, in this context, is defined as the number of points within a specified radius. In this section, the main focus will be manipulating the data and properties of DBSCAN and observing the resulting clustering.
In this lab exercise, you will learn a popular machine learning algorithm, Decision Tree. You will use this classification algorithm to build a model from historical data of patients, and their response to different medications. Then you use the trained decision tree to predict the class of a unknown patient, or to find a proper drug for a new patient.
Despite its simplicity, the K-means is vastly used for clustering in many data science applications, especially useful if you need to quickly discover insights from unlabelled data. In this notebook, you learn how to use k-Means for customer segmentation.
In this Lab you will load a customer datasets related to a telecommunication company, clean it, use KNN (K-Nearest Neighbours to predict the category of customers, and evaluate the accuracy of your model. Let's learn about KNN and see how we can apply it real world problems.
In this notebook, you will learn Logistic Regression, and then, you'll create a model with telecommunications data to predict when its customers will leave for a competitor, so that you can take some action to retain the customer.
In this lab, we learn how to use scikit-learn library to implement Multiple linear regression. We again use the Carbon dioxide emission dataset to build a model, Evaluate the model, and finally use model to predict unknown value.