GithubHelp home page GithubHelp logo

jennyjohnson78 / cryptocurrencies Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 3.53 MB

Analysis using PCA (Principal Component Analysis) algorithm, K-means testing, and data visualizations using Python.

Jupyter Notebook 100.00%
pca k-means-clustering data-visualization

cryptocurrencies's Introduction

Cryptocurrencies

Overview

Accountability Accounting, a prominent investment bank, is interested in offering a new cryptocurrency investment portfolio for its customers. In this analysis, a report will be created that includes what cryptocurrencies are on the trading market and how they could be grouped to create a classification system for this new investment. The data will need to be processed to fit the machine learning models. Since there is no known output for what to look for, unsupervised learning will be used. A clustering algorithm will be used to group the cryptocurrencies, and data visualizations will be used to share the findings.

Results

Steps for analysis:

  • Preprocessing the Data for PCA
  • Reducing Data Dimensions Using PCA
  • Clustering Cryptocurrencies Using K-means
  • Visualizing Cryptocurrencies Results

Preprocessing the data

  • Read data into a DataFrame
  • Drop the "IsTrading" column
  • Remove the rows that have at least one null value
  • Create a new DataFrame that only holds the names of the cryptocurrencies
  • Use the get_dummies() method to create variables for the two text features, "Algorithm" and "ProofType" and store the results in a new DataFrame
# Use get_dummies() to create variables for text features.
X = pd.get_dummies(crypto_df, columns=['Algorithm', 'ProofType'])
X.head()
  • Then, use the StandardScaler fit_transform() function to standardize the features from the new DataFrame
# Standardize the data with StandardScaler().
crypto_scaled = StandardScaler().fit_transform(X)
print(crypto_scaled[0:5])

Reduce the data dimensions using PCA (Principal Component Analysis) algorithm

  • Apply PCA to reduce the dimensions to three principal components
# Using PCA to reduce dimension to three principal components.
pca = PCA(n_components= 3)
crypto_pca = pca.fit_transform(crypto_scaled)
crypto_pca
  • Create a new DataFrame and use the same index as the previous DataFrame and columns named "PC 1", "PC 2", and "PC 3"
# Create a DataFrame with the three principal components.
pcs_df = pd.DataFrame(data = crypto_pca, columns= ['pc1', 'pc2', 'pc3'],index= crypto_df.index)
pcs_df.head(10)

image

Clustering Cryptocurrencies Using K-means

  • Using the previous DataFrame, create an elbow curve using hvPlot and a for loop to find the best value for K

image

  • Run the K-means algorithm to make predictions of the K clusters for the cryptocurrencies’ data
# Initialize the K-Means model.
model = KMeans(n_clusters=4, random_state=0)
# Fit the model
model.fit(pcs_df)
# Predict clusters
predictions = model.predict(pcs_df)
predictions
  • Create a new DataFrame by concatenating the crypto_df and pcs_df DataFrames on the same columns
# Create a new DataFrame including predicted clusters and cryptocurrencies features.
# Concatentate the crypto_df and pcs_df DataFrames on the same columns.
clustered_df = pd.concat([crypto_df, pcs_df],axis =1)
  • Add another column named "Class" that will hold the predictions

image

Visualizing Cryptocurrencies Results

  • Create a 3D scatter plot using the Plotly Express scatter_3d() function to plot the three clusters from the clustered_df DataFrame. Add the CoinName and Algorithm columns to the hover_name and hover_data parameters, respectively, so each data point shows the CoinName and Algorithm on hover.

image

  • Create an hvplot scatter plot with x="TotalCoinsMined", y="TotalCoinSupply", and by="Class", and have it show the CoinName when you hover over the the data point.

image

Summary

Cryptocurrencies are increasing in popularity and complexity, and the ability to understand and market them to alient will be key to any financial institution's growth. As more and more people look to invest in crypto, having the knowledge of which currencies are on the market and which ones would benefit a specific client will put any institution in a great position to become an industry leader.

cryptocurrencies's People

Contributors

jennyjohnson78 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.