GithubHelp home page GithubHelp logo

odsc_west_2021's Introduction

Machine Learning With Graphs: Going Beyond Tabular Data

Talk presented at the Open Data Science West 2021 conference

Written by Dr. Clair J. Sullivan, Data Science Advocate, Neo4j

Twitter: @CJLovesData1

Last updated: 2021-11-17

Abstract

Machine learning has traditionally relied on creating models around data that can be represented in tabular format such as SQL tables, Pandas dataframes, and the like. Inherent in this data is the assumption that there is no relationship between each entry (row) of the data. In certain cases this is an accurate assumption. However, there are many common use cases for machine learning where this assumption is not entirely accurate. In these cases, by considering the relationships among those individual data points, models can be significantly enhanced and measurable improvements can be made to the appropriate metrics of that model. Such use cases can include common data science and machine learning tasks such as churn prediction and automated recommendation engines.

In this talk we will compare and contrast models created with individual data points to those made entirely with graphs and hybrids of the two. We will explore a variety of techniques that are used for creating graph embeddings, the vectors for representing graphs that are created in a similar fashion to the feature engineering and vector embeddings associated with traditional machine learning. We will focus on the optimization of the graph embeddings and explore some real-world examples of their use individually and in conjunction with the traditional types of machine learning embeddings. Special emphasis will be placed on the benefits of using graph embeddings with significant class imbalance. We will also discuss the use of these embeddings with traditional machine learning packages and workflows, such as through the use of scikit-learn and TensorFlow.

Code and slides

The slides for this talk are available in this repository.

The code used for this talk can be found in this Google Colab Notebook.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.