GithubHelp home page GithubHelp logo

ereynrs / sciarticlesrecommender Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 10.24 MB

Data is processed, transformed, and loaded into the Neo4j graph database. Using the cleaned and modelled data, authors are disambiguated, recommendations in terms of what authors could review the incoming publications are made, and the most influential authors are identified.

Jupyter Notebook 93.54% Python 6.46%
data-science graph graph-algorithms graphs jupyter-notebook neo4j python3

sciarticlesrecommender's Introduction

README

Overview

Data is processed, transformed, and loaded into the Neo4j graph database. Using the cleaned and modelled data, authors are disambiguated, recommendations in terms of what authors could review the incoming publications are made, and the most influential authors are identified.

The sample data files in CSV format:

  • publications.csv,
  • authors.csv,
  • topics.csv,
  • publications_incoming.csv.

Objectives

  1. Draft the initial data model (nodes, relationship and labels) and ETL strategy in order to load into Neo4j graph data is relevant included in publications, authors and topics csv files.
  2. Clean the datasets up, considering , for example, possible duplicated authors.
  3. Recommend a group of people to review the incoming publications.
  4. Depict the more influential authors.

Folders and file structure

  • Assignment for Knowledge Graph Engineer.pdf file depicting the assessment.
  • assignment_slideck.pdf file is the slide deck depicting the solution process.
  • graphDB_model.svg depicts the graph data model.
  • data/ folder. Contains the data CSV files.
  • notebooks/ folder. Contains the Jupyter notebooks:
    • to perform initial data exploration (exploration.ipynb),
    • to run the graph data science algorithms (analysis.ipynb)
  • src/ folder. Contains the Python script to load the data into the graph db (etl_pandas.py).

sciarticlesrecommender's People

Contributors

ereynrs avatar

Watchers

Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.