GithubHelp home page GithubHelp logo

researchhub / researchhub-backend Goto Github PK

View Code? Open in Web Editor NEW
31.0 10.0 20.0 135.98 MB

Django API for researchhub.com

Home Page: https://researchhub.com

Python 96.01% Shell 0.07% CSS 1.51% HTML 2.35% Dockerfile 0.06%
researchhub science science-research open-source

researchhub-backend's Introduction

The ResearchHub Django API

Automated Tests

ย 

Our Mission

Our mission is to accelerate the pace of scientific research ๐Ÿš€

We believe that by empowering scientists to independently fund, create, and publish academic content we can revolutionize the speed at which new knowledge is created and transformed into life-changing products.

Important Links ๐Ÿ‘€

๐Ÿ’ก Got an idea or request? Open issue on Github.
๐Ÿ› Found a bug? Report it here.
โž• Want to contribute to this project? Introduce yourself in our Discord community
๐Ÿ”จ See what we are working on
๐Ÿ“ฐ Read the ResearchCoin White Paper

Installation

There are three different methods for running this project: Dev Containers with VSCode, Docker Compose and a native installation.

Dev Containers and VSCode

Prerequisites

Install Docker, Visual Studio Code and the Dev Containers extension. Please review the Installation section in the Visual Studio Code Dev Container documentation.

On MacOS with Homebrew, the installation can be achieved with the following commands:

brew install docker
brew install visual-studio-code
code --install-extension ms-vscode-remote.vscode-remote-extensionpack

Configuration

Clone the repository and create an initial configuration by copying the sample configuration files to config_local:

cp db_config.sample.py src/config_local/db.py
cp keys.sample.py src/config_local/keys.py

Make adjustments to the new configuration files as needed.

Start Developing

When opening the code in VSCode, tt will recognize the Dev Containers configuration and will prompt to Rebuild and Reopen in Container. Alternatively, select Rebuild and Reopen in Container manually from the command palette. This will pull and run all necessary auxiliary services including ElasticSearch, PostgreSQL, and Redis.

During the creation of the dev container, all Python dependencies are downloaded and installed and an initial database migration is also performed. After dev container creation, proceed with seeding the database as needed.

Running and Debugging

Run the application by typing the following into integrated terminal:

cd src
python manage.py runserver

Alternatively, debugging of the application is possible with the following launch configuration (in .vscode/launch.json):

{
  "version": "0.2.0",
  "configurations": [
    {
      "name": "Python: Django",
      "type": "debugpy",
      "request": "launch",
      "program": "${workspaceFolder}/src/manage.py",
      "args": ["runserver", "[::]:8000"],
      "django": true,
      "autoStartBrowser": false
    }
  ]
}

Quick install using Docker (Not recommended for development)

  1. Download or clone this repository.
  2. Copy local config files. From inside the dir root, run
cp db_config.sample.py src/config_local/db.py
cp keys.sample.py src/config_local/keys.py
  1. Run:
docker build --tag researchhub-backend .
docker-compose up

The backend will now run at localhost:8000
4. Setup and run the web app at localhost:3000

Native install (Slower, recommended for development)

Prerequisites

  1. Docker
  2. pyenv
  3. redis
  4. Install the flake8 linter in your IDE:

General setup

  • Create a fork of the repository in your GitHub account, and clone it.

  • Prepare the database:

    Create a db file in config

    touch src/config/db.py

    Add the following:

    NAME = 'researchhub'
    HOST = 'localhost'
    PORT = 5432
    USER = 'rh_developer'  # replace as needed
    PASS = 'not_secure'  # replace as needed
  • Use posgres.app to install Posgres DB. The latest available DB version should be fine.

Good UI tool for interacting with PostgreSQ: Postico

  • The project virtual environment is managed using Poetry.

    pip3 install poetry
  • Go to the src directory and run the following commands in order to activate the virtual environment:

    cd src
    
    # activates a Python virtual environment and enters shell
    poetry shell
    
    # installs the project virtual environment and packages
    poetry install

The following commands should all be run in the virtual environment (poetry shell), in the src folder:

  • Install python dependencies stored in requirements.txt:

    pip3 install -r requirements.txt --no-deps
  • Create the database schema:

    python manage.py makemigrations
    python manage.py migrate
  • The backend worker queue is managed using redis. Before you start the backend, in a separate terminal, run redis-server:

    brew install redis
    redis-server
  • Start celery, the tool that runs the worker via redis. In a separate terminal:

    # celery: in poetry shell, run:
    cd src
    ./start-celery.sh

Seed the database

  • In order for the UI to work properly, some data needs to be seeded into the database. Seed category data:

    python manage.py create-categories
  • Seed hub data. There's a CSV file in /misc/hub_hub.csv with hub data that you can use to seed hubs data. This can be done in two ways:

    • in Postico: right-click on the hub_hub table, and select Import CSV.... You will encounter problems importing the CSV due to the tool thinking that empty fields are nulls for acronym and description columns. Temporarily update hub_hub table to allow null values for those columns:
    ALTER TABLE hub_hub ALTER COLUMN description DROP NOT NULL;
    ALTER TABLE hub_hub ALTER COLUMN acronym DROP NOT NULL;
    

    Import CSV, then change all nulls to empty in the two columns, and revert the columns to not null:

    UPDATE hub_hub set acronym='', description='';
    ALTER TABLE hub_hub ALTER COLUMN description SET NOT NULL;
    ALTER TABLE hub_hub ALTER COLUMN acronym SET NOT NULL;
    

    OR

    • in Python: run python manage.py shell_plus to open a Python terminal in the virtual environment. Then, paste the following code:
    import pandas as pd
    from hub.models import Hub
    
    hub_df = pd.read_csv("../misc/hub_hub.csv")
    hub_df = hub_df.drop("slug_index", axis=1)
    hub_df = hub_df.drop("acronym", axis=1)
    hub_df = hub_df.drop("hub_image", axis=1)
    hubs = [Hub(**row.to_dict()) for _, row in hub_df.iterrows()]
    Hub.objects.bulk_create(hubs)

Run the development server:

python manage.py runserver

Ensure pre-commit hooks are set up

pre-commit install

Useful stuff

Create a superuser in order to get data from the API

# create a superuser and retrieve an authentication token
python manage.py createsuperuser --username=florin [email protected]
# p: not_secure
python manage.py drf_create_token [email protected]

Query the API using the Auth token

Note that for paths under /api, e.g. /api/hub/, you don't need a token.

curl --silent \
--header 'Authorization: Token <token>' \
http://localhost:8000/api/

Sending API requests via vscode

  • Install the REST Client extension.

  • Create a file called api.rest with the following contents (insert token):

    GET http://localhost:8000/api/ HTTP/1.1
    content-type: application/json
    Authorization: Token <token>
    

    Then press Send Request in vscode, above the text.

Seed paper data.

For this to work, the celery worker needs to be running (see above). This calls two methods that are temporarily disabled, in src/paper/tasks.py: pull_crossref_papers() and pull_papers(). First, comment the first line of the methods, that cause the methods to be disabled. Then, change the while loops to finish after pulling a small number of papers (enough to populate local environment):

def pull_papers(start=0, force=False):
    # Temporarily disabling autopull
    return  # <-- this line needs to be commented out
    ...
    while True:  # <-- change this to while i < 100:

...

def pull_crossref_papers(start=0, force=False):
    # Temporarily disabling autopull
    return  # <-- this line needs to be commented out
    ...
    while True:  # <-- change this to while offset < 100:

Then, run:

python manage.py shell_plus # enters Python shell within poetry shell
from paper.tasks import pull_crossref_papers, pull_papers
pull_crossref_papers(force=True)
pull_papers(force=True)

Make sure to revert that file once you're done seeding the local environment.

Adding new packages

# add a package to the project environment
poetry add package_name

# update requirements.txt which is used by elastic beanstalk
poetry export -f requirements.txt --output requirements.txt

ELASTICSEARCH (Optional)

In a new shell, run this Docker image script (make sure Redis is running in the background redis-server)

 # Let this run for ~30 minutes in the background before terminating, be patient :)
./start-es.sh

Back in the python virtual environment, build the indices

python manage.py search_index --rebuild

Optionally, start Kibana for Elastic dev tools

./start-kibana.sh

To view elastic queries via the API, add DEBUG_TOOLBAR = True to keys.py. Then, visit an API url such as http://localhost:8000/api/search/paper/?publish_date__gte=2022-01-01

ETHEREUM (Optional)

Create a wallet file in config

touch src/config/wallet.py

Add the following to wallet.py (fill in the blanks)

KEYSTORE_FILE = ''
KEYSTORE_PASSWORD = ''

Add the keystore file to the config directory

Ask a team member for the file or create one from MyEtherWallet https://www.myetherwallet.com/create-wallet

Testing

Run the test suite:

# run all tests
# Note: Add --keepdb flag to speed up the process of running tests locally
python manage.py test

# run tests for the paper app, excluding ones that require AWS secrets
python manage.py test paper --exclude-tag=aws

# run a specific test example:
run python manage.py test note.tests.test_note_api.NoteTests.test_create_workspace_note --keepdb

Run in the background for async tasks:

celery -A researchhub worker -l info

Run in the background for periodic tasks (needs celery running)

celery -A researchhub beat -l info

Both celery commands in one (for development only)

celery -A researchhub worker -l info -B

Google Auth

Ask somebody to provide you with CLIENT_ID and SECRET config, and run this SQL query (with updated configs) to seed the right data for Google login to work:

insert into socialaccount_socialapp (provider, name, client_id, secret, key)
values ('google','Google','<CLIENT_ID>', '<SECRET>');

insert into django_site (domain, name) values ('http://google.com', 'google.com');

insert into socialaccount_socialapp_sites (socialapp_id, site_id) values (1, 1);

(make sure that IDs are the right one in the last query)

researchhub-backend's People

Contributors

aaronwatson2975 avatar alexcui508 avatar briansantoso avatar calvinhlee23 avatar craiglu avatar dhimmel avatar ebiederstadt avatar florin-chelaru avatar gzurowski avatar joshslee avatar kerkelae avatar koutst avatar le-sun avatar lightninglu10 avatar manveerxyz avatar piccoloman avatar sm-hwang avatar thomas-vu avatar v-stickykeys avatar vlazzle avatar yattias avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

researchhub-backend's Issues

Add user contributions serializer

On the frontend we want to display a list of a user's contributions.

On the backend we want to gather and send those contributions in a single api call.

TODO: Determine what the "contributions" that we need to show are

Create summary edit history view

Testing

  1. Send get request to paper edit history endpoint
/paper/<id>/edits/<id>?sort_by=<date,category>
  1. Notice list of edits
history_id: {
  page,
  paper_id,
  items,
  count,
}

Create edit history item model

Testing

  1. View EditHistory table in Postico and see these columns
id,
paper,
submit_date,
updated_date,
accepted,
score,
upvotes,
downvotes,
additions,
deletions,
created_by,
thread,
is_public,
is_removed,
text,
ip_address

Create models for votes and flags

Testing

  1. Notice tables in postico

Vote

id,
created_by,
item,
created_date,
vote_type

Flag

id,
created_by,
item,
created_date,
reason

Create Model for Subscribed (locked) hub

How does this affect users?

We need a model for subscribed hubs. We need to allow users to add themselves to hubs, and then see themselves added. We should also get a total count of all users subscribed to the hub.

I think this should be a many to many model but it's late and may be wrong.

Search authors by name

Testing

  1. Go to /api/authors?

Notes

  • Uses django rest framework default query params

TODO: Fill in details

Create summary urls

How does this affect users?

Make changes to a summary

Testing

  1. Create user with low rep
  2. Propose summary change
/paper/<id>/summary/

POST
{
text
}

This will create a new summary proposal.

  1. Increase user rep
  2. Propose summary change
/paper/<id>/summary/

PUT
{
text
}

This will alter the summary text.

  1. Repeat 3 and 4 as paper author and paper moderator

TODO: Include required rep amount

Create view for author contributions

How does this affect users?

Return list of all summary edits by the author

Testing

  1. Get request to /user/<id>/contribution/
  2. Notice response
{
summaries
}

Setup paper urls

How does this affect users?

Can retrieve and add papers

Testing

  1. Send get request to /api/paper
  2. Notice response
    {
    papers
    }
  3. Send post request to /api/paper for paper upload flow

Notes

  • This is dependent on #3

Create model for citation

Testing

  1. See Citations table in Postico with these columns
id,
authors,
date,
title,
paper_id,
publisher,
url

TODO: Pick a citation format

Fix voting

django.core.exceptions.ImproperlyConfigured: Field name authors__id is not valid for model Vote.
[23/Oct/2019 22:24:55] "GET /api/paper/?page=1 HTTP/1.1" 500 21270

Create view for author discussions

How does this affect users?

Retrieve list of all discussion posts by the user

Testing

  1. Get request to /user/<id>/discussion/
  2. Notice response
{
discussion_threads,
discussion_posts,
discussion_replies
}

Setup paper form

How does this affect users?

Can create a new paper

Testing

  1. Create a paper with a post request
/paper/

POST
{
title,
doi,
author_ids,
hub_ids,
publish_date,
pdf,
url
}

Notes

  • Requires auth token in request

Create and retrieve comment replies

How does this affect users?

Testing

  1. Go to paper/<id>/discussion/<id>/comment/<id>/reply
  2. Notice the list of replies to the comment
  3. Send a post request to create a new reply
POST

{
text
}

Model paper summaries

Testing

  1. $ python manage.py migrate
  2. Notice a Summary table with the following
id,
paper,
text

Create view for edit history item

Testing

  1. Send get request to edit history with item id in query
/paper/<id>/edits/<id>
  1. See response items with the following format:
edit_id: {
paper: id,
date_created: timestamp,
date_updated: timestamp,
accepted: true/false,
score: number,
upvotes: number,
downvotes: number,
additions: number,
deletions: number,
created_by: user_id
}

Notes

  • Dependent on #10

Create view for bookmarked papers

How does this affect users?

List of papers bookmarked by this user

Testing

  1. Get request to /user/<id>
  2. Notice response
user_id: {
authored_papers,
bookmarks,
summaries,
discussions
}
  1. Get request to /user/<id>/bookmark
  2. Notice response
{
bookmarks,
page,
count
}

Create form for paper summary edit

How does this affect users?

Can submit summary edit proposal

Testing

  1. Post request to /paper/<id>/summary/ with
{
text
}

Notes

  • Users without enough rep should have their proposal set to is_approved=False
  • Users with sufficient rep will have is_approved=True on the proposal

Create model for paper summary

Testing

  1. In Postico the paper table has a summary column
  2. In Postico the paper_summary table has the following columns
id,
paper,
text,
date_created,
date_updated

Create author invite form

How does this affect users?

Invite user by email to claim and author profile

Testing

  1. Send Post to /user/<id>/invite/
{
email
}

Create upvote/downvote urls for papers

How does this affect users?

Can upvote or downvote a paper if they have enough reputation points.

Testing

  1. POST, PUT, or PATCH to /api/paper/<id>/upvote/ and /api/paper/<id>/downvote/

Add linter

Testing

  1. Push a commit
  2. Notice the flake8 linter runs in a pre-push hook
  3. Add the flake8 plugin to vscode (or whatever IDE you are using)

Remove flagged items from being public

TODO: Determine how exactly we want to do this

  • Change is_public field?
  • Allow admins to change this field?
  • List flagged items by number of flags in admin panel?

Create models for summay revisions

Testing

  1. In Postico the paper_summary table as a edits column
  2. In Postico the paper_summary_edits table has the following columns
paper_summary,
user,
text,
updated_date,
created_date,
upvotes,
downvotes,
score,
approved_by,
is_approved

Notes

  • We can reuse the django_comments models

Setup discussion urls

How does this affect users?

Start thread, upvote, flag, or edit comments on discussion

Testing

  1. Go to the endpoints with an auth token
/paper/<id>/discussion/

GET
{
thread_ids
}

POST
{
text
}

/paper/<id>/discussion/<id>/

GET
{
thread_id,
paper_id,
comment_count
}

/paper/<id>/discussion/<id>/upvote/
/paper/<id>/discussion/<id>/downvote/
/paper/<id>/discussion/<id>/flag/

POST
{
}

/paper/<id>/discussion/<id>/comments/

GET
{
thread_id,
paper_id,
page,
comments,
comment_count
}

Notes

  • Must get an auth token from /signup or /login

TODO: Determine name of endpoints

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.