covid-19-net / covid-19-community Goto Github PK
View Code? Open in Web Editor NEWCommunity effort to build a Neo4j Knowledge Graph (KG) that links heterogeneous data about COVID-19
License: MIT License
Community effort to build a Neo4j Knowledge Graph (KG) that links heterogeneous data about COVID-19
License: MIT License
The binder.pangeo.io service has been shut down due to crypto mining: binder.pangeo.io shut down due to crypto mining.
The URls to download the Mexican confirmed cases and death give a 404 error (see: https://github.com/covid-19-net/covid-19-community/blob/master/notebooks/dataprep/02d-GOBMXCases.ipynb).
We need to find out if there is a different way to download these data, e.g., from here
Hello,
Your dataset was added to CoronaWhy (https://www.coronawhy.org/) Data Lake on Dataverse as a piece of common COVID-19 data https://datasets.coronawhy.org/dataset.xhtml?persistentId=doi:10.5072/FK2/D1Q9MF
Would you be willing to help with the maintenance of your dataset in Dataverse, e.g. adding the relevant metadata and keeping the dataset up-to-date? That will help to make the dataset findable and accessible for the medical science community.
We start the Neo4j database in Jupyter Lab. How can we check if the database is up and running before starting executing Cypher queries with py2neo?
See: https://github.com/covid-19-net/covid-19-community/blob/master/notebooks/3-ExampleQueries.ipynb
Explore options to represent time series data about COVID-19 outbreak in the Neo4j graph.
The data are here:
https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_19-covid-Confirmed.csv
See for example:
https://community.neo4j.com/t/how-to-model-timeseries-data-in-property-graph/8713
https://github.com/graphaware/neo4j-timetree
*** Downlading CNCB Strain Data ***
Downloading: ftp://download.big.ac.cn/GVM/Coronavirus/gff3/a
--2021-04-06 01:30:07-- ftp://download.big.ac.cn/GVM/Coronavirus/gff3/a
=> ‘/var/lib/neo4j/import/cache/raw/cncb/a/.listing’
Resolving download.big.ac.cn (download.big.ac.cn)... 124.16.164.229
Connecting to download.big.ac.cn (download.big.ac.cn)|124.16.164.229|:21... conn
ected.
Logging in as anonymous ... Logged in!
==> SYST ... done. ==> PWD ... done.
==> TYPE I ... done. ==> CWD (1) /GVM/Coronavirus/gff3 ...
No such directory ‘GVM/Coronavirus/gff3’.
The notebook 01f-PDBStructure.ipynb fails because the PDBj web service is broken.
The webservice https://pdbj.org/rest/mine2_sql' URL does not work anymore. We need to find a replacement. I emailed PDBj to see if there is a new service we can use.
The new version is: 3.0.3
It throws 'error_prem: 550 Failed to change directory' while downloading and caching data files with variant information. Can you please upload the data itself to the repository if possible?
I run the cypher queries with neo4j however, I am confused as to how to use notebooks, is my assumption correct that first we need to have the graphs ready by running all cypher queries and then connect the notebooks with the neo4j source to run them?
ValueError Traceback (most recent call last)
in
1 df[['geoName1','geoName2']] = df[['geoName1','geoName2']].apply(lambda x
: x if x[0] != '' else [x[1],x[0]], axis=1)
----> 2 df[['geoName2','geoName3']] = df[['geoName2','geoName3']].apply(lambda x
: x if x[0] != '' else [x[1],x[0]], axis=1)
In readme file, for the data preparation step, files inside dataprep has to be run numerically or alphanumerically? When I tried to run numerically i.e, 00b-NCBITaxonomy.ipynb an error occured. Also, when I ran alphanumerically 5 files reported error as they were not able to find the required files in the cache folder.
It looks like these have been marked as obsolete, is 10-GeoLink still needed?
df4 = pd.read_csv(NEO4J_IMPORT / '02b-CDSCases.csv', dtype='str', usecols=['origLocation'])
df5 = pd.read_csv(NEO4J_IMPORT / '02d-GOBMXCasesAdmin1.csv', dtype='str', usecols=['origLocation'])
df6 = pd.read_csv(NEO4J_IMPORT / '02d-GOBMXCasesAdmin2.csv', dtype='str', usecols=['origLocation'])
USCensus zip level API returns an additional column: state
Use log scale to display number of strains per country in :
https://github.com/covid-19-net/covid-19-community/blob/master/notebooks/analyses/StrainB.1.1.7.ipynb
A Jupyter Lab plugin is urgently needed to visualize and interactively explore Neo4j KG.
Here is a list of JS libraries:
https://neo4j.com/developer/tools-graph-visualization/
https://ipython-cypher.readthedocs.io/en/latest/
https://nicolewhite.github.io/neo4j-jupyter/hello-world.html
Dear All,
I am writing you now to briefly introduce the COVID-19 Disease Map community project: https://covid.pages.uni.lu/, which includes manual-curated, domain-expert-approved information on COVID-19 disease mechanisms. We also have a Neo4j-dedicated component and we are interested in creating a communication between the COVID-19-Net and the COVID-19 Disease Map project. I am looking forward to learning your opinion on this.
Thank you for your time.
Best regards,
Irina Balaur
Failed to build ipycytoscape
Pip subprocess error:
Running command git clone -q https://github.com/pwrose/ipycytoscape.git /tmp/pip-req-build-_8vpzekr
ERROR: Command errored out with exit status 1:
command: /srv/conda/envs/notebook/bin/python /srv/conda/envs/notebook/lib/python3.7/site-packages/pip/_vendor/pep517/_in_process.py build_wheel /tmp/tmpdkkmg4qe
cwd: /tmp/pip-req-build-_8vpzekr
Complete output (122 lines):
running bdist_wheel
running jsdeps
Installing build dependencies with npm. This may take a while...
npm install
added 1246 packages, and audited 1313 packages in 44s
21 vulnerabilities (13 low, 5 moderate, 3 high)
To address issues that do not require attention, run:
npm audit fix
To address all issues (including breaking changes), run:
npm audit fix --force
Run npm audit
for details.
npm notice
npm notice New minor version of npm available! 7.0.8 -> 7.11.0
npm notice Changelog: https://github.com/npm/cli/releases/tag/v7.11.0
npm notice Run npm install -g [email protected]
to update!
npm notice
npm run build
[email protected] build
npm run build:lib && npm run build:all
[email protected] build:lib
tsc
[email protected] build:all
npm run build:labextension && npm run build:nbextension
[email protected] build:labextension
npm run clean:labextension && jupyter labextension build .
[email protected] clean:labextension
rimraf ipycytoscape/labextension
Traceback (most recent call last):
File "/tmp/pip-build-env-lgjks042/overlay/bin/jupyter-labextension", line 5, in
from jupyterlab.labextensions import main
File "/tmp/pip-build-env-lgjks042/overlay/lib/python3.7/site-packages/jupyterlab/init.py", line 7, in
from .labapp import LabApp
File "/tmp/pip-build-env-lgjks042/overlay/lib/python3.7/site-packages/jupyterlab/labapp.py", line 14, in
from jupyter_server._version import version_info as jpserver_version_info
File "/tmp/pip-build-env-lgjks042/overlay/lib/python3.7/site-packages/jupyter_server/init.py", line 15, in
from ._version import version_info, version
File "/tmp/pip-build-env-lgjks042/overlay/lib/python3.7/site-packages/jupyter_server/_version.py", line 5, in
from jupyter_packaging import get_version_info
ImportError: cannot import name 'get_version_info' from 'jupyter_packaging' (/tmp/pip-build-env-lgjks042/overlay/lib/python3.7/site-packages/jupyter_packaging/init.py)
npm ERR! code 1
npm ERR! path /tmp/pip-req-build-_8vpzekr
npm ERR! command failed
npm ERR! command sh -c npm run clean:labextension && jupyter labextension build .
npm ERR! A complete log of this run can be found in:
npm ERR! /home/jovyan/.npm/_logs/2021-04-23T07_57_47_862Z-debug.log
npm ERR! code 1
npm ERR! path /tmp/pip-req-build-_8vpzekr
npm ERR! command failed
npm ERR! command sh -c npm run build:labextension && npm run build:nbextension
npm ERR! A complete log of this run can be found in:
npm ERR! /home/jovyan/.npm/_logs/2021-04-23T07_57_47_889Z-debug.log
npm notice
npm notice New minor version of npm available! 7.0.8 -> 7.11.0
npm notice Changelog: https://github.com/npm/cli/releases/tag/v7.11.0
npm notice Run npm install -g [email protected]
to update!
npm notice
npm ERR! code 1
npm ERR! path /tmp/pip-req-build-_8vpzekr
npm ERR! command failed
npm ERR! command sh -c npm run build:lib && npm run build:all
ERROR: Failed building wheel for ipycytoscape
ERROR: Could not build wheels for ipycytoscape which use PEP 517 and cannot be installed directly
CondaEnvException: Pip failed
4.0 pyhd8ed1ab_0 conda-forge/noarch 36 KB
libgcc-ng 9.2.0 h24d8f2e_2 installed
libgcc-ng 9.3.0 h2828fa1_19 conda-forge/linux-64 8 MB
libgomp 9.2.0 h24d8f2e_2 installed
libgomp 9.3.0 h2828fa1_19 conda-forge/linux-64 376 KB
libstdcxx-ng 9.2.0 hdf63c60_2 installed
libstdcxx-ng 9.3.0 h6de172a_19 conda-forge/linux-64 4 MB
openssl 1.1.1g h516909a_0 installed
openssl 1.1.1k h7f98852_0 conda-forge/linux-64 2 MB
sqlite 3.32.3 hcee41ef_1 installed
sqlite 3.35.5 h74cdb3f_0 conda-forge/linux-64 1 MB
tornado 6.0.4 py37h8f50634_1 installed
tornado 6.1 py37h5e8e339_1 conda-forge/linux-64 646 KB
Summary:
Install: 177 packages
Upgrade: 9 packages
Total download: 544 MB
──────────────────────────────────────────────────────────────────────────────────────
time: 293.282
Removing intermediate container d26330eb3b1f
The command '/bin/sh -c TIMEFORMAT='time: %3R' bash -c 'time mamba env update -p ${NB_PYTHON_PREFIX} -f "binder/environment.yml" && time mamba clean --all -f -y && mamba list -p ${NB_PYTHON_PREFIX} '' returned a non-zero code: 1
Corona Data Scaper stopped updates around Oct. 2020.
The plot "Cummulative {pathogen} structures by release date" should use a unique color for each type of protein.
notebooks/analyses/Coronavirus3DStructures.ipynb
Map "feline" to taxonomy:9685
Batch upload all node and relationship file in the /data directory:
See: https://github.com/covid-19-net/covid-19-community/blob/master/notebooks/2-CreateGraph.ipynb
The URLs that were used to get COVID-19 case counts are not available anymore.
As a result, the CNCBVariant notebook is broken with all of them saying parsing failed, but in fact there's no csv files
In 3-ExampleQueries, the code for starting Neo4J from the Jupyter Notebook does not appear to work on my machine (Windows 10 PC). However, it seems like this can be resolved by adjusting some of the code.
What worked for me:
Instead of !"$NEO4J_HOME"/bin/neo4j start
, switch the direction of the slashes to !"$NEO4J_HOME"\bin\neo4j start
.
The command !"$NEO4J_HOME"\bin\neo4j start
may also result in an error "Service neo4j not found". This can be resolved by running !"$NEO4J_HOME"\bin\neo4j uninstall-service
, and then !"$NEO4J_HOME"\bin\neo4j install-service
.
I am unsure if the issues I encountered are Windows specific or only to my own machine, but the above are the changes that allow me to use the notebook as intended.
I can contribute to model Events (Covid Test, Information, Death) related to a Person.
Related to that I would like to colaborate to model relationships between people and model the Cell Phone tracking and location.
Furthermore, I could contribute to model tracking of people location based on the use of credit cards in shops, ATMs, etc.
Thanks in advance for your consideration.
Mario Íñiguez
It seems like 01c-CNCBStrain.csv is gone and 01c-CNCBStrainPre.csv is the only file available. The Id column also doesn't exist anymore. This is breaking a few notebooks and Cypher queries
In the following notebook https://github.com/covid-19-net/covid-19-community/blob/master/notebooks/3-ExampleQueries.ipynb we stop the Neo4j database at the end.
Often, the shutdown fails.
Does anyone know what causes this issue and how to shut down cleanly?
MemoryError Traceback (most recent call last)
in
1 unique_var = variations[['id', 'variantType', 'start', 'end', 'ref', 'alt', 'varia
ntConsequence',
2 'proteinVariant', 'geneVariant', 'distance', 'proteinPosi
tion', 'proteinAccession',
----> 3 'taxonomyId', 'referenceGenome']].copy()
CNCB seems to have fixed the issue with the `Host column name. Need to update the Host column name.
KeyError Traceback (most recent call last)
in
1 # assign taxonomy id to host
----> 2 df['host'] = df['`Host'].str.strip()
The data format of the ftp://ftp.ebi.ac.uk/pub/databases/Pfam/mappings/pdb_pfam_mapping.txt file has changed since they updated it.
GOBMX data are not available anymore.
The Spike glycoprotein is included twice in KG. This is due to inconsistencies in the IntAct database.
Several notebooks that load Excel files in the xlsx format don't work anymore:
~/miniconda3/envs/covid-19-community/lib/python3.7/site-packages/pandas/io/excel/_base.py in in
it(self, path_or_buffer, engine, storage_options)
1079 if xlrd_version >= "2":
1080 raise ValueError(
-> 1081 f"Your version of xlrd is {xlrd_version}. In xlrd >= 2.0, "
1082 f"only the xls format is supported. Install openpyxl instead."
1083 )
ValueError: Your version of xlrd is 2.0.1. In xlrd >= 2.0, only the xls format is supported. Inst
all openpyxl instead.
We need to refactor the code that processes variants to re-instate the update process.
Fri Apr 9 14:00:02 UTC 2021
The transaction has been terminated. Retry your operation in a new transaction,
and you should see a successful result. The transaction has been terminated, so
no more locks can be acquired. This can occur because the transaction ran longer
than the configured transaction timeout, or because a human operator manually t
erminated the transaction, or because the database is shutting down. ForsetiClie
nt[7]
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.