Comments (8)
Currently ReCiter output (article assignment results) and scores are written to a CSV file. The goal of this project is to write the scores to the database, while retaining the CSV file output.
from reciter.
Created table named rc_analysis
in reciter
table with the following sql:
create table rc_analysis (
cwid VARCHAR(50),
analysis_status VARCHAR(50),
target_name VARCHAR(255),
pubmed_search_query VARCHAR(255),
pmid VARCHAR(50),
article_title TEXT,
full_journal_title TEXT,
publication_year VARCHAR(50),
scopus_target_author_affiliation TEXT,
scopus_coauthor_affiliation BLOB,
pubmed_target_author_affiliation TEXT,
pubmed_coauthor_affiliation BLOB,
article_keywords BLOB,
name_matching_score DECIMAL,
cluster_originator BOOl,
journal_similarity_phase_one DECIMAL,
coauthor_affiliation_score DECIMAL,
target_author_affiliation_score DECIMAL,
known_coinvestigator_score DECIMAL,
funding_statement_score DECIMAL,
terminal_degree_score DECIMAL,
default_department_journal_similarity_score DECIMAL,
department_of_affiliation_score DECIMAL,
keyword_matching_score DECIMAL,
phase_two_similarity_threshold DECIMAL,
cluster_article_assigned_to INT,
count_articles_in_assigned_cluster INT,
cluster_selected_in_phase_two_matching INT,
phase_two_affiliation_similarity DECIMAL,
phase_two_keyword_similarity DECIMAL,
phase_two_journal_similarity DECIMAL
);
from reciter.
Looks great!
from reciter.
Finished initial beta commit to write to db. Still need to work on converters.
from reciter.
Updated sql script.
create table rc_analysis (
cwid VARCHAR(50),
analysis_status VARCHAR(50),
target_name VARCHAR(255),
pubmed_search_query VARCHAR(255),
pmid VARCHAR(50),
article_title TEXT,
full_journal_title TEXT,
publication_year VARCHAR(50),
scopus_target_author_affiliation TEXT,
scopus_coauthor_affiliation BLOB,
pubmed_target_author_affiliation TEXT,
pubmed_coauthor_affiliation BLOB,
article_keywords BLOB,
cluster_originator BOOl,
cluster_article_assigned_to INT,
count_articles_in_assigned_cluster INT,
cluster_selected_in_phase_two_matching INT,
email_score DECIMAL,
department_score DECIMAL,
known_coinvestigator_score DECIMAL,
affiliation_score DECIMAL,
scopus_score DECIMAL,
coauthor_score DECIMAL,
journal_score DECIMAL,
citizenship_score DECIMAL
);
from reciter.
Added gold_standard
column.
create table rc_analysis (
cwid VARCHAR(50),
analysis_status VARCHAR(50),
gold_standard INT,
target_name VARCHAR(255),
pubmed_search_query VARCHAR(255),
pmid VARCHAR(50),
article_title TEXT,
full_journal_title TEXT,
publication_year VARCHAR(50),
scopus_target_author_affiliation TEXT,
scopus_coauthor_affiliation TEXT,
pubmed_target_author_affiliation TEXT,
pubmed_coauthor_affiliation TEXT,
article_keywords TEXT,
cluster_originator BOOl,
cluster_article_assigned_to INT,
count_articles_in_assigned_cluster INT,
cluster_selected_in_phase_two_matching INT,
email_score DECIMAL,
department_score DECIMAL,
known_coinvestigator_score DECIMAL,
affiliation_score DECIMAL,
scopus_score DECIMAL,
coauthor_score DECIMAL,
journal_score DECIMAL,
citizenship_score DECIMAL
);
from reciter.
Added columns for bachelors and doctoral year discrepancy.
create table rc_analysis (
cwid VARCHAR(50),
analysis_status VARCHAR(50),
gold_standard INT,
target_name VARCHAR(255),
pubmed_search_query VARCHAR(255),
pmid VARCHAR(50),
article_title TEXT,
full_journal_title TEXT,
publication_year VARCHAR(50),
scopus_target_author_affiliation TEXT,
scopus_coauthor_affiliation TEXT,
pubmed_target_author_affiliation TEXT,
pubmed_coauthor_affiliation TEXT,
article_keywords TEXT,
cluster_originator BOOl,
cluster_article_assigned_to INT,
count_articles_in_assigned_cluster INT,
cluster_selected_in_phase_two_matching INT,
email_score DECIMAL,
department_score DECIMAL,
known_coinvestigator_score DECIMAL,
affiliation_score DECIMAL,
scopus_score DECIMAL,
coauthor_score DECIMAL,
journal_score DECIMAL,
citizenship_score DECIMAL,
bachelors_year_discrepancy_score DECIMAL,
doctoral_year_dicrepancy_score DECIMAL
);
from reciter.
Some scores remain to be added, but the infrastructure is in place.
from reciter.
Related Issues (20)
- Update MeSHterm.json
- First name scoring does not properly match in cases where nameMatchFirstType should be "full-conflictingAllButInitials"
- Failure to score article first name in cases where institutional first name contains space or dash
- Feature Generator by Group API should accept input of an array of person IDs HOT 2
- Fields parameter in Feature Generator not working as expected
- Feature Generator outputs in a single article suggestion pieces of two separate article records HOT 1
- 500 Internal Server Error for _dar7342 HOT 1
- Application returns 500 error if "emails" field is null
- Refactor publication type assignment
- DOI is parsed incorrectly HOT 1
- No documentation on how to use Reciter and the Reciter Pubmed retrieval tool HOT 6
- Create new publication type, "Erratum"
- Add first name likelihood scoring strategy
- Investigate 404 errors
- Switch to using environmental variables
- App throws an error if firstName field is blank
- Output equalContrib as an author level attribute
- Update the way ReCiter handles books HOT 1
- Downweight cases where org unit doesn't match
- Look up candidate records by names of collaborators
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from reciter.