ReCiter output is currently written to a series of .csv files, one for each target aut

Created table named rc_analysis in <code class="notra

Updated sql . <div class="highlight highlight-source-sql notranslate positio

Added gold_standard column. <div class="highlight

Added columns for bachelors and doctoral year discrepancy. <div class="highlight h

Store output and scores in the database about reciter HOT 8 CLOSED

wcmc-its commented on May 30, 2024

Store output and scores in the database

from reciter.

Comments (8)

michaelbales1 commented on May 30, 2024

Currently ReCiter output (article assignment results) and scores are written to a CSV file. The goal of this project is to write the scores to the database, while retaining the CSV file output.

from reciter.

jl987-Jie commented on May 30, 2024

Created table named rc_analysis in reciter table with the following sql:

create table rc_analysis (
    cwid VARCHAR(50),
    analysis_status VARCHAR(50),
    target_name VARCHAR(255),
    pubmed_search_query VARCHAR(255),
    pmid VARCHAR(50),
    article_title TEXT,
    full_journal_title TEXT,
    publication_year VARCHAR(50),
    scopus_target_author_affiliation TEXT,
    scopus_coauthor_affiliation BLOB,
    pubmed_target_author_affiliation TEXT,
    pubmed_coauthor_affiliation BLOB,
    article_keywords BLOB,
    name_matching_score DECIMAL,
    cluster_originator BOOl,
    journal_similarity_phase_one DECIMAL,
    coauthor_affiliation_score DECIMAL,
    target_author_affiliation_score DECIMAL,
    known_coinvestigator_score DECIMAL,
    funding_statement_score DECIMAL,
    terminal_degree_score DECIMAL,
    default_department_journal_similarity_score DECIMAL,
    department_of_affiliation_score DECIMAL,
    keyword_matching_score DECIMAL,
    phase_two_similarity_threshold DECIMAL,
    cluster_article_assigned_to INT,
    count_articles_in_assigned_cluster INT,
    cluster_selected_in_phase_two_matching INT,
    phase_two_affiliation_similarity DECIMAL,
    phase_two_keyword_similarity DECIMAL,
    phase_two_journal_similarity DECIMAL
);

from reciter.

michaelbales1 commented on May 30, 2024

Looks great!

from reciter.

jl987-Jie commented on May 30, 2024

Finished initial beta commit to write to db. Still need to work on converters.

from reciter.

jl987-Jie commented on May 30, 2024

Updated sql script.

create table rc_analysis (
    cwid VARCHAR(50),
    analysis_status VARCHAR(50),
    target_name VARCHAR(255),
    pubmed_search_query VARCHAR(255),
    pmid VARCHAR(50),
    article_title TEXT,
    full_journal_title TEXT,
    publication_year VARCHAR(50),
    scopus_target_author_affiliation TEXT,
    scopus_coauthor_affiliation BLOB,
    pubmed_target_author_affiliation TEXT,
    pubmed_coauthor_affiliation BLOB,
    article_keywords BLOB,
    cluster_originator BOOl,
    cluster_article_assigned_to INT,
    count_articles_in_assigned_cluster INT,
    cluster_selected_in_phase_two_matching INT,
    email_score DECIMAL,
    department_score DECIMAL,
    known_coinvestigator_score DECIMAL,
    affiliation_score DECIMAL,
    scopus_score DECIMAL,
    coauthor_score DECIMAL,
    journal_score DECIMAL,
    citizenship_score DECIMAL
);

from reciter.

jl987-Jie commented on May 30, 2024

Added gold_standard column.

create table rc_analysis (
    cwid VARCHAR(50),
    analysis_status VARCHAR(50),
    gold_standard INT,
    target_name VARCHAR(255),
    pubmed_search_query VARCHAR(255),
    pmid VARCHAR(50),
    article_title TEXT,
    full_journal_title TEXT,
    publication_year VARCHAR(50),
    scopus_target_author_affiliation TEXT,
    scopus_coauthor_affiliation TEXT,
    pubmed_target_author_affiliation TEXT,
    pubmed_coauthor_affiliation TEXT,
    article_keywords TEXT,
    cluster_originator BOOl,
    cluster_article_assigned_to INT,
    count_articles_in_assigned_cluster INT,
    cluster_selected_in_phase_two_matching INT,
    email_score DECIMAL,
    department_score DECIMAL,
    known_coinvestigator_score DECIMAL,
    affiliation_score DECIMAL,
    scopus_score DECIMAL,
    coauthor_score DECIMAL,
    journal_score DECIMAL,
    citizenship_score DECIMAL
);

from reciter.

jl987-Jie commented on May 30, 2024

Added columns for bachelors and doctoral year discrepancy.

create table rc_analysis (
    cwid VARCHAR(50),
    analysis_status VARCHAR(50),
    gold_standard INT,
    target_name VARCHAR(255),
    pubmed_search_query VARCHAR(255),
    pmid VARCHAR(50),
    article_title TEXT,
    full_journal_title TEXT,
    publication_year VARCHAR(50),
    scopus_target_author_affiliation TEXT,
    scopus_coauthor_affiliation TEXT,
    pubmed_target_author_affiliation TEXT,
    pubmed_coauthor_affiliation TEXT,
    article_keywords TEXT,
    cluster_originator BOOl,
    cluster_article_assigned_to INT,
    count_articles_in_assigned_cluster INT,
    cluster_selected_in_phase_two_matching INT,
    email_score DECIMAL,
    department_score DECIMAL,
    known_coinvestigator_score DECIMAL,
    affiliation_score DECIMAL,
    scopus_score DECIMAL,
    coauthor_score DECIMAL,
    journal_score DECIMAL,
    citizenship_score DECIMAL,
    bachelors_year_discrepancy_score DECIMAL,
    doctoral_year_dicrepancy_score DECIMAL
);

from reciter.

michaelbales1 commented on May 30, 2024

Some scores remain to be added, but the infrastructure is in place.

from reciter.

Store output and scores in the database about reciter HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs