GithubHelp home page GithubHelp logo

ranking-agent / aragorn Goto Github PK

View Code? Open in Web Editor NEW
4.0 4.0 3.0 4.13 MB

A Translator ARA combining asynchronous database querying, answer coalescence, and answer ranking.

License: MIT License

Dockerfile 0.71% Python 98.70% Shell 0.59%
ara ncats-translator trapi

aragorn's Introduction

ARAGORN

Autonomous Relay Agent for Generation Of Ranked Networks (ARAGORN)

A tool to query Knowledge Providers (KPs) and synthesize highly ranked answers relevant to user-specified questions.

  • Operates in a federated knowledge environment.
  • Bridges the precision mismatch between data specificity in KPs and more abstract levels of user queries.
  • Generalizes answer ranking.
  • Normalizes data to use preferred and equivalent identifiers.

The ARAGORN tool relies on a number of external services to perform a standardized ranking of a user-specified question.

  • Strider - Accepts a query and provides knowledge-provider querying, answer generation and ranking.
  • Answer Coalesce - Accepts a query containing Strider answers and returns answers that have been coalesced by property, graph and/or ontology analysis.
  • Node normalization - A Translator SRI service that provides the preferred CURIE and equivalent identifiers for data in the query.
  • ARAGORN Ranker - Accepts a query and provides Omnicorp overlays, score and weight-correctness rankings of coalesced answers.

Demonstration

A live version of the API can be found here.

Source Code

Below you will find references that detail the standards, web services and supporting tools that are part of ARAGORN.

Installation

This version of ARAGORN has all links to subordinate services hard coded. In the future, these links will be defined in the Kubernetes configuration files.

In the meantime some manual edits will be needed in the src/service_aggregator.py file to support your installation.

Subordinate services

The ARAGORN subordinate services will have to be deployed prior to the stand-up of ARAGRON. Please reference the following READMEs for more information on standing those up:

Command line installation

cd <aragorn codebase root>

python<version> -m venv venv
source venv/bin/activate

Install dependencies

pip install -r requirements.txt

Run Script

cd <aragorn root>

./main.sh

DOCKER installation

Or build an image and run it.

cd <aragorn root>

docker build --tag <image_tag> .

Then start the container

docker run --name aragorn -p 8080:4868 aragorn-test

Kubernetes configurations

Kubernetes configurations and helm charts for this project can be found at:

https://github.com/helxplatform/translator-devops/helm/aragorn

aragorn's People

Contributors

cbizon avatar dnlrkorn avatar jdr0887 avatar maximusunc avatar patrickkwang avatar phillipsowen avatar uhbrar avatar yaphetkg avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

aragorn's Issues

self-subclasses as edges?

See NCATSTranslator/testing#130

For queries like (A_fixed)-(B*)-(C_fixed) sometimes you can return C in for B, since C is a subclass of itself. So you get results like
(A)-(C)-[subclass_of]-(C).

Now, it's not wrong necessarily, but it's probably not what's intended.

This will occur any time A has a direct link to C in this kind of query.

Should we combine in results? Where should it happen?

Consider this one hop query:

query = {
    "message": {
        "query_graph": {
            "edges": {
                "e01": {
                    "object": "n0",
                    "predicates": [
                        "biolink:treats"
                    ],
                    "subject": "n1"
                }
            },
            "nodes": {
                "n0": {
                    "ids": [
                        "MONDO:0021187"
                    ]
                },
                "n1": {
                    "categories": [
                        "biolink:SmallMolecule"
                    ]
                }
            }
        }
    }
}

For some values of n1, there are two edges that come back: One that has predicate "treats" and one that has "approved_to_treat" which is a subclass of treat.

Edge 1:

{
    "subject": "PUBCHEM.COMPOUND:24875259",
    "object": "MONDO:0021187",
    "predicate": "biolink:treats",
    "attributes": [
        {
            "attribute_type_id": "biolink:aggregator_knowledge_source",
            "value": "infores:molepro",
            "value_type_id": "biolink:InformationResource",
            "original_attribute_name": "biolink:aggregator_knowledge_source",
            "value_url": null,
            "attribute_source": "infores:molepro",
            "description": "Molecular Data Provider",
            "attributes": null
        },
        {
            "attribute_type_id": "biolink:aggregator_knowledge_source",
            "value": "infores:molepro",
            "value_type_id": "biolink:InformationResource",
            "original_attribute_name": "biolink:aggregator_knowledge_source",
            "value_url": null,
            "attribute_source": "infores:chembl",
            "description": "Molecular Data Provider",
            "attributes": []
        },
        {
            "attribute_type_id": "biolink:aggregator_knowledge_source",
            "value": "infores:aragorn",
            "value_type_id": null,
            "original_attribute_name": null,
            "value_url": null,
            "attribute_source": null,
            "description": null,
            "attributes": null
        },
        {
            "attribute_type_id": "biolink:primary_knowledge_source",
            "value": "infores:chembl",
            "value_type_id": "biolink:InformationResource",
            "original_attribute_name": "biolink:primary_knowledge_source",
            "value_url": null,
            "attribute_source": "infores:chembl",
            "description": "MolePro's ChEMBL indication transformer",
            "attributes": []
        },
        {
            "attribute_type_id": "biolink:FDA_approval_status",
            "value": "FDA Clinical Research Phase 2",
            "value_type_id": "biolink:FDA_approval_status_enum",
            "original_attribute_name": "max phase for indication",
            "value_url": null,
            "attribute_source": "infores:chembl",
            "description": null,
            "attributes": []
        },
        {
            "attribute_type_id": "biolink:Publication",
            "value": "NCT02719028",
            "value_type_id": "string",
            "original_attribute_name": "ClinicalTrials",
            "value_url": "https://clinicaltrials.gov/search?id=%22NCT02719028%22",
            "attribute_source": "infores:chembl",
            "description": null,
            "attributes": []
        }
    ]
}

Edge 2:

{
    "subject": "PUBCHEM.COMPOUND:24875259",
    "object": "MONDO:0021187",
    "predicate": "biolink:approved_to_treat",
    "attributes": [
        {
            "attribute_type_id": "biolink:aggregator_knowledge_source",
            "value": "infores:aragorn",
            "value_type_id": null,
            "original_attribute_name": null,
            "value_url": null,
            "attribute_source": null,
            "description": null,
            "attributes": null
        },
        {
            "attribute_type_id": "biolink:primary_knowledge_source",
            "value": [
                "infores:chembl"
            ],
            "value_type_id": "biolink:InformationResource",
            "original_attribute_name": null,
            "value_url": null,
            "attribute_source": null,
            "description": null,
            "attributes": null
        },
        {
            "attribute_type_id": "biolink:aggregator_knowledge_source",
            "value": [
                "infores:biothings-explorer"
            ],
            "value_type_id": "biolink:InformationResource",
            "original_attribute_name": null,
            "value_url": null,
            "attribute_source": null,
            "description": null,
            "attributes": null
        },
        {
            "attribute_type_id": "biolink:aggregator_knowledge_source",
            "value": [
                "infores:mychem-info"
            ],
            "value_type_id": "biolink:InformationResource",
            "original_attribute_name": null,
            "value_url": null,
            "attribute_source": null,
            "description": null,
            "attributes": null
        }
    ]
}

Currently, this creates 2 results in strider and hence in aragorn. Each result binds to one of the two edges. These edges are not going to get merged in the KG because they are different predicates. But maybe they should get rolled into a single result for scoring.

If so, where would this occur? Strider? AC? Elsewhere?

Note that there will be consortium-level discussions coming about what ARAs are supposed to do in cases like this. The last time it was discussed, the answer was that ARAs were free to do as they chose, but now that we want to merge ARA results, we might need to revisit.

Failing Query

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "category": [
                        "biolink:ChemicalSubstance"
                    ],
                    "is_set": false,
                    "name": "Chemical Substance"
                },
                "n1": {
                    "id": "MONDO:0018150",
                    "is_set": false,
                    "name": "Gaucher disease"
                }
            },
            "edges": {
                "e0": {
                    "subject": "n0",
                    "object": "n1",
                    "predicate": [
                        "biolink:treats"
                    ]
                }
            }
        }
    }

This query returns ~100 results in strider, but is failing with an error in aragorn

Workflow Implemented

TRAPI input has a workflow section with operations that must be completed in order specified by workflow.

Due: July 29, 2021

Details in architecture repo Git issue here.

Passing logs kills ranker

If (input) logs is not empty, then omnicorp (and probably other services) throw 500.

This appears to be because the pydantic model is turning the inputs into something (datetimes) that are causing an error when they get json serialized on output.

500 in /score

This query gives 37 results in strider, but returns no results from aragorn

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "id": "UniProtKB:P52788",
                    "category": "biolink:Gene"
                },
                "n1": {
                    "category": "biolink:ChemicalSubstance"
                }
            },
            "edges": {
                "e01": {
                    "subject": "n0",
                    "object": "n1"
                }
            }
        }
    }
}

Update tests

The aragorn test suite is not up to date. The test jsons are not conformant.

I think that what we should do is write tests here that mock the underlying services and at the aragorn level we're just testing do we make the right calls for workflows, does the callback functionality work, etc.

The if we want to test that eg. ranker is returning the right property, we test that on ranker.

Chunk RMQ

Within aragorn we use rabbit mq to communicate between worker threads. That's done by pushing a strider response through the mq.

But, there is a max size for rabbit mq messages. So we need some chunking on the messages, which is probably going to complicate things....

But without it, we will always fail on strider results greater than X.

Cannot run direct three-hop ARAGORN queries for Workflow B

This issue is to report that both @xu-hao and I cannot run direct three-hop ARAGORN queries for Workflow B. I initially thought the error was on my end, but if Hao is encountering issues, then I think there's something not quite right on the ARAGORN side.

The TRAPI query can be found here. Note that Hao tested both e01 biolink:has_real_world_evidence_of_association_with and e01 biolink_correlated_with. I only tested the latter predicate, as that's the one I used when testing direct three-hop ARAX queries.

Here's the command:

curl -XPOST https://aragorn.renci.org/1.1/query -d '{                                                                                 
                               "message": {
                                   "query_graph": {
                                       "nodes": {
                                           "n0": {
                                                "ids": ["MESH:D056487"],
                                                "categories": ["biolink:DiseaseOrPhenotypicFeature"]
                                           },
                                           "n1": {
                                               "categories": ["biolink:DiseaseOrPhenotypicFeature"]
                                           },
                                           "n2": {
                                               "categories": ["biolink:Gene"]
                                           },
                                           "n3": {
                                               "categories": ["biolink:ChemicalEntity"]
                                           }
                                       },
                                       "edges": {
                                           "e01": {
                                               "subject": "n0",
                                               "object": "n1",
                                               "predicates": ["biolink:correlated_with"]
                                           },
                                           "e02": {
                                               "subject": "n2",
                                               "object": "n1",
                                               "predicates": ["biolink:gene_associated_with_condition"]
                                           },
                                           "e03": {
                                               "subject": "n2",
                                               "object": "n3",
                                               "predicates": ["biolink:related_to"]
                                           }
                                       }
                                   }
                               }
                           }' -H "Content-Type: application/json"

Here's the error message that Hao received from e01 biolink:has_real_world_evidence_of_association_with:

{"message":{"query_graph":{"nodes":{"n0":{"ids":["MESH:D056487"],"categories":["biolink:DiseaseOrPhenotypicFeature"],"is_set":false,"constraints":null},"n1":{"ids":null,"categories":["biolink:DiseaseOrPhenotypicFeature"],"is_set":false,"constraints":null},"n2":{"ids":null,"categories":["biolink:Gene"],"is_set":false,"constraints":null},"n3":{"ids":null,"categories":["biolink:ChemicalEntity"],"is_set":false,"constraints":null}},"edges":{"e01":{"subject":"n0","object":"n1","predicates":["biolink:has_real_world_evidence_of_association_with"],"relation":null,"constraints":null},"e02":{"subject":"n2","object":"n1","predicates":["biolink:gene_associated_with_condition"],"relation":null,"constraints":null},"e03":{"subject":"n2","object":"n3","predicates":["biolink:related_to"],"relation":null,"constraints":null}}},"knowledge_graph":{"nodes":{},"edges":{}},"results":[]},"logs":[{"timestamp":"2021-08-24T22:13:34.599883","level":"WARNING","code":null,"message":"warning: empty returned"},{"timestamp":"2021-08-24T22:13:34.648997","level":"ERROR","code":null,"message":"No results to coalesce"},{"timestamp":"2021-08-24T22:13:34.653174","level":"ERROR","code":null,"message":"answer_coalesce error: HTML error status code 422 returned."},{"timestamp":"2021-08-24T22:13:34.785124","level":"WARNING","code":null,"message":"warning: empty returned"},{"timestamp":"2021-08-24T22:13:34.836039","level":"WARNING","code":null,"message":"warning: empty returned"},{"timestamp":"2021-08-24T22:13:34.882219","level":"WARNING","code":null,"message":"warning: empty returned"}],"status":null,"workflow":["lookup","enrich_results","connect_knodes","score"]}

And here's the error message that was returned with e01 biolink:correlated_with:

{"message":{"query_graph":{"nodes":{"n0":{"ids":["MESH:D056487"],"categories":["biolink:DiseaseOrPhenotypicFeature"],"is_set":false,"constraints":null},"n1":{"ids":null,"categories":["biolink:DiseaseOrPhenotypicFeature"],"is_set":false,"constraints":null},"n2":{"ids":null,"categories":["biolink:Gene"],"is_set":false,"constraints":null},"n3":{"ids":null,"categories":["biolink:ChemicalEntity"],"is_set":false,"constraints":null}},"edges":{"e01":{"subject":"n0","object":"n1","predicates":["biolink:correlated_with"],"relation":null,"constraints":null},"e02":{"subject":"n2","object":"n1","predicates":["biolink:gene_associated_with_condition"],"relation":null,"constraints":null},"e03":{"subject":"n2","object":"n3","predicates":["biolink:related_to"],"relation":null,"constraints":null}}},"knowledge_graph":null,"results":null},"logs":[{"timestamp":"2021-08-24 22:36:29.878473","level":"ERROR","message":"Exception 'logs'","code":null}],"status":null}

Any chance you all can work on this query and send me/Hao both the executable query and the associated JSON output, so that Hao and I can figure out what we did wrong and (importantly) I can review the answers? I honestly think this might be the more efficient testing approach.

Handle lookup options

When a lookup query occurs, you can set a parameter on the max number of results, but we are currently ignoring that.

Update all components to reasoner-pydantic 1.2.0.4

We are currently having a "X" on the arax webpage for trapi 1.2 validation. It's because of an error in reasoner-pydantic, but that has been fixed in 1.2.0.4. So we need to update aragorn, ranker, and AC.

Local ARAGORN

ARAGORN receives TRAPI queries, which may contain workflows. If a TRAPI query does not contain a workflow, a default workflow is run.

One of the possible workflow elements is "lookup" which consults data sources to provide all possible answers to the question. In current ARAGORN, lookup is implemented by making an asynchronous TRAPI call to strider, which implements a federated lookup to all translator KPs.

We want to make a Local ARAGORN. This would be another instance of the container, with a different configuration (Helm chart?)
When a lookup operation happens, then Local ARAGORN will not call strider, but will instead consult a single automat endpoint (either robokop or covidkop probably). It's not quite as simple as replacing the strider URL with the automat URL because automat can only be consulted synchronously.

Additionally, there will soon be another operation (infer or creative lookup) which will have different implementations in the two ARAGORNs: Federated ARAGORN will use mined rules to query strider, while Local ARAGORN will simply consult a database of pre-calculated results.

Implement Aragorn Service

We want a TRAPI 0.9.2 interface that the ARS can hit that does the following

  1. Call strider
  2. Take the result from strider and send it to the ranker
    A. Omnicorp
    B. Weight
    C. Score
  3. Optionally send the result of scoring to answercoalsce
  4. Return the result

To handle coalescence we want an option at the level of message. So we want

{
  "message": {...},
  "coalesce": ""
}

Valid options for the coalesce should be "none", "graph", "ontology", and "property". "graph", "ontology", and "property" should be passed as options to the AC service. If coalesce is "none", don't call the coalsecer.

If the option does not exist, ... we decided on Friday to not coalesce, but I think we'll want to change that soon. Probably default to "graph" IMO.

Make use of asyncquery when calling components

With PR 29, we'll stand up an asyncquery version of aragorn. However, internally, aragorn will still be connecting to strider, AC, and ranker synchronously. We will next want to add asyncquery at this lower level so that we don't needlessly hold open a bunch of connections & fail on long running stuff.

Move endpoints to config

ARAGORN is a simple workflow engine, and it parcels out work to services at other endpoints. The urls are currently hardcoded, but these should be pulled out into a config to be changed on the fly.

Implement a data overlay

We now use omnicorp to add overlay edges. But I would like to be able to get overlay edges from everywhere. Both to improve scoring, but also to provide extra contexrt.

Provide evidence by adding other edges/nodes

Suppose somebody looks up A->B. We go find that edge, and provide the evidence that we found in the KPs, like papers or p-values, etc...

But are there other graph elements that provide extra support (or conversely that reduce support). So for instance, if I find A-not->B that's something I should know and it should affect the ranking?

Or maybe we know that A->C->B can imply A->B or is often associated with A->B. If we then go find some C, does returning that information help convince a user?

@schatzkara @kennethmorton

Bad answer to (procedures treating cataracts)

Query:
https://github.com/NCATSTranslator/testing/blob/main/ars-requests/not-none/1.2/cataractTreatment.json
Results:
https://arax.ncats.io/?r=1ef9d36e-dd59-4248-b0fc-fb588a387010

The query is "procedure that treats cataracts". RTX KG2 is retuning a bunch of answers relating to kidneys, with very general (low IC) terms for the procedure (e.g. "Therapeutic Procedure").

Then AC happily says, "hey what do a bunch of kidney diseases and eye diseases have in common?" and finds some garbage high level node like "disease by anatomical region" and merges everything together.

Then the ranker looks at that, says "great, so many nodes!" and gives a high score.

So I think that there are multiple things that could be done here, affecting different components:

  1. RTX I think shouldn't be returning those results, I have an issue into them
  2. Should strider try to verify the subclass of and filter in cases when it thinks the KPs are wrong? How much trust vs verify do we need in strider?
  3. AC should probably be tuned; I doubt that disease by anatomical feature should ever be considered an enrichment?
  4. Ranker should downweight this answer based on the low IC of the "disease by anatomical feature"

Convert to 1.0

We will want a TRAPI 1.0 version of this interface as well.

Operations and Workflow

  1. We need to expose available operations (once they are defined)
  2. We need to handle the workflow input in the extended TRAPI schema. This will require calling individual services in the order proscribed in the input.

Incorect node category found - glucose as gene/protein

Glucose was returned as a gene or protein:
image

Query:
{ "edges": { "N1": { "constraints": [], "object": "n1", "predicates": [ "biolink:has_normalized_google_distance_with" ], "subject": "n0" }, "N2": { "constraints": [], "object": "n2", "predicates": [ "biolink:has_normalized_google_distance_with" ], "subject": "n0" }, "N3": { "constraints": [], "object": "n2", "predicates": [ "biolink:has_normalized_google_distance_with" ], "subject": "n1" }, "e00": { "constraints": [], "object": "n1", "subject": "n0" }, "e01": { "constraints": [], "object": "n1", "subject": "n2" } }, "nodes": { "n0": { "categories": [ "biolink:SmallMolecule" ], "constraints": [], "ids": [ "UMLS:C0034407" ], "is_set": false, "name": "Quinazolines" }, "n1": { "categories": [ "biolink:Gene" ], "constraints": [], "is_set": false }, "n2": { "categories": [ "biolink:Gene" ], "constraints": [], "ids": [ "NCBIGene:10628", "NCBIGene:22861", "NCBIGene:51085", "NCBIGene:1490", "NCBIGene:389692", "NCBIGene:3480", "NCBIGene:598", "NCBIGene:2308", "NCBIGene:22877", "NCBIGene:2033" ], "is_set": true, "name": "TXNIP, NLRP1, MLXIPL, CTGF, MAFA, IGF1R, BCL2L1, FOXO1, MLXIP, EP300" } } }

Doubling Answers

This standup query:

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "id": "NCBIGENE:1017",
                    "category": "biolink:Gene"
                },
                "n1": {
                    "category": "biolink:Pathway"
                }
            },
            "edges": {
                "e01": {
                    "subject": "n0",
                    "object": "n1"
                }
            }
        }
    }
}

Returns 116 results from strider.

AC does not do any aggregation, because we can't aggregate on pathways at the moment.

So we should get 116 results back.

We somehow get 232.

If I run each component by hand, I end up with only 116, i.e. none of the components directly seem to be doubling the answers.

Aragorn calling COHD with duplicate queries

I was testing the following query through ARS, and noticed Aragorn is calling COHD four times, two of which are duplicate queries. Two of the queries are inverses of each other (the subject and object nodes are swapped in the edge). COHD returns essentially the same results either way, but it seems reasonable for an ARA to do this. But the second two queries are exact duplicates of the first two. This issue is potentially related to #10.

The duplication of queries also occurred on the larger 3-hop query graph for December Demo's Workflow B, with Aragorn sending COHD 4 queries total (2 duplicates) for the single edge relevant to COHD (same as the query below)

I double checked our registration in SmartAPI, and I don't think COHD is double registered or anything, but please let me know if it might be something on our end that's causing this.

This is not a high priority issue for us.

{
  "message": {
    "query_graph": {
      "nodes": {
        "n1": {
          "categories": [
            "biolink:DiseaseOrPhenotypicFeature"
          ],
          "name": "Disease Or Phenotypic Feature"
        },
        "n0": {
          "ids": [
            "SNOMEDCT:197358007"
          ],
          "categories": [
            "biolink:DiseaseOrPhenotypicFeature"
          ],
          "name": "drug-induced liver injury"
        }
      },
      "edges": {
        "e0": {
          "subject": "n0",
          "object": "n1",
          "predicates": [
            "biolink:correlated_with"
          ]
        }
      }
    }
  },
  "logs": [],
  "status": null
}

Raising 500

m = {"message":{"query_graph":{
  "edges": {
    "e01": {
      "constraints": [],
      "object": "n0",
      "predicates": [
        "biolink:has_manifestation"
      ],
      "subject": "n1"
    }
  },
  "nodes": {
    "n0": {
      "categories": [
        "biolink:Disease"
      ],
      "constraints": [],
      "ids": [
        "MONDO:0004995"
      ],
      "fulltextname": "n0"
    },
    "n1": {
      "categories": [
        "biolink:PathologicalProcess"
      ],
      "constraints": [],
      "fulltextname": "n1"
    }
  }
}}}

Returns fine from strider with 14 results, then aragorn throws a 500.

Change default behavior

The default behavior now is to run answer coalsecence (graph style). We have 3 AC types, as well as "none". Often we will want none, and more to the point,

  1. The user won't know which one they want a priori
  2. There is no control on the ARS to specify what they do want

So the default is very important.

We need to (I think) run all 3 types of coalescence, and merge that set of results, as well as all the original results and return them.

Add tests

There are currently no tests on this repository or service.

Improve error handling

  1. Rather than checking for 0-size responses from the component tools, we should check for status codes.
  2. We should pass along errors and logs from the underlying elements, esp strider.
  3. If a component fails, we should return whatever we got to up to that point, rather than returning nothing. Everything after strider is a valid response, even if it does not have everything we want.

ARAGORN removing parts of strider's message when empty?

Query:

{
    "message": {
        "query_graph": {
            "edges": {
                "e01": {
                    "object": "n0",
                    "subject": "n1",
                    "predicates": [
                        "biolink:negatively_regulates_entity_to_entity"
                    ]
                }
            },
            "nodes": {
                "n0": {
                    "ids": [
                        "NCBIGene:23221"
                    ],
                    "categories": [
                        "biolink:Gene"
                    ]
                },
                "n1": {
                    "categories": [
                        "biolink:Gene"
                    ]
                }
            }
        }
    }
}

Sent directly to strider, this produces no results, but this message:

{
    "query_graph": {
        "nodes": {
            "n1": {
                "ids": null,
                "categories": [
                    "biolink:Gene"
                ],
                "is_set": false,
                "constraints": null
            },
            "n0": {
                "ids": [
                    "NCBIGene:23221"
                ],
                "categories": [
                    "biolink:Gene"
                ],
                "is_set": false,
                "constraints": null
            }
        },
        "edges": {
            "e01": {
                "subject": "n1",
                "object": "n0",
                "predicates": [
                    "biolink:negatively_regulates_entity_to_entity"
                ],
                "relation": null,
                "constraints": null
            }
        }
    },
    "knowledge_graph": {
        "nodes": {},
        "edges": {}
    },
    "results": []
}

But when we call aragorn, we only get back the qg in the message:

{
    "query_graph": {
        "nodes": {
            "n0": {
                "ids": [
                    "NCBIGene:23221"
                ],
                "categories": [
                    "biolink:Gene"
                ],
                "is_set": false
            },
            "n1": {
                "categories": [
                    "biolink:Gene"
                ],
                "is_set": false
            }
        },
        "edges": {
            "e01": {
                "subject": "n1",
                "object": "n0",
                "predicates": [
                    "biolink:negatively_regulates_entity_to_entity"
                ]
            }
        }
    }
}

This is legal trapi, but it creates problems for downstream components (like arax) that don't expect this. Also, I don't see why we would do it.

Multiple deployments

We need 2 deployments of ARAGORN. One Prod and one Dev.

  • At runtime, there needs to be a config defining the environment
  • Based on the config, use the right components (all sub tools will need multiple deployments as well)
  • deploy both
  • Update smart api registry to point at the two different aragorns, with the correct x-maturity levels

Aragorn returning 2-hop results on 1-hop query (Workflow B.2x)

Top ranking results from B.2x queries have what looks like a 2-hop query format with a set of DiseaseOrPhenotypicFeature nodes in the middle and ChemicalEntity nodes on each end. Lower ranked results have the expected structure.

Sample Query:

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "ids": ["MESH:D000077385"],
                    "categories": [
                        "biolink:ChemicalEntity"
                    ],
                    "name": "Silybin"
                },
                "n1": {
                    "categories": [
                        "biolink:DiseaseOrPhenotypicFeature"
                    ]
                }
            },
            "edges": {
                "e0": {
                    "subject": "n0",
                    "object": "n1",
                    "predicates": ["biolink:related_to"]
                }
            }
        }
    }
}

Results:
https://arax.ncats.io/?r=964ae5cc-f9f1-4917-8ebb-0b95322e5fbf

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.