GithubHelp home page GithubHelp logo

wikibrowser-service's Introduction

wikibrowser-service (ConceptMap)

Facility that browses Wikipedia augmented with semantic capabilities

Enhancements to consider

  • In claims and relatedclaims return enough items to show 100 items in every property

  • Fix /claims service issue in which when an item has multiple pictures, one row per pictures is returned. For example, Douglas Adam's alma mater Q691283 appears four times in the results.

  • Add spu, spf, spr, exp1, exp* buttons to the two top-level relationship headings

    • Perhaps the spu button could return all shortest paths, or just one shortest path that exclude obvious connections. Here are a couple of queries for reference:

      • The returned collection of paths does not contain any path that contains an Item node whose itemId is equal to “Q5”: MATCH p=allShortestPaths( (a:Item {itemId:"Q6294"})-[*]-(b:Item {itemId:"Q359442"}) ) WHERE NONE(x IN NODES(p) WHERE x:Item AND x.itemId = "Q5") RETURN p;

      • The returned collection of paths does not contain any path that contains a relationship whose propId is equal to “P31”: MATCH p=allShortestPaths( (a:Item {itemId:"Q6294"})-[*]-(b:Item {itemId:"Q359442"}) ) WHERE NONE(x IN RELATIONSHIPS(p) WHERE x.propId = "P31") RETURN p;

      • The returned collection of paths does not contain any path that contains a relationship whose propId is equal to “P1343” or whose itemId is equal to “Q5”: MATCH p=allShortestPaths( (a:Item {itemId:"Q23"})-[*..2]-(b:Item {itemId:"Q9696"}) ) WHERE NONE(x IN NODES(p) WHERE x:Item AND x.itemId = "Q5") AND NONE(y IN RELATIONSHIPS(p) WHERE y.propId = "P1343") RETURN p;

  • Include all items (not just ones with English labels) in WikidataNeo4jProcessor, by using getLabels() if findLabel() returns null

  • Make query limits (# of rows returned in a given query) configurable, or at least a constant that may be changed one place

  • Implement in SPARQL a query similar to the Neo4j query that returns results necessary to disply pinned nodes and their relationship.
    This could potentially eliminate the need for such a large Neo4j DB. Here is a start for such a query: PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema# PREFIX wikibase: http://wikiba.se/ontology# PREFIX wd: http://www.wikidata.org/entity/ PREFIX p: http://www.wikidata.org/prop/direct/ SELECT ?from ?rel ?to WHERE { VALUES ?from {wd:Q2 wd:Q5 wd:Q405 wd:Q525 } . VALUES ?to {wd:Q2 wd:Q5 wd:Q405 wd:Q525 } . }

  • Move Wikidata relationships to left side, and concept map to middle?

  • Add ?parent to results from traversal sparql query & merge into neo4j? (or just do a batch load periodically)

  • [] Implement authentication

  • [] Enable user to save a concept map for later retrieval

    • Relationship between user and the nodes in the map, as well as an indication of main node and language
  • [] Enable user to associate external resources with Q items

  • [] Create utility that loads Neo4j QUICKLY from Wikidata dump

  • Create option that displays only those relationship types that were used in the latest traversal (or some other way of displaying a single type of relationship)

  • Associate image with each Q item

  • Put level of detail in properties that enables traversing lists such as US Presidents

  • Show tabs for Wikibooks, Wikinews, Wikiquote, Wikisource, Wikivoyage, commons, and Wikiversity when present

  • Implement visjs library for directed graph functionality

    • Make the concept map not jump around so much when drawing
  • Create Getting Started guide for ConceptMap in Slides

  • Put Wikipedia header on right side of app

  • Highlight node that is represented by selected Wikipedia article

  • Consider showing all relationships for a node when double-clicking it: (perhaps showing nodes as dashed line borders that may be pinned/unpinned)

  • On a node that represents a person, if there are family relationships then show options to traverse br CHILD relationship (forward, reverse, forward all levels, reverse all levels)

    • Investigate other such relationship types (e.g. parent taxon, subclass of, child, followed by)
    • Related task: Consider inferring Neo4j labels from properties (e.g. instance of human is a person)
  • Compute and display nodes for shortest path between two selected nodes. Related ideas:

    • Allow specification of relationship types (e.g. CAST_MEMBER, SHARES_BORDER_WITH, MEMBER_OF)
  • Investigate techniques like AngryLoki uses in https://github.com/AngryLoki/wikidata-graph-builder like https://wiki.blazegraph.com/wiki/index.php/RDF_GAS_API

  • Implement Item and Property autocompletion using Select2 (see https://github.com/AngryLoki/wikidata-graph-builder and https://select2.github.io/)

  • Explore idea of associating tags from external systems (e.g. StackOverflow) to Q items, and inferring relationships among them

  • Create UI for editing Wikidata items

  • Enable user to change panel widths

  • Enable full screen concept map, perhaps automatically so on small mobile devices

  • Export concept map to PDF

  • Implement caching of /claims results

  • Import Wikidata data into Neo4j

  • Implement erase map button

  • Provide ability to associate links to external articles

  • Make concept map resize when browser window is resized (without requiring reloading)

  • Create embeddable mini concept map viewer

  • Create image/snippet component that steps though ordered list of Q items

  • In the /claims and /relatedclaims (and /claimsxml) endpoints, consider returning Q-numbers when there isn't an article in the requested language.

  • Implement configuration server (override configurable items such as those in application.properties, and make starting Q item a configurable item in application.properties)

  • Break up the one microservice into several, each implementing one or more of the following endpoints?

    1. /articlesearch, /idlocator, /langlinks (calls https://xx.wikipedia.org/w/api.php?action=query&...)
    2. /claims, /claimsxml, /relatedclaims (calls https://query.wikidata.org/bigdata/namespace/wdq/sparql?... and populates Neo4j database using Spring Data Neo4j)
    3. /locator (calls https://www.wikidata.org/w/api.php?action=wbgetentities&...)
    4. /wikipage (retrieves mobile wikipedia page https://xx.m.wikipedia.org/...)
    5. /bitly (calls https://api-ssl.bitly.com/v3/shorten?...)
    6. /graph (calls hosted Neo4j Cypher transactional endpoint)
  • Implement an API gateway? (perhaps not necessary or even advisable if endpoint aren't broken up into several microservices)?

  • Implement service discovery?

  • Implement patterns like bulkhead and circuit-breaker so that external services like Neo4j, wikimedia REST call, etc. don't cause:

    • services to hang
    • service to require restart
    • give bad messages?
  • Implement blue/green deploy for zero downtime

Items to Fix

  • [] Create unit tests
  • [] Fix issues that occur when an article title contains an ampersand (e.g. not finding article, and thumbnail lookup problems)
  • [] Handle issues like when SPARQL query failed more gracefully
  • If relationship is removed from Wikidata, delete it from Neo4j at the same point at which relationships are added (MERGE)
  • [] Add a language property to Item nodes stored in Neo4j that aren't currently in English, and use that property to invoke the /thumbnail endpoint from WikiVisGraphController. Perhaps, in addition or instead of, use the item ID to get thumbnails (currently slower though)
  • Create process that generates MERGE code for missing properties in ItemRepository.java
  • [] Make the graph not so jumpy, and consider implementing sigmajs or visjs
  • WikiSearchController search() method probably needs to do a query to get the article URL for the article title (as it currently returns titles with spaces)
  • Make iFrame height resize when new web page loads
  • Draw a line for each relationship in the same direction between two nodes
  • Diagnose issue on Safari/Mac where bitly link isn't created when app started with conceptmap.cfapps.io but it did when started with conceptmap.io
  • Handle multiple languages in Neo4j database
  • handle case when typing fast (use timer?)
  • Consider comparing nearmatch with similar entry in dropdown, and use the latter for the search
  • Create unit tests
  • Make it so that hitting Enter key doesn't trigger search when Search button isn't enabled?
  • Render TOC entries in mobile Wikipedia
  • Watch for duplicated language code (e.g. frfr) in console.
  • Check out following warnings: 02:25:53.431 [main] WARN o.s.d.n.m.Neo4jPersistentProperty - No identity field found for class of type: com.javafxpert.wikibrowser.model.conceptmap.GraphRelationFar when creating persistent property for field: private java.lang.String com.javafxpert.wikibrowser.model.conceptmap.GraphRelationFar.startNode

Code refactoring

  • Organize/eliminate duplication in classes that hold deserialized objects
    • Start with renaming idlocator package to wpquery and IdLocatorResponse to WpQueryResponse
    • Standardize on Near/Far naming?
    • Add @JsonProperty("foo") to all JSON deserialize classes, adjusting singular/plural as makes sense
  • Change IdLocator endpoint parameter from name to title
  • Move WIKIDATA_ITEM_BASE, etc. to WikiBrowserProperties class
  • Remove D3js related code

I18N considerations:

  • Make WikiPageController#generateWikiBrowserPage() write the lang in generated links e.g. href=wikipage?name=Terre&lang=fr

Miscellaneous notes:

  • For Neo4j performance, CREATE INDEX ON :Item(itemId)
  • To remove all outgoing relationships from a node, MATCH (a:Item {itemId: "Q43274"})-[r]->(b) DELETE r

wikibrowser-service's People

Contributors

javafxpert avatar kaleb-hutchy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

wikibrowser-service's Issues

Documentation: What do button labels mean?

If this is taken from some standard set of defs - is there a link?
If custom, a legend in online help screen would help.

Very promising project once I learn how to use it!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.