smurp / huviz Goto Github PK

View Code? Open in Web Editor NEW

14.0 10.0 4.0 53.09 MB

interactive, customizable semantic web visualization

License: GNU General Public License v3.0

Makefile 0.56% Python 5.25% Shell 0.05% CSS 4.51% JavaScript 82.72% HTML 4.38% Dockerfile 0.03% EJS 2.50%

coffeescript javascript linked-open-data semantic-web graph-visualization canvas-animation svg-animations

huviz's People

Contributors

Stargazers

Watchers

Forkers

antimony27 diversusandtim dnorman siryyy

huviz's Issues

make treepicker display very populated branches as a text flow rather than vertical list

This is motivated by compactness of a flowed representation. This would make it possible to have (say) 100+ colored leaves in only as many lines as there are branches.

use spatial index to speed find_node_or_edge_closest_to_pointer()

research an appropriate algorithm

look to gaming (moving things, collision detection, high frequency updates)

do this in a data-driven way.

We should take measurements of how long it is taking using this brute-force table-scan approach. How many milliseconds does it take to loop through 100 nodes? 500? At some point it takes longer than a tick frequency of 1/60 sec!! Once we've implemented an attempted solution, measure again.

review and implementation

After a search for algorithms let's review the candidates together and give the best one a try.

algorithm candidates

Consider the R-tree
https://en.wikipedia.org/wiki/R-tree

This survey of spatial index strategies might offer other approaches
https://medium.com/@agafonkin/a-dive-into-spatial-search-algorithms-ebd0c5e39d2a

The puzzle is whether the cost of maintaining the R-tree would be recouped.

Perhaps gaming situations -- where many moving things need to have their collisions detected -- offer analogs which have documented indexing solutions.

show writer names on the timeline

The Orlando timeline http://orlando.dev.semandra.com:9999/timeline.html does not show writer names despite being in the Timeline.Event title attribute. See the JFK assassination timeline for an example of this working: http://www.simile-widgets.org/timeline/

Ability to combine nodes

Give the user the ability to drag one mode onto another and combine them into a single node.

Very advanced feature: Ideally this should allow people to merge nodes in the actual entity collection, if they have the authority to do so.

Filters

Ability to filter by:
Author/nonauthor/gender of author
Temporal range
Geographical range?
Type of data

Nodes: exclude

Exclude particular nodes from an existing graph

should we add taxonomic_parent column to regex file? No!

This would make it possible to replace predicate_to_type in orlandoScrape which maps predicates onto urls like xfn:kin . This would hook up with the edgepicker being powered by a taxonomy

Upon reflection, it seems that predicate_to_type should be removed and just replaced with an Orlando ontology namespace. Such as orl: perhaps.

Nodes: allow users to select what nodes they want to see

Data export

The ability to push the data that is in the center of the ring to a format suitable for use in other graph tools for additional editing/formatting. Gephi is a likely candidate for a default export format.

display predicate hierarchy in TreePicker

The predicate hierarchy should be displayed. Orlando's is coming, ultimately from the orlandoOntology.owl but for generality we should not expect that the particular ontological choices made in its creation will be the same as those made by other workers in theirs -- if they even have one. Further, we would like to be able to provide control over that hierarchy and possible restrictions on which predicates are displayed in the treepicker.

So. We should create a translator from the .owl to the .json format which already drives the treepicker.

Data: link back to source data for context

restrict edge display under graphcommandlanguage control

implement orlandoScrape.py --predicates

Make it possible to specify which regexes are employed by providing a --predicates regex which is applied to the names of the predicates.

orlandoScrape.py --predicates "standardName|dateOf.*"

Would only produce output for standardName, dateOfBirth and dateOfDeath

This would simplify generation of the timeline and other selective uses of orlandoScrape.py, obviating the creation of temporary files containing just the regexes desired for a particular pull.

re-enable the display of TTL files, so ontology can be displayed

Oh this would be so handy, to be able to view the orlando_ontology.ttl file using Huviz! The barrier is just that the .ttl reader has not been converted to use the new streaming / event-driven import tech. This is not strictly needed, but boy would it be handy and an excellent test of how generic the tool is.

Free text search

Search for a string of text

Most desirable: select all the nodes whose material contains that string and graph any edges between those nodes.

Also interesting: graph all relations embedded in all objects (in the case of Orlando this would be entries) that contain that string.

If possible, also more narrowly, graph on relationships between the subject of the entry and other entities contained with in the structural tag that contains the string or within x characters of that string.

Lassoing / selecting groups of nodes

Ability to select several nodes and save them as a "tear off" group to which you could apply actions:
show or hide labels
show or hide edges
discard all
discard rest/restrict active graph to them alone

Screenshots

Ability to save screenshots in different file formats and with different resolutions

shortest paths between nodes

Select two nodes that are not connected. Have the option to see the shortest path(s) between them.

make regexes robust wrt tags with line breaks between attributes

Getting childOf to work for abdyma required putting \s* in the regex to cope with a line break WITHIN the tag < and >

This trickery should be implemented across most of the regexes

Edges: identify by predicate/type/tag

Nodes: allow persons, places, organizations or titles to be nodes

Tag-sensitive search

Ability to search within a particular tag
Recognize that this is likely to be a costly feature and that we may not be able to do it.

only leaves should be colored in colored treepicker

HuViz2

Next iteration of TM&V prototype

Edges: exclude

Exclude particular (individual) edges (not their types) from the graph

verify the regexs against writers who should have both birth and death dates

orientation of labels

Get labels right side up on the left outside of the graph

Pin nodes

Ability to click on and then pull a node either from the periphery into the active space of the graph or from another location in the space of the graph and have it stick when you release your mouse

in timeline multiple dateOfBirth or dateOfDeath should use earliestStart and earliestEnd

http://simile-widgets.org/wiki/Timeline_EventSources

There are indeed writers with multiple deaths (maybe births too).

Data: include bibliographical data

complete parser for graphcommandlanguage

people with just birth or death dates but not both could have imputed lifespans

Writers with just a birthdate or a deathdate should have an imputed 50yr lifespan and earliestStart or earliestEnd applied as needed to ensure that all lives are durationEvents.

People without death dates but who might still be alive should have their imputed death at the present moment.

make graphcommands appear as shareable links

as described by stefan

Ability to label the graph

Ability to create a label for the graph
--auto generated label that summarizes search/selection criteria, at least a translation of the choices in the graph, the name of the dataset on which the graph is based, and perhaps also the date
--custom label that can be typed in by the user

Link back to data

Provide a snippet of text (try 5 words on each side of the node tag) within the graph interface itself
Provide link back to the relevant structural tag within the published CUP Orlando

make click an complete command

Imagine being able to control what a click on a node means by selecting a verb first. In other words an incomplete command will be completed by clicking on a node.

Edges: weight them

Weight edges according to the number of links involved, either through multigraph or thickening of lines. Ideally hang on to the colouring of the edges as well, but this may not be possible. This may need to be a view that one turns on and off given the way it may impact overall readability.

make orlandoScrape generate predicates based on the owl

show labels on graph

So there are now labels on everything.

Hmm. A little overwhelming (somehow the charm is lost!)

Alternatives to this appear to be:

label-as-you-type: a field where entered text causes only matching labels to appear instantly (Is this your filter-as-you-type feature, Stefan?)
hover-to-label: nodes within some distance of the mouse have their labels shown
hold-down-spacebar-to-show-labels: when the spacebar is held down, labels are shown (not iPad/smartphone friendly)
show-labels-checkbox: a checkbox in a control panel toggles label display
context-menu-show-per-node-options: right-clicking (or hold on touch devices) brings up a menu of options on a per-node basis, including show-label

add button to 'hide all links'

break out the comma-delimited details in STANDARD NAME=

Here are some examples of where the_name has more than one comma:

        "Smith, Maria,, 1773 - 1829"
        "Abergavenny, Frances Neville,,, Baroness", 
        "Abergavenny, Frances Neville,,, Baroness", 
        "Rutland, Thomas Manners,,, third Earl of", 
        "Anne,, second queen of Henry VIII", 
        "Bergavenny, Joan,,, Lady"
        "Abergavenny, Frances Neville,,, Baroness", 
        "Rutland, Thomas Manners,,, third Earl of", 
        "Anne,, second queen of Henry VIII",

Based on the above it looks like the format is:

Surname,GivenName,Comment,YearOfBirth - YearOfDeath,Title

Could someone confirm this? This stuff could be turned into knowledge as appropriate.

Edges: select/deselect

Select on the edges that you want to see
Select all edges
Deselect the ones you don't want to see

make orlandoScrape generate predicates based on the owl

Reify Data and Extract Structural IDs from Divs

Modify regex scraper to pull the structural ID from the container DIV for every piece of RDF extracted so that it can be used to link back to the source.

Output file is to be tab separated and use a concatenated form of the triple as the label.

populate edgepicker with external mintree file

The edge picker should be populated by a file something structurally like:
https://github.com/smurp/huviz/blob/master/orlando_tag_tree_PRETTY.json
but the leaves and branches actually shown should continue to be driven by the predicates which are actually present in the viewed dataset.

The motivation for this is so we can impose a hierarchy on the predicates.

Such files could be manually or automatically authored (eg extracted from the OWL ontology).

One might imagine that ideally in the long run the edge picker would be powered directly by the ontology, but my suspicion is that one would usually want some indirection in there for creative control.

Possible Controls

edgepicker_shows_only_used controls whether the whole taxonomy is displayed vs whether just the present ones are.
edgepicker_taxonomy_file presents a file picklist of alternative predicate taxonomies

type-ahead search / filter-as-you-type

[Starting a new thread for this as it seems distinct from labels.]

Are these the same things? I think Stéfan raised that question but I can't find that bit of the email thread again.

If filter and search are the same (which they may not be):
Would a filter/type ahead apply only to what's already in the graph, or would it go to the dataset on the server and fetch any new (matching) data?

"loading" message

Need an indication that the switch to the different dataset has registered, and that the system is loading.

add regexes for BIRTH and DEATH DATERANGE and DATESTRUCT

see dant__

DEATH
  DATERANGE CERTAINTY="ROUGHLYDATED" FROM="1321-09-13" TO="1321-09-14"

see starma

<DEATH>
    <DATESTRUCT CERTAINTY="C" VALUE="1838-04-">
     <SEASON>Spring</SEASON>
     <YEAR>1838</YEAR>
    </DATESTRUCT>

State saving

Ability to save state of a graph so that it can be:
saved
reloaded
shared