smurp / huviz Goto Github PK
View Code? Open in Web Editor NEWinteractive, customizable semantic web visualization
License: GNU General Public License v3.0
interactive, customizable semantic web visualization
License: GNU General Public License v3.0
This is motivated by compactness of a flowed representation. This would make it possible to have (say) 100+ colored leaves in only as many lines as there are branches.
We should take measurements of how long it is taking using this brute-force table-scan approach. How many milliseconds does it take to loop through 100 nodes? 500? At some point it takes longer than a tick frequency of 1/60 sec!! Once we've implemented an attempted solution, measure again.
After a search for algorithms let's review the candidates together and give the best one a try.
Consider the R-tree
https://en.wikipedia.org/wiki/R-tree
This survey of spatial index strategies might offer other approaches
https://medium.com/@agafonkin/a-dive-into-spatial-search-algorithms-ebd0c5e39d2a
The puzzle is whether the cost of maintaining the R-tree would be recouped.
Perhaps gaming situations -- where many moving things need to have their collisions detected -- offer analogs which have documented indexing solutions.
The Orlando timeline http://orlando.dev.semandra.com:9999/timeline.html does not show writer names despite being in the Timeline.Event title attribute. See the JFK assassination timeline for an example of this working: http://www.simile-widgets.org/timeline/
Give the user the ability to drag one mode onto another and combine them into a single node.
Very advanced feature: Ideally this should allow people to merge nodes in the actual entity collection, if they have the authority to do so.
Ability to filter by:
Author/nonauthor/gender of author
Temporal range
Geographical range?
Type of data
Exclude particular nodes from an existing graph
This would make it possible to replace predicate_to_type
in orlandoScrape which maps predicates onto urls like xfn:kin . This would hook up with the edgepicker being powered by a taxonomy
Upon reflection, it seems that predicate_to_type should be removed and just replaced with an Orlando ontology namespace. Such as orl:
perhaps.
The ability to push the data that is in the center of the ring to a format suitable for use in other graph tools for additional editing/formatting. Gephi is a likely candidate for a default export format.
The predicate hierarchy should be displayed. Orlando's is coming, ultimately from the orlandoOntology.owl but for generality we should not expect that the particular ontological choices made in its creation will be the same as those made by other workers in theirs -- if they even have one. Further, we would like to be able to provide control over that hierarchy and possible restrictions on which predicates are displayed in the treepicker.
So. We should create a translator from the .owl to the .json format which already drives the treepicker.
Make it possible to specify which regexes are employed by providing a --predicates regex which is applied to the names of the predicates.
orlandoScrape.py --predicates "standardName|dateOf.*"
Would only produce output for standardName, dateOfBirth and dateOfDeath
This would simplify generation of the timeline and other selective uses of orlandoScrape.py, obviating the creation of temporary files containing just the regexes desired for a particular pull.
Oh this would be so handy, to be able to view the orlando_ontology.ttl file using Huviz! The barrier is just that the .ttl reader has not been converted to use the new streaming / event-driven import tech. This is not strictly needed, but boy would it be handy and an excellent test of how generic the tool is.
Search for a string of text
Most desirable: select all the nodes whose material contains that string and graph any edges between those nodes.
Also interesting: graph all relations embedded in all objects (in the case of Orlando this would be entries) that contain that string.
If possible, also more narrowly, graph on relationships between the subject of the entry and other entities contained with in the structural tag that contains the string or within x characters of that string.
Ability to select several nodes and save them as a "tear off" group to which you could apply actions:
show or hide labels
show or hide edges
discard all
discard rest/restrict active graph to them alone
Ability to save screenshots in different file formats and with different resolutions
Select two nodes that are not connected. Have the option to see the shortest path(s) between them.
Getting childOf to work for abdyma required putting \s*
in the regex to cope with a line break WITHIN the tag < and >
This trickery should be implemented across most of the regexes
Ability to search within a particular tag
Recognize that this is likely to be a costly feature and that we may not be able to do it.
Next iteration of TM&V prototype
Exclude particular (individual) edges (not their types) from the graph
Get labels right side up on the left outside of the graph
Ability to click on and then pull a node either from the periphery into the active space of the graph or from another location in the space of the graph and have it stick when you release your mouse
http://simile-widgets.org/wiki/Timeline_EventSources
There are indeed writers with multiple deaths (maybe births too).
Writers with just a birthdate or a deathdate should have an imputed 50yr lifespan and earliestStart or earliestEnd applied as needed to ensure that all lives are durationEvents.
People without death dates but who might still be alive should have their imputed death at the present moment.
as described by stefan
Ability to create a label for the graph
--auto generated label that summarizes search/selection criteria, at least a translation of the choices in the graph, the name of the dataset on which the graph is based, and perhaps also the date
--custom label that can be typed in by the user
Provide a snippet of text (try 5 words on each side of the node tag) within the graph interface itself
Provide link back to the relevant structural tag within the published CUP Orlando
Imagine being able to control what a click on a node means by selecting a verb first. In other words an incomplete command will be completed by clicking on a node.
Weight edges according to the number of links involved, either through multigraph or thickening of lines. Ideally hang on to the colouring of the edges as well, but this may not be possible. This may need to be a view that one turns on and off given the way it may impact overall readability.
So there are now labels on everything.
Hmm. A little overwhelming (somehow the charm is lost!)
Alternatives to this appear to be:
Here are some examples of where the_name has more than one comma:
"Smith, Maria,, 1773 - 1829"
"Abergavenny, Frances Neville,,, Baroness",
"Abergavenny, Frances Neville,,, Baroness",
"Rutland, Thomas Manners,,, third Earl of",
"Anne,, second queen of Henry VIII",
"Bergavenny, Joan,,, Lady"
"Abergavenny, Frances Neville,,, Baroness",
"Rutland, Thomas Manners,,, third Earl of",
"Anne,, second queen of Henry VIII",
Based on the above it looks like the format is:
Surname,GivenName,Comment,YearOfBirth - YearOfDeath,Title
Could someone confirm this? This stuff could be turned into knowledge as appropriate.
Select on the edges that you want to see
Select all edges
Deselect the ones you don't want to see
Modify regex scraper to pull the structural ID from the container DIV for every piece of RDF extracted so that it can be used to link back to the source.
Output file is to be tab separated and use a concatenated form of the triple as the label.
The edge picker should be populated by a file something structurally like:
https://github.com/smurp/huviz/blob/master/orlando_tag_tree_PRETTY.json
but the leaves and branches actually shown should continue to be driven by the predicates which are actually present in the viewed dataset.
The motivation for this is so we can impose a hierarchy on the predicates.
Such files could be manually or automatically authored (eg extracted from the OWL ontology).
One might imagine that ideally in the long run the edge picker would be powered directly by the ontology, but my suspicion is that one would usually want some indirection in there for creative control.
edgepicker_shows_only_used
controls whether the whole taxonomy is displayed vs whether just the present ones are.edgepicker_taxonomy_file
presents a file picklist of alternative predicate taxonomies[Starting a new thread for this as it seems distinct from labels.]
Are these the same things? I think Stéfan raised that question but I can't find that bit of the email thread again.
If filter and search are the same (which they may not be):
Would a filter/type ahead apply only to what's already in the graph, or would it go to the dataset on the server and fetch any new (matching) data?
Need an indication that the switch to the different dataset has registered, and that the system is loading.
see dant__
DEATH
DATERANGE CERTAINTY="ROUGHLYDATED" FROM="1321-09-13" TO="1321-09-14"
see starma
<DEATH>
<DATESTRUCT CERTAINTY="C" VALUE="1838-04-">
<SEASON>Spring</SEASON>
<YEAR>1838</YEAR>
</DATESTRUCT>
Ability to save state of a graph so that it can be:
saved
reloaded
shared
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.