GithubHelp home page GithubHelp logo

h-rea / hrea Goto Github PK

View Code? Open in Web Editor NEW
142.0 142.0 15.0 7.6 MB

A ValueFlows / REA economic network coordination system implemented on Holochain and with supplied Javascript GraphQL libraries

Home Page: https://docs.hrea.io

License: Other

JavaScript 24.34% Shell 0.60% Rust 58.29% HTML 0.17% CSS 0.04% TypeScript 16.44% Nix 0.11%

hrea's People

Contributors

adaburrows avatar connoropolous avatar hackmd-deploy avatar jamesray1 avatar jhonatanhern avatar leosprograms avatar oro13 avatar pegahvaezi avatar pospi avatar sqykly avatar steveej avatar tommycp3 avatar weswalla avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hrea's Issues

Explore alternative record architecture without "key indexes"

When I ran through the first pass of inter-DNA linking, we were storing "base" entries (*) as the address of the first version of an entry to keep consistent IDs. This was mostly to allow for consistent record IDs between entry updates, across networks. After implementing the second pass, which does not use "base" entries as targets but instead writes metadata around the target link as a JSON-based entry, storage of such consistent record hashes seems less necessary.

(*)(which have since been renamed to "key indexes"; please substitute as appropriate when you see the older terminology)

It may be possible to link directly between entries whilst always referring to them by their consistent initial hash without incurring any additional storage overhead. For example, we would no longer need the consistently-identified EVENT_BASE_ENTRY_TYPE linking to the underlying EVENT_ENTRY_TYPE that has a roaming hash.

  • When creating new records via create_record, instead of storing the base entry address, just return the initial hash coming from commit_entry.
  • When calling update_record there would no longer be any need to dereference the entry; however, reading the entry metadata in order to determine the most recent version hash may be necessary. The initial hash (rather than most recent entry hash) would be returned from this method as an identifier for the record that remains consistent between updates.
    • Another option is that update_record should accept a revision ID (read: actual entry hash) rather than a record ID (read: hash of first entry); which would necessitate returning this record metadata in responses (see #40). This method would also be better for avoiding undesirable update conflicts.
  • delete_record may have the same revision ID / record ID concern as for update, with the addition that there is no longer any base entry to delete.
  • read_record_entry takes the record ID initially returned from commit_entry and follows the update metadata through to the latest version of the entry automatically- there is no longer any reason to dereference the base entry. We may optionally wish to validate that no previous versions of the provided entry address exist, to ensure that revision IDs cannot be incorrectly used as record locators.

Aside from restructuring the zome link! definitions to remove the indirection, I don't think anything needs to change in the linking API. Provided all links continue to use the initial version of an entry, they should all still be readable in a single query for field traversals. It'll just be different link type names.

Allow per-record flexibility in cross-DNA linking by implementing URI resolver logic

Currently, the limited nature of bridging configurations means we are experimenting with a 1:1 relationship between different modules of the system (eg. "planning" & "observation")- in other words, records can only be linked to a single destination network that is known ahead of time; rather than being able to link anywhere. Eventually we want to be at a place where multiple modules of the same type can be connected to a DNA (eg. multiple "observation" DNAs referencing shared "planning" space) and linked arbitrarily.

This will mean that records can no longer be referenced by unique bridge ID and instead must be referenced by DNA hash. We will thus need to conform to a URI spec (proposal at end of this document) and implement appropriate resolvers; which at this time can only be created via introspection of an agent's local capability registry.

Holochain has plans to support this, and it will require some changes to the bridges definitions in zome.json (as yet unknown). In addition, it will also require some helper functions to be implemented.

  • create helper to retrieve appropriate bridge ID name for DNA hash
  • create helper to call a method on a target DNA when provided with the DNA hash
  • implement URI resolver method using DNA --> bridge ID helper
  • create helper to call methods on every connected target DNA which fulfills a particular zome trait
  • change anywhere that record IDs are returned as Addresses to return URIs instead (hopefully limited to construct_response helpers)
  • change all linking methods to accept record URIs instead of Addresses, and infer the necessary bridge ID (eg. BRIDGED_PLANNING_DHT) from the network portion of the URI

Note that by the time this is attempted, some or all of the above may have been provided in the HDK.

A PoC can be implemented by allowing a many:many configuration between the observation and planning DNAs.

Ontology and Taxonomy and how they are related

I know both of those words have different meanings to different people, so I'll define how I am using them here, and why the difference matters.

By Ontology, I mean "a set of concepts and categories in a subject area or domain that shows their properties and the relations between them".

Ontologies can be domain ontologies, covering a specific domain, or upper ontologies, trying to cover all domains. REA is a domain ontology, originally covering accounting, later extended to cover all economic interactions. ValueFlows is a vocabulary based on REA and other sources.

By Taxonomy,I mean "the branch of science concerned with classification". Taxonomies are usually hierarchical, for example, a granny smith apple is an apple is a pome fruit is a fruit is an agricultural product is a product.

Here's a sketch of how taxonomies might fit into ValueFlows:
VF plus taxonomies
Here's the whole VF ontology model under that excerpt.

Resource Specifications would fit naturally into various taxonomies of resource classifications, like http://aims.fao.org/vest-registry/vocabularies/agrovoc suggested in the diagram, a taxonomy of agricultural products. Other resource classification taxonomies exist, like http://www.productontology.org/ which of course calls itself an ontology. (My goal in this issue is to differentiate concepts and how they are used, not to argue against other uses of the words ontology and taxonomy.)

If you selected an item from Agrovoc or Prodont and used it as a Resource Specification in ValueFlows, it would instantly have all of the relationships with other concepts shown in the larger VF ontology model. If you selected an concept from ValueFlows (like resource classification) you would have a hard time fitting into Prodont or Agrovoc or any other taxonomy I know of, and if you did, it would only have some hierarchical relationships.

That's the basics. Next I'll get into how groups of agents agree - or fail to agree - on ontologies and taxonomies.

Pattern for CRUD gateway and relational records

There has been some discussion about what's best for managing these kinds of "relationship" records, and how to make things more ergonomic for the caller. We have particularly focused on the logic for update behaviours (adding, removing and editing relationship records), and on "shorthand" data structures for defining dependant records (eg. declaring fulfillment relationships when writing an economicEvent).

After thinking about this for a while my opinion is that we should not bother with these kinds of indirect edit methods and simply treat "relationship" records (like fulfillment) as first-class items with their own CREATE, UPDATE & DELETE methods. The reasoning is as follows:

Create:

There is a small efficiency gain to the caller if they can specify a sub-structure for relationship records when creating a parent. The internal logic for implementation is also deemed to be reasonably simple, as it is simply the composition of the creation of the base record and creation of any child relationship record(s). However, the sub-records for declaring relationships would necessarily differ from the stored records themselves (for example, the fulfilled_by field of the fulfillment would come from the newly created economicEvent). And, in practise, this logic and field munging ends up being cumbersome and somewhat of a burden to implementors. With the goal of making development of HoloREA as accessible as possible, I think we should simply omit this burden. All calls are going through a websocket at the end of the day anyway.

Update & Delete:

Any set-based implementations for UPDATE & DELETE of referenced records within the api involve complex logic for retrieving the affected record and / or managing the change. However, it is pointless for the backend to be forced to run these computations and checks! Any UPDATE or DELETE request is going to originate at the UI layer, by the action of a user who has clicked on some control next to a data item. In all cases, the ID of the affected relationship is already known by the UI at time of operation. There is no need to say "remove this ID from the set of an economicEvent's fulfills", we only need to say "delete the fulfillment with this ID". And the same applies to UPDATE.

Delete:

Perhaps worth forking to a separate issue if there is contention, but specific to DELETE is the way that we manage the deletion- do we remove links, or records? I think this will be a case-by-case thing, but expect that in most cases we would want to remove the record rather than the link. This allows UIs to load previous versions of records where they have been deleted, by following the dangling link pointer back to the previously available version of the referenced record.

Either way it seems as though the decision to make historical information easily visible or not will fall to the UI layer, and the particular norms and requirements of the context in which a consuming application is being built.

Required integration tests and edge-cases list

  • creating, deleting and re-creating a record results in a new record with a different hash (requires implementing creation timestamps)
  • deleting a record of a different type via a deletion method is not possible
  • create event, link fulfillment, modify event, link fulfillment, read fulfillments; validate length = 2
  • can set same link twice without error
  • can re-add the same links again after previously deleted
  • create process, link 2 commitments, ensure process.committedInputs length = 2 and that both commitments are retrievable via QueryParams.input_of search
  • assertions for link index management behaviour when multiple links are present (see test_record_links_cross_*.js)
  • cross-zome linking errors if target is not present (currently soft-erroring in that the value is not persisted)
  • move events where source and destination resource point to the same resource need to be tested to ensure that quantities are correct in both event creation operation output and subsequent resource reads

How to handle links between DNAs?

I'm running into a decision point where one side of a network fires a call into another network, and the other side only deals with its own internal links.

I'm now wondering under which conditions links between networks need to be updated in both directions- usually it is a "foreign key" type relationship, where one side of the relationship controls the other. So in this case, I can't update Commitment.fulfilledBy to change the associated EconomicEvents, I have to change each EconomicEvent.fulfilled for any given commitment.

What's the best way to make this bi-directional? Will any of these kinds of linking operations all need to be updated in both directions, or will there always be sender/receiver roles in play? (in this case, 'sender' = EconomicEvent & 'receiver' = the Commitments which an EconomicEvent fulfills).

Implement capabilities to restrict permissions between DNAs

Currently everything runs through hc_public and this needs to be constrained to particular functionality-based permissions. That also means that agent and bridge registration callbacks need to grant the appropriate capabilities for newly created networks and newly joining agents.

Implement EconomicResource

These need to be stubbed out and wired up to EconomicEvent API handlers. Note that VF 0.3 defines some fields as "may be calculated"- I think the appropriate behaviour there is to use a value where present and derive the field if an explicit value has not been set. Does that sound correct @fosterlynn?

Investigate using signalling API to wire up cross-DNA data mirroring

At present, data that lives "between" networks (eg. Satisfaction & Fulfillment) requires a "context record" to be kept on either side of the network boundary in order for participants from both networks to have equal access to information.

You can see this implemented in our system where the satisfaction zome in the planning DNA pings the satisfaction zome in the observation DNA in order to create a duplicate of the record.

We currently only allow the creation of satisfactions & fufillments within the planning DNA space, though ideally both actions should be triggerable via the observation DNA as well. In the current implementation we can't do this, because you can't configure cyclic bridge dependencies. But it does seem like a common practise to have logic in one DNA responding to activity in another.

The signalling API may be the answer to this, provided signals emitted in one DNA are received by all connected DNAs running in the conductor; and no bridging is needed to enable this link.

If we can get this working we should remove the unnecessary bridge and provide for an ability to create fulfillments in the observation DNA as well. We should also decide on some standard message structures in order to wire up similar triggers between inter-network parts of the system.

No direct CRUD (well, CUD)

Looking forward, we want to think about how to have complete record of create, update, delete operations on VF records.

Haven't actually implemented this yet in the ActivityPub track, but there I think it would play out like:

  • Use ActivityPub / ActivityStreams vocabulary for activities called Create, Update, Delete, which would reference VF objects.
  • This would include even EconomicEvent, because although that is the way EconomicResources are created and updated, they also may need to be updated, or occasionally deleted, themselves.

It would make sense to me to try to keep the same base vocabulary when doing this in the Holochain track, unless there is something already built into Holochain for this?

Implement Specification DNA

  • implement ResourceSpecification CRUD routes
    • GraphQL mutations & queries
    • integration tests for CRUD methods
    • GraphQL resolver for ResourceSpecification link fields, and integration test for creating EconomicResources in an associated observation DNA which can then be retrieved via resourceSpecification(id: ID).conformingResources (leave this until last)
  • implement ProcessSpecification CRUD routes
    • GraphQL mutations & queries
    • integration tests for CRUD methods
  • implement Unit CRUD routes
    • Indexing to use Unit.id as the ID in getUnit(id: ID) (will require base entries for the unit label index and nonstandard get_unit zome API logic
    • GraphQL mutations & queries
    • integration tests for CRUD methods
  • implement Action routes for reading of the built-in Action type definitions from lib/vf_core/src/measurement.rs:
    • get_action(id: ActionId)
    • query_actions() - no parameters, just return all of them in an array
    • GraphQL queries
    • integration tests for read methods
  • update new zome modules to use proc macros as per #101

Note that the specification types will refer to classifications. At this stage, these are just string fields which will contain the URIs of semantic web ontology terms.

Units are best identified by their id, rather than entry hash. Every unit in the OM ontology is unique.

How to divide objects between the public and private spheres

Some operations that are critical or basic to the functioning of REA need access to some fields of some classes to function. For example, a simple trace procedure cannot function without:

  • EconomicEvent
    • inputOf
    • outputOf
  • EconomicResource
    • affectedBy
  • Process & Transfer
    • inputs
    • outputs

Given the clean dichotomy of public and private data natively supported in Holochain as described in #7, we will need some (many? all?) types to have dual representation. I.E. a single object has one public record and one private record.

In this thread, we will discuss and conclude for each VF class:

  • What, if any, fields are so critical that they must be present on the public DHT, or we will lose key functionality over the network.
  • What, if any, fields may be sensitive for security, or which users simply may want to keep hidden from their peers by default? Note that selective sharing (see #7) will allow an authorized agent to retrieve the private entry as well, so this should include fields which are needed within an organization but not by the rest of the world.

Throughout this process, we should remember that the full properties of an object in Holochain must uniquely identify that object. If one record is { foo: "bar" }, there can never be a different object whose only properties are { foo: "bar" }. Therefore, neither the public nor private face of the object may have:

  • No required fields
  • Fields whose combined values may serve multiple objects over the lifetime of the network

In addition, I have made no attempt to use links to or from private entries. They may differ significantly; I seem to recall someone saying that links are on the DHT (public) only. Private data on the local chain may be queried via query, but this is a little different from the "queries" I implement with links.

Implement bidirectional links between `Event.satisfies` & `Intent.satisfiedBy`

To build on #19 & #20 โ€” once completed, the read query for Intent.satisfiedBy as exposed by GraphQL should load both the Commitment relationships from the planning DNA & the EconomicEvent relationships from the observation DNA. This will be a chance to think through our naming conventions and patterns as all our prior helper functions will be needed to build up the necessary behaviour.

Sign off on contributor guidelines

I just wanted to draw this out into an explicit decision- I've added a section on contribution agreements to the contributors doc in the repo in response to more interested third-parties and potential contributors reaching out.

Since you two are the other major stakeholders in this at the moment I wanted to run it by your eyes and ask for any input; at least one other person should OK this before we start living by it. Feel free to provide suggestions & comments here, or as PRs.

Finalise fulfillment & satisfaction data structures

These need to provide for the intermediary records (they currently link directly between EconomicEvent, Commitment and Intent) and all CRUD methods as deemed appropriate by #37.

Once finalised we can roll this pattern out to all zome API handlers and complete the Observation and Planning modules.

pubsubhub for offers/wants/status

Is there something akin to a pubsubhub feed aggregator function/happ?

I am interested in that for offers/wants/status-updates.

Any pointers appreciated.

I envisioned a minimum of fields

  • title text
  • description textarea
  • link url
  • metadata as needed: author, date created, tags

Create polished abstractions for graph-like record storage

This basically means bringing the work @sqykly has been doing on reproducing LinkRepo into the shell of the end-to-end system and validating the architecture by implementing our first inter-DNA bidirectional link.

Most of the groundwork and underlying architecture will be similar to the structure shown in these diagrams, and most questions needed per #3 have been answered. Additionally, it is also clear that if one defines a "record" as a single logical entity within a Holochain app then there are actually multiple DNA primitives involved in constructing that representation.

It has been observed and discussed that there are some common emerging patterns for record management on Holochain:

  • @sqykly referred to these as "link constraints" in discussion of whether links should have a "bidirectional" attribute added as a "constraint" when building the link index.
  • @pospi calls these "link components" or "link behaviours" and wants to use them as low-level declarative building blocks for building up record storage & query structures. We can work towards these simply by wrapping up common functionality as we go. @pdaoust has also expressed a desire to build up 'small composable helper functions' out of what the HDK provides, we hope eventually this process will yield a new set of simplified methods in hdk::utils.
  • @lucksus has spoken about the extra utility to be found in aggressive decomposition of Holochain app modules into smaller chunks- for example, a separate 'link broadcasting' DNA which tracks the linkages between records in two cooperating DNAs would allow inter-network connectivity to be analysed and broadcast the appearance of new networks into the ecosystem.
  • We've also discussed Holodex with @artbrock, which was a standalone DNA for indexing content on other DNAs (kinda like plugging in ElasticSearch in a traditional cluster); and it seems clear that some features will need to interact with bridged DNAs whilst others will be internal logic.

These things being the case, the evolutionary pattern is likely to be:

  1. higher-order wrapper functions emerge to take care of managing common data structures (in our case, graph databases), and are integrated into DNA zome handler code
  2. some low-level utility DNAs will also emerge, which require external state to be managed in independent networks by way of bridging; in addition to integrating with other DNA code
    • these 'utility' DNAs are likely to have separate user interfaces (in the case of Holodex, a search UI; in the case of link broadcasting, a network map)
    • also likely to have pre-deployed 'recommended' public networks (eg. "the hApp of hApps")
      • will be pairwise modules: 1 side is the DNA; the other side is the utility crate/module to provide wrapper functions for interacting with the microservice
  3. hApp developers then begin to see system behaviours as pluggable, declarative mixins to their zome code...

Progressing through this implementation process should land us on a stable, polished foundation to build out the rest of the app on. This is likely to lead to the creation of some URI resolver logic to more reasonably handle cross-DNA record linkage.

This issue is a place to talk through the high-level design of our graph-like record structures. The lower-level tasks encompassing this work (so far) can be found in #18, #19, #20, #21 & #22.

The other reason to log this task separately is to claim it as a milestone. We should not proceed beyond the implementation of our first set of relationships (satisfactions & fulfillments) until we are happy with the code structure and quality, and comfortable building out the rest of the app to the same standard.

How to manage data privacy on a network-wide DHT

Many users will want to share their data only among their peers in an organization, while exposing enough to do business with others on the same network. Others (e.g. Sensorica) want everything out there for anyone to see and validate. Holochain has only two spheres of access rights:

  • The public DHT, where anyone with an open port can see every bit of every record that you've ever saved, even if it is overwritten or deleted.
  • The private local chain, which is not accessible to anyone other than the machine that saved the data.

In addition, the definitions of data types in the DNA determine immutably where all instances of that type live; there is no API option to make exceptions on a record-to-record granularity. Therefore, we will need to engineer a sensible way to split our data types into public and private aspects. In addition, we will need to implement a mechanism for agents to share their private data with trusted peers at their discretion. I'll be using the term "selective sharing" here, and abbreviating it as SS if I have to type it more than twice.


Selective Sharing

Based on Art's suggestions, the tentative plan for selective sharing is to send data directly peer-to-peer:

  • When private data is created, the owner agent stores the data in their local chain.
  • A reference to that data, a placeholder reference (or as I want to call it, a really far pointer) however, may be used on the public DHT in a field of another record that needs to refer to private data for its own integrity.
  • When a foreign agent discovers the placeholder during its normal operations crawling the graph on the DHT, it may decide that it needs the full data for its own calculations.
    • In this case, if it believes it has access to the private data, then it can attempt to initiate SS with the owner
      • The foreign agent calls send with an SS message to the owner whose payload is a structure containing:
        • The placeholder about which it is inquiring.
        • The credentials (capability tokens) that prove the foreign agent should be able to access the full record.
        • MAYBE Identity information on the foreign agent in case the owner wishes examine the agent further or gossip about it.
      • If the owner agent is online,
        • The owner validates the foreign credentials. If they are valid,
          • The owner returns the real private data from its private local store.
          • SOMEHOW, MAYBE the foreign agent promises to share the data if and only if another agent produces the exact credentials that were used.
          • MAYBE the placeholder itself on the DHT is amended to reflect that the foreign agent is an additional source for the data in case the owner is offline.
        • If they are not valid,
          • The owner returns a rude message to that effect and trash-talks the foreign agent among its network peers.
      • MAYBE if the owner is offline, the foreign agent is able to obtain secondary sources for the record by examining the placeholder on the DHT. It should restart its SS procedure using an online secondary source instead of the owner.
      • Otherwise, the SS attempt fails and the foreign agent is very unhappy.

MAYBE indicates that this is something that popped into my head right now and needs vetting or input from @pospi et al.

Please pick this scenario apart, shoot it down, add or subtract things, or whatever so that we can arrive at the best possible selective sharing.

Details of update logic for EconomicResources?

I've got some questions on how to manage the fields of an EconomicResource in regard to event-related functionality through the API, apologies if they are documented elsewhere; if not- this might be a good opportunity to describe the state machine at a high level.

When creating a resource via the additional createResource parameters in the createEconomicEvent GraphQL mutation:

  • stage: should this be provided as an input parameter, just start off as 'none', or should it take on the ProcessSpecification of the related EconomicEvent that resulted in the resource creation, if EconomicEvent.outputOf is set? Or inputOf? Or something else?
  • unitOfEffort: similarly, must this be explicitly provided or does it come from the associated ResourceSpecification?
  • and state comes from the event's action, if the action is pass or fail? If that's going in to the model... how does a "pass" expire for inspections that need periodical review? It doesn't feel right for it to "stick" to the resource...
  • I presume accountingQuantity adjusts in response to e.resourceQuantity based on e.action.resourceEffect? (being either increment or decrement)
  • are any other EconomicResource fields affected by or derived from fields of the associated EconomicEvent that is provided at creation time?
  • Should the resource be linked to the event responsible for creating it somehow? I suppose we would transparently create the resource address as part of the transaction and inject that as resourceInventoriedAs in the stored event?

I also have outstanding ambivalence about createResource as a parameter name in the event API, even though we've discussed that it's unavoidable. What if we renamed it to observedResource? I'd feel better about that :P

For updating a resource directly (pretty much only provided for correcting data entry errors...)

  • Are we missing a bunch of fields in the mutations API to mirror the new resource fields in 0.3, or are some omissions intentional? (lot, unitOfEffort, stage, state & containedIn)

For updating a resource in response to an event:

  • Is classifiedAs ever altered by events, or does it retain the value it began with? i.e. do I update r.classifiedAs in response to e.resourceClassifiedAs if e.resourceInventoriedAs is provided? If so, is e.toResourceInventoriedAs modified as well?
  • Is onhandQuantity affected by e.resourceQuantity, or just accountingQuantity?
  • Does containedIn respond to any action types?
  • What about currentLocation?
  • Kinda repeats, but- what is the logic for how to update stage & state?

One other unrelated thing- is onhandQuantity the correct casing, or should the 'H' be uppercase?

Oh, and- deleting records... is this possible? Or should they stick around, even if all the events that created them and interacted with them are deleted? Is it worth implementing extra logic to manage the cleanup?

Development workflow & team coordination bits

So I've seeded this repo with a bunch of boilerplate- basically just trying to dump out all the best practices, tools & conventions that I've personally come to in my time managing development teams and get a stable foundation ready for us to build on. There's a collaborators guide which details dev tools and git conventions, which should give us a good place to start.

I want to make it clear that this is in no way me prescribing a way that we must do things (and on that note, nor is anything else I would say, ever) - just starting the conversation. It may take some dialogue for us to find a place that we both feel makes sense and lets us work together smoothly whilst getting out of our way. I think that's basically the goal here- if we were a team within some enterprise then I would probably have things a bit stricter, but it's important to remember that we are doing this for love of the game and that if it feels too onerous then it can take the joy out of it.

With that in mind, when we're past GFD and you have time to digest this it would be great to hear your thoughts & feelings about what I'm proposing :)

I'd also like to get better at the coordination side this time around, and really embrace having a regular cadence that keeps us all in the loop. My present feelings are that this would be a case of:

  • checking in via shared chat when starting work to let others know what we plan on doing
  • checking in via shared chat when finishing work to let others know what we got done & if there are any blockers
  • always pushing our code up to the relevant feature/ branch when done for the day
  • letting the team know when we will be away for a period of time, and having an understanding of what "our usual work days" are (if any)

Implement running end-to-end system & polish scaffolding

This issue has been filed to record the outstanding cleanup jobs remaining on #13. As you can see there is a lot of tidy-up to do... my goal is to try to keep things moving and avoid a large merge conflict or bottleneck in aligning what we've already done (particularly as I have low bandwidth for development in the coming week).

Though the PR is still rough around the edges the framework, some patterns and at least the locations of files appear to be coalescing and I don't want to waste an opportunity for us to be working within the same collaboration space & polishing out the same details- I think things are stable enough for both of us to be working on top of. For completeness and so you can get a sense of what still needs to be tidied up, here's an exhaustive list of what still needs doing-

code cleanup:

  • move event / commitment zome I/O structs into vf_core, abstract out the pattern
    • use macros to avoid repetition between CREATE, UPDATE, entry (storage) & response (output) structures
  • rename & potentially split "main" zomes, restructure around areas of function
  • vf-graphql integration:
    • implement build process for reference schema (see valueflows/vf-graphql#32)
      • move schema instantiation logic to vf-graphql-holochain package and no longer depend on Webpack for bundling (remove related raw.macro & hack in postinstall.sh as well)
    • fix scalar types setup in schema resolvers config
    • remove temporary git submodule from repository, publish module to NPM & reference as separate node package

documentation:

  • document language around "requests", "records", "entries", "base entries" & "links" (perhaps equivalent to docs for vf_core record macros)
  • document files, methods & modules to Rust standards
  • docs for example GraphiQL explorer
  • finish scaffolding and documenting modules/vf-graphql-holochain

testing:

  • unit tests as appropriate
  • integration tests (awaiting bridging support in nodejs conductor)
    • create event
    • read event (incl. fulfillment field)
    • read fulfillments
    • create event, link fulfillment, modify event, link fulfillment, read fulfillments; validate length = 2

implementation:

  • implement fixed identity pattern for records
  • update address type aliases to use newtype pattern to avoid accidental field mismatches
  • wire up Action fields using static action builtins (note- some entry_address wrapping still needed)
  • improve error handling around complex operations
    • For critical writes (eg. the base entry for an 'entry + links' record structure), handlers should panic before attempting any other operations.
    • For parallel writes (for example, linking a set of EconomicEvents as fulfillments of a Commitment), handlers should pass back ZomeApiResult objects and leave the context handling to the caller. This would usually be done towards the end of the callback handler by partitioning out all the Err responses from sub-operations and returning them in the API response object. You could also do things like pass the errors into some bridged "error logging" Holochain DHT.
  • include field name whitelist parameters within every request to avoid reading links where unnecessary
  • expand fulfillment test implementation to include the intermediary record struct and abstract out a pattern for

My goal is to keep working to tick many of these off; things that are to be left for later will be logged as separate issues. @sqykly please let me know if there is anything here you think should be completed before merging the PR- in other words, that you feel blocks you from being able to continue work after merging #13 into what you're doing. Other than that, I definitely think most of this should be taken care of before proceeding beyond event / fulfillment / commitment, as I want to ensure we've landed on optimal patterns and laid a good foundation for code best-practises before we continue work elsewhere.

There are also some assumptions in my tasks which might require unpacking, please query anything which doesn't make sense. Perhaps this can become the new "optimal architecture" thread, since it seems pretty clear that read & create are mostly solved problems and the only thing we may still have to discuss is update and how that fits in to the project scaffolding I've already created.

There are a couple of other items that can't be tied off just yet until the Holochain core & HDK catch up (CC @philipbeadle @thedavidmeister @willemolding)-

  • Implement bindings into the capabilities register and bridge registration callback to set up inter-network data operations more flexibly.
    • Allow multiple DNAs to interconnect as the "same" network (2 separate observation DNAs referencing a single planning DNA, for example)
  • Integration tests to proof cross-DNA functionality

Agent-centric architecture

I haven't dug into much of the technical architecture of Holochain, so if you @sqykly and @pospi tell me this is dumb, I'll close it. But I have wondered this for quite a while.

Holochain is "agent-centric", we all want to be "agent-centric" too. But how does it work? (I've thought about it quite a bit in our ActivityPub track, so I'll use that to translate over to here.)

Some different aspects of my questions:

  1. Can we make something agent-centric out of what I understand Holochain provides, which is more like user key centric? If not,
  2. Where are we with the extended discussion about group agents that @bhaugen and Paul D were having earlier? (I won't repeat it here, but can pull some things into github later if useful.) Basically, we see the need for group agents to also be part of the agent-centric paradigm. Not so much in terms of having a user key, but in terms of having their own source chain and whatever else is needed for owning their own data. And person agents could have role-based permissions to act on behalf of the group agent based on the group's governance.
  3. How should we think of group agent-ness related, in relation say to hApps and DHTs and such? I picture group agents having their own choice of apps that they use, which would first save their data in the group's source chain, and then...? Interaction from person agents who are part of the group could be carried out from a UI based on the person's home base. But the software would need to be coordinated with the group's software. The group's software would have most of the backend logic. Where does that live in the hApp / DHT / DNA architecture of holochain?

Apologies again for not being very deep into the architecture, for any wrong terms I used, etc.

Manage network hash equivalence for target DNAs between updates

Once we have more flexible links between records (see #49), we need to resolve the issue whereby a destination network may be upgraded and thus the old data becomes unresolvable.

"Source" DNAs will also have to keep a list of equivalences between current & prior versions of "destination" DNAs in order to determine the correct DNA hash to look up the current version of an entry. It is as yet unknown how best to account for this.

Get Nix environment working and documented

We need to align with the officially supported dependency management system.

Work on https://github.com/holo-rea/holo-rea/tree/feature/nix-install-scripts is mostly completed for the 0.0.26 upgrade (quite a few breaking changes), as well as revised setup instructions and install script for downloading a specific version of https://github.com/holochain/holonix. Editor tooling is also working well under the new environment.

Two minor issues left to resolve after the upgrade before this can be completed, see this post.

How should we structure the code to be a framework?

We have said that holo-rea will be an REA framework that others can build on top of with application or domain specific logic. What does that actually mean in the context of holochain architecturally? And when do we want/need to figure this out? Is there a specific group we are working with where we could work through the question in practice?

Or if we do understand it, I'd be interested in hearing how it should work.

Implement zome validation callbacks

See valueflows/vf-apps#5.

For either / or fields, there are some fields where one or the other is required. These are:

  • Intent, Commitment, EconomicEvent, RecipeFlow, Claim, Process: at least one of the time fields
  • Intent, Commitment, EconomicEvent, RecipeFlow, Claim, Fulfillment, Satisfaction, Settlement: resourceQuantity or effortQuantity (or both)
  • Intent, Commitment, EconomicEvent: resourceClassifiedAs, resourceConformsTo, or resourceInventoriedAs (mroe than one is fine)

Implement compatibility layer for group agents

Starting a new issue for this, seems enough different from agent-centric architecture. Should it be broader? Like how to prepare for group agents or something?

@pospi @sqykly Please feel free to pull the various discussions from elsewhere into this to seed it. And I'll think about how onBehalfOf might (or might not) want to become part of VF, seems like a universal problem. Probably working it through here first though.

Optimal architecture for DHT record logic

This is a thread to discuss Rust's language features and how we best implement the DHT code... presuming Philip doesn't come along with the new GraphQL API generation feature and obviate us needing to hand-code most of the backend ;)

The first thing about Rust is that it's an ML-derived language and neither of us has learned any ML before. This will probably make the experience somewhat painful for a while until we attain some lightbulb moments, after which it will suddenly become amazing. It might be an idea to have weekly catch-ups where we compare notes on our learning as this will help accelerate each other. I will keep updating this thread as I learn new insights so that you can alert me if you've come to different conclusions.

Type hints:

Something I have seen in a couple of 'best practise' documents is to make type declarations "clear and expressive, without being cumbersome". However, information on how to make such distinctions is lacking.

From what I can tell, the compiler mandates that you declare a return type and parameter types for all functions. I suspect the above guideline is around the 'restrictiveness' of type parameters, and that the best practise is to make your types as generic as possible (eg. using &str instead of String to allow values which are not instances of the String class but can be converted to be used).

Record architecture:

The topic I've been debating lately is how best to architect the VF core functionality. Rust is not an OO language, and so preferring composition over inheritance is not only good practise here, it's also idiomatic and more performant, and AFAIK OO is not even really an option. Rust's trait system looks to be a really solid and strongly-typed way of dealing with mixins, though- we are in good hands.

The HDK methods for data handling are all quite low-level. Some amount of wrapping them up will be needed, especially around links. And then we need some higher-order wrapping that combines an entry and some links to create a record, like we did with GoChain. I imagine we will want very similar functionality.

The other challenge with the core VF fields is that we probably need to change our way of thinking, because a) you can't inherit structs, and b) traits cannot define fields. Rust really enforces that you keep your data separate from your behaviour. As a consequence, I suspect we will need to use macros to declare our entry types succinctly and avoid having to redeclare fields.

So this is what I came up with as a rough scratchpad for implementing a higher-order struct to manage all the related record data, and a trait implementation for managing it:

#[derive(Eq, PartialEq, Debug)]
enum LinkOrLinkList {
    Link(Address),
    LinkList(Vec<Address>),
}

#[derive(Eq, PartialEq, Debug, Default)]
pub struct Record<T> {
    entry_type: String,
    entry: T,
    address: Option<Address>,
    links: HashMap<String, LinkOrLinkList>,
}

trait LinkedRecord {
    fn commit(self);    // :TODO: return something useful
}

impl<T> LinkedRecord for Record<T> {
    fn commit(self) {
        // save entry
        let entry = Entry::App(self.entry_type.into(), self.entry.into());
        let address = hdk::commit_entry(&entry);
        match address {
            Err(e) => { /* :TODO: bail */ }
            Ok(a) => {
                self.address = Some(a);
            }
        }

        // save links
        for (tag, addr) in &self.links {
            match addr {
                LinkOrLinkList::Link(link) => {
                    match &self.address {
                        None => { /* should probably throw some error here, or check `address` above */ },
                        Some(self_addr) => {
                            // :TODO: handle result
                            link_entries(&self_addr, &link, tag.to_string());
                        }
                    }
                },
                LinkOrLinkList::LinkList(links) => {
                    match &self.address {
                        None => { /* ... */ },
                        Some(self_addr) => {
                            for link in links {
                                // :TODO: handle result
                                link_entries(&self_addr, &link, tag.to_string());
                            }
                        }
                    }
                }
            }
        }
    }
}

Some notes about this:

  • There are probably a lot of Rust n00b misconceptions and mistakes in here.
  • The first line of the commit function fails on self.entry.into(), and I couldn't figure out how to declare the type for the LinkedRecord implementation to make that work. I don't really understand the Into and From traits yet, they seem like dark magic.
  • Gave some conditional logic a try with "single link or array of links" via the LinkOrLinkList enum. The way the language handles these felt overbearing at first, but I can see the benefits. It's impossible to miss a condition unless you explicitly want to.
  • Error handling is probably going to be an interesting exercise. The docs on it go really deep.
  • Of course all this needs to be split up and made to do all the cool things your TS LinkRepo does.
  • Writing it took me a very long time. It's going to be a steep learning curve, but on the upside the type system is so good that you practically cannot write invalid code. The downside is how frequently the compiler barfs at you.

Anyway, that's my half-baked thoughts after a day and a half or so of learning Rust. If I'm doing stupid things or on the wrong track I would love to know about it! heh

Implement plumbing to run tests via GraphQL connector

GraphQL is able to detect the presence or absence of mandatory fields in responses, which gives us some error checking for free.

This would also be a good integration test to run over each record type, to ensure that storage & retrieval works for every field and that no typos or other developer errors have occurred between specification and implementation.

It is likely that this is the only integration test we will need for each simple record type, once the core CRUD behaviours are wrapped up by a macro (see #22). Once that task is complete, most API functionality will be tested via unit tests in the macro package or in hdk_graph_helpers. If we can get to a place where the only code in the zome API handlers is declarative use of the graph helper framework code, then the only thing remaining to test for each external API is that all the fields are correctly wired up.

Implement bidirectional links between Event.fulfills & Commitment.fulfilledBy

Following on from #19, the pattern needs to be extended to work for inter-DNA field links. It is difficult to say what the correct flow of data is for managing updates across DNAs, but one goal is for the Fulfillment record to exist on both sides of the network boundary. Ideally fulfillments can be managed in either network and changes would flow between them, but it may be more appropriate to only have one side of the relationship editable. We could start there and see what the added complexity in bidirectional editing looks like?

What is HoloREA?

It's a "modular" VF implementation on Holochain, yes. But the architecture of the program depends strongly on the exact protocol through which a client app interfaces with a "module". We need to decide what "module" really means to us. This issue can serve to collect our thoughts and knowledge in advance of the next meeting, and to hold our conclusions afterward.

Drop-in zomes

A drop-in zome is source code that Holochain developers can include in their own projects. The interface with our code would be cross-zome methods, which have a special declaration in the DNA and a special way they must be called:

let vf_data = call("name_of_holorea_zome", "name_of_holorea_function", argument);

The VF model objects would live on the same DHT as the rest of their app. Any object created by the zome(s) is validated in our own zome. After that, our data is at the mercy of the host app's security strategy, i.e. there is no point in building in access control. The host app ultimately has total control and our zome(s) operate with full transparency to it. It would be trivial for developer to break our zome in some way, but that transparency also creates possibilities that may be impossible otherwise by allowing the host app to use or redefine our types.

If desired by the host app, the special DNA declaration can also expose our functions as web API or bridge functions. We can't guarantee any part of an API we define will be a part of an app running HoloREA. We also have no configuration data apart from what we require from the host app to run. We can effectively never push an update,

Bridged apps

Like drop-in zomes, only Holochain developers can use a bridged app. Our DNA will define a set of bridge functions, which may be called from the client app. Instead of call, there is a similar HDK function:

let vf_object = bridge("name_of_holorea_app", "name_of_zome", "name_of_function", argument);

Our data lives on a different DHT than the client app. We do all of our own validation and access control. We control the configuration data and our own DNA, so we decide which parts of the API we expose at every level.

I don't think we will have a different DHT for each client app by default; we will need to juggle multiple data sets for different apps. I think we can use configurations to make each client app's HoloREA a different app hash, which would relieve this problem where the client app plays along. Whether this will affect updates to HoloREA, I don't know.

Web API

This is how it worked in the prototype. Every hApp runs a server that serves requests for static content and requests to the web API endpoints its DNA declares. The client app can then be anything, not just another hApp. A web browser can access our default UI (like the REPL), or another app can use our API and serve its own pages (like our separate UI server). Client apps call our functions via requests:

POST holorea-server/fn/holorea-function-name

A web API can also be used as a bridged app if we declare so in our DNA. It could be used as drop-in zomes, but doing so will cede the advantages of being an independent app. It would also make security very difficult. Let's not do that. Bridging might be okay.

In most other characteristics, this situation is identical to a bridged app. The difference is only in how the API functions are declared in the DNA.


Questions, comments, additional facts before I weigh in?

Requirements for connections between ValueFlows/REA concepts

Extracted from https://docs.google.com/presentation/d/15hDyktqVni3NatUeB1wS_zMS0_msXpM1xZNUQ1WU6Ww/edit?usp=sharing

  • An Agent needs to be able to access their Relationships, and the Relationships need to be able to access their Roles.
  • An Agent needs to be able to access all of the Economic Events they participated in, and an Economic Event needs to be able to access its provider and receiver Agents.
  • An Economic Resource needs to be able to access all of the Economic Events that affect it, and an Economic Event needs to be able to access the resource it affects.
  • An Agent needs to be able to access all the Resources for which it has some responsibility, and a Resource needs to be able to access the Agents who are responsible for it. This might be via Economic Events or direct connections.
  • A Process needs to be able to access all of the Economic Events that are connected to it, and an Economic Event needs to be able to access a Process that it may be (optionally) connected to as an input or output.
  • Traversals need to be able to follow connections from one Process to another where the same Resource flows out of one Process and into another.
  • Economic Events need to be able to access Exchange Agreements to which they may (optionally) be connected, and Exchange Agreements need to be able to access all of their connected Economic Events.
  • Traversals need to be able follow connections between Processes and Exchange Agreements that are connected by Resource flows.
  • Traversals also need to be able to follow Resource flow connections between Exchanges.
  • Processes and Exchange Agreements need to be able to access their planned Commitments, and Commitments need to be able to access their connected Processes and Exchange Agreements.
  • Economic Events need to be able to access Commitments that they fulfill, and Commitments need to be able to access Economic Events that fulfill them.
  • Agents need to be able to access Commitments and Intents that they are involved in.
  • All of the connections of the Knowledge concepts to each other, and to the Plan and Observation concepts, also require mutual accessibility.

Implement ability to augment built-in action types via Specification module

  • define action record types in Specification module, keyed by action ID as a user-supplied string
  • upgrade all record types which reference the action builtins directly to reference a combination of these, and records from an associated Specification zome. For now we can presume that only 1 Specification DNA can be connected to any consuming zome.

Implement Process

Nothing crazy here, just another record type with CRUD routes to define in the observation DNA.

Implement bidirectional links between Commitment.satisfies & Intent.satisfiedBy

This is a good first relationship to implement as it deals with same-DNA field links. Implementation requires:

  • CREATE methods for Commitment & Intent in the planning DNA
    • Ability to create Satisfaction records as sub-records of a new Commitment
    • Ability to create Satisfaction records as sub-records of a new Intent
  • CREATE method for Satisfaction
  • UPDATE methods for Commitment & Intent (non-relational fields only)
  • UPDATE method for Satisfaction
  • DELETE method for Satisfaction
  • DELETE methods for Commitment & Intent

I don't think there is any need to implement an ability to modify Satisfaction records via manipulation of Commitment or Intent- the client needs to be responsible for tracking which records to CREATE / DELETE anyway, so at that point they might as well call UPDATE / DELETE on Satisfaction directly. I do still think it's a nice convenience thing to have an ability to author them at Commitment / Intent logging time but am open to having my mind changed on that. I thought if nothing else it would be a good way to encourage splitting abstractions out.

We also need to define a naming convention for our zome API methods going forward, part of this issue is deciding on those naming conventions.

(BTW: if you think we need to define acceptance criteria for this and other issues I'm about to log, please let me know; otherwise happy for those tests to emerge as makes sense.)

Implement 'stable ID' record pattern

We first need to implement a wrapper for records that allows other DNAs to detect multiple versions of the same entry as one. This allows cross-DNA linking to work predictably, but requires some wrappers to manage the underlying DHT structures in order to do so.

Basically, this means:

  • de-referencing an entry's address the first time it is created and linking the actual entry data off there via an initial_entry link.
  • creating a read wrapper function which will load the most recent version of some such record structure from the DHT by following the previously created link.
  • creating an update wrapper function which follows the initial_entry link and updates the de-referenced entry instead of the ID entry
  • creating a delete wrapper function which knows to delete both the de-referenced entry ID and linked data

Tests should prove that creating & updating a record does not result in loss of data linked to the entry ID.

Code quality / review pass for Multi-DNA PoC

Common abstractions

Careful thought needs to be taken before addressing these items. Advanced Rust skills are needed. Don't presume that an issue from this list should be fixed in a particular way, or if indeed it is an issue at all.

  • update terminology of hdk_graph_helpers methods for clarity and consistency once abstractions stabilise:
    • change record_helpers to entry_helpers
    • change link_helpers to index_helpers
    • change rpc_helpers to remote_index_helpers
      • move the raw RPC method out of this module into standalone rpc_helpers
  • review Newtype alias implementation in type_aliases.rs, ensure it has zero runtime performance impact, most ergonomic possible development experience & correct usage documentation
  • optimise record helpers (eg. hdk_graph_helpers::records::create_record) to take payloads by reference
  • ensure handling of vector types is most efficient & flexible possible; pass as slices if possible, avoid cloning data
  • link helpers should return ZomeApiResult<Vec<(A, ZomeApiResult<R>)>>, not ZomeApiResult<Vec<(A, Option<R>)>>
  • try_decode_entry (& dependants) don't need to return Options, should return ZomeApiResult<R>
  • get_linked_addresses_as_type & get_linked_remote_addresses_as_type need to take refs
  • make link_entries_bidir handle errors nicely- we want to be able to more safely link to entries which don't exist (should be a no-op with returned Err)
  • return errors instead of discarding with get_links_and_load_type
  • pass response errors through from read_from_zome without discarding the inner error (investigate error chaining?)

Record CRUD API gateway

  • wire up newtypes to records & through zome API gateway (vf_core/src/type_aliases.rs)
  • create trait to enforce consistency in construct_response call signature (potentially via partial application of first 2 params)
  • change call signature for all receive_query_ API calls to consistently take parameters by name (as struct, not fn args- see satisfactions API as example)
  • move default_false serde field filler out of record classes into hdk_graph_helpers
  • check that all remaining record fields for 3 main flow classes (event, intent & commitment) are done

Best practise

  • improve error handling around complex operations:
    • For critical writes (eg. the base entry for an 'entry + links' record structure), handlers should panic before attempting any other operations.
    • For parallel writes (for example, linking a set of EconomicEvents as fulfillments of a Commitment), handlers should pass back ZomeApiResult objects and leave the context handling to the caller. This would usually be done towards the end of the callback handler by partitioning out all the Err responses from sub-operations and returning them in the API response object. You could also do things like pass the errors into some bridged "error logging" Holochain DHT.

Implement EconomicResource deletion

Following on from #53.

This is going to require figuring out how this should work. Speccing out "delete" functionality may play into it. There is also a failing test in resource_links.js which needs to be uncommented once the correct logic has been sorted out.

So, how should this work? Deleting the associated event deletes the resource as well if it's the only attached event? Or should there be a separate API entrypoint for deleting resources separately to events?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.