cafincubator / midden Goto Github PK
View Code? Open in Web Editor NEWA research metadata catalog and metadata editor that integrates into common workflows used in academic research.
License: Creative Commons Zero v1.0 Universal
A research metadata catalog and metadata editor that integrates into common workflows used in academic research.
License: Creative Commons Zero v1.0 Universal
To reproduce:
Go to "Catalog", click magnifying glass to preview dataset, close, click any other magnifying glass to preview second dataset
Error:
crit: Microsoft.AspNetCore.Components.WebAssembly.Rendering.WebAssemblyRenderer[100]
Unhandled exception rendering component: Map container is already initialized.
Error: Map container is already initialized.
at i._initContainer (https://unpkg.com/[email protected]/dist/leaflet.js:5:37578)
at initialize (https://unpkg.com/[email protected]/dist/leaflet.js:5:26026)
at new i (https://unpkg.com/[email protected]/dist/leaflet.js:5:2616)
at Object.t.map (https://unpkg.com/[email protected]/dist/leaflet.js:5:141663)
at Module.create (https://meta.cafltar.org/geojsonMap.js:23:24)
at https://meta.cafltar.org/_framework/blazor.webassembly.js:1:3942
at new Promise (<anonymous>)
at Object.beginInvokeJSFromDotNet (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:3908)
at Object.w [as invokeJSFromDotNet] (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:64218)
at _mono_wasm_invoke_js_blazor (https://meta.cafltar.org/_framework/dotnet.5.0.9.js:1:190800)
Currently, this wiki page is a stub: https://github.com/CafIncubator/Midden/wiki/Using-the-Midden-Editor (updated to here: https://github.com/CafIncubator/Midden/wiki/Using-the-Midden-Dataset-Editor)
These instructions should be completed in order to support creation of .midden files.
This will also aid in creating the context mentioned in #77
I need to clean up some of the architecture. Right now some "pages" handle no logic and all the logic, event handing, etc. occur in the components. For example, the Catalog
page has a CatalogViewer
. The CatalogViewer
handles event updates to State and such things. This forced me to create a second component for ProjectCatalogViewer
which is very similar to the CatalogViewer
. But then I have a MetadataView
page that handles the event logic and passes a Metadata
to a MetadataDetails
component.
I'm not being consistent with the way I implement logic
I think a better way to deal with this might be for the page to handle updating states and passing that along to the component? This will also allow the CatalogViewer
and VariableViewer
to reference the same underlining List<Metadata>
instead of each having to create a subset of the list (in case of Project-specific or Zone-specific pages)
Proposal: Rewrite CatalogViewer
and VariableViewer
to only iterate on a List<Metadata>
(the variable viewer will still need to create a List<CatalogVariableViewerViewModel>
from the List<Metadata
>)
These are tasks / suggestions that were discussed during a presentation of how Projects were implemented in the latest dev build:
catalog/projects
)
datasetCount
in catalog.json that is populated by the collatorcatalog
)
Currently it's difficult to read pre-generated tags in the Editor.
In the metadata / dataset viewer, it's not clear what zone the data belong to. Zone (and project) is only indicated through the breadcrumb path. It would be good to put more emphasis on zone (and maybe project) when displaying datasets.
Add feature to allow viewers to filter datasets by area of interest or other spatial filters (distance from a point?).
Look into NetTopologySuite: https://github.com/NetTopologySuite/NetTopologySuite
Create map visualization of geojson specified in the areaOfInterest field.
Probably use leaflet. Two options to explore:
We need a deep philosophical/metaphysical/existential discussion on pages, links, and nav.
Midden used to be just datasets but now there are tags, projects, zones, etc. It makes some sense to list these elements but this complicates navigation.
In the 0.2-dev.2 build there are catalog pages for the above elements. But things get weird fast. For example:
catalog/dataset
lists all datasets for all zones. catalog/zones
lists all the data zones. If you follow a zone link it goes to catalog/zones/{specific-zone}
that lists all the datasets in that zone. But shouldn't this be the dataset catalog, just filtered by zone? Something like: catalog/datasets/zones/{specific-zone}
? But if we have that, then what does catalog/datasets/zones
list? All data for all zones? That's the same as catalog/datasets
!
Lots of breaking changes but lots of improvements. See latest version here: https://github.com/ant-design-blazor/ant-design-blazor/releases
The Home page in the web app needs some love.
Consider:
The subtitle is "Filtered project". It should be "Filtered by project".
An error occurs when loading the catalog.json into the web Catalog when using a "" in the item name attribute in a midden file. Looks like the editor can create it, but when I tried to see the item in the catalog, I was getting an error (not able to open the item).
The Midden version is currently only seen in the Editor. Either including it in the header or the footer should work so that it can be visible in Insights, Catalog, and homepage.
Make keywords clickable so you can quickly see other datasets/metadata associated with that keyword. Would also be handy if you can assign projects as keywords on datasets, linking multiple projects to one dataset.
Reproduce:
In Catalog, click the preview button, then click the "View Page" button.
The follow is the error message:
blazor.webassembly.js:1
crit: Microsoft.AspNetCore.Components.WebAssembly.Rendering.WebAssemblyRenderer[100]
Unhandled exception rendering component: Cannot read properties of null (reading 'removeChild')
TypeError: Cannot read properties of null (reading 'removeChild')
at e (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:10331)
at e (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:10303)
at Object.e [as removeLogicalChild] (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:10303)
at e.applyEdits (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:33040)
at e.updateComponent (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:32271)
at Object.t.renderBatch (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:12134)
at Object.window.Blazor._internal.renderBatch (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:61913)
at Object.w [as invokeJSFromDotNet] (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:64435)
at _mono_wasm_invoke_js_blazor (https://meta.cafltar.org/_framework/dotnet.5.0.9.js:1:190800)
at wasm_invoke_iiiiii (wasm://wasm/00aba242:wasm-function[5611]:0xdda7f)
Microsoft.JSInterop.JSException: Cannot read properties of null (reading 'removeChild')
TypeError: Cannot read properties of null (reading 'removeChild')
at e (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:10331)
at e (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:10303)
at Object.e [as removeLogicalChild] (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:10303)
at e.applyEdits (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:33040)
at e.updateComponent (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:32271)
at Object.t.renderBatch (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:12134)
at Object.window.Blazor._internal.renderBatch (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:61913)
at Object.w [as invokeJSFromDotNet] (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:64435)
at _mono_wasm_invoke_js_blazor (https://meta.cafltar.org/_framework/dotnet.5.0.9.js:1:190800)
at wasm_invoke_iiiiii (wasm://wasm/00aba242:wasm-function[5611]:0xdda7f)
at Microsoft.JSInterop.WebAssembly.WebAssemblyJSRuntime.InvokeUnmarshalled[Int32,RenderBatch,Object,Object](String identifier, Int32 arg0, RenderBatch arg1, Object arg2, Int64 targetInstanceId)
at Microsoft.JSInterop.WebAssembly.WebAssemblyJSRuntime.InvokeUnmarshalled[Int32,RenderBatch,Object](String identifier, Int32 arg0, RenderBatch arg1)
at Microsoft.AspNetCore.Components.WebAssembly.Rendering.WebAssemblyRenderer.UpdateDisplayAsync(RenderBatch& batch)
at Microsoft.AspNetCore.Components.RenderTree.Renderer.ProcessRenderQueue()
Likely an issue with modal component?
A lot of cleanup can be done with how sub-catalogs (aka filtered catalogs) are handled.
No need to show 'em; hide 'em!
Currently, the dataset catalog shows a "path" where it's "{ZoneName} / {ProjectName}". The ZoneName links to a dataset catalog filtered by zone. The ProjectName links to a dataset catalog filtered by ZoneName AND ProjectName.
It would be more useful, and more intuitive, if the ProjectName links to a dataset catalog filtered ONLY by project.
To make this more intuitive, the formatting of the ZoneName and ProjectName as a "path" should be removed.
app-config.json gets cached so configuration updates are not seen until the browser cache is cleared.
This was a problem with catalog.json as well. It was solved by adding a "?{guid}" at the end of the download path. See CatalogReaderHttp.cs.
Currently, there is no context provided for the various input fields within the editor. Consider adding an icon that shows a tooltip or popup when hovered or clicked. This tooltip/popup should give a description of the field and, possibly, simple instructions.
More instruction is needed for users of Midden. Should consider using the Wiki for:
This may be messy....
Issues arise when WASM is cached in browser; updated client do not get loaded and catalog.json can be outdated. Need to investigate ways to check things like assembly version for client and last updated for catalog.json against cache. Or, maybe, simply disable cache for now?
See:
Midden gives the following error when loading any page:
Error while trying to use the following icon from the Manifest: https://meta.cafltar.org/icon-512.png (Download error or resource isn't a valid image)
Likely a residual reference to the removed image.
Currently, there is no readme. One should:
Update 'app-config.json' to be less CAF-specific and expand on fields relating to "best practices"; e.g. ISO standard tags.
Input like description, methods, derivedWorks could be highly enhanced if basic markdown support was enabled. Support for bullets, links, and code blocks will greatly aid in readability.
Explore:
Some datasets are updated periodically such as timeseries data or drone flights. It would be useful for those browsing the metadata to know when datasets have been updated.
The crawlers could read file metadata and determine when files in the dataset folder were last updated.
One issue is that the catalog will only reflect information accurate to when the catalog itself was last updated. This could cause information to be misleading as data could be updated after the catalog was generated.
It is reasonable that a data zone is not necessarily linked 1:1 to the technology of the data store. For example, "raw" data could be in azure data lake, google drive, or an FTP. Similarly, 'scratch' data could be in dropbox, drive, onedrive, etc. It's probably best practice to have a 1:1 relationship, but discussion should occur whether or not MIdden should enforce that.
Consider adding a prefix to the "datasetPath" variable. e.g. "GoogleWorkspaceSharedDrive//relative/path/to/dataset".
In earlier version of Midden, the number of fields displayed/required was related to the data zone the metadata is for. Now that Midden allows custom data zones, it's difficult to assign required fields (unless this is customizable in the app-config.json file, but that would be a beast to deal with). Now all metadata fields are displayed at all times. This is overwhelming and goes against one of the tenants of Midden (which is basically, get some metadata, with low barrier, even if it's just a one sentence description).
As a (temporary?) fix, implement a "complex view" toggle that shows all fields. The default view just shows name, zone, project, description, contact, variables (which only includes name, description, units).
Some goofiness leaked into the repo over time. Fix 'em:
Midden needs a killer favicon instead of the default
Locally cache catalog.json and check remote file when it was last updated before downloading
Creating a metadata file using the Editor could take significant time. A feature to use local storage to save changes is essential to prevent data/time loss.
Potentially:
If a field like "derivedFrom" is added to the dataset metadata, then a lineage graph could be constructed
A Dashboard page needs to be created. This will:
Currently, the .midden metadata schema have fields that were hand picked from various metadata standards (ISO 19115, Project Open Data) and chosen as a result of decisions made internally within USDA LTAR. The fields were named using a self-defined naming scheme. This decision was made to simplify the creation of metadata by the researchers and also for convenience of development.
Care was taken to ensure that the metadata were at least mostly compatible for export to ISO 19115, Project Open Data, and EPA guidelines. However, there are advantages of adopting a single standard instead of using a combination of them (and thus no standard). This should be discussed.
Explain how versioning works in the README
It's fairly common practice for data dictionaries are already defined within spreadsheets or as separate CSV files. We should allow users to upload data dictionaries instead of making them reproduce the work through the editor.
Functionality should allow various formats. At a minimum, should support the CAF default; csv file with FieldName, Description, Units columns.
Currently the editor does not validate user input. It should.
The collate command of the CLI should support azure files
Give that team some cred!
Set Quality Control and Processing to Unknown when loading a data dictionary. Alternatively, investigate why Processing does not appear to be empty
Projects in Midden are essential to organization yet there is no metadata to specify project information (purpose, members, start date, end date, whatever). Consider a metadata file to specify project information.
This will allow an "overview" of projects within an organization and better inform potential datasets, grouped methods, results, maybe documents (manuscripts? related literature?)
This will require a separate metadata file extension? Or a top-level specification in a midden file on whether it's a dataset or project. Maybe a new aggregator?
Currently, the entire catalog.json file is loaded into memory and persists throughout the app lifecycle. This isn't sustainable for large catalogs.
At some point, optimization steps should be taken to ease the computational burden. Consider paging and/or streaming.
Create a private function to return files objects instead of IDs for Drive that include parent info. Use instead of getFilesnames
. Change getFilesnames
to return the actual names instead of id
Currently, there are no easy ways to view project information while viewing dataset details. There is no link that goes to the project page (clicking the breadcrumbs goes to a dataset view for the specific zone and project).
Consider adding a link (with respective icons) for the zone and project that the dataset belongs to. Pretty much exactly how the dataset card shows it. Probably under the title.
Could also have a pop-up that displays project info?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.