cafincubator / midden Goto Github PK

View Code? Open in Web Editor NEW

14.0 14.0 16.0 149.79 MB

A research metadata catalog and metadata editor that integrates into common workflows used in academic research.

License: Creative Commons Zero v1.0 Universal

HTML 31.29% CSS 1.61% C# 65.59% JavaScript 1.29% Batchfile 0.22%

academic data data-catalog data-management data-science metadata research research-data-management

midden's People

Stargazers

Watchers

Forkers

bryancarlson-ars canyonriverfarms ars-swrc kemuller riemino gswrl-usda-ars yfs11013 meltourn nickaplan cperltar agaidinstitute jjrdk kemullerdrives ltarnetwork byancam9

midden's Issues

Error when previewing datasets: map container already initialized

To reproduce:

Go to "Catalog", click magnifying glass to preview dataset, close, click any other magnifying glass to preview second dataset

Error:

crit: Microsoft.AspNetCore.Components.WebAssembly.Rendering.WebAssemblyRenderer[100]
      Unhandled exception rendering component: Map container is already initialized.
      Error: Map container is already initialized.
          at i._initContainer (https://unpkg.com/[email protected]/dist/leaflet.js:5:37578)
          at initialize (https://unpkg.com/[email protected]/dist/leaflet.js:5:26026)
          at new i (https://unpkg.com/[email protected]/dist/leaflet.js:5:2616)
          at Object.t.map (https://unpkg.com/[email protected]/dist/leaflet.js:5:141663)
          at Module.create (https://meta.cafltar.org/geojsonMap.js:23:24)
          at https://meta.cafltar.org/_framework/blazor.webassembly.js:1:3942
          at new Promise (<anonymous>)
          at Object.beginInvokeJSFromDotNet (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:3908)
          at Object.w [as invokeJSFromDotNet] (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:64218)
          at _mono_wasm_invoke_js_blazor (https://meta.cafltar.org/_framework/dotnet.5.0.9.js:1:190800)

Finish instructions for using the Editor

Currently, this wiki page is a stub: https://github.com/CafIncubator/Midden/wiki/Using-the-Midden-Editor (updated to here: https://github.com/CafIncubator/Midden/wiki/Using-the-Midden-Dataset-Editor)

These instructions should be completed in order to support creation of .midden files.

This will also aid in creating the context mentioned in #77

Rewrite pages and components to have consistent state and business logic

I need to clean up some of the architecture. Right now some "pages" handle no logic and all the logic, event handing, etc. occur in the components. For example, the Catalog page has a CatalogViewer. The CatalogViewer handles event updates to State and such things. This forced me to create a second component for ProjectCatalogViewer which is very similar to the CatalogViewer. But then I have a MetadataView page that handles the event logic and passes a Metadata to a MetadataDetails component.

I'm not being consistent with the way I implement logic

I think a better way to deal with this might be for the page to handle updating states and passing that along to the component? This will also allow the CatalogViewer and VariableViewer to reference the same underlining List<Metadata> instead of each having to create a subset of the list (in case of Project-specific or Zone-specific pages)

Proposal: Rewrite CatalogViewer and VariableViewer to only iterate on a List<Metadata> (the variable viewer will still need to create a List<CatalogVariableViewerViewModel> from the List<Metadata>)

Enhance UI for Projects

These are tasks / suggestions that were discussed during a presentation of how Projects were implemented in the latest dev build:

Projects page (catalog/projects)
- Should render projects with markdown instead of simple text. Also set a max height and either use a scroll bar or a "more..." link to expand
- Show number of datasets per project; don't worry about performance until it becomes an issue. Can add a field like datasetCount in catalog.json that is populated by the collator
Strip whitespace when assigning the relationship between projects and datasets via LINQ to avoid input issues
Insights page
- Show datasets without projects and/or projects without datasets to highlight potential issues
Catalog page (catalog)
- Remove tabs for Datasets, Variables, Projects. Instead, make them sub-items with each having their own url; this enables back button better
Don't use .mippen; use .md files instead (this has been addressed between that meeting and this note)
Create a project editor that is a markdown editor and allows downloading of the project file and loading (like the metadata editor)
- Add the "Project" editor as a sub-menu to the current "Editor" nav menu. Also add a "Dataset" sub-item under the same parent that links to the current editor
Consider a "Variable" tab in the Project-detail pages; this in addition to the current datasets that are listed

Set size of Tag select component in editor

Currently it's difficult to read pre-generated tags in the Editor.

Emphasize zone more when displaying datasets

In the metadata / dataset viewer, it's not clear what zone the data belong to. Zone (and project) is only indicated through the breadcrumb path. It would be good to put more emphasis on zone (and maybe project) when displaying datasets.

Allow spatial filters

Add feature to allow viewers to filter datasets by area of interest or other spatial filters (distance from a point?).

Look into NetTopologySuite: https://github.com/NetTopologySuite/NetTopologySuite

Map visualization for area of interest geojson

Create map visualization of geojson specified in the areaOfInterest field.

Probably use leaflet. Two options to explore:

https://github.com/Mehigh17/BlazorLeaflet (more community involvement but seems abandoned)
https://github.com/darnton/LeafletBlazor (seems abandoned as well)

Revisit links and pages

We need a deep philosophical/metaphysical/existential discussion on pages, links, and nav.

Midden used to be just datasets but now there are tags, projects, zones, etc. It makes some sense to list these elements but this complicates navigation.

In the 0.2-dev.2 build there are catalog pages for the above elements. But things get weird fast. For example:

catalog/dataset lists all datasets for all zones. catalog/zones lists all the data zones. If you follow a zone link it goes to catalog/zones/{specific-zone} that lists all the datasets in that zone. But shouldn't this be the dataset catalog, just filtered by zone? Something like: catalog/datasets/zones/{specific-zone}? But if we have that, then what does catalog/datasets/zones list? All data for all zones? That's the same as catalog/datasets!

Update to latest version of Ant Design Blazor

Lots of breaking changes but lots of improvements. See latest version here: https://github.com/ant-design-blazor/ant-design-blazor/releases

Redesign Home

The Home page in the web app needs some love.

Consider:

Icons/graphics for each link (Insights, Catalog, Editor) instead of just text
Functional insights like "Recent Updates" to show new datasets
Graphics/logos?
Help documents? Introduction material (what is Midden?)?

Typo in /catalog/projects/{ProjectName}

The subtitle is "Filtered project". It should be "Filtered by project".

Reevaluate how zones, projects, datasets are displayed

Catalog: Put dataset name first in card title, make it a link to the metadata page; remove "View" button
- Consider adding zone and project id in card body
- Use tag style for this? Or put in Tag section? Probably put up top, above description, as tags

Error when using a "\" in the item name attribute

An error occurs when loading the catalog.json into the web Catalog when using a "" in the item name attribute in a midden file. Looks like the editor can create it, but when I tried to see the item in the catalog, I was getting an error (not able to open the item).

Include version in all web app pages

The Midden version is currently only seen in the Editor. Either including it in the header or the footer should work so that it can be visible in Insights, Catalog, and homepage.

Clickable Keywords

Make keywords clickable so you can quickly see other datasets/metadata associated with that keyword. Would also be handy if you can assign projects as keywords on datasets, linking multiple projects to one dataset.

Error: Unhandled exception rendering component: Cannot read properties of null (reading 'removeChild')

Reproduce:

In Catalog, click the preview button, then click the "View Page" button.

The follow is the error message:

blazor.webassembly.js:1 
        
       crit: Microsoft.AspNetCore.Components.WebAssembly.Rendering.WebAssemblyRenderer[100]
      Unhandled exception rendering component: Cannot read properties of null (reading 'removeChild')
      TypeError: Cannot read properties of null (reading 'removeChild')
          at e (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:10331)
          at e (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:10303)
          at Object.e [as removeLogicalChild] (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:10303)
          at e.applyEdits (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:33040)
          at e.updateComponent (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:32271)
          at Object.t.renderBatch (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:12134)
          at Object.window.Blazor._internal.renderBatch (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:61913)
          at Object.w [as invokeJSFromDotNet] (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:64435)
          at _mono_wasm_invoke_js_blazor (https://meta.cafltar.org/_framework/dotnet.5.0.9.js:1:190800)
          at wasm_invoke_iiiiii (wasm://wasm/00aba242:wasm-function[5611]:0xdda7f)
Microsoft.JSInterop.JSException: Cannot read properties of null (reading 'removeChild')
TypeError: Cannot read properties of null (reading 'removeChild')
    at e (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:10331)
    at e (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:10303)
    at Object.e [as removeLogicalChild] (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:10303)
    at e.applyEdits (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:33040)
    at e.updateComponent (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:32271)
    at Object.t.renderBatch (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:12134)
    at Object.window.Blazor._internal.renderBatch (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:61913)
    at Object.w [as invokeJSFromDotNet] (https://meta.cafltar.org/_framework/blazor.webassembly.js:1:64435)
    at _mono_wasm_invoke_js_blazor (https://meta.cafltar.org/_framework/dotnet.5.0.9.js:1:190800)
    at wasm_invoke_iiiiii (wasm://wasm/00aba242:wasm-function[5611]:0xdda7f)
   at Microsoft.JSInterop.WebAssembly.WebAssemblyJSRuntime.InvokeUnmarshalled[Int32,RenderBatch,Object,Object](String identifier, Int32 arg0, RenderBatch arg1, Object arg2, Int64 targetInstanceId)
   at Microsoft.JSInterop.WebAssembly.WebAssemblyJSRuntime.InvokeUnmarshalled[Int32,RenderBatch,Object](String identifier, Int32 arg0, RenderBatch arg1)
   at Microsoft.AspNetCore.Components.WebAssembly.Rendering.WebAssemblyRenderer.UpdateDisplayAsync(RenderBatch& batch)
   at Microsoft.AspNetCore.Components.RenderTree.Renderer.ProcessRenderQueue()

Likely an issue with modal component?

Update to .net 6

Handling of sub/filtered catalogs should be improved

A lot of cleanup can be done with how sub-catalogs (aka filtered catalogs) are handled.

URL should be more like an rest api; /catalog/{zoneName} should be something like /catalog/zones/{zoneName}
- Issue #71 is related: url should be something like /tags/{tagName}
Page header can be improved. "Work Zone Catalog" seems awkward. Maybe something like: "Catalog filtered by zone: Work". Probably better options than this.
There's a lot of code copy/pasted for these filtered catalogs; see ZoneCatalogMetadataViewer, ProjectCatalogMetadataViewer, CatalogMetadataViewer (and soon TagMetadataViewer). Figure out a way to unify this. Create single component with parameters for what's being filtered? Pass a linq function to component?

Hide void fields in Metadata Details view

No need to show 'em; hide 'em!

Project links in dataset catalog should link to a filtered catalog by just project

Currently, the dataset catalog shows a "path" where it's "{ZoneName} / {ProjectName}". The ZoneName links to a dataset catalog filtered by zone. The ProjectName links to a dataset catalog filtered by ZoneName AND ProjectName.

It would be more useful, and more intuitive, if the ProjectName links to a dataset catalog filtered ONLY by project.

To make this more intuitive, the formatting of the ZoneName and ProjectName as a "path" should be removed.

Stop app-config.json from being cached

app-config.json gets cached so configuration updates are not seen until the browser cache is cleared.

This was a problem with catalog.json as well. It was solved by adding a "?{guid}" at the end of the download path. See CatalogReaderHttp.cs.

Provide context to input fields in the Editor

Currently, there is no context provided for the various input fields within the editor. Consider adding an icon that shows a tooltip or popup when hovered or clicked. This tooltip/popup should give a description of the field and, possibly, simple instructions.

Flesh out wiki

More instruction is needed for users of Midden. Should consider using the Wiki for:

Best practices of data organization - Use of data zones, projects, datasets in own directory
Catalog/Editor: Setup/installation - Supported platforms: Github Pages, Azure Static Web Sites, self hosting)
Catalog/Editor: Configuration - Customizing website
CLI: Installation and overview
CLI: Configuration
Example workflow: Use editor to create metadata, download, save to data store, use CLI to collate, update catalog.json, update Midden (if needed)

Move from schema version 0.1.0-alpha4 to 0.1

This may be messy....

Fix issues with caching

Issues arise when WASM is cached in browser; updated client do not get loaded and catalog.json can be outdated. Need to investigate ways to check things like assembly version for client and last updated for catalog.json against cache. Or, maybe, simply disable cache for now?

See:

Error downloading icon-512.png

Midden gives the following error when loading any page:

Error while trying to use the following icon from the Manifest: https://meta.cafltar.org/icon-512.png (Download error or resource isn't a valid image)

Likely a residual reference to the removed image.

Create readme

Currently, there is no readme. One should:

Describe project, scope, license, etc.
Contribution guidelines (so optimistic!)
Explain current features; supported data stores, supported static website hosts
Provide some guidance on setup and use
Roadmap?

Generalize and expand upon app-config.json

Update 'app-config.json' to be less CAF-specific and expand on fields relating to "best practices"; e.g. ISO standard tags.

Support markdown

Input like description, methods, derivedWorks could be highly enhanced if basic markdown support was enabled. Support for bullets, links, and code blocks will greatly aid in readability.

Explore:

Include "LastUpdated" to metadata and display in Catalog

Some datasets are updated periodically such as timeseries data or drone flights. It would be useful for those browsing the metadata to know when datasets have been updated.

The crawlers could read file metadata and determine when files in the dataset folder were last updated.

One issue is that the catalog will only reflect information accurate to when the catalog itself was last updated. This could cause information to be misleading as data could be updated after the catalog was generated.

[CLI] Specify the data store in the path variable

It is reasonable that a data zone is not necessarily linked 1:1 to the technology of the data store. For example, "raw" data could be in azure data lake, google drive, or an FTP. Similarly, 'scratch' data could be in dropbox, drive, onedrive, etc. It's probably best practice to have a 1:1 relationship, but discussion should occur whether or not MIdden should enforce that.

Consider adding a prefix to the "datasetPath" variable. e.g. "GoogleWorkspaceSharedDrive//relative/path/to/dataset".

[Editor] Create a simplified view

In earlier version of Midden, the number of fields displayed/required was related to the data zone the metadata is for. Now that Midden allows custom data zones, it's difficult to assign required fields (unless this is customizable in the app-config.json file, but that would be a beast to deal with). Now all metadata fields are displayed at all times. This is overwhelming and goes against one of the tenants of Midden (which is basically, get some metadata, with low barrier, even if it's just a one sentence description).

As a (temporary?) fix, implement a "complex view" toggle that shows all fields. The default view just shows name, zone, project, description, contact, variables (which only includes name, description, units).

Fix repo oddities

Some goofiness leaked into the repo over time. Fix 'em:

Remove azure stat web app yml file (or rename?)
Reset catalog.json to correct "template" version instead of CAF version

Add favicon

Midden needs a killer favicon instead of the default

Optimize tool using local caching

Locally cache catalog.json and check remote file when it was last updated before downloading

[Editor] Allow "saving" of edits using local storage

Creating a metadata file using the Editor could take significant time. A feature to use local storage to save changes is essential to prevent data/time loss.

Potentially:

Enable a "save" button to manually save edits
Wait for OnFieldChanged support (ant-design-blazor/ant-design-blazor#1128 (comment)) and save on each field change

Consider data lineage

If a field like "derivedFrom" is added to the dataset metadata, then a lineage graph could be constructed

Create Dashboard page

A Dashboard page needs to be created. This will:

Provide summaries of the data catalog using figures and statistics (bar chart of datasets per zone, datasets per project, common tags, and so on)
Provide suggestions, or insights that guide suggestions; e.g. how many datasets do not have any tags?
[Likely much later] Metrics on use; What are most visited datasets?

Adopt specific metadata standard?

Currently, the .midden metadata schema have fields that were hand picked from various metadata standards (ISO 19115, Project Open Data) and chosen as a result of decisions made internally within USDA LTAR. The fields were named using a self-defined naming scheme. This decision was made to simplify the creation of metadata by the researchers and also for convenience of development.

Care was taken to ensure that the metadata were at least mostly compatible for export to ISO 19115, Project Open Data, and EPA guidelines. However, there are advantages of adopting a single standard instead of using a combination of them (and thus no standard). This should be discussed.

Specify versioning strategy

Explain how versioning works in the README

Allow uploading of data dictionaries to define variables

It's fairly common practice for data dictionaries are already defined within spreadsheets or as separate CSV files. We should allow users to upload data dictionaries instead of making them reproduce the work through the editor.

Functionality should allow various formats. At a minimum, should support the CAF default; csv file with FieldName, Description, Units columns.

Editor validation

Currently the editor does not validate user input. It should.

Support Azure Files

The collate command of the CLI should support azure files

Add link to Ant Design Blazor in footer

Give that team some cred!

Set default variable meta when loading data dictionary

Set Quality Control and Processing to Unknown when loading a data dictionary. Alternatively, investigate why Processing does not appear to be empty

Consider project-level metadata

Projects in Midden are essential to organization yet there is no metadata to specify project information (purpose, members, start date, end date, whatever). Consider a metadata file to specify project information.

This will allow an "overview" of projects within an organization and better inform potential datasets, grouped methods, results, maybe documents (manuscripts? related literature?)

This will require a separate metadata file extension? Or a top-level specification in a midden file on whether it's a dataset or project. Maybe a new aggregator?

Could also have a pop-up that displays project info?

cafincubator / midden Goto Github PK

midden's People

Stargazers

Watchers

Forkers

midden's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs