GithubHelp home page GithubHelp logo

jncc / datahub Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 1.0 13.28 MB

The JNCC datahub - our online web repository of open data and publications.

C# 11.08% CSS 59.04% JavaScript 19.41% Dockerfile 0.09% HTML 9.49% Python 0.90%

datahub's People

Contributors

cathyjinjncc avatar completer avatar dependabot[bot] avatar felixmasonjncc avatar jonparsonsjncc avatar mattdebont avatar shenavall73 avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

uk-gov-mirror

datahub's Issues

Asset page design

There are several different types of resources or datasets:

  • Open Data datasets (with or without a DOI)
  • Publications
  • Scientific documents
  • ... what else?

Elastic Search PoC

Update the lambda function to upsert the record into an elastic search index.
Includes create elastic search instance.
Check can do basic full text search.

Automate deployment

Matt has a new deployment mechanism to replace Felix's - but it needs to be automated.

The full steps to manually deploy are now in the readme.md but this is still a manual process.

Which object store can we use?

Questions:

  • Can we migrate from data.jncc.gov.uk to S3 later without changing URL?

These questions are now less relevant:

  • Can we use Defra ongoing?
  • Is there an easier way to extract things from glacier?

Home page design

The home page is a search page.

  • Search box
  • Count of datasets (and other statistics?)
  • Display latest datasets

Create Topcat data export json blob [PoC]

...in format from the catalog/inventory and filter (note tweaking needed for multiple resources not viewable in UI, include DOI record- type doi into search to bring it up)

Asset data structure design

Design notes / discussion for the main Datahub Asset class.

Common Standards Monitoring

  • What are CSM collections for?
  • Are CMS data assets reports? Currently they're 'series' - how are genuine series going to be handled by the data structure?

General design

  • Data Format is currently missing

Keyword-based search

We just need single keyword search by URL and from the asset page keyword, for starters.
Adjust search page to show that it's a keyword search.

How can we get the data.gov.uk URL of a dataset?

It would be nice to have the link to data.gov.uk page if it exists on our resource / asset landing page.

This could be tricky.... The current plan is to (logically at least) publish to the Datahub first, then DGU.

  • Is the data.gov.uk URL conventional or calculable?

Needs investigation. Could fundamentally affect the publishing mechanism design.

Short Abstracts have excessive whitespace and redundant 'Show more' / 'Show less'

The 'Show more' / 'Show less' implementation was a fairly quick CSS-only solution for the proof-of-concept to long abstracts, and doesn't work very well if the abstract is shorter than 10 lines - the 'Show more' button always shows, and there's excessive whitespace.

This is a definite problem for some records with shorter abstracts.

GDPR - extract email addresses from MNR database for Tetrienne

Tetrienne and MNR staff are sending round to existing users a copy of their privacy policy as part of GDPR readiness.

They need a list of emails to send the message to - individuals and organisaitonal emails

Please extract from DB as CSV / Excel and sent to Tet.

How / should we make full Elastic Search capabilities available?

How (and should) we do this in the Search page URL for linking to from e.g. website pages?

Need to understand more about how Elastic Search works, whether this is possible, and then whether this is desirable.

The alternative is to have our own search parameters much like Topcat.

Dynamo/Lambda PoC

Make an HTTP API which accepts a JSON record and upserts it into a Dynamo table.

Search

Implement the Search page.

Productionise Datahub Backend search scripts

Complete the implementation of a command line tool for Creating, Populating and Deleting index entries.

  • Command line app
  • Ingest PDF examples
  • Working both locally and with signed requests on AWS

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.