dzhw / metadatamanagement Goto Github PK

Metadatamanagement (MDM) - Data Search for Higher Education Research and Science Studies

License: GNU Affero General Public License v3.0

Java 52.78% HTML 1.07% CSS 0.01% JavaScript 40.40% Shell 0.07% Python 0.03% RobotFramework 2.69% Batchfile 0.01% Dockerfile 0.02% HCL 0.48% Smarty 0.21% SCSS 1.92% TypeScript 0.32%

spring-boot angularjs angular-material grunt maven elasticsearch mongodb saucelabs robotframework docker-compose

metadatamanagement's Introduction

Metadatamanagement (MDM)

The MDM holds the metadata of the data packages which are available in our Research Data Center FDZ. It enables researchers to browse our data packages before signing a contract for using the data.

Developing the MDM system

Please checkout the development branch before starting to code and create a new branch starting with your username followed by the backlog items issue number you will be working on:

git checkout development
git checkout -b rreitmann/issue1234

Before you can build this project, you must install and configure the following dependencies on your machine:

Java: You need to install java 15 sdk on your system. On Ubuntu you should use SDKMAN! (sdk install java 15.0.2.hs-adpt)
Maven: You need to install maven 3.6.1 or above on your system. On Ubuntu you should use SDKMAN! (sdk install maven)
Node.js: Node.js 16 and npm (coming with node.js) are required as well. On Ubuntu you should install node using NVM (nvm install v16)

On Windows, patch.exe has to exist in the PATH. It is distributed as part of git bash, or can be downloaded manually from GnuWin32.

Running on your local machine

Make sure that you have read-write-access on the data directory (in your project directory) for Elasticsearch and MongoDB. Specifically Mac users need to run the following command to create all data directories before bringing up the containers for the first time:

mkdir -p data/elasticsearch/data data/mongodb/db data/mongodb/logs

Otherwise your Docker Host will attempt to change permissions on the directories and fail.

Use docker-compose up to create all containers initially. MongoDB and Elasticsearch will be listening on their default ports. MailDev will show all locally sent email on 8081 and the identity-provider can be setup on port 8082. Any time after that use either docker-compose up or docker-compose start.

In case elasticsearch does not start successfully, you might need to increase its memory limit mem_limit: 512m, e.g. to 1024 (this change required removing and re-building the container).

You can get a MongoDB dump and restore it locally:

$ wget https://metadatamanagement-public.s3.eu-central-1.amazonaws.com/20220926_metadatamanagement_e2e.zip
$ unzip 20220926_metadatamanagement_e2e.zip
$ mv dump/metadatamanagement data/mongodb/db/
$ docker exec -it mongodb bash
mongo$ cd /data/mongodb/db
mongo$ mongorestore ./metadatamanagement --db=metadatamanagement
mongo$ exit
rm -r dump

You will need to setup your ~/.m2/settings.xml so that maven can download a dependency from Github:

 <settings xmlns="http://maven.apache.org/SETTINGS/1.0.0"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.0.0
                        http://maven.apache.org/xsd/settings-1.0.0.xsd">
    <servers>
      <server>
        <id>github</id>
        <username>${GITHUB_USERNAME}</username>
        <password>${GITHUB_TOKEN}</password>
      </server>
    </servers>
  </settings>

Run mvn first to start the Spring backend and to make sure the frontend Angular constants module has been generated by the Maven Plugin. Run npm --prefix mdm-frontend start to start the Angular Frontend.

mvn clean install -f maven-plugin/pom.xml
mvn spring-boot:run

In order for all external services to work on your local machine, you need to set the following variables in application-local.yml:

dara:
    endpoint: "https://labs.da-ra.de/dara/"
    username: {see s3://metadatamanagement-private/sensitive_variables.tf}
    password: {see s3://metadatamanagement-private/sensitive_variables.tf}

If you run the backend on your machine for the first time, or you have restored a mongodb dump, then you need to setup/reindex the elasticsearch indices. Therefore, login as admin to the application, go to Administration on the left, navigate to External Services and then click the red button Reindex for the Elasticsearch service. Reindexing can take up to 1 hour.

If you want to build a docker image for the metadatamanagement server app you can run

mvn deploy

This image can be run with all its dependent containers by

docker-compose -f docker-compose.yml -f docker-compose-app.yml up -d --build

Building for the dev environment

Our CI pipleline will do some automatic checks and tests and it will optimize the metadatamanagement client for the dev environment. So before pushing to Github in order to be sure you won't fail the build you should run:

mvn -Pdev clean verify

This will concatenate and minify CSS and JavaScript files using grunt. It will also modify the index.html so it references these new files.

We test our project continuously with the Robot Framework. Test Developers can get further info here.

Authentication

When an analysis package or data package is released with version >=1.0.0, the user can optionally post a message about the release on X (formerly Twitter).

To set this up, you need to have an X Developer Account (Free Access Level) and your projects' api credentials consumer key and consumer secret. Be aware that the current Free Access Level is limited to 50 tweets/24h; 1,500 tweets/month; 1 environment; 1 project.

Make your credentials consumer key and consumer secret accessible by the application.yml of the current stage through sensitive_variables.tf just like other highly sensitive data.

[application.yml]
...
tweet:
  consumerkey: ${vcap.services.tweet.credentials.consumerkey}
  consumersecret: ${vcap.services.tweet.credentials.consumersecret}
  oauthtoken: ${vcap.services.tweet.credentials.oauthtoken}
  oauthtokensecret: ${vcap.services.tweet.credentials.oauthtokensecret}
  ...

Create your oauthtoken and oauthtokensecret by following the three steps of the Postman Twitter examples: Twitter OAuth 1.0a flow test.

step oauth/request_token:

Execute the request with your consumer key and consumer secret from your Developer Account. An OAUTH_TOKEN_FROM_STEP1 and OAUTH_TOKEN_SECRET_FROM_STEP1 will be returned.

step oauth/authorize:

Visit https://api.twitter.com/oauth/authorize?oauth_token={OAUTH_TOKEN_FROM_STEP1}&oauth_token_secret={OAUTH_TOKEN_SECRET_FROM_STEP1}&oauth_callback_confirmed=true with OAUTH_TOKEN_FROM_STEP1 and OAUTH_TOKEN_SECRET_FROM_STEP1 from the first step, and authenticate your app.

After being redirected to X, open the network, and copy the values for oauth_token as OAUTH_TOKEN_FROM_STEP2 and oauth_token as OAUTH_VERIFIER_FROM_STEP2 from this GET request

GET 'http://twitter.com/?oauth_token={OAUTH_TOKEN_FROM_STEP2}&oauth_verifier={OAUTH_VERIFIER_FROM_STEP2}`'

step oauth/access_token:

Insert the OAUTH_TOKEN_FROM_STEP2 and OAUTH_VERIFIER_FROM_STEP2 from step 2 into the third request (If you are using Postman like the linked Twitter example, select No Auth instead of OAuth 1.0).

POST 'https://api.twitter.com/?oauth_token={OAUTH_TOKEN_FROM_STEP2}&oauth_verifier={OAUTH_VERIFIER_FROM_STEP2}'

Add the returned values for oauth_token and oauth_token_secret from step 3 to the sensitive_variables.tf.

For further details also see Authentication OAuth FAQ.

Big Thanks

Cross-browser Testing Platform and Open Source ❤️ Provided by Sauce Labs

Continuous Integration Platform provided by Github Actions

metadatamanagement's People

Contributors

Stargazers

Watchers

Forkers

fossabot galagaygay martowu vishttt dcwangqian bolidehi sanduhrs bellmit cschwartze

metadatamanagement's Issues

Fulltext search for variables

As any user I want to do fulltext search in all attributes of a variable in order to look at the details afterwards.

Use country flags in navbar

Edit existing variable

As publisher I want to edit existing variables,...

Implement Data Versioning

As publisher I want to create new versions of variables from existing versions.

Use BMBF Logo

As user I want to see the BMBF Logo including a hint.

Suggest search terms as I type

As public user I want to get suggestions for search terms when I type my search query (see google).

Metadata must not be publicly available before published explicitly

As publisher I want to mark variables as publicly available in order to present only consistent variables.

Add summarized statistics to variables

As public user I want to see summarized statistics (e.g. mean value) of a variable.

Show similar variables

As public user I want to find a list of similar variables (variables which have a similar indicator) in order to compare variables to each other.

Import variables from DDI

As publisher I want to import the metadata of variables from the DDI lifecycle format in order to make the available in the system.

Create variable report

As publisher I want to create a variable report according to the FDZ Layout which I can edit with my tools and convert to a PDF (barrier-free) and publish as part of the SUF.

Fill tex template with metadata for variable report

As publisher I want to be able to upload a Tex template. This template will be filled with metadata and returned to the publisher.

Implement health monitoring

As admin I want to get notified if the system is not healthy in order to be able to fix the system as fast as possible.

Quality rating for metadata

As publisher I want rate the quality of a variable in order to view the ratings as report.

Add tooltip/popover to pageheader

As Public User I want to get an explanation of what I can do on the page.

Create domain object dataset

As public user I want to create datasets which contain 1..n variables in order to search them later.

User wants to create account

As publisher I want to create an account in order to be able to work as publisher.

Create survey title filter

As public user I want to filter variables by survey title,...

Highlight query in variable search results

As public user I want to see where my search query matches the result,...

Export variables of a dataset as DDI

As publisher I want to export the variables of a dataset as DDI in order to make them available for other institutions.

User needs to login

As Publisher I want to login in, in order to be able to delete variable, edit variables, ...

Revoking publishing of variables

As publisher I want to hide variables from the publicly available sites in order to work on inconsistent variables before republishing them.

Ensure independence of unit tests from example data

As developer I want to write tests which do not depend on the example data.

Create CUF order

As public user I want to order a CUF in order to do teaching with it.

Search for attachments like the data set report

As public user I want to search for any kind of attachment stored in our database...

Show picture of question in questionnaire for variable

As public user I want to see a picture of the indicator used to measure the variable in order to get an idea for the quality of the variable.

Automatic test of elasticsearch mapping definitions

As developers we want a unit test which ensures that

there is a mapping file for the english and the german index
all attributes are covered by the mapping file
in order to prevent missing attributes in the elasticsearch index.

Create scale level filter

As Public User I want to filter variables by scale level,...

Create SUF order

As publisher I want to be able to order a SUF in order to be able to do research with it.

Report search query statistics

As publisher I want view frequency distributions about search queries placed by public users in order see which documents are more valuable than others.

Add frequency distributions to variables

As public user I want to see freequency distributions of variables.

Create error view when document not found

As public user I want to see an error message when a (for instance internationalized) variable cannot be found.

Implement Pessimistic Offline Lock

As publisher I want to know if another user is currently modifying the variable which I want to modify in order to prevent losing work.

Report documentation state

As publisher I want to view a report about the state of documentation (completeness) of a dataset in order to see the work in progress.

Delete users

As admin I want to delete unused user accounts.

Create variables

As publisher I want to create new variables in order to search them later on.

Add FDZ logo to navbar

Search-as-you-type with pjax

As developers we want to integrate jquery pjax in order to increase cross browser compatibility when doing partial page requests.

Automatic HTML5 W3C validation

As developers we want a unit test which ensures that all pages contain valid HTML5 in order to ensure cross browser compatibility.

Implement data backups

As publisher I want to be sure that persistent data is not lost on system failures.

Create an about us (legal info)

As user I want to see a legal info (impressum).

Add explaining text to bucket sizes (tooltip/popover)

As Public User I want to get an explanation for the numbers which are displayed on the variable search page next to the survey title for instance.

Create filter by survey period

As public user I want to filter variables by survey period,...

Role assignment

As admin I want to assign roles to registered users in order to grant them rights for their work.

Find datasets with similar variables

As public user I want to see the list of datasets with similar variables in order to be able to justify how comparable these variables are.

Too many matches when searching for variables

When searching variables by "allbus ordinal" (https://metadatamanagement.cfapps.io/de/variables/search?query=allbus+ordinal&_surveyTitle=on&_scaleLevel=on&_scaleLevel=on&dateRange.startDate=&dateRange.endDate=) the search returns nominal variables as well.