GithubHelp home page GithubHelp logo

humlab-sead / sead_change_control Goto Github PK

View Code? Open in Web Editor NEW
1.0 2.0 0.0 120.47 MB

Sane SEAD change control using Sqitch.

PLpgSQL 98.13% Shell 1.66% Batchfile 0.02% Makefile 0.15% Python 0.04%
sead visead sqitch schema-changes database

sead_change_control's People

Contributors

johanvonboer avatar roger-mahler avatar

Stargazers

 avatar

Watchers

 avatar  avatar

sead_change_control's Issues

20200203_DML_FACET_UI_ELEMENTS_UPDATE: Correct Filter group and item titles

Correct filter visible elements as follows:

measured_values - measurements
MS -Magnetic sus.
LOI - Loss on Ignition
PT? - Phosphates

space_time - space/time
Site - Sites
Country - Countries
Sample group - Sample groups

Taxa - Taxon

Reorder filter groups:
space/time
ecology
measurements
taxonomy

Reorder items within space/time filter
Master datasets
Countries
Sites
Sample groups
Time periods
Geochronology

20190101_DDL_FACET_SCHEMA: New facet "Master datasets"

A new facet with the following JSON specification:

{
	"facet_id": 38,
	"facet_code": "dataset_master",
	"display_title": "Master datasets",
	"description": "Master datasets",
	"facet_group_id": "2",
	"facet_type_id": 1,
	"category_id_expr": "tbl_dataset_masters.master_set_id ",
	"category_name_expr": "tbl_dataset_masters.master_name",
	"sort_expr": "tbl_dataset_masters.master_name",
	"is_applicable": true,
	"is_default": false,
	"aggregate_type": "count",
	"aggregate_title": "Number of samples",
	"aggregate_facet_code": "result_facet",
	"tables": [
    	{
    		"sequence_id": 1,
    		"table_name": "tbl_dataset_masters",
    		"udf_call_arguments": null,
    		"alias":  null
    	}
    ],
	"clauses": [  ]
}

New facet "Dataset methods"

A new facet with the following specification:

{
	"facet_id": 39,
	"facet_code": "dataset_methods",
	"display_title": "Dataset methods",
	"description": "Dataset methods",
	"facet_group_id": "2",
	"facet_type_id": 1,
	"category_id_expr": "tbl_methods.method_id ",
	"category_name_expr": "tbl_methods.method_name",
	"sort_expr": "tbl_methods.method_name",
	"is_applicable": true,
	"is_default": false,
	"aggregate_type": "count",
	"aggregate_title": "Number of datasets",
	"aggregate_facet_code": "result_facet_dataset",
	"tables": [
    	{
    		"sequence_id": 1,
    		"table_name": "tbl_dataset_masters",
    		"udf_call_arguments": null,
    		"alias":  null
    	},
    	{
    		"sequence_id": 2,
    		"table_name": "tbl_datasets",
    		"udf_call_arguments": null,
    		"alias":  null
    	}
    ],
	"clauses": [  ]
}

New result facet that targets datasets

A new result facet that target datasets is added using the following JSON specification:

{
   "facet_id": 40,
   "facet_code": "result_facet_dataset",
   "display_title": "Datasets",
   "description": "Datasets",
   "facet_group_id":"99",
   "facet_type_id": 1,
   "category_id_expr": "tbl_datasets.dataset_id",
   "category_name_expr": "tbl_datasets.dataset_name",
   "sort_expr": "tbl_datasets.dataset_name",
   "is_applicable": false,
   "is_default": false,
   "aggregate_type": "count",
   "aggregate_title": "Number of datasets",
   "aggregate_facet_code": null,
   "tables": [
   {
   	"sequence_id": 1,
   	"table_name": "tbl_datasets",
   	"udf_call_arguments": null,
   	"alias":  null
   } ],
   "clauses": [  ]
}

20190503_DDL_TAXA_ATTRIBUTE_TYPE_LENGTH SqlException

Import fails with exception "ERROR: value too long for type character varying(25)" when values are inserted into tbl_taxa_measured_attributes, column attribute_type.

Sample values that exceeeds 25 characters:

  • Length larva final (10th) instar
  • Length (brachypterous females)
  • Length female brachypterous
  • Length female macropterous
  • Length, head-end of elytra

20200203_DML_METHOD_UPDATE: Update characters in method description

20200203_DML_METHOD_UPDATE

Method description needs editing to remove linebreak between 'use.' and 'See' as it causes problem when exporting data.

"Swedish Rikets Nät (National Grid) system. Full name "RT 90 2.5 gon V 0:-15". X = south-north, Y = west-east. Essentially superceded by SWEREF 99 although still in extensive use.
See http://www.lantmateriet.se/templates/LMV_Page.aspx?id=4766&lang=EN (NOTE: include URL as biblio link)"

Integration of temperature proxy data

Timeline needs temperature data in the database. I have created the table tbl_temperatures for this purpose, but it should be integrated into the sead change control so that it carries over with different database versions. The name is also wrong since according to Phil this is not actually temperature data but only a proxy for it. Perhaps this data should also be inserted into an existing table somewhere rather than having it's own? If you don't mind I'll leave it up to you @visead and @roger-mahler to figure out where/how to best integrate this data into the SEAD database in a more proper manner.

Attaching SQL for creating the table and inserting the data.

tbl_temperatures.zip

Consolidate public schema.

[ x ] Setup sqitch project

Compare the different SEAD versions for schema changes, excluding bugs-related data.
Create a DDL change (using sqitch) for each identified difference.

tbl_measured_value_dimensions and tbl_analysis_entity_dimensions are empty

These two tables are currently empty and are not in use. The tables seem to be meant for another layer of storing information about subsamples and their measured dimensions e.g. before and after burning weights for LOI analysis.

Although I think this function seem to exist indirectly in other populated tables, but I might be cofusing things.

Since they aren't in use and you can get sample information form other tables I would suggest looking into perhaps removing them

Deploy of submission failed: "Cannot change to directory ./general"

Errour output from sqitch command:

Cannot change to directory ./general: No such file or directory

Trace begun at /bin/../lib/perl5/App/Sqitch.pm line 279
App::Sqitch::_parse_core_opts('App::Sqitch', 'ARRAY(0x558dc7e04d30)') called at /bin/../lib/perl5/App/Sqitch.pm line 182
App::Sqitch::go('App::Sqitch') called at /bin/sqitch line 17

Raä-IDs

Should be next to the site name in the site filter, but requires back-end support

20170101_DDL_BIBLIO_REFACTOR_MODEL: Biblio refactoring MAL data transfer

tbl_biblio is being simplified and existing MAL data in this and related tables need to be merged into the simplified fields. The order of fields to be concatenated will depend on the publication type - e.g. book vs journal article.

create string for title and full_reference

Phil will create the list for each type
e.g.
Publication_type = journal then
tbl_biblio.title = tbl_biblio.title & " " tbl...

for publication_type_id
9
5
4
6
12
26
23
22

Isotope: Remaining errors in Isotope data

Remaining errors in Isotope data:

  • Reference to missing method_id:
    • method_id 23 in tbl_isotope_measurement
    • method_id 5 in tbl_sample_groups
  • Reference to missing sample_type_id:
    • method_id 15 in tbl_physical_samples

SEAD @2019.12: Release checklist

Deploy new SEAD database

Note! This checklist _only_deploys the database. Relevant systems need to be deployed desperately and in conjunction.

Prerequisites

  • Docker
  • Git

Description

This will be the first deploy of SEAD database using the new change control system. The starting point of the deployment is an SQL dump of sead_master_9 stored as zipped file sead_master_9_public.sql.gz in folder starting_point in the sead_change_control repository.

The overall deployment process is as follows:

  1. Prepare deploy
  2. Create new staging
  3. Import pending submissions
  4. Deploy submissions
  5. Import bugs data
  6. Switch production database
  7. Deploy relevant systems

Prepare deploy

Make sure you have an up-to-date version of the SEAD repositories sead_change_control and sead_clearinghouse*.

Run git clone if you haven't checked out the repository before, or update the code using git pull

% mkdir -p ~/source && cd ~/source
% git clone https://github.com/humlab-sead/sead_change_control.git
% git clone https://github.com/humlab-sead/sead_clearinghouse.git
% git clone https://github.com/humlab-sead/sead_clearinghouse_import.git

Freeze deployment by adding tag @2019.12 to all Sqitch plans (for each subfolder xyz) to be included in the release (this can also be done manually by editing the .plan files):

% sqitch tag --tag @2019.12 --plan-file ./xyz/sqitch.plan --note "2019 December release"

Add tag @2019.12 to all relevant repositories.

Create new staging

$ ./bin/deploy_staging --user humlab_admin --target-db-name sead_staging --create-database --source-type dump --on-conflict drop --deploy-to-tag @2019.12

Import pending submissions

The following workflow imports and commits submissions that have been prepared in Excel files i.e. Ceramics, Dendrochronology and Isotope data. The flow assumes that relevant updates to the SEAD database model have been committed and all lookup data reference by the new data submissions has been inserted (separate change requests).

Checkout or update systems sead_clearinghouse and sead_clearinghouse_import.

  • Prepare submission data (Excel) files and copy them to sead_clearinghouse_import/data/input.
  • Import data to SEAD Clearinghouse System using sead_clearinghouse_import. Edit options in ~/sorce/sead_clearinghouse_import/runner.py i.e. pecify name of input files and data types.
  • Verify imported data using SEAD Clearinghouse.
  • Commit & deploy submission data to SEAD Change Control System
  • Deploy submission change request to SEAD staging

Run the following commands to import the data to Clearinghouse:

$ cd ~/source/sead_clearinghouse_import
$ pipenv shell
$ vi runner.py # edit run options
$ SEAD_CH_PASSWORD=qwerty python runner.py

Store staging phase 1 database:

create database sead_staging_phase_2 owner sead_master template sead_staging;

Deploy submissions to SEAD Change Control system:

$ cd ~/source/sead_clearinghouse/transport_system$ .
$ ./deploy_submission.sh --dbname=sead_staging --id=1 --force --add-change-request
$ ./deploy_submission.sh --dbname=sead_staging --id=2 --force --add-change-request
$ ./deploy_submission.sh --dbname=sead_staging --id=3 --force --add-change-request
$ ./deploy_submission.sh --dbname=sead_staging --id=4 --force --add-change-request

Store staging phase 2 database:

create database sead_staging_phase_2 owner sead_master template sead_staging;

Deploy submissions

$ sqitch deploy --target staging --mode change --no-verify -C ./submissions 20191220_DML_SUBMISSION_CERAMICS_001_COMMIT
$ sqitch deploy --target staging --mode change --no-verify -C ./submissions 20191220_DML_SUBMISSION_DENDRO_BUILDING_002_COMMIT
$ sqitch deploy --target staging --mode change --no-verify -C ./submissions 20191220_DML_SUBMISSION_DENDRO_ARCHEOLOGY_003_COMMIT
$ sqitch --target staging --mode change --no-verify -C ./submissions 20191220_DML_SUBMISSION_ISOTOPE_004_COMMIT

Store staging phase 3 database:

create database sead_staging_phase_3 owner sead_master template sead_staging;

Import bugs data

$ cd ~/source/sead_bugs_import || cd ~/source && git clone https://github.com/humlab-sead/sead_bugs_import
$ cd ~/source/sead_bugs_import && git pull
$ mvn -Dmaven.test.skip=true clean
$ mvn -Dmaven.test.skip=true package
$ vi config/application.properties # set sead_staging as target database
$ java -jar target/bugs.import-0.1-SNAPSHOT.jar --file=./bugsdata/bugsdata_20190503.mdb  > bugsimport_20191220.log 2>&1 &
$ Execute (not yet created post_bugs_import.sh) or
select bugs_import.post_import_updates()

Create Bugs import change request:

% cd ~/source/sead_change_control/
% sqitch add --change-name 20191221_DML_SUBMISSION_BUGS_20190303_COMMIT --note "Initial Bugs Import" --chdir ./submissions
% cd source/sead_change_control/submissions/deploy/20191221_DML_SUBMISSION_BUGS_20190303_COMMIT
% __navicat public schema diff pre/post bugs > public_data_diff.sql__
% pg_dump --data-only --blobs -d sead_staging --schema bugs_import -h seadserv.humlab.umu.se -F p -U humlab_admin -f bugs_import_schema.sql
% gzip bugs_import_schema.sql
% gzip public_data_diff.sql

Switch production database

-- alter database sead_production rename sead_yyyymm;
create database sead_production owner sead_master template sead_staging;

Deploy relevant systems

Rebuild SEAD Query API:

% cd ~/source/sead_query_api/
% # wget https://github.com/humlab-sead/..././docker-build.sh && chmod +x ./docker-build.sh
% vi conf/appsettings.Production.json # edit run settings
% docker-compose down
% ./docker-build.sh

Start API

% docker-compose up -d

20200315_DML_ECOCODE_ADD_SYSTEM_4: Anolds & van der Maarel (plants)

tbl.ecocode_sytem.name 'Anolds & van der Maarel (plants)' was not transferred from sead master 8 during migration. This needs doing, including code assignations to taxa and reference for system.
System provides ecology codes for plants in the Netherlands in environments as classified for biodiversity surveys. Part of Ida Lundberg's Magister project.

20190101_DDL_FACET_SCHEMA: Region facet request

Can we have a "Region" facet under the space-time group?

Contents would be something like (filtering sites?):

select location_name, site_id, location_type_id
from tbl_locations
inner join tbl_site_locations on tbl_locations.location_id=tbl_site_locations.location_id
where tbl_locations.location_type_id = 2 or tbl_locations.location_type_id = 7 or tbl_locations.location_type_id = 14 or tbl_locations.location_type_id = 16 or tbl_locations.location_type_id = 18
order by location_name

Use case: A point of entry for all sites in Västerbotten linked from https://sparfran10000ar.se/

Wrong record_type for some methods?

Methods relating to geoarchaeology might have been marked with the wrong record_type. They have been marked as "Non-biological taxa" (record_type_id 10) instead of "Soil chemistry/property" (record_type_id 12), you can see the mixup when searching in the webbrowser and sorting the sites based on their record types in the table-view

Only samples containing soil chemistry seem to have this error

20191213_BUGS_IMPORT_POST_APPLY: Bugs Sample codes need to be stored as alt refs in SEAD

In BugsCEP, unique identifier for a sample is TSample.SampleCODE.
This is stored in bugs_trace during import but not stored in the main SEAD database.
SampleCODE needs storing in SEAD as:
tbl_sample_alt_refs.alt_ref with type "Database identifier" (new alt_ref_type)

This is needed for tracing samples through SEAD to Bugs, for example when Francesca creates a matrix of all insect samples.

20191125_DML_CERAMICS_LOOKUP: Ceramics lookup data

CH import system has guards that prevent import of new lookup data via Clearinghouse. The ceramics data import files were created before these guards were activated.

The lookup data are added to SEAD via a change request via the change control system, and the Excel is updated with the new SEAD system identities returned by the inserts.

20200203_DML_ADD_YEAR_TYPES: Populate tbl_years_types

Populate tbl_years_type as follows (values from Bugs plus an extra 'Unknown' - which maps to nulls and '?' in Bugs import)

years_type_id name description date_updated
  Calendar Calendar years  
  C14 Radiocarbon years  
  Radiometric Radiometric years (but not C14)  
  Unknown Unknown years type (either to be defined or unspecified in source)  

Update all MAL-datasets that is NULL to ID 2

Erik mail May 5, 2018:

I have troubles with the datasets in sead_master_9. There are a bunch in there without master_set_id. I don’t know if these should be there or not?

Should we have some kind of collection of all the lookup data somewhere so we can include it into a datastore separately?`

Filter has corrupt name

Filter with facetCode "tbl_denormalized_measured_values_37" has a DisplayTitle of "P┬░", which is obviously due to some sort of encoding translation error.

Dendrochronology record type needed

Dendrochronology data needs a record type to enable some query and filter functionality.

tbl_record_types.record_name = 'Dendrochronology'
tbl_record_types.record_name = SEE ROGER'S QUERY

Update tbl_methods.record_type_id to match the new record_type

20200210_DML_MAL_QOS_UPDATE_01: MAL dataset Quality Assurance (part #1)

64 MAL sites in tbl_sites has missing or erroneous coordinates when reviewing SEAD master 9. These sites are somewhat misplaced with some being found in water and others remaining in the general area of their correct location. Of these 64 sites only 53 were corrected due to some sites currently missing metadata or lack coordinate information.

tbl_sites should be corrected with the following list below which has correct coordinates,

site_id altitude latitude_dd longitude_dd national_site_identifier site_description site_name site_preservation_status_id date_updated
57 15 56.8882889 12.5898222 Skrea Raä 162   Skrea Raä 162   2013-05-16 10:39:58.584644+02
78   55.5672222226 12.9844444 MHM12753; MHM12879   Citytunnelprojektet delområde 5   2013-11-13 15:33:32.218+01
84   55.5638888889 12.9794444 MHM12878; MHM12752   Citytunnelprojektet delområde 4   2013-11-13 15:33:32.218+01
85   68.6027777778 19.8483333     Guomojávrrit   2013-11-13 15:33:32.218+01
90   57.4808833337 12.6857750 Örby 98:1   Örby Raä 98   2013-11-13 15:33:32.218+01
97   70.4894444441 29.2500000     Oarddojávri   2013-11-13 15:33:32.218+01
100   70.3221555559 25.1738639     Ruksesbákti, Indre Sandvik   2013-11-13 15:33:32.218+01
105   55.5611111111 12.9716667 MHM 12877   Citytunnelprojektet delområde 3   2013-11-13 15:33:32.218+01
111   61.2238444448 11.4983500     Melvold   2013-11-13 15:33:32.218+01
119   59.1138888889 9.7980556 136604/136599   Sundsaasen 2   2013-11-13 15:33:32.218+01
121   61.2758333337 11.6069444     Tiertjern   2013-11-13 15:33:32.218+01
122   55.9519444444 11.8752778 NFH A2039   Hilleröd NFH A2039   2013-11-13 15:33:32.218+01
123   70.6983333330 23.6280556     Skjærvika   2013-11-13 15:33:32.218+01
124   58.4809444448 13.4798472 Ledsjö 153:1   Ledsjö Raä 153   2013-11-13 15:33:32.218+01
128   55.1130555556 14.8705556 BMR 3054   BMR 3054 Vestre Indlæg   2013-11-13 15:33:32.218+01
132   61.2220416670 11.4927944     Stræten terasse   2013-11-13 15:33:32.218+01
135   70.9619444444 26.6700000     Sværholt   2013-11-13 15:33:32.218+01
139   63.5861750 19.7208000 Nordmaling 524:1   Bothnia line JP 71F1   2013-11-13 15:33:32.218+01
143   61.3736111114 11.5144444     Knubbetjern   2013-11-13 15:33:32.218+01
153   61.3844444441 11.5433333     Knubba   2013-11-13 15:33:32.218+01
159   70.0669444445 27.5586111     Láksjohka   2013-11-13 15:33:32.218+01
163   58.1368305552 12.9879639     Västerbitterna 3:17   2013-11-13 15:33:32.218+01
165   61.2266527781 11.4976694     Rødstranda   2013-11-13 15:33:32.218+01
176   63.5844444441 19.5839944     Bothnia line JP 71C1/71:24   2013-11-13 15:33:32.218+01
178   60.5190944448 15.4388000     Yttre Medväga   2013-11-13 15:33:32.218+01
186 36 63.8449999997 20.1447222 Umeå socken 229   Prästsjödiket Umeå   2013-11-13 15:33:32.218+01
193   63.7231278 20.1977083     Bothnia line JP 73G   2013-11-13 15:33:32.218+01
203   63.6594167 19.9786472     Bothnia line JP 72E   2013-11-13 15:33:32.218+01
209   63.2226500 18.0569333     Bothnia line jp 31J1   2013-11-13 15:33:32.218+01
213   59.3803250 15.6283583     Alväng, Götlunda   2013-11-13 15:33:32.218+01
241   58.3846361108 13.8942833 Skövde 158:1   Skövde Raä 158   2013-11-13 15:33:32.218+01
243   61.3833333330 11.5288889     Deset Knubben   2013-11-13 15:33:32.218+01
254 20 63.3916194 19.2470694 Grundsunda 30:1   Grundsunda Raä 30   2013-11-13 15:33:32.218+01
257   55.5647222222 12.9733333 MHM 12751   Citytunnelprojektet delområde Hotelltomten   2014-02-19 15:28:44.312+01
258   59.5261111 18.9977778     Ägglösen   2014-02-19 15:28:44.312+01
268   59.5927222219 16.4485111     Västra Skälby   2014-02-19 15:28:44.312+01
288 44 66.0204306 22.6298389 Töre 405:2   Töre 405:2   2014-02-19 15:28:44.312+01
289   59.5347222 18.5644444     Norra Småholmen   2014-02-19 15:28:44.312+01
291   65.9296833337 22.8561528 Töre 510   Kosjärv   2014-02-19 15:28:44.312+01
292 50 66.0250167 22.6166667 Töre 408:1   Töre 408:1   2014-02-19 15:28:44.312+01
294   65.9783861114 19.1512028     Åssjiejávrátje   2014-02-19 15:28:44.312+01
305   59.6830555559 18.9266667     Fåröarna   2014-02-19 15:28:44.312+01
320   59.5152777778 18.7777778     Bergskärit   2014-02-19 15:28:44.312+01
328 25 59.6732250 17.8628361 Odensala 402:1   Odensala Raä 402   2014-02-19 15:28:44.312+01
337 15 59.6973222 18.4795028 Skederid 190 Settlement site Skederid Raä 190   2014-02-19 15:28:44.312+01
345   61.1331666670 16.9196917     Själstuga   2014-02-19 15:28:44.312+01
350   59.5775000 18.9636111     Stora Halmören   2014-02-19 15:28:44.312+01
354 5 58.7519722 17.0113527778 Nyköping 231:1 Stadslager, ca 1150x1000 m (N-S), medeltida och yngre, i stadscentrum. Åkroken   2014-02-19 15:28:44.312+01
358   59.6833333330 18.9552778     Marskärskobben   2014-02-19 15:28:44.312+01
373   62.5760694448 12.5677750 Härjedalen 169   Hedningsgärdet   2014-02-19 15:28:44.312+01
374   59.5655555556 18.9725000     Stora Träskär   2014-02-19 15:28:44.312+01
384 45 66.0217556 22.6267111     Töre 422   2014-02-19 15:28:44.312+01
386   64.9952583 21.3126556     Tåme   2014-02-19 15:28:44.312+01

Inconsistent sqitch.plan prevents deploy

Manual changes (delete) in plan of deployed changes prevents deploy.
Changes:
[x] Make system forward only for now (remove reverts)
[x] Make all changes idempotent
[x] Do a sqitch rebase - does a revert and deploy in sequence

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.