humlab-sead / sead_change_control Goto Github PK
View Code? Open in Web Editor NEWSane SEAD change control using Sqitch.
Sane SEAD change control using Sqitch.
Phil is unsure how pre treatment etc is handled and how data should be selected for range (histogram) filters for geochemical analysis methods
Correct filter visible elements as follows:
measured_values - measurements
MS -Magnetic sus.
LOI - Loss on Ignition
PT? - Phosphates
space_time - space/time
Site - Sites
Country - Countries
Sample group - Sample groups
Taxa - Taxon
Reorder filter groups:
space/time
ecology
measurements
taxonomy
Reorder items within space/time filter
Master datasets
Countries
Sites
Sample groups
Time periods
Geochronology
A new facet with the following JSON specification:
{
"facet_id": 38,
"facet_code": "dataset_master",
"display_title": "Master datasets",
"description": "Master datasets",
"facet_group_id": "2",
"facet_type_id": 1,
"category_id_expr": "tbl_dataset_masters.master_set_id ",
"category_name_expr": "tbl_dataset_masters.master_name",
"sort_expr": "tbl_dataset_masters.master_name",
"is_applicable": true,
"is_default": false,
"aggregate_type": "count",
"aggregate_title": "Number of samples",
"aggregate_facet_code": "result_facet",
"tables": [
{
"sequence_id": 1,
"table_name": "tbl_dataset_masters",
"udf_call_arguments": null,
"alias": null
}
],
"clauses": [ ]
}
A new facet with the following specification:
{
"facet_id": 39,
"facet_code": "dataset_methods",
"display_title": "Dataset methods",
"description": "Dataset methods",
"facet_group_id": "2",
"facet_type_id": 1,
"category_id_expr": "tbl_methods.method_id ",
"category_name_expr": "tbl_methods.method_name",
"sort_expr": "tbl_methods.method_name",
"is_applicable": true,
"is_default": false,
"aggregate_type": "count",
"aggregate_title": "Number of datasets",
"aggregate_facet_code": "result_facet_dataset",
"tables": [
{
"sequence_id": 1,
"table_name": "tbl_dataset_masters",
"udf_call_arguments": null,
"alias": null
},
{
"sequence_id": 2,
"table_name": "tbl_datasets",
"udf_call_arguments": null,
"alias": null
}
],
"clauses": [ ]
}
A new result facet that target datasets is added using the following JSON specification:
{
"facet_id": 40,
"facet_code": "result_facet_dataset",
"display_title": "Datasets",
"description": "Datasets",
"facet_group_id":"99",
"facet_type_id": 1,
"category_id_expr": "tbl_datasets.dataset_id",
"category_name_expr": "tbl_datasets.dataset_name",
"sort_expr": "tbl_datasets.dataset_name",
"is_applicable": false,
"is_default": false,
"aggregate_type": "count",
"aggregate_title": "Number of datasets",
"aggregate_facet_code": null,
"tables": [
{
"sequence_id": 1,
"table_name": "tbl_datasets",
"udf_call_arguments": null,
"alias": null
} ],
"clauses": [ ]
}
The change requests that add ecocode system 4 is commented out from SCC plan.
For example, see: http://supersead.humlab.umu.se/site/102
Import fails with exception "ERROR: value too long for type character varying(25)" when values are inserted into tbl_taxa_measured_attributes
, column attribute_type
.
Sample values that exceeeds 25 characters:
Add a rudimentary description of SEAD CCS in README.
Update NULL MAL master_set_id in tbl_datasets.
Resolves #100
Method description needs editing to remove linebreak between 'use.' and 'See' as it causes problem when exporting data.
"Swedish Rikets Nät (National Grid) system. Full name "RT 90 2.5 gon V 0:-15". X = south-north, Y = west-east. Essentially superceded by SWEREF 99 although still in extensive use.
See http://www.lantmateriet.se/templates/LMV_Page.aspx?id=4766&lang=EN (NOTE: include URL as biblio link)"
Timeline needs temperature data in the database. I have created the table tbl_temperatures for this purpose, but it should be integrated into the sead change control so that it carries over with different database versions. The name is also wrong since according to Phil this is not actually temperature data but only a proxy for it. Perhaps this data should also be inserted into an existing table somewhere rather than having it's own? If you don't mind I'll leave it up to you @visead and @roger-mahler to figure out where/how to best integrate this data into the SEAD database in a more proper manner.
Attaching SQL for creating the table and inserting the data.
[ x ] Setup sqitch project
Compare the different SEAD versions for schema changes, excluding bugs-related data.
Create a DDL change (using sqitch) for each identified difference.
These two tables are currently empty and are not in use. The tables seem to be meant for another layer of storing information about subsamples and their measured dimensions e.g. before and after burning weights for LOI analysis.
Although I think this function seem to exist indirectly in other populated tables, but I might be cofusing things.
Since they aren't in use and you can get sample information form other tables I would suggest looking into perhaps removing them
This query should give 0 count as result:
SELECT count(*)
FROM tbl_ecocodes e
JOIN tbl_ecocode_definitions ed USING (ecocode_definition_id)
JOIN tbl_ecocode_groups eg USING (ecocode_group_id)
WHERE ecocode_system_id is null;
On creation of each bugs datasets, tbl_dataset_submissions entries need to be created automatically.
Submission type as follows:
"submission_type_id";"submission_type";"description"
5;"Compilation into SEAD from another database";"Single dataset from another database submission into SEAD"
Errour output from sqitch command:
Cannot change to directory ./general: No such file or directory
Trace begun at /bin/../lib/perl5/App/Sqitch.pm line 279
App::Sqitch::_parse_core_opts('App::Sqitch', 'ARRAY(0x558dc7e04d30)') called at /bin/../lib/perl5/App/Sqitch.pm line 182
App::Sqitch::go('App::Sqitch') called at /bin/sqitch line 17
Name of column facet.facet.facet_key
changed to facet_code
.
Reason for this is that name should be in line with used terminology.
Update is currently missing in refactor DDL. Need SQL update of new fields authors, title, full_ reference based on old fields.
Project for roles and privileges.
Should be next to the site name in the site filter, but requires back-end support
tbl_biblio is being simplified and existing MAL data in this and related tables need to be merged into the simplified fields. The order of fields to be concatenated will depend on the publication type - e.g. book vs journal article.
create string for title and full_reference
Phil will create the list for each type
e.g.
Publication_type = journal then
tbl_biblio.title = tbl_biblio.title & " " tbl...
for publication_type_id
9
5
4
6
12
26
23
22
Automate as much as possible.
method_id
:
method_id
23 in tbl_isotope_measurement
method_id
5 in tbl_sample_groups
sample_type_id
:
method_id
15 in tbl_physical_samples
Note! This checklist _only_deploys the database. Relevant systems need to be deployed desperately and in conjunction.
This will be the first deploy of SEAD database using the new change control system. The starting point of the deployment is an SQL dump of sead_master_9
stored as zipped file sead_master_9_public.sql.gz
in folder starting_point
in the sead_change_control repository.
The overall deployment process is as follows:
Make sure you have an up-to-date version of the SEAD repositories sead_change_control
and sead_clearinghouse*
.
Run git clone
if you haven't checked out the repository before, or update the code using git pull
% mkdir -p ~/source && cd ~/source
% git clone https://github.com/humlab-sead/sead_change_control.git
% git clone https://github.com/humlab-sead/sead_clearinghouse.git
% git clone https://github.com/humlab-sead/sead_clearinghouse_import.git
Freeze deployment by adding tag @2019.12
to all Sqitch plans (for each subfolder xyz) to be included in the release (this can also be done manually by editing the .plan files):
% sqitch tag --tag @2019.12 --plan-file ./xyz/sqitch.plan --note "2019 December release"
Add tag @2019.12
to all relevant repositories.
$ ./bin/deploy_staging --user humlab_admin --target-db-name sead_staging --create-database --source-type dump --on-conflict drop --deploy-to-tag @2019.12
The following workflow imports and commits submissions that have been prepared in Excel files i.e. Ceramics, Dendrochronology and Isotope data. The flow assumes that relevant updates to the SEAD database model have been committed and all lookup data reference by the new data submissions has been inserted (separate change requests).
Checkout or update systems sead_clearinghouse
and sead_clearinghouse_import
.
sead_clearinghouse_import/data/input
.sead_clearinghouse_import
. Edit options in ~/sorce/sead_clearinghouse_import/runner.py
i.e. pecify name of input files and data types.Run the following commands to import the data to Clearinghouse:
$ cd ~/source/sead_clearinghouse_import
$ pipenv shell
$ vi runner.py # edit run options
$ SEAD_CH_PASSWORD=qwerty python runner.py
Store staging phase 1 database:
create database sead_staging_phase_2 owner sead_master template sead_staging;
Deploy submissions to SEAD Change Control system:
$ cd ~/source/sead_clearinghouse/transport_system$ .
$ ./deploy_submission.sh --dbname=sead_staging --id=1 --force --add-change-request
$ ./deploy_submission.sh --dbname=sead_staging --id=2 --force --add-change-request
$ ./deploy_submission.sh --dbname=sead_staging --id=3 --force --add-change-request
$ ./deploy_submission.sh --dbname=sead_staging --id=4 --force --add-change-request
Store staging phase 2 database:
create database sead_staging_phase_2 owner sead_master template sead_staging;
$ sqitch deploy --target staging --mode change --no-verify -C ./submissions 20191220_DML_SUBMISSION_CERAMICS_001_COMMIT
$ sqitch deploy --target staging --mode change --no-verify -C ./submissions 20191220_DML_SUBMISSION_DENDRO_BUILDING_002_COMMIT
$ sqitch deploy --target staging --mode change --no-verify -C ./submissions 20191220_DML_SUBMISSION_DENDRO_ARCHEOLOGY_003_COMMIT
$ sqitch --target staging --mode change --no-verify -C ./submissions 20191220_DML_SUBMISSION_ISOTOPE_004_COMMIT
Store staging phase 3 database:
create database sead_staging_phase_3 owner sead_master template sead_staging;
$ cd ~/source/sead_bugs_import || cd ~/source && git clone https://github.com/humlab-sead/sead_bugs_import
$ cd ~/source/sead_bugs_import && git pull
$ mvn -Dmaven.test.skip=true clean
$ mvn -Dmaven.test.skip=true package
$ vi config/application.properties # set sead_staging as target database
$ java -jar target/bugs.import-0.1-SNAPSHOT.jar --file=./bugsdata/bugsdata_20190503.mdb > bugsimport_20191220.log 2>&1 &
$ Execute (not yet created post_bugs_import.sh) or
select bugs_import.post_import_updates()
Create Bugs import change request:
% cd ~/source/sead_change_control/
% sqitch add --change-name 20191221_DML_SUBMISSION_BUGS_20190303_COMMIT --note "Initial Bugs Import" --chdir ./submissions
% cd source/sead_change_control/submissions/deploy/20191221_DML_SUBMISSION_BUGS_20190303_COMMIT
% __navicat public schema diff pre/post bugs > public_data_diff.sql__
% pg_dump --data-only --blobs -d sead_staging --schema bugs_import -h seadserv.humlab.umu.se -F p -U humlab_admin -f bugs_import_schema.sql
% gzip bugs_import_schema.sql
% gzip public_data_diff.sql
-- alter database sead_production rename sead_yyyymm;
create database sead_production owner sead_master template sead_staging;
Rebuild SEAD Query API:
% cd ~/source/sead_query_api/
% # wget https://github.com/humlab-sead/..././docker-build.sh && chmod +x ./docker-build.sh
% vi conf/appsettings.Production.json # edit run settings
% docker-compose down
% ./docker-build.sh
Start API
% docker-compose up -d
tbl_ecocode_definitions.label needs to be changed to tbl_ecocode_definitions.name
Affects supersead facet queries.
Results from column name change between sead master 8 and 9.
tbl.ecocode_sytem.name 'Anolds & van der Maarel (plants)' was not transferred from sead master 8 during migration. This needs doing, including code assignations to taxa and reference for system.
System provides ecology codes for plants in the Netherlands in environments as classified for biodiversity surveys. Part of Ida Lundberg's Magister project.
Can we have a "Region" facet under the space-time group?
Contents would be something like (filtering sites?):
select location_name, site_id, location_type_id
from tbl_locations
inner join tbl_site_locations on tbl_locations.location_id=tbl_site_locations.location_id
where tbl_locations.location_type_id = 2 or tbl_locations.location_type_id = 7 or tbl_locations.location_type_id = 14 or tbl_locations.location_type_id = 16 or tbl_locations.location_type_id = 18
order by location_name
Use case: A point of entry for all sites in Västerbotten linked from https://sparfran10000ar.se/
Methods relating to geoarchaeology might have been marked with the wrong record_type. They have been marked as "Non-biological taxa" (record_type_id 10) instead of "Soil chemistry/property" (record_type_id 12), you can see the mixup when searching in the webbrowser and sorting the sites based on their record types in the table-view
Only samples containing soil chemistry seem to have this error
New alt ref type needed for SEAD - Bugs tractability .
In BugsCEP, unique identifier for a sample is TSample.SampleCODE.
This is stored in bugs_trace during import but not stored in the main SEAD database.
SampleCODE needs storing in SEAD as:
tbl_sample_alt_refs.alt_ref with type "Database identifier" (new alt_ref_type)
This is needed for tracing samples through SEAD to Bugs, for example when Francesca creates a matrix of all insect samples.
System is now in wrongs schema.
The change 20170906_DDL_RELATIVE_DATES_ALTER_RELATION was run OK but still schema unchanged.
Fixed ID is needed to avoid id clashes.
CH import system has guards that prevent import of new lookup data via Clearinghouse. The ceramics data import files were created before these guards were activated.
The lookup data are added to SEAD via a change request via the change control system, and the Excel is updated with the new SEAD system identities returned by the inserts.
Previous QSEAD system counted values by grouping on physical_sample_ìd
, not analysis_entity_id
. This two approaches gives totally different results.
Populate tbl_years_type as follows (values from Bugs plus an extra 'Unknown' - which maps to nulls and '?' in Bugs import)
years_type_id | name | description | date_updated |
---|---|---|---|
Calendar | Calendar years | ||
C14 | Radiocarbon years | ||
Radiometric | Radiometric years (but not C14) | ||
Unknown | Unknown years type (either to be defined or unspecified in source) |
Erik mail May 5, 2018:
I have troubles with the datasets in sead_master_9. There are a bunch in there without master_set_id. I don’t know if these should be there or not?
Should we have some kind of collection of all the lookup data somewhere so we can include it into a datastore separately?`
Filter with facetCode "tbl_denormalized_measured_values_37" has a DisplayTitle of "P┬░", which is obviously due to some sort of encoding translation error.
Dendrochronology data needs a record type to enable some query and filter functionality.
tbl_record_types.record_name = 'Dendrochronology'
tbl_record_types.record_name = SEE ROGER'S QUERY
Update tbl_methods.record_type_id to match the new record_type
Resolves humlab-sead/sead_bugs_import#13.
20190503_DDL_BUGS_ADD_TRANSLATIONS (updated)
Resolved by sqitchers/sqitch#459 (comment).
A new field containing a short description of the facet. Related to humlab-sead/sead_query_api/issues/54
87 records in tbl_methods lack record_type. All data related to these methods will be filtered out from SEAD Query API reporting queries since all facet tables are inner joined.
64 MAL sites in tbl_sites has missing or erroneous coordinates when reviewing SEAD master 9. These sites are somewhat misplaced with some being found in water and others remaining in the general area of their correct location. Of these 64 sites only 53 were corrected due to some sites currently missing metadata or lack coordinate information.
tbl_sites should be corrected with the following list below which has correct coordinates,
site_id | altitude | latitude_dd | longitude_dd | national_site_identifier | site_description | site_name | site_preservation_status_id | date_updated |
---|---|---|---|---|---|---|---|---|
57 | 15 | 56.8882889 | 12.5898222 | Skrea Raä 162 | Skrea Raä 162 | 2013-05-16 10:39:58.584644+02 | ||
78 | 55.5672222226 | 12.9844444 | MHM12753; MHM12879 | Citytunnelprojektet delområde 5 | 2013-11-13 15:33:32.218+01 | |||
84 | 55.5638888889 | 12.9794444 | MHM12878; MHM12752 | Citytunnelprojektet delområde 4 | 2013-11-13 15:33:32.218+01 | |||
85 | 68.6027777778 | 19.8483333 | Guomojávrrit | 2013-11-13 15:33:32.218+01 | ||||
90 | 57.4808833337 | 12.6857750 | Örby 98:1 | Örby Raä 98 | 2013-11-13 15:33:32.218+01 | |||
97 | 70.4894444441 | 29.2500000 | Oarddojávri | 2013-11-13 15:33:32.218+01 | ||||
100 | 70.3221555559 | 25.1738639 | Ruksesbákti, Indre Sandvik | 2013-11-13 15:33:32.218+01 | ||||
105 | 55.5611111111 | 12.9716667 | MHM 12877 | Citytunnelprojektet delområde 3 | 2013-11-13 15:33:32.218+01 | |||
111 | 61.2238444448 | 11.4983500 | Melvold | 2013-11-13 15:33:32.218+01 | ||||
119 | 59.1138888889 | 9.7980556 | 136604/136599 | Sundsaasen 2 | 2013-11-13 15:33:32.218+01 | |||
121 | 61.2758333337 | 11.6069444 | Tiertjern | 2013-11-13 15:33:32.218+01 | ||||
122 | 55.9519444444 | 11.8752778 | NFH A2039 | Hilleröd NFH A2039 | 2013-11-13 15:33:32.218+01 | |||
123 | 70.6983333330 | 23.6280556 | Skjærvika | 2013-11-13 15:33:32.218+01 | ||||
124 | 58.4809444448 | 13.4798472 | Ledsjö 153:1 | Ledsjö Raä 153 | 2013-11-13 15:33:32.218+01 | |||
128 | 55.1130555556 | 14.8705556 | BMR 3054 | BMR 3054 Vestre Indlæg | 2013-11-13 15:33:32.218+01 | |||
132 | 61.2220416670 | 11.4927944 | Stræten terasse | 2013-11-13 15:33:32.218+01 | ||||
135 | 70.9619444444 | 26.6700000 | Sværholt | 2013-11-13 15:33:32.218+01 | ||||
139 | 63.5861750 | 19.7208000 | Nordmaling 524:1 | Bothnia line JP 71F1 | 2013-11-13 15:33:32.218+01 | |||
143 | 61.3736111114 | 11.5144444 | Knubbetjern | 2013-11-13 15:33:32.218+01 | ||||
153 | 61.3844444441 | 11.5433333 | Knubba | 2013-11-13 15:33:32.218+01 | ||||
159 | 70.0669444445 | 27.5586111 | Láksjohka | 2013-11-13 15:33:32.218+01 | ||||
163 | 58.1368305552 | 12.9879639 | Västerbitterna 3:17 | 2013-11-13 15:33:32.218+01 | ||||
165 | 61.2266527781 | 11.4976694 | Rødstranda | 2013-11-13 15:33:32.218+01 | ||||
176 | 63.5844444441 | 19.5839944 | Bothnia line JP 71C1/71:24 | 2013-11-13 15:33:32.218+01 | ||||
178 | 60.5190944448 | 15.4388000 | Yttre Medväga | 2013-11-13 15:33:32.218+01 | ||||
186 | 36 | 63.8449999997 | 20.1447222 | Umeå socken 229 | Prästsjödiket Umeå | 2013-11-13 15:33:32.218+01 | ||
193 | 63.7231278 | 20.1977083 | Bothnia line JP 73G | 2013-11-13 15:33:32.218+01 | ||||
203 | 63.6594167 | 19.9786472 | Bothnia line JP 72E | 2013-11-13 15:33:32.218+01 | ||||
209 | 63.2226500 | 18.0569333 | Bothnia line jp 31J1 | 2013-11-13 15:33:32.218+01 | ||||
213 | 59.3803250 | 15.6283583 | Alväng, Götlunda | 2013-11-13 15:33:32.218+01 | ||||
241 | 58.3846361108 | 13.8942833 | Skövde 158:1 | Skövde Raä 158 | 2013-11-13 15:33:32.218+01 | |||
243 | 61.3833333330 | 11.5288889 | Deset Knubben | 2013-11-13 15:33:32.218+01 | ||||
254 | 20 | 63.3916194 | 19.2470694 | Grundsunda 30:1 | Grundsunda Raä 30 | 2013-11-13 15:33:32.218+01 | ||
257 | 55.5647222222 | 12.9733333 | MHM 12751 | Citytunnelprojektet delområde Hotelltomten | 2014-02-19 15:28:44.312+01 | |||
258 | 59.5261111 | 18.9977778 | Ägglösen | 2014-02-19 15:28:44.312+01 | ||||
268 | 59.5927222219 | 16.4485111 | Västra Skälby | 2014-02-19 15:28:44.312+01 | ||||
288 | 44 | 66.0204306 | 22.6298389 | Töre 405:2 | Töre 405:2 | 2014-02-19 15:28:44.312+01 | ||
289 | 59.5347222 | 18.5644444 | Norra Småholmen | 2014-02-19 15:28:44.312+01 | ||||
291 | 65.9296833337 | 22.8561528 | Töre 510 | Kosjärv | 2014-02-19 15:28:44.312+01 | |||
292 | 50 | 66.0250167 | 22.6166667 | Töre 408:1 | Töre 408:1 | 2014-02-19 15:28:44.312+01 | ||
294 | 65.9783861114 | 19.1512028 | Åssjiejávrátje | 2014-02-19 15:28:44.312+01 | ||||
305 | 59.6830555559 | 18.9266667 | Fåröarna | 2014-02-19 15:28:44.312+01 | ||||
320 | 59.5152777778 | 18.7777778 | Bergskärit | 2014-02-19 15:28:44.312+01 | ||||
328 | 25 | 59.6732250 | 17.8628361 | Odensala 402:1 | Odensala Raä 402 | 2014-02-19 15:28:44.312+01 | ||
337 | 15 | 59.6973222 | 18.4795028 | Skederid 190 | Settlement site | Skederid Raä 190 | 2014-02-19 15:28:44.312+01 | |
345 | 61.1331666670 | 16.9196917 | Själstuga | 2014-02-19 15:28:44.312+01 | ||||
350 | 59.5775000 | 18.9636111 | Stora Halmören | 2014-02-19 15:28:44.312+01 | ||||
354 | 5 | 58.7519722 | 17.0113527778 | Nyköping 231:1 | Stadslager, ca 1150x1000 m (N-S), medeltida och yngre, i stadscentrum. | Åkroken | 2014-02-19 15:28:44.312+01 | |
358 | 59.6833333330 | 18.9552778 | Marskärskobben | 2014-02-19 15:28:44.312+01 | ||||
373 | 62.5760694448 | 12.5677750 | Härjedalen 169 | Hedningsgärdet | 2014-02-19 15:28:44.312+01 | |||
374 | 59.5655555556 | 18.9725000 | Stora Träskär | 2014-02-19 15:28:44.312+01 | ||||
384 | 45 | 66.0217556 | 22.6267111 | Töre 422 | 2014-02-19 15:28:44.312+01 | |||
386 | 64.9952583 | 21.3126556 | Tåme | 2014-02-19 15:28:44.312+01 |
Manual changes (delete) in plan of deployed changes prevents deploy.
Changes:
[x] Make system forward only for now (remove reverts)
[x] Make all changes idempotent
[x] Do a sqitch rebase - does a revert and deploy in sequence
Identity of facet group ROOT
(facet.facet_group.facet_group_id) changed from 0 to 999. Value 0 is reserved for undefined.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.