GithubHelp home page GithubHelp logo

hochfrequenz / mig_ahb_utility_stack Goto Github PK

View Code? Open in Web Editor NEW
2.0 5.0 0.0 1.02 MB

MIG AHB Utility Stack (MAUS): A Script to Match the Message Implementation Guide (MIG) with the Anwendungshandbuch (AHB)

License: MIT License

Python 100.00%
anwendungshandbuch ahb mig bdew edi message-implementation-guide energiewirtschaft marktkommunikation

mig_ahb_utility_stack's Introduction

MIG AHB Utility Stack (MAUS) 🐭

ℹ If you're looking for a wrapper around the new (2024) BDEW XMLs for MIG and AHB, checkout our fundamend repository. This maus package uses a different data format, because it's older than 2024.

Unittests status badge Coverage status badge Linting status badge Black status badge pypy status badge read the docs
maus logo
This repository contains the Python package maus.
MAUS is an acronym for MIG AHB Utility Stack where MIG stands for Message Implementation Guide and AHB stands for Anwendungshandbuch.
The maus software/package allows matching single lines from the AHB with fields specified in the MIG.
This package is necessary because EDI@Energy does not provide any real technical and machine-readable description of the MIGs and AHBs, only PDFs.
MAUS can also be used as a data model (maus.model) , without using the software or logic included in the package (MIG/AHB matching logic).

We're all hoping for the day of true digitization on which this repository will become obsolete.

What Problem Does It Solve?

Image you scraped the AHB PDFs into something machine-readable. Machine-readability in this context implies, that for each field/information inside the AHB you can easily access

  • segment group (e.g. "SG4")
  • segment (e.g. "LOC")
  • data element ID (e.g. "3225")
  • AHB Expressions (e.g. "Muss [123] U ([456] O [789])[904]")

The exact data format (be it CSV, JSON, XML ...) is not important beyond an initial deserialization.

(BTW: The AHB Expression can be parsed and evaluated using the 🦅 AHBicht Library or our AHBicht REST API which is publicly available.)

Image you also had a machine-readable version of the MIG -- spoiler: Hochfrequenz can help you with that (please contact @JoschaMetze for a demo) -- you still weren't able to make use of your data because the MIG data and AHB data are still unrelated. MAUS creates a connection between machine readable AHBs and machine readable MIGs. This allows to associate certain lines from the AHB with certain fields in the MIG and is the basis for a meaningful content evaluation/validation of EDIFACT messages, or, to be more precise, validation of data structures that might be converted to EDIFACT.

Code Quality / Production Readiness

  • The code has at least a 95% unit test coverage. ✔️
  • The code is rated 10/10 in pylint and type checked with mypy (PEP 561). ✔️
  • The code is MIT licensed. ✔️
  • There are only few dependencies. ✔️

Installation

For the bare maus data model and matching logic it's sufficient to install pip install maus. Only if the MIG you're using is based on XML (namely the Hochfrequenz XML based MIG representation), you need to install pip install maus[xml]. If you want to use the CLI tool, you need to install pip install maus[cli].

Once installed you can either use the package and its data model in your own Python code or use the mapping logic (of only the Hochfrequenz EDIFACT XML templates as of now) via CLI: maus --flat_ahb_path flat_ahb_by_kohlrahbi.ahb.json --sgh_path path_to_segment_group_hierarchy.sgh.json --template_path UTILMD5.2e.template --output_path file_to_be_created.maus.json. The CLI tool is not only available via pip but also as standalone executable in the respective release assets.

Development

Please follow the detailed instructions in the README of our Python Template Repository on how to setup your local development environment (tl;dr: tox).

Contribute

You are very welcome to contribute to this template repository by opening a pull request against the main branch.

mig_ahb_utility_stack's People

Contributors

deltadaniel avatar dependabot[bot] avatar hf-aschloegl avatar hf-kklein avatar hf-krechan avatar joschametze avatar lord-haffi avatar olgatrotsenko avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

mig_ahb_utility_stack's Issues

Move to CodeQL Action V2

In our public repos, we use CodeQL to scan the code:
https://github.com/search?q=org%3AHochfrequenz+%22codeql-action%22&type=code

CodeQL Action v1 won't be supported anymore by December 2022:

CodeQL Action v1 will be deprecated on December 7th, 2022. Please upgrade to v2. For more information, see https://github.blog/changelog/2022-04-27-code-scanning-deprecation-of-codeql-action-v1/

Please upgrade all our repositories that use CodeQL v1 (see search query above) to CodeQL v2.
It's a rather dumb task that is not urgent.

Fix `RemovedInMarshmallow4Warning`s

anwendungshandbuch.py:193: RemovedInMarshmallow4Warning: The 'default' argument to fields is deprecated. Use 'dump_default' instead.
maus_version = fields.String(required=False, allow_none=True, default=_VERSION)

anwendungshandbuch.py:194: RemovedInMarshmallow4Warning: The 'default' argument to fields is deprecated. Use 'dump_default' instead.
description = fields.String(required=False, allow_none=True, default=None)

anwendungshandbuch.py:195: RemovedInMarshmallow4Warning: The 'default' argument to fields is deprecated. Use 'dump_default' instead.
direction = fields.String(required=False, allow_none=True, default=None)

Wrong value type for MP-ID Absender (Ex. 11042)

Das Feld MP-ID Absender sollte ein Freifeld-Text sein.

{
    "data_element_id": "3039",
    "discriminator": "$[\"Dokument\"][0][\"Nachricht\"][0][\"MP-ID Absender\"][0][\"MP-ID\"]",
    "value_pool": [],
    "value_type": "VALUE_POOL"
},

Harmonize `DataElementValueType` and `DataElementDataType`

Downstream services (namely ahbicht) now need to decide whether to use the name DataType or ValueType. This confusion/ambiguity is unnecessary.

Ideas

  • From the DataElements perspective the name "value type" seems more appropriate imo, because it characterizes the data elements value.
  • For (also non python) downstream dependents it is easier if we don't change the attribute name but the type name (because this doesn't affect the json serialization)

Abschnitts-Kontrollsegment has no Segment-Gruppe in all MSCONS and INVOICE messages

In the docx file MSCONSAHB-informatorischeLesefassung3.1aKonsolidierteLesefassungmitFehlerkorrekturenStand27.09.2022_20230331_20221001.docx on page 51 we have the following situation:

image

So the FlatAHB instance can not get created.

The following Prüfidentifikatoren have this issue:

# MSCONS
13002
13003
13005
13006
13007
13008
13009
13010
13011
13012
13013
13014
13015
13016
13017
13018
13019
13020
13021
13022
13023
13025
13026
# INVOICE
31001
31002
31003
31004
31005
31006
31007
31008
31009
31010
31011

Fix address paths in 11042.maus

(May also be relevant for other Prüfis and other fields.)
If you have a look at the addresses, f.ex. in the 11042.maus you can see for the "Korrespondenzanschrift des Kunden des Messstellenbetreibers" that the Name, Ort, Postleitzahl, etc. only have their name as Discriminator/Path, but we need an EDISeedPath. This only works for "Struktur" which has $[\"Dokument\"][0][\"Nachricht\"][0][\"Vorgang\"][0][\"Korrespondenzanschrift des Kunden des Messstellenbetreibers\"][0][\"Struktur\"] as Discriminator/Path. We would expect and need this also for the other address fields.
Example Ort:

  • current state: "discriminator": "Ort"
  • expected value: "discriminator": "$[\"Dokument\"][0][\"Nachricht\"][0][\"Vorgang\"][0][\"Korrespondenzanschrift des Kunden des Messstellenbetreibers\"][0][\"Ort\"]"

At the moment this was hardcoded in https://github.com/Hochfrequenz/edifact-templates/pull/151/files (here you can also see reality vs. expections).
⚠You need to look out to not overwrite this state for the wimbee-backend.

Package Build Fails: HTTPError: 400 Bad Request from https://upload.pypi.org/legacy/: File already exists

The pipeline: "Build and publish Python 🐍 distributions 📦 to PyPI and TestPyPI" does not work.

What happened:

https://github.com/Hochfrequenz/mig_ahb_utility_stack/runs/4581156534?check_suite_focus=true

Run pypa/gh-action-pypi-publish@master
/usr/bin/docker run --name a6825ce3aad4b5f5d402d96be993ed9e93a9d_1f59ca --label 6a6825 --workdir /github/workspace --rm -e INPUT_PASSWORD -e INPUT_USER -e INPUT_REPOSITORY_URL -e INPUT_PACKAGES_DIR -e INPUT_VERIFY_METADATA -e INPUT_SKIP_EXISTING -e INPUT_VERBOSE -e HOME -e GITHUB_JOB -e GITHUB_REF -e GITHUB_SHA -e GITHUB_REPOSITORY -e GITHUB_REPOSITORY_OWNER -e GITHUB_RUN_ID -e GITHUB_RUN_NUMBER -e GITHUB_RETENTION_DAYS -e GITHUB_RUN_ATTEMPT -e GITHUB_ACTOR -e GITHUB_WORKFLOW -e GITHUB_HEAD_REF -e GITHUB_BASE_REF -e GITHUB_EVENT_NAME -e GITHUB_SERVER_URL -e GITHUB_API_URL -e GITHUB_GRAPHQL_URL -e GITHUB_REF_NAME -e GITHUB_REF_PROTECTED -e GITHUB_REF_TYPE -e GITHUB_WORKSPACE -e GITHUB_ACTION -e GITHUB_EVENT_PATH -e GITHUB_ACTION_REPOSITORY -e GITHUB_ACTION_REF -e GITHUB_PATH -e GITHUB_ENV -e RUNNER_OS -e RUNNER_ARCH -e RUNNER_NAME -e RUNNER_TOOL_CACHE -e RUNNER_TEMP -e RUNNER_WORKSPACE -e ACTIONS_RUNTIME_URL -e ACTIONS_RUNTIME_TOKEN -e ACTIONS_CACHE_URL -e GITHUB_ACTIONS=true -e CI=true -v "/var/run/docker.sock":"/var/run/docker.sock" -v "/home/runner/work/_temp/_github_home":"/github/home" -v "/home/runner/work/_temp/_github_workflow":"/github/workflow" -v "/home/runner/work/_temp/_runner_file_commands":"/github/file_commands" -v "/home/runner/work/mig_ahb_utility_stack/mig_ahb_utility_stack":"/github/workspace" 6a6825:ce3aad4b5f5d402d96be993ed9e93a9d "token" "***" "" "dist" "true" "false" "false"
Checking dist/maus-0.0.2-py3-none-any.whl: PASSED
Checking dist/maus-0.0.2.tar.gz: PASSED
Uploading distributions to https://upload.pypi.org/legacy/
Uploading maus-0.0.2-py3-none-any.whl

0%| | 0.00/23.1k [00:00<?, ?B/s]
35%|███▍ | 8.00k/23.1k [00:00<00:00, 48.4kB/s]
100%|██████████| 23.1k/23.1k [00:00<00:00, 49.8kB/s]
Error during upload. Retry with the --verbose option for more details.
HTTPError: 400 Bad Request from https://upload.pypi.org/legacy/
File already exists. See https://pypi.org/help/#file-name-reuse for more information.

What I expected:

The package to be updated on pypi: https://pypi.org/project/maus/ (still 0.0.2, released on 2021-12-13)
grafik

Fix Release Workflow

https://github.com/Hochfrequenz/mig_ahb_utility_stack/actions/runs/4566827926/jobs/8059895848

Warning: Unexpected input(s) 'token', valid inputs are ['config-name', 'name', 'tag', 'version', 'publish', 'prerelease', 'commitish', 'header', 'footer', 'disable-releaser', 'disable-autolabeler']
Run release-drafter/release-drafter@v5
"pull_request_target.opened" is not a known webhook name (https://developer.github.com/v3/activity/events/types/)
"pull_request_target.reopened" is not a known webhook name (https://developer.github.com/v3/activity/events/types/)
"pull_request_target.synchronize" is not a known webhook name (https://developer.github.com/v3/activity/events/types/)
"pull_request_target.edited" is not a known webhook name (https://developer.github.com/v3/activity/events/types/)
Warning: Hochfrequenz/mig_ahb_utility_stack: Invalid config file
{
name: 'event',
id: '4566827926',
stack: 'Error: Configuration file .github/release-drafter.yml is not found. The configuration file must reside in your default branch.\n' +
' at getConfig (/home/runner/work/_actions/release-drafter/release-drafter/v5/dist/index.js:142719:13)\n' +
' at async drafter (/home/runner/work/_actions/release-drafter/release-drafter/v5/dist/index.js:142347:20)\n' +
' at async Promise.all (index 0)',
type: 'Error'
}
Error: Invalid config file

MIG.json + AHB.json =>MAUS.json

  • Transaktionsgrund im MIG + Transaktionsgrund im AHB ==> JsonPath für EdifactSeed
  • Mehrere mögliche Wert im AHB ==> Wertevorrat pro Datelement/pro Jsonpath + angehägnte Bedingungen

grafik

{
    "$['Nachricht'][0]['Dokument'][0]['Vorgang'][0]['Transaktionsgrund']": {
        "ahb_expression": "Muss[2061]",
        "dataelements": {
            "9015": {
                "mig_name": "Statuskategorie",
                "possible_values": {
                    "7": {
                        "meaning": "Transaktionsgrund",
                        "ahb_expression": "X"
                    }
                }
            }
            "9013": {
                "mig_name": "Statusanlass",
                "possible_values ": {
                    " E01 ": {
                        " meaning ": " Einzug / Auszug(Umzug)",
                        " ahb_expression ": "X"
                    },
                    "E02 ": {
                        " meaning ": " Einzug / Neuanlage ",
                        " ahb_expression ": "X",
                    },
                   ...E03, ZJ4...
                }
            }
        }
    }
}

Package upload (v0.1.14) fails because of unresolved dependencies in private submodule helper script

type_check run-test: commands[2] | mypy --show-error-codes tests/integration_tests
tests/integration_tests/edifact-templates/extract_all_conditions_from_ahbs.py:10: error: Cannot find implementation or library stub for module named "ahbicht.mapping_results" [import]
tests/integration_tests/edifact-templates/extract_all_conditions_from_ahbs.py:10: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
Found 1 error in 1 file (checked 16 source files)
ERROR: InvocationError for command /home/runner/work/mig_ahb_utility_stack/mig_ahb_utility_stack/.tox/type_check/bin/mypy --show-error-codes tests/integration_tests (exited with code 1)
___________________________________ summary ____________________________________
unit_tests: commands succeeded
integration_tests: commands succeeded
linting: commands succeeded
coverage: commands succeeded
ERROR: type_check: commands failed

I don't get it. Why does the usual CI just pass but the upload behaves differently?
There are in fact unresolved (ahbicht) dependencies in the script mentioned above but they shouldn't let the package build fail.

https://github.com/Hochfrequenz/mig_ahb_utility_stack/runs/6796916740?check_suite_focus=true#step:5:175

Empfänger and Absender MP-ID has wrong data element type in maus and are missing ahb_expression

Probably also relevant for other pruefis, but here in the 11042.maus the MP-ID for Absender and Empfänger should be a Freefield Data Element and not a Value Pool: https://github.com/Hochfrequenz/edifact-templates/blob/03559bccf5f29d3c457667aec393e57d5338cbe8/maus/FV2110/UTILMD/11042_maus.json#L108
Also they are missing their ahb_expression:
Current state:

{
"data_element_id": "3039",
"discriminator": "$[\"Dokument\"][0][\"Nachricht\"][0][\"MP-ID Absender\"][0][\"MP-ID\"]",
"value_pool": [],
"value_type": "VALUE_POOL"
}

Expected:

{
"ahb_expression": "X",
"data_element_id": "3039",
"discriminator": "$[\"Dokument\"][0][\"Nachricht\"][0][\"MP-ID Absender\"][0][\"MP-ID\"]",
"entered_input": null,
"value_type": "TEXT"
}
  • AHB-Expression Absender
  • value_type Absender
  • AHB-Expression Empfänger
  • value_type Empfänger

Can be hardcoded if it is urgent, but this always has the danger to be overwritten again when the maus is newly generated.

What's the problem with validator.optional and a default string?

src\maus\models\anwendungshandbuch.py:156: error: Argument "validator" to "field" has incompatible type "Callable[[Any, Attribute[Optional[str]], Optional[str]], Any]"; expected "Union[Callable[[Any, Attribute[str], str], Any], Sequence[Callable[[Any, Attribute[str], str], Any]], None]" [arg-type]

Distinguish two UTILMD 6063

grafik

Problem: Der Predecessor Qualifier Z01 kommt aus der SG8 und die SG9 wird separat behandelt, unabhängig von ihrer Mutter SG8.

MAUS CLI crashes if started without argument

Actual Behaviour:

> PS  .\maus_cli.exe
Traceback (most recent call last):
  File "maus\__init__.py", line 83, in <module>
  File "click\core.py", line 1130, in __call__
  File "click\core.py", line 1055, in main
  File "click\core.py", line 1404, in invoke
  File "click\core.py", line 760, in invoke
  File "maus\__init__.py", line 46, in main
TypeError: expected str, bytes or os.PathLike object, not NoneType
[17792] Failed to execute script '__init__' due to unhandled exception!

Expected Behaviour: Meaningful error message

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.