GithubHelp home page GithubHelp logo

ssim's People

Contributors

ramonvanschaik avatar rok avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ssim's Issues

License Change

Is there any possibility of changing the license to MIT? People writing tools cannot link to GPL code without open sourcing their code.

Relaxing the compulsory seat order

Currently the algorithm follows a compulsory order in which the seat configuration should be specified "seat designator should be in fixed order according to standard" However, sometimes SSIM files are received that do not stick to this fixed order. So e.g. J50O24Y202VV10 instead of J50Y202O24VV10. As a result, only part of the aircraft_configuration_string in _explode_aircraft_configuration_string() is being read.
image

I would recommend to focus on the aircraft_configuration_string instead of the fixed order in seat_class_designators. So replacing

    string_remainder = aircraft_configuration_string.rstrip()

    seat_class_designators = [
        "P",
        "F",
        "A",
        "J",
        "C",
        "D",
        "I",
        "Z",
        "W",
        "S",
        "Y",
        "B",
        "H",
        "K",
        "L",
        "M",
        "N",
        "Q",
        "T",
        "V",
        "X",
        "G",
        "U",
        "E",
        "O",
        "R",
    ]

    cargo_designators = ["LL", "PP"]  # unit load devices (containers)  # pallets

    integer_designators = {"seats": seat_class_designators, "cargo": cargo_designators}

    acv_info = {}

    # seat designator should be in fixed order according to standard
    for designator_type, designators in integer_designators.items():
        for designator in designators:
            if string_remainder.startswith(designator) and (  # only continue if string starts with designator and...
                len(designator) > 1
                or not (  # either have a multi length designator
                    (len(string_remainder) > 1 and string_remainder[1] == string_remainder[0])
                    or string_remainder.startswith("V V")
                )
            ):  #

                acv_info_key = designator_type + "_" + designator

                # for next iteration, remove designator from beginning of string
                string_remainder = string_remainder[len(designator):]

                # standard specifies that there may be an int following. If not return empty string (to later on destinguish from NaN if data gets put in a data frame)
                acv_info_val = ""
                acv_regex = re.search(r"^\d*", string_remainder).group()

                # if int found, add it to total and remove it from string to process as well
                if acv_regex:
                    acv_info_val = int(acv_regex)
                    if designator_type == "seats":
                        if designator_type in acv_info.keys():
                            acv_info[designator_type] += acv_info_val
                        else:
                            acv_info[designator_type] = acv_info_val

                    string_remainder = string_remainder[len(acv_regex):]

                # store found information
                acv_info[acv_info_key] = acv_info_val

    # remainer are general designators
    if string_remainder.startswith("BB"):
        acv_info["BB"] = ""

    # aircraft type
    if string_remainder.startswith("VV"):
        acv_info["VV"] = string_remainder[2:]
    elif string_remainder.startswith("V V"):  # aircraft type alt. Assuming it won't appear together with VV.
        acv_info["V V"] = string_remainder[3:]
    elif len(string_remainder.strip()):
        log_text = (
            "After trying to process aircraft configuration string, there should be no remainder. However, the following string remains in this instance: (%s)"
            % string_remainder
        )
        if raw_line:
            log_text += "\n Raw slot line: " + raw_line
        logging.warning(log_text)

by e.g.

    string_remainder = aircraft_configuration_string.rstrip().replace(' ', '')

    seat_class_designators = [
        "P",
        "F",
        "A",
        "J",
        "C",
        "D",
        "I",
        "Z",
        "W",
        "S",
        "Y",
        "B",
        "H",
        "K",
        "L",
        "M",
        "N",
        "Q",
        "T",
        "V",
        "X",
        "G",
        "U",
        "E",
        "O",
        "R",
    ]

    # unit load devices (containers), number of pallets
    cargo_designators = [
        "LL", 
        "PP"
    ]  

    # general designators
    general_designators = [
        "VV", 
        "BB"
    ]  

    allowed_designators = {
        **{x: "seats" for x in seat_class_designators},
        **{y: "cargo" for y in cargo_designators},
        **{z: "general" for z in general_designators}
    }

    acv_info = {}

    included_designators = re.findall(r'(\w+?)(\d+)', string_remainder)
    for designator_pair in included_designators:
        assert len(designator_pair) == 2, \
            "designator_pair does not contain type and number information"

        designator = designator_pair[0]
        acv_info_val = int(designator_pair[1])
        
        if designator in allowed_designators.keys():
            acv_info_type = allowed_designators[designator]
        else:
            log_text = (
                "Designator (%s) is not recognised as valid type. Please check schedule"
                % designator
            )
            if raw_line:
                log_text += "\n Raw slot line: " + raw_line
            logging.warning(log_text)
            acv_info_type = "unknown"
        acv_info_key = acv_info_type + "_" + designator            

        if acv_info_type == "seats":
            if acv_info_type in acv_info.keys():
                acv_info[acv_info_type] += acv_info_val
            else:
                acv_info[acv_info_type] = acv_info_val

        # store found information
        acv_info[acv_info_key] = acv_info_val

Allow multiple encoding formats apart from UTF-8

Currently the algorithm only seems to accept SSIM files in UTF-8 encoding. In case of UTF-16, the following error is being raised:
image

With a small change, more formats should be accepted. E.g. by replacing

    with open(file, "r") as f:
        text = f.read()

in read(), by

import chardet
bytes = min(32, os.path.getsize(file))
raw = open(file, 'rb').read(bytes)
result = chardet.detect(raw)
encoding = result['encoding']
    with open(file, "r", ecoding=encoding) as f:
        text = f.read()

Error on import

Probably missing 'install_requires' in setup.py:

Python 3.6.5 (default, Dec 26 2018, 13:41:35)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.2.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import ssim

ModuleNotFoundError Traceback (most recent call last)
in
----> 1 import ssim

***/ssim_parser/sim_env/lib/python3.6/site-packages/ssim/init.py in
----> 1 from .ssim import read, expand_slots

***/ssim_parser/sim_env/lib/python3.6/site-packages/ssim/ssim.py in
8 import re
9 from datetime import datetime, timedelta, date
---> 10 from dateutil.rrule import rrule, WEEKLY
11 import sys
12 import os

ModuleNotFoundError: No module named 'dateutil'

ValueError: Could not find required header information in file:

Hello,

Thanks for providing this utility. I am having some trouble processing a sample SSID file, the error I get is:

Traceback (most recent call last):
  File "/usr/local/bin/ssim", line 11, in <module>
    load_entry_point('ssim==0.2.1', 'console_scripts', 'ssim')()
  File "/usr/local/lib/python2.7/dist-packages/ssim/__main__.py", line 15, in main
    slots, _, _ = ssim.read(input_file)
  File "/usr/local/lib/python2.7/dist-packages/ssim/ssim.py", line 537, in read
    slots, header, footer = _parse_slotfile(text, year_prefix=year_prefix)
  File "/usr/local/lib/python2.7/dist-packages/ssim/ssim.py", line 236, in _parse_slotfile
    raise ValueError('Could not find required header information in file:\n%s' % text[0:200])
ValueError: Could not find required header information in file:
1AIRLINE STANDARD SCHEDULE DATA SET      COPYRIGHT 2008, OAG WORLDWIDE LTD.  ALL RIGHTS RESERVED.                       18MAY08                                                                001000001

I assume there is a problem with my data. The first 20 lines look like:

00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
2LCA R0008S08 07MAY0800XXX0007MAY08OAG SSIM PRODUCT             18MAY08C                                                                                                                    EN1937000002
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
3 CA  1010101J12MAY0813MAY0812      PEK08000800+08003 HKG11251125+08001 738CDIJYBMHKLQGSXVUZWTE     XX                 II                                        M                              00000003
4 CA  1010101J              AB010PEKHKGCX 6101 /KA 1101                                                                                                                                           000004
4 CA  1010101J              AB109PEKHKGB B B B B B B B B B B B B B B B B B B B                                                                                                                    000005
4 CA  1010101J              AB503PEKHKG  9                                                                                                                                                        000006
4 CA  1010101J              AB505PEKHKGET                                                                                                                                                         000007
3 CA  1010201J14MAY0815MAY08  34    PEK08000800+08003 HKG11251125+08001 738CDIJYBMHKLQGSXVUZWTE     XX                 II                                        M                              00000008
4 CA  1010201J              AB010PEKHKGCX 6101 /KA 1101                                                                                                                                           000009
4 CA  1010201J              AB109PEKHKGB B B B B B B B B B B B B B B B B B B B                                                                                                                    000010
4 CA  1010201J              AB503PEKHKG  9                                                                                                                                                        000011
4 CA  1010201J              AB505PEKHKGET                                                                                                                                                         000012

Do you know what the problem might be?

Linting test fails

The Black checks if the codebase is formatted correctly. It currently isn't. It would be great if it was.

Automated logic tests

We have some logic tests for the parser. It would be great to run them after every commit to avoid introducing bugs.
We could use Github Actions to run these tests quite easily.

Trigger warning if 00XXX00 in SSIM

In case the end date (or start date) of slotline in SSIM file is unknown, it is set to 00XXX00 by airline. It is implemented that the specified season is used to determine the end date. In 'normal' situations, the SSIM files only contain 1 season. However, the current timespan of the files are very short due to all uncertainty involved. In combination with a close season change, this means that files can contain multiple seasons. The specified season (user input) can be arbitrary or based on the current situation. Right now, it remains unnoticed that a) there is an infinity indicator (00XXX00) present in the flight schedule and b) that the end date of flights is limited by the selected choice of season.

An example is presented in the figure below; a SSIM file that contains flights for Winter season W20 (first line) and Summer season S21 (second and third line). In case the user specified season="W20" since it is the current season, this means that the latter slot line is expanded for the timespan [6th April 2021, enddate(W20) = 27th March 2021]. However, there is no warning specified.
image

Therefore it might be useful to warn user this has inifinity indicator 00XXX00 has been found, and set to the end date corresponding to the selected season. E.g. by adding code to _expand() in ssim.py

    log_text = (
        "00XXX00 has been found -> set to end of season " + season
    )
    logging.warning(log_text)

to

period_of_operation_to = record["period_of_operation_to"]
if period_of_operation_to in infinity_indicators:

dict contains fields not in fieldnames: 'seats_F'

Hello,

I am trying to parse an SSIM file but not able to do it as I got the next error, do you know what does it mean?

2023-09-20 19:37:19,608] INFO: Expanded 15546 slots into 384146 flights.
Traceback (most recent call last):
File "/usr/local/bin/ssim", line 8, in
sys.exit(main())
File "/Users/miguelgarciaalfaro/Library/Python/3.8/lib/python/site-packages/ssim/main.py", line 23, in main
dict_writer.writerows(flights)
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/csv.py", line 157, in writerows
return self.writer.writerows(map(self._dict_to_list, rowdicts))
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/csv.py", line 149, in _dict_to_list
raise ValueError("dict contains fields not in fieldnames: "
ValueError: dict contains fields not in fieldnames: 'seats_F'

Thank you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.