GithubHelp home page GithubHelp logo

mlbelobraydi / txrrc_data_harvest Goto Github PK

View Code? Open in Web Editor NEW
32.0 32.0 17.0 1.78 MB

Script for accessing and organizing oil and gas well data from the Texas Railroad Commission

License: The Unlicense

Jupyter Notebook 95.67% Python 4.33%
gas hacktoberfest oil texas-railroad-commission txrrc-data-harvest

txrrc_data_harvest's People

Contributors

dcslagel avatar deflateawning avatar mlbelobraydi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

txrrc_data_harvest's Issues

TXRRC file location changes

Describe the bug
TXRCC is no longer using an FTP and documentation and code is out of date.

To Reproduce
Attempting to download or connect to any FTP file

Expected behavior
Connections to work

Additional context
It might be good to have a config file that points to the file locations that can universally change any code cascading from that point.

Comp-3 function

Is your feature request related to a problem? Please describe.
The production data has several fields that are comp-3

Describe the solution you'd like
The positions in bytes are less than the ending number of digits. The pic_signed function does not account for this and an additional function needs to be created

Describe alternatives you've considered
Tried to find a way to modify pic_signed and it isn't possible and will require a new function for comp-3

Additional context
Information on Comp-3 can be found here
http://www.3480-3590-data-conversion.com/article-packed-fields.html

Workflow documentation?

Nice work so far; I was thinking it might be helpful if you were to provide some documentation detailing the order in which the scripts should be ran in, as well as an overview of what each script does (outside of the comments in each notebook). This would make it easier for folks to pick it up and run with it. Looking forward to digging in and see what all this is capable of.

python struct format generated from Cobol copybooks at RRC

Is your feature request related to a problem? Please describe.
I wanted to generate python structs for the Cobol copybooks. And for the computational numeric fields emit for specific fields the hex for the signed/unsigned. I am part way there, but I wanted to bring this to your attention to see if you think this would be good. This way, no one would need to hand code parsing of the structures.

Describe the solution you'd like
I would like to be able to use the copybook in a full cobol program, parse the data division, and generate struct formats so that each section can be parsed directly in python without hand coding the parsing lengths as seems to be the direction now. I am working on the Oil Ledger files now with the copybook defined in the Oil Ledger PDF

Describe alternatives you've considered
I considered writing a parser of the copybook myself, but a Cobol84.g4 grammer file exists for ANTLR4, so I can just use that and generate a Listener that I can use to walk the symbol table and generate the struct formats.

Additional context
I am adding unit tests to make sure the code works as I tweak it.
I would like to integrate this into your repo and contribute to that.
My main interest is in parsing as much Oil/Gas well data out so I can continue my machine learning project which will look for aberations in production data for wells over time.

Longitude not being recognized as negative

@skylerbast, I'll be pushing the bytes version of the code and I'm not sure if the values are signed or unsigned. It is now capturing the last digits to the value, but I'm not sure the below section is working correctly.

If the penultimate nibble == 0xD, then the number is negative. Otherwise,
it is either positive or unsigned.
val = (val * (-1 if signed_raw[-1] >> 4 == 0xD else 1)) / 10**decimal

Would it be possible to chat with you on how this is suppose to work?

supporting systems with limited memory.

Is your feature request related to a problem? Please describe.
No

Describe the solution you'd like
Currently the script opens and decodes the entire file in memory. This can cause issues for systems with limited memory (<8GBram). It may be good to read parts of the file and dump from memory as necessary to keep memory more free.

Describe alternatives you've considered
opening and reading by line
decoding as necessary
writing results to disk and not holding it in memory

Additional context
Any changes will need to be tested with limited memory.

Preserve original entry of date infomation along with conversion

Is your feature request related to a problem? Please describe.
Dates are not always formatted with correct numbers for good datetime conversion. The original data needs to be preserved to get out any good information in the entry for manual correction

Describe the solution you'd like
Preserve original value along with the datetime conversion

Describe alternatives you've considered
If we keep the nulls, the original data needs to be also added back in. Is it possible to parse and correct out of range months and days with a flag column to see actual vs estimated date.

Additional context
This is important for completions tracking. A month and/or year is better than nothing. DSTs need be linked to the right open section.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.