GithubHelp home page GithubHelp logo

Comments (5)

rtobar avatar rtobar commented on July 23, 2024

@szampier the third variant is actually just like crc32, but it's consistent across python versions, it's described in the same page of the documentation. Have you given it a try? Like the docs say, crc32 was kept for backwards compatibility, but crc33z should be preferred in newer installations.

from ngas.

szampier avatar szampier commented on July 23, 2024

My point is that crc32 is not backward compatible because when we upgrade to Python 3 half of the checkums in the DB (the negative ones) will become invalid and this will break the DCC. I know that we can update the checksums in the database but we also have a number of tools which rely on the signed checksum. The third variant doesn't help either in this case. What I'm asking for is a variant (preferably the existing crc32) which consistently generates a signed integer independently of the Python version. Something like:

checksum = checksum if checksum < 2**31 else checksum - 2**32

from ngas.

rtobar avatar rtobar commented on July 23, 2024

@szampier I see, sorry I didn't get your meaning the first time, I think I see the issue now.

In hindsight I should have probably implemented the crc32 checksum variant as you suggest when porting the code to python 3; that way existing checksums would have been valid across python versions without any further changes. Sadly I didn't have this insight, and chose a different path: keeping the implementation of crc32 as-is, but offer a new variant that would give the same value in both python versions -- useful, but not in your case.

Since crc32 has been out there already for a while, even in python 3 installations, I'd be hesitant to change its meaning now. A new variant then would have to be added that is backwards compatible with the existing python2 values of crc32. Do you think you could provide a patch implementing this? Otherwise when I get some time I can give this a try.

I don't think I've ever dealt with ngamsCrc32.c, as I haven't really minded the C library that much -- I have fixed a few things here and there, but overall it's pretty difficult to maintain and it has too much technical debt accumulated.

As a side note, I don't really like how checksums are stored in NGAS, we store a string representation of an integer, which in turn is a representation of a sequence of bytes, when it would be far better to simply store the bytes directly. Not much can be done about this one though, unless we add a new column to the ngas_files table and change the code.

from ngas.

szampier avatar szampier commented on July 23, 2024

Hi @rtobar after checking the code more carefully I realized that NGAS already handles the above issue by providing a variant-specific comparison function (equals field in checksum_info), which is then used to check the checksum against a reference value stored in the database or provided by the user. The good news is that for crc32 the comparison function masks the checksums to compare with 0xffffffff. This means that we don't have a problem with negative checksums in the DB when using Python 3, as I initially thought 😄 .
I tested successfully the following cases with NGAS running on Python 3 and using crc32:

  • executing the CHECKFILE command on a file which has a negative checksum in the database
  • archiving a file which has a negative (signed) checksum and passing the checksum alongside the archive command. NGAS checks the validity of the provided checksum in this case.

We need to perform more tests to be sure that everything works fine when running the new NGAS on Python 3 but so far it looks good.

from ngas.

rtobar avatar rtobar commented on July 23, 2024

@szampier ahhhh it's reassuring to know that my former self thought about this more carefully that I remembered! I should have read the code before replying, thanks for the great news :-).

There is still obviously space for problems if external tools try to compare these checksums, but within NGAS it sounds like all all checks are done correctly then.

from ngas.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.