GithubHelp home page GithubHelp logo

cipher-project-1's Introduction

Hi there ๐Ÿ‘‹

cipher-project-1's People

Contributors

amp813 avatar evanrichter avatar mittelmg avatar

Watchers

 avatar  avatar  avatar  avatar

cipher-project-1's Issues

ability to spell check a very close plaintext to perfectly plausible plaintext

since we know the exact wordlist used, we should be able to take a very close plaintext and use "spell checking" to correct the few words that don't quite match, to exact words found in the wordlist.

even better would be to do a few "spell checks", note what index from the key was used to correct, and try to apply that pattern to the rest, seeing if that helps more words match automatically. this technique would need to know info about the assumed keylength that was guessed in previous steps.

this function would be applied after #20, but can be developed and tested in parallel. For testing, use keys that are mostly zeros, for example: [ 0, 0, 0, 0, 0, 0, -2, 0, 0, 1, 0, 0 ] with a simple RepeatingKey schedule, to simulate a "close" plaintext. Also throw in a light PeriodicRand in some tests.

Implement the example encryption

In the project description, one possible method of encryption is given. We should implement this so we can verify we can crack it

integer underflow in OffsetReverse

When rustc compiles in release mode, it removes checks for integer wrapping (underflow in this case):

let inverted_index = last_char - (index % eff_key_length);

When this happens, the returned key index wraps around and is actually quite large, around 0xffffffffffffffff! This makes the encryptor pick a random char to insert instead, and thus makes the cracking more difficult than intended.

we can try usize::saturating_sub() to floor the subtraction result at 0, but I'm not sure that's what is intended.

given keylength, crack ciphertext by "ranking" the plaintext output

after guessing key values, there needs to be a way to figure out the best candidates for true key value.

we have access to the dictionary of plaintext words, so we can use character frequency with either strategy:

  • just read every word from the dict and get a frequency. simple, words are sampled randomly so should be representative
  • take the dict and build a plaintext of 10,000 words or so. then take character frequency of that

have a way to see how "close" a string of characters is to the expected character frequency distribution

Define a "Scheduler" trait

all key scheduling algorithms take the same inputs and produce the same type of output, so it should be a trait.

then any cipher can use any key scheduler by calling the trait function

or maybe this could just be a function type

confirm keylength guessing with randomized tests

keylength guessing works most of the time but that's not good enough.

In the tests that pass, the correct keylength or multiple of the keylength appears in the top 5 results, out of 70 or so key lengths guessed.

There seems to be an issue where longer keys are slightly favored over shorter keys, so I need to adjust the chunked hamming distance weights and penalize keylength a little bit.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.