pyca / cryptography Goto Github PK

View Code? Open in Web Editor NEW

6.3K 131.0 1.4K 53.75 MB

cryptography is a package designed to expose cryptographic primitives and recipes to Python developers.

Home Page: https://cryptography.io

License: Other

Shell 0.05% Python 74.60% Rust 25.34%

python cryptography

cryptography's People

Contributors

Stargazers

Watchers

Forkers

lvh dstufft exarkun tiran reaperhulk wallrj alex ivoz pombredanne cyli dlitz public jmvrbanac nidhog glyph rgbkrk georgedorn dmendiza sholsapp cipherself parlarjb manuels bluehorn jonathan-hepp chrisglass manishearth fedor-brunner smurfix zxc911 hacosta fangli ferencyao habnabit hellais antonukvat pyaa jgiannuzzi kimvais lukasa securapawn saschpe deed02392 gooddingo sankarshan-mudkavi anotherthomas gantenbein ashfall ashumeow derwolfe valorien mytemp crc32a timic nryoung msystechnologiesllc b-rich leizhang kennwhite chjwdzhr chjwdzhrxp genba akgood bdpayne twodimension lf2225 frewsxcv captainwong bartjkdp pombreda src0108 michael-hart elbow-jason maxking esaul majord4m4ge rodneyrichardson viraptor rpidanny hasimir sashka bwhmather mlalic jonahgraham vdcow delroth callidus orcasgit sigma-random mark-adams greghaynes emonty richmoore schlenk samucc abdulsidibe ghisvail julian yxjsolid gosom lmctv

cryptography's Issues

Using OpenSSL backend should not produce warnings on OS X

We should get this warnings clean.

Figure out if we got the CTR API right

Right now we just do what OpenSSL does: aka nonce is a random blob. Sometimes nonce means some other stuff.

/cc @lvh

Mission statement and target audience

As discussed on twitter let's add a mission statement to cryptography.

It should explain:

reason for a new Python crypto package
purpose
target audience
target platforms (Python 2.7+, 3.2+, PyPy)
supported OS (Linux, *BSD, Windows, ???)
supported architectures (X86, X86_64, X32, ARMv6+ ?)
supported compilers (GCC 4.x, clang, MSVC)
C standard (C89 or partial C99, optional GNU extension?)

Document target platforms

Document backwards compatibility policy

Add key derivation functions (low level only)

I propose the addition of key derivation and key stretching algorithms to cryptography. KDFs are state of the art algorithms to securely hash passwords. The three most well-known algorithms are:

HMAC based PBKDF2
bcrypt (blowfish crypt)
scrypt

OpenSSL has PKCS5_PBKDF2_HMAC() http://bugs.python.org/issue18582 .

OpenSSL also has blowfish but the API is not suitable for bcrypt. The bcrypt algorithm needs one low level function to modify blowfish's internal state. bcrypt also implements its own base64 encoding that is not compatible with standard base64.

scrypt can be implemented on top of OpenSSL's HMAC_SHA256().

IMHO the API should be as low level as possible so 3rd party libraries can be build on top of the APIs. The low level API shall neither handle encodings nor parse crypt(3) / shadow(5) password entries. Password, salt and key shall be bytes. E.g.::

pbkdf2_hmac(hash_name, password_bytes, salt_bytes, iterations, keylen) -> key

Features needed to replace pycrypto in twisted-conch

Document how to contribute private things

The CONTRIBUTING file, once merged, suggests PEP 8, which suggests underscores:

Even with __all__ set appropriately, internal interfaces (packages, modules, classes, functions, attributes or other names) should still be prefixed with a single leading underscore.

However, on the initial repo pull request, @alex commented:

I hate naming modules with an underscore prefix. I think not documenting c should be enough. I agree that OpenSSL is an implementationd etail.

FWIW, I disagree on both counts:

I love the _ prefix because it's unambiguous and impossible to miss. A note in a module somewhere is easy to miss, and no note at all even easier :)
Undocumented is not the same as private, it's just undocumented.
Private things ought to be documented too (particularly since we're doing crypto, and using things the wrong way is typically disastrous)

... but that's just my two satoshis.

Either way, this should be documented in the CONTRIBUTING file.

Check IV lengths are correct for cipher

OpenSSL can't be bothered to check these itself, because crypto is real easy and not important.

STACK_OF -dependent functions in openssl bindings ssl.py

The following functions were in https://github.com/exarkun/opentls/blob/b38a0cb646c34fe272fd747d6ec096416e279ef8/tls/c/ssl.py#L285:

struct stack_st_X509 *SSL_get_peer_cert_chain(const SSL *);
struct stack_st_SSL_CIPHER *SSL_get_ciphers(const SSL *);
struct stack_st_X509_NAME *SSL_get_client_CA_list(const SSL *);
struct stack_st_X509_NAME *SSL_CTX_get_client_CA_list(const SSL_CTX *);
void SSL_CTX_set_client_CA_list(SSL_CTX *, struct stack_st_X509_NAME *);

However, in Openssl in both 0.9.8 and 1.0.0, they use STACK_OF instead of stack_st_X509. stack_st_X509 is only valid if DEBUG_SAFESTACK is defined in safestack.h.

PyOpenSSL seems to require this, but not sure how to treat it with CFFI.

The X509 bindings seem to use stack_st_X509 fine without any errors.

Fix OpenSSL bindings under OpenSSL 0.9.8

Here is a strategy we can use for things with significantly different signatures:

ffi.cdef("""
int PyCryptography_HMAC_new(...);
""")

ffi.verify("""
int PyCryptography_HMAC_new(...) {
#ifdef OPENSSL_VERSION > 2
    return HMAC_new(...);
#else
    HMAC_new(...);
    return 0;
#endif
}
""")

Wrap GPGME

Since we want to be the single answer to all crypto questions on Python, how about wrapping GPGME too?

ISTM that most people should just use that anyway, cf “GPG for data at rest. TLS for data in motion.”

Use abstract base classes

How do you feel about using ABCs in cryptography?

For example:

from abc import ABCMeta

class CryptoHash(metaclass=ABCMeta):
    pass

class Cipher(metaclass=ABCMeta):
    pass

class BlockCipher(Cipher):
    pass

class StreamCipher(Cipher):
    pass

We could also use abcs to flag algorithms as weak, broken or deprecated:

class WeakendAlgorithm(metaclass=ABCMeta):
    pass

class BrokenAlgorithm(metaclass=ABCMeta):
    pass

class DeprecatedAlgorithm(metaclass=ABCMeta):
    pass

Example with RC4:

class RC4(StreamCipher):
    pass

BrokenAlgorithm.register(RC4)

Add Support for CAST5 (CAST-128) CBC/CFB/OFB

OpenSSL supports these modes, but we have disabled them in our backend due to a lack of test vectors.

If we still can't find any we should consider generating some against OpenSSL's implementation and documenting their provenance. We can use them for validation as we expand our backend support.

Add DSA

Things that need to happen;

Pen a good answer to the question of "Why?"

We know there are a lot of problems with the current libraries, but we should expand upon our basic answer to Why? in the README.

Ideally we should answer:

Why a new binding to OpenSSL?
Why a new low level API?
Why a new "For Humans" API?

Data loss with some chunk sizes

>>> from cryptography.primitives.block import BlockCipher, ciphers, modes
>>> cipher = BlockCipher(ciphers.AES(b"\x00" * 16), modes.CBC(b"\x00" * 16))
<cryptography.primitives.block.base.BlockCipher object at 0x1066b2f90>
>>> cipher.encrypt(b"a")
''
>>> cipher.encrypt(b"a" * 63)
'\x92l\x0f\xea\x05\xaf\xd6_Y1\xd2\x9a\x9ck?\x9c\x04 73\xc9e\x9b*\x8b\xe6\xff\xb2\x83\x80\xb3z\x8c\x96\xcc\x16=^\x95\x8f\xb5\xfa6\xd77\xbfi\xaf\x97\xfa\x1f\x1fhrd\x16d\xcf\xb0\xd8V\x87\x1e'
>>> len(_)
63
>>> cipher.finalize()
''

Add PGP key ids to AUTHORS.rst?

How do you feel about PGP key ids in the AUTHORs.rst file? As security concerned citizens we should encourage people to use PGP/GPG, especially if somebody wants to report a bug securely.

Write release automation software

Compile error on Debian

Debian's standard OpenSSL appears to not include SSLv2. This was seen in Debian Unstable, but most likely applies to Debian 7.

This is the important bits of the stack trace:

cryptography/hazmat/backends/openssl/__pycache__/_cffi__xca198f2bxe6136d81.c: In function ‘_cffi_const_SSL_OP_MSIE_SSLV2_RSA_PADDING’:
cryptography/hazmat/backends/openssl/__pycache__/_cffi__xca198f2bxe6136d81.c:3132:20: error: ‘SSL_OP_MSIE_SSLV2_RSA_PADDING’ undeclared (first use in this function)

The full stack trace is available at http://pb.codehash.net/hokahaku .

The output of openssl version is OpenSSL 1.0.1f-dev xx XXX xxxx.

Evaluate the use of zero-buffer

For efficient buffering

Write some fuzzer tests

Once we have decryption support we should write some tests that do some fuzzing. The basic idea is to generate random strings and just test a round trip encrypt/decrypt and make sure the output matches the input. The goal being that by fuzzing we might find something that the test vectors didn't bother. For example maybe none of the vectors have a null inside the string and we accidentally used a ffi.string where a ffi.buffer should be used.

Things needed to implement fernet

https://github.com/fernet/spec/blob/master/Spec.md

PKCS 7 padding (same as PKCS 5 padding)
AES-128 CBC
HMAC
SHA256

Jenkins CI

As seen in #405, we have now reached the point of no return on coverage. To stay at 100% we can't use Travis alone.

Advantages/Drawbacks time!

Advantages

Supports windows/mac.
Allows us to test against a much wider set of OpenSSL versions (and other backends as we add them)
Jenkins build speeds can be much faster than Travis, especially during peak hours.

Disadvantages

Coveralls integration is much less elegant. Combined coverage is available, but it won't automatically leave comments on PRs (although we could hypothetically write something to do this in jenkins)
Have to administer our own test infrastructure.
Jenkins is uglier and harder to use.

If we're ready to do this (and to land CommonCrypto or Windows support we require it) the first step will be to get the coveralls repo token from coveralls.io. After we confirm the integration works (and iron out whatever bugs crop up) we'll have to turn off Travis support (unless we are okay with coveralls putting inaccurate coverage comments on our PRs).

Re-seed PRNG after fork

When forking a process we should reseed the PRNG for any loaded backend to prevent multiple processes from generating identical/similar "random" values.

Python and Ruby have already walked some of this path:
MRI Ruby
CPython

Unfortunately there appears to be an issue with the current approach in Python, so we may have to wait to see if a more reliable approach is developed.

Include CPRNGs

Every decent cryptographic library needs proper crypto pseudo random number generators.

Most (all?) Unix-like operation systems have /dev/random and /dev/urandom. /dev/urandom is non-blocking and sufficient for most crypto stuff except for long-living keys (e.g. ssh, TLS and PGP private keys). The API needs some flags to classify entropy of a CPRNG and its blocking state.

OpenSSL has RAND_pseudo_bytes() and RAND_bytes() as documented at http://www.openssl.org/docs/crypto/RAND_bytes.html . We should also consider http://www.openssl.org/docs/crypto/RAND_add.html and EGD, too.

OpenSSL's RAND generator has a twist: it must be reset on fork(). Otherwise parent and child generate the same random values. Postgres suffered from the issue and now calls RAND_cleanup() on fork(). It's an unsolved issue in Python, too. See http://bugs.python.org/issue16500 for my proposal of an atfork module.

No mention of who owns the copyright

IIUC, you have to do this (the license even says this), but it's good enough to just put the authors the ``cryptography`` library (see AUTHORS) and then add an AUTHORS file.

Improve GCM / AEADContext documentation

It's apparently not very clear:

https://twitter.com/marcinw/status/414037537049440256

We should consider moving the interface docs to the bottom and having more examples / prose up top.

Clearly communicate the required properties of values

A lot of the values our API asks for have very specific properties that we can't really test for. Things like a requirement to be non predictable, or to only ever be used once. Typically giving values that don't match the expected properties can severely compromise the security of the system even to the point of plaintext or key recovery.

So we need to figure out a good way to very clearly and unambiguously communicate exactly what properties a value is expected to have. If we can work it then even a distinction between MUST have and SHOULD have would be nice as well. Simple prose is probably not enough for this as it leaves interpretation open to the reader and the costs of getting it wrong can be dire.

Bind enough of OpenSSL for pyOpenSSL to use it

A probably incomplete list of things we need, or which we need to improve (based on exarkun/opentls@master...err-load-rand-strings ):

Better organize tests

Hash Object Instantiation

We currently require you to instantiate a hash object before passing it into the Hash().

from cryptography.hazmat.backends.openssl.backend import backend
from cryptography.hazmat.primitives import hashes

sha1 = hashes.Hash(hashes.SHA1(), backend)

Since the hash objects contain no state, does it make sense to have users pass the hash class rather than an instance of it?

sha1 = hashes.Hash(hashes.SHA1, backend)

If we'd prefer to continue to require object instantiation, should we require that when doing a hash_supported check for consistency? Right now the OpenSSL backend will work either way.

Do we actually set-up threading properly for OpenSSL? TLDR; Nope.

OpenSSL needs you to call at least CRYPTO_set_id_callback() (or CRYPTO_THREADID_set_callback() depending on version.) The CPython built in _ssl module binds these its self using whatever version of OpenSSL Python is linked to.

I suspect that if import _ssl never happens, this initialization will never be done. Potentially CPython is not even using the same version of OpenSSL we are linked to and so we might not benefit from it anyway. (e.g. on Windows where OpenSSL is often statically linked to CPython.)

If we set our own callbacks and we are using the same OpenSSL as CPython when _ssl is imported our callbacks will be replaced. This is probably fine?

If they are using different versions of OpenSSL we might have some interesting dlopen binding issues to address too.

Things needed to replace PyCrypto in paramiko

Paramiko is an SSH client implementation:

Split up backends and bindings.

This was pyOpenSSL and others can import things without running any initialization code.

Hasher.hexdigest() is documented as returning a str, but returns unicode under python2

It should return a native string under boht py2 and py3

Replace many of the asserts in the backend with real error handling

We need to understand the underlying OpenSSL error conditions and be able to write tests for them.

Binding.is_avaialble can print to stderr

Specifically it'll print an error message, you can see this if you run tests/hazmat/bindings/test_bindings.py

Privatize Backend lib and ffi attributes.

#380 means these don't need to be public anymore.

Split up and rename API

First of all, API shouldn't be named API, it should be backend or something, everytning is an API. Second it should be split up and made up of composition:

instead of putting all the features for all the primitives ona single class, the core backend object should just expose attributes which have all the behavior, backend.ciphers, backend.hashes, etc. Then the backend can implement different interfaces to indicate what it makes avialable

Features needed to implement Keyczar

https://code.google.com/p/keyczar/wiki/KeyczarSummary

Secure memory wiping

The patch in http://bugs.python.org/issue17405 might be interesting for cryptography. It contains my research on secure memory wiping and a C89 implementation of C11's memset_s() function.

Quote:
Compilers like GCC optimize away code like memset(var, 0, sizeof(var)) if the code occurs at the end of a function and var is not used anymore [1]. But security relevant code like hash and encryption use this to overwrite sensitive data with zeros.

The code in _sha3module.c uses memset() to clear its internal state. The other hash modules don't clear their internal states yet.

There exists a couple of solutions for the problem:

C11 [ISO/IEC 9899:2011] has a memset_s() function
MSVC has SecureZeroMemory()
GCC can disable the optimization with #pragma GCC optimize ("O0") since GCC 4.4
[2] contains an example for a custom implementation of memset_s() with volatile.

[1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8537
[2] https://www.securecoding.cert.org/confluence/display/seccode/MSC06-C.+Be+aware+of+compiler+optimization+when+dealing+with+sensitive+data

OpenSSL bindings don't work on OS X

$ py.test -x
======================================== test session starts =========================================
platform darwin -- Python 2.7.5 -- pytest-2.5.0
plugins: cache, cov, xdist
collecting 0 items / 1 errors
=============================================== ERRORS ===============================================
________________________________ ERROR collecting tests/test_utils.py ________________________________
tests/conftest.py:2: in pytest_generate_tests
>       from cryptography.hazmat.bindings import _ALL_BACKENDS
cryptography/hazmat/bindings/__init__.py:14: in <module>
>   from cryptography.hazmat.bindings import openssl
cryptography/hazmat/bindings/openssl/__init__.py:14: in <module>
>   from cryptography.hazmat.bindings.openssl.backend import backend
cryptography/hazmat/bindings/openssl/backend.py:455: in <module>
>   backend = Backend()
cryptography/hazmat/bindings/openssl/backend.py:71: in __init__
>       self._ensure_ffi_initialized()
cryptography/hazmat/bindings/openssl/backend.py:116: in _ensure_ffi_initialized
>           libraries=["crypto", "ssl"],
../../.virtualenvs/cryptography-dev/lib/python2.7/site-packages/cffi/api.py:339: in verify
>       lib = self.verifier.load_library()
../../.virtualenvs/cryptography-dev/lib/python2.7/site-packages/cffi/verifier.py:75: in load_library
>           return self._load_library()
../../.virtualenvs/cryptography-dev/lib/python2.7/site-packages/cffi/verifier.py:151: in _load_library
>       return self._vengine.load_library()
../../.virtualenvs/cryptography-dev/lib/python2.7/site-packages/cffi/vengine_cpy.py:138: in load_library
>           raise ffiplatform.VerificationError(error)
E           VerificationError: importing '/Users/alex_gaynor/projects/cryptography/cryptography/hazmat/bindings/openssl/__pycache__/_cffi__x29084331x3a5e42a8.so': dlopen(/Users/alex_gaynor/projects/cryptography/cryptography/hazmat/bindings/openssl/__pycache__/_cffi__x29084331x3a5e42a8.so, 2): Symbol not found: _ENGINE_set_default_pkey_asn1_meths
E             Referenced from: /Users/alex_gaynor/projects/cryptography/cryptography/hazmat/bindings/openssl/__pycache__/_cffi__x29084331x3a5e42a8.so
E             Expected in: flat namespace
E            in /Users/alex_gaynor/projects/cryptography/cryptography/hazmat/bindings/openssl/__pycache__/_cffi__x29084331x3a5e42a8.so
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
====================================== 1 error in 1.06 seconds =======================================
[08:27:28] (cryptography-dev) ~/p/cryptography (master)
$

Looks to be a missing symbol.

Memory Leak Checks

At some point we should figure out how to add a memory leak check (probably valgrind-based) to our CI pipeline.

hashlib compatible interface

We should provide an interface to our cryptographic hashes that is compatible with hashlib.

Add a primitive API for RSA

Here's a proposal:

Some open questions about:

how to expose the from_* methods sanely
API to generate a new key pair?
what haven't we thought of yet

class RSAPublicKey(object):
    def __init__(self, modulus, public_exponent):
        pass

    @classmethod
    def from_x509(cls, ...):
        return cls()

    @classmethod
    def from_pkcs1(cls, ...):
        return cls()

    @classmethod
    def from_openssh(cls, text):
        return cls()

    def encrypt(self, plaintext):
        return ciphertext

    def verify(self, message, signature):
        raises an exception on bad signature

    @property
    def keysize(self):
        return keysize_in_bits


class RSAPrivateKey(object):
    def __init__(self, modulus, public_exponent, private_exponent, p, q,
                 crt_coefficient):
        pass

    @classmethod
    def from_pkcs1(cls, ...):
        return cls()

    @classmethod
    def from_pkcs8(cls, ...):
        return cls()

    @classmethod
    def from_openssh(cls, text):
        return cls()

    def encrypt(self, plaintext):
        return ciphertext

    def decrypt(self, ciphertext):
        return plaintext

    def sign(self, message):
        return signature

    def verify(self, message, signature):
        raises an exception on bad signature

    @property
    def keysize(self):
        return keysize_in_bits

    def publickey(self):
        return RSAPublicKey(modulus, public_exponent)

SHA-3 (Keccak)

I'm the author of Python 3.4's sha3 module and https://bitbucket.org/tiran/pykeccak/ . OpenSSL doesn't provide SHA-3 yet. I'm willing to re-license and adapt my code for cryptography if you are interested in SHA-3 support.

SHA-3 is not finalized yet so we may want to wait, see http://bugs.python.org/issue16113 and http://csrc.nist.gov/groups/ST/hash/sha-3/timeline_fips.html .

standard library hmac compatible interface

We should provide an interface to our HMAC support that is compatible with the standard library.

Verify function on Hash

After discussion on IRC around #315 it was decided to not include a verify function on Hash yet. One suggestion @dreid brought up was if to split HashContext into a more specific context for HMACs too and also change the exceptions being thrown to be more specific to what they are being used for.

It should be an error to provide ``tag`` when using GCM in encrypt mode

Right now it's silently ignored, that's bad. Don'd do that.

pyca / cryptography Goto Github PK

cryptography's People

Contributors

Stargazers

Watchers

Forkers

cryptography's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs