GithubHelp home page GithubHelp logo

aykutalparslan / ferrite Goto Github PK

View Code? Open in Web Editor NEW
137.0 14.0 22.0 11.41 MB

Experimental Telegram Server

License: GNU Affero General Public License v3.0

C# 100.00%
telegram mtproto server telegram-server

ferrite's People

Contributors

alikhadivi avatar aykutalparslan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ferrite's Issues

Implement Account.UpdateProfile

Updates user profile.

userEmpty#d3bc4b7a id:long = User;
user#3ff6ecb0 flags:# self:flags.10?true contact:flags.11?true mutual_contact:flags.12?true deleted:flags.13?true bot:flags.14?true bot_chat_history:flags.15?true bot_nochats:flags.16?true verified:flags.17?true restricted:flags.18?true min:flags.20?true bot_inline_geo:flags.21?true support:flags.23?true scam:flags.24?true apply_min_photo:flags.25?true fake:flags.26?true id:long access_hash:flags.0?long first_name:flags.1?string last_name:flags.2?string username:flags.3?string phone:flags.4?string photo:flags.5?UserProfilePhoto status:flags.6?UserStatus bot_info_version:flags.14?int restriction_reason:flags.18?Vector<RestrictionReason> bot_inline_placeholder:flags.19?string lang_code:flags.22?string = User;
---functions---
account.updateProfile#78515775 flags:# first_name:flags.0?string last_name:flags.1?string about:flags.2?string = User;

DH exchange initiation - implement req_pq_multi and write tests

  1. Client sends query to server

req_pq_multi#be7e8ef1 nonce:int128 = ResPQ;

The value of nonce is selected randomly by the client (random number) and identifies the client within this communication. Following this step, it is known to all.

  1. Server sends response of the form

resPQ#05162463 nonce:int128 server_nonce:int128 pq:string server_public_key_fingerprints:Vector long = ResPQ;

Here, string pq is a representation of a natural number (in binary big endian format). This number is the product of two different odd prime numbers. Normally, pq is less than or equal to 2^63-1. The value of server_nonce is selected randomly by the server; following this step, it is known to all.

server_public_key_fingerprints is a list of public RSA key fingerprints (64 lower-order bits of SHA1 (server_public_key); the public key is represented as a bare type rsa_public_key n:string e:string = RSAPublicKey, where, as usual, n and е are numbers in big endian format serialized as strings of bytes, following which SHA1 is computed) received by the server.

All subsequent messages contain the pair (nonce, server_nonce) both in the plain-text, and the encrypted portions which makes it possible to identify a “temporary session” — one run of the key generation protocol described on this page that uses the same (nonce, server_nonce) pair. An intruder could not create a parallel session with the server with the same parameters and reuse parts of server- or client-encrypted messages for its own purposes in such a parallel session, because a different server_nonce would be selected by the server for any new “temporary session”.

Implement MTProto Intermediate Transport

Intermediate
In case 4-byte data alignment is needed, an intermediate version of the original protocol may be used.

  • Overhead: small
  • Minimum envelope length: 4 bytes
  • Maximum envelope length: 4 bytes

Payload structure:

+----+----...----+
+len.+  payload  +
+----+----...----+

Before sending anything into the underlying socket (see transports), the client must first send 0xeeeeeeee as the first int (four bytes, the server will not send 0xeeeeeeee as the first int in the first reply).
Then, payloads are wrapped in the following envelope:

  • Length: payload length encoded as 4 length bytes (little endian)
  • Payload: the MTProto payload

Implement MTProto Session

Mobile Protocol: Detailed Description

As of version 4.6, major Telegram clients are using MTProto 2.0. MTProto v.1.0 is deprecated and is currently being phased out.

This article describes the basic layer of the MTProto protocol version 2.0 (Cloud chats, server-client encryption). The principal differences from version 1.0 (described here for reference) are as follows:

  • SHA-256 is used instead of SHA-1;
  • Padding bytes are involved in the computation of msg_key;
  • msg_key depends not only on the message to be encrypted, but on a portion of auth_key as well;
  • 12..1024 padding bytes are used instead of 0..15 padding bytes in v.1.0.

See also: MTProto 2.0: Secret Chats, end-to-end encryption

Protocol description

Before a message (or a multipart message) is transmitted over a network using a transport protocol, it is encrypted in a certain way, and an external header is added at the top of the message that consists of a 64-bit key identifier auth_key_id (that uniquely identifies an authorization key for the server as well as the user) and a 128-bit message key msg_key.

The authorization key auth_key combined with the message key msg_key define an actual 256-bit key aes_key and a 256-bit initialization vector aes_iv, which are used to encrypt the message using AES-256 encryption in infinite garble extension (IGE) mode. Note that the initial part of the message to be encrypted contains variable data (session, message ID, sequence number, server salt) that obviously influences the message key (and thus the AES key and iv). In MTProto 2.0, the message key is defined as the 128 middle bits of the SHA-256 of the message body (including session, message ID, padding, etc.) prepended by 32 bytes taken from the authorization key. In the older MTProto 1.0, the message key was computed as the lower 128 bits of SHA-1 of the message body, excluding the padding bytes.

Multipart messages are encrypted as a single message.

MTProto server-client encryption, cloud chats

Got questions about this setup? — Check out the Advanced FAQ!

Note 1

Each plaintext message to be encrypted in MTProto always contains the following data to be checked upon decryption in order to make the system robust against known problems with the components:

  • server salt (64-Bit)
  • session id
  • message sequence number
  • message length
  • time
Note 2

Telegram's End-to-end encrypted Secret Chats are using an additional layer of encryption on top of the described above. See Secret Chats, End-to-End encryption for details.

Terminology

Authorization Key (auth_key)

A 2048-bit key shared by the client device and the server, created upon user registration directly on the client device by exchanging Diffie-Hellman keys, and never transmitted over a network. Each authorization key is user-specific. There is nothing that prevents a user from having several keys (that correspond to “permanent sessions” on different devices), and some of these may be locked forever in the event the device is lost. See also Creating an Authorization Key.

Server Key

A 2048-bit RSA key used by the server digitally to sign its own messages while registration is underway and the authorization key is being generated. The application has a built-in public server key which can be used to verify a signature but cannot be used to sign messages. A private server key is stored on the server and changed very infrequently.

Key Identifier (auth_key_id)

The 64 lower-order bits of the SHA1 hash of the authorization key are used to indicate which particular key was used to encrypt a message. Keys must be uniquely defined by the 64 lower-order bits of their SHA1, and in the event of a collision, an authorization key is regenerated. A zero key identifier means that encryption is not used which is permissible for a limited set of message types used during registration to generate an authorization key in a Diffie-Hellman exchange. For MTProto 2.0, SHA1 is still used here, because auth_key_id should identify the authorization key used independently of the protocol version.

Session

A (random) 64-bit number generated by the client to distinguish between individual sessions (for example, between different instances of the application, created with the same authorization key). The session in conjunction with the key identifier corresponds to an application instance. The server can maintain session state. Under no circumstances can a message meant for one session be sent into a different session. The server may unilaterally forget any client sessions; clients should be able to handle this.

Server Salt

A (random) 64-bit number changed every 30 minutes (separately for each session) at the request of the server. All subsequent messages must contain the new salt (although, messages with the old salt are still accepted for a further 1800 seconds). Required to protect against replay attacks and certain tricks associated with adjusting the client clock to a moment in the distant future.

Message Identifier (msg_id)

A (time-dependent) 64-bit number used uniquely to identify a message within a session. Client message identifiers are divisible by 4, server message identifiers modulo 4 yield 1 if the message is a response to a client message, and 3 otherwise. Client message identifiers must increase monotonically (within a single session), the same as server message identifiers, and must approximately equal unixtime*2^32. This way, a message identifier points to the approximate moment in time the message was created. A message is rejected over 300 seconds after it is created or 30 seconds before it is created (this is needed to protect from replay attacks). In this situation, it must be re-sent with a different identifier (or placed in a container with a higher identifier). The identifier of a message container must be strictly greater than those of its nested messages.

Important: to counter replay-attacks the lower 32 bits of msg_id passed by the client must not be empty and must present a fractional part of the time point when the message was created.

Content-related Message

A message requiring an explicit acknowledgment. These include all the user and many service messages, virtually all with the exception of containers and acknowledgments.

Message Sequence Number (msg_seqno)

A 32-bit number equal to twice the number of “content-related” messages (those requiring acknowledgment, and in particular those that are not containers) created by the sender prior to this message and subsequently incremented by one if the current message is a content-related message. A container is always generated after its entire contents; therefore, its sequence number is greater than or equal to the sequence numbers of the messages contained in it.

Message Key (msg_key)

In MTProto 2.0, the middle 128 bits of the SHA-256 hash of the message to be encrypted (including the internal header and the padding bytes for MTProto 2.0), prepended by a 32-byte fragment of the authorization key.

In MTProto 1.0, message key was defined differently, as the lower 128 bits of the SHA-1 hash of the message to be encrypted, with padding bytes excluded from the computation of the hash. Authorization key was not involved in this computation.

Internal (cryptographic) Header

A header (16 bytes) added before a message or a container before it is all encrypted together. Consists of the server salt (64 bits) and the session (64 bits).

External (cryptographic) Header

A header (24 bytes) added before an encrypted message or a container. Consists of the key identifier auth_key_id (64 bits) and the message key msg_key (128 bits).

Payload

External header + encrypted message or container.

Defining AES Key and Initialization Vector

The 2048-bit authorization key (auth_key) and the 128-bit message key (msg_key) are used to compute a 256-bit AES key (aes_key) and a 256-bit initialization vector (aes_iv) which are subsequently used to encrypt the part of the message to be encrypted (i. e. everything with the exception of the external header that is added later) with AES-256 in infinite garble extension (IGE) mode.

For MTProto 2.0, the algorithm for computing aes_key and aes_iv from auth_key and msg_key is as follows.

  • msg_key_large = SHA256 (substr (auth_key, 88+x, 32) + plaintext + random_padding);
  • msg_key = substr (msg_key_large, 8, 16);
  • sha256_a = SHA256 (msg_key + substr (auth_key, x, 36));
  • sha256_b = SHA256 (substr (auth_key, 40+x, 36) + msg_key);
  • aes_key = substr (sha256_a, 0, 8) + substr (sha256_b, 8, 16) + substr (sha256_a, 24, 8);
  • aes_iv = substr (sha256_b, 0, 8) + substr (sha256_a, 8, 16) + substr (sha256_b, 24, 8);

where x = 0 for messages from client to server and x = 8 for those from server to client.

For the obsolete MTProto 1.0, msg_key, aes_key, and aes_iv were computed differently (see this document for reference).

The lower-order 1024 bits of auth_key are not involved in the computation. They may (together with the remaining bits or separately) be used on the client device to encrypt the local copy of the data received from the server. The 512 lower-order bits of auth_key are not stored on the server; therefore, if the client device uses them to encrypt local data and the user loses the key or the password, data decryption of local data is impossible (even if data from the server could be obtained).

In MTProto 1.0, when AES was used to encrypt a block of data of a length not divisible by 16 bytes, the data was padded with 0 to 15 random padding bytes random_padding to a length divisible by 16 bytes prior to encryption. In MTProto 2.0, this padding is taken into account when computing msg_key. Note that MTProto 2.0 requires from 12 to 1024 bytes of padding, still subject to the condition that the resulting message length be divisible by 16 bytes.

Using MTProto 2.0 instead of MTProto 1.0

A client may either use only MTProto 2.0 or only MTProto 1.0 in the same TCP connection. The server detects the protocol used by the first message received from the client, and then uses the same encryption for its messages, and expects the client to use the same encryption henceforth. We recommend using MTProto 2.0; MTProto 1.0 is deprecated and supported for backward compatibility only.

Important Checks

When an encrypted message is received, it must be checked that msg_key is in fact equal to the 128 middle bits of the SHA-256 of the decrypted data with a 32-byte fragment of auth_key prepended to it, and that msg_id has even parity for messages from client to server, and odd parity for messages from server to client.

In addition, the identifiers (msg_id) of the last N messages received from the other side must be stored, and if a message comes in with msg_id lower than all or equal to any of the stored values, the message is to be ignored. Otherwise, the new message msg_id is added to the set, and, if the number of stored msg_id values is greater than N, the oldest (i. e. the lowest) is forgotten.

On top of this, msg_id values that belong over 30 seconds in the future or over 300 seconds in the past are to be ignored. This is especially important for the server. The client would also find this useful (to protect from a replay attack), but only if it is certain of its time (for example, if its time has been synchronized with that of the server).

Certain client-to-server service messages containing data sent by the client to the server (for example, msg_id of a recent client query) may, nonetheless, be processed on the client even if the time appears to be “incorrect”. This is especially true of messages to change server_salt and notifications of invalid client time. See Mobile Protocol: Service Messages.

Storing an Authorization Key on a Client Device

It may be suggested to users concerned with security that they password protect the authorization key in approximately the same way as in ssh. This can be accomplished by prepending the value of cryptographic hash function, such as SHA-256, of the key to the front of the key, following which the entire string is encrypted using AES in CBC mode and a key equal to the user’s (text) password. When the user inputs the password, the stored protected password is decrypted and verified by checking the SHA-256 value. From the user’s standpoint, this is practically the same as using an application or a website password.

Unencrypted Messages

Special plain-text messages may be used to create an authorization key as well as to perform a time synchronization. They begin with auth_key_id = 0 (64 bits) which means that there is no auth_key. This is followed directly by the message body in serialized format without internal or external headers. A message identifier (64 bits) and body length in bytes (32 bytes) are added before the message body.

Only a very limited number of messages of special types can be transmitted as plain text.

Schematic Presentation of Messages

Encrypted Message

auth_key_idint64 msg_keyint128 encrypted_databytes

MTProto 2.0 uses 12..1024 padding bytes, instead of the 0..15 used in MTProto 1.0

Creating an Authorization Key

An authorization key is normally created once for every user during the application installation process immediately prior to registration. Registration itself, in actuality, occurs after the authorization key is created. However, a user may be prompted to complete the registration form while the authorization key is being generated in the background. Intervals between user key strokes may be used as a source of entropy in the generation of high-quality random numbers required for the creation of an authorization key.

See Creating an Authorization Key.

During the creation of the authorization key, the client obtains its server salt (to be used with the new key for all communication in the near future). The client then creates an encrypted session using the newly generated key, and subsequent communication occurs within that session (including the transmission of the user's registration information and phone number validation) unless the client creates a new session. The client is free to create new or additional sessions at any time by choosing a new random session_id.

Implement MTProto PaddedIntermediate Transport

Padded intermediate
Padded version of the intermediate protocol, to use with obfuscation enabled to bypass ISP blocks.

  • Overhead: small-medium
  • Minimum envelope length: random
  • Maximum envelope length: random

Before sending anything into the underlying socket (see transports), the client must first send 0xdddddddd as the first int (four bytes, the server will not send 0xdddddddd as the first int in the first reply).
Then, payloads are wrapped in the following envelope:

+----+----...----+----...----+
|tlen|  payload  |  padding  |
+----+----...----+----...----+

Envelope description:

  • Total length: payload+padding length encoded as 4 length bytes (little endian)
  • Payload: the MTProto payload
  • Padding: A random padding string of length 0-15

Implement MTProto Abridged Transport

Abridged
The lightest protocol available.

  • Overhead: Very small
  • Minimum envelope length: 1 byte
  • Maximum envelope length: 4 bytes

Payload structure:

+-+----...----+
|l|  payload  |
+-+----...----+
OR

+-+---+----...----+
|h|len|  payload  +
+-+---+----...----+

Before sending anything into the underlying socket (see transports), the client must first send 0xef as the first byte (the server will not send 0xef as the first byte in the first reply).
Then, payloads are wrapped in the following envelope:

  • Length: payload length, divided by four, and encoded as a single byte, only if the resulting packet length is a value between 0x01..0x7e.
  • Payload: the MTProto payload

If the packet length divided by four is bigger than or equal to 127 (>= 0x7f), the following envelope must be used, instead:

  • Header: A single byte of value 0x7f
  • Length: payload length, divided by four, and encoded as 3 length bytes (little endian)
  • Payload: the MTProto payload

Request for Message Status Information

If either party has not received information on the status of its outgoing messages for a while, it may explicitly request it from the other party:

msgs_state_req#da69fb52 msg_ids:Vector long = MsgsStateReq;

Max 8192 IDs are allowed per constructor.

The response to the query contains the following information:

Informational Message regarding Status of Messages

msgs_state_info#04deb57d req_msg_id:long info:string = MsgsStateInfo;

Here, info is a string that contains exactly one byte of message status for each message from the incoming msg_ids list:

  • 1 = nothing is known about the message (msg_id too low, the other party may have forgotten it)
  • 2 = message not received (msg_id falls within the range of stored identifiers; however, the other party has certainly not received a message like that)
  • 3 = message not received (msg_id too high; however, the other party has certainly not received it yet)
  • 4 = message received (note that this response is also at the same time a receipt acknowledgment)
  • +8 = message already acknowledged
  • +16 = message not requiring acknowledgment
  • +32 = RPC query contained in message being processed or processing already complete
  • +64 = content-related response to message already generated
  • +128 = other party knows for a fact that message is already received

This response does not require an acknowledgment. It is an acknowledgment of the relevant msgs_state_req, in and of itself.

Note that if it turns out suddenly that the other party is missing a message that appears to have been sent to it, the message must not be re-sent on its own with the same msg_id. Instead, it can be either wrapped in a container, or the status of the message can be checked using msgs_state_req and if the message wasn't received, then it must be re-sent with a new msg_id.

Test/Debug TCP Server

Test the TCP Server with TDLIB and various Telegram clients to make sure all of the transport protocols and the TCP server itself was implemented correctly.
Write unit tests.
Write performance tests.

Write a code generator for TL Schema

TL Language
TL (Type Language) serves to describe the used system of types, constructors, and existing functions. In fact, the combinator description format presented in Binary Data Serialization is used.

See also:

Polymorphism in TL
Advanced topics:

Dependent types in TL

Formal description of TL

Formal description of TL combinators

Type serialization

TL schema for serialization of TL schemas

Optional combinator parameters and their values

Binary serialization and abstract TL types

Formal description of templates in TL

Overview
A TL program usually consists of two sections separated by keyword ---functions---. The first section consists of declarations of built-in types and aggregate types (i.e. their constructors). The second section consists of the declared functions, i.e. functional combinators.

Actually, both the first and second sections consist of combinator declarations, each of which ends with a semicolon. However, the first section contains only constructors, while the second section only involves functions. Each combinator is declared using a “combinator declaration” in the format explained above. However, the combinator number and field names may be explicitly assigned.

If additional type declarations are required after functions have been declared, the keyword (section divider) ---types--- is used. Furthermore, a functional combinator may be declared in the type section if its result type begins with an exclamation point (in fact, when the function section is interpreted, this exclamation point is added automatically).

To explicitly define 32-bit names of combinators, a hash mark (#) is added immediately after the combinator’s name, followed by 8 hexadecimal digits.

Namespaces
Composite constructions like <namespace_identifier>.<constructor_identifier> and <namespace_identifier>.<Type_identifier> can be used as constructor- or type identifiers. The portion of the identifier to the left of the period is called the namespace. Moreover, the rule about a first uppercase letter in type identifiers and lowercase letter in constructor identifiers applies to the part of the construction after the period. For example, auth.Message would be a type, while auth.std_message would be a constructor.

Namespaces do not require a special declaration.

Comments
Comments are the same as in C++.

Example

// built-in types
int#a8509bda ? = Int;
long ? = Long;
double ? = Double;
string ? = String;
null = Null;

vector {t:Type} # [ t ] = Vector t;
coupleInt {alpha:Type} int alpha = CoupleInt<alpha>;
coupleStr {gamma:Type} string gamma = CoupleStr gamma;  
/* The name of the type variable is irrelevant: "gamma" could be replaced with "alpha"; 
   However, the combinator number will depend on the specific choice. */

intHash {alpha:Type} vector<coupleInt<alpha>> = IntHash<alpha>;
strHash {alpha:Type} (vector (coupleStr alpha)) = StrHash alpha;
intSortedHash {alpha:Type} intHash<alpha> = IntSortedHash<alpha>;
strSortedHash {alpha:Type} (strHash alpha) = StrSortedHash alpha;

// custom types
pair x:Object y:Object = Pair;
triple x:Object y:Object z:Object = Triple;

user#d23c81a3 id:int first_name:string last_name:string = User;
no_user#c67599d1 id:int = User;
group id:int title:string last_name:string = Group;
no_group = Group;

---functions---

// Maybe some built-in arithmetic functions; inverse quotes make "identifiers" out of arbitrary non-alphanumeric strings
`+` Int Int = Int;
`-` Int Int = Int;
`+` Double Double = Double;
// ...

// API functions (aka RPC functions)
getUser#b0f732d5 int = User;
getUsers#2d84d5f5 (Vector int) = Vector User;

In this case, the user constructor has been explicitly assigned a number (0xd23c81a3); In fact, this was not necessary, since this value is the CRC32 of the string "user id:int first_name:string last_name:string = User", which would have been used by default.

Special constructors are not required for Vector int, Vector User, Vector Object, etc. -- the same universal constructor can be used everywhere:

vector#1cb5c415 {t:Type} # [ t ] = Vector t;

Note that when the getUsers (Vector int) = Vector User; constructor number is calculated, the CRC32 of the string "getUsers Vector int = Vector User” is computed (from which all parentheses have been removed).

Notation T0<T1,T2,...,Tn> is syntactic sugar for (T0 (T1) (T2) ... (Tn)). For example, Vector and (Vector User) are entirely interchangeable.

Example of an RPC query
Suppose we want to call getUsers([2,3,4]). This query will be serialized into a sequence of 32-bit integers as follows:

0x2d84d5f5 0x1cb5c415 0x3 0x2 0x3 0x4

Please note that TL serialization yields sequences of 32-bit integers. When it has to be embedded into a byte stream, for example a network packet, each 32-bit integer is represented by four bytes in little-endian order. In this way the above query corresponds to the following byte stream:

F5 D5 84 2D 15 C4 B5 1C 03 00 00 00 02 00 00 00 03 00 00 00 04 00 00 00

The response might look something like this:

0x1cb5c415 0x3 0xd23c81a3 0x2 0x74655005 0x00007265 0x72615006 0x72656b 0xc67599d1 0x3 0xd23c81a3 0x4 0x686f4a04 0x6e 0x656f4403

This roughly corresponds to

[{"id":2,"first_name":"Peter", "last_name":"Parker"},{},{"id":4,"first_name":"John","last_name":"Doe"}]

Note that in both cases the same universal constructor vector#1cb5c415 is used: in the request to serialize the value of type Vector int, and in the serialization of the value of type Vector User in the response. There is no ambiguity because in both cases the type of the value being (de)serialized is known before its (de)serialization begins. For example, after receiving the query, the server sees that the first part is 0x2d84d5f5, which corresponds to the combinator getUsers#2d84d5f5 (Vector int) = Vector User. Thus, it is understood that what follows will be a value of type Vector int. After receiving the response to this query, the client knows that it must receive a value of type Vector User and it deserializes the response accordingly.

Implement AES IGE Encryption

Telegram uses AES-256 in infinite garble extension (IGE) mode.

IGE (Infinite Garble Extension) chaining sequences are as follows:

Encryption: y_i = f(x_i ⊕ y_{i−1}) ⊕ x_{i−1}
Decryption: x_i = f^-1(y_i ⊕ x_{i-1}) ⊕ y_{i-1}
  
Gligor, V. D., Donescu, P., & Katz, J. (2000, November). 
On message integrity in symmetric encryption. 
In 1st NIST Workshop on AES Modes of Operation.

Acknowledgment of Receipt

Receipt of virtually all messages (with the exception of some purely service ones as well as the plain-text messages used in the protocol for creating an authorization key) must be acknowledged. This requires the use of the following service message (not requiring an acknowledgment):

msgs_ack#62d6b459 msg_ids:Vector<long> = MsgsAck;

A server usually acknowledges the receipt of a message from a client (normally, an RPC query) using an RPC response. If a response is a long time coming, a server may first send a receipt acknowledgment, and somewhat later, the RPC response itself.

A client normally acknowledges the receipt of a message from a server (usually, an RPC response) by adding an acknowledgment to the next RPC query if it is not transmitted too late (if it is generated, say, 60-120 seconds following the receipt of a message from the server). However, if for a long period of time there is no reason to send messages to the server or if there is a large number of unacknowledged messages from the server (say, over 16), the client transmits a stand-alone acknowledgment.

Max 8192 IDs are allowed per constructor.

Implement Auth.AcceptLoginToken

Accept QR code login token, logging in the app that generated it.

Returns info about the new session.

For more info, see login via QR code.

authorization#ad01d61d flags:# current:flags.0?true official_app:flags.1?true password_pending:flags.2?true encrypted_requests_disabled:flags.3?true call_requests_disabled:flags.4?true hash:long device_model:string platform:string system_version:string api_id:int app_name:string app_version:string date_created:int date_active:int ip:string country:string region:string = Authorization;
---functions---
auth.acceptLoginToken#e894ad4d token:bytes = Authorization;

Handle Transport Errors

Transport errors
In the event of a transport error (missing auth key, transport flood, etc.), the server may send a packet with a signed little-endian number of 4 bytes, whose absolute value contains the error code (the error itself is actually negative).

For example, error Code 403 corresponds to situations where the corresponding HTTP error would have been returned by the HTTP protocol.

Error 404 (auth key not found) is returned when the specified auth key ID cannot be found by the DC.

Error 429 (transport flood) is returned when too many transport connections are established to the same IP in a too short lapse of time, or if any of the container/service message limits are reached.

Implement MTProto Full Transport

Full
The basic MTProto transport protocol

  • Overhead: medium
  • Minimum envelope length: 12 bytes (length+seqno+crc)
  • Maximum envelope length: 12 bytes (length+seqno+crc)

Payload structure:

+----+----+----...----+----+
|len.|seq.|  payload  |crc.|
+----+----+----...----+----+

Envelope description:

  • Length: length+seqno+payload+crc length encoded as 4 length bytes (little endian, the length of the length field must be included, too)
  • Seqno: the TCP sequence number for this TCP connection (different from the MTProto sequence number): the first packet sent is numbered 0, the next one 1, etc.
  • payload: MTProto payload
  • crc: 4 CRC32 bytes computed using length, sequence number, and payload together.

Implement Transport Obfuscation

Transport obfuscation
Transport obfuscation is required to use the websocket transports.

Transport obfuscation to prevent ISP blocks is implemented using the following protocol, situated under the MTProto transport in the ISO/OSI stack, see the recap; this means that the payload is first wrapped in the MTProto transport envelope (all transports are supported), and then obfuscated:

Prior to establishing the connection (and eventually sending the protocol header of a specific MTProto transport), a 64-byte (512-bit) random initialization payload is generated.
During the generation process, special care must be taken in order to avoid a situation where that the first int (first four bytes) of the random string are equal to one of the known protocol identifiers (see above).
In particular, the first four bytes must not be equal to 0xdddddddd (padded intermediate), 0xeeeeeeee (intermediate), POST, GET, HEAD, or any of the HTTP methods that are accepted by the MTProto servers.
The first byte must also not be equal to 0xef (abridged).
Bytes 4-8 must also not be equal to 0x00000000, since that would indicate use of the full transport with the initial TCP sequence number (0).

The protocol identifier, if present, must be inserted in the initialization payload at byte offset 56: if its length is less than 4, it must be padded using the protocol identifier itself, to make its length 4 (0xef => 0xefefefef): the standalone protocol identifier must be not be sent aftwerwards.

Extended Voluntary Communication of Status of One Message

Normally used by the server to respond to the receipt of a duplicate msg_id, especially if a response to the message has already been generated and the response is large. If the response is small, the server may re-send the answer itself instead. This message can also be used as a notification instead of resending a large message.

msg_detailed_info#276d3ec6 msg_id:long answer_msg_id:long bytes:int status:int = MsgDetailedInfo;
msg_new_detailed_info#809db6df answer_msg_id:long bytes:int status:int = MsgDetailedInfo;

The second version is used to notify of messages that were created on the server not in response to an RPC query (such as notifications of new messages) and were transmitted to the client some time ago, but not acknowledged.

Currently, status is always zero. This may change in future.

This message does not require an acknowledgment.

Implement Account.UpdateUsername

Changes username for the current user.

userEmpty#d3bc4b7a id:long = User;
user#3ff6ecb0 flags:# self:flags.10?true contact:flags.11?true mutual_contact:flags.12?true deleted:flags.13?true bot:flags.14?true bot_chat_history:flags.15?true bot_nochats:flags.16?true verified:flags.17?true restricted:flags.18?true min:flags.20?true bot_inline_geo:flags.21?true support:flags.23?true scam:flags.24?true apply_min_photo:flags.25?true fake:flags.26?true id:long access_hash:flags.0?long first_name:flags.1?string last_name:flags.2?string username:flags.3?string phone:flags.4?string photo:flags.5?UserProfilePhoto status:flags.6?UserStatus bot_info_version:flags.14?int restriction_reason:flags.18?Vector<RestrictionReason> bot_inline_placeholder:flags.19?string lang_code:flags.22?string = User;
---functions---
account.updateUsername#3e0bdd7c username:string = User;

Notice of Ignored Error Message

In certain cases, a server may notify a client that its incoming message was ignored for whatever reason. Note that such a notification cannot be generated unless a message is correctly decoded by the server.

bad_msg_notification#a7eff811 bad_msg_id:long bad_msg_seqno:int error_code:int = BadMsgNotification;
bad_server_salt#edab447b bad_msg_id:long bad_msg_seqno:int error_code:int new_server_salt:long = BadMsgNotification;

Here, error_code can also take on the following values:

  • 16: msg_id too low (most likely, client time is wrong; it would be worthwhile to synchronize it using msg_id notifications and re-send the original message with the “correct” msg_id or wrap it in a container with a new msg_id if the original message had waited too long on the client to be transmitted)
  • 17: msg_id too high (similar to the previous case, the client time has to be synchronized, and the message re-sent with the correct msg_id)
  • 18: incorrect two lower order msg_id bits (the server expects client message msg_id to be divisible by 4)
  • 19: container msg_id is the same as msg_id of a previously received message (this must never happen)
  • 20: message too old, and it cannot be verified whether the server has received a message with this msg_id or not
  • 32: msg_seqno too low (the server has already received a message with a lower msg_id but with either a higher or an equal and odd seqno)
  • 33: msg_seqno too high (similarly, there is a message with a higher msg_id but with either a lower or an equal and odd seqno)
  • 34: an even msg_seqno expected (irrelevant message), but odd received
  • 35: odd msg_seqno expected (relevant message), but even received
  • 48: incorrect server salt (in this case, the bad_server_salt response is received with the correct salt, and the message is to be re-sent with it)
  • 64: invalid container.

The intention is that error_code values are grouped (error_code >> 4): for example, the codes 0x40 - 0x4f correspond to errors in container decomposition.

Notifications of an ignored message do not require acknowledgment (i.e., are irrelevant).

Important: if server_salt has changed on the server or if client time is incorrect, any query will result in a notification in the above format. The client must check that it has, in fact, recently sent a message with the specified msg_id, and if that is the case, update its time correction value (the difference between the client’s and the server’s clocks) and the server salt based on msg_id and the server_salt notification, so as to use these to (re)send future messages. In the meantime, the original message (the one that caused the error message to be returned) must also be re-sent with a better msg_id and/or server_salt.

In addition, the client can update the server_salt value used to send messages to the server, based on the values of RPC responses or containers carrying an RPC response, provided that this RPC response is actually a match for the query sent recently. (If there is doubt, it is best not to update since there is risk of a replay attack).

Presenting proof of work; Server authentication implement req_DH_params and write tests

  1. Client sends query to server

req_DH_params#d712e4be nonce:int128 server_nonce:int128 p:string q:string public_key_fingerprint:long encrypted_data:string = Server_DH_Params

Here, encrypted_data is obtained as follows:

  • new_nonce := another (good) random number generated by the client; after this query, it is known to both client and server;

  • data := a serialization of

p_q_inner_data_dc#a9f55f95 pq:string p:string q:string nonce:int128 server_nonce:int128 new_nonce:int256 dc:int = P_Q_inner_data;
  • or of
p_q_inner_data_temp_dc#56fddf88 pq:string p:string q:string nonce:int128 server_nonce:int128 new_nonce:int256 dc:int 
expires_in:int = P_Q_inner_data;
  • encrypted_data := RSA_PAD (data, server_public_key), where RSA_PAD is a version of RSA with a variant of OAEP+ padding
    explained below in 4.1).

Someone might intercept the query and replace it with their own, independently decomposing pq into factors instead of the client. The only field that it makes sense to modify is new_nonce which would be the one an intruder would have to re-generate (because an intruder cannot decrypt the encrypted data sent by the client). Since all subsequent messages are encrypted using new_nonce or contain new_nonce_hash, they will not be processed by the client (an intruder would not be able to make it look as though they had been generated by the server because they would not contain new_nonce). Therefore, this intercept will only result in the intruder’s completing the authorization key generation protocol in place of the client and creating a new key (that has nothing to do with the client); however, the same effect could be achieved simply by creating a new key in one's own name.

An alternative form of inner data (p_q_inner_data_temp_dc) is used to create temporary keys, that are only stored in the server RAM and are discarded after at most expires_in seconds. The server is free to discard its copy earlier. In all other respects the temporary key generation protocol is the same. After a temporary key is created, the client usually binds it to its principal authorisation key by means of the auth.bindTempAuthKey method, and uses it for all client-server communication until it expires; then a new temporary key is generated. Thus Perfect Forward Secrecy (PFS) in client-server communication is achieved. Read more about PFS »

1.1 RSA_PAD(data, server_public_key) mentioned above is implemented as follows:

  • data_with_padding := data + random_padding_bytes; -- where random_padding_bytes are chosen so that the resulting length of data_with_padding is precisely 192 bytes, and data is the TL-serialized data to be encrypted as before. One has to check that data is not longer than 144 bytes.
  • data_pad_reversed := BYTE_REVERSE(data_with_padding); -- is obtained from data_with_padding by reversing the byte order.
  • a random 32-byte temp_key is generated.
  • data_with_hash := data_pad_reversed + SHA256(temp_key + data_with_padding); -- after this assignment, data_with_hash is exactly 224 bytes long.
  • aes_encrypted := AES256_IGE(data_with_hash, temp_key, 0); -- AES256-IGE encryption with zero IV.
  • temp_key_xor := temp_key XOR SHA256(aes_encrypted); -- adjusted key, 32 bytes
  • key_aes_encrypted := temp_key_xor + aes_encrypted; -- exactly 256 bytes (2048 bits) long
  • The value of key_aes_encrypted is compared with the RSA-modulus of server_pubkey as a big-endian 2048-bit (256-byte) unsigned integer. If key_aes_encrypted turns out to be greater than or equal to the RSA modulus, the previous steps starting from the generation of new random temp_key are repeated. Otherwise the final step is performed:
  • encrypted_data := RSA(key_aes_encrypted, server_pubkey); -- 256-byte big-endian integer is elevated to the requisite power from the RSA public key modulo the RSA modulus, and the result is stored as a big-endian integer consisting of exactly 256 bytes (with leading zero bytes if required).
  1. Server responds with:

server_DH_params_ok#d0e8075c nonce:int128 server_nonce:int128 encrypted_answer:string = Server_DH_Params;

If the query is incorrect, the server returns a -404 error and the handshake must be restarted (any subsequent request also returns -404, even if it is correct).

Here, encrypted_answer is obtained as follows:

  • new_nonce_hash := 128 lower-order bits of SHA1 (new_nonce);

  • answer := serialization

server_DH_inner_data#b5890dba nonce:int128 server_nonce:int128 g:int dh_prime:string g_a:string server_time:int = 
    Server_DH_inner_data;
  • answer_with_hash := SHA1(answer) + answer + (0-15 random bytes); such that the length be divisible by 16;

  • tmp_aes_key := SHA1(new_nonce + server_nonce) + substr (SHA1(server_nonce + new_nonce), 0, 12);

  • tmp_aes_iv := substr (SHA1(server_nonce + new_nonce), 12, 8) + SHA1(new_nonce + new_nonce) + substr (new_nonce, 0, 4);

  • encrypted_answer := AES256_ige_encrypt (answer_with_hash, tmp_aes_key, tmp_aes_iv); here, tmp_aes_key is a 256-bit key, and tmp_aes_iv is a 256-bit initialization vector. The same as in all the other instances that use AES encryption, the encrypted data is padded with random bytes to a length divisible by 16 immediately prior to encryption.

Following this step, new_nonce is still known to client and server only. The client is certain that it is the server that responded and that the response was generated specifically in response to client query req_DH_params, since the response data are encrypted using new_nonce.

Client is expected to check whether p = dh_prime is a safe 2048-bit prime (meaning that both p and (p-1)/2 are prime, and that 2^2047 < p < 2^2048), and that g generates a cyclic subgroup of prime order (p-1)/2, i.e. is a quadratic residue mod p. Since g is always equal to 2, 3, 4, 5, 6 or 7, this is easily done using quadratic reciprocity law, yielding a simple condition on p mod 4g -- namely, p mod 8 = 7 for g = 2; p mod 3 = 2 for g = 3; no extra condition for g = 4; p mod 5 = 1 or 4 for g = 5; p mod 24 = 19 or 23 for g = 6; and p mod 7 = 3, 5 or 6 for g = 7. After g and p have been checked by the client, it makes sense to cache the result, so as not to repeat lengthy computations in future.

If the verification takes too long time (which is the case for older mobile devices), one might initially run only 15 Miller--Rabin iterations for verifying primeness of p and (p - 1)/2 with error probability not exceeding one billionth, and do more iterations later in the background.

Another optimization is to embed into the client application code a small table with some known "good" couples (g,p) (or just known safe primes p, since the condition on g is easily verified during execution), checked during code generation phase, so as to avoid doing such verification during runtime altogether. Server changes these values rarely, thus one usually has to put the current value of server's dh_prime into such a table. For example, current value of dh_prime equals (in big-endian byte order)

C7 1C AE B9 C6 B1 C9 04 8E 6C 52 2F 70 F1 3F 73 98 0D 40 23 8E 3E 21 C1 49 34 D0 37 56 3D 93 0F 48 19 8A 0A A7 C1 40 58 22 94 93 D2 25 30 F4 DB FA 33 6F 6E 0A C9 25 13 95 43 AE D4 4C CE 7C 37 20 FD 51 F6 94 58 70 5A C6 8C D4 FE 6B 6B 13 AB DC 97 46 51 29 69 32 84 54 F1 8F AF 8C 59 5F 64 24 77 FE 96 BB 2A 94 1D 5B CD 1D 4A C8 CC 49 88 07 08 FA 9B 37 8E 3C 4F 3A 90 60 BE E6 7C F9 A4 A4 A6 95 81 10 51 90 7E 16 27 53 B5 6B 0F 6B 41 0D BA 74 D8 A8 4B 2A 14 B3 14 4E 0E F1 28 47 54 FD 17 ED 95 0D 59 65 B4 B9 DD 46 58 2D B1 17 8D 16 9C 6B C4 65 B0 D6 FF 9C A3 92 8F EF 5B 9A E4 E4 18 FC 15 E8 3E BE A0 F8 7F A9 FF 5E ED 70 05 0D ED 28 49 F4 7B F9 59 D9 56 85 0C E9 29 85 1F 0D 81 15 F6 35 B1 05 EE 2E 4E 15 D0 4B 24 54 BF 6F 4F AD F0 34 B1 04 03 11 9C D8 E3 B9 2F CC 5B

Complete DH key exchange - implement set_client_DH_params and write tests

  1. Client computes random 2048-bit number b (using a sufficient amount of entropy) and sends the server a message

set_client_DH_params#f5045f1f nonce:int128 server_nonce:int128 encrypted_data:string = Set_client_DH_params_answer;

Here, encrypted_data is obtained thus:

  • g_b := pow(g, b) mod dh_prime;
  • data := serialization
  client_DH_inner_data#6643b654 nonce:int128 server_nonce:int128 retry_id:long g_b:string = Client_DH_Inner_Data
  • data_with_hash := SHA1(data) + data + (0-15 random bytes); such that length be divisible by 16;
  • encrypted_data := AES256_ige_encrypt (data_with_hash, tmp_aes_key, tmp_aes_iv);

The retry_id field is equal to zero at the time of the first attempt; otherwise, it is equal to auth_key_aux_hash from the previous failed attempt (see Item 9).

  1. Thereafter, auth_key equals pow(g, {ab}) mod dh_prime; on the server, it is computed as pow(g_b, a) mod dh_prime, and on the client as (g_a)^b mod dh_prime.

  2. auth_key_hash is computed := 64 lower-order bits of SHA1 (auth_key). The server checks whether there already is another key with the same auth_key_hash and responds in one of the following ways.

DH key exchange complete
Server responds in one of three ways:

dh_gen_ok#3bcbf734 nonce:int128 server_nonce:int128 new_nonce_hash1:int128 = Set_client_DH_params_answer;
dh_gen_retry#46dc1fb9 nonce:int128 server_nonce:int128 new_nonce_hash2:int128 = Set_client_DH_params_answer;
dh_gen_fail#a69dae02 nonce:int128 server_nonce:int128 new_nonce_hash3:int128 = Set_client_DH_params_answer;

new_nonce_hash1, new_nonce_hash2, and new_nonce_hash3 are obtained as the 128 lower-order bits of SHA1 of the byte string derived from the new_nonce string by adding a single byte with the value of 1, 2, or 3, and followed by another 8 bytes with auth_key_aux_hash. Different values are required to prevent an intruder from changing server response dh_gen_ok into dh_gen_retry.
auth_key_aux_hash is the 64 higher-order bits of SHA1(auth_key). It must not be confused with auth_key_hash.
In the other case, the client goes to generating a new b. In the first case, the client and the server have negotiated auth_key, following which they forget all other temporary data, and the client creates another encrypted session using auth_key. At the same time, server_salt is initially set to substr(new_nonce, 0, 8) XOR substr(server_nonce, 0, 8). If required, the client stores the difference between server_time received in 5) and its local time, to be able always to have a good approximation of server time which is required to generate correct message identifiers.

IMPORTANT: Apart from the conditions on the Diffie-Hellman prime dh_prime and generator g, both sides are to check that g, g_a and g_b are greater than 1 and less than dh_prime - 1. We recommend checking that g_a and g_b are between 2^{2048-64} and dh_prime - 2^{2048-64} as well.

Quick Ack Support

Quick ack
These MTProto transport protocols have support for quick acknowledgment.
In this case, the client sets the highest-order length bit in the query packet, and the server responds with a special 4 bytes as a separate packet.
They are the 32 higher-order bits of SHA256 of the encrypted portion of the packet prepended by 32 bytes from the authorization key (the same hash as computed for verifying the message key), with the most significant bit set to make clear that this is not the length of a regular server response packet; if the abridged version is used, bswap is applied to these four bytes.

Implement Auth.ExportLoginToken

Generate a login token, for login via QR code.
The generated login token should be encoded using base64url, then shown as a tg://login?token=base64encodedtoken URL in the QR code.

For more info, see login via QR code.

auth.loginToken#629f1980 expires:int token:bytes = auth.LoginToken;
auth.loginTokenMigrateTo#68e9916 dc_id:int token:bytes = auth.LoginToken;
auth.loginTokenSuccess#390d5c5e authorization:auth.Authorization = auth.LoginToken;
---functions---
auth.exportLoginToken#b7e085fe api_id:int api_hash:string except_ids:Vector<long> = auth.LoginToken;

Voluntary Communication of Status of Messages

Either party may voluntarily inform the other party of the status of the messages transmitted by the other party.

msgs_all_info#8cc0d131 msg_ids:Vector long info:string = MsgsAllInfo

All message codes known to this party are enumerated, with the exception of those for which the +128 and the +16 flags are set. However, if the +32 flag is set but not +64, then the message status will still be communicated.

This message does not require an acknowledgment.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.