Wednesday, March 29, 2017

Python: Using PyCrypto Library, Part 1

Last week, the instructor of ICS 444 has gave us new assignment. In the assignment, it is required to use one of Python cryptographic libraries. As for me, I have decided to use a library called PyCrypto. In this blog post, I will be discussing the following topics regarding the selected library:
  1. Installing the library in Windows 7 OS.
  2. Computing SHA512 digest of a message.
  3. Using MD5 + AES to Hash and Encrypt a Message.
Note that I have never used Python programming language before. I will explain only the parts that I will be using from the language.

Installing The Library in Windows

Installing PyCrypto library in windows is very simple. All what we have to do is to execute the following command in cmd.exe:

pip install --use-wheel --no-index --find-links=https://github.com/sfbahr/PyCrypto-Wheels/raw/master/pycrypto-2.6.1-cp35-none-win_amd64.whl pycrypto

After executing the command, the library will be installed.


After Executing this command, the library will be installed. Note that the given command is for python 64 bit windows. If your python installation is 32 bits, change pycrypto-2.6.1-cp35-none-win_amd64.whl to pycrypto-2.6.1-cp35-none-win32.whl.


Applying PyCrypto Algorithms

Suppose that we have a text file named "PythonCrypto.txt" and the file contains the following Text:
Cryptography is the practice and study of techniques for secure communication in the presence of third parties called adversaries.   These adversaries are often referred to as Eve in cryptography, while the sender and recipient of messages are called Alice and Bob respectively. More generally, cryptography is about constructing and analyzing protocols that prevent third parties or the public from reading private messages.

The algorithms of PyCrypto Library will be applied to the content of the given file. The first thing we will do is to read the contend of the file as follows:

file = open("PythonCrypto.txt", 'r')
message = file.read();
#print file content to check if it was opened
print(message)

This code snippet can be used to only read the content of a file. The method open() in the given example has two parameters, the first is the name of the file and the second is the mode in which we will open the file. 'r' stands for 'read' and 'w' stands for 'write'. After opening the file, we read its content and store it in the variable 'message' using the method 'read()'. The text that comes after the '#' is a comment in python language. A comment is a piece of text that does not affect the output of a program. It is only used to describe what your code is doing. the 'print()' is a function that is used to show the output in the console. It is defined in the heart of python language.

Computing SHA512 Hash 

Now that we have the content of the file, let's try to apply some algorithms to it. The first thing we will do is to compute SHA512 digest of the text. To do that, we have to tell Python that we are going to use a function from PyCrypto library. The statement that is used to do that is as follows:

from Crypto.Hash import SHA512

This statement is called import statement. It must be the first statement in our python file. Now, let's compute the digest. The code for that is as follows:

from Crypto.Hash import SHA512
file = open("PythonCrypto.txt", 'r')
message = file.read();
print(message)
digest = SHA512.new(message).hexdigest();
print(digest);

If we run the code as it is shown above, we will get the following error message:

TypeError: Unicode-objects must be encoded before hashing

This error basically means that we have to specify the type of text encoding that we are using for the given text. To specify the encoding of the text, we call the method 'encode()' as follows:

from Crypto.Hash import SHA512
file = open("PythonCrypto.txt", 'r')
message = file.read();
print(message)
digest = SHA512.new(message.encode('utf-8')).hexdigest();
print(digest);

This will solve the problem and will show the SHA512 digest of the message. It will be something like this:

6881eae82c054693543a9111f612b7252a19003483f75cd86455d3b72cfb6bbe32882d275f8ce26346f5b48d345d459796a0067031186f9a3196ac9b6964b2a3

This is a hexadecimal representation of the digest. Now, instead of printing the digest, let's store it in a file called "SHA512Digest.txt". The code for performing such as task is as follows:

from Crypto.Hash import SHA512
file = open("PythonCrypto.txt", 'r')
message = file.read();
print(message)
digest = SHA512.new(message.encode('utf-8')).hexdigest();
print(digest);
file = open("SHA512Digest.txt", 'w')
file.write(digest)
file.close()

The given code will create a new file called "SHA512Digest.txt". After that, the method 'write()' will write the digest to the file. The method 'close()' will close the file after writing the digest to it.

Using MD5 Hash With AES Cipher

Now that we have the basics of using PyCrypto, we will start using it in little bit advanced way. The first thing that we will do is to compute the MD5 digest of the message. After that, we will be using AES Cipher to encrypt the text. Finally, we will send the message to another program using text file.

Hashing + Encryption, What Can we Get?

By using the digest of the message, we can guaranty the integrity of the message that is been sent. If the message has been modified while being sent, the digest will be different at the receiver side. By using AES, we guaranty confidentiality. Basically no one can read the content of the message except the one who has the key to decrypt the message.

Basics of AES:

AES Cipher has two parameters, the first one is the data that will be encrypted and the second one is the key. The cipher accepts data of size 16 bytes only. This means in order to encrypt larger data set, we have to divide the data set into smaller parts of size 16 bytes and use AES in each part. The key size for the cipher has 3 different sizes, 16 bytes, 24 bytes or 32 bytes.

AES in PyCrypto Library

Before using AES, we have to import it as we did with SHA algorithm. But in this case, AES will be in a different place. AES can be found in 'Crypto.Cipher.AES'. According to the documentation of AES, the algorithm takes 3 parameters, A key, Mode and Initialization Vector (IV). The initialization vector will be used depending on the mode. If the mode is 'AES.MODE_ECB' or 'AES.MODE_CTR', the initialization vector is not needed. In our case, we will ignore the initialization vector and use 'AES.MODE_ECB'. The following code example shows how to encrypt and decrypt a simple text of size 16 bytes using AES in PyCrypt.

from Crypto.Cipher import AES
cipher = AES.new("1111111111111111", AES.MODE_ECB, b'')
#each letter is one byte 
message = '0123456789abcdef' 
print('Original message = '+message)
encryptedMessage = cipher.encrypt('aaaaaaaaaaaaaaaa')
print('Encrypted Message = '+encryptedMessage)
decryptedMessage = cipher.decrypt(encryptedMessage)
print('Decrypted Message = '+decryptedMessage)


The "b''" implies that we are supplying a byte string to the algorithm as initialization vector. This will be ignored since the mode is 'AES.MODE_ECB'. If we run the code as it is, it will not run and we will get the following error:

TypeError: Can't convert 'bytes' object to str implicitly

This error means that the type of data that returned by AES algorithm is byte and it cannot be shown in the console as it is. What we have to do is to change it to string. After doing that, the code will work fine.

from Crypto.Cipher import AES
cipher = AES.new("1111111111111111", AES.MODE_ECB, b'')
message = '0123456789abcdef' 
print('Original message = '+message)
encryptedMessage = cipher.encrypt('aaaaaaaaaaaaaaaa')
print('Encrypted Message = '+str(encryptedMessage))
decryptedMessage = cipher.decrypt(encryptedMessage)
print('Decrypted Message = '+str(decryptedMessage))

The output of the given code will be something like this:

Original message = 0123456789abcdef
Encrypted Message = b'\xc6\xb9\xd1\x02\xdd\xea\xb1i\xf3$\xcb\xdb\xe73\xf0\n'
Decrypted Message = b'aaaaaaaaaaaaaaaa'

We can notice that the decrypted message has the letter 'b' which we don't want to appear with the text. For this reason, we will be using different way to display the decrypted message. We will use 'bytes.decode()'.

from Crypto.Cipher import AES

cipher = AES.new("1111111111111111", AES.MODE_ECB, b'')
message = '0123456789abcdef'
print('Original message = '+message)
encryptedMessage = cipher.encrypt('aaaaaaaaaaaaaaaa')
print('Encrypted Message = '+str(encryptedMessage))
decryptedMessage = cipher.decrypt(encryptedMessage)
print('Decrypted Message = '+bytes.decode(decryptedMessage,'utf-8'))

Using MD5 + AES to Hash and Encrypt a File

Now that we know how to use hash functions and ciphers from PyCrypto library, we will be using them in a simple application. First, we will be computing MD5 hash of the file "PythonCrypto.txt". Then we will be storing the hash in a file. After that, we will use AES to encrypt the file and store the key in one file and the encrypted data in another file. Lastly, another program will open the 3 generated files to verify the integrity of the data. The code for the first program is as follows:

from Crypto.Hash import MD5
from Crypto.Cipher import AES


key = "1111111111111111" 
cipher = AES.new(key, AES.MODE_ECB, b'')
file = open('PythonCrypto.txt', 'r')
# notice the 'wb'. the 'b' in here means we are writing binary data to file
encryptedFile = open("AESEncrypted.txt", 'wb')
while True:
    data = file.read(16)
    if data:
        encryptedMessage = cipher.encrypt(data)
        encryptedFile.write(encryptedMessage)
    else:
        encryptedFile.close()
        break
file = open("AESKey.txt", 'w')
file.write(key)
file.close()
 
# notice the 'rb' here. This means we are reading binary data 
encryptedFile = open("AESEncrypted.txt", 'rb')
message = encryptedFile.read()
digest = MD5.new(message.encode('utf-8')).hexdigest()
file = open("MD5Digest.txt", 'w')
file.write(digest)
file.close() 

When we run the code, it will not run and the following error will appear:
ValueError: Input strings must be a multiple of 16 in length
We are sure that the length of the 'data' variable is multiple of 16 bytes since we are providing the number 16 in the 'read()' function. When we print the data that we read from the file, we will see something as follows:
Cryptography
The first 3 characters are not part of the text on the file. The 3 characters called byte order mark (BOM). they are basically used to tell the encoding type of the file. To solve the problem, we have to skip the 3 characters. So, before we go into the loop, we will read 3 bytes and then enter the loop. The code now will become as follows:

from Crypto.Hash import MD5
from Crypto.Cipher import AES


key = "1111111111111111" 
cipher = AES.new(key, AES.MODE_ECB, b'')
file = open('PythonCrypto.txt', 'r')
encryptedFile = open("AESEncrypted.txt", 'wb')
file.read(3)
while True:
    data = file.read(16)
    if data:
        encryptedMessage = cipher.encrypt(data)
        encryptedFile.write(encryptedMessage)
    else:
        encryptedFile.close()
        break 
file = open("AESKey.txt", 'w')
file.write(key)
file.close()
encryptedFile = open("AESEncrypted.txt", 'rb')
message = encryptedFile.read()
digest = MD5.new(message.encode('utf-8')).hexdigest()
file = open("MD5Digest.txt", 'w')
file.write(digest)
file.close() 


When we run the code again, we still get the same error. The reason for that is because the content of the file is not a multiple of 16 bytes. To solve the problem, we have to add a check when we reach the end of the file to add some bytes to make it multiple of 16. We will add white spaces at the end of the data. After doing so, the code should work 100% fine.

from Crypto.Hash import MD5
from Crypto.Cipher import AES


key = "1111111111111111" 
cipher = AES.new(key, AES.MODE_ECB, b'')
file = open('PythonCrypto.txt', 'r')
encryptedFile = open("AESEncrypted.txt", 'wb')
file.read(3)
while True:
    data = file.read(16)
    if data:
        if len(data) != 16:
            while len(data) != 16:
                data += ' '
        encryptedMessage = cipher.encrypt(data)
        encryptedFile.write(encryptedMessage)
    else:
        encryptedFile.close()
        break 
file = open("AESKey.txt", 'w')
file.write(key)
file.close()
encryptedFile = open("AESEncrypted.txt", 'rb')
message = encryptedFile.read()
digest = MD5.new(message).hexdigest()
file = open("MD5Digest.txt", 'w')
file.write(digest)
file.close()


The code looks little bit messy. What we will do is to modify it and use methods (or functions) instead to make it cleaner. We will create 3 methods, one to encrypt the file, one to compute the digest and one for running the two methods. The syntax for defining a method in python is as follows:

def method_name(zero_or_more_params):
    # methode code goes here

As we can see, it is very simple. Note that to run the code, your methods should come before the place where you are using them. After fixing the code, it will become as follows:
from Crypto.Hash import MD5
from Crypto.Cipher import AES


def aes_encrypt(key, key_file_name, file_name, encrypted_file_name):
    cipher = AES.new(key, AES.MODE_ECB, b'')
    file = open(file_name, 'r')
    encrypted_file = open(encrypted_file_name, 'wb')
    file.read(3)
    while True:
        data = file.read(16)
        if data:
            if len(data) != 16:
                while len(data) != 16:
                    data += ' '
            encrypted_message = cipher.encrypt(data)
            encrypted_file.write(encrypted_message)
        else:
            encrypted_file.close()
            break     
    file = open(key_file_name, 'w')
    file.write(key)
    file.close()


def compute_md5_digest(digest_file_name, file_to_digest):
    encrypted_file = open(file_to_digest, 'rb')
    message = encrypted_file.read()
    digest = MD5.new(message).hexdigest()
    md5_file = open(digest_file_name, 'w')
    md5_file.write(digest)
    md5_file.close()


def main():
    key = '1111111111111111'    
    data_file_name = 'PythonCrypto.txt'    
    encrypted_file_name = 'AESEncrypted.txt'    
    key_file_name = 'AESKey.txt'    
    digest_file_name = 'MD5Digest.txt'    
    aes_encrypt(key, key_file_name, data_file_name, encrypted_file_name)
    compute_md5_digest(digest_file_name, encrypted_file_name)

main()

Now we will start working with the other program which will verify the integrty of the message + decrypt it.The program will be very short. The first thing the program will do is to open the encrypted file, compute the digest and compare it with the digest that was created by the first program. If the two are the same, then the message was not modified by any one and it is safe to decrypt it. We will create this program as a method. The code for that is as follows:

def program_2():
    print('Running Program #2')
    encrypted_file_name = 'AESEncrypted.txt'     
    key_file_name = 'AESKey.txt' 
    program_1_digest_file = 'MD5Digest.txt' 
    decrypted_file_name = 'AESDecrypt.txt'     
    file = open(program_1_digest_file)
    program_1_digest = file.read()
    file = open(encrypted_file_name,'rb')
    message = file.read()
    digest = MD5.new(message).hexdigest()
    if digest == program_1_digest:
        print('Data Integrity Verified!')
        print('Decrypting data')
        key_file = open(key_file_name)
        key = key_file.read()
        print('Key = '+key)
        encrypted_file = open(encrypted_file_name,'rb')
        decrypted_file = open(decrypted_file_name,'w')
        cipher = AES.new(key, AES.MODE_ECB, b'')
        while True:
            data = encrypted_file.read(16)
            if data:
                decrypted_message = cipher.decrypt(data)
                decrypted_file.write(bytes.decode(decrypted_message, 'utf-8'))
            else:
                print('Decrypting Finished!')
                print('Decrypted Data can be found inside '+encrypted_file_name)
                decrypted_file.close()
                break 
    else:
        print('Data was modified by someone!!')

Now we can merge the given code with the previous code to get our full two programs:

from Crypto.Hash import MD5
from Crypto.Cipher import AES


def aes_encrypt(key, key_file_name, file_name, encrypted_file_name):
    print('Encrypting '+file_name+' using AES')
    print('Key = '+key)
    print('File to encrypt = '+file_name)
    print('Encrypted file name = '+encrypted_file_name)
    print('Note: the encrypted data is stored as binary')
    cipher = AES.new(key, AES.MODE_ECB, b'')
    file = open(file_name, 'r')
    encrypted_file = open(encrypted_file_name, 'wb')
    file.read(3)
    while True:
        data = file.read(16)
        if data:
            if len(data) != 16:
                while len(data) != 16:
                    data += ' '
            encrypted_message = cipher.encrypt(data)
            encrypted_file.write(encrypted_message)
        else:
            encrypted_file.close()
            break     
    file = open(key_file_name, 'w')
    file.write(key)
    file.close()
    print('File Encryption Completed!')


def compute_md5_digest(digest_file_name, file_to_digest):
    print('Computing MD5 digest')
    print('Input file = '+file_to_digest)
    print('Output file = '+digest_file_name)
    encrypted_file = open(file_to_digest, 'rb')
    message = encrypted_file.read()
    digest = MD5.new(message).hexdigest()
    md5_file = open(digest_file_name, 'w')
    md5_file.write(digest)
    md5_file.close()
    print('MD5 digest finished!')


def program_1():
    print('Running Program #1')
    key = '1111111111111111'    data_file_name = 'PythonCrypto.txt' 
    encrypted_file_name = 'AESEncrypted.txt' 
    key_file_name = 'AESKey.txt' 
    digest_file_name = 'MD5Digest.txt'     
    aes_encrypt(key, key_file_name, data_file_name, encrypted_file_name)
    compute_md5_digest(digest_file_name, encrypted_file_name)
    print('Program #1 Finished!')


def program_2():
    print('Running Program #2')
    encrypted_file_name = 'AESEncrypted.txt' 
    key_file_name = 'AESKey.txt' 
    program_1_digest_file = 'MD5Digest.txt' 
    decrypted_file_name = 'AESDecrypt.txt'     
    file = open(program_1_digest_file)
    program_1_digest = file.read()
    file = open(encrypted_file_name, 'rb')
    message = file.read()
    digest = MD5.new(message).hexdigest()
    if digest == program_1_digest:
        print('Data Integrity Verified!')
        print('Decrypting data')
        key_file = open(key_file_name)
        key = key_file.read()
        print('Key = '+key)
        encrypted_file = open(encrypted_file_name, 'rb')
        decrypted_file = open(decrypted_file_name, 'w')
        cipher = AES.new(key, AES.MODE_ECB, b'')
        while True:
            data = encrypted_file.read(16)
            if data:
                decrypted_message = cipher.decrypt(data)
                decrypted_file.write(bytes.decode(decrypted_message, 'utf-8'))
            else:
                print('Decrypting Finished!')
                print('Decrypted Data can be found inside '+encrypted_file_name)
                decrypted_file.close()
                break 
    else:
        print('Data was modified by someone!!')
    print('Program #2 Finished!')


# inside main, we can select which program to run. 
# program_1() means the one who will send the data. 
# program_2 means the one who will receive the data.
def main():
    program_1()
    program_2()


main()

The next step will be to learn  about asymmetric keys in PyCrypt and how to use them  to encrypt and decrypt the file. Basically we will be using digital signature.


Common Errors and how to Fix Them:

Error Message: 
ImportError: No module named 'winrandom'

Fix:
Edit the file 'Python35\lib\site-packages\Crypto\Random\OSRNG\nt.py'. Change '

import winrandom' to '
from . import winrandom'.

Error Message: 
ImportError: DLL load failed: The specified module could not be found.
Fix:
This message appear when trying to import something from the library. The reason for that is the installation is broken. You have to re-install PyCrypto library.

Error Message: 
ValueError: Input strings must be a multiple of 16 in length.
This error can appear if an algorithm needs specific data length. For example, the input to AES must be 16 bytes in length. if it is more or less, this error will appear.
Fix: Check data length that it is 16 bytes.

Error Message:
TypeError: Can't convert 'bytes' object to str implicitly
This error appears when we try to print other data types such as byte. The print method accepts only string type.
Fix:
Use str(data_to_print).

Error Message:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
This error can happen when reading from file. The error means that the encoding of the file is not 'utf-8'. All what you have to do is open the file and save it in 'utf-8' encoding.

Error Message:
TypeError: write() argument must be str, not bytes
This error can happen when we try to write data into a file. The method 'write()' will only accept data of type string if we use 'w'. This means we have to use the method 'str(data)' to convert our data into string before using 'write()' method. Another solution is to use 'wb' if we are writing raw data to the file.


In part 2, we will learn how to use digital signature to sign our data. Part 2 will be posted in the future.
 







No comments:

Post a Comment

Feel free to write any thing in your mind here 😉