Starting from:

$25

EECS388-Project 1 Cryptographic Attacks Solved

In this project, you will investigate vulnerabilities in widely used cryptographic hash functions, including length-extension attacks and collision vulnerabilities, and an implementation vulnerability in a popular digital signature scheme. In Part 1, we will guide you through attacking the authentication capability of an imaginary server API. The attack will exploit the length-extension vulnerability of hash functions in the MD5 and SHA family. In Part 2, you will use a cutting-edge tool to generate different messages with the same MD5 hash value (collisions). You’ll then investigate how that capability can be exploited to conceal malicious behavior in software. In Part 3, you will learn about an attack against certain implementations of RSA padding; then, you will forge a digital signature using your own implementation of this attack.

Objectives:

•   Understand how to apply basic cryptographic integrity and authentication primitives.

•   Investigate how cryptographic failures can compromise the security of applications.

•   Appreciate why you should use HMAC-SHA256 as a substitute for common hash functions.

•   Understand why padding schemes are integral to cryptographic security.

Part 1. Length Extension
In most applications, you should use MACs such as HMAC-SHA256 instead of plain cryptographic hash functions (e.g. MD5, SHA-1, or SHA-256), because hashes, also known as digests, fail to match our intuitive security expectations. What we really want is something that behaves like a pseudorandom function, which HMACs seem to approximate and hash functions do not.

One difference between hash functions and pseudorandom functions is that many hashes are subject to length extension. All the hash functions we’ve discussed use a design called the Merkle-Damgård construction. Each is built around a compression function f and maintains an internal state s, which is initialized to a fixed constant. Messages are processed in fixed-sized blocks by applying the compression function to the current state and current block to compute an updated internal state, i.e. si+1 = f(si,bi). The result of the final application of the compression function becomes the output of the hash function.

A consequence of this design is that if we know the hash of an n-block message, we can find the hash of longer messages by applying the compression function for each block bn+1,bn+2,... that we want to add. This process is called length extension, and it can be used to attack many applications

of hash functions.

1.1        Experiment with Length Extension in Python
To experiment with this idea, we’ll use a Python implementation of the MD5 hash function, though SHA-1 and SHA-256 are vulnerable to length extension too. You can download the pymd5 module at https://www.eecs.umich.edu/courses/eecs388/static/project1/pymd5.py and learn

how to use it by running $ pydoc pymd5. To follow along with these examples, run Python in interactive mode ($ python -i) and run the command from pymd5 import md5, padding.

Consider the string “Use HMAC, not hashes”. We can compute its MD5 hash by running:

m = "Use HMAC, not hashes" h = md5()

h.update(m) print h.hexdigest() or, more compactly, print md5(m).hexdigest(). The output should be:

3ecc68efa1871751ea9b0b1a5b25004d

MD5 processes messages in 512-bit blocks, so, internally, the hash function pads m to a multiple of that length. The padding consists of the bit 1, followed by as many 0 bits as necessary, followed by a 64-bit count of the number of bits in the unpadded message. (If the 1 and count won’t fit in the current block, an additional block is added.) You can use the function padding(count) in the pymd5 module to compute the padding that will be added to a count -bit message.

Even if we didn’t know m, we could compute the hash of longer messages of the general form m + padding(len(m)*8) + suffix by setting the initial internal state of our MD5 function to

MD5(m), instead of the default initialization value, and setting the function’s message length counter to the size of m plus the padding (a multiple of the block size). To find the padded message length, guess the length of m and run bits = (length_of_m + len(padding(length_of_m*8)))*8.

The pymd5 module lets you specify these parameters as additional arguments to the md5 object: h = md5(state="3ecc68efa1871751ea9b0b1a5b25004d".decode("hex"), count=512)

Now you can use length extension to find the hash of a longer string that appends the suffix “Good advice." Simply run:

x = "Good advice" h.update(x) print h.hexdigest()

to execute the compression function over x and output the resulting hash. Verify that it equals the MD5 hash of m + padding(len(m)*8) + x. Notice that, due to the length-extension property of MD5, we didn’t need to know the value of m to compute the hash of the longer string—all we needed to know was m’s length and its MD5 hash.

This component is intended to introduce length extension and familiarize you with the Python MD5 module we will be using; you will not need to submit anything for it.

1.2        Conduct a Length Extension Attack
Length extension attacks can cause serious vulnerabilities when people mistakenly try to construct something like an HMAC by using hash(secret k message)[1]. The National Bank of EECS 388, which is not up-to-date on its security practices, hosts an API that allows its client-side applications to perform actions on behalf of a user by loading URLs of the form:

https://eecs388.org/project1/api?token=402a574d265dc212ee64970f159575d0

&user=admin&command1=ListFiles&command2=NoOp

where token is MD5(user’s 8-character password k user= ...[the rest of the URL starting from user= and ending with the last command]).

Using the techniques that you learned in the previous section and without guessing the password, apply length extension to create a URL ending with &command3=UnlockAllSafes that is treated as valid by the server API. You have permission to use our server to check whether your command is accepted.

Hint: You might want to use the quote() function from Python’s urllib module to encode non-ASCII characters in the URL.

Historical fact: In 2009, security researchers found that the API used by the photo-sharing site Flickr suffered from a length-extension vulnerability almost exactly like the one in this exercise.

What to submit         A Python 2.x script named len_ext_attack.py that:

1.    Accepts a valid URL in the same form as the one above as a command line argument.

2.    Modifies the URL so that it will execute the UnlockAllSafes command as the user.

3.    Successfully performs the command on the server and prints the server’s response.

You should make the following assumptions:

•   The input URL will have the same form as the sample above, but we may change the server hostname and the values of token, user, command1, and command2 (although they will remain in the same order). These values may be of substantially different lengths than in the sample.

•   The input URL may be for a user with a different password, but the length of the password will be unchanged.

•   The server’s output might not exactly match what you see during testing.

You can base your code on the following example:

import httplib, urlparse, sys url = sys.argv[1]

# Your code to modify url goes here

parsedUrl = urlparse.urlparse(url)

conn = httplib.HTTPSConnection(parsedUrl.hostname,parsedUrl.port) conn.request("GET", parsedUrl.path + "?" + parsedUrl.query) print conn.getresponse().read()

Part 2. MD5 Collisions
MD5 was once the most widely used cryptographic hash function, but today it is considered dangerously insecure. This is because cryptanalysts have discovered efficient algorithms for finding collisions—pairs of messages with the same MD5 hash value.

The first known collisions were announced on August 17, 2004, by Xiaoyun Wang, Dengguo Feng, Xuejia Lai, and Hongbo Yu. Here’s one pair of colliding messages they published:

Message 1:

d131dd02c5e6eec4693d9a0698aff95c 2fcab58712467eab4004583eb8fb7f89 55ad340609f4b30283e488832571415a 085125e8f7cdc99fd91dbdf280373c5b d8823e3156348f5bae6dacd436c919c6 dd53e2b487da03fd02396306d248cda0 e99f33420f577ee8ce54b67080a80d1e c69821bcb6a8839396f9652b6ff72a70 Message 2:

d131dd02c5e6eec4693d9a0698aff95c 2fcab50712467eab4004583eb8fb7f89 55ad340609f4b30283e4888325f1415a 085125e8f7cdc99fd91dbd7280373c5b d8823e3156348f5bae6dacd436c919c6 dd53e23487da03fd02396306d248cda0 e99f33420f577ee8ce54b67080280d1e c69821bcb6a8839396f965ab6ff72a70

Convert each group of hex strings into a binary file.

(On Linux, run $ xxd -r -p file.hex > file.)

1.    What are the MD5 hashes of the two binary files? Verify that they’re the same.

($ openssl dgst -md5 file1 file2)

2.    What are their SHA-256 hashes? Verify that they’re different.

($ openssl dgst -sha256 file1 file2)

This component is intended to introduce you to MD5 collisions; you will not submit anything for it.

2.1        Generating Collisions Yourself
In 2004, Wang’s method took more than 5 hours to find a collision on a desktop PC. Since then, researchers have introduced vastly more efficient collision finding algorithms. You can compute your own MD5 collisions using a tool written by Marc Stevens that uses a more advanced technique. You can download the fastcoll tool here: http://www.win.tue.nl/hashclash/fastcoll_v1.0.0.5.exe.zip (Windows executable) or http://www.win.tue.nl/hashclash/fastcoll_v1.0.0.5-1_source.zip (source code)

If you are building fastcoll from source, you can compile using this makefile: https://www. eecs.umich.edu/courses/eecs388/static/project1/Makefile. You will also need the

Boost libraries. On Ubuntu, you can install these using apt-get install libboost-all-dev. On OS X, you can install Boost via the Homebrew package manager using brew install boost.

1.    Generate your own collision with this tool. How long did it take?

($ time fastcoll -o file1 file2)

2.    What are your files? To get a hex dump, run $ xxd -p file.

3.    What are their MD5 hashes? Verify that they’re the same.

4.    What are their SHA-256 hashes? Verify that they’re different.

What to submit   A text file named generating_collisions.txt containing your answers.

Format your file using this template. Make sure to follow the format precisely as this part is autograded:

# Question 1

time_for_fastcoll (e.g. 3.456s)

# Question 2 file1: file1_hex_dump file2: file2_hex_dump # Question 3 md5_hash

# Question 4 file1: file1_sha256_hash file2:

file2_sha256_hash

2.2        A Hash Collision Attack
The collision attack lets us generate two messages with the same MD5 hash and any chosen

(identical) prefix. Due to MD5’s length-extension behavior, we can append any suffix to both messages and know that the longer messages will also collide. This lets us construct files that differ only in a binary “blob” in the middle and have the same MD5 hash, i.e. prefix k blobA k suffix and prefix k blobB k suffix.

We can leverage this to create two programs that have identical MD5 hashes but wildly different behaviors. We’ll use Python, but almost any language would do. Put the following three lines into a file called prefix:

#!/usr/bin/python

# -*- coding: utf-8 -*blob = """

and put these three lines into a file called suffix:

"""

from hashlib import sha256 print sha256(blob).hexdigest()

Now use fastcoll to generate two files with the same MD5 hash that both begin with prefix.

($ fastcoll -p prefix -o col1 col2). Then append the suffix to both ($ cat col1 suffix > file1.py; cat col2 suffix > file2.py). Verify that file1.py and file2.py have the

same MD5 hash but generate different output.

Extend this technique to produce another pair of programs, good and evil, that also share the same MD5 hash. One program should execute a benign payload: print "I come in peace." The second should execute a pretend malicious payload: print "Prepare to be destroyed!". Note that we may rename these program before grading them.

What to submit         Two Python 2.x scripts named good.py and evil.py that have the same MD5 hash, have different SHA-256 hashes, and print the specified messages.

Part 3. RSA Signature Forgery
A secure implementation of RSA encryption or digital signatures requires a proper padding scheme. RSA without padding, also known as textbook RSA, has several undesirable properties. One property is that it is trivial for an attacker with only an RSA public key pair (n,e) to produce a mathematically valid (message,signature) pair by choosing an s and returning (se,s).

In order to prevent an attacker from being able to forge valid signatures in this way, RSA implementations use a padding scheme to provide structure to the values that are encrypted or signed. The most commonly used padding scheme in practice is defined by the PKCS #1 v1.5 standard, which can be found at https://tools.ietf.org/html/rfc2313. The standard defines, among other things, the format of RSA keys and signatures and the procedures for generating and validating RSA signatures.

3.1         Validating RSA Signatures
You can experiment with validating RSA signatures yourself. Create a text file called key.pub that contains the following RSA public key:

-----BEGIN PUBLIC KEY-----

MFowDQYJKoZIhvcNAQEBBQADSQAwRgJBALvMZ+YhjwKpteNYzzbPHvPqdvMrs2Rf HeIhK+ui9v0YHNyFW6aBwwG/6sfbvxx4OlePBWjRhpoxDC5A/J+rVXkCAQM=

-----END PUBLIC KEY-----

Confirm that the key has a 512-bit modulus with an exponent of 3. You can view the modulus and public exponent of this key by running:

$ openssl rsa -in key.pub -pubin -text -noout

Create a file containing only the text EECS 388 rul3z! ($ echo -n ’EECS 388 rul3z!’ >

myfile). The following is a base64-encoded signature of the file using the private key corresponding to the public key above.

kbenR8AWBIKUroVOqfUFJKL61+8EFu88yKPRg+o/XGCWLFlxTaVWyDnDE2UciNns NcFQRaNev5+nCxMCUfZbZw==

Copy the base64-encoded signature to a file, and then convert it from base64 to raw binary. ($ base64 --decode -i sig.b64 > sig). Verify the signature against the file you created:

$ openssl dgst -sha1 -verify key.pub -signature sig myfile

We can also use basic math operations in Python to explore this signature further. Remember, RSA ciphertexts, plaintexts, exponents, moduli, and signatures are actually all integers.

Usually, you would use a cryptography library to import a public key. However, for the purposes of this part of the assignment, you can just manually assign the modulus and exponent as integers in Python based on the earlier output from OpenSSL. You may find the following command useful: $ openssl rsa -in key.pub -text -noout -pubin | egrep '^ ' | tr -d ' :\n'

Launch Python in interactive mode and assign the modulus and the exponent to integer variables:

# n is the modulus from the key.

# You can just assign it as a hexadecimal literal--remember to start with 0x # It will look something like:

n = 0x00d56d87ba372303f8...104dd059e49e435f

# e is the exponent from the key e = 3

We can also load the signature into Python. Like the modulus and the exponent, we’ll convert the signature to an integer:

signature = int(open('sig').read().encode('hex'), 16)

Now reverse the signing operation by computing signaturee mod n: x = pow(signature, e, n)

You can print the resulting value as a 64-byte (512-bit) integer in hex: print "%0128x" % x

You should see something like 0001fffff...f82b4557c1d37f99c6d6f5ff00900.

Verify that the last 20 bytes of this value match the SHA-1 hash of your file:

import hashlib m = hashlib.sha1()

m.update("EECS 388 rul3z!") print m.hexdigest()

The hash has been padded using the PKCS #1 v1.5 signature scheme, which specifies that, for a SHA-1 hash with a k-bit RSA key, the signed value will contain the following bytes: 0001FFFFFF ··· FF003021300906052B0E03021A05000414XXXXXXXX ··· XX

 

| {z} | {z } | {z } k/8−38 bytes ASN.1 “magic” bytes denoting type of hash algorithm SHA-1 digest (20 bytes)

The number of FF bytes varies such that the size of the result is equal to the size of the RSA key. Confirm that the value you computed above matches this format. It is crucial for implementations to verify that every bit is exactly as it should be, but sometimes developers can be lazy...

3.2       Bleichenbacher’s Attack
It’s tempting for an implementer to validate the signature padding as follows: (1) confirm that the total length equals the key size; (2) strip off the bytes 0001, followed by any number of FF bytes, then 00; (3) parse the ASN.1 bytes; (4) verify that the next 20 bytes are the correct SHA-1 digest.

This procedure does not check the length of the FF bytes, nor does it verify that the hash is in the least significant (rightmost) bytes of the string. As a result, it will accept malformed values that have “garbage” bytes following the digest, like this example, which has only one FF:

0001FF003021300906052B0E03021A05000414 XXXXXX ··· XX YYYYYYYY ··· YY

 

                            |                                         {z                                         } |             {z             } |                           {z                                        }

                                ASN.1 “magic” bytes denoting type of hash algorithm                  SHA-1 digest (20 bytes) k/8−39 arbitrary bytes

Convince yourself that this value would be accepted by the incorrect implementation described above, and that the bytes at the end labeled YY would be ignored. When an implementation uses this lenient, incorrect parsing, an attacker can easily create forged signatures that it will accept.

This possibility is particularly troubling when RSA is used with e = 3. Consider the case with RSA encryption: If we encrypt an unpadded message m that is much shorter than k-bits, then m3 < n. Thus, the “encrypted” message c = me = m3 mod n does not “wrap around” the modulus n. In this case, RSA doesn’t provide good security, since the attacker can simply take the normal cube root of the ciphertext to find the plaintext, m = c1/3. It’s easy to reverse normal exponentiation, as opposed to modular exponentiation!

Now recall that RSA signature validation is analogous to RSA encryption. If the signature uses e = 3, the validator calculates se = s3 mod n and checks that the result is the correct PKCS-padded digest of the signed message.

Here comes the attack: For a 2048-bit key, a correctly padded value for an RSA signature using a SHA-1 hash should have k/8−38 = 2048/8−38 = 218 bytes of FFs. But what if there were only one FF, as in the example shown above? This would leave space for 217 arbitrary bytes at the end of the value. A correct implementation of signature validation would notice the extra bytes and reject the signature, but the weak implementation described above would ignore these bytes.

To construct a signature that would validate against such an implementation, the attacker needs to find a number x such that x3 < n and where x3 matches the format of the malformed example shown above. Since the implementation ignores the last 217 bytes, the attacker has enough flexibility to find a perfect cube of this form. To do this, you can construct an integer whose most significant bytes have the correct format, including the digest of the target message, and set the last 217 bytes to 00. Then take the cube root, rounding as appropriate.

Historical fact: This attack was discovered by Daniel Bleichenbacher, who presented it in a lightning talk at the rump session at the Crypto 2006 conference. His talk is described in this mailing list post: https://mailarchive.ietf.org/arch/msg/openpgp/5rnE9ZRN1AokBVj3VqblGlP63QE. At

the time, many important implementations of RSA signatures were found to be vulnerable to this attack, including OpenSSL. In 2014, the Firefox TLS implementation was also found to be vulnerable to this type of attack: https://www.mozilla.org/security/advisories/mfsa2014-73/.

3.3        Constructing Forged Signatures
The National Bank of EECS 388 has a website at https://eecs388.org/project1/ that its employees use to initiate wire transfers between bank accounts. To authenticate each transfer request, the control panel requires a signature from a particular 2048-bit RSA key that is listed on the website’s home page. Unfortunately, this control panel is running old, unpatched software that is vulnerable to signature forgery.

Using the signature forgery technique described above, produce an RSA signature that the National Bank of EECS 388 site accepts as valid.

What to submit        A Python script called bleichenbacher.py that:

1.    Accepts a double-quoted string as a command-line argument.

2.    Prints a base64-encoded forged signature of the input string.

You have our permission to use https://eecs388.org/project1/ to test your signatures, but when we grade your program it will not have access to the network.

We have provided a Python library, roots.py, that provides several useful functions that you may wish to use when implementing your solution. You can download roots.py at https://www.eecs. umich.edu/courses/eecs388/static/project1/roots.py. To examine its documentation, run $ pydoc roots.

To use these functions, you will have to import roots.py. You can start with the following template: from roots import * import hashlib import sys message = sys.argv[1]

# Your code to forge a signature goes here. print integer_to_base64(forged_signature)

Part 4. Writeup
1.    With reference to the construction of HMAC, explain how changing the design of the API in Part 1.2 to use token= HMACuser’s password(user=...) would avoid the length extension vulnerability.

2.    Briefly explain why the technique you explored in Part 2.2 poses a danger to systems that rely on digital signatures to verify the integrity of programs before they are installed or executed. Examples include Microsoft Authenticode and most Linux package managers. (You may assume that these systems sign MD5 hashes of the programs.)

3.    Since 2010, NIST has specified that RSA public exponents (e) should be at least 216 +1. Briefly explain why Bleichenbacher’s attack would not work for these keys.


 
[1] k is the symbol for contatenation, i.e. “hello" k “world" = “helloworld".

More products