Viewing C object code using Objdump

In computing and programming, most higher level programming(C/C++) are converted/interpreted into a lower-level language through a compiler.  Generally the output from the compiler is object or assembly code which are pretty low-level languages.Using objdump, we will be able to view the object code from a compiled C program.

Objdump is a program that displays information about inputted object files. The object file that is created when code is compiled is an ELF(Executable and Linkable Format) or as we will see it, “a.out”. We can take this object file and use Objdump to view the programs object code


I will be compiling my C code in gcc. The flags that I use are listed and described here.

-g          # enable debugging information
-O0              # do not optimize (that's a capital letter and then the digit zero)
-fno-builtin     # do not use builtin function optimizations
-static          # use only static library's

The Objdump flags that I will use are listed and described here.

-f          # display header information for the entire file
-s          # display per-section summary information
-d          # disassemble sections containing code
--source    # (implies -d) show source code, if available, along with disassembly

These tests will be preformed on a x86_64 platform so results may very on a different architecture


For the first couple of tests, I will use a simple C hello world program

#include <stdio.h>
int main() {
    printf("Hello World!\n");
}

I will be compiling in gcc with the -g,-O0 and -fno-builtin flags and the objdump programmer will be used in all the tests to generate the object code from the ELF

gcc -g -O0 -fno-builtin helloworld.c
objdump -f -s -d --source a.out

The resulting file is about 18.6 KB in size. If you analyse the code under the <main> tag, you can see the assembly equivalent of our C code beside the original C code

0000000000400536 :


#include 

int main() {
  400536:	55                   	push   %rbp
  400537:	48 89 e5             	mov    %rsp,%rbp
    printf("Hello World!\n");
  40053a:	bf e0 05 40 00       	mov    $0x4005e0,%edi
  40053f:	b8 00 00 00 00       	mov    $0x0,%eax
  400544:	e8 c7 fe ff ff       	callq  400410 <printf@plt>
}
  400549:	5d                   	pop    %rbp
  40054a:	c3                   	retq   
  40054b:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)

Now I will add the static option when compiling our program, we will then proceed to run the output through objdump with the same flags.

gcc -g -O0 -fno-builtin -static helloworld.c

Right away the objdump is noticeably larger then our previous test. The output is now  10.607 KB. When looking at the object code, there are also some differences in certain sections and disassemble output.

 0000000000400b5e <main>:

#include <stdio.h>

int main() {
 400b5e: 55 push %rbp
 400b5f: 48 89 e5 mov %rsp,%rbp
 printf("Hello World!\n");
 400b62: bf 10 09 49 00 mov $0x490910,%edi
 400b67: b8 00 00 00 00 mov $0x0,%eax
 400b6c: e8 0f 0b 00 00 callq 401680 <_IO_printf>
}
 400b71: 5d pop %rbp
 400b72: c3 retq 
 400b73: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
 400b7a: 00 00 00 
 400b7d: 0f 1f 00 nopl (%rax)

We can see now the stdio.h library is included into our assembly and the printf@plt instruction called is replaced with the <_IO_printf> instruction


Now I will compile without the -fno-builtin parameter

gcc -g -O0 -static helloworld.c

Once we examine the objdump output we notice that we have less output this them compared to using the -fno-builtin parameter, about 10.306 KB in size

0000000000400b5e <main>:


#include <stdio.h>

int main() {
 400b5e: 55 push %rbp
 400b5f: 48 89 e5 mov %rsp,%rbp
 printf("Hello World!\n");
 400b62: bf 50 0a 49 00 mov $0x490a50,%edi
 400b67: e8 04 0b 00 00 callq 401670 <_IO_puts>
}
 400b6c: 5d pop %rbp
 400b6d: c3 retq 
 400b6e: 66 90 xchg %ax,%ax

For the assembly output the only major difference is now our program is calling the <_IO_puts> instruction


For this test we will remove the -g option. This will not include the debugging information in the object code.

gcc -O0 -static helloworld.c

Now the object code output is smaller then the previous test,about 10.116 KB in size.

0000000000400b5e <main>:
 400b5e: 55 push %rbp
 400b5f: 48 89 e5 mov %rsp,%rbp
 400b62: bf 50 0a 49 00 mov $0x490a50,%edi
 400b67: e8 04 0b 00 00 callq 401670 <_IO_puts>
 400b6c: 5d pop %rbp
 400b6d: c3 retq 
 400b6e: 66 90 xchg %ax,%ax

By removing the -g flag, debugging information such as the C code is now absent from the object code.


The new program will include additional arguments in our printf() function,  Notice how the registers in the assembly code handle our additional arguments.

#include <stdio.h>

int main() {
    printf("Hello World! %d %d %d %d %d %d %d %d %d %d\n",1,2,3,4,5,6,7,8,9,10);
}

We will use the same compilation options as before, except this time we will use our new program, same with the objdump command

gcc -O0 -static helloworld2.c

Now we can that we have more object code generated from the function. The overall size of the output is 10.705 KB

 0000000000400b5e <main>:
 400b5e: 55 push %rbp
 400b5f: 48 89 e5 mov %rsp,%rbp
 400b62: 48 83 ec 08 sub $0x8,%rsp
 400b66: 6a 0a pushq $0xa
 400b68: 6a 09 pushq $0x9
 400b6a: 6a 08 pushq $0x8
 400b6c: 6a 07 pushq $0x7
 400b6e: 6a 06 pushq $0x6
 400b70: 41 b9 05 00 00 00 mov $0x5,%r9d
 400b76: 41 b8 04 00 00 00 mov $0x4,%r8d
 400b7c: b9 03 00 00 00 mov $0x3,%ecx
 400b81: ba 02 00 00 00 mov $0x2,%edx
 400b86: be 01 00 00 00 mov $0x1,%esi
 400b8b: bf 50 09 49 00 mov $0x490950,%edi
 400b90: b8 00 00 00 00 mov $0x0,%eax
 400b95: e8 06 0b 00 00 callq 4016a0 <_IO_printf>
 400b9a: 48 83 c4 30 add $0x30,%rsp
 400b9e: c9 leaveq 
 400b9f: c3 retq

We can see that 1,2,3,4,5 are assigned in registers using mov with 0x1,$0x2,$0x3,$0x4,$0x5 and 6,7,8,9,10 are pushed using pushq into registers $0x6,$0x7,$0x8,$0x9,$0xa


I will modify our previous program by moving the printf statement into a function called output that is called from the main function.

#include <stdio.h>

void output();
int main() {
   output(); 
}

void output(){
   printf("Hello World! %d %d %d %d %d %d %d %d %d %d\n",1,2,3,4,5,6,7,8,9,10);
}

This time objdump object code is the same size as before, about 10.705 KB in size. We also have some noticeable differences in the disassembly.

 0000000000400b5e <main>:
 400b5e: 55 push %rbp
 400b5f: 48 89 e5 mov %rsp,%rbp
 400b62: b8 00 00 00 00 mov $0x0,%eax
 400b67: e8 02 00 00 00 callq 400b6e <output>
 400b6c: 5d pop %rbp
 400b6d: c3 retq 

0000000000400b6e <output>:
 400b6e: 55 push %rbp
 400b6f: 48 89 e5 mov %rsp,%rbp
 400b72: 48 83 ec 08 sub $0x8,%rsp
 400b76: 6a 0a pushq $0xa
 400b78: 6a 09 pushq $0x9
 400b7a: 6a 08 pushq $0x8
 400b7c: 6a 07 pushq $0x7
 400b7e: 6a 06 pushq $0x6
 400b80: 41 b9 05 00 00 00 mov $0x5,%r9d
 400b86: 41 b8 04 00 00 00 mov $0x4,%r8d
 400b8c: b9 03 00 00 00 mov $0x3,%ecx
 400b91: ba 02 00 00 00 mov $0x2,%edx
 400b96: be 01 00 00 00 mov $0x1,%esi
 400b9b: bf 50 09 49 00 mov $0x490950,%edi
 400ba0: b8 00 00 00 00 mov $0x0,%eax
 400ba5: e8 06 0b 00 00 callq 4016b0 <_IO_printf>
 400baa: 48 83 c4 30 add $0x30,%rsp
 400bae: c9 leaveq 
 400baf: c3 retq

The when examining the output we can notice that we now have a new section,<output> which is called from the <main> section. The output section will handle the printf statement the same way from the previous test.


This time when compiling we will change the optimization level to 3 using the -O3 flag. This means we will set the optimization level to 3 meaning the code generated should be more efficient resulting in possibly a smaller output from objdump

gcc -O3 -static helloworld3.c

The resulting objdump output is now smaller in size then the previous test, about 10.008 KB in size

 00000000004007f0 <main>:
 4007f0: 31 c0 xor %eax,%eax
 4007f2: 48 83 ec 08 sub $0x8,%rsp
 4007f6: e8 75 03 00 00 callq 400b70 <output>
 4007fb: 48 83 c4 08 add $0x8,%rsp
 4007ff: c3 retq

0000000000400b70 <output>:
 400b70: 48 83 ec 10 sub $0x10,%rsp
 400b74: 41 b9 05 00 00 00 mov $0x5,%r9d
 400b7a: 41 b8 04 00 00 00 mov $0x4,%r8d
 400b80: 6a 0a pushq $0xa
 400b82: 6a 09 pushq $0x9
 400b84: b9 03 00 00 00 mov $0x3,%ecx
 400b89: 6a 08 pushq $0x8
 400b8b: 6a 07 pushq $0x7
 400b8d: ba 02 00 00 00 mov $0x2,%edx
 400b92: 6a 06 pushq $0x6
 400b94: be 01 00 00 00 mov $0x1,%esi
 400b99: bf 50 09 49 00 mov $0x490950,%edi
 400b9e: 31 c0 xor %eax,%eax
 400ba0: e8 0b 0b 00 00 callq 4016b0 <_IO_printf>
 400ba5: 48 83 c4 38 add $0x38,%rsp
 400ba9: c3 retq 
 400baa: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)

When inspecting the assembly in the object code, we can see that  now we have less assembly code in both the <main> and <output> sections as a result of optimization level 3.


Conclusion and Final Thoughts

After seeing how C code is interpreted in the compiler by viewing object code in objdump, I am now interested to learn assembly level languages. From what I heard and seen(in the object code), assembly level code seems intimidating and very complex compared to high-level programming languages. Hopefully I will soon be able to understand and write my own code in assembly language.


Software/Hardware used

  • Kali Linux
  • Oracle VM VirtualBox
  • x86-64 system(australia.proximity.on.ca

References

http://zenit.senecac.on.ca/wiki/index.php/SPO600_Compiled_C_Lab

https://sourceware.org/binutils/docs/binutils/objdump.html

Leave a comment