分析软件缺陷,有时候需要把ELF文件反汇编为汇编代码,然后分析汇编代码,需要把汇编代码和C语言代码对应起来。
如果一个函数比较长,那么人工把汇编代码和C语言代码对应起来,费时费力。怎么快速把汇编代码和C语言代码对应起来?
首先,编译程序的时候使用选项“-g”生成调试信息。
gcc test.c -o test.elf -g
执行“readelf -S test.elf”,可以看到多个名称以“.debug_”开头的节,如下。
$ readelf -S test.elf
There are 35 section headers, starting at offset 0x1508:
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
...
[27] .debug_aranges PROGBITS 0000000000000000 0000106b
0000000000000030 0000000000000000 0 0 1
[28] .debug_info PROGBITS 0000000000000000 0000109b
0000000000000173 0000000000000000 0 0 1
[29] .debug_abbrev PROGBITS 0000000000000000 0000120e
0000000000000096 0000000000000000 0 0 1
[30] .debug_line PROGBITS 0000000000000000 000012a4
0000000000000048 0000000000000000 0 0 1
[31] .debug_str PROGBITS 0000000000000000 000012ec
00000000000000d2 0000000000000001 MS 0 0 1
...
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), l (large)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
使用objdump工具把ELF文件反汇编为汇编代码的时候,使用选项“-S”把源代码和汇编代码混合,并且输出到一个文本文件。
objdump -S test.elf > test_asm.txt
打开文本文件,下面是一个例子。
000000000040056a < func2 >:
void func2(int *ptr, int oldval, int newval)
{
40056a: 55 push %rbp
40056b: 48 89 e5 mov %rsp,%rbp
40056e: 48 83 ec 20 sub $0x20,%rsp
400572: 48 89 7d e8 mov %rdi,-0x18(%rbp)
400576: 89 75 e4 mov %esi,-0x1c(%rbp)
400579: 89 55 e0 mov %edx,-0x20(%rbp)
bool b;
b = __atomic_compare_exchange_n(ptr, &oldval, newval,
40057c: 8b 4d e0 mov -0x20(%rbp),%ecx
40057f: 48 8b 75 e8 mov -0x18(%rbp),%rsi
400583: 48 8d 55 e4 lea -0x1c(%rbp),%rdx
400587: 8b 02 mov (%rdx),%eax
400589: f0 0f b1 0e lock cmpxchg %ecx,(%rsi)
40058d: 89 c1 mov %eax,%ecx
40058f: 0f 94 c0 sete %al
400592: 84 c0 test %al,%al
400594: 75 02 jne 400598 < func2+0x2e >
400596: 89 0a mov %ecx,(%rdx)
400598: 88 45 ff mov %al,-0x1(%rbp)
false, __ATOMIC_ACQUIRE, __ATOMIC_RELAXED);
printf("b=%d\\n", b);
40059b: 0f b6 45 ff movzbl -0x1(%rbp),%eax
40059f: 89 c6 mov %eax,%esi
4005a1: bf 54 06 40 00 mov $0x400654,%edi
4005a6: b8 00 00 00 00 mov $0x0,%eax
4005ab: e8 60 fe ff ff callq 400410 < printf@plt >
}
4005b0: c9 leaveq
4005b1: c3 retq