程序示例

参照多方资料做个总结,用下方程序

/*
gcc -c 文件名.c,生成可重定位文件,生成的为可重定位文件,不要直接生成可执行文件
*/

int printf(const char* format, ...);

int global_init_var = 84;
int global_uninit_var;

void func1(int i)
{
printf("%d\n", i);
}

int main(void)
{
static int static_var = 85;
static int static_var2;

int a = 1;
int b;
func1(static_var + static_var2 + a + b);

return a;
}

elf header

初次学习主要关注 Start of section headers即可,初次旨在弄清楚文件结构

/*
其中指出了节表的开始地址,也指出了节表中元素数量
*/
root@L:/home/l/c++# readelf -h ./elfdemo.o
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: REL (Relocatable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x0
Start of program headers: 0 (bytes into file)
Start of section headers: 1040 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 0 (bytes)
Number of program headers: 0
Size of section headers: 64 (bytes)
Number of section headers: 14
Section header string table index: 13/*字符串表的节索引*/

section header(节描述符)

按照本人的理解为section和segment的区别是一个是在文件中,一个是在runtime,本次学习所用实例为可重定向文件,非文件,因此没有program header。

其详细的描述了每一个节的信息。

结构

typedef struct
{
Elf64_Word sh_name; /* Section name (string tbl index) */
Elf64_Word sh_type; /* Section type *//*段的类型(用处)*/
Elf64_Xword sh_flags; /* Section flags *//*标志位*/
Elf64_Addr sh_addr; /* Section virtual addr at execution */
Elf64_Off sh_offset; /* Section file offset *//*文件偏移地址*/
Elf64_Xword sh_size; /* Section size in bytes *//*节长*/
Elf64_Word sh_link; /* Link to another section */
Elf64_Word sh_info; /* Additional section information */
Elf64_Xword sh_addralign; /* Section alignment *//*对齐,若为8,则起始地址除8=0*/
Elf64_Xword sh_entsize; /* Entry size if section holds table *//*项长度,符号表24*/
} Elf64_Shdr;

root@L:/home/l/c++# readelf -S ./elfdemo.o
There are 14 section headers, starting at offset 0x410:

Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .text PROGBITS 0000000000000000 00000040
0000000000000062 0000000000000000 AX 0 0 1
[ 2] .rela.text RELA 0000000000000000 000002f0
0000000000000078 0000000000000018 I 11 1 8
[ 3] .data PROGBITS 0000000000000000 000000a4
0000000000000008 0000000000000000 WA 0 0 4
[ 4] .bss NOBITS 0000000000000000 000000ac
0000000000000008 0000000000000000 WA 0 0 4
[ 5] .rodata PROGBITS 0000000000000000 000000ac
0000000000000004 0000000000000000 A 0 0 1
[ 6] .comment PROGBITS 0000000000000000 000000b0
000000000000002c 0000000000000001 MS 0 0 1
[ 7] .note.GNU-stack PROGBITS 0000000000000000 000000dc
0000000000000000 0000000000000000 0 0 1
[ 8] .note.gnu.pr[...] NOTE 0000000000000000 000000e0
0000000000000020 0000000000000000 A 0 0 8
[ 9] .eh_frame PROGBITS 0000000000000000 00000100
0000000000000058 0000000000000000 A 0 0 8
[10] .rela.eh_frame RELA 0000000000000000 00000368
0000000000000030 0000000000000018 I 11 9 8
[11] .symtab SYMTAB 0000000000000000 00000158
0000000000000138 0000000000000018 12 8 8
[12] .strtab STRTAB 0000000000000000 00000290
000000000000005a 0000000000000000 0 0 1
[13] .shstrtab STRTAB 0000000000000000 00000398
0000000000000074 0000000000000000 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
D (mbind), l (large), p (processor specific)

Symbol table(重要节之一)

其中元素结构为:

/*
1.Symbol name,为符号名所在位置的index,而符号名在字符串表中,会在下面介绍
2.st_info:符号类型和绑定,
符号绑定(binding):表示符号的作用域和链接属性,例如是局部符号还是全局符号,还有弱符号。
符号类型(type):表示符号的类型,例如它是一个函数、变量还是某种特殊的符号。
3.st_other,符号可见性,可由符号绑定决定,也可以自定义,决定了是否能被外部引用。
*/
typedef uint32_t Elf64_Word;
typedef uint16_t Elf64_Section;
typedef uint64_t Elf64_Addr;
typedef uint64_t Elf64_Xword;
typedef struct
{
Elf64_Word st_name; /* Symbol name (string tbl index) */
unsigned char st_info; /* Symbol type and binding */
unsigned char st_other; /* Symbol visibility */
Elf64_Section st_shndx; /* Section index */
Elf64_Addr st_value; /* Symbol value */
Elf64_Xword st_size; /* Symbol size */
} Elf64_Sym;
一个结构体占:4+2+1+1+8+8=24bytes

String table && Shstrtab

分别用来保存符号表的名称和节表的名称,即symbol name和section name。可通过st_name和sh_name索引来访问。

示例


/*
符号表
*/
root@L:/home/l/c++# readelf -s elfdemo.o

Symbol table '.symtab' contains 13 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS elfdemo.c
2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text
3: 0000000000000000 0 SECTION LOCAL DEFAULT 3 .data
4: 0000000000000000 0 SECTION LOCAL DEFAULT 4 .bss
5: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata
6: 0000000000000004 4 OBJECT LOCAL DEFAULT 3 static_var.1
7: 0000000000000004 4 OBJECT LOCAL DEFAULT 4 static_var2.0
8: 0000000000000000 4 OBJECT GLOBAL DEFAULT 3 global_init_var
9: 0000000000000000 4 OBJECT GLOBAL DEFAULT 4 global_uninit_var
10: 0000000000000000 43 FUNC GLOBAL DEFAULT 1 func1
11: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND printf
12: 000000000000002b 55 FUNC GLOBAL DEFAULT 1 main
/*
符号表的hex表示,可以理解其存储的格式
第一列为索引,注意大端法,一个结构体占24字节,仔细阅读可以理解其与字符串表的对应关系。
*/
root@L:/home/l/c++# readelf -x 11 ./elfdemo.o

Hex dump of section '.symtab':
0x00000000 00000000 00000000 00000000 00000000 ................
0x00000010 00000000 00000000 01000000 0400f1ff ................
0x00000020 00000000 00000000 00000000 00000000 ................
0x00000030 00000000 03000100 00000000 00000000 ................
0x00000040 00000000 00000000 00000000 03000300 ................
0x00000050 00000000 00000000 00000000 00000000 ................
0x00000060 00000000 03000400 00000000 00000000 ................
0x00000070 00000000 00000000 00000000 03000500 ................
0x00000080 00000000 00000000 00000000 00000000 ................
0x00000090 0b000000 01000300 04000000 00000000 ................
0x000000a0 04000000 00000000 18000000 01000400 ................
0x000000b0 04000000 00000000 04000000 00000000 ................
0x000000c0 26000000 11000300 00000000 00000000 &...............
0x000000d0 04000000 00000000 36000000 11000400 ........6.......
0x000000e0 00000000 00000000 04000000 00000000 ................
0x000000f0 48000000 12000100 00000000 00000000 H...............
0x00000100 2b000000 00000000 4e000000 10000000 +.......N.......
0x00000110 00000000 00000000 00000000 00000000 ................
0x00000120 55000000 12000100 2b000000 00000000 U.......+.......
0x00000130 37000000 00000000 7.......
/*
字符串表
*/
root@L:/home/l/c++# readelf -x 12 ./elfdemo.o

Hex dump of section '.strtab':
0x00000000 00656c66 64656d6f 2e630073 74617469 .elfdemo.c.stati
0x00000010 635f7661 722e3100 73746174 69635f76 c_var.1.static_v
0x00000020 6172322e 3000676c 6f62616c 5f696e69 ar2.0.global_ini
0x00000030 745f7661 7200676c 6f62616c 5f756e69 t_var.global_uni
0x00000040 6e69745f 76617200 66756e63 31007072 nit_var.func1.pr
0x00000050 696e7466 006d6169 6e00 intf.main.

root@L:/home/l/c++# readelf -x 13 ./elfdemo.o

Hex dump of section '.shstrtab':
0x00000000 002e7379 6d746162 002e7374 72746162 ..symtab..strtab
0x00000010 002e7368 73747274 6162002e 72656c61 ..shstrtab..rela
0x00000020 2e746578 74002e64 61746100 2e627373 .text..data..bss
0x00000030 002e726f 64617461 002e636f 6d6d656e ..rodata..commen
0x00000040 74002e6e 6f74652e 474e552d 73746163 t..note.GNU-stac
0x00000050 6b002e6e 6f74652e 676e752e 70726f70 k..note.gnu.prop
0x00000060 65727479 002e7265 6c612e65 685f6672 erty..rela.eh_fr
0x00000070 616d6500 ame.

关于重定位节

因此是一个可重定位文件缺少外部文件函数的定义,因此需要链接器来重定位,而.rela就是需要重定位的节

此时可以反汇编文件,文件是这样的:仔细观察可以看见call字段地址为00 00 00 00,故需要重定位。

root@L:/home/l/c++# objdump -d elfdemo.o

elfdemo.o: file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <func1>:
0: f3 0f 1e fa endbr64
4: 55 push %rbp
5: 48 89 e5 mov %rsp,%rbp
8: 48 83 ec 10 sub $0x10,%rsp
c: 89 7d fc mov %edi,-0x4(%rbp)
f: 8b 45 fc mov -0x4(%rbp),%eax
12: 89 c6 mov %eax,%esi
14: 48 8d 05 00 00 00 00 lea 0x0(%rip),%rax # 1b <func1+0x1b>
1b: 48 89 c7 mov %rax,%rdi
1e: b8 00 00 00 00 mov $0x0,%eax
23: e8 00 00 00 00 call 28 <func1+0x28>
28: 90 nop
29: c9 leave
2a: c3 ret

000000000000002b <main>:
2b: f3 0f 1e fa endbr64
2f: 55 push %rbp
30: 48 89 e5 mov %rsp,%rbp
33: 48 83 ec 10 sub $0x10,%rsp
37: c7 45 f8 01 00 00 00 movl $0x1,-0x8(%rbp)
3e: 8b 15 00 00 00 00 mov 0x0(%rip),%edx # 44 <main+0x19>
44: 8b 05 00 00 00 00 mov 0x0(%rip),%eax # 4a <main+0x1f>
4a: 01 c2 add %eax,%edx
4c: 8b 45 f8 mov -0x8(%rbp),%eax
4f: 01 c2 add %eax,%edx
51: 8b 45 fc mov -0x4(%rbp),%eax
54: 01 d0 add %edx,%eax
56: 89 c7 mov %eax,%edi
58: e8 00 00 00 00 call 5d <main+0x32>
5d: 8b 45 f8 mov -0x8(%rbp),%eax
60: c9 leave
61: c3 ret

可以观察到重定位之后符号表的变化,主要是变化为装载地址的相对地址:

可以通过gdb来验证

Symbol table '.symtab' contains 41 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS Scrt1.o
2: 000000000000038c 32 OBJECT LOCAL DEFAULT 4 __abi_tag
3: 0000000000000000 0 FILE LOCAL DEFAULT ABS crtstuff.c
4: 0000000000001090 0 FUNC LOCAL DEFAULT 16 deregister_tm_clones
5: 00000000000010c0 0 FUNC LOCAL DEFAULT 16 register_tm_clones
6: 0000000000001100 0 FUNC LOCAL DEFAULT 16 __do_global_dtors_aux
7: 0000000000004018 1 OBJECT LOCAL DEFAULT 26 completed.0
8: 0000000000003dc0 0 OBJECT LOCAL DEFAULT 22 __do_global_dtor[...]
9: 0000000000001140 0 FUNC LOCAL DEFAULT 16 frame_dummy
10: 0000000000003db8 0 OBJECT LOCAL DEFAULT 21 __frame_dummy_in[...]
11: 0000000000000000 0 FILE LOCAL DEFAULT ABS elfdemo.c
12: 0000000000004014 4 OBJECT LOCAL DEFAULT 25 static_var.1
13: 0000000000004020 4 OBJECT LOCAL DEFAULT 26 static_var2.0
14: 0000000000000000 0 FILE LOCAL DEFAULT ABS crtstuff.c
15: 0000000000002110 0 OBJECT LOCAL DEFAULT 20 __FRAME_END__
16: 0000000000000000 0 FILE LOCAL DEFAULT ABS
17: 0000000000003dc8 0 OBJECT LOCAL DEFAULT 23 _DYNAMIC
18: 0000000000002008 0 NOTYPE LOCAL DEFAULT 19 __GNU_EH_FRAME_HDR
19: 0000000000003fb8 0 OBJECT LOCAL DEFAULT 24 _GLOBAL_OFFSET_TABLE_
20: 0000000000001149 43 FUNC GLOBAL DEFAULT 16 func1
21: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __libc_start_mai[...]
22: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_deregisterT[...]
23: 0000000000004000 0 NOTYPE WEAK DEFAULT 25 data_start
24: 0000000000004018 0 NOTYPE GLOBAL DEFAULT 25 _edata
25: 00000000000011ac 0 FUNC GLOBAL HIDDEN 17 _fini
26: 0000000000000000 0 FUNC GLOBAL DEFAULT UND printf@GLIBC_2.2.5
27: 0000000000004000 0 NOTYPE GLOBAL DEFAULT 25 __data_start
28: 000000000000401c 4 OBJECT GLOBAL DEFAULT 26 global_uninit_var
29: 0000000000000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__
30: 0000000000004008 0 OBJECT GLOBAL HIDDEN 25 __dso_handle
31: 0000000000002000 4 OBJECT GLOBAL DEFAULT 18 _IO_stdin_used
32: 0000000000004028 0 NOTYPE GLOBAL DEFAULT 26 _end
33: 0000000000001060 38 FUNC GLOBAL DEFAULT 16 _start
34: 0000000000004010 4 OBJECT GLOBAL DEFAULT 25 global_init_var
35: 0000000000004018 0 NOTYPE GLOBAL DEFAULT 26 __bss_start
36: 0000000000001174 55 FUNC GLOBAL DEFAULT 16 main
37: 0000000000004018 0 OBJECT GLOBAL HIDDEN 25 __TMC_END__
38: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_registerTMC[...]
39: 0000000000000000 0 FUNC WEAK DEFAULT UND __cxa_finalize@G[...]
40: 0000000000001000 0 FUNC GLOBAL HIDDEN 12 _init
LEGEND: STACK | HEAP | CODE | DATA | WX | RODATA
Start End Perm Size Offset File
0x555555554000 0x555555555000 r--p 1000 0 /home/l/c++/test
0x555555555000 0x555555556000 r-xp 1000 1000 /home/l/c++/test
0x555555556000 0x555555557000 r--p 1000 2000 /home/l/c++/test
0x555555557000 0x555555558000 r--p 1000 2000 /home/l/c++/test
0x555555558000 0x555555559000 rw-p 1000 3000 /home/l/c++/test
0x7ffff7d86000 0x7ffff7d89000 rw-p 3000 0 [anon_7ffff7d86]
pwndbg> x/20gx 0x555555558000+0x14
0x555555558014 <static_var.1>: 0x0000000000000055 0x0000000000000000
0x555555558024: 0x0000000000000000 0x0000000000000000
0x555555558034: 0x0000000000000000 0x0000000000000000
0x555555558044: 0x0000000000000000 0x0000000000000000
0x555555558054: 0x0000000000000000 0x0000000000000000
0x555555558064: 0x0000000000000000 0x0000000000000000
0x555555558074: 0x0000000000000000 0x0000000000000000
0x555555558084: 0x0000000000000000 0x0000000000000000
0x555555558094: 0x0000000000000000 0x0000000000000000
0x5555555580a4: 0x0000000000000000 0x0000000000000000

符号表中符号可以大致分为如下几类

(真实性有待商榷,没有链接器的实现基础)

  • 能被外部引用的,无static关键字修饰的全局符号
  • static修饰的符号
  • 本程序引用的外部符号,如本程序的printf函数
  • 段名

一些杂记

c++中会支持函数重载,也就是同名函数会经过其所在类,命名空间,参数类型,符号类型等再次修饰,导致虽然变量名一样但是其实编译之后是不一样的,所以可以支持这个特性,但是问题是,若想要c++兼容c库就不是那么好办了,c中没有此特性,因此c++中使用c库的函数声明就会被重载,导致符号位定义错误,具体如下:

extern “C”

/*c的库文件貌似适配做的很好,如果直接包含c库文件是不会出现这个问题的,就当是一个小知识吧*/
extern int printf(const char* format, ...);
int main(int argc, char const *argv[])
{
printf("hh");
return 0;
}
会出现:
root@L:/home/l/c++# g++ extern.cpp -o ex
/usr/bin/ld: /tmp/cctEKkS7.o: in function `main':
extern.cpp:(.text+0x23): undefined reference to `printf(char const*, ...)'
collect2: error: ld returned 1 exit status
如果这样就会告诉编译器不会命名修饰,因此可以编译成功
extern "C" int printf(const char* format, ...);
int main(int argc, char const *argv[])
{
printf("hh");
return 0;
}

强弱符号与强弱引用

强弱符号可以解决同名冲突的问题,若遇上同名的定义符号,优先使用强符号,注意是对定义而言,而非引用(声明)。

strsym

int printf(const char* format, ...);
extern int ext;
int weak;
int strong = 1;
__attribute__((weak)) int weak2 = 2;
int week2 = 3;

int main ()
{
printf("%d",week2);
return 0;
}
root@L:/home/l/c++# ./strong
3

强引用弱引用则是可以当外部引用找不到定义的时候不报错,增加了程序的容错性。

但是注意执行的时候还是会报错,call的时候会把一个无效地址给rip,导致segment fault(非法地址访问)

__attribute__((weak)) int foo();
extern int ext;
int weak;
int strong = 1;
__attribute__((weak)) int weak2 = 2;
int week2 = 3;

int main ()
{
foo();
return 0;
}
root@L:/home/l/c++# gcc strong.c -o strong
root@L:/home/l/c++#