04 February 2018

Tools

# generate i386 instructions on a x86-64 machine
gcc m32 -S

Linking & Loading

Programming Language Pragmatics, Fourth Edition

  1. 15.6.1 Relocation and Name Resolution
  2. 15.7.1 Position-Independent Code
  3. 15.7.2 Fully Dynamic (Lazy) Linking

Lazy loading:

Linux

Useful objdump options:

-S
--source
Display source code intermixed with disassembly, if possible. Implies -d.

-d
--disassemble
Display the assembler mnemonics for the machine instructions from objfile.
This option only disassembles those sections which are expected to contain
instructions.

-h # same as readelf -l

Useful readelf options:

-l
--program-headers
--segments
Displays the information contained in the file’s segment headers, if it has any.

--syms              Display the symbol table
-l --program-headers   Display the program headers

-S --section-headers   Display the sections' header
   --sections          An alias for --section-headers

Use objdump to Check Global Variables

foo.c

int global = 0xFFFF;

gcc -g -c foo.c

objdump -s foo.o produces:


foo.o:     file format elf64-x86-64

Contents of section .data:
 0000 ffff0000                             ....
Contents of section .debug_info:
 0000 36000000 04000000 00000801 00000000  6...............
 0010 0c000000 00000000 00000000 00020000  ................
 0020 00000101 32000000 09030000 00000000  ....2...........
 0030 00000304 05696e74 0000               .....int..
Contents of section .debug_abbrev:
 0000 01110125 0e130b03 0e1b0e10 17000002  ...%............
 0010 3400030e 3a0b3b0b 49133f19 02180000  4...:.;.I.?.....
 0020 0324000b 0b3e0b03 08000000           .$...>......
Contents of section .debug_aranges:
 0000 1c000000 02000000 00000800 00000000  ................
 0010 00000000 00000000 00000000 00000000  ................
Contents of section .debug_line:
 0000 22000000 02001c00 00000101 fb0e0d00  "...............
 0010 01010101 00000001 00000100 666f6f2e  ............foo.
 0020 63000000 0000                        c.....
Contents of section .debug_str:
 0000 2f726f6f 742f636f 64652f68 61636b00  /root/code/hack.
 0010 676c6f62 616c0047 4e552043 31312037  global.GNU C11 7
 0020 2e322e30 202d6d74 756e653d 67656e65  .2.0 -mtune=gene
 0030 72696320 2d6d6172 63683d78 38362d36  ric -march=x86-6
 0040 34202d67 202d6673 7461636b 2d70726f  4 -g -fstack-pro
 0050 74656374 6f722d73 74726f6e 6700666f  tector-strong.fo
 0060 6f2e6300                             o.c.
Contents of section .comment:
 0000 00474343 3a202855 62756e74 7520372e  .GCC: (Ubuntu 7.
 0010 322e302d 31756275 6e747531 7e31362e  2.0-1ubuntu1~16.
 0020 30342920 372e322e 3000               04) 7.2.0.

ffff0000 in .data section is the global variable in little endian format.

objdump -D foo.o produces:


foo.o:     file format elf64-x86-64


Disassembly of section .data:

0000000000000000 <global>:
   0:	ff                   	(bad)
   1:	ff 00                	incl   (%rax)
	...

Disassembly of section .debug_info:

0000000000000000 <.debug_info>:
   0:	36 00 00             	add    %al,%ss:(%rax)
   3:	00 04 00             	add    %al,(%rax,%rax,1)
   6:	00 00                	add    %al,(%rax)
   8:	00 00                	add    %al,(%rax)
   a:	08 01                	or     %al,(%rcx)
   c:	00 00                	add    %al,(%rax)
   e:	00 00                	add    %al,(%rax)
  10:	0c 00                	or     $0x0,%al
	...
  1a:	00 00                	add    %al,(%rax)
  1c:	00 02                	add    %al,(%rdx)
  1e:	00 00                	add    %al,(%rax)
  20:	00 00                	add    %al,(%rax)
  22:	01 01                	add    %eax,(%rcx)
  24:	32 00                	xor    (%rax),%al
  26:	00 00                	add    %al,(%rax)
  28:	09 03                	or     %eax,(%rbx)
	...
  32:	03 04 05 69 6e 74 00 	add    0x746e69(,%rax,1),%eax
	...

Disassembly of section .debug_abbrev:

0000000000000000 <.debug_abbrev>:
   0:	01 11                	add    %edx,(%rcx)
   2:	01 25 0e 13 0b 03    	add    %esp,0x30b130e(%rip)        # 30b1316 <global+0x30b1316>
   8:	0e                   	(bad)
   9:	1b 0e                	sbb    (%rsi),%ecx
   b:	10 17                	adc    %dl,(%rdi)
   d:	00 00                	add    %al,(%rax)
   f:	02 34 00             	add    (%rax,%rax,1),%dh
  12:	03 0e                	add    (%rsi),%ecx
  14:	3a 0b                	cmp    (%rbx),%cl
  16:	3b 0b                	cmp    (%rbx),%ecx
  18:	49 13 3f             	adc    (%r15),%rdi
  1b:	19 02                	sbb    %eax,(%rdx)
  1d:	18 00                	sbb    %al,(%rax)
  1f:	00 03                	add    %al,(%rbx)
  21:	24 00                	and    $0x0,%al
  23:	0b 0b                	or     (%rbx),%ecx
  25:	3e 0b 03             	or     %ds:(%rbx),%eax
  28:	08 00                	or     %al,(%rax)
	...

Disassembly of section .debug_aranges:

0000000000000000 <.debug_aranges>:
   0:	1c 00                	sbb    $0x0,%al
   2:	00 00                	add    %al,(%rax)
   4:	02 00                	add    (%rax),%al
   6:	00 00                	add    %al,(%rax)
   8:	00 00                	add    %al,(%rax)
   a:	08 00                	or     %al,(%rax)
	...

Disassembly of section .debug_line:

0000000000000000 <.debug_line>:
   0:	22 00                	and    (%rax),%al
   2:	00 00                	add    %al,(%rax)
   4:	02 00                	add    (%rax),%al
   6:	1c 00                	sbb    $0x0,%al
   8:	00 00                	add    %al,(%rax)
   a:	01 01                	add    %eax,(%rcx)
   c:	fb                   	sti
   d:	0e                   	(bad)
   e:	0d 00 01 01 01       	or     $0x1010100,%eax
  13:	01 00                	add    %eax,(%rax)
  15:	00 00                	add    %al,(%rax)
  17:	01 00                	add    %eax,(%rax)
  19:	00 01                	add    %al,(%rcx)
  1b:	00 66 6f             	add    %ah,0x6f(%rsi)
  1e:	6f                   	outsl  %ds:(%rsi),(%dx)
  1f:	2e 63 00             	movslq %cs:(%rax),%eax
  22:	00 00                	add    %al,(%rax)
	...

Disassembly of section .debug_str:

0000000000000000 <.debug_str>:
   0:	2f                   	(bad)
   1:	72 6f                	jb     72 <global+0x72>
   3:	6f                   	outsl  %ds:(%rsi),(%dx)
   4:	74 2f                	je     35 <.debug_str+0x35>
   6:	63 6f 64             	movslq 0x64(%rdi),%ebp
   9:	65 2f                	gs (bad)
   b:	68 61 63 6b 00       	pushq  $0x6b6361
  10:	67 6c                	insb   (%dx),%es:(%edi)
  12:	6f                   	outsl  %ds:(%rsi),(%dx)
  13:	62 61 6c 00 47       	(bad)
  18:	4e 55                	rex.WRX push %rbp
  1a:	20 43 31             	and    %al,0x31(%rbx)
  1d:	31 20                	xor    %esp,(%rax)
  1f:	37                   	(bad)
  20:	2e 32 2e             	xor    %cs:(%rsi),%ch
  23:	30 20                	xor    %ah,(%rax)
  25:	2d 6d 74 75 6e       	sub    $0x6e75746d,%eax
  2a:	65 3d 67 65 6e 65    	gs cmp $0x656e6567,%eax
  30:	72 69                	jb     9b <global+0x9b>
  32:	63 20                	movslq (%rax),%esp
  34:	2d 6d 61 72 63       	sub    $0x6372616d,%eax
  39:	68 3d 78 38 36       	pushq  $0x3638783d
  3e:	2d 36 34 20 2d       	sub    $0x2d203436,%eax
  43:	67 20 2d 66 73 74 61 	and    %ch,0x61747366(%eip)        # 617473b0 <global+0x617473b0>
  4a:	63 6b 2d             	movslq 0x2d(%rbx),%ebp
  4d:	70 72                	jo     c1 <global+0xc1>
  4f:	6f                   	outsl  %ds:(%rsi),(%dx)
  50:	74 65                	je     b7 <global+0xb7>
  52:	63 74 6f 72          	movslq 0x72(%rdi,%rbp,2),%esi
  56:	2d 73 74 72 6f       	sub    $0x6f727473,%eax
  5b:	6e                   	outsb  %ds:(%rsi),(%dx)
  5c:	67 00 66 6f          	add    %ah,0x6f(%esi)
  60:	6f                   	outsl  %ds:(%rsi),(%dx)
  61:	2e 63 00             	movslq %cs:(%rax),%eax

Disassembly of section .comment:

0000000000000000 <.comment>:
   0:	00 47 43             	add    %al,0x43(%rdi)
   3:	43 3a 20             	rex.XB cmp (%r8),%spl
   6:	28 55 62             	sub    %dl,0x62(%rbp)
   9:	75 6e                	jne    79 <global+0x79>
   b:	74 75                	je     82 <global+0x82>
   d:	20 37                	and    %dh,(%rdi)
   f:	2e 32 2e             	xor    %cs:(%rsi),%ch
  12:	30 2d 31 75 62 75    	xor    %ch,0x75627531(%rip)        # 75627549 <global+0x75627549>
  18:	6e                   	outsb  %ds:(%rsi),(%dx)
  19:	74 75                	je     90 <global+0x90>
  1b:	31 7e 31             	xor    %edi,0x31(%rsi)
  1e:	36 2e 30 34 29       	ss xor %dh,%cs:(%rcx,%rbp,1)
  23:	20 37                	and    %dh,(%rdi)
  25:	2e 32 2e             	xor    %cs:(%rsi),%ch
  28:	30 00                	xor    %al,(%rax)

<global> shows the global variable.

macOS

objdump
dwarfdump
otool

include

gcc -v show paths searched for #include "..." and #include <...>. #include also works for files with a file extension name other than .h. For exampl,e #include "source.c" also works. And the stuff inside <> or "" are pathname. Sor example, #include <rock/lib.h> works.

C++ Trait

Value Categories

Value properties:

  • i: has identity – i.e. and address, a pointer, the user can determine whether two copies are identical, etc.
  • m: can be moved from – i.e. we are allowed to leave to source of a “copy” in some indeterminate, but valid state

Value kinds:

  • iM: has identity and cannot be moved from
  • im: has identity and can be moved from (e.g. the result of casting an lvalue to a rvalue reference)
  • Im: does not have identity and can be moved from.
  • IM: doesn’t have identity and cannot be moved) is not useful in C++ (or, I \ think) in any other language.
             expression
            /        \
           /          \
    glvalue(i)       rvalue(m)
     /     \         /     \
    /       \       /       \
   /         \     /         \
lvalue(iM)   xvalue(im)     prvalue(Im)



    ______ ______
   /      X      \
  /      / \      \
 |   l  | x |  pr  |
  \      \ /      /
   \______X______/
       gl    r    

Value examples:

  • lvalue: *p
  • xvalue: std::move
  • prvalue: a + b

std::move

Templates

Smart Pointer

  • [Differences between unique_ptr and shared_ptr duplicate

MISC

Virtual:

Google C++ Style Guide says:

Explicitly annotate overrides of virtual functions or virtual destructors with exactly one of an override or (less frequently) final specifier. Do not use virtual when declaring an override. Rationale: A function or destructor marked override or final that is not an override of a base class virtual function will not compile, and this helps catch common errors. The specifiers serve as documentation; if no specifier is present, the reader has to check all ancestors of the class in question to determine if the function or destructor is virtual or not.

Assembly

From Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 1: Basic Architecture

RIP + Displacement ⎯ In 64-bit mode, RIP-relative addressing uses a signed 32-bit displacement to calculate the effective address of the next instruction by sign-extend the 32-bit value and add to the 64-bit value in RIP.

OS

Intel term IA-32e EM64T and Intel 64 all means x86-64. AMD term for x86-64 is AMD64.

> lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian

Can you enter x64 32-bit “long compatibility sub-mode” outside of kernel mode?

STL