[PROLOG] 0x001, La mémoire | (French version)

November 1, 2022 malware memory prolog

Second post in our PROLOG series, which aims to provide some quick theoretical reminders on the basics to start reverse engineering healthily: today’s topic, MEMORY.

When one wishes to embark on reverse engineering binaries, it is often thought that assembly language is the complex entry barrier. In reality, assembly language has a very simple syntax, total logic, and thus a very low level of complexity. The complexity arises from a very common initial mistake: starting to read assembly code without first fully mastering the following four elements:

The functioning of memory, particularly the Stack
Registers
Calling conventions defined by the ABI (Application Binary Interface)
The functioning of the Stack ;-)

Memory

Memory is a vast subject. I will summarize here the main elements that will be necessary for our reverse engineering of x64 malware.

Each running process gets its own virtual address space. The amount of space depends on the architecture (32-bit or 64-bit), system parameters, and the operating system. Only a small part of this virtual space within each process is mapped to physical memory. There are different ways to map virtual memory to physical memory using paging and address translation.

The different sections of virtual memory:

Section	Data stored in the section	Example in C
.text	Corresponds essentially to the .text part of the binary executable file. It contains the instructions to be executed. Its size is fixed at runtime when the process is first loaded.
.rodata	Stores initialized global variables (read-only)	`int x = 10;`
.data	Stores initialized global variables	`int x = 10;`
.bss	Stores uninitialized global variables (read/write, but not executable)	`int x;`
HEAP (the heap)	Stores dynamically allocated variables and grows from lower address memory to higher address memory. Memory allocation is controlled by the functions `malloc(), realloc(), and free()`.	`int x = malloc(sizeof(int);`
*shared libraries*
STACK (the stack)	The stack keeps track of function calls (recursively) and grows from higher address memory to lower address memory. The stack stores local variables. If the process is multithreaded, each thread will have a unique stack (but a common heap)

But what is the Stack?

The stack is a volatile, fast-access memory organized according to the LIFO (Last-in, First-out) principle. The PUSH instruction is used to store a value in the stack (called “pushing”) (e.g., PUSH 0xf56a46). The POP instruction is used to extract/unstack the last pushed value and place it in the specified CPU register (e.g., POP ecx).

The EBP register stores the base/beginning address of the current stack frame (it designates the highest address of the current stack frame).
The ESP register stores the top address of the stack, i.e., the current end address of the process’s stack. It designates the lowest address of the stack.
Remember that a PUSH decrements ESP and a POP increments ESP.

When a function is executed, a Stack Frame is created to store its information (e.g., its local variables). This new stack frame is pushed onto the thread’s stack. When this function is finished, the stack frame is discarded. This means that the ESP and EBP registers point again to the caller’s stack frame. The execution flow (whose next instruction address is stored in the EIP register) then continues in the caller at the address following the call. This return address (return address) was previously saved on the stack by the caller (via the CALL instruction).

The value (i.e., its address) of EBP remains fixed for the stack frame.
The value of ESP varies (up or down) depending on the data (number and size) pushed or popped on this stack frame.
You will note that this stack frame measures at a given time (EBP-ESP).

Let’s now look at a diagram of this caller (caller) and callee (callee) story from the perspective of stack frames:

Navigating the Stack

The stack is used to store:

A function’s local variables
Function call arguments
Return addresses

But where do we find this information in the stack, and how do we access it?

We navigate a stack by relative addresses (offsets); relative to its top (i.e., relative to the ESP register), or relative to its base (the EBP register).

Remember that on x86 and x64, we decrement ESP to move up the stack (e.g., as static memory allocations are made) and increment ESP to move down the stack. Initially, this is a bit confusing, but you’ll get used to it quickly: “More to go down”, “Less to go up” ;-) for example, to reserve memory on the stack, we decrease ESP by the size to reserve:

sub esp, <size to reserve>

Passing Arguments via the Stack

There are several conventions that specify how to pass arguments to a function (parameters sent by the caller and thus received by the callee). We will study these different calling conventions (calling conventions) later. What interests us here for the moment is understanding where and how these stack-passed arguments are positioned in memory. The goal is obviously to be able to access the values of these parameters.

Let’s take the following C function call:

int __attribute__((__cdecl__)) additionne(int a,int b, int c);
int somme=additionne(20, 30,40);

Note : at this stage, do not pay attention to the keywords __cdecl__ and __attribute__, which just ensure that the 32-bit C compiler uses the stack (and not the registers) to pass arguments to the function. We will come back to this later when we see the different calling conventions.

Our 32-bit C compiler would have translated this C code into the following assembly code (on a 32-bit x86¹):

push 0x28      ; argument 3 | 40 in decimal
push 0x1e      ; argument 2 | 30 in decimal
push 0x14      ; argument 1 | 20 in decimal
call additionne ; additionne(20,30,40)

The stack frame of the main() function JUST BEFORE EXECUTING THE FIRST INSTRUCTION of the additionne() function would then be as follows:

Before calling the additionne function, the main() function saves the EAX, ECX, and EDX registers, only if they risk being overwritten by the additionne function it is about to call. In this case, our additionne function will only use the EAX and EDX registers.

Then main pushes the 3 arguments onto the stack with which it will call the additionne function.

And finally comes the time for the CALL. In assembly, the CALL instruction performs the following actions:

The content of the EIP register is pushed onto the stack.
Transfers the execution flow to the address of the function to be called (thanks to the special EIP register).

Thus, we obtain a stack frame of main with its return address saved at the top of the stack. This return address will allow the execution flow to resume at the address where it was just before its CALL when it exits the called function (here additionne). Thus, the flow will not suffer any interruption.

In this perspective, main() is the “caller” function of the additionne function. Let’s now focus on the called function: additionne(). Let’s take, for example, its following C code:

int __attribute__((__cdecl__)) additionne(int a,int b, int c) {
    return a+b+c;
}

, whose compilation into 32-bit assembly gives:

0x0000118d         push       ebp                       ; Function prologue 
0x0000118e         mov        ebp, esp                  ;
0x00001190         mov        edx, dword [ebp+8]        ; int a
0x00001193         mov        eax, dword [ebp+12]       ; int b
0x00001196         add        edx, eax                  ; (a+b) into edx
0x00001198         mov        eax, dword [ebp+16]       ; int c
0x0000119b         add        eax, edx                  ; (edx + c) into eax
0x0000119d         pop        ebp                       ;
0x0000119e         ret                                  ; by convention, the function result is placed in eax

We can distinguish 3 parts in this function’s code:

Its prologue
Its processing
Its epilogue

The prologue of the additionne function

0x0000118d         push       ebp                       ; Save EBP on the stack 
0x0000118e         mov        ebp, esp                  ; Set the function's EBP by pointing it to ESP

The role of a function’s prologue is thus:

To save the address stored in EBP on the stack
To build a new empty stack frame for the called function by positioning EBP (the base of the stack) on ESP

The epilogue of the additionne function The epilogue of the function is here constituted by the simple RET instruction; an instruction that performs several actions:

Removes from the stack the previously stored return address (via a POP)
Directs the execution flow to this address (which is the address following the CALL by which we entered the called function)

Okay, you have understood how the stack works, let’s now study how to use it under Windows and Linux? And since computing was not built in one go, each OS has different conventions: let’s go for the ABIs…

The ABI (Application Binary Interface)

An ABI defines how data structures and data are accessed in machine code. For example, calling conventions (which we will see a bit later) are defined within ABIs.

Adhering to an ABI (which may or may not be officially standardized) is generally the work of a compiler (to produce the binary) and an operating system (to execute the binary). However, a developer may have to deal directly with an ABI when writing a program using multiple programming languages (e.g., C for Windows and Assembly), or even compiling a program written in the same language with different compilers.

When handling assembly code resulting from the reverse engineering of a binary program, we are obliged to take into account the ABI it uses.

The details covered by an ABI include the following elements:

Instruction set of the processor, with details such as register structure, stack organization, memory access types, etc.
Sizes, layouts, and alignments of basic data types that the processor can directly access
Calling convention, which controls how function arguments are passed and return values are retrieved. For example, the ABI defines the following:
- Whether all parameters are passed on the stack, and/or some are passed in registers
- Which registers are used for which function parameters
- Whether the first function parameter passed on the stack is pushed first or last
How an application should make system calls to the operating system, and if the ABI specifies direct system calls rather than procedure calls, the system call numbers
In the case of a complete operating system, the OS ABI standardizes the binary format of object files, binary libraries, …

In our quest for malware on 64-bit OSes, we will mainly encounter the following 2 ABIs:

System V AMD
Microsoft x64

Talking about ABIs means talking about calling conventions, and for that, I have prepared the next post for you.

Citations: [1] https://ppl-ai-file-upload.s3.amazonaws.com/web/direct-files/18240784/a25a808b-3de0-4af2-8a8d-f4b604b5c909/paste.txt [2] https://translate.google.com/?hl=fr&op=translate&sl=en&tab=TT&tl=fr [3] https://www.linguee.com/english-french/translation/reverse%2Bengineering.html [4] https://www.lexilogos.com/english/french_translation.htm [5] https://translate.google.fr/?hl=fr [6] https://scd-resnum.univ-lyon3.fr/out/theses/2022_out_zuk_f.pdf [7] https://www.linguee.com/french-english/translation/langage%2Bassembleur.html [8] https://www.youtube.com/watch?v=__R0GiJxims [9] https://www.academia.edu/8785377/French_wordlist_sans_accent [10] https://context.reverso.net/translation/french-english/ [11] https://kaizen-solutions.net/kaizen-insights/articles-et-conseils-de-nos-experts/comprendre-et-utiliser-le-reverse-engineering/ [12] http://sdz.tdct.org/sdz/en-profondeur-avec-l-assembleur.html [13] https://gist.github.com/n4sm/71ecaac0b7a7ef294987f939cf59ab16 [14] https://im2ag-moodle.univ-grenoble-alpes.fr/pluginfile.php/33928/mod_page/intro/LivreALM.pdf [15] https://fr.wikipedia.org/wiki/Assembleur [16] https://ww2.ac-poitiers.fr/lettres/IMG/txt/dic-fr.txt

I specify here on x86, because on ARM and x64 the __cdecl keyword is not taken into account by the compiler. Indeed, the convention requires that on ARM and x64 processors, parameters are passed as much as possible through registers, then only through the stack. ↩︎