Files
Embedded-Hacking/WEEK03/WEEK03-04.md
2026-03-19 15:01:07 -04:00

239 lines
6.2 KiB
Markdown

# Embedded Systems Reverse Engineering
[Repository](https://github.com/mytechnotalent/Embedded-Hacking)
## Week 3
Embedded System Analysis: Understanding the RP2350 Architecture w/ Comprehensive Firmware Analysis
### Non-Credit Practice Exercise 4: Find Your Main Function and Trace Back
#### Objective
Locate the `main()` function, examine its first instructions, identify the first function call, and trace backward to discover where `main()` was called from.
#### Prerequisites
- Raspberry Pi Pico 2 with debug probe connected
- OpenOCD and `arm-none-eabi-gdb` available
- `build\0x0001_hello-world.elf` loaded
- Understanding of function calls and the link register (LR) from previous weeks
#### Task Description
You will use GDB to find `main()`, examine its disassembly, identify the initial function call (`stdio_init_all`), and use the link register to trace backward through the boot sequence.
#### Background Information
Key concepts:
- **Link Register (LR)**: Stores the return address when a function is called
- **Program Counter (PC)**: Points to the currently executing instruction
- **Function prologue**: The setup code at the start of every function
- **bl instruction**: "Branch with Link" - calls a function and stores return address in LR
#### Step-by-Step Instructions
##### Step 1: Connect and Halt
```gdb
(gdb) target extended-remote :3333
(gdb) monitor reset halt
```
##### Step 2: Find the Main Function
```gdb
(gdb) info functions main
```
**Expected output:**
```
All functions matching regular expression "main":
File 0x0001_hello-world.c:
0x10000234 int main(void);
Non-debugging symbols:
0x10000186 platform_entry_arm_a
...
```
Note the address of `main`: **`0x10000234`**
##### Step 3: Examine Instructions at Main
```gdb
(gdb) x/10i 0x10000234
```
**Expected output:**
```
0x10000234 <main>: push {r7, lr}
0x10000236 <main+2>: sub sp, #8
0x10000238 <main+4>: add r7, sp, #0
0x1000023a <main+6>: bl 0x100012c4 <stdio_init_all>
0x1000023e <main+10>: movw r0, #404 @ 0x194
0x10000242 <main+14>: movt r0, #4096 @ 0x1000
0x10000246 <main+18>: bl 0x1000023c <__wrap_puts>
0x1000024a <main+22>: b.n 0x1000023e <main+10>
0x1000024c <runtime_init>: push {r3, r4, r5, r6, r7, lr}
```
##### Step 4: Identify the First Function Call
The first function call in `main()` is:
```
0x1000023a <main+6>: bl 0x100012c4 <stdio_init_all>
```
**What does this function do?**
```gdb
(gdb) info functions stdio_init_all
```
**Answer:** `stdio_init_all()` initializes all standard I/O systems (USB, UART, etc.) so `printf()` works.
##### Step 5: Set a Breakpoint at Main
```gdb
(gdb) b main
(gdb) c
```
**Expected output:**
```
Breakpoint 1, main () at 0x0001_hello-world.c:5
5 stdio_init_all();
```
##### Step 6: Examine the Link Register
When stopped at `main()`, check what's in the link register:
```gdb
(gdb) info registers lr
```
**Expected output:**
```
lr 0x1000018b 268435851
```
The LR contains the return address - where execution will go when `main()` returns.
##### Step 7: Disassemble the Caller
Subtract 1 to remove the Thumb bit and disassemble:
```gdb
(gdb) x/10i 0x1000018a
```
**Expected output:**
```
0x10000186 <platform_entry>: ldr r1, [pc, #80]
0x10000188 <platform_entry+2>: blx r1
0x1000018a <platform_entry+4>: ldr r1, [pc, #80] ? LR points here
0x1000018c <platform_entry+6>: blx r1 ? This called main
0x1000018e <platform_entry+8>: ldr r1, [pc, #80]
0x10000190 <platform_entry+10>: blx r1
0x10000192 <platform_entry+12>: bkpt 0x0000
```
##### Step 8: Understand the Call Chain
Working backward from `main()`:
```
platform_entry (0x10000186)
? calls (blx at +2)
runtime_init() (0x1000024c)
? calls (blx at +6)
main() (0x10000234) ? We are here
? will call (blx at +6)
stdio_init_all() (0x100012c4)
```
##### Step 9: Verify Platform Entry Calls Main
Look at what `platform_entry` loads before the `blx`:
```gdb
(gdb) x/x 0x100001dc
```
This is the address loaded into r1 before calling `blx`. It should point to `main()`.
**Expected output:**
```
0x100001dc <data_cpy_table+60>: 0x10000235
```
Note: `0x10000235` = `0x10000234` + 1 (Thumb bit), which is the address of `main()`!
##### Step 10: Complete the Boot Trace
You've now traced the complete path:
```
1. Reset (Power-on)
?
2. Bootrom (0x00000000)
?
3. Vector Table (0x10000000)
?
4. _reset_handler (0x1000015c)
?
5. Data Copy & BSS Clear
?
6. platform_entry (0x10000186)
?
7. runtime_init() (first call)
?
8. main() (second call) ? Exercise focus
?
9. stdio_init_all() (first line of main)
```
#### Expected Output
- `main()` is at address `0x10000234`
- First function call is `stdio_init_all()` at offset +6
- Link register points to `platform_entry+4` (0x1000018a)
- `platform_entry` makes three function calls: runtime_init, main, and exit
#### Questions for Reflection
###### Question 1: Why does the link register point 4 bytes after the `blx` instruction that called main?
###### Question 2: What would happen if `main()` tried to return (instead of looping forever)?
###### Question 3: How can you tell from the disassembly that main contains an infinite loop?
###### Question 4: Why is `stdio_init_all()` called before the printf loop?
#### Tips and Hints
- Use `bt` (backtrace) to see the call stack
- Remember to account for Thumb mode when reading addresses from LR
- Use `info frame` to see detailed information about the current stack frame
- The `push {r7, lr}` at the start of main saves the return address
#### Next Steps
- Set a breakpoint at `stdio_init_all()` and step through its initialization
- Examine what happens after `main()` by looking at `exit()` function
- Try Exercise 5 in Ghidra for static analysis of the boot sequence
#### Additional Challenge
Create a GDB command to automatically trace the call chain:
```gdb
(gdb) define calltrace
> set $depth = 0
> set $addr = $pc
> while $depth < 10
> printf "%d: ", $depth
> info symbol $addr
> set $addr = *(int*)($lr - 4)
> set $depth = $depth + 1
> end
> end
```
Then try stepping through functions and running `calltrace` at each level to build a complete call graph.