Files
Embedded-Hacking/WEEK04/WEEK04-01.md
T
Kevin Thomas 29073bd383 Added WEEK04
2026-01-31 14:07:15 -05:00

242 lines
7.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Embedded Systems Reverse Engineering
[Repository](https://github.com/mytechnotalent/Embedded-Hacking)
## Week 4
Variables in Embedded Systems: Debugging and Hacking Variables w/ GPIO Output Basics
### Exercise 1: Analyze Variable Storage in Ghidra
#### Objective
Import and analyze the `0x0005_intro-to-variables.bin` binary in Ghidra to understand where variables are stored, identify memory sections, and trace how the compiler optimizes variable usage.
#### Prerequisites
- Ghidra installed and configured
- `0x0005_intro-to-variables.bin` binary available in your build directory
- Understanding of memory sections (`.data`, `.bss`, `.rodata`) from Week 4 Part 2
- Basic Ghidra navigation skills from Week 3
#### Task Description
You will import the binary into Ghidra, configure it for ARM Cortex-M33, analyze the code structure, resolve function names, and locate where the `age` variable is used in the compiled binary.
#### Step-by-Step Instructions
##### Step 1: Start Ghidra and Create New Project
```bash
ghidraRun
```
1. Click **File****New Project**
2. Select **Non-Shared Project**
3. Click **Next**
4. Enter Project Name: `week04-ex01-intro-to-variables`
5. Choose a project directory
6. Click **Finish**
##### Step 2: Import the Binary
1. Navigate to your file explorer
2. Find `Embedded-Hacking/0x0005_intro-to-variables/build/0x0005_intro-to-variables.bin`
3. **Drag and drop** the `.bin` file into Ghidra's project window
##### Step 3: Configure Import Settings
When the import dialog appears:
1. Click the three dots (**…**) next to **Language**
2. Search for: `Cortex`
3. Select: **ARM Cortex 32 little endian default**
4. Click **OK**
Now click **Options…** button:
1. Change **Block Name** to: `.text`
2. Change **Base Address** to: `10000000` (XIP flash base)
3. Click **OK**
Then click **OK** on the main import dialog.
##### Step 4: Analyze the Binary
1. Double-click the imported file in the project window
2. When prompted "Analyze now?" click **Yes**
3. Leave all default analysis options selected
4. Click **Analyze**
5. Wait for analysis to complete (watch bottom-right progress bar)
##### Step 5: Navigate to the Symbol Tree
Look at the left panel for the **Symbol Tree**. Expand **Functions** to see the auto-detected functions:
You should see function names like:
- `FUN_1000019a`
- `FUN_10000210`
- `FUN_10000234`
- Many more...
These are auto-generated names because we're analyzing a raw binary without debug symbols.
##### Step 6: Identify the Main Function
From Week 3, we know the typical boot sequence:
1. Reset handler copies data
2. `frame_dummy` runs
3. `main()` is called
Click on `FUN_10000234` - this should be our `main()` function.
**Look at the Decompile window:**
```c
void FUN_10000234(void)
{
FUN_100030cc();
do {
FUN_10003100("age: %d\r\n", 0x2b);
} while (true);
}
```
**Observations:**
- `FUN_100030cc()` is likely `stdio_init_all()`
- `FUN_10003100()` is likely `printf()`
- The magic value `0x2b` appears (what is this?)
##### Step 7: Convert 0x2b to Decimal
Let's figure out what `0x2b` means:
**Manual calculation:**
- `0x2b` in hexadecimal
- `2 × 16 + 11 = 32 + 11 = 43` in decimal
**In GDB (alternative method):**
```gdb
(gdb) p/d 0x2b
$1 = 43
```
So `0x2b = 43`! This matches our `age = 43` from the source code!
##### Step 8: Rename Functions for Clarity
Let's rename the functions to their actual names:
**Rename FUN_10000234 to main:**
1. Right-click on `FUN_10000234` in the Symbol Tree
2. Select **Rename Function**
3. Enter: `main`
4. Press **Enter**
**Update main's signature:**
1. In the Decompile window, right-click on `main`
2. Select **Edit Function Signature**
3. Change to: `int main(void)`
4. Click **OK**
**Rename FUN_100030cc to stdio_init_all:**
1. Click on `FUN_100030cc` in the decompile window
2. Right-click → **Edit Function Signature**
3. Change name to: `stdio_init_all`
4. Change signature to: `bool stdio_init_all(void)`
5. Click **OK**
**Rename FUN_10003100 to printf:**
1. Click on `FUN_10003100`
2. Right-click → **Edit Function Signature**
3. Change name to: `printf`
4. Check the **Varargs** checkbox (printf accepts variable arguments)
5. Click **OK**
##### Step 9: Examine the Optimized Code
After renaming, the decompiled main should now look like:
```c
int main(void)
{
stdio_init_all();
do {
printf("age: %d\r\n", 0x2b);
} while (true);
}
```
**Critical observation:** Where did our `age` variable go?
Original source code:
```c
uint8_t age = 42;
age = 43;
```
The compiler **optimized it completely away**!
**Why?**
1. `age = 42` is immediately overwritten
2. The value `42` is never used
3. The compiler replaces `age` with the constant `43` (`0x2b`)
4. No variable allocation in memory is needed!
##### Step 10: Examine the Assembly Listing
Click on the **Listing** window (shows assembly code):
Find the instruction that loads `0x2b`:
```assembly
10000xxx movs r1, #0x2b
10000xxx ...
10000xxx bl printf
```
**What this does:**
- `movs r1, #0x2b` - Moves the immediate value 0x2b (43) into register r1
- `bl printf` - Branches to printf, which expects format args in r1+
##### Step 11: Document Your Findings
Create a table of your observations:
| Item | Value/Location | Notes |
| --------------------- | -------------- | ------------------------------- |
| Main function address | `0x10000234` | Entry point of program |
| Age value (hex) | `0x2b` | Optimized constant |
| Age value (decimal) | `43` | Original variable value |
| Variable in memory? | No | Compiler optimized it away |
| Printf address | `0x10003100` | Standard library function |
| Format string | "age: %d\r\n" | Located in .rodata section |
#### Expected Output
After completing this exercise, you should be able to:
- Successfully import and configure ARM binaries in Ghidra
- Navigate the Symbol Tree and identify functions
- Understand how compiler optimization removes unnecessary variables
- Convert hexadecimal values to decimal
- Rename functions for better code readability
#### Questions for Reflection
###### Question 1: Why did the compiler optimize away the `age` variable?
###### Question 2: In what memory section would `age` have been stored if it wasn't optimized away?
###### Question 3: Where is the string "age: %d\r\n" stored, and why can't it be in RAM?
###### Question 4: What would happen if we had used `age` in a calculation before reassigning it to 43?
#### Tips and Hints
- Use **CTRL+F** in Ghidra to search for specific values or strings
- The **Data Type Manager** window shows all recognized data types
- If Ghidra's decompiler output looks wrong, try re-analyzing with different options
- Remember: optimized code often looks very different from source code
- The **Display****Function Graph** shows control flow visually
#### Next Steps
- Proceed to Exercise 2 to learn binary patching
- Try analyzing `0x0008_uninitialized-variables.bin` to see how uninitialized variables behave
- Explore the `.rodata` section to find string literals
#### Additional Challenge
Find the format string "age: %d\r\n" in Ghidra. What address is it stored at? How does the program reference this string in the assembly code? (Hint: Look for an `ldr` instruction that loads the string address into a register.)