mirror of
https://github.com/mytechnotalent/Embedded-Hacking.git
synced 2026-04-05 10:52:30 +02:00
205 lines
7.4 KiB
Markdown
205 lines
7.4 KiB
Markdown
# Embedded Systems Reverse Engineering
|
|
[Repository](https://github.com/mytechnotalent/Embedded-Hacking)
|
|
|
|
## Week 1
|
|
Introduction and Overview of Embedded Reverse Engineering: Ethics, Scoping, and Basic Concepts
|
|
|
|
### Non-Credit Practice Exercise 3: Find Cross-References in Ghidra
|
|
|
|
#### Objective
|
|
Learn how to use Ghidra's cross-reference feature to trace how data flows through code, understanding where specific data is read, written, or referenced.
|
|
|
|
#### Prerequisites
|
|
- Ghidra installed with `0x0001_hello-world` project open
|
|
- Completed Exercise 2 (Find Strings) - you should know where the "hello, world" string is located
|
|
- CodeBrowser window open with the binary loaded
|
|
|
|
#### Task Description
|
|
|
|
In this exercise, you'll:
|
|
1. Navigate to a specific data reference in the `main` function
|
|
2. Find where a particular data item (`DAT_...`) is used
|
|
3. Trace back to see which functions access this data
|
|
4. Understand how data flows from memory to the CPU and then to functions
|
|
|
|
#### Background: What are Cross-References?
|
|
|
|
A **cross-reference** is a link between different parts of the code:
|
|
- **Code ? Data**: An instruction reads or writes data
|
|
- **Code ? Code**: A function calls another function
|
|
- **Data ? Data**: One data item references another
|
|
|
|
In this exercise, we're tracking **code ? data** references to understand where and how the program uses the "hello, world" string.
|
|
|
|
#### Step-by-Step Instructions
|
|
|
|
##### Step 1: Navigate to the main Function
|
|
|
|
1. In Ghidra's CodeBrowser, use **Search ? For Address or Label** (or press **Ctrl+G**)
|
|
2. Type `main` and press Enter
|
|
3. Ghidra will navigate to the `main` function
|
|
4. You should see the disassembly in the Listing view (center panel)
|
|
|
|
##### Step 2: Locate the `ldr` Instruction
|
|
|
|
In the main function's disassembly, look for an `ldr` (load register) instruction. It should look something like:
|
|
|
|
```
|
|
ldr r0, [DAT_10000244]
|
|
```
|
|
|
|
or similar. This instruction:
|
|
- **`ldr`** = load register (read data from memory)
|
|
- **`r0`** = put the data into register `r0`
|
|
- **`[DAT_10000244]`** = read from the address stored at location `DAT_10000244`
|
|
|
|
##### Step 3: Understand the Notation
|
|
|
|
In Ghidra's decompiler notation:
|
|
- **`DAT_10000244`** = a data item (not code) at address `0x10000244`
|
|
- **`[...]`** = the address of; accessing memory at that location
|
|
- The actual value is the address of the "hello, world" string in Flash memory
|
|
|
|
##### Step 4: Right-Click on the Data Reference
|
|
|
|
1. In the Listing view, find the `ldr` instruction that loads the string address
|
|
2. **Right-click** on the `DAT_...` part (the data reference)
|
|
3. A context menu should appear
|
|
|
|
##### Step 5: Select "References" Option
|
|
|
|
In the context menu:
|
|
1. Look for an option that says **References**
|
|
2. Click on it to see a submenu
|
|
3. Select **Show References to** (this shows "where is this data used?")
|
|
|
|
##### Step 6: Review the References Window
|
|
|
|
A new window should appear showing all the locations where `DAT_10000244` (or whatever the address is) is referenced:
|
|
|
|
**Expected output might look like:**
|
|
```
|
|
DAT_10000244 (1 xref):
|
|
main:10000236 (read)
|
|
```
|
|
|
|
This means:
|
|
- The data at `DAT_10000244` is used in 1 place
|
|
- That place is in the `main` function at instruction `10000236`
|
|
- It's a **read** operation (the code is reading this data)
|
|
|
|
##### Step 7: Answer These Questions
|
|
|
|
###### Question 1: Data Address
|
|
- What is the address of the data reference you found? (e.g., `DAT_10000244`)
|
|
- __________
|
|
|
|
###### Question 2: Referenced By
|
|
- How many places reference this data?
|
|
- __________
|
|
- Which function(s) use it?
|
|
- __________
|
|
|
|
###### Question 3: Reference Type
|
|
- Is it a read or write operation?
|
|
- __________
|
|
- Why? (What's the program doing with this data?)
|
|
- __________
|
|
|
|
###### Question 4: The Chain
|
|
- The `ldr` instruction loads an address into `r0`
|
|
- What happens next? (Hint: Look at the next instruction after the `ldr`)
|
|
- __________
|
|
- Is there a function call? If so, which one?
|
|
- __________
|
|
|
|
###### Question 5: Understanding the Flow
|
|
- **`DAT_10000244`** contains the address of the "hello, world" string
|
|
- The `ldr` loads that address into `r0`
|
|
- Then a function (probably `printf` or `puts`) is called with `r0` as the argument
|
|
- Can you trace this complete flow?
|
|
|
|
#### Deeper Analysis (Optional Challenge)
|
|
|
|
##### Challenge 1: Find the Actual String Address
|
|
1. Navigate to the `DAT_10000244` location
|
|
2. Look at the value stored there
|
|
3. Can you decode the hex bytes and find the actual address of "hello, world"?
|
|
4. Hint: The RP2350 uses little-endian encoding, so the bytes are "backwards"
|
|
|
|
**Example:**
|
|
If you see bytes: `CC 19 00 10`
|
|
Read backwards: `10 00 19 CC` = `0x100019CC`
|
|
|
|
##### Challenge 2: Understand the Indirection
|
|
1. In C, if we want to load an address, we do: `char *ptr = &some_string;`
|
|
2. Then to use it: `printf(ptr);`
|
|
3. In assembly, this becomes:
|
|
- Load the pointer: `ldr r0, [DAT_...]`
|
|
- Call the function: `bl printf`
|
|
4. Can you see this pattern in the assembly?
|
|
|
|
##### Challenge 3: Follow Multiple References
|
|
1. Try this with different data items in the binary
|
|
2. Find a data reference that has **multiple** cross-references
|
|
3. What data is used in more than one place?
|
|
|
|
#### Questions for Reflection
|
|
|
|
1. **Why does the code need to load an address from memory?**
|
|
- Why can't it just use the address directly?
|
|
- Hint: Position-independent code and memory protection
|
|
|
|
2. **What's the relationship between `DAT_10000244` and the "hello, world" string?**
|
|
- They're at different addresses - why?
|
|
- Which is in Flash and which points to where it's stored?
|
|
|
|
3. **If we wanted to change what gets printed, where would we modify the code?**
|
|
- Could we just change the string at address `0x100019CC`?
|
|
- Or would we need to change `DAT_10000244`?
|
|
- Or both?
|
|
|
|
4. **How does this relate to memory layout?**
|
|
- Code section (Flash memory starting at `0x10000000`)
|
|
- Data section (constants/strings)
|
|
- Is everything at different addresses for a reason?
|
|
|
|
#### Tips and Hints
|
|
|
|
- If you right-click and don't see "References", try right-clicking directly on the instruction address instead
|
|
- You can also use **Search ? For Cross References** from the menu for a more advanced search
|
|
- In the Decompile view (right side), cross-references may be shown in a different format or with different colors
|
|
- Multi-level references: You can right-click on a data item and then follow the chain to another data item
|
|
|
|
#### Real-World Applications
|
|
|
|
Understanding cross-references is crucial for:
|
|
- **Vulnerability hunting**: Finding where user input flows through the code
|
|
- **Firmware patching**: Changing constants, strings, or data values
|
|
- **Malware analysis**: Tracking command-and-control server addresses or encryption keys
|
|
- **Reverse engineering**: Understanding program logic by following data dependencies
|
|
|
|
#### Summary
|
|
|
|
By completing this exercise, you've learned:
|
|
1. How to find and interpret cross-references in Ghidra
|
|
2. How to trace data from its definition to where it's used
|
|
3. How the `ldr` (load) instruction works to pass data to functions
|
|
4. The relationship between high-level C code and assembly-level data flow
|
|
5. How addresses are indirectly referenced in position-independent code
|
|
|
|
#### Expected Final Understanding
|
|
|
|
You should now understand this flow:
|
|
```
|
|
String "hello, world" is stored at address 0x100019CC in Flash
|
|
?
|
|
A pointer to this address is stored at DAT_10000244 in Flash
|
|
?
|
|
The main() function loads this pointer: ldr r0, [DAT_10000244]
|
|
?
|
|
main() calls printf with r0 (the string address) as the argument
|
|
?
|
|
printf() reads the bytes at that address and prints them
|
|
```
|