mirror of
https://github.com/mytechnotalent/Embedded-Hacking.git
synced 2026-04-01 09:00:18 +02:00
Added WEEK01
This commit is contained in:
2
.gitignore
vendored
2
.gitignore
vendored
@@ -41,6 +41,8 @@
|
||||
*.su
|
||||
*.idb
|
||||
*.pdb
|
||||
*.rep
|
||||
*.gpr
|
||||
|
||||
# Kernel Module Compile Results
|
||||
*.mod*
|
||||
|
||||
121
WEEK01/WEEK01-01.md
Normal file
121
WEEK01/WEEK01-01.md
Normal file
@@ -0,0 +1,121 @@
|
||||
# Embedded Systems Reverse Engineering
|
||||
[Repository](https://github.com/mytechnotalent/Embedded-Hacking)
|
||||
|
||||
## Week 1: Introduction and Overview of Embedded Reverse Engineering: Ethics, Scoping, and Basic Concepts
|
||||
|
||||
### Exercise 1: Explore in Ghidra
|
||||
|
||||
#### Objective
|
||||
Learn how to navigate Ghidra's Symbol Tree to find and analyze functions, specifically examining the `stdio_init_all` function.
|
||||
|
||||
#### Prerequisites
|
||||
- Ghidra installed and running
|
||||
- `0x0001_hello-world` project already created and imported in Ghidra
|
||||
- The `0x0001_hello-world.elf` file already imported and analyzed
|
||||
|
||||
#### Task Description
|
||||
|
||||
Your goal is to explore the `stdio_init_all` function in Ghidra and understand what it does based on:
|
||||
1. Its decompiled code
|
||||
2. The functions it calls
|
||||
3. The variables it accesses
|
||||
|
||||
#### Step-by-Step Instructions
|
||||
|
||||
##### Step 1: Open Your Ghidra Project
|
||||
|
||||
1. Launch **Ghidra** on your computer
|
||||
2. In the Ghidra Project Manager window, you should see your `0x0001_hello-world` project
|
||||
3. If you don't see it, create a new project or open an existing one
|
||||
4. **Double-click** on the project to open it
|
||||
|
||||
##### Step 2: Access the Symbol Tree
|
||||
|
||||
In the CodeBrowser window that opens:
|
||||
- Look at the left side panel - you should see several tabs
|
||||
- Find and click on the **Symbol Tree** tab (it might be labeled "Symbol Tree" or showing a tree icon)
|
||||
- If you don't see it, go to **Window → Symbol Tree** in the menu
|
||||
|
||||
##### Step 3: Expand the Functions List
|
||||
|
||||
1. In the Symbol Tree, look for a folder or section labeled **Functions**
|
||||
2. **Click the arrow/triangle** next to "Functions" to expand it
|
||||
3. This will show you a list of all the functions that Ghidra identified in the binary
|
||||
|
||||
##### Step 4: Find the stdio_init_all Function
|
||||
|
||||
1. In the expanded Functions list, scroll through to find `stdio_init_all`
|
||||
2. **Alternative method**: If the list is long, you can use **Search → For Address or Label** from the menu and type `stdio_init_all` to jump directly to it
|
||||
3. Once you find it, **click on it** to navigate to that function in the CodeBrowser
|
||||
|
||||
##### Step 5: Examine the Decompiled Code
|
||||
|
||||
Once you've navigated to `stdio_init_all`:
|
||||
- On the **right side** of the window, you should see the **Decompile** view
|
||||
- This shows the C-like code that Ghidra has reconstructed from the assembly
|
||||
- Read through the decompiled code carefully
|
||||
|
||||
##### Step 6: Answer These Questions
|
||||
|
||||
Based on what you see in the decompiled code, answer the following:
|
||||
|
||||
###### Question 1: What does the function return?
|
||||
Look at the return type at the top of the function. Is it `void`, `int`, `bool`, or something else?
|
||||
|
||||
###### Question 2: What parameters does it take?
|
||||
Look at the function signature. Does it take any parameters? (Hint: Look for anything inside the parentheses)
|
||||
|
||||
###### Question 3: What functions does it call?
|
||||
Look for function calls within `stdio_init_all`. What other functions does it call? List them:
|
||||
- Function 1: ________________
|
||||
- Function 2: ________________
|
||||
- Function 3: ________________
|
||||
(There may be more or fewer)
|
||||
|
||||
###### Question 4: What's the purpose?
|
||||
Based on the functions it calls and the overall structure, what do you think `stdio_init_all()` is setting up? Think about what "stdio" stands for:
|
||||
- **std** = Standard
|
||||
- **io** = Input/Output
|
||||
|
||||
What types of I/O might be getting initialized?
|
||||
|
||||
##### Step 7: Explore Called Functions (Optional Challenge)
|
||||
|
||||
If you want to go deeper:
|
||||
|
||||
1. In the Decompile view, **click on one of the functions** that `stdio_init_all` calls
|
||||
2. Ghidra will navigate to that function
|
||||
3. Look at what **that** function does
|
||||
4. Can you build a picture of what's being initialized?
|
||||
|
||||
#### Expected Output
|
||||
|
||||
You should be able to write a brief summary like:
|
||||
|
||||
```
|
||||
stdio_init_all() returns: [your answer]
|
||||
It takes [number] parameters
|
||||
It calls the following functions: [list them]
|
||||
Based on these calls, I believe it initializes: [your analysis]
|
||||
```
|
||||
|
||||
#### Questions for Reflection
|
||||
|
||||
1. Why would we need to initialize standard I/O before using `printf()`?
|
||||
2. Can you find other functions in the Symbol Tree that might be related to I/O?
|
||||
3. How does this function support the `printf("hello, world\r\n")` call in main?
|
||||
|
||||
#### Tips and Hints
|
||||
|
||||
- If you see a function name you don't recognize, you can right-click on it to see more options
|
||||
- The Decompile view is your best friend - it shows you what code is doing in an almost-C format
|
||||
- Don't worry if some variable names are automatic (like `local_4` or `param_1`) - that's normal when symbols aren't available
|
||||
- You can collapse/expand sections in the Decompile view by clicking the arrows next to braces `{}`
|
||||
|
||||
#### Next Steps
|
||||
|
||||
After completing this exercise, you'll have a better understanding of:
|
||||
- How to navigate Ghidra's interface
|
||||
- How to find functions using the Symbol Tree
|
||||
- How to read decompiled code
|
||||
- How initialization functions work in embedded systems
|
||||
162
WEEK01/WEEK01-02.md
Normal file
162
WEEK01/WEEK01-02.md
Normal file
@@ -0,0 +1,162 @@
|
||||
# Embedded Systems Reverse Engineering
|
||||
[Repository](https://github.com/mytechnotalent/Embedded-Hacking)
|
||||
|
||||
## Week 1: Introduction and Overview of Embedded Reverse Engineering: Ethics, Scoping, and Basic Concepts
|
||||
|
||||
### Exercise 2: Find Strings in Ghidra
|
||||
|
||||
#### Objective
|
||||
Learn how to locate and analyze strings in a binary, understanding where they are stored in memory and how they're used.
|
||||
|
||||
#### Prerequisites
|
||||
- Ghidra installed with `0x0001_hello-world` project open
|
||||
- Basic familiarity with Ghidra's interface (from Exercise 1)
|
||||
- CodeBrowser window open with the binary loaded
|
||||
|
||||
#### Task Description
|
||||
|
||||
In this exercise, you'll find the "hello, world" string in the binary and determine:
|
||||
1. **Where** it's located in memory (its address)
|
||||
2. **How** it's used by the program
|
||||
3. **What** format it's stored in
|
||||
|
||||
#### Step-by-Step Instructions
|
||||
|
||||
##### Step 1: Open the Defined Strings Window
|
||||
|
||||
1. In the CodeBrowser menu, go to **Window** (top menu bar)
|
||||
2. Look for and click on **Defined Strings**
|
||||
3. A new window should appear showing all strings Ghidra found in the binary
|
||||
|
||||
##### Step 2: Understand the Strings Window
|
||||
|
||||
The Defined Strings window shows:
|
||||
- **Address**: The memory location where the string starts
|
||||
- **String**: The actual text content
|
||||
- **Length**: How many bytes the string uses
|
||||
- **Defined**: Whether Ghidra has marked it as data
|
||||
|
||||
##### Step 3: Search for "hello, world"
|
||||
|
||||
1. In the Defined Strings window, look through the list to find `"hello, world"`
|
||||
2. **Search method**: If the window has a search box at the top, you can type to filter. Otherwise, use **Ctrl+F** to open the search function
|
||||
3. Once you find it, **click on it** to highlight the entry
|
||||
|
||||
##### Step 4: Record the Address
|
||||
|
||||
When you find `"hello, world"`, note down:
|
||||
|
||||
**String Address**: ________________
|
||||
|
||||
**Actual String Content**: ________________
|
||||
|
||||
**String Length**: ________________ bytes
|
||||
|
||||
##### Step 5: Double-Click to Navigate
|
||||
|
||||
1. **Double-click** on the `"hello, world"` entry in the Defined Strings window
|
||||
2. Ghidra will automatically navigate you to that address in the CodeBrowser
|
||||
3. You should see the string displayed in the **Listing** view (center panel)
|
||||
|
||||
##### Step 6: Examine the Listing View
|
||||
|
||||
Now that you're at the string's location:
|
||||
|
||||
1. Look at the **Listing view** (center panel) where the string is shown
|
||||
2. You'll see the string in **hex/ASCII** format
|
||||
3. Notice how it appears in memory - each character takes one byte
|
||||
4. Look for the string content: `hello, world\r\n`
|
||||
5. What comes after the string? (Ghidra may show other data nearby)
|
||||
|
||||
##### Step 7: Look at the Cross-References
|
||||
|
||||
To see where this string is **used**:
|
||||
|
||||
1. In the Listing view where the string is displayed, **right-click** on the string
|
||||
2. Select **References** → **Show References to**
|
||||
3. A dialog should appear showing which functions/instructions reference this string
|
||||
4. This tells you which parts of the code use this string
|
||||
|
||||
##### Step 8: Answer These Questions
|
||||
|
||||
Based on what you found:
|
||||
|
||||
###### Question 1: Memory Location
|
||||
- What is the address of the "hello, world" string? __________
|
||||
- Is it in Flash memory (starts with `0x100...`) or RAM (starts with `0x200...`)? __________
|
||||
|
||||
###### Question 2: String Storage
|
||||
- How many bytes does the string take in memory? __________
|
||||
- Can you count the characters? (h-e-l-l-o-,-space-w-o-r-l-d-\r-\n)
|
||||
|
||||
###### Question 3: References
|
||||
- How many times is this string referenced in the code? __________
|
||||
- Which function(s) reference it? (Hint: Look at the cross-references)
|
||||
|
||||
###### Question 4: ASCII Encoding
|
||||
- How is the string encoded in memory?
|
||||
- Is each character one byte or more? __________
|
||||
- What does `\r` and `\n` represent? (Hint: `\r` = carriage return, `\n` = newline)
|
||||
|
||||
## Expected Output
|
||||
|
||||
You should be able to fill in a summary like:
|
||||
|
||||
```
|
||||
String Found: "hello, world\r\n"
|
||||
Address: 0x________
|
||||
Located in: [Flash / RAM]
|
||||
Total Size: ________ bytes
|
||||
Referenced by: [Function names]
|
||||
Used in: [How the program uses it]
|
||||
```
|
||||
|
||||
## Deeper Exploration (Optional Challenge)
|
||||
|
||||
### Challenge 1: Follow the String Usage
|
||||
1. From the cross-references you found, click on the instruction that uses the string
|
||||
2. You should navigate to the `ldr` (load) instruction that loads the string's address into register `r0`
|
||||
3. This is how the `printf` function gets the pointer to the string!
|
||||
|
||||
### Challenge 2: Find Other Strings
|
||||
1. Go back to the Defined Strings window
|
||||
2. Look for other strings in the binary
|
||||
3. Are there any other text strings besides "hello, world"?
|
||||
4. If yes, where are they and what are they used for?
|
||||
|
||||
### Challenge 3: Understand Little-Endian
|
||||
1. When Ghidra shows the string address in the `ldr` instruction, it's showing a number
|
||||
2. Look at the raw bytes of that address value
|
||||
3. Notice how the bytes are stored in "backwards" order? That's little-endian!
|
||||
4. Can you convert the hex bytes to the actual address?
|
||||
|
||||
## Questions for Reflection
|
||||
|
||||
1. **Why is the string stored in Flash memory?** Why not in RAM?
|
||||
2. **How does `printf()` know where to find the string?** (Hint: The address is loaded into `r0`)
|
||||
3. **What would happen if we didn't have the `\r\n` at the end?** How would the output look?
|
||||
4. **Could we modify this string at runtime?** Why or why not?
|
||||
|
||||
## Tips and Hints
|
||||
|
||||
- Strings in compiled binaries are often stored in read-only memory (Flash) to save RAM
|
||||
- The `\r` and `\n` characters are special: they're single bytes (0x0D and 0x0A in hex)
|
||||
- When you see a string in Ghidra's listing, the ASCII representation is shown on the right side
|
||||
- You can scroll left/right in the Listing view to see different representations (hex, ASCII, disassembly)
|
||||
|
||||
## Real-World Application
|
||||
|
||||
Understanding where strings are stored is crucial for:
|
||||
- **Firmware modification**: Finding text messages to modify
|
||||
- **Reverse engineering**: Understanding what a program does by finding its strings
|
||||
- **Vulnerability analysis**: Finding format string bugs or hardcoded credentials
|
||||
- **Localization**: Finding where text needs to be translated
|
||||
|
||||
## Summary
|
||||
|
||||
By completing this exercise, you've learned:
|
||||
1. How to find strings in a binary using Ghidra's Defined Strings window
|
||||
2. How to determine the memory address of a string
|
||||
3. How to follow cross-references to see where strings are used
|
||||
4. How strings are stored in memory and referenced in code
|
||||
5. The relationship between C code (`printf()`) and assembly (`ldr`)
|
||||
203
WEEK01/WEEK01-03.md
Normal file
203
WEEK01/WEEK01-03.md
Normal file
@@ -0,0 +1,203 @@
|
||||
# Embedded Systems Reverse Engineering
|
||||
[Repository](https://github.com/mytechnotalent/Embedded-Hacking)
|
||||
|
||||
## Week 1: Introduction and Overview of Embedded Reverse Engineering: Ethics, Scoping, and Basic Concepts
|
||||
|
||||
### Exercise 3: Cross-References
|
||||
|
||||
#### Objective
|
||||
Learn how to use Ghidra's cross-reference feature to trace how data flows through code, understanding where specific data is read, written, or referenced.
|
||||
|
||||
#### Prerequisites
|
||||
- Ghidra installed with `0x0001_hello-world` project open
|
||||
- Completed Exercise 2 (Find Strings) - you should know where the "hello, world" string is located
|
||||
- CodeBrowser window open with the binary loaded
|
||||
|
||||
#### Task Description
|
||||
|
||||
In this exercise, you'll:
|
||||
1. Navigate to a specific data reference in the `main` function
|
||||
2. Find where a particular data item (`DAT_...`) is used
|
||||
3. Trace back to see which functions access this data
|
||||
4. Understand how data flows from memory to the CPU and then to functions
|
||||
|
||||
#### Background: What are Cross-References?
|
||||
|
||||
A **cross-reference** is a link between different parts of the code:
|
||||
- **Code → Data**: An instruction reads or writes data
|
||||
- **Code → Code**: A function calls another function
|
||||
- **Data → Data**: One data item references another
|
||||
|
||||
In this exercise, we're tracking **code → data** references to understand where and how the program uses the "hello, world" string.
|
||||
|
||||
## Step-by-Step Instructions
|
||||
|
||||
##### Step 1: Navigate to the main Function
|
||||
|
||||
1. In Ghidra's CodeBrowser, use **Search → For Address or Label** (or press **Ctrl+G**)
|
||||
2. Type `main` and press Enter
|
||||
3. Ghidra will navigate to the `main` function
|
||||
4. You should see the disassembly in the Listing view (center panel)
|
||||
|
||||
##### Step 2: Locate the `ldr` Instruction
|
||||
|
||||
In the main function's disassembly, look for an `ldr` (load register) instruction. It should look something like:
|
||||
|
||||
```
|
||||
ldr r0, [DAT_10000244]
|
||||
```
|
||||
|
||||
or similar. This instruction:
|
||||
- **`ldr`** = load register (read data from memory)
|
||||
- **`r0`** = put the data into register `r0`
|
||||
- **`[DAT_10000244]`** = read from the address stored at location `DAT_10000244`
|
||||
|
||||
##### Step 3: Understand the Notation
|
||||
|
||||
In Ghidra's decompiler notation:
|
||||
- **`DAT_10000244`** = a data item (not code) at address `0x10000244`
|
||||
- **`[...]`** = the address of; accessing memory at that location
|
||||
- The actual value is the address of the "hello, world" string in Flash memory
|
||||
|
||||
##### Step 4: Right-Click on the Data Reference
|
||||
|
||||
1. In the Listing view, find the `ldr` instruction that loads the string address
|
||||
2. **Right-click** on the `DAT_...` part (the data reference)
|
||||
3. A context menu should appear
|
||||
|
||||
##### Step 5: Select "References" Option
|
||||
|
||||
In the context menu:
|
||||
1. Look for an option that says **References**
|
||||
2. Click on it to see a submenu
|
||||
3. Select **Show References to** (this shows "where is this data used?")
|
||||
|
||||
##### Step 6: Review the References Window
|
||||
|
||||
A new window should appear showing all the locations where `DAT_10000244` (or whatever the address is) is referenced:
|
||||
|
||||
**Expected output might look like:**
|
||||
```
|
||||
DAT_10000244 (1 xref):
|
||||
main:10000236 (read)
|
||||
```
|
||||
|
||||
This means:
|
||||
- The data at `DAT_10000244` is used in 1 place
|
||||
- That place is in the `main` function at instruction `10000236`
|
||||
- It's a **read** operation (the code is reading this data)
|
||||
|
||||
##### Step 7: Answer These Questions
|
||||
|
||||
###### Question 1: Data Address
|
||||
- What is the address of the data reference you found? (e.g., `DAT_10000244`)
|
||||
- __________
|
||||
|
||||
#### Question 2: Referenced By
|
||||
- How many places reference this data?
|
||||
- __________
|
||||
- Which function(s) use it?
|
||||
- __________
|
||||
|
||||
#### Question 3: Reference Type
|
||||
- Is it a read or write operation?
|
||||
- __________
|
||||
- Why? (What's the program doing with this data?)
|
||||
- __________
|
||||
|
||||
###### Question 4: The Chain
|
||||
- The `ldr` instruction loads an address into `r0`
|
||||
- What happens next? (Hint: Look at the next instruction after the `ldr`)
|
||||
- __________
|
||||
- Is there a function call? If so, which one?
|
||||
- __________
|
||||
|
||||
###### Question 5: Understanding the Flow
|
||||
- **`DAT_10000244`** contains the address of the "hello, world" string
|
||||
- The `ldr` loads that address into `r0`
|
||||
- Then a function (probably `printf` or `puts`) is called with `r0` as the argument
|
||||
- Can you trace this complete flow?
|
||||
|
||||
## Deeper Analysis (Optional Challenge)
|
||||
|
||||
### Challenge 1: Find the Actual String Address
|
||||
1. Navigate to the `DAT_10000244` location
|
||||
2. Look at the value stored there
|
||||
3. Can you decode the hex bytes and find the actual address of "hello, world"?
|
||||
4. Hint: The RP2350 uses little-endian encoding, so the bytes are "backwards"
|
||||
|
||||
**Example:**
|
||||
If you see bytes: `CC 19 00 10`
|
||||
Read backwards: `10 00 19 CC` = `0x100019CC`
|
||||
|
||||
### Challenge 2: Understand the Indirection
|
||||
1. In C, if we want to load an address, we do: `char *ptr = &some_string;`
|
||||
2. Then to use it: `printf(ptr);`
|
||||
3. In assembly, this becomes:
|
||||
- Load the pointer: `ldr r0, [DAT_...]`
|
||||
- Call the function: `bl printf`
|
||||
4. Can you see this pattern in the assembly?
|
||||
|
||||
### Challenge 3: Follow Multiple References
|
||||
1. Try this with different data items in the binary
|
||||
2. Find a data reference that has **multiple** cross-references
|
||||
3. What data is used in more than one place?
|
||||
|
||||
## Questions for Reflection
|
||||
|
||||
1. **Why does the code need to load an address from memory?**
|
||||
- Why can't it just use the address directly?
|
||||
- Hint: Position-independent code and memory protection
|
||||
|
||||
2. **What's the relationship between `DAT_10000244` and the "hello, world" string?**
|
||||
- They're at different addresses - why?
|
||||
- Which is in Flash and which points to where it's stored?
|
||||
|
||||
3. **If we wanted to change what gets printed, where would we modify the code?**
|
||||
- Could we just change the string at address `0x100019CC`?
|
||||
- Or would we need to change `DAT_10000244`?
|
||||
- Or both?
|
||||
|
||||
4. **How does this relate to memory layout?**
|
||||
- Code section (Flash memory starting at `0x10000000`)
|
||||
- Data section (constants/strings)
|
||||
- Is everything at different addresses for a reason?
|
||||
|
||||
## Tips and Hints
|
||||
|
||||
- If you right-click and don't see "References", try right-clicking directly on the instruction address instead
|
||||
- You can also use **Search → For Cross References** from the menu for a more advanced search
|
||||
- In the Decompile view (right side), cross-references may be shown in a different format or with different colors
|
||||
- Multi-level references: You can right-click on a data item and then follow the chain to another data item
|
||||
|
||||
## Real-World Applications
|
||||
|
||||
Understanding cross-references is crucial for:
|
||||
- **Vulnerability hunting**: Finding where user input flows through the code
|
||||
- **Firmware patching**: Changing constants, strings, or data values
|
||||
- **Malware analysis**: Tracking command-and-control server addresses or encryption keys
|
||||
- **Reverse engineering**: Understanding program logic by following data dependencies
|
||||
|
||||
## Summary
|
||||
|
||||
By completing this exercise, you've learned:
|
||||
1. How to find and interpret cross-references in Ghidra
|
||||
2. How to trace data from its definition to where it's used
|
||||
3. How the `ldr` (load) instruction works to pass data to functions
|
||||
4. The relationship between high-level C code and assembly-level data flow
|
||||
5. How addresses are indirectly referenced in position-independent code
|
||||
|
||||
## Expected Final Understanding
|
||||
|
||||
You should now understand this flow:
|
||||
```
|
||||
String "hello, world" is stored at address 0x100019CC in Flash
|
||||
↓
|
||||
A pointer to this address is stored at DAT_10000244 in Flash
|
||||
↓
|
||||
The main() function loads this pointer: ldr r0, [DAT_10000244]
|
||||
↓
|
||||
main() calls printf with r0 (the string address) as the argument
|
||||
↓
|
||||
printf() reads the bytes at that address and prints them
|
||||
```
|
||||
370
WEEK01/WEEK01-04.md
Normal file
370
WEEK01/WEEK01-04.md
Normal file
@@ -0,0 +1,370 @@
|
||||
# Embedded Systems Reverse Engineering
|
||||
[Repository](https://github.com/mytechnotalent/Embedded-Hacking)
|
||||
|
||||
## Week 1: Introduction and Overview of Embedded Reverse Engineering: Ethics, Scoping, and Basic Concepts
|
||||
|
||||
### Exercise 4: Connect GDB (Preparation for Week 2)
|
||||
|
||||
#### Objective
|
||||
Set up GDB (GNU Debugger) to dynamically analyze the "hello, world" program running on your Pico 2, verifying that your debugging setup works correctly.
|
||||
|
||||
#### Prerequisites
|
||||
- Raspberry Pi Pico 2 with "hello-world" binary already flashed
|
||||
- OpenOCD installed and working
|
||||
- GDB (arm-none-eabi-gdb) installed
|
||||
- Your Pico 2 connected to your computer via USB CMSIS-DAP interface
|
||||
- CMake build artifacts available (`.elf` file from compilation)
|
||||
|
||||
#### Task Description
|
||||
|
||||
In this exercise, you'll:
|
||||
1. Start OpenOCD to provide a debug server
|
||||
2. Connect GDB to the Pico 2 via OpenOCD
|
||||
3. Set a breakpoint at the main function
|
||||
4. Examine registers and memory while the program is running
|
||||
5. Verify that your dynamic debugging setup works
|
||||
|
||||
#### Important Setup Notes
|
||||
|
||||
Before you start, make sure:
|
||||
- Your Pico 2 is **powered on** and connected to your computer
|
||||
- You have **OpenOCD** installed for ARM debugging
|
||||
- You have **GDB** (specifically `arm-none-eabi-gdb`) installed
|
||||
- Your binary file (`0x0001_hello-world.elf`) is available in the `build/` directory
|
||||
|
||||
## Step-by-Step Instructions
|
||||
|
||||
##### Step 1: Start OpenOCD in Terminal 1
|
||||
|
||||
Open a **new terminal window** (PowerShell, Command Prompt, or WSL):
|
||||
|
||||
**On Windows (PowerShell/Command Prompt):**
|
||||
```
|
||||
openocd ^
|
||||
-s "C:\Users\flare-vm\.pico-sdk\openocd\0.12.0+dev\scripts" ^
|
||||
-f interface/cmsis-dap.cfg ^
|
||||
-f target/rp2350.cfg ^
|
||||
-c "adapter speed 5000"
|
||||
```
|
||||
|
||||
**Expected Output:**
|
||||
```
|
||||
Open On-Chip Debugger 0.12.0+dev
|
||||
...
|
||||
Info : CMSIS-DAP: SWD detected
|
||||
Info : RP2350 (dual core) detected
|
||||
Info : Using JTAG interface
|
||||
...
|
||||
Info : accepting 'gdb' connection on tcp/3333
|
||||
```
|
||||
|
||||
##### Step 2: Start GDB in Terminal 2
|
||||
|
||||
Open a **second terminal window** and navigate to your project directory:
|
||||
|
||||
```
|
||||
arm-none-eabi-gdb -q build/0x0001_hello-world.elf
|
||||
```
|
||||
|
||||
**Expected Output:**
|
||||
```
|
||||
Reading symbols from build/0x0001_hello-world.elf...
|
||||
(gdb)
|
||||
```
|
||||
|
||||
##### Step 3: Connect GDB to OpenOCD
|
||||
|
||||
At the GDB prompt `(gdb)`, type:
|
||||
|
||||
```gdb
|
||||
target extended-remote localhost:3333
|
||||
```
|
||||
|
||||
**Expected Output:**
|
||||
```
|
||||
Remote debugging using localhost:3333
|
||||
(gdb)
|
||||
```
|
||||
|
||||
(The warning is normal - you already loaded the .elf file, so it doesn't matter)
|
||||
|
||||
##### Step 4: Reset and Halt the Target
|
||||
|
||||
To reset the Pico 2 and prepare for debugging, type:
|
||||
|
||||
```gdb
|
||||
monitor reset halt
|
||||
```
|
||||
|
||||
**Expected Output:**
|
||||
```
|
||||
(gdb)
|
||||
```
|
||||
|
||||
(This resets the processor and halts it, preventing execution until you tell it to run)
|
||||
|
||||
##### Step 5: Set a Breakpoint at main
|
||||
|
||||
To stop execution at the beginning of the `main` function:
|
||||
|
||||
```gdb
|
||||
b main
|
||||
```
|
||||
|
||||
**Expected Output:**
|
||||
```
|
||||
Breakpoint 1 at 0x10000234: file ../0x0001_hello-world.c, line 4.
|
||||
(gdb)
|
||||
```
|
||||
|
||||
**What this means:**
|
||||
- Breakpoint 1 is set at address `0x10000234`
|
||||
- That's in the file `../0x0001_hello-world.c` at line 4
|
||||
- The breakpoint is at the `main` function
|
||||
|
||||
##### Step 6: Continue Execution to the Breakpoint
|
||||
|
||||
Now let the program run until it hits your breakpoint:
|
||||
|
||||
```gdb
|
||||
c
|
||||
```
|
||||
|
||||
**Expected Output:**
|
||||
```
|
||||
Continuing.
|
||||
|
||||
Breakpoint 1, main () at ../0x0001_hello-world.c:4
|
||||
4 stdio_init_all();
|
||||
(gdb)
|
||||
```
|
||||
|
||||
**Great!** Your program is now halted at the beginning of `main()`.
|
||||
|
||||
##### Step 7: Examine the Assembly with `disas`
|
||||
|
||||
To see the assembly language of the current function:
|
||||
|
||||
```gdb
|
||||
disas
|
||||
```
|
||||
|
||||
**Expected Output:**
|
||||
```
|
||||
Dump of assembler code for function main:
|
||||
=> 0x10000234 <+0>: push {r3, lr}
|
||||
0x10000236 <+2>: bl 0x1000156c <stdio_init_all>
|
||||
0x1000023a <+6>: ldr r0, [pc, #8] @ (0x10000244 <main+16>)
|
||||
0x1000023c <+8>: bl 0x100015fc <__wrap_puts>
|
||||
0x10000240 <+12>: b.n 0x1000023a <main+6>
|
||||
0x10000242 <+14>: nop
|
||||
0x10000244 <+16>: adds r4, r1, r7
|
||||
0x10000246 <+18>: asrs r0, r0, #32
|
||||
End of assembler dump.
|
||||
(gdb)
|
||||
```
|
||||
|
||||
**Interpretation:**
|
||||
- The `=>` arrow shows where we're currently stopped (at `0x10000234`)
|
||||
- We can see the `push`, `bl` (branch and link), `ldr`, and `b.n` (branch) instructions
|
||||
- This is the exact code you analyzed in the Ghidra exercises!
|
||||
|
||||
##### Step 8: View All Registers with `i r`
|
||||
|
||||
To see the current state of all CPU registers:
|
||||
|
||||
```gdb
|
||||
i r
|
||||
```
|
||||
|
||||
**Expected Output:**
|
||||
```
|
||||
r0 0x0 0
|
||||
r1 0x10000235 268436021
|
||||
r2 0x80808080 -2139062144
|
||||
r3 0xe000ed08 -536810232
|
||||
r4 0x100001d0 268435920
|
||||
r5 0x88526891 -2007865199
|
||||
r6 0x4f54710 83183376
|
||||
r7 0x400e0014 1074659348
|
||||
r8 0x43280035 1126694965
|
||||
r9 0x0 0
|
||||
r10 0x10000000 268435456
|
||||
r11 0x62707361 1651536737
|
||||
r12 0xed07f600 -318245376
|
||||
sp 0x20082000 0x20082000
|
||||
lr 0x1000018f 268435855
|
||||
pc 0x10000234 0x10000234 <main>
|
||||
xpsr 0x69000000 1761607680
|
||||
```
|
||||
|
||||
**Key Registers to Understand:**
|
||||
| Register | Value | Meaning |
|
||||
| -------- | ------------ | ------------------------------------------------- |
|
||||
| `pc` | `0x10000234` | Program Counter - we're at the start of `main` |
|
||||
| `sp` | `0x20082000` | Stack Pointer - top of our stack in RAM |
|
||||
| `lr` | `0x1000018f` | Link Register - where we return from `main` |
|
||||
| `r0-r3` | Various | Will hold function arguments and return values |
|
||||
|
||||
##### Step 9: Step Into the First Instruction
|
||||
|
||||
To execute one assembly instruction:
|
||||
|
||||
```gdb
|
||||
si
|
||||
```
|
||||
|
||||
**Expected Output:**
|
||||
```
|
||||
0x10000236 in main () at ../0x0001_hello-world.c:5
|
||||
5 stdio_init_all();
|
||||
(gdb)
|
||||
```
|
||||
|
||||
The `pc` should now be at `0x10000236`, which is the next instruction.
|
||||
|
||||
##### Step 10: Answer These Questions
|
||||
|
||||
Based on what you've observed:
|
||||
|
||||
###### Question 1: GDB Connection
|
||||
- Was GDB able to connect to OpenOCD? (Yes/No)
|
||||
- Did the program stop at your breakpoint? (Yes/No)
|
||||
- __________
|
||||
|
||||
###### Question 2: Breakpoint Address
|
||||
- What is the memory address of the `main` function's first instruction?
|
||||
- __________
|
||||
- Is this in Flash memory (0x100...) or RAM (0x200...)?
|
||||
- __________
|
||||
|
||||
###### Question 3: Stack Pointer
|
||||
- What is the value of the Stack Pointer (sp) when you're at `main`?
|
||||
- __________
|
||||
- Is this in Flash or RAM?
|
||||
- __________
|
||||
|
||||
###### Question 4: First Instruction
|
||||
- What is the first instruction in `main`?
|
||||
- __________
|
||||
- What does it do? (Hint: `push` = save to stack)
|
||||
- __________
|
||||
|
||||
###### Question 5: Disassembly Comparison
|
||||
- Look at the disassembly from GDB (Step 7)
|
||||
- Compare it to the disassembly from Ghidra (Exercise 1)
|
||||
- Are they the same?
|
||||
- __________
|
||||
|
||||
## Deeper Exploration (Optional Challenge)
|
||||
|
||||
### Challenge 1: Step Through stdio_init_all
|
||||
1. Continue stepping: `si` (step into) or `ni` (next instruction)
|
||||
2. Eventually, you'll reach `bl 0x1000156c <stdio_init_all>`
|
||||
3. Use `si` to step **into** that function
|
||||
4. What instructions do you see?
|
||||
5. What registers are being modified?
|
||||
|
||||
### Challenge 2: View Specific Registers
|
||||
Instead of viewing all registers, you can view just a few:
|
||||
```gdb
|
||||
i r pc sp lr r0 r1 r2
|
||||
```
|
||||
This shows only the registers you care about.
|
||||
|
||||
### Challenge 3: Examine Memory
|
||||
To examine memory at a specific address (e.g., where the string is):
|
||||
```gdb
|
||||
x/16b 0x100019cc
|
||||
```
|
||||
This displays 16 bytes (`b` = byte) starting at address `0x100019cc`. Can you see the "hello, world" string?
|
||||
|
||||
### Challenge 4: Set a Conditional Breakpoint
|
||||
Set a breakpoint that only triggers after a certain condition:
|
||||
```gdb
|
||||
b *0x1000023a if $r0 != 0
|
||||
```
|
||||
This is useful when you want to break on a condition rather than every time.
|
||||
|
||||
## Questions for Reflection
|
||||
|
||||
1. **Why does GDB show both the C source line AND the assembly?**
|
||||
- This is because the .elf file contains debug symbols
|
||||
- What would happen if we used a stripped binary?
|
||||
|
||||
2. **How does GDB know the assembly for each instruction?**
|
||||
- It disassembles the binary on-the-fly based on the architecture
|
||||
|
||||
3. **Why is the Stack Pointer so high (0x20082000)?**
|
||||
- It's at the top of RAM and grows downward
|
||||
- Can you calculate how much RAM this Pico 2 has?
|
||||
|
||||
4. **What's the difference between `si` (step into) and `ni` (next instruction)?**
|
||||
- `si` steps into function calls
|
||||
- `ni` executes entire functions without stopping inside them
|
||||
|
||||
## Important GDB Commands Reference
|
||||
|
||||
| Command | Short Form | What It Does |
|
||||
| ---------------------- | ---------- | ------------------------------------ |
|
||||
| `target extended-remote localhost:3333` | | Connect to OpenOCD |
|
||||
| `monitor reset halt` | | Reset and halt the processor |
|
||||
| `break main` | `b main` | Set a breakpoint at main function |
|
||||
| `continue` | `c` | Continue until breakpoint |
|
||||
| `step instruction` | `si` | Step one instruction (into calls) |
|
||||
| `next instruction` | `ni` | Step one instruction (over calls) |
|
||||
| `disassemble` | `disas` | Show assembly for current function |
|
||||
| `info registers` | `i r` | Show all register values |
|
||||
| `x/Nxy ADDRESS` | `x` | Examine memory (N=count, x=format, y=size) |
|
||||
| `quit` | `q` | Exit GDB |
|
||||
|
||||
**Examples for `x` command:**
|
||||
- `x/10i $pc` - examine 10 instructions at program counter
|
||||
- `x/16b 0x20000000` - examine 16 bytes starting at RAM address
|
||||
- `x/4w 0x10000000` - examine 4 words (4-byte values) starting at Flash address
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Problem: "OpenOCD not found"
|
||||
**Solution:** Make sure OpenOCD is in your PATH or use the full path to the executable
|
||||
|
||||
### Problem: "Target not responding"
|
||||
**Solution:**
|
||||
- Check that your Pico 2 is properly connected
|
||||
- Make sure OpenOCD is running and shows "accepting 'gdb' connection"
|
||||
- Restart both OpenOCD and GDB
|
||||
|
||||
### Problem: "Cannot find breakpoint at main"
|
||||
**Solution:**
|
||||
- Make sure you compiled with debug symbols
|
||||
- The .elf file must include symbol information
|
||||
- Try breaking at an address instead: `b *0x10000234`
|
||||
|
||||
### Problem: GDB shows "No source available"
|
||||
**Solution:**
|
||||
- This happens with stripped binaries
|
||||
- You can still see assembly with `disas`
|
||||
- You can still examine memory and registers
|
||||
|
||||
## Summary
|
||||
|
||||
By completing this exercise, you've:
|
||||
1. ✅ Set up OpenOCD as a debug server
|
||||
2. ✅ Connected GDB to a Pico 2 board
|
||||
3. ✅ Set a breakpoint and halted execution
|
||||
4. ✅ Examined assembly language in a live debugger
|
||||
5. ✅ Viewed CPU registers and their values
|
||||
6. ✅ Verified your dynamic debugging setup works
|
||||
|
||||
You're now ready for Week 2, where you'll:
|
||||
- Step through code line by line
|
||||
- Watch variables and memory change
|
||||
- Understand program flow in detail
|
||||
- Use this knowledge to modify running code
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. **Close GDB**: Type `quit` or `q` to exit
|
||||
2. **Close OpenOCD**: Type `Ctrl+C` in the OpenOCD terminal
|
||||
3. **Review**: Go back to the Ghidra exercises and compare static vs. dynamic analysis
|
||||
4. **Prepare**: Read through Week 2 materials to understand what's coming next
|
||||
622
WEEK01/WEEK01.md
Normal file
622
WEEK01/WEEK01.md
Normal file
@@ -0,0 +1,622 @@
|
||||
# Embedded Systems Reverse Engineering
|
||||
[Repository](https://github.com/mytechnotalent/Embedded-Hacking)
|
||||
|
||||
## Week 1: Introduction and Overview of Embedded Reverse Engineering: Ethics, Scoping, and Basic Concepts
|
||||
|
||||
### 🎯 What You'll Learn This Week
|
||||
|
||||
By the end of this week, you will be able to:
|
||||
- Understand what a microcontroller is and how it works
|
||||
- Know the basic registers of the ARM Cortex-M33 processor
|
||||
- Understand memory layout (Flash vs RAM) and why it matters
|
||||
- Understand how the stack works in embedded systems
|
||||
- Set up and connect GDB to your Pico 2 for debugging
|
||||
- Use Ghidra for static analysis of your binary
|
||||
- Read basic ARM assembly instructions and understand what they do
|
||||
|
||||
---
|
||||
|
||||
### 📚 Part 1: Understanding the Basics
|
||||
|
||||
#### What is a Microcontroller?
|
||||
|
||||
Think of a microcontroller as a tiny computer on a single chip. Just like your laptop has a processor, memory, and storage, a microcontroller has all of these packed into one small chip. The **RP2350** is the microcontroller chip that powers the **Raspberry Pi Pico 2**.
|
||||
|
||||
#### What is the ARM Cortex-M33?
|
||||
|
||||
The RP2350 has two "brains" inside it - we call these **cores**. One brain uses ARM Cortex-M33 instructions, and the other can use RISC-V instructions. In this course, we'll focus on the **ARM Cortex-M33** core because it's more commonly used in the industry.
|
||||
|
||||
#### What is Reverse Engineering?
|
||||
|
||||
Reverse engineering is like being a detective for code. Instead of writing code and compiling it, we take compiled code (the 1s and 0s that the computer actually runs) and figure out what it does. This is useful for:
|
||||
- Understanding how things work
|
||||
- Finding bugs or security issues
|
||||
- Learning how software interacts with hardware
|
||||
|
||||
---
|
||||
|
||||
### 📚 Part 2: Understanding Processor Registers
|
||||
|
||||
#### What is a Register?
|
||||
|
||||
A **register** is like a tiny, super-fast storage box inside the processor. The processor uses registers to hold numbers while it's doing calculations. Think of them like the short-term memory your brain uses when doing math in your head.
|
||||
|
||||
#### The ARM Cortex-M33 Registers
|
||||
|
||||
The ARM Cortex-M33 has several important registers:
|
||||
|
||||
| Register | Also Called | Purpose |
|
||||
| ------------ | -------------------- | ------------------------------------------- |
|
||||
| `r0` - `r12` | General Purpose | Store numbers, pass data between functions |
|
||||
| `r13` | SP (Stack Pointer) | Keeps track of where we are in the stack |
|
||||
| `r14` | LR (Link Register) | Remembers where to go back after a function |
|
||||
| `r15` | PC (Program Counter) | Points to the next instruction to run |
|
||||
|
||||
##### General Purpose Registers (`r0` - `r12`)
|
||||
|
||||
These 13 registers are your "scratch paper." When the processor needs to add two numbers, subtract, or do any calculation, it uses these registers to hold the values.
|
||||
|
||||
**Example:** If you want to add 5 + 3:
|
||||
1. Put 5 in `r0`
|
||||
2. Put 3 in `r1`
|
||||
3. Add them and store the result (8) in `r2`
|
||||
|
||||
##### The Stack Pointer (`r13` / SP)
|
||||
|
||||
The **stack** is a special area of memory that works like a stack of plates:
|
||||
- When you add something, you put it on top (called a **PUSH**)
|
||||
- When you remove something, you take it from the top (called a **POP**)
|
||||
|
||||
The Stack Pointer always points to the top of this stack. On ARM systems, the stack **grows downward** in memory. This means when you push something onto the stack, the address number gets smaller!
|
||||
|
||||
```
|
||||
Higher Memory Address (0x20082000)
|
||||
┌──────────────────┐
|
||||
│ │ ← Stack starts here (empty)
|
||||
├──────────────────┤
|
||||
│ Pushed Item 1 │ ← SP points here after 1 push
|
||||
├──────────────────┤
|
||||
│ Pushed Item 2 │ ← SP points here after 2 pushes
|
||||
└──────────────────┘
|
||||
Lower Memory Address (0x20081FF8)
|
||||
```
|
||||
|
||||
##### The Link Register (`r14` / LR)
|
||||
|
||||
When you call a function, the processor needs to remember where to come back to. The Link Register stores this "return address."
|
||||
|
||||
**Example:**
|
||||
```
|
||||
main() calls print_hello()
|
||||
↓
|
||||
LR = address right after the call in main()
|
||||
↓
|
||||
print_hello() runs
|
||||
↓
|
||||
print_hello() finishes, looks at LR
|
||||
↓
|
||||
Jumps back to main() at the address stored in LR
|
||||
```
|
||||
|
||||
##### The Program Counter (`r15` / PC)
|
||||
|
||||
The Program Counter always points to the **next instruction** the processor will execute. It's like your finger following along as you read a book - it always points to where you are.
|
||||
|
||||
---
|
||||
|
||||
### 📚 Part 3: Understanding Memory Layout
|
||||
|
||||
#### XIP - Execute In Place
|
||||
|
||||
The RP2350 uses something called **XIP (Execute In Place)**. This means the processor can run code directly from the flash memory (where your program is stored) without copying it to RAM first.
|
||||
|
||||
**Key Memory Address:** `0x10000000`
|
||||
|
||||
This is where your program code starts in flash memory. Remember this address - we'll use it a lot!
|
||||
|
||||
#### Memory Map Overview
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────┐
|
||||
│ Flash Memory (XIP) │
|
||||
│ Starts at: 0x10000000 │
|
||||
│ Contains: Your program code │
|
||||
├─────────────────────────────────────┤
|
||||
│ RAM │
|
||||
│ Starts at: 0x20000000 │
|
||||
│ Contains: Stack, Heap, Variables │
|
||||
└─────────────────────────────────────┘
|
||||
```
|
||||
|
||||
#### Stack vs Heap
|
||||
|
||||
| Stack | Heap |
|
||||
| ---------------------------------------- | ---------------------------------- |
|
||||
| Automatic memory management | Manual memory management |
|
||||
| Fast | Slower |
|
||||
| Limited size | More flexible size |
|
||||
| Used for function calls, local variables | Used for dynamic memory allocation |
|
||||
| Grows downward | Grows upward |
|
||||
|
||||
---
|
||||
|
||||
### 📚 Part 3.5: Reviewing Our Hello World Code
|
||||
|
||||
Before we start debugging, let's understand the code we'll be working with. Here's our `0x0001_hello-world.c` program:
|
||||
|
||||
```c
|
||||
#include <stdio.h>
|
||||
#include "pico/stdlib.h"
|
||||
|
||||
int main(void) {
|
||||
stdio_init_all();
|
||||
|
||||
while (true)
|
||||
printf("hello, world\r\n");
|
||||
}
|
||||
```
|
||||
|
||||
#### Breaking Down the Code
|
||||
|
||||
##### The Includes
|
||||
|
||||
```c
|
||||
#include <stdio.h>
|
||||
#include "pico/stdlib.h"
|
||||
```
|
||||
|
||||
- **`<stdio.h>`** - This is the standard input/output library. It gives us access to the `printf()` function that lets us print text.
|
||||
- **`"pico/stdlib.h"`** - This is the Pico SDK's standard library. It provides essential functions for working with the Raspberry Pi Pico hardware.
|
||||
|
||||
##### The Main Function
|
||||
|
||||
```c
|
||||
int main(void) {
|
||||
```
|
||||
|
||||
Every C program starts running from the `main()` function. The `void` means it takes no arguments, and `int` means it returns an integer (though our program never actually returns).
|
||||
|
||||
##### Initializing Standard I/O
|
||||
|
||||
```c
|
||||
stdio_init_all();
|
||||
```
|
||||
|
||||
This function initializes all the standard I/O (input/output) for the Pico. It sets up:
|
||||
- **USB CDC** (so you can see output when connected to a computer via USB)
|
||||
- **UART** (serial communication pins)
|
||||
|
||||
Without this line, `printf()` wouldn't have anywhere to send its output!
|
||||
|
||||
##### The Infinite Loop
|
||||
|
||||
```c
|
||||
while (true)
|
||||
printf("hello, world\r\n");
|
||||
```
|
||||
|
||||
- **`while (true)`** - This creates an infinite loop. The program will keep running forever (or until you reset/power off the Pico).
|
||||
- **`printf("hello, world\r\n")`** - This prints the text "hello, world" followed by a carriage return (`\r`) and newline (`\n`).
|
||||
|
||||
> 💡 **Why `\r\n` instead of just `\n`?**
|
||||
>
|
||||
> In embedded systems, we often use both carriage return (`\r`) and newline (`\n`) together. The `\r` moves the cursor back to the beginning of the line, and `\n` moves to the next line. This ensures proper display across different terminal programs.
|
||||
|
||||
#### What Happens When This Runs?
|
||||
|
||||
1. **Power on** - The Pico boots up and starts executing code from flash memory
|
||||
2. **`stdio_init_all()`** - Sets up USB and/or UART for communication
|
||||
3. **Infinite loop begins** - The program enters the `while(true)` loop
|
||||
4. **Print forever** - "hello, world" is sent over and over as fast as possible
|
||||
|
||||
#### Why This Code is Perfect for Learning
|
||||
|
||||
This simple program is ideal for reverse engineering practice because:
|
||||
- It has a clear, recognizable function call (`printf`)
|
||||
- It has an infinite loop we can observe
|
||||
- It's small enough to understand completely
|
||||
- It demonstrates real hardware interaction (USB/UART output)
|
||||
|
||||
When we debug this code, we'll be able to see how the C code translates to ARM assembly instructions!
|
||||
|
||||
#### Compiling and Flashing to the Pico 2
|
||||
|
||||
Now that we understand the code, let's get it running on our hardware:
|
||||
|
||||
##### Step 1: Compile the Code
|
||||
|
||||
In VS Code, look for the **Compile** button in the status bar at the bottom of the window. This is provided by the Raspberry Pi Pico extension. Click it to compile your project.
|
||||
|
||||
The extension will run CMake and build your code, creating a `.uf2` file that can be loaded onto the Pico 2.
|
||||
|
||||
##### Step 2: Put the Pico 2 in Flash Loading Mode
|
||||
|
||||
To flash new code to your Pico 2, you need to put it into **BOOTSEL mode**:
|
||||
|
||||
1. **Press and hold** the right-most button on your breadboard (the BOOTSEL button)
|
||||
2. **While holding BOOTSEL**, press the white **Reset** button
|
||||
3. **Release the Reset button** first
|
||||
4. **Then release the BOOTSEL button**
|
||||
|
||||
When done correctly, your Pico 2 will appear as a USB mass storage device (like a flash drive) on your computer. This means it's ready to receive new firmware!
|
||||
|
||||
> 💡 **Tip:** You'll see a drive called "RP2350" appear in your file explorer when the Pico 2 is in flash loading mode.
|
||||
|
||||
##### Step 3: Flash and Run
|
||||
|
||||
Back in VS Code, click the **Run** button in the status bar. The extension will:
|
||||
1. Copy the compiled `.uf2` file to the Pico 2
|
||||
2. The Pico 2 will automatically reboot and start running your code
|
||||
|
||||
Once flashed, your Pico 2 will immediately start executing the hello-world program, printing "hello, world" continuously when we open PuTTY!
|
||||
|
||||
---
|
||||
|
||||
### 📚 Part 4: Dynamic Analysis with GDB
|
||||
|
||||
#### Prerequisites
|
||||
|
||||
Before we start, make sure you have:
|
||||
1. A Raspberry Pi Pico 2 board
|
||||
2. GDB (GNU Debugger) installed
|
||||
3. OpenOCD or another debug probe connection
|
||||
4. The sample "hello-world" binary loaded on your Pico 2
|
||||
|
||||
#### Connecting to Your Pico 2 with OpenOCD
|
||||
|
||||
Open a terminal and start OpenOCD:
|
||||
|
||||
```bash
|
||||
openocd ^
|
||||
-s "C:\Users\flare-vm\.pico-sdk\openocd\0.12.0+dev\scripts" ^
|
||||
-f interface/cmsis-dap.cfg ^
|
||||
-f target/rp2350.cfg ^
|
||||
-c "adapter speed 5000"
|
||||
```
|
||||
|
||||
#### Connecting to Your Pico 2 with GDB
|
||||
|
||||
Open another terminal and start GDB with your binary:
|
||||
|
||||
```bash
|
||||
arm-none-eabi-gdb -q build/0x0001_hello-world.elf
|
||||
```
|
||||
|
||||
Connect to your target:
|
||||
|
||||
```bash
|
||||
(gdb) target extended-remote localhost:3333
|
||||
(gdb) monitor reset halt
|
||||
```
|
||||
|
||||
#### Basic GDB Commands: Your First Steps
|
||||
|
||||
Now that we're connected, let's learn three essential GDB commands that you'll use constantly in embedded reverse engineering.
|
||||
|
||||
##### Setting a Breakpoint with `b main`
|
||||
|
||||
A **breakpoint** tells the debugger to pause execution when it reaches a specific point. Let's set one at our `main` function:
|
||||
|
||||
```gdb
|
||||
(gdb) b main
|
||||
Breakpoint 1 at 0x10000234: file ../0x0001_hello-world.c, line 5.
|
||||
```
|
||||
|
||||
**What this tells us:**
|
||||
- GDB found our `main` function
|
||||
- It's located at address `0x10000234` in flash memory
|
||||
- The source file and line number are shown (because we have debug symbols)
|
||||
|
||||
Now let's run to that breakpoint:
|
||||
|
||||
```gdb
|
||||
(gdb) c
|
||||
Continuing.
|
||||
|
||||
Breakpoint 1, main () at ../0x0001_hello-world.c:5
|
||||
5 stdio_init_all();
|
||||
```
|
||||
|
||||
The program has stopped right at the beginning of `main`!
|
||||
|
||||
##### Disassembling with `disas`
|
||||
|
||||
The `disas` (disassemble) command shows us the assembly instructions for the current function:
|
||||
|
||||
```gdb
|
||||
(gdb) disas
|
||||
Dump of assembler code for function main:
|
||||
=> 0x10000234 <+0>: push {r3, lr}
|
||||
0x10000236 <+2>: bl 0x1000156c <stdio_init_all>
|
||||
0x1000023a <+6>: ldr r0, [pc, #8] @ (0x10000244 <main+16>)
|
||||
0x1000023c <+8>: bl 0x100015fc <__wrap_puts>
|
||||
0x10000240 <+12>: b.n 0x1000023a <main+6>
|
||||
0x10000242 <+14>: nop
|
||||
0x10000244 <+16>: adds r4, r1, r7
|
||||
0x10000246 <+18>: asrs r0, r0, #32
|
||||
End of assembler dump.
|
||||
```
|
||||
|
||||
**Understanding the output:**
|
||||
- The `=>` arrow shows where we're currently stopped
|
||||
- Each line shows: `address <offset>: instruction operands`
|
||||
- We can see the calls to `stdio_init_all` and `__wrap_puts` (printf was optimized to puts)
|
||||
- The `b.n 0x1000023a` at the end is our infinite loop - it jumps back to reload the string!
|
||||
|
||||
##### Viewing Registers with `i r`
|
||||
|
||||
The `i r` (info registers) command shows the current state of all CPU registers:
|
||||
|
||||
```gdb
|
||||
(gdb) i r
|
||||
r0 0x0 0
|
||||
r1 0x10000235 268436021
|
||||
r2 0x80808080 -2139062144
|
||||
r3 0xe000ed08 -536810232
|
||||
r4 0x100001d0 268435920
|
||||
r5 0x88526891 -2007865199
|
||||
r6 0x4f54710 83183376
|
||||
r7 0x400e0014 1074659348
|
||||
r8 0x43280035 1126694965
|
||||
r9 0x0 0
|
||||
r10 0x10000000 268435456
|
||||
r11 0x62707361 1651536737
|
||||
r12 0xed07f600 -318245376
|
||||
sp 0x20082000 0x20082000
|
||||
lr 0x1000018f 268435855
|
||||
pc 0x10000234 0x10000234 <main>
|
||||
xpsr 0x69000000 1761607680
|
||||
```
|
||||
|
||||
**Key registers to watch:**
|
||||
| Register | Value | Meaning |
|
||||
| -------- | ------------ | ----------------------------------------------- |
|
||||
| `pc` | `0x10000234` | Program Counter - we're at the start of `main` |
|
||||
| `sp` | `0x20081fc8` | Stack Pointer - top of our stack in RAM |
|
||||
| `lr` | `0x100002d5` | Link Register - where we return after `main` |
|
||||
| `r0-r3` | Various | Will hold function arguments and return values |
|
||||
|
||||
> 💡 **Tip:** You can also use `i r pc sp lr` to show only specific registers you care about.
|
||||
|
||||
#### Quick Reference: Essential GDB Commands
|
||||
|
||||
| Command | Short Form | What It Does |
|
||||
| --------------------- | ---------- | ------------------------------------ |
|
||||
| `break main` | `b main` | Set a breakpoint at main |
|
||||
| `continue` | `c` | Continue execution until breakpoint |
|
||||
| `disassemble` | `disas` | Show assembly for current function |
|
||||
| `info registers` | `i r` | Show all register values |
|
||||
| `stepi` | `si` | Execute one assembly instruction |
|
||||
| `nexti` | `ni` | Execute one instruction (skip calls) |
|
||||
| `x/10i $pc` | | Examine 10 instructions at PC |
|
||||
| `monitor reset halt` | | Reset the target and halt |
|
||||
|
||||
---
|
||||
|
||||
> 💡 **What's Next?** In Week 2, we'll put these GDB commands to work with hands-on debugging exercises! We'll step through code, examine the stack, watch registers change, and ultimately use these skills to modify a running program. The commands you learned here are the foundation for everything that follows.
|
||||
|
||||
---
|
||||
|
||||
### 🔬 Part 5: Static Analysis with Ghidra
|
||||
|
||||
#### Setting Up Your First Ghidra Project
|
||||
|
||||
Before we dive into GDB debugging, let's set up Ghidra to analyze our hello-world binary. Ghidra is a powerful reverse engineering tool that will help us visualize the disassembly and decompiled code.
|
||||
|
||||
##### Step 1: Create a New Project
|
||||
|
||||
1. Launch Ghidra
|
||||
2. A window will appear - select **File → New Project**
|
||||
3. Choose **Non-Shared Project** and click **Next**
|
||||
4. Enter the Project Name: `0x0001_hello-world`
|
||||
5. Click **Finish**
|
||||
|
||||
##### Step 2: Import the Binary
|
||||
|
||||
1. Open your file explorer and navigate to the `Embedded-Hacking` folder
|
||||
2. **Drag and drop** the `0x0001_hello-world.elf` file into the folder panel within the Ghidra application
|
||||
|
||||
##### Step 3: Understand the Import Dialog
|
||||
|
||||
In the small window that appears, you will see the file identified as an **ELF** (Executable and Linkable Format).
|
||||
|
||||
> 💡 **What is an ELF file?**
|
||||
>
|
||||
> ELF stands for **Executable and Linkable Format**. This format includes **symbols** - human-readable names for functions and variables. These symbols make reverse engineering much easier because you can see function names like `main` and `printf` instead of just memory addresses.
|
||||
>
|
||||
> In future weeks, we will work with **stripped binaries** (`.bin` files) that do not contain these symbols. This is more realistic for real-world reverse engineering scenarios where symbols have been removed to make analysis harder.
|
||||
|
||||
3. Click **Ok** to import the file
|
||||
4. **Double-click** on the file within the project window to open it in the CodeBrowser
|
||||
|
||||
##### Step 4: Auto-Analyze the Binary
|
||||
|
||||
When prompted, click **Yes** to auto-analyze the binary. Accept the default analysis options and click **Analyze**.
|
||||
|
||||
Ghidra will now process the binary, identifying functions, strings, and cross-references. This may take a moment.
|
||||
|
||||
#### Reviewing the Main Function in Ghidra
|
||||
|
||||
Once analysis is complete, let's find our `main` function:
|
||||
|
||||
1. In the **Symbol Tree** panel on the left, expand **Functions**
|
||||
2. Look for `main` in the list (you can also use **Search → For Address or Label** and type "main")
|
||||
3. Click on `main` to navigate to it
|
||||
|
||||
##### What You'll See
|
||||
|
||||
Ghidra shows you two views of the code:
|
||||
|
||||
**Listing View (Center Panel)** - The disassembled ARM assembly:
|
||||
```
|
||||
*************************************************************
|
||||
* FUNCTION
|
||||
*************************************************************
|
||||
int main (void )
|
||||
assume LRset = 0x0
|
||||
assume TMode = 0x1
|
||||
int r0:4 <RETURN>
|
||||
main XREF[3]: Entry Point (*) ,
|
||||
_reset_handler:1000018c (c) ,
|
||||
.debug_frame::00000018 (*)
|
||||
0x0001_hello-world.c:4 (2)
|
||||
0x0001_hello-world.c:5 (2)
|
||||
10000234 08 b5 push {r3,lr}
|
||||
0x0001_hello-world.c:5 (4)
|
||||
10000236 01 f0 99 f9 bl stdio_init_all _Bool stdio_init_all(void)
|
||||
LAB_1000023a XREF[1]: 10000240 (j)
|
||||
0x0001_hello-world.c:7 (6)
|
||||
0x0001_hello-world.c:8 (6)
|
||||
1000023a 02 48 ldr r0=>__EH_FRAME_BEGIN__ ,[DAT_10000244 ] = "hello, world\r"
|
||||
= 100019CCh
|
||||
1000023c 01 f0 de f9 bl __wrap_puts int __wrap_puts(char * s)
|
||||
0x0001_hello-world.c:7 (8)
|
||||
10000240 fb e7 b LAB_1000023a
|
||||
10000242 00 ?? 00h
|
||||
10000243 bf ?? BFh
|
||||
DAT_10000244 XREF[1]: main:1000023a (R)
|
||||
10000244 cc 19 00 10 undefine 100019CCh ? -> 100019cc
|
||||
|
||||
```
|
||||
|
||||
**Decompile View (Right Panel)** - The reconstructed C code:
|
||||
```c
|
||||
int main(void) {
|
||||
stdio_init_all();
|
||||
do {
|
||||
__wrap_puts("hello, world");
|
||||
} while (true);
|
||||
}
|
||||
```
|
||||
|
||||
> 🎯 **Notice how Ghidra reconstructed our original C code!** The decompiler recognized the infinite loop and the `puts` call (the compiler optimized `printf` to `puts` since we're just printing a simple string).
|
||||
|
||||
##### Why We Start with .elf Files
|
||||
|
||||
We're using the `.elf` file because it contains symbols that help us learn:
|
||||
- Function names are visible (`main`, `stdio_init_all`, `puts`)
|
||||
- Variable names may be preserved
|
||||
- The structure of the code is easier to understand
|
||||
|
||||
In future weeks, we'll work with `.bin` files that have been stripped of symbols. This will teach you how to identify functions and understand code when you don't have these helpful hints!
|
||||
|
||||
---
|
||||
|
||||
### 📊 Part 6: Summary and Review
|
||||
|
||||
#### What We Learned
|
||||
|
||||
1. **Registers**: The ARM Cortex-M33 has 13 general-purpose registers (`r0`-`r12`), plus special registers for the stack pointer (`r13`/SP), link register (`r14`/LR), and program counter (`r15`/PC).
|
||||
|
||||
2. **The Stack**:
|
||||
- Grows downward in memory
|
||||
- PUSH adds items (SP decreases)
|
||||
- POP removes items (SP increases)
|
||||
- Used to save return addresses and register values
|
||||
|
||||
3. **Memory Layout**:
|
||||
- Code lives in flash memory starting at `0x10000000`
|
||||
- Stack lives in RAM around `0x20080000`
|
||||
|
||||
4. **GDB Basics**: We learned the essential commands for connecting to hardware and examining code:
|
||||
|
||||
| Command | What It Does |
|
||||
| --------------------- | -------------------------------------- |
|
||||
| `target remote :3333` | Connect to OpenOCD debug server |
|
||||
| `monitor reset halt` | Reset and halt the processor |
|
||||
| `b main` | Set breakpoint at main function |
|
||||
| `c` | Continue running until breakpoint |
|
||||
| `disas` | Disassemble current function |
|
||||
| `i r` | Show all register values |
|
||||
|
||||
5. **Ghidra Static Analysis**: We set up a Ghidra project and analyzed our binary:
|
||||
- Imported the ELF file with symbols
|
||||
- Found the `main` function
|
||||
- Saw the decompiled C code
|
||||
- Understood how assembly maps to C
|
||||
|
||||
6. **Little-Endian**: The RP2350 stores multi-byte values with the least significant byte at the lowest address, making them appear "backwards" when viewed as a single value.
|
||||
|
||||
#### The Program Flow
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────┐
|
||||
│ 1. push {r3, lr} │
|
||||
│ Save registers to stack │
|
||||
├─────────────────────────────────────────────────────┤
|
||||
│ 2. bl stdio_init_all │
|
||||
│ Initialize standard I/O │
|
||||
├─────────────────────────────────────────────────────┤
|
||||
│ 3. ldr r0, [pc, #8] ────────────────┐ │
|
||||
│ Load address of "hello, world" into r0│ │
|
||||
├─────────────────────────────────────────────────────┤
|
||||
│ 4. bl __wrap_puts │ │
|
||||
│ Print the string │ │
|
||||
├─────────────────────────────────────────────────────┤
|
||||
│ 5. b.n (back to step 3) ────────────────┘ │
|
||||
│ Infinite loop! │
|
||||
└─────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### ✅ Practice Exercises
|
||||
|
||||
Try these on your own to reinforce what you learned:
|
||||
|
||||
#### Exercise 1: Explore in Ghidra
|
||||
1. Open your `0x0001_hello-world` project in Ghidra
|
||||
2. Find the `stdio_init_all` function in the Symbol Tree
|
||||
3. Look at its decompiled code - can you understand what it's setting up?
|
||||
|
||||
#### Exercise 2: Find Strings in Ghidra
|
||||
1. In Ghidra, go to **Window → Defined Strings**
|
||||
2. Look for `"hello, world"` - what address is it at?
|
||||
3. Double-click to navigate to it in the listing
|
||||
|
||||
#### Exercise 3: Cross-References
|
||||
1. In Ghidra, navigate to the `main` function
|
||||
2. Find the `ldr r0, [DAT_...]` instruction that loads the string
|
||||
3. Right-click on `DAT_10000244` and select **References → Show References to**
|
||||
4. This shows you where this data is used!
|
||||
|
||||
#### Exercise 4: Connect GDB (Preparation for Week 2)
|
||||
1. Start OpenOCD and connect GDB as shown in Part 4
|
||||
2. Set a breakpoint at main: `b main`
|
||||
3. Continue: `c`
|
||||
4. Use `disas` to see the assembly
|
||||
5. Use `i r` to see register values
|
||||
|
||||
> 💡 **Note:** The detailed hands-on GDB debugging (stepping through code, watching the stack, examining memory) will be covered in Week 2!
|
||||
|
||||
---
|
||||
|
||||
### 🎓 Key Takeaways
|
||||
|
||||
1. **Reverse engineering combines static and dynamic analysis** - we look at the code (static with Ghidra) and run it to see what happens (dynamic with GDB).
|
||||
|
||||
2. **The stack is fundamental** - understanding how push/pop work is essential for following function calls.
|
||||
|
||||
3. **GDB and Ghidra work together** - Ghidra helps you understand the big picture, GDB lets you watch it happen live.
|
||||
|
||||
4. **Assembly isn't scary** - each instruction does one simple thing. Put them together and you understand the whole program!
|
||||
|
||||
5. **Everything is just numbers** - whether it's code, data, or addresses, it's all stored as numbers in memory.
|
||||
|
||||
---
|
||||
|
||||
### 📖 Glossary
|
||||
|
||||
| Term | Definition |
|
||||
| ------------------- | --------------------------------------------------------- |
|
||||
| **Assembly** | Human-readable representation of machine code |
|
||||
| **Breakpoint** | A marker that tells the debugger to pause execution |
|
||||
| **GDB** | GNU Debugger - a tool for examining running programs |
|
||||
| **Hex/Hexadecimal** | Base-16 number system (0-9, A-F) |
|
||||
| **Little-Endian** | Storing the least significant byte at the lowest address |
|
||||
| **Microcontroller** | A small computer on a single chip |
|
||||
| **Program Counter** | Register that points to the next instruction |
|
||||
| **Register** | Fast storage inside the processor |
|
||||
| **Stack** | Memory region for temporary storage during function calls |
|
||||
| **Stack Pointer** | Register that points to the top of the stack |
|
||||
| **XIP** | Execute In Place - running code directly from flash |
|
||||
Reference in New Issue
Block a user