feat: Add GCG, Swarm, and Probes diagrams to Chapter 43 and simplify the README version string.

2026-02-12 14:42:46 +00:00 · 2026-01-21 23:15:38 +01:00
parent 03ef7df4b1
commit 24c7745e3d
5 changed files with 13 additions and 1 deletions
--- a/README.md
+++ b/README.md
@@ -161,4 +161,4 @@ The authors and contributors accept no liability for misuse of this material.

 ---

-**Version:** 1.46.154 | **Status:** Gold Master
+**Version:** 1.46.154
--- a/docs/Chapter_43_Future_of_AI_Red_Teaming.md
+++ b/docs/Chapter_43_Future_of_AI_Red_Teaming.md
@@ -25,6 +25,10 @@ Manual prompting is random and hard to scale. The future belongs to algorithms t

 Published by Zubritsky et al. (2023), GCG demonstrated that you can mechanically find a string of characters (a "suffix") that forces any model to comply.

+<p align="center">
+  <img src="assets/Ch43_Graph_GCG.png" width="512" alt="GCG Loss Landscape Optimization">
+</p>
+
 **The Math:**
 $$ \text{minimize } - \log P(Target | Input + Suffix) $$

@@ -61,6 +65,10 @@ Using **SMT Solvers** (Satisfiability Modulo Theories), researchers convert neur

 We are moving from "Chatbots" to "OS-Controlling Agents." How do you Red Team a swarm?

+<p align="center">
+  <img src="assets/Ch43_Diagram_Swarm.png" width="512" alt="Swarm Agent Attack Diagram">
+</p>
+
 ### 43.2.1 The "Agent Turing Test"

 Red Teaming an agent requires testing its **Goal Integrity** over time.
@@ -87,6 +95,10 @@ A philosophical risk with practical implications. Any intelligent agent, regardl

 The ultimate defense is not looking at the output, but looking at the _activations_.

+<p align="center">
+  <img src="assets/Ch43_Schematic_Probes.png" width="512" alt="Neural Network X-Ray Probes">
+</p>
+
 ### 43.3.1 Linear Probes and Steering Vectors

 Research shows that concepts like "Deception" or "Refusal" are represented as **Vectors** in the model's residual stream.
--- a/docs/assets/Ch43_Diagram_Swarm.png
+++ b/docs/assets/Ch43_Diagram_Swarm.png
--- a/docs/assets/Ch43_Graph_GCG.png
+++ b/docs/assets/Ch43_Graph_GCG.png
--- a/docs/assets/Ch43_Schematic_Probes.png
+++ b/docs/assets/Ch43_Schematic_Probes.png