test: Add secret detection benchmark dataset and ground truth

Add comprehensive benchmark dataset with 32 documented secrets for testing
secret detection workflows (gitleaks, trufflehog, llm_secret_detection).

- Add test_projects/secret_detection_benchmark/ with 19 test files
- Add ground truth JSON with precise line-by-line secret mappings
- Update .gitignore with exceptions for benchmark files (not real secrets)

Dataset breakdown:
- 12 Easy secrets (standard patterns)
- 10 Medium secrets (obfuscated)
- 10 Hard secrets (well hidden)
This commit is contained in:
tduhamel42
2025-10-16 11:46:28 +02:00
parent ca4cd1ee22
commit fde6c86278
22 changed files with 773 additions and 0 deletions
@@ -0,0 +1,15 @@
-- Database initialization script
CREATE DATABASE prod_db;
-- MEDIUM SECRET #18: Secret in SQL comment
-- Connection string: postgresql://admin:Pr0dDB_S3cr3t_P@ss@db.prod.example.com:5432/prod_db
CREATE TABLE users (
id SERIAL PRIMARY KEY,
username VARCHAR(255) NOT NULL,
email VARCHAR(255) NOT NULL
);
-- Insert test data
INSERT INTO users (username, email) VALUES ('admin', 'admin@example.com');