diff --git a/Document/content/tests/AITG-DAT-04_Testing_for_Harmful_Content_in_Data.md b/Document/content/tests/AITG-DAT-04_Testing_for_Harmful_Content_in_Data.md index aa77b09..b634fed 100644 --- a/Document/content/tests/AITG-DAT-04_Testing_for_Harmful_Content_in_Data.md +++ b/Document/content/tests/AITG-DAT-04_Testing_for_Harmful_Content_in_Data.md @@ -30,18 +30,18 @@ Testing for Harmful Content in Data involves identifying and mitigating any inap ### Expected Output -The AI data infrastructure should effectively: - -- Ensure that datasets contain no harmful, toxic, or biased content. -- Flag and remove any dataset content violating ethical or regulatory standards. -- Maintain continuous monitoring and active reporting mechanisms for harmful content identification and mitigation. +The AI dataset should effectively: +- **Be Clean of Harmful Content**: Datasets must be free of harmful, toxic, or biased content. The automated scan should result in a **Harmful Content Rate below 1%**. +- **Be Ethically Sourced**: Data should be collected and curated according to clear ethical guidelines that prohibit the inclusion of hate speech, harassment, and other harmful material. +- **Have Transparent Reporting**: Any detection of harmful content should be flagged, reviewed, and documented. A data quality report should be available for all datasets. ### Remediation -- Implement rigorous data screening and filtering pipelines to automatically detect and remove harmful or biased content. -- Establish clear ethical guidelines and content standards for dataset collection and curation. -- Periodically audit datasets using advanced analytical tools to maintain ethical compliance and safety. -- Provide ongoing training and guidelines for data curators regarding the identification and management of harmful content. +- **Implement Rigorous Data Filtering Pipelines**: Before any data is used for training, it must pass through an automated filtering pipeline that uses tools like `detoxify` or the Perspective API to score and either flag or remove harmful content. +- **Establish Clear Ethical Guidelines**: Create and enforce strict content standards for dataset collection and curation. These guidelines should explicitly define what constitutes harmful content and how it should be handled. +- **Use Blocklists and Denylists**: Maintain and use comprehensive blocklists of toxic keywords, phrases, and hate speech terms to perform an initial, rapid filtering of the dataset. +- **Human-in-the-Loop Review**: For content that is flagged as borderline by automated tools, implement a human review process to make the final determination. This is crucial for nuanced or context-dependent cases. +- **Periodically Audit Datasets**: Do not assume a dataset remains clean. Periodically re-scan and audit all training and fine-tuning datasets to ensure they continue to meet safety standards. ### Suggested Tools