AI Data Transparency: an Exploration Through the Lens of AI Incidents
Summary
An exploration of AI data transparency across systems associated with public incidents, finding persistent low levels of public documentation regarding training data and its curation.
Key quotes
low data transparency persists across a wide range of systems
the presence of model or system transparency documentation does not necessarily lead to presence of all desired transparency information
The research utilizes the AI Incidents Database (AIID) and a search protocol from the Stanford Foundation Model Transparency Index to analyze 54 AI systems. It highlights a ‘hierarchy of transparency’ where generative AI systems are more documented than autonomous driving or facial recognition systems.