Safe(r) Data Infrastructure

  • Ethical Sourcing of Benchmarks

    Developers of nudity detection technology used in content need datasets of nude content for training, testing, and benchmarking their nudity detection algorithms.

    Our ongoing research finds that many if not all open-source & academic datasets lack the consent of the nude image subjects depicted.

    Our team is investigating approaches to consensually sourcing and governing the use of nude datasets.

    Our initial investigation of the use of nude datasets in machine learning and computer vision research results, identifies several ethical challenges.

    A full version of this work will be available in October 2025.