Coding Equality: How We Built a Gender-Conscious Voice Model

Share this post

AI for Equality: Building a Gender-Aware, Open Source Voice Recognition Model

At Digitalfems, we’ve partnered with the Catalan Broadcasting Corporation (CCMA) to co-develop a pioneering AI-based voice recognition system aimed at analyzing the presence of women in audiovisual content. The goal? To promote gender equality, increase transparency, and ensure technology works for social justice—not against it.

This pilot project has been more than just a proof of concept. It’s a concrete example of how innovation can go hand in hand with ethics. Our model is built on open-source software and fully compliant with GDPR. We applied strict data minimization principles and avoided using any personal data not essential for the analysis.

What did we do?

  • We developed an automatic system to transcribe, diarize, and identify speakers’ sex in television programs.

  • We designed a set of indicators to measure screen time by gender, analyzing over 300 hours of TV broadcasts.

  • We cleaned and curated the training data to reduce algorithmic bias in voice identification.

  • We built a scalable, replicable infrastructure that can serve as a model for other public media institutions.

Key results

The data shows that female voices accounted for 40.4% of the total speaking time. This figure increases to 49.2% when considering programs hosted and directed by women. However, we also identified significant disparities across time slots and formats: female presence drops notably in late-night programming and talk show formats.

Why is this innovative?

Because we’re not just using AI—we’re using it ethically and through a feminist lens. Because we’ve co-developed an open-source tool that could be reused. And because we’re bridging data science, technology, and policy to move toward more inclusive media systems.

This project positions CCMA as a European frontrunner in gender diversity monitoring through AI. It also shows that it’s not enough to demand more representation—we need to build the tools to measure it and improve it.

🟣 Open tech. Real equality. Responsible AI.