Highlight Research

How NLP transforms political analysis: Inside Etienne Ollion’s Textual Politics project

What can millions of newspaper articles teach us about democracy, representation, or inequality?

For Etienne Ollion, sociologist and Hi! PARIS chair recipient, the answer lies not only in the words themselves, but in the tools we use to read them. His project, Textual Politics, uses advances in natural language processing (NLP) to revisit core questions in the political and social sciences, this time, at scale.

“We now have the ability to extract insights from large collections of texts with a precision that rivals human analysis,” Ollion explains. “But we can do it across millions of documents. That changes everything.”

A new lens on traditional media

Unlike most AI projects, which are focused on social media, Ollion’s work turns to the slower, more detailed world of newspapers, television transcripts, and radio broadcasts. These aren’t off-the-cuff posts, they’re deep investigations, interviews, and editorials. They carry a different kind of weight.

“Traditional media holds layers of meaning,” he says. “But that richness also makes it harder to study with machines. We want to see how far we can push AI to understand that complexity.”

Working at the crossroads of NLP and the social sciences, Textual Politics brings together researchers who share a common goal: to use AI as a microscope on how power, identity, and narratives are shaped in public discourse.

Behind the firewall

One of the biggest challenges? Gaining access to the data itself. Media archives are valuable and often guarded, especially in an era when content can be used to train large commercial models without consent.

That’s why Ollion’s collaboration with Aday, an established company which archives this data, is key. It enables his team to work directly with extensive corpora of journalistic text while respecting usage agreements offered by publishers. With access unlocked, the questions become familiar to any social scientist: What are we seeing? What are we missing? And what does it mean?

Interestingly, the team avoids the biggest AI models. “They’re too slow, too opaque, and too resource-hungry,” Ollion says. Instead, they work with smaller models, faster, more transparent, and often better suited to the job, and often better.

“Limitations can be a strength. They force us to think more clearly about what we want to learn.”

What AI can (and can’t) tell us?

At the heart of the project is a desire to better understand how ideas are presented and whose voices are amplified in the media. One study, for example, looks at which types of sources are most often quoted, a question with direct relevance to debates on political pluralism and media bias.

The approach is not to replace human interpretation but to enhance it. What would take months or years to study manually can now be mapped and compared at scale. Patterns become visible. Blind spots emerge.

“We’re not trying to automate the work of journalists or sociologists,” says Ollion. “We’re trying to give them with better tools.”

A shift in method

Ollion is careful not to make sweeping predictions about the future of AI. But he sees this moment as a turning point for social science.

“Text, image, and sound have always been essential to how we understand society,” he reflects. “But our ability to study them in a way that is both systematic and detailed has been limited. That’s now starting to change.”

The ambition is simple but far-reaching: to explore how AI can help uncover the structures, silences, and signals hidden in our most foundational forms of public communication.

More profiles of our 2025 chairs will follow soon, exploring how AI is reshaping research, media, and the way we understand the world.