• Funding : Artois
  • Start year :
  • 2025

Many textual corpora report discourses on various societal topics, such as migration, health, politics, and climate. These discourses often involve a diversity of actors. For instance, in the case of migration, they engage a wide range of stakeholders, from migrants themselves to the media, policymakers, and civil society. Exploring the opinions of these different actors by accounting for the intertextuality of these discourses and explaining their convergences and/or divergences is a key objective.

A second type of textual corpus, reported by a single actor, concerns biographical narratives, for example. These narratives often recount a succession of events. Uncovering these events and extracting the causal network that links them together is a step towards understanding the underlying complexity of these phenomena.

This thesis fits within this context. It relies on two types of textual corpora: discourses on migration and the biographical trajectories of minors in conflict with the law. The methodological approach will be structured around intertextual analysis and the discovery of causal networks. This requires the prior development of our own language model (LLM) and will build upon various artificial intelligence themes, such as natural language processing, machine learning, data mining, knowledge representation, and visual analysis.