• Funding : Artois
  • Start year :
  • 2021

Data mining encompasses many techniques whose goal is the extraction of relevant and multiple forms knowledge from large volumes of data. These techniques have a very wide scope of application due to their usefulness in data analysis, especially at a time when we are witnessing an important evolution of the capabilities related to information collection. Indeed, the continuous emergence of new applications and new services, as well as the remarkable developments of the techniques used, make data mining a central domain in computer science.

Qualitative reasoning is a field of artificial intelligence (AI) with different domain of application such as planning and geographic information systems. The main focus of this thesis is the integration of qualitative reasoning in time and space [6,7] within data mining approaches. This integration is notably envisaged with the aim of simplifying the understanding of knowledge from complex data, particularly because of the proximity of qualitative reasoning to human reasoning in relation to various types of entities. In this respect, this subject fits perfectly into the framework of the CRIL project which is focused on the construction of explainable artificial intelligence systems. Indeed, the knowledge extraction methods targeted in this topic can intervene at different levels in the analysis of AI system behaviors in order to produce explanations that can be easily understood.

More concretely, the aim of this thesis is, on the one hand, to define sufficiently expressive data representation models and, on the other hand, to develop efficient knowledge extraction algorithms. The languages of the proposed models can be built from constraints coming from existing qualitative formalisms for time and space [6] or to be defined during this thesis. Hybrid languages combining numerical and qualitative aspects can also be considered and studied. The proposed models and algorithms will have to take into account both the temporal and spatial dimensions of the considered data sources. They will have to be sufficiently expressive and generic in order to be able to treat spatio-temporal data in a large applicative context. It will also be necessary to develop sophisticated measures to represent the quality of the extracted knowledge [1,4]. The targeted applications concern entities whose spatial component evolves over time for which trajectory analysis is necessary. For example, they can concern the evolution of cultivable land in the agriculture field or, in a completely different field, the analysis of players’ behaviors in a sports context. It should be noted that preliminary results related to the integration of qualitative reasoning in data mining have been obtained by a member of the team in [3].

[1] Jean-François Condotta, Badran Raddaoui, Yakoub Salhi : Quantifying Conflicts for Spatial and Temporal Information. KR 2016 : 443-452.

[2] Yoann Pitarch, Dino Ienco, Elodie Vintrou, Agnès Bégué, Anne Laurent, Pascal Poncelet, Michel Sala, Maguelonne Teisseire : Spatio-temporal data classification through multidimensional sequential patterns : Ap- plication to crop mapping in complex landscape. Eng. Appl. Artif. Intell. 37 : 91-102 (2015).

[3] Yakoub Salhi : Qualitative Reasoning and Data Mining. The 26th International Symposium on Temporal Representation and Reasoning, TIME 2019. 9 :1-9 :15.

[4] Yakoub Salhi : Inconsistency Measurement for Improving Logical Formula Clustering. IJCAI 2020 : 1891- 1897.

[5] Michael Sioutis, Anastasia Paparrizou, Jean-François Condotta : Collective singleton-based consistency for qualitative constraint networks : Theory and practice. Theor. Comput. Sci. 797 : 17-41 (2019).

[6] Frank Dylla, Jae Hee Lee, Till Mossakowski, Thomas Schneider, André van Delden, Jasper van de Ven, Diedrich Wolter : A Survey of Qualitative Spatial and Temporal Calculi : Algebraic and Computational Proper- ties. ACM Comput. Surv. 50(1) : 7 :1-7 :39 (2017).

[7] Gérard Ligozat : Qualitative Spatial and Temporal Reasoning. ISTE/Wiley, ISBN 978-1-84821-252-7, 2011.