Modern artificial intelligence benefits from two major advantages: data availability and computational power. Data is now available, often in large quantities and from multiple sources, and can be tainted by various imperfections (missing, inaccurate, heterogeneous data and so on.). Managing this massive and heterogeneous data raises several challenges for the AI community. From data mining to machine learning, several current problems require computationally efficient solutions that can provide usable, reliable and explainable results for the user.

Thus, the new “Data” research axis within CRIL has set as its main goals:

  • The proposal of new algorithms for knowledge extraction and machine learning;
  • The study and analysis of fundamental, algorithmic and experimental aspects of knowledge extraction and machine learning techniques;
  • The proposal of efficient solutions for the management of massive, heterogeneous and complex data by integrating of confidentiality and reliability aspects;
  • Cross-fertilization and exploitation of strong synergies with the two other thematic axes of CRIL (such as the development of symbolic and declarative approaches for data mining and explainability based on the strength of modern solvers and reasoners);
  • The collection, completion and interrogation of massive and heterogeneous databases;
  • Modeling and design of knowledge extraction and artificial learning pipelines in some application fields.

Keywords:

Data mining and data science

knowledge extraction (pattern and rule extraction, clustering, communities,…), declarative approaches, data quality

Machine learning

machine learning and explainability, reliability, calibration

Data management

Query, completion, access control, confidentiality, repair

Applications

Recommendation, anomaly detection, community detection,and so on.