Determining the association constant between a cyclodextrin and a guest molecule is an important task for various applications in various industrial and academical fields. However, such a task is time consuming, tedious and requires samples of both molecules. A significant number of association constants and relevant data is available from the literature. The availability of data makes the use of machine learning techniques to predict association constants possible. However, such data is mainly available from tables in articles or appendices. It is necessary to make them available in a computer friendly format and to curate them. Furthermore, the raw data need to be enriched with physicochemical information about each molecule and when such information does not allow to discriminate molecules, some additional data is needed. We present a dataset built from data gathered from the literature. The dataset contains both the original raw data from the articles and the enriched ones. We also provide the scripts used to curate and enrich the raw data.

Auteurs :
Gökhan Tahil Fabien Delorme Daniel Le Berre   Éric Monflier   Adlane Sayede   Sébastien Tilloy