Florent Capelli (Université de Lille)

It is well-known that one may faithfully approximate the expected value of a random variable by taking the average of many independant observations of the variables. The average quickly converges to the expected value with good probability. This fact is commonly used when computing statistics in data mining. However, in this setting, the assumption that the observations are independant is usually wrong and we have no more guarantees on the convergence of the average toward the expected value. One way of circumventing this problem would be to only use an independant subset of the observations. Finding a large independant subset is however a hard problem and this method tends to discard precious data. An other method proposed by Wang and Ramon starts by assigning weights to each observation. If the weights are solution to some linear program, then we have the guarantee that the weighted average converges toward the expected value. In practice however, when the number of factor of dependencies is large, the size of such a linear program can be hudge. In this talk, we will present a way of generating an equivalent compressed linear program when the dependencies are succinctly represented. This is joint work with Nicolas Crosetti, Jan Ramon and Joachim Niehren.