On January 20, 2016 we had a seminar by Themis Palpanas about "Exploratory Analysis of Very Large Scientific Data”. Themis is professor of computer science at Université Paris Descartes, where he is a director of the Data Intensive and Knowledge Oriented Systems (diNo) group. The seminar took place at the FACe.
There is an increasingly pressing need, by several applications in diverse domains, for developing techniques able to index and mine very large collections of data series. Examples of such applications come from biology (e.g., genome sequences) and neuroscience (e.g., fMRI and EEG data), as well as from several other scientific and industrial domains. It is not unusual for these applications to involve numbers of data series in the order of hundreds of millions to billions, which are often times not analyzed in their full detail due to their sheer size.
In this talk, we describe the state of the art techniques for indexing and mining truly massive collections of data series that will enable scientists to easily analyze their data. Moreover, we show how our methods allow mining on datasets that would otherwise be completely untenable, including the first published experiments using one billion data series.
Finally, we present the prototype we have developed, which showcases the above technologies.