Bing Hu, (University of California, Riverside) “Mining Time Series Data: Moving from Toy Problems to Realistic Deployments” (2013)

You are here

Inside Signal Processing Newsletter Home Page

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

News and Resources for Members of the IEEE Signal Processing Society

Bing Hu, (University of California, Riverside) “Mining Time Series Data: Moving from Toy Problems to Realistic Deployments” (2013)

Bing Hu, (University of California, Riverside) “Mining Time Series Data: Moving from Toy Problems to Realistic Deployments” (2013) Advisor: Eamonn Keogh

Data mining and knowledge discovery has attracted a lot of research interest in the last decade. Although there is extensive research in this area, the authors argue that most of the work is not as useful, since the datasets that they are dealing with and the methods that they proposed to solve the problems are more like ‘toy examples’ compared to the much more complicate real-world scenario. The authors have observed the following two problems that widely exist in most of data mining research. First, parameters will hurt the potential of spreading the ideas in the research community. In a lot of works, there are usually several parameters to tune in the proposed method. The authors claim that the parameter turning can kill the usefulness of an algorithm and reduce the number of citations. Second, the prevalently existed assumptions about the data further limit their application to solve the real-world problem. The authors strive to mitigate the above two problems. The contribution of this dissertation is as follows:

First, the authors demonstrate a parameter free framework using MDL to discover the intrinsic features of the data. With the intrinsic cardinality and dimensionality of the time series, the authors can further understand the underlying meaning of the data, before consulting the domain experts. In addition, the intrinsic features can be used as dimensionality reduction and have huge applications in the various lower bounding techniques. Second, the authors show a time series classification framework that has none of the prevalent assumptions. The authors propose to use the data editing technique to automatically build a data dictionary. In addition, our classification framework has the capability to say ‘I do not know’ at a certain point when classifying the incoming queries that does not belong to any concept in the training data. Our results show that a small fraction of all the data can achieve even better classification results than using all the data. In the last, the authors propose a dynamically weighted multi-dimensional classification framework, which can smartly choose the weight of each data dimension. The results over extensive datasets from various domains show that our framework is more accurate and robust to the occluded data.

For details, please visit the thesis page.

Table of Contents:

Research Opportunities


IEEE SPS Educational Resources

IEEE SPS Resource Center

IEEE SPS YouTube Channel