CS PhD Alumnus Awarded SIGKDD Dissertation Award

5/12/2009

Written by

CS alumnus Xiaoxin Yin has been awarded the SIGKDD Doctoral Dissertation award by the ACM Special Interest Group on Knowledge Discovery and Data Mining for his work while a PhD student at the University of Illinois. This annual award recognizes excellent research by doctoral candidates in the field of data mining and knowledge discovery.

Yin's dissertation titled "Scalable Mining and Link Analysis Across Multiple Database Relations" describes his work to develop scalable and accurate approaches for data mining tasks such as classification, clustering, and duplicate detection. His approach utilizes novel techniques for virtually joining different relations, single-scan algorithms, and multi-resolutional data structures to dramatically reduce computational costs. His experiments have shown that the approaches he proposes in his thesis are not only highly efficient and scalable, but also achieve high accuracies in multi-relational data mining.

"Because most real-world relational databases have complicated schemas and contain huge amount of data, efficiency and scalability become our major concerns as well as the accuracy and effectiveness of the algorithms," writes Yin in his thesis.

Yin's work is increasingly important as more and more knowledge is stored in relational databases scattered throughout the web. As he writes in his thesis, "Relational databases are the most popular repository for structured data, and are thus one of the richest sources of knowledge in the world... Unfortunately, most existing data mining approaches can only handle data stored in single tables, and cannot be applied to relational databases. Therefore, it is an urgent task to design data mining approaches that can discover knowledge from multi-relational data."

Yin is currently a researcher in the Internet Services Research Center of Microsoft Research. His research interests include web search quality, link analysis and its application on web search, and multi-relational data mining.


Share this story

This story was published May 12, 2009.