Georgia Tech Researchers Awarded Best Paper at SIAM International Conference on Data Mining

Wed, 05/09/2012 - 23:00 | Atlanta, GA

Related Media

Click on image(s) to view larger version(s)

For More Information Contact

Joshua Preston



Georgia Institute of Technology researchers Dongryeol Lee, Alexander G. Gray and Richard Vuduc, from the College of Computing, were awarded Best Paper at the SIAM International Conference on Data Mining April 26 for their paper “A Distributed Kernel Summation Framework for General-Dimension Machine Learning.” 

Kernel summations are a ubiquitous key computational bottleneck in many data analysis methods. The paper proposes a hybrid MPI/OpenMP kernel summation framework for scaling many popular data analysis methods. Advantages to the approach include utilizing the platform-independent C++ code base that utilizes standard protocols such as MPI and OpenMP; using the template code structure that uses any multidimensional binary trees and any approximation schemes that may be suitable for high-dimensional problems; and having extendibility to a large class of problems that require fast evaluations of kernel sums.

“Researchers have previously parallelized kernel summations in the context of simulations,” says Dongryeol Lee, a Ph.D. candidate in Computer Science. “But this paper is the first serious effort in parallelizing kernel summations in the context of data mining with potentially high-profile scientific applications.”

In data mining, kernel summations appear in popular so-called kernel methods which can model complex, nonlinear structures in data. The richer expressiveness of the methods comes with the drawback of requiring many data points and hence more computational power for crunching collected data, according to Lee. The collected data in some cases must be stored on multiple machines.

From the data mining community, Lee says this work is the first to utilize algorithmic techniques in both high performance computing, computer science, computational physics, computational geometry, and approximation theory in a general framework.

Kernel summations drive algorithms in application areas such as finance, astronomy, and medical science. 

Lee notes some examples: “Fraudulent financial transactions can be detected more quickly using fast kernel summations. Astronomy uses the algorithms to predict redshift of many galaxies and stars, which can shed light onto the ultimate fate of the universe. Medicine uses fast kernel summation algorithms in automated early detection of cancer that can save human lives."