Research Projects

[Back to Homepage]

Small-Variance Asymptotics
Visual Domain Adaptation
Online Learning
Hashing Algorithms for Image Search
Metric Learning
Scalable Semidefinite Programming
Graph Clustering and Image Segmentation
Topic Tracking


Small-Variance Asymptotics


The design and use of graphical models for rich probabilistic models in machine learning has proven to be one of the most important research themes of the field. For complex models, particularly those involving continuous variables or distributions such as Bayesian nonparametric processes, scalability remains a concern. Sampling and variational inference methods simply do not work at scale for many real-world analysis problems.

We take as a starting point the connection between k-means and mixtures of Gaussians (a similar connection exists between PCA and probabilistic PCA): letting the covariances of the clusters in a Gaussian mixture model tend to zero, the complete-data log likelihood approaches the k-means objective and the EM algorithm approaches the k-means algorithm. Recently, we have been working on extending this idea in two complementary directions: i) to larger graphical models (e.g., hierarchical models and sequential data models), and ii) to models involving Bayesian nonparametric processes. In both cases, one can apply similar small-variance asymptotics as those connecting EM and k-means to these more complex models; the resulting models maintain key properties of the original probabilistic models. Crucially, this yields a number of scalable algorithms for problems where the rich probabilistic models cannot be applied.

Some representative publications:

  • Small-Variance Asymptotics for Exponential Family Dirichlet Process Mixture Models
    Ke Jiang, Brian Kulis, & Michael I. Jordan
    In Neural Information Processing Systems (NIPS), 2012.
    [pdf]

  • Revisiting k-means: New Algorithms via Bayesian Nonparametrics
    Brian Kulis & Michael I. Jordan
    In Proc. 29th International Conference on Machine Learning (ICML), 2012.
    [pdf]

  • Visual Domain Adaptation


    In many applications, the distribution of the data used for training may differ from the data used at testing. Using computer vision as an example, we may train an object classifier using images taken from the Internet, but at testing we may be using images taken from a low-resolution robot camera. We have been mainly focusing on the case when we have a large amount of labeled training data (from the source domain), and a small amount of labeled testing data (from the target domain). Using this data, we attempt to learn mappings from source to target, and vice versa. Our approach is based on extensions of metric learning---instead of learning a global transformation to apply over all the data, we learn a global transformation to map data from one domain to another.

    Some representative publications:

  • What You Saw is Not What You Get: Domain Adaptation Using Asymmetric Kernel Transforms
    Brian Kulis, Kate Saenko & Trevor Darrell
    In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011.
    Oral Presentation: 3.5% Acceptance Rate
    [pdf]

  • Adapting Visual Category Models to New Domains
    Kate Saenko, Brian Kulis, Mario Fritz & Trevor Darrell
    In Proc. 11th European Conference on Computer Vision (ECCV), 2010.
    [pdf]

  • Online Learning


    There has been considerable interest in the paradigm of online machine learning algorithms, which only observe data points once and do not store the entire data set. Such algorithms are desirable from both theoretical and practical points of view: bounds on worst-case performance can be proven, and these algorithms have shown success on many real-world problems.

    My work has examined various online learning problems. At ICML 2010, we focused on the analysis of general convex online learning problems, and proved bounds for a class of learning algorithms that involves implicit update rules. The resulting algorithms show improved performance over the state-of-the-art for a variety of problems. Earlier, I worked on proving bounds for the online metric learning problem, with a focus on the theoretical properties of online learning algorithms involving the LogDet divergence. We also considered the problem of how to integrate online learning and hashing for large-scale similarity search.

    Some representative publications:

  • Implicit Online Learning
    Brian Kulis & Peter Bartlett
    In Proc. 27th International Conference on Machine Learning (ICML), 2010.
    [pdf]

  • Online Metric Learning and Fast Similarity Search
    Prateek Jain, Brian Kulis, Inderjit Dhillon, & Kristen Grauman
    In Neural Information Processing Systems (NIPS), 2008.
    Oral Presentation: 2.7% Acceptance Rate
    [pdf], Longer version: [pdf]

  • Hashing Algorithms for Image Search


    A common problem in large-scale data is that of quickly extracting nearest neighbors to a query from a large database. In computer vision, for example, this problem arises in content-based image retrieval, 3-d image reconstructions, human body pose estimation, and object recognition problems.

    My work has developed algorithms for quickly and accurately performing large-scale image searches using hashing techniques. We developed algorithms for incorporating hashing methods for learned metrics (CVPR 2008) as well as for performing locality-sensitive hashing over arbitrary kernel functions (ICCV 2009), two prominent scenarios arising in modern computer vision applications. We have applied our algorithms to several large-scale data sets, including the 80 million images of the Tiny Image data set.

    Some representative publications:

  • Kernelized Locality-Sensitive Hashing for Scalable Image Search
    Brian Kulis & Kristen Grauman
    In Proc. 12th International Conference on Computer Vision (ICCV), 2009.
    [pdf] [webpage and code]

  • Fast Image Search for Learned Metrics
    Prateek Jain, Brian Kulis, & Kristen Grauman
    In. Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2008.
    CVPR 2008 Best Student Paper Award
    [pdf]

  • Metric Learning


    Metric learning is the problem of learning an appropriate distance function for a given task. For example, similarity searches over large-scale data rely on the underlying metric, which may change from task to task or over time. A majority of existing methods for metric learning learn a Mahalanobis distance, but are limited to low-dimensional and/or small data sets, making them impractical for many real-world problems.

    In my recent work, I have shown how to learn Mahalanobis metrics in a variety of settings. These include: learning metrics for high-dimensional data and non-linear transformations via kernelization (ICML 2007), learning metrics in online scenarios (NIPS 2008), and learning metrics for large-scale data (CVPR 2008). This work has been applied in conjunction with hashing techniques to a variety of problems in computer vision.

    Some of our metric learning code is available here.

    Some representative publications:

  • Low-Rank Kernel Learning with Bregman Matrix Divergences
    Brian Kulis, Matyas Sustik, & Inderjit Dhillon
    Journal of Machine Learning Research, 10 (Feb): 341--376, 2009.
    [pdf]

  • Online Metric Learning and Fast Similarity Search
    Prateek Jain, Brian Kulis, Inderjit Dhillon, & Kristen Grauman
    In Neural Information Processing Systems (NIPS), 2008.
    Oral Presentation: 2.7% Acceptance Rate
    [pdf], Longer version: [pdf]

  • Information-Theoretic Metric Learning
    Jason Davis, Brian Kulis, Prateek Jain, Suvrit Sra, & Inderjit Dhillon
    In Proc. 24th International Conference on Machine Learning (ICML), 2007.
    ICML 2007 Best Student Paper Award
    [pdf] [code]

  • Scalable Semidefinite Programming


    Semidefinite programming arises in a number of machine learning problems, including recommender systems, graph cut problems, metric learning, and others. One of the disadvantages to semidefinite programming is the lack of scalability to large problems.

    I have studied algorithms for scaling semidefinite programming to large data sets, with a focus on semidefinite programming problems arising in practical machine learning applications. With colleagues at Microsoft Research, I developed algorithms for low-rank semidefinite programming (SDPs with an additional low-rank constraint) for embedding and clustering problems (AISTATS 2007), and with colleagues at UT Austin and Max Planck Institute, I developed a general, scalable first-order SDP algorithm by analyzing convex perturbations for SDPs (AISTATS 2009).

    Some representative publications:

  • Convex Perturbations for Scalable Semidefinite Programming
    Brian Kulis, Suvrit Sra, & Inderjit Dhillon
    In Proc. 12th Intl. AISTATS Conference, 2009.
    [pdf]

  • Fast Low-Rank Semidefinite Programming for Embedding and Clustering
    Brian Kulis, Arun Surendran, & John Platt
    In Proc. 11th Intl. AISTATS Conference, 2007.
    [pdf]

  • Graph Clustering and Image Segmentation


    Spectral clustering partitions the nodes of a graph by minimizing one of several graph cut objectives; it has received a significant amount of attention recently and is used in numerous applications. I worked on Graclus, the first algorithm for optimizing spectral clustering objectives that does not use any eigenvector computation. This is possible due to a mathematical connection between spectral clustering and weighted kernel k-means. Our resulting multilevel algorithm outperforms the best existing spectral methods in terms of quality, speed, and memory usage. Applications include fast image segmentation, speech segmentation, and gene network clustering.

    Graclus is available for download here.

    Some representative publications:

  • Weighted Graph Cuts without Eigenvectors: A Multilevel Approach
    Inderjit Dhillon, Yuqiang Guan, & Brian Kulis
    IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 11, pp. 1944--1957, 2007.
    [pdf] [code]

  • Semi-Supervised Graph Clustering: A Kernel Approach
    Brian Kulis, Sugato Basu, Inderjit Dhillon, & Raymond Mooney
    In Proc. 22nd International Conference on Machine Learning (ICML), 2005.
    ICML 2005 Best Student Paper Award
    [pdf]

  • Kernel k-means, Spectral Clustering, and Normalized Cuts
    Inderjit Dhillon, Yuqiang Guan, & Brian Kulis
    In Proc. 10th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, 2004.
    [pdf]

  • Topic Tracking


    As an undergraduate, I worked on tracking topics in networked data over time. Our goal was to understand how research topics in computer science evolve by analyzing the citation structure of computer science publication networks. We developed an algorithm based on finding natural communities---clusters that are insensitive to small perturbations. We applied our techniques to the CiteSeer network.

    Some representative publications:

  • Tracking Evolving Communities in Large Linked Networks
    John Hopcroft, Omar Khan, Brian Kulis, & Bart Selman
    Proc. of the National Academy of Sciences, vol. 101, pp. 5249--5253, April 2004.
    [pdf]

  • Natural Communities in Large Linked Networks
    John Hopcroft, Omar Khan, Brian Kulis, & Bart Selman
    In Proc. 9th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, 2003.
    [pdf]