Events & calendar

Events and calendar

September 15, 2017
Lab meeting focus: Structured stochastic variational inference (Hoffman & Blei 2015) Variational inference.

Presenter: Koushiki Bose

Structured stochastic variational inference (SSVI) allows dependencies between latent variables by using the approximating model, q as q(z, \beta) = \prod_k q(\beta_k) \prod_n q(z_n | \beta), for \beta the global parameters and $\latex z_n$ the sample specific latent variables.
This differs from mean field VI, which uses marginals q(z, \beta) = \prod_k q(\beta_k) \prod_n q(z_n) and assumes independence of all latent variables and parameters. It also differs from the factorization implied by a hierarchical graphical model: p(y, z, \beta) = p(\beta) \prod_n p(y_n, z_n | \beta), for y_n one of n observations.

To show that SSVI finds less local optima than normal stochastic variational inference (SVI), they run SSVI and use it as an initialization for SVI, and show that this outperforms running SVI with random initialization for the LDA model with fixed hyperparameters. In experiments, SSVI was shown to be more accurate: SSVI finds fewer local optima and is less sensitive to hyperparameter changes.

SSVI is a compelling approach to variational inference for models where all of the local conditional distributions are tractable (including some models where VI is not appropriate), but it might be harder to implement.

August 4, 2017
Lab meeting focus: Compatible Conditional Distributions (Arnold & Press 1989) Pseudo likelihoods.

Presenter: Greg Darnell

On our search to understand more about pseudolikelihoods, we came across this really exciting paper. Keeping a focus on the discrete random variable case, we discussed the idea of a compatible model (there exists a joint distribution for (X,Y) with the given families, X|Y and Y|X, as its conditional distributions),
how to show compatibility given a pair of conditional probabilities, how to define uniqueness of the joint distribution (this is a particularly eye-opening approach), and then extending these ideas to more than two variables.

March 27, 2015
Lab meeting focus: Poisson Processes Chapters 4-5.

Presenter: Genna Gliner

Today we discussed chapters four and five of Poisson Processes by J.F.C Kingman. These two chapters continued the introduction to Poisson Processes that we discussed on March 13th with a focus on Poisson Process on the real line.  When considering Poisson Processes on the real line, the random points have the additional property that they can be ordered.  This leads to some interesting properties.  For example, if you consider a Poisson Process that models the arrivals in a queue, the Interval Theorem states that time between any two arrivals follows an exponential distribution.  Chapter 5 introduces the idea of a marked Poisson Process.  That is, it develops the theory of Poisson Processes when the random points can be distinguished by colors or some other label.  This leads to the notion of Poisson Processes on a product space (the combination of the physical space and the space that contains these labels).  This leads to properties like the Displacement Theorem which states that a random displacement of points in a random set  is a Poisson Process.  Our group spent some time brainstorming ideas on how Poisson Processes are relevant in the context of genetics.  While genetic data is not measured in terms of time, their are other mediums in genetic data that share the same property of order. In particular, we discussed that the notion of location in the genome is a property that can be ordered.  Under this notion, we went on to discuss how the coloring theorem and marking theorem can be applied in the context of SNPs, haplotypes, and other genetic markers of interest whose occurrence can be modeled by a Poisson Process.

March 13, 2015 Lab meeting focus: Poisson Processes Chapters 1-3.

Presenter: Genna Gliner

Today we discussed the first three chapters of Poisson Processes by J.F.C Kingman. The motivation behind the Poisson Process is that the Poisson distribution has many wonderful properties that we can exploit when computing joint probabilities, moments, conditional expectations, and much more. Kingman shows that the nice properties of Poisson distributions naturally arise or have counterparts in the theory of Poisson Process’. This leads to behaviors and patterns that can be used to derive analytic formulas and/or characterize a variety of interesting distributions, similar to the theory of Gaussian Processes. The Poisson distribution can be thought to model the distribution of random points in space.  For instance, the Poisson can be used to model the number of trees growing in an acre of land.  A key (and powerful) property of the Poisson distribution is that the number of trees in two disjoint acres of land are independent of each other.  This independence means that we can compute the joint distributions.  While the Poisson distribution models the number of elements in a certain spatial region, a Poisson Process models the change in the number of elements if the spatial region were to change.  It turns out that this change follows a Poisson distribution.  Kingman shows you can use this relationship to prove practical theorems about Poisson Process’. These include: The Superposition Theorem (the countable union of independent Poisson Process’ is a Poisson Process), The Mapping Theorem (under certain conditions a transformation of a Poisson Process from one state space to another is a Poisson Process), Campbell’s Theorem (The sum of a real-valued function on a Poisson Process exists), Renyi’s Theorem (Gives conditions, which DO NOT include independence, under which a countable random subset in d-dimension Euclidean space is a Poisson Process).

March 6, 2015 Lab meeting focus: Maximization Expectation.
Presenter: Derek Aguiar


The 2006 Welling and Kurihara paper formalizes the Maximization Expectation alternating model estimation framework that reverses the roles of expectation and maximization of the well known EM algorithm. A common characteristic of biological data is that the number of latent variables far outnumber random model parameters. For example, clustering gene expression data requires a latent cluster assignment for each gene (~20k genes), while model parameters are confined to the number of clusters and cluster properties. The ME algorithm combines selection of model structure, e.g. the number of clusters, with hard assignments of latent variables, e.g. the cluster assignments, frequently, leading to fast implementations. One of the important contributions of this paper is formalizing the four alternating model learning algorithms and placing them in a proper perspective (image, right).