Background and Research Interests

I am doing a Computer Science PhD at Princeton University, where I am adviced by professor Tom Griffiths. I graduated from Columbia University with M.S. in Data Science and B.A. with Statistics and Applied Math majors. I was advised by professor David Blei during masters, and by professor Itsik Pe'er and professor Andrew Gelman during undergrad.
I use Bayesian statistics to explain both human thinking and deep learning models. I focus on using probabilistic models to distill knowledge from and instill knowledge to deep learning models, and developing scalable and effective approximate inference methods for probabilistic models.

Publications and Preprints


Transport Score Climbing: Variational Inference Using Forward KL and Adaptive Neural Transport
Liyi Zhang,
Christian A. Naesseth,
David M. Blei,
Accepted at Transactions on Machine Learning Research (TMLR). Code.
Variational inference often minimizes the "reverse" KullbeckLeibler (KL) KL(qp) from the approximate distribution q to the posterior p. Recent work studies the "forward" KL KL(pq), which unlike reverse KL does not lead to variational approximations that underestimate uncertainty. This paper introduces Transport Score Climbing (TSC), a method that optimizes KL(pq) by using Hamiltonian Monte Carlo (HMC) and a novel adaptive transport map. The transport map improves the trajectory of HMC by acting as a change of variable between the latent variable space and a warped space. TSC uses HMC samples to dynamically train the transport map while optimizing KL(pq). TSC leverages synergies, where better transport maps lead to better HMC sampling, which then leads to better transport maps. We demonstrate TSC on synthetic and real data. We find that TSC achieves competitive performance when training variational autoencoders on largescale data.


Variational Combinatorial Sequential Monte Carlo Methods in Bayesian Phylogenetic Inference
Antonio K. Moretti*,
Liyi Zhang*,
Christian A. Naesseth,
Hadiah Venner,
David M. Blei,
Itsik Pe'er,
Uncertainty in Artificial Intelligence 2021 (UAI). Code.
Bayesian phylogenetic inference is often conducted via local or sequential search over topologies and branch lengths using algorithms such as randomwalk Markov chain Monte Carlo (MCMC) or Combinatorial Sequential Monte Carlo (CSMC). However, when MCMC is used for evolutionary parameter learning, convergence requires long runs with inefficient exploration of the state space. We introduce Variational Combinatorial Sequential Monte Carlo (VCSMC), a powerful framework that establishes variational sequential search to learn distributions over intricate combinatorial structures. We then develop nested CSMC, an efficient proposal distribution for CSMC and prove that nested CSMC is an exact approximation to the (intractable) locally optimal proposal. We use nested CSMC to define a second objective, VNCSMC which yields tighter lower bounds than VCSMC. We show that VCSMC and VNCSMC are computationally efficient and explore higher probability spaces than existing methods on a range of tasks.
Variational Combinatorial Sequential Monte Carlo in Bayesian Phylogenetic Inference
Antonio K. Moretti,
Liyi Zhang,
Itsik Pe'er,
Machine Learning in Computational Biology 2020 (MLCB). Code.


Model Stacking in Bayesian Phylogenetic Inference
Columbia Statistics  Undergraduate Summer Research Internship with Andrew Gelman
We develop stacking algorithm for phylogenetic inference by isolating discrete models, using MCMCbased methods in Stan for sampling on continuous parameters, and adopting stacking for model combination. Preliminary results support the hypothesis that stacking tends less to produce spuriously high model posteriors than Bayesian Model Averaging.


Improving Neural Network Robustness with Bayesian Weight Sampling
STCS 6701 Foundations of Graphical Models  Final Project. Code.
Deep neural networks are the stateoftheart for various tasks and benchmarks, but they are often not robust to slight or even negligible input perturbations. On image classification tasks, for example, models with nearperfect accuracy can easily degrade to nearzero accuracy when the input images change by a tiny amount that is invisible to human. We propose to improve modelsâ€™ robustness against input perturbations by adding diversity in model weights during training with Bayesian neural networks. Our experiments on an image classification benchmark shows that Bayesian neural networks are more robust than nonBayesian deep neural networks trained with normbased regularization. We claim that introducing diversity of model weights during model training improves models' robustness against input perturbations.

