# Avoiding Latent Variable Collapse With Generative Skip Models

@article{Dieng2019AvoidingLV, title={Avoiding Latent Variable Collapse With Generative Skip Models}, author={Adji B. Dieng and Yoon Kim and Alexander M. Rush and David M. Blei}, journal={ArXiv}, year={2019}, volume={abs/1807.04863} }

#### 99 Citations

Lagging Inference Networks and Posterior Collapse in Variational Autoencoders

- Computer Science, Mathematics
- ICLR
- 2019

This paper investigates posterior collapse from the perspective of training dynamics and proposes an extremely simple modification to VAE training to reduce inference lag: depending on the model's current mutual information between latent variable and observation, the inference network is optimized before performing each model update. Expand

Posterior Collapse and Latent Variable Non-identifiability

- 2020

Variational autoencoders (VAEs) model high-dimensional data by positing low-dimensional latent variables that are mapped through a flexible, implicit distribution parametrized by a neural network.… Expand

BIVA: A Very Deep Hierarchy of Latent Variables for Generative Modeling

- Computer Science, Mathematics
- NeurIPS
- 2019

This paper introduces the Bidirectional-Inference Variational Autoencoder (BIVA), characterized by a skip-connected generative model and an inference network formed by a bidirectional stochastic inference path, and shows that BIVA reaches state-of-the-art test likelihoods, generates sharp and coherent natural images, and uses the hierarchy of latent variables to capture different aspects of the data distribution. Expand

Preventing Posterior Collapse with Levenshtein Variational Autoencoder

- Computer Science, Mathematics
- ArXiv
- 2020

Levenstein VAE produces more informative latent representations than alternative approaches to preventing posterior collapse and is closely related to optimizing a bound on the intractable Kullback-Leibler divergence of an LD-based kernel density estimator from the model distribution. Expand

PREVENTING POSTERIOR COLLAPSE WITH δ-VAES

- 2019

Due to the phenomenon of “posterior collapse,” current latent variable generative models pose a challenging design choice that either weakens the capacity of the decoder or requires altering the… Expand

A Surprisingly Effective Fix for Deep Latent Variable Modeling of Text

- Computer Science, Mathematics
- EMNLP/IJCNLP
- 2019

A simple fix for posterior collapse is investigated which yields surprisingly effective results and is used to argue that the typical surrogate objective for VAEs may not be sufficient or necessarily appropriate for balancing the goals of representation learning and data distribution modeling. Expand

Discretized Bottleneck in VAE: Posterior-Collapse-Free Sequence-to-Sequence Learning

- Computer Science, Mathematics
- ArXiv
- 2020

This paper proposes a principled approach to eliminate the posterior-collapse issue in latent space by applying a discretized bottleneck in the latent space and imposes a shared discrete latent space where each input is learned to choose a combination of shared latent atoms as its latent representation. Expand

Sequential Learning and Regularization in Variational Recurrent Autoencoder

- Computer Science
- 2020 28th European Signal Processing Conference (EUSIPCO)
- 2021

Experiments on language model and sentiment classification show that the proposed method mitigates the issue of posterior collapse and learns the meaningful latent features to improve the inference and generation for semantic representation. Expand

Characterizing and Avoiding Problematic Global Optima of Variational Autoencoders

- Computer Science, Mathematics
- AABI
- 2019

This paper demonstrates that both issues stem from the fact that the global optima of the VAE training objective often correspond to undesirable solutions and presents a novel inference method, LiBI, mitigating the problems identified in the analysis. Expand

Reweighted Expectation Maximization

- Mathematics, Computer Science
- ArXiv
- 2019

A new EM-based algorithm for fitting deep generative models called reweighted expectation maximization (REM) is proposed and compared to the VAE and the IWAE on several density estimation benchmarks and found it leads to significantly better performance as measured by log-likelihood. Expand

#### References

SHOWING 1-10 OF 36 REFERENCES

Spherical Latent Spaces for Stable Variational Autoencoders

- Computer Science
- EMNLP
- 2018

This work experiments with another choice of latent distribution, namely the von Mises-Fisher (vMF) distribution, which places mass on the surface of the unit hypersphere and shows that they learn richer and more nuanced structures in their latent representations than their Gaussian counterparts. Expand

Ladder Variational Autoencoders

- Mathematics, Computer Science
- NIPS
- 2016

A new inference model is proposed, the Ladder Variational Autoencoder, that recursively corrects the generative distribution by a data dependent approximate likelihood in a process resembling the recently proposed Ladder Network. Expand

Fixing a Broken ELBO

- Computer Science, Mathematics
- ICML
- 2018

This framework derives variational lower and upper bounds on the mutual information between the input and the latent variable, and uses these bounds to derive a rate-distortion curve that characterizes the tradeoff between compression and reconstruction accuracy. Expand

Importance Weighted Autoencoders

- Computer Science, Mathematics
- ICLR
- 2016

The importance weighted autoencoder (IWAE), a generative model with the same architecture as the VAE, but which uses a strictly tighter log-likelihood lower bound derived from importance weighting, shows empirically that IWAEs learn richer latent space representations than VAEs, leading to improved test log- likelihood on density estimation benchmarks. Expand

Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks

- Computer Science, Mathematics
- ICML
- 2017

Adversarial Variational Bayes (AVB), a technique for training Variational Autoencoders with arbitrarily expressive inference models by introducing an auxiliary discriminative network that allows to rephrase the maximum-likelihood-problem as a two-player game, hence establishing a principled connection between VAEs and Generative Adversarial Networks (GANs). Expand

Tackling Over-pruning in Variational Autoencoders

- Mathematics, Computer Science
- ArXiv
- 2017

The epitomic variational autoencoder (eVAE) is proposed, which makes efficient use of model capacity and generalizes better than VAE and helps prevent inactive units since each group is pressured to explain the data. Expand

Auto-Encoding Variational Bayes

- Mathematics, Computer Science
- ICLR
- 2014

A stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case is introduced. Expand

Learning Deep Latent Gaussian Models with Markov Chain Monte Carlo

- Computer Science
- ICML
- 2017

This paper proposes a different approach to deep latent Gaussian models: rather than use a variational approximation, this work uses Markov chain Monte Carlo (MCMC), which yields higher held-out likelihoods, produces sharper images, and does not suffer from the variational overpruning effect. Expand

How to Train Deep Variational Autoencoders and Probabilistic Ladder Networks

- Computer Science
- ICML 2016
- 2016

This work proposes three advances in training algorithms of variational autoencoders, for the first time allowing to train deep models of up to five stochastic layers, using a structure similar to the Ladder network as the inference model and shows state-of-the-art log-likelihood results for generative modeling on several benchmark datasets. Expand

VAE with a VampPrior

- Computer Science, Mathematics
- AISTATS
- 2018

This paper proposes to extend the variational auto-encoder (VAE) framework with a new type of prior called "Variational Mixture of Posteriors" prior, or VampPrior for short, which consists of a mixture distribution with components given by variational posteriors conditioned on learnable pseudo-inputs. Expand