Improving Data Compression Based On Deep Learning
We generate and use an ever-increasing amount of data in digital form. Technological advancements in digital communication have allowed us to transmit data to almost anyone on the globe. However, storage and transmission capacities do not seem to keep up with the explosive growth of data. Due to data compression, we can represent data in compact form and therefore store and transmit more data with the same cost. In this thesis, we focus on compression without loss of information, known as lossless compression, of high-dimensional data. Lossless compression can be achieved by finding structure that exists in the data through probabilistic modelling and exploiting that structure with compression algorithms. Probabilistic models based on deep learning have experienced a lot of progress in recent years. Efficiently using these models in compression algorithms, however, is still an open problem. Earlier work has focused on designing a compression algorithm that uses a latent variable model, called the bits-back scheme. Latent variable models based on neural networks can be efficiently optimized for high-dimensional data. We extended both the latent variable model and the bits-back scheme, such that we achieve more effective lossless compression. We call the extensions nested latent variable models and the recursive bits-back scheme respectively. Through experiments we verify that the recursive bits-back scheme using nested latent variable models results in lossless compression that is empirically superior to existing techniques. We also conduct an extensive analysis of how different versions of the method compare to earlier work on lossless compression using latent variable models.