Cookies?
Library Header Image
LSE Theses Online London School of Economics web site

Three essays in Bayesian nonparametric machine learning

Liu, Yirui (2022) Three essays in Bayesian nonparametric machine learning. PhD thesis, London School of Economics and Political Science.

[img] Text - Submitted Version
Restricted to Repository staff only until 4 January 2025.

Download (3MB)
Identification Number: 10.21953/lse.00004682

Abstract

There is a wide application of Bayesian nonparametric machine learning to a variety of fields, such as bioinformation, language processing, computer vision, network analysis, economics and finance. In Bayesian nonparametric models, Dirichlet process, Indian buffet process, Gaussian process, and other priors are used to obtain an infinitely dimensional prior distribution and hence to infer the dimensionality of parameter space in a data-adaptive manner. In this thesis, I present a variational inference framework for hierarchical Bayesian nonparametric models and develop state-of-the-art models that combine Bayesian nonparametrics and deep learning for graph data and sequential data. First, I develop a novel inference method for the hierarchical Bayesian nonparametric models, especially for the Dirichlet process model. Current variational inference methods for hierarchical Bayesian nonparametric models can neither characterize the correlation structure among latent variables due to the mean-field setting, nor infer the true posterior dimension because of the universal truncation. To overcome these limitations, I propose the conditional and adaptively truncated variational inference method (CATVI) by maximizing the nonparametric evidence lower bound and integrating Monte Carlo into the variational inference framework. CATVI enjoys several advantages over traditional methods, including a smaller divergence between variational and true posteriors, reduced risk of underfitting or overfitting, and improved prediction accuracy. Empirical studies on three large datasets reveal that CATVI applied in Bayesian nonparametric topic models substantially outperforms competing models, providing lower perplexity and clearer topic-words clustering. Moreover, I develop a methodology of using Bayesian nonparametric to improve the performance of deep graph neural networks. Training deep graph neural networks (GNNs) poses a challenging task, as the performance of GNNs may suffer from the number of hidden message-passing layers. The literature has focused on the proposals of over-smoothing and under-reaching to explain the performance deterioration of deep GNNs. I propose a new explanation for such deteriorated performance phenomenon, mis-simplification, that is, mistakenly simplifying graphs by preventing self-loops and forcing edges to be unweighted. I show that such simplifying can reduce the potential of message-passing layers to capture the structural information of graphs. In view of this, I propose a new framework, edge enhanced graph neural network (EEGNN). EEGNN uses the structural information extracted from the proposed Dirichlet mixture Poisson graph model, a Bayesian nonparametric model for graphs, to improve the performance of various deep message-passing GNNs. Experiments over different datasets show that our method achieves considerable performance increase compared to baselines. Finally, I present Deep Functional Factor Model (DF2M), a Bayesian nonparametric model for analyzing high-dimensional functional time series. The DF2M makes use of the Indian Buffet Process and the multi-task Gaussian Process with a deep kernel function to capture non-Markovian and nonlinear temporal dynamics. Unlike many black-box deep learning models, the DF2M provides an explainable way to use neural networks by constructing a factor model and incorporating deep neural networks within the kernel function. Additionally, I develop a computationally efficient variational inference algorithm for inferring the DF2M. Empirical results from four real-world datasets demonstrate that the DF2M offers better explainability and superior predictive accuracy compared to conventional deep learning models for high-dimensional functional time series.

Item Type: Thesis (PhD)
Additional Information: © 2022 Yirui Liu
Library of Congress subject classification: Q Science > QA Mathematics
Sets: Departments > Statistics
Supervisor: Qiao, Xinghao
URI: http://etheses.lse.ac.uk/id/eprint/4682

Actions (login required)

Record administration - authorised staff only Record administration - authorised staff only

Downloads

Downloads per month over past year

View more statistics