Project Title

When are embeddings enough?

Challenge

Many practical applications rely on fine-tuning predictors on top of pre-trained embeddings, yet a rigorous theoretical framework explaining when this approach is truly optimal compared to end-to-end learning is lacking.

Description

This project aims to develop a theoretical framework, potentially using tools from statistical learning theory and information theory, to understand the conditions under which using pre-trained embeddings is optimal. The students will analyse the interplay between the pre-training data distribution, the downstream task data distribution, the size of the fine-tuning dataset, and model architecture to derive bounds and conditions that guide the choice between a frozen embedding approach (y = f(e(x))) and end-to-end model training (y = f(x)).

Skills Required

Statistical learning theory, information theory, deep learning fundamentals

Skills to be Developed

Theoretical analysis, generalisation theory, embedding models

Relevant Background Reading

Shachaf, G., Brutzkus, A. and Globerson, A., 2021. A theoretical analysis of fine-tuning with linear teachers. Advances in Neural Information Processing Systems, 34, pp.15382-15394.
Wang, T. and Isola, P., 2020, November. Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In International conference on machine learning (pp. 9929-9939). PMLR.
Deng, Y., Hong, J., Zhou, J. and Mahdavi, M., 2024, April. On the generalization ability of unsupervised pretraining. In International Conference on Artificial Intelligence and Statistics (pp. 4519-4527). PMLR. trustworthiness. Potential Supervisors: Yee Whye Teh, Yarin Gal, Stephen Roberts, Chris Holmes Skills Required: Bayesian methods, conformal prediction, calibration techniques Skills to be Developed: Uncertainty estimation, trustworthy AI Relevant Background Reading: TBD