Statistical learning and regularisation for regression

PhD thesis of Cyril Goutte

[Homepage] [Publication page] [Matlab page]

The thesis consists of an introduction, 6 chapters and a number of appendices, amounting to 154 pages in all. It was completed in the connectionist group at LIP6 (formerly LAFORIA). Their Theses page provides an alternate source for the manuscript.

If you download some or all of the following files, please notify me (comments are appreciated as well).

The thesis is available in the following formats:

Most of the following postscript files are compressed using gzip 1.2.4 with option -9.
You may have to hold shift while clicking on the links if you wish to download rather than view!

I wish to thank the members of the jury who awarded me the title of doctor: Prof. Sylvie Thiria, Prof. Gérard Govaert, Dr. Anne Guérin-Dugué, Dr. Jan Larsen and Prof. Patrick Gallinari.


Abstract

This thesis deals with the use of statistical learning and regularisation on regression problems, with a focus on time series modelling and system identification. Both linear models and non-linear neural networks are considered as particular modelling techniques.

Linear and non-linear parametric regression are briefly introduced and their limit is shown using the bias-variance decomposition of the generalisation error. We then show that as such, those problems are ill-posed, and thus need to be regularised. Regularisation introduces a number of hyper-parameters, the setting of which is performed by estimating generalisation error. Several such methods are evoked in the course of this work.

The use of these theoretical aspects is targeted towards two particular problems. First an iterative method relying on generalisation error to extract the relevant delays from time series data is presented. Then a particular regularisation functional is studied, that provides pruning of unnecessary parameters as well as a regularising effect. This last part uses Bayesian estimators, and a brief presentation of those estimators is also given in the thesis.


Multiple files (ca. one per chapter)

Abstract/Résumé Bibliography and index

One big file

The manuscript is available as one large compressed postscript file from the following sources:

Versions might differ slightly (typos, corrections, ...)


Hardcopy

The manuscript was released as LIP6 technical report no. 1997/033. You can obtain a hardcopy of the thesis by requesting Thesis 1997/033 from Rapports-Admin@lip6.fr. (In french: Je souhaite obtenir la these LIP6 1997/033. Merci.)


Slides

The slides of the defence are available in french only. The most convenient format is probably pdf format (64 Kb). In postscript, they're a hefty 4.3 Mb compressed (9.4 Mb uncompressed). They cannot be ghostviewed but should print OK.


[Homepage] [Publication page] [Matlab page]
This page has been updated on February 4, and visited times.
© 1996 by Section for DSP, IMM. No material obtained from this page may be used commercially without permission.