Tutorial: Information Theory and Statistics


By Prof. Bin Yu

Departments of Statistics, and EECS

University of California at Berkeley





Information Theory deals with a basic challenge in communication:

How do we transmit information efficiently? In addressing that issue, Information Theorists have created a rich mathematical framework to

describe communication processes with tools to characterize so-called fundamental limits of data compression and transmission.


In this tutorial, we will cover basic concepts and principles from information theory that are useful in statistics (machine learning). Topics include entropy, Kullback-Leibler divergence,

mutual information, and Maximum Entropy principle. Model selection methodologies like AIC and the Principle of Minimum Description Length will also be covered because of their information theoretic roots. If time permits, we will cover I-projection, Bregman distance and their connections to ICA and boosting, respectively.