**Tutorial:** Information Theory and Statistics

By
Prof. Bin Yu

Departments
of Statistics, and EECS

University
of California at Berkeley

www.stat.berkeley.edu/~binyu

binyu@stat.berkeley.edu

**Abstract:**

Information
Theory deals with a basic challenge in communication:

How
do we transmit information efficiently? In addressing that issue, Information
Theorists have created a rich mathematical framework to

describe
communication processes with tools to characterize so-called fundamental limits
of data compression and transmission.

In
this tutorial, we will cover basic concepts and principles from information
theory that are useful in statistics (machine learning). Topics include
entropy, Kullback-Leibler divergence,

mutual
information, and Maximum Entropy principle. Model selection methodologies like
AIC and the Principle of Minimum Description Length will also be covered
because of their information theoretic roots. If time permits, we will cover
I-projection, Bregman distance and their connections to ICA and boosting,
respectively.