- Sum over possible values. E[X]=∑xiP(xi). This is the standard definition of expected value. It’s usually only useful if you can easily calculate the probability distribution P(xi).
- Example: If R is the value of a single roll of a die, then E[R]=(1/6)+(2/6)+(3/6)+(4/6)+(5/6)+(6/6)=7/2
- Linearity of expectation. E[X+Y]=E[X]+E[Y].
- Example: If S is the sum of two independent rolls of a die, then E[S]=E[R+R]=E[R]+E[R]=7
- Example: Suppose n people enter a restaurant and leave their hats at reception. At the end of dinner, each person gets a hat back at random. If H is the number of people who correctly get their own hat back, then E[H]=E[H1]+E[H2]+...+E[Hn]=nE[H1]=n(1/n)=1 where Hi is an indicator variable indicating whether the ith person got their own hat back. Note that Hi and Hj aren’t independent, but linearity of expectation works anyway! H is a classic example of a random variable whose expected value is much easier to compute than its distribution.
- Law of total expectation. E[X]=E[E[X∣Y]]. Sometimes the random variable is complicated, but is simpler if you condition on another random variable.
- Example: Suppose you roll a fair die and record the value. Then you continue rolling the die until you obtain a value at least as large as the first roll. Let N be the number of rolls after the first and X1 be value of the first roll. Then E[N]=E[E[N∣X1]]=E[6/(7−X1)]=∑i=16(1/6)(6/(7−i))=49/20. In other words, it’s hard to understand N without first conditioning on X1, but we can calculate E[N] anyway by “averaging” over all the possible values of X1.
- Symmetry. If (X1,...,Xn) are exchangeable and ∑iE[Xi]=T, then E[Xi]=E[T]/n.
- Example: If you break a stick of length L into n pieces, then the expected length of the leftmost piece is L/n.
- Recursion.
- Example: What is the expected number of coin flips to see HTH? We can draw a state diagram
- Wald’s equation. If X is real, N is integer, and both are iid random variables, then E[X_1 + … + X_N] = E[N] * E[X_1]
- Tail-sum. If nonnegative, E[X] = integral_{0}^{infty} p(x > t) dt