• Sum over possible values. E[X]=xiP(xi)E[X] = \sum x_i P(x_i). This is the standard definition of expected value. It’s usually only useful if you can easily calculate the probability distribution P(xi)P(x_i).
    • Example: If RR is the value of a single roll of a die, then E[R]=(1/6)+(2/6)+(3/6)+(4/6)+(5/6)+(6/6)=7/2E[R] = (1/6) + (2/6) + (3/6) + (4/6) + (5/6) + (6/6) = 7/2
  • Linearity of expectation. E[X+Y]=E[X]+E[Y]E[X + Y] = E[X] + E[Y].
    • Example: If SS is the sum of two independent rolls of a die, then E[S]=E[R+R]=E[R]+E[R]=7E[S] = E[R + R] = E[R] + E[R] = 7
    • Example: Suppose nn people enter a restaurant and leave their hats at reception. At the end of dinner, each person gets a hat back at random. If HH is the number of people who correctly get their own hat back, then E[H]=E[H1]+E[H2]+...+E[Hn]=nE[H1]=n(1/n)=1E[H] = E[H_1] + E[H_2] + ... + E[H_n] = nE[H_1] = n(1/n) = 1 where HiH_i is an indicator variable indicating whether the iith person got their own hat back. Note that HiH_i and HjH_j aren’t independent, but linearity of expectation works anyway! HH is a classic example of a random variable whose expected value is much easier to compute than its distribution.
  • Law of total expectation. E[X]=E[E[XY]]E[X] = E[E[X|Y]]. Sometimes the random variable is complicated, but is simpler if you condition on another random variable.
    • Example: Suppose you roll a fair die and record the value. Then you continue rolling the die until you obtain a value at least as large as the first roll. Let NN be the number of rolls after the first and X1X_1 be value of the first roll. Then E[N]=E[E[NX1]]=E[6/(7X1)]=i=16(1/6)(6/(7i))=49/20E[N] = E[E[N|X_1]] = E[6 / (7 - X_1)] = \sum_{i=1}^{6} (1/6)(6/(7-i)) = 49/20. In other words, it’s hard to understand NN without first conditioning on X1X_1, but we can calculate E[N]E[N] anyway by “averaging” over all the possible values of X1X_1.
  • Symmetry. If (X1,...,Xn)(X_1, ..., X_n) are exchangeable and iE[Xi]=T\sum_{i} E[X_i] = T, then E[Xi]=E[T]/nE[X_i] = E[T] / n.
    • Example: If you break a stick of length LL into nn pieces, then the expected length of the leftmost piece is L/nL/n.
  • Recursion.
    • Example: What is the expected number of coin flips to see HTH? We can draw a state diagram
  • Wald’s equation. If X is real, N is integer, and both are iid random variables, then E[X_1 + … + X_N] = E[N] * E[X_1]
  • Tail-sum. If nonnegative, E[X] = integral_{0}^{infty} p(x > t) dt