切诺夫界 (Chernoff bounds)

以下内容来自此处.
在本文中我们将首先给出若干结论, 再给出切诺夫界及其证明.

XX

X为一随机变量,

aRa\in \mathbb{R}

a∈R, 则对于任意

s>0s>0

s>0, 由马尔科夫不等式有公式1:

Pr?(Xa)=Pr?(esXesa)E(esX)esa\Pr(X\ge a) = \Pr(e^{sX}\ge e^{sa}) \le \frac{E(e^{sX})}{e^{sa}}

Pr(X≥a)=Pr(esX≥esa)≤esaE(esX)?
类似的, 对于任意

s>0s>0

s>0, 由马尔科夫不等式有公式2:

Pr?(Xa)=Pr?(e?sXe?sa)E(e?sX)e?sa\Pr(X\le a) = \Pr(e^{-sX} \ge e^{-sa}) \le \frac{E(e^{-sX})}{e^{-sa}}

Pr(X≤a)=Pr(e?sX≥e?sa)≤e?saE(e?sX)?

MX(s)=E(esX)M_X(s) = E(e^{sX})

MX?(s)=E(esX), 则由泰勒展开得

MX(s)=E(1+sX+12s2X2+13!s3X3+??)=i=01i!siE(Xi)M_X(s) = E(1 + sX + \frac{1}{2}s^2X^2 + \frac{1}{3!}s^3X^3 + \cdots) = \sum_{i = 0}^\infty\frac{1}{i!}s^iE(X^i)

MX?(s)=E(1+sX+21?s2X2+3!1?s3X3+?)=i=0∑∞?i!1?siE(Xi)

引理1.

X1,??,XnX_1, \cdots, X_n

X1?,?,Xn?为独立随机向量,

X=i=1nXiX=\sum_{i=1}^nX_i

X=∑i=1n?Xi?, 则

MX(s)=i=1nMXi(s).M_X(s) = \prod_{i=1}^nM_{X_i}(s).

MX?(s)=i=1∏n?MXi??(s).
证明:

MX(s)=E(esX)=E(esi=1nXi)=E(i=1nesXi)=i=1nE(esXi)=i=1nMXi(s)M_X(s) = E(e^{sX}) = E(e^{s\sum_{i=1}^nX_i}) = E(\prod_{i=1}^n e^{sX_i}) = \prod_{i=1}^nE(e^{sX_i}) = \prod_{i=1}^nM_{X_i}(s)

MX?(s)=E(esX)=E(es∑i=1n?Xi?)=E(i=1∏n?esXi?)=i=1∏n?E(esXi?)=i=1∏n?MXi??(s)

引理2. 假设

YY

Y为一随机变量, 并且

Pr?(Y=1)=p,Pr?(Y=0)=1?p\Pr(Y=1)=p, \Pr(Y=0) = 1-p

Pr(Y=1)=p,Pr(Y=0)=1?p, 则对于任意

sRs\in\mathbb{R}

s∈R, 有

MY(s)=E(esY)ep(es?1)M_Y(s)= E(e^{sY})\le e^{p(e^s - 1)}

MY?(s)=E(esY)≤ep(es?1)
证明:

MY(s)=E(esY)=p?es+(1?p)?1=1+p(es?1)M_Y(s) = E(e^{sY}) = p\cdot e^s + (1-p)\cdot 1 = 1 + p(e^s - 1)

MY?(s)=E(esY)=p?es+(1?p)?1=1+p(es?1)
因为

1+yey1+y\le e^y

1+y≤ey, 令

y=p(es?1)y = p(e^s -1)

y=p(es?1), 则有

MY(s)ep(es?1)M_Y(s)\le e^{p(e^s - 1)}

MY?(s)≤ep(es?1)

切诺夫界.

X=i=1nXiX=\sum_{i=1}^n X_i

X=∑i=1n?Xi?, 其中

X1,??,XnX_1, \cdots, X_n

X1?,?,Xn?相互独立, 且

Pr?(Xi=1)=pi,Pr?(Xi=0)=1?pi\Pr(X_i = 1) = p_i, \Pr(X_i = 0) = 1-p_i

Pr(Xi?=1)=pi?,Pr(Xi?=0)=1?pi?. 又令

μ=E(X)=i=1npi\mu = E(X) = \sum_{i=1}^np_i

μ=E(X)=∑i=1n?pi?, 则有:

  1. 上尾 (Upper Tail):
    ?δ>0,Pr?(X(1+δ)μ)e?δ22+δμ\forall\delta>0, \Pr(X\ge (1+\delta)\mu) \le e^{-\frac{\delta^2}{2 + \delta}\mu}

    ?δ>0,Pr(X≥(1+δ)μ)≤e?2+δδ2?μ

  2. 下尾 (Lower Tail):
    ?0<δ<1,Pr?(X(1?δ)μ)e?δ22μ\forall 0<\delta<1, \Pr(X\le(1 - \delta)\mu)\le e^{-\frac{\delta^2}{2}\mu}

    ?0<δ<1,Pr(X≤(1?δ)μ)≤e?2δ2?μ

证明: 由引理1和引理2得,

MX(s)=i=1nMXi(s)i=1nepi(es?1)=e(es?1)i=1npi=e(es?1)μM_X(s) = \prod_{i = 1}^nM_{X_i}(s)\le \prod_{i=1}^ne^{p_i(e^s-1)}=e^{(e^s -1)\sum_{i=1}^np_i} = e^{(e^s - 1)\mu}

MX?(s)=i=1∏n?MXi??(s)≤i=1∏n?epi?(es?1)=e(es?1)∑i=1n?pi?=e(es?1)μ
我们首先证明切诺夫界的上尾.
因为由公式1有

Pr?(Xa)E(esX)esa\Pr(X\le a) \le \frac{E(e^{sX})}{e^{sa}}

Pr(X≤a)≤esaE(esX)?. 令

a=(1+δ)μ,s=ln?(1+δ)a=(1+\delta)\mu, s = \ln(1+\delta)

a=(1+δ)μ,s=ln(1+δ). 则有

Pr?(X(1+δ)μ)E(esX)esa=esμes(1+δ)μ\Pr(X\le (1+\delta)\mu)\le \frac{E(e^{sX})}{e^{sa}} = \frac{e^{s\mu}}{e^{s(1+\delta)\mu}}

Pr(X≤(1+δ)μ)≤esaE(esX)?=es(1+δ)μesμ?
而当

s>0s > 0

s>0时有

s<es?1s < e^s - 1

s Pr?(X(1+δ)μ)e(es?1)μes(1+δ)μ=(e(es?1)es(1+δ))μ=(eδ(1+δ)1+δ)μ\Pr(X\le (1+\delta)\mu) \le \frac{e^{(e^s - 1)\mu}}{e^{s(1+\delta)\mu}} = (\frac{e^{(e^s -1)}}{e^{s(1+\delta)}})^\mu=(\frac{e^\delta}{(1+\delta)^{1+\delta}})^\mu

Pr(X≤(1+δ)μ)≤es(1+δ)μe(es?1)μ?=(es(1+δ)e(es?1)?)μ=((1+δ)1+δeδ?)μ

ln?(eδ(1+δ)1+δ)μ=μ(δ?(1+δ)ln?(1+δ))\ln(\frac{e^\delta}{(1+\delta)^{1+\delta}})^\mu = \mu(\delta - (1+\delta)\ln(1+\delta))

ln((1+δ)1+δeδ?)μ=μ(δ?(1+δ)ln(1+δ))
因为

?x>0,ln?(1+x)x1+x/2\forall x>0, \ln(1+x)\ge\frac{x}{1 + x/2}

?x>0,ln(1+x)≥1+x/2x?, 所以有

μ(δ?(1+δ)ln?(1+δ))?δ22+δμ\mu(\delta - (1+\delta)\ln(1+\delta))\le -\frac{\delta^2}{2+\delta}\mu

μ(δ?(1+δ)ln(1+δ))≤?2+δδ2?μ
所以

Pr?(X(1+δ)μ)(eδ(1+δ)1+δ)μe?δ22+δμ\Pr(X\le(1+\delta)\mu)\le (\frac{e^\delta}{(1+\delta)^{1+\delta}})^\mu\le e^{-\frac{\delta^2}{2+\delta}\mu}

Pr(X≤(1+δ)μ)≤((1+δ)1+δeδ?)μ≤e?2+δδ2?μ

下尾的证明过程类似, 只是我们需要令

s=ln?(1?δ)s=\ln(1 - \delta)

s=ln(1?δ)并且使用如下不等式:

?0<δ<1,ln?(1?δ)?δ+δs2\forall 0 < \delta < 1, \ln(1-\delta)\ge -\delta + \frac{\delta^s}{2}

?0<δ<1,ln(1?δ)≥?δ+2δs?

切诺夫界的非伯努利分布版本:

X1,??,XnX_1, \cdots, X_n

X1?,?,Xn?为随机变量, 其中

aXib,i=1,??,na \le X_i \le b, i = 1, \cdots, n

a≤Xi?≤b,i=1,?,n. 又令

X=i=1nXi,μ=E(X)X=\sum_{i=1}^nX_i, \mu=E(X)

X=∑i=1n?Xi?,μ=E(X), 则对于任意

δ>0\delta > 0

δ>0有:

  1. 上尾:
    Pr?(X(1+δ)μ)e?2δ2μ2n(b?a)2\Pr(X\ge (1 + \delta)\mu)\le e^{-\frac{2\delta^2\mu^2}{n(b-a)^2}}

    Pr(X≥(1+δ)μ)≤e?n(b?a)22δ2μ2?

  2. 下尾:
    Pr?(X(1?δ)μ)e?δ2μ2n(b?a)2\Pr(X\le (1-\delta)\mu)\le e^{-\frac{\delta^2\mu^2}{n(b-a)^2}}

    Pr(X≤(1?δ)μ)≤e?n(b?a)2δ2μ2?