【深度学习】半监督模型(Π-Model、Temporal Ensembling、Mean Teacher)

两种误差:一致性误差(consistency cost)和分类误差

一致性误差: J ( θ ) = E x , η ′ , η [ ∣ ∣ f ( x , θ ′ , η ′ ) − f ( x , θ , η ) ∣ ∣ 2 ] J(\theta)=\mathbb{E}_{x,\eta',\eta}[||f(x,\theta',\eta')-f(x,\theta,\eta)||^2] J(θ)=Ex,η,η[∣∣f(x,θ,η)f(x,θ,η)2]

三种半监督模型的区别(这些都用的噪声扰动):
Π-Model: θ ′ = θ \theta'=\theta θ=θ
Temporal Ensembling: f ( x , θ ′ , η ′ ) f(x,\theta',\eta') f(x,θ,η)用连续预测的加权平均值逼近
Mean Teacher: θ t ′ = α θ t − 1 ′ + ( 1 − α ) θ t \theta'_t=\alpha\theta_{t-1}'+(1-\alpha)\theta_t θt=αθt1+(1α)θt

Interpolation Consistency Training(ICT) model:用的插值不是噪声扰动