概述¶
有意义、新意的方法,但实际效果似乎也没比CrossEntropy高到哪去
涉及以下几篇论文:
- [Large Margin Deep Networks for Classification][1]
- [Max-Margin Adversarial (MMA) Training: Direct Input Space Margin Maximization through Adversarial Training][2]
- [DeepFool: a simple and accurate method to fool deep neural networks][3]
基本概念:
应用支持向量机的最远化分类边界思想于神经网络, 对多分类softmax问题, 我们定义分类边界为:
$$ D_{(i, j)} \triangleq \{ x | f_{i}(x)=f_{j}(x) \} $$
这里的 $f_i(x), f_j(x)$ 显然是值最大的两个分类,比如分类器判断是猫是狗的概率各为50%, 而象、马狗等其他类别概率都为0跟分类边界没关系。
虽然神经网络应该是非线性分类, 但我们再假设是线性分类的, 或用泰勒展开来用线性函数来拟合, 设某 $x$ ,移动一个$\delta$ 到达分类边界, 可推导得 $$ \begin{aligned} f_{i}(x+\delta) & =f_{j}(x+\delta) \\ f_{i}(x)+\delta \nabla f_{i}(x) & = f_{j}(x)+\delta \nabla f_{j}(x) \end{aligned} $$
$$ \begin{aligned} \delta&=\frac{f_{i}(x)-f_{j}(x)}{\nabla f_{j}(x)-\nabla f_{i}(x)} \\ |\delta|&=\frac{\left|f_{i}(x)-f_{j}(x)\right|}{\left|\nabla f_{j}(x)-\nabla f_{i}(x)\right|^2}(\nabla f_{j}(x)-\nabla f_{i}(x)) \end{aligned} $$
TODO¶
参考文献¶
1: / "Large Margin Deep Networks for Classification"
2: / "Max-Margin Adversarial (MMA) Training: Direct Input Space Margin Maximization through Adversarial Training"
3: / "DeepFool: a simple and accurate method to fool deep neural networks"