Dec 6, 2022 - ' read math, learning

Embedded Rotation

假设我们要训练一个神经网络$f_{\theta}$，这个神经网络只有全连接层没有激活层，因此只是线性变换。由是网络参数可以表示为 weight和bias，即$\theta={A,b}$，因为不管多少层都可以统一视作一个线性变换。假设该网络接受3维输入，输出D维feature

\[f_{\theta}(n^g)=A_{D\times3}\cdot n^g+b_{D\times1}\]

假设$n^g=M\cdot n^l$，此处$M$是local space到global space的旋转变换

\[f_{\theta}(M\cdot n^l)=A_{D\times3}\cdot M\cdot n^l+b_{D\times1}\]

这个$f_\theta$就是normal feature的上限。

现在我们要寻找另一个神经网络$f_{\theta’}(\theta’={A’,b’})$，他接受local space下的输入$n^l$，输出feature之后再用M进行旋转。我们希望这一套操作，达到和$f_\theta$一样的效果

\[\begin{align} f_{\theta}(n^g)&=M\cdot f_{\theta'}(n^l)\\ A_{D\times3}\cdot n^g+b_{D\times1}&=M\cdot(A'_{D\times3}\cdot n^l+b'_{D\times1})\\ A_{D\times3}\cdot M\cdot n^l+b_{D\times1}&=M\cdot A'_{D\times3}\cdot n^l+M\cdot b'_{D\times1} \end{align}\]

因此我们只要使得

\[\begin{align} M\cdot A'_{D\times3}&=A_{D\times3}\cdot M\\ M\cdot b'_{D\times1}&=b_{D\times1} \end{align}\]

由于M是可逆的

\[\begin{align} A'_{D\times3}&=M^{-1}\cdot A_{D\times3}\cdot M\\ b'_{D\times1}&=M^{-1}\cdot b_{D\times1} \end{align}\]

这就找到了。这说明

global space输入，线性变换输出
local space输入，线性变换输出，再变换回global space

要想使得这俩等价，网络还必须接受关于旋转$M$的信息。