#### 期刊菜单

Three Gradient-Based Optimization Methods and Their Comparison

1. 引言

2. 预备知识

$\mathrm{min}L\left(x\right)$ (2.1)

$L\left({x}^{*}\right)\le L\left(x\right)$$\forall x\in {R}^{n}$

$L\left({x}^{*}\right)\le L\left(x\right)$$\forall x\in N$

$L\left({x}^{*}\right)$\forall x\in N$

$\nabla L\left({x}^{*}\right)=0$

3. 模型

${x}^{*}=\mathrm{arg}\left(\underset{x\in {R}^{D}}{\mathrm{min}}L\left(x\right)=\frac{1}{N}\underset{n=1}{\overset{N}{\sum }}{L}_{n}\left(x\right)\right)$ (3.1)

4. 方法介绍

(一) 梯度下降法(GD)

$\nabla L\left(x\right)=\frac{1}{N}\underset{n=1}{\overset{N}{\sum }}\nabla {L}_{n}\left(x\right)$ (4.1)

(二) 随机梯度下降法(SGD)

$E\left[\nabla {L}_{n}\left({x}_{k}\right)\right]=\nabla L\left({x}_{k}\right)$ (4.2)

(三) 小批量随机梯度下降法(MB-SGD)

$\nabla {L}_{B}\left({x}_{k}\right)=\frac{1}{M}\underset{m=1}{\overset{M}{\sum }}\nabla {L}_{{n}_{m}}\left({x}_{k}\right)$ (4.3)

5. 算例

$\mathrm{min}f\left(x\right)=\frac{1}{n}\underset{i=1}{\overset{n}{\sum }}\left[{\left(1-{x}_{2i-1}\right)}^{2}+10{\left({x}_{2i}-{x}_{2i-1}^{2}\right)}^{2}\right]$ (5.1)

Figure 1. Image when $x\in {R}^{2}$

$\nabla f\left(x\right)=\frac{1}{n}{\left[\frac{\partial f\left(x\right)}{\partial {x}_{1}},\frac{\partial f\left(x\right)}{\partial {x}_{2}},\cdots ,\frac{\partial f\left(x\right)}{\partial {x}_{n}}\right]}^{\text{T}}$ (5.2)

$\frac{\partial f\left(x\right)}{\partial {x}_{1}}=-2\left(1-{x}_{1}\right)-40{x}_{1}\left({x}_{2}-{x}_{1}^{2}\right)$ (5.3)

$\frac{\partial f\left(x\right)}{\partial {x}_{2}}=20\left({x}_{2}-{x}_{1}^{2}\right)$ (5.4)

$\frac{\partial f\left(x\right)}{\partial {x}_{3}}=-2\left(1-{x}_{3}\right)-40{x}_{3}\left({x}_{4}-{x}_{3}^{2}\right)$ (5.5)

$\frac{\partial f\left(x\right)}{\partial {x}_{4}}=20\left({x}_{4}-{x}_{3}^{2}\right)$ (5.6)

$\frac{\partial f\left(x\right)}{\partial {x}_{j}}=-2\left(1-{x}_{j}\right)-40{x}_{j}\left({x}_{j+1}-{x}_{j}^{2}\right)$ (5.7)

$\frac{\partial f\left(x\right)}{\partial {x}_{j}}=20\left({x}_{j}-{x}_{j-1}^{2}\right)$ (5.8)

Table 1.Experimental results

$\mathrm{min}f\left(x\right)=\frac{1}{n}\underset{i=1}{\overset{n}{\sum }}\left[{\left({x}_{i}-B\right)}^{2}+10\mathrm{cos}\left(2\pi \left({x}_{i}-B\right)\right)+10\right]+C$ (5.9)

Figure 2. Image when $x\in {R}^{1}$$B=0$$C=0$

Table 2. The number of local minima changes with the value of n

$\frac{\partial f\left(x\right)}{\partial {x}_{1}}=2\left({x}_{1}-B\right)+20\pi \mathrm{sin}\left(2\pi \left({x}_{1}-B\right)\right)$ (5.10)

$\frac{\partial f\left(x\right)}{\partial {x}_{2}}=2\left({x}_{2}-B\right)+20\pi \mathrm{sin}\left(2\pi \left({x}_{2}-B\right)\right)$ (5.11)

$⋮$

$\frac{\partial f\left(x\right)}{\partial {x}_{n}}=2\left({x}_{n}-B\right)+20\pi \mathrm{sin}\left(2\pi \left({x}_{n}-B\right)\right)$ (5.12)

$|{\left({\stackrel{¯}{x}}_{k}^{*}\right)}_{i}-{\left({x}^{*}\right)}_{i}|<0.25$$\forall i$ (5.13)

Table 3. Simulation experiment results

6. 总结与展望