梯度下降英语是什么意思

梯度下降步骤可以概括为：.

The gradient descent steps can be summarized as:.

要更详细地理解解梯度下降，请参考：.

To get a more detailed understanding of gradient descent, please refer to:.

我们使用的是固定学习速率来进行梯度下降。

We used a fixed learning rate for gradient descent.

如果我们采用批量梯度下降，那么我们会被困在这里，因为这里的梯度始终会指向局部最小值点。

If we're doing Batch Gradient Descent, we will get stuck here since the gradient will always point to the local minima.

梯度下降可以被认为是攀登到山谷的底部，而不是爬上山丘。

Gradient Descent can be thought of climbing down to the bottom of a valley, instead of climbing up a hill.

Combinations with other parts of speech

用动词使用

梯度下降

在导数达到最小误差值之前，我们会一直计算梯度下降，并且每个步骤都会取决于斜率（梯度）的陡度。

We calculate the gradient descent until the derivative reaches the minimum error, and each step is determined by the steepness of the slope(gradient).

因此，梯度下降倾向于卡在局部最小值，这取决于地形的性质（或ML中的函数）。

Gradient Descent therefore is prone to be stuck in local minimum, depending on the nature of the terrain(or function in ML terms).

随机梯度下降和朴素梯度下降之间唯一的区别是：前者使用了梯度的噪声近似。

The only difference between stochastic gradient descent and vanilla gradient descent is the fact that the former uses a noisy approximation of the gradient..

我们使用随机梯度下降训练我们的模型，batch大小为128，momentum0.9，权重衰减率0.0005。

We trained our models using stochastic gradient descent with a batch size of 128 examples, momentum of 0.9, and weight decay of 0.0005.

现在我们将结合计算图和梯度下降的概念，来看看如何更新逻辑回归的参数。

Now we will take the concept of computation graphs and gradient descent together and see how the parameters of logistic regression can be updated.

梯度下降的一个直观理解就是想象一条源自山顶的河流。

An intuitive way to think of Gradient Descent is to imagine the path of a river originating from top of a mountain.

梯度下降区域的房子都是平滑曲线以及密切交织在一起的模式，看起来几乎更像一个丛林，而不是一座城市。

The houses in the gradient descent district are all smooth curves and densely intertwined patterns, almost more like a jungle than a city.

同时，每个最先进的深度学习库包含各种梯度下降优化算法的实现，（例如：lasagne，caffe和keras）。

At the same time, every state-of-the-art Deep Learning librarycontains implementations of various algorithms to optimize gradient descent(e.g. lasagne's, caffe's, and keras'documentation).

实际上，事实证明，虽然神经网络有时是令人畏惧的结构，但使它们工作的机制出奇地简单：随机梯度下降。

Actually, it turns out that while neural networks are sometimes intimidating structures, the mechanism for making them work is surprisingly simple:stochastic gradient descent.

降法以确保其收敛，我通常也把同样的方法用在逻辑回归中，来监测梯度下降，以确保它正.

I usually apply that same method to logistic regression,too to monitor a gradient descent, to make sure it's converging correctly.

然而，它仍然是一个不错的教学工具，有助于全面了解关于梯度下降的一些最重要的想法。

However, it still serves as a decent pedagogicaltool to get some of the most important ideas about gradient descent across the board.

请注意，它是在训练神经网络时的go-to算法，它是深度学习中最常见的梯度下降类型。

This is the go-to algorithm when training a neural network andit is the most common type of gradient descent within deep learning.

虽然GQN的训练十分困难，由于隐变量z的存在，我们可以借助变分推理，并借助SGD（随机梯度下降）进行优化。

Although the GQN training objective is intractable, owing to the presence of latent variables,we can employ variational approximations and optimize with stochastic gradient descent.

更具体地说，它描述了使用具有梯度下降的线性回归的最底层的概念。

More specifically,it describes the underlying concepts of using linear regression with gradient descent.

不幸的是，该函数不适合梯度下降（没有可以下降的梯度：每一处的导数都为0）.

Unfortunately, it is incompatible with gradient descent(there is no gradient to descend: the derivative is null everywhere).

在本篇博客中，我们先回归了梯度下降的三个变种方法，其中mini-batch是最流行的。

In this blog post,we have initially looked at the three variants of gradient descent, among which mini-batch gradient descent is the most popular.

下面继续我们的示例问题，梯度下降的过程会像这样：.

Continuing with our example problem, the gradient descent procedure would go something like this:.

关键在于，该架构的每个组件都是可微分的，使其可以直接使用梯度下降进行训练。

Crucially, every component of the architecture is differentiable,making it straightforward to train with gradient descent.

同时，每个优秀的深度学习库都包含了优化梯度下降的多种算法的实现（比如，lasagne、caffe和keras的文档）。

At the same time, every state-of-the-art Deep Learning librarycontains implementations of various algorithms to optimize gradient descent(e.g. lasagne's, caffe's, and keras' documentation).

第二种方法通过梯度下降优化得分-通常用于机器学习的数学技术，用于进行小的、渐进的改进这导致高度精确的结构。

The second method optimised scores through gradient descent- a mathematical technique commonly used in machine learning for making small, incremental improvements- which resulted in highly accurate structures.

当然，可以为x，y和z选择一些随机的起始值，然后使用梯度下降找到最小的w，但是谁在乎呢？?

Sure, we can pick some random starting values for x, y and z,and then use gradient descent to find the smallest w, but who cares?

结果显示在消防训练区周围湿地的全氟辛烷磺酸的含量很高，并在邻近区域呈梯度下降（2.2-0.2微克/升）。

The results showed highly elevated levels of PFOS in awetland in the vicinity of a fire drill area with a declining gradient out in the adjacent bay(2.2- 0.2ug/L).

不同于图灵机的是，NTM是一个可微的计算机，能够使用梯度下降进行训练，对于学习程序来说是一个很实用的机制。

Unlike a Turing machine,an NTM is a differentiable computer that can be trained by gradient descent, yielding a practical mechanism for learning programs.

梯度下降英语是什么意思 - 英语翻译

在中文中使用梯度下降的示例及其翻译为英语

单词翻译

顶级字典查询

中文 - 英语

梯度下降 英语是什么意思 - 英语翻译

在 中文 中使用 梯度下降 的示例及其翻译为 英语

单词翻译

顶级字典查询

中文 - 英语

梯度下降英语是什么意思 - 英语翻译

在中文中使用梯度下降的示例及其翻译为英语