STOCHASTIC GRADIENT DESCENT in Chinese Translation

Add momentum-based stochastic gradient descent to network2. py.

增加基于momentum的随机梯度下降到network2.py中。

If we define the batch size to be 1,this is called stochastic gradient descent.

如果mini-batch的大小为1，这叫做随机梯度下降法。

An idea called Stochastic Gradient Descent can be used to speed up learning.

有种叫做随机梯度下降（SGD）的算法能够用来加速学习过程。

Here, we focus on a technique known as data parallel stochastic gradient descent(SGD).

在这里，我们关注一种称为数据并行随机梯度下降(SGD)的技术。

The use of the stochastic gradient descent implicitly carries information about the network state.

随机梯度下降的使用隐含地携带了有关网络状态的信息。

The network itself was trained using momentum-based mini-batch stochastic gradient descent.

网络本身使用基于momentum的mini-batch随机梯度下降进行训练。

In stochastic gradient descent we define our cost function as the cost of a single example:.

在随机梯度下降法中，我们定义代价函数为一个单一训练实例的代价：.

That gives a nice, compact rule for doing stochastic gradient descent with L1 regularization.

这样就给出了一种细致又紧凑的规则来进行采用L1规范化的随机梯度下降学习。

Stochastic gradient descent is a simple yet very efficient approach to fit linear models.

Stochasticgradientdescent(随机梯度下降)是一种简单但非常有效的拟合线性模型的方法。

So, does this mean in practice,should be always perform this one-example stochastic gradient descent?

所以，这是否意味着在实践中应该使用这种一个样本的随机梯度下降呢？?

In contrast, previous analyses of stochastic gradient descent methods for SVMs require Ω(1/ϵ2) iterations.

相比之下，先前SVM的随机梯度下降方法的分析需要Ω(1/ε^2)次迭代。

So, does this mean in practice,should be always perform this one-example stochastic gradient descent?

所以，这是否意味着，在实践中，我们应该总是进行这样的单样本随机梯度下降？?

I used the stochastic gradient descent optimizer with a learning rate of 0.01 and a momentum of 0.9.

我使用随机梯度下降法进行优化，参数learningrate为0.01，momentum为0.9。

Some of the most popular optimization algorithms used are the Stochastic Gradient Descent(SGD), ADAM and RMSprop.

最流行的优化算法包括随机梯度下降（SGD）、ADAM和RMSprop。

Stochastic Gradient Descent is sensitive to feature scaling, so it is highly recommended to scale your data.

随机梯度下降法对特征缩放(featurescaling)很敏感，因此强烈建议您缩放您的数据。

These deep learning techniques are based on stochastic gradient descent and backpropagation, but also introduce new ideas.

这些深度学习技术基于了随机梯度下降和反向传播，并引进了新的想法。

Using small batches of random data is called stochastic training-in this case, stochastic gradient descent.

使用一小部分的随机数据来进行训练被称为随机训练（stochastictraining）-在这里更确切的说是随机梯度下降训练。

Variations such as SGD(stochastic gradient descent) or minibatch gradient descent typically perform better in practice.

诸如SGD（随机梯度下降）或minibatch梯度下降通常在实践中有更好的表现。

The weights corresponding to these gatesare also updated using BPTT stochastic gradient descent as it seeks to minimize a cost function.

对应这些门的权重也使用BPTT随机梯度下降来更新，因为其要试图最小化成本函数。

The only difference between stochastic gradient descent and vanilla gradient descent is the fact that the former uses a noisy approximation of the gradient..

随机梯度下降和朴素梯度下降之间唯一的区别是：前者使用了梯度的噪声近似。

Notable examples of this include a demonstration that neural networks trained by stochastic gradient descent can fit randomly-assigned labels[81].

这里的例子可以证明：通过随机梯度下降训练的神经网络可以适用于随机分配的标签[81]。

We trained our models using stochastic gradient descent with a batch size of 128 examples, momentum of 0.9, and weight decay of 0.0005.

我们使用随机梯度下降训练我们的模型，batch大小为128，momentum0.9，权重衰减率0.0005。

Actually, it turns out that while neural networks are sometimes intimidating structures,the mechanism for making them work is surprisingly simple: stochastic gradient descent.

实际上，事实证明，虽然神经网络有时是令人畏惧的结构，但使它们工作的机制出奇地简单：随机梯度下降。

Trained the model using batch stochastic gradient descent, with specific values for momentum and weight decay.

用stochasticgradientdescent+minibatch方法进行训练，并且运用了momentum和weightdecay。

Although the GQN training objective is intractable, owing to the presence of latent variables,we can employ variational approximations and optimize with stochastic gradient descent.

虽然GQN的训练十分困难，由于隐变量z的存在，我们可以借助变分推理，并借助SGD（随机梯度下降）进行优化。

We trained our models using stochastic gradient descent with a batch size of 128 examples, momentum of 0.9, and weight decay of 0.0005.

我们使用随机梯度下降来训练我们的模型，样本的batchsize为128，动量为0.9，权重衰减为0.0005。

To construct such an example,we first need to figure out how to apply our stochastic gradient descent learning algorithm in a regularized neural network.

为了构造这个例子，我们首先需要弄清楚如何将随机梯度下降算法应用在一个规范化的神经网络上。

The class SGDClassifier implements a plain stochastic gradient descent learning routine which supports different loss functions and penalties for classification.

SGDClassifier类实现了一个为分类设计的普通随机梯度下降学习，它支持不同的loss函数和罚项。

Variations such as SGD(stochastic gradient descent) or minibatch gradient descent typically perform better in practice.

它的衍化版例如SGD（随机梯度下降stochasticgradientdescent）或者最小批量梯度下降（minibatchgradientdescent）通常在实际使用中会有更好的效果。

What is the translation of " STOCHASTIC GRADIENT DESCENT " in Chinese?

Examples of using Stochastic gradient descent in English and their translations into Chinese

Word-for-word translation

Top dictionary queries

English - Chinese