Q-LEARNING 中文是什么意思 - 中文翻译

名词
q-learning
Q学习
q学习

在 英语 中使用 Q-learning 的示例及其翻译为 中文

{-}
  • Political category close
  • Ecclesiastic category close
  • Programming category close
Reinforced learning, see Q-learning;
强化学习-Qlearning.
The Q-learning update formula is:.
此时的Q-Learning的更新公式为.
One approach to the above-discussed problem is called Q-learning.
解决问题的一种方法叫做Q-learning
Q-learning is about learning Q-values through observations.
Q学习是通过观察来学习Q值的。
A simple description of Q-learning can be summarized as follows:.
Q学习的简单描述可以总结如下:.
Q-learning is one of the easiest Reinforcement Learning algorithms.
Q-Learning是最著名的强化学习算法之一。
A simple description of Q-learning can be summarized as follows:.
Q学习算法的简单描述可以总结如下:.
There is a simple procedure to learn all the Q-values called Q-learning.
它只需要一个简单的被称为Q学习的过程来学习所有的Q值。
These include Q-Learning, SARSA and some other variants.
这些包括Q-学习,SARSA和其他一些变体。
The tools you will use will be TD-Learning, Q-Learning and genetic algorithms.
你将使用的工具将是TD-Learning,Q-Learning和遗传算法。
These include Q-Learning, SARSA and some other variants.
这其中包括Q-Learning、SARSA及其他算法。
The tools you will use will be TD-Learning, Q-Learning and genetic algorithms.
你将使用的工具包括TD-Learning、Q-Learning和遗传算法。
These include Q-Learning, SARSA and some other variants.
这些包括Q-Learning,SARSA和其他一些变体。
The tools that you would use include TD-Learning, Q-Learning and genetic algorithms.
你将使用的工具包括TD-Learning、Q-Learning和遗传算法。
These include Q-Learning, SARSA and some other variants.
这里面包括Q-Learning,SARSA和一些其它变型。
Machine learning approaches such as reinforcement learning andin particular, Q-learning might be applicable in this context.
强化学习、尤其是Q学习等机械学习方法可能适用于这种情况。
Q-learning is a values-based learning algorithm in reinforcement learning.
Q-Learning是强化学习中基于价值的学习算法。
Reinforcement learning(Q-learning, temporal difference learning).
Q-Learning以及时间差学习(Temporaldifferencelearning).
Q-Learning is considered to be one of the most important breakthroughs in Reinforcement Learning.
Q-Learning是最著名的强化学习算法之一。
This is formulated as a Markov Decision Process(MDP), and Q-learning is used to perform the optimization.
这被形式化为了一个马尔可夫决策过程(MDP),然后使用Q学习来执行优化。
The popular Q-learning algorithm is known to overestimate action values under certain conditions.
众所周知,流行的Q学习算法会高估某些条件下的动作值。
In this course, you will be introduced to the foundation of RL methods,such as value/policy iteration, Q-learning, policy gradient, and many more.
在这里您将发现:-RL方法的基础:价值/政策迭代,q学习,政策梯度等。
In 2015, DeepMind showed its Deep Q-learning AI figuring out how to play Atari breakout.
年,DeepMind展示了它的深度Q-learningAI,该AI能够解决如何玩Ataribreakout。
Currently, there are a multitude of algorithms that can be used to perform TD control,including Sarsa, Q-learning, and Expected Sarsa.
目前,有大量算法可用于执行TD控制,包括Sarsa、Q-learning和ExpectedSarsa。
Additionally, Q-learning can handle problems with stochastic transitions and rewards, without requiring adaptations.
此外,Q学习可以处理随机过渡和奖励的问题,而不需要任何适应。
Unlike policy learning, Q-Learning takes two inputs- state and action- and returns a value for each pair.
与策略学习不同,Q-Learning算法有两个输入,分别是状态和动作,并为每个状态动作对返回对应值。
By contrast, Q-learning has no constraint over the next action, as long as it maximizes the Q-value for the next state.
相比之下,Q-learning对下一个动作没有约束,只要它能较大化下一个状态的Q值就行了。
结果: 27, 时间: 0.0335

顶级字典查询

英语 - 中文