Ví dụ về việc sử dụng Theta one trong Tiếng anh và bản dịch của chúng sang Tiếng việt
{-}
-
Colloquial
-
Ecclesiastic
-
Computer
Just trying to draw the same function j of theta one.
If theta zero is 1.5 and theta one is 0, then the hypothesis function will look like this.
So I'm going to write minimize over theta zero, theta one.
And so I have theta one minus a negative number which means I'm actually going to increase theta, right?
What we did last time was, right,when we only had theta one.
Now, my derivative term, d, d theta one j of theta one, when evaluated at this point, gonna look at right.
What we want to do iscome up with values for the parameters theta zero and theta one.
So how do we come up with values theta zero, theta one that corresponds to a good fit to the data?
The height of the surface of the points indicates the value of J of theta zero,J of theta one.
In contrast, the cost function, J,that's a function of the parameter, theta one, which controls the slope of the straight line.
So, this is a 3-D surface plot,where the axes are labeled theta zero and theta one.
And so, in your gradient descent update, you have theta one, gives update that theta one, minus alpha times zero.
All three of these points that I just drew in magenta,they have the same value for J(theta zero, theta one).
Now given these value of theta zero and theta one, we want to plot the corresponding, you know, cost function on the right.
For illustration in this specific presentation,I have initialised theta zero at about 900, and theta one at about minus 0.1, okay?
And, this notation, minimize over theta zero and theta one, this means find me the values of theta zero and theta one that causes this expression to be minimized.
And this second term here,that term is just a partial derivative with respect to theta one that we worked out on the previous line.
What we're going to do in this video is talk about how to go about choosing these two parameter values,theta zero and theta one.
But now we have two parameters, theta zero, and theta one, and so the plot gets a little more complicated.
When we talk about the method in linear regression for how to solve for the parameters,theta zero and theta one, all in one shot.
To introduce a little bit more terminology, these theta zero and theta one, right, these theta i's are what I call the parameters of the model.
And we set the parameter theta zero to be only zero. In the next video. We will go back to the original problem formulation andlook at some visualizations involving both theta zero and theta one.
And what I want to do is minimize over theta zero and theta one my function J of theta zero comma theta one.
I'm gonna decrease theta one and we can see this is the right thing to do because I actually went ahead in this direction you know to get me closer to the minimum over there.
Unlike before, unlike the last video, I'm going to keep both of my parameters,theta zero, and theta one, as we generate our visualizations for the cost function.
The idea is we'regoing to choose our parameters theta zero, theta one so that h(x), meaning the value we predict on input x, that this at least close to the values y for the examples in our training set, for our training examples.
Now here's a different hypothesis that's you know still not a great fit for the data but may be slightly better so here right that's mypoint that those are my parameters theta zero theta one and so my theta zero value.
So as you vary theta zero and theta one, the two parameters, you get different values of the cost function J(theta zero, theta one) and the height of this surface above a particular point of theta zero, theta one.
And so, what we want is to have software to find the value of theta zero, theta one that minimizes this function and in the next video we start to talk about an algorithm for automatically finding that value of theta zero and theta one that minimizes the cost function J.
It turns out that you can have you know negative values of theta one as well so if theta one is negative then h of x would be equal to say minus 0.5 times x then theta one is minus 0.5 and so that corresponds to a hypothesis with a slope of negative 0.5.