Examples of using Loss function in English and their translations into Russian
{-}
-
Official
-
Colloquial
It is important for the loss function to be convex.
The loss function also affects the convergence rate for an algorithm.
For example, in online classification, the prediction domain and the loss functions are not convex.
The Huber loss function describes the penalty incurred by an estimation procedure f.
In such scenarios, two simple techniques for convexification are used: randomisation and surrogate loss functions.
People also translate
This familiar loss function is used in Ordinary Least Squares regression.
The optimal regularization in hindsight can be derived for linear loss functions, this leads to the AdaGrad algorithm.
Different loss functions are used depending on whether the problem is one of regression or one of classification.
The above proved a regret bound for linear loss functions v t( w)⟨ w, z t⟩{\displaystyle v_{ t}( w)=\ langle w, z_{t}\rangle.
If a loss function could be specified then critical areas could be identified in forecasting methodology where research is most needed.
In an inference context the loss function would take the form of a scoring rule.
Train the network, letting NetTrain automatically infer that it should attach cross-entropy loss functions to both outputs.
BrownBoost uses a non-convex potential loss function, thus it does not fit into the AdaBoost framework.
Hyperparameter optimization finds a tuple of hyperparameters that yields an optimal model which minimizes a predefined loss function on given independent data.
The most common loss function for regression is the square loss function also known as the L2-norm.
For decision-making, Bayesian statisticians might use a Bayes factor combined with a prior distribution and a loss function associated with making the wrong choice.
The choice of loss function is a determining factor on the function f S{\displaystyle f_{S}} that will be chosen by the learning algorithm.
Some classification models, such as naive Bayes, logistic regression andmultilayer perceptrons(when trained under an appropriate loss function) are naturally probabilistic.
The choice of loss function here gives rise to several well-known learning algorithms such as regularized least squares and support vector machines.
The posterior gives a universal sufficient statistic for detection applications,when choosing values for the variable subset that minimize some expected loss function, for instance the probability of decision error.
The Huber loss is a loss function used in robust regression, that is less sensitive to outliers in data than the squared error loss. .
In estimation theory and decision theory, a Bayes estimator or a Bayes action is an estimator ordecision rule that minimizes the posterior expected value of a loss function i.e., the posterior expected loss. .
In this case the set of actions is the parameter space, and a loss function details the cost of the discrepancy between the true value of the parameter and the estimated value.
For the Euclidean regularisation, one can show a regret bound of O( T){\displaystyle O({\sqrt{T}})}, which can be improved further to a O( log T){\displaystyle O(\log T)}for strongly convex and exp-concave loss functions.
These two different scales of loss function for uncertainty are both useful, according to how well each reflects the particular circumstances of the problem in question.
The framework is that of repeated game playing as follows: For t 1, 2,…, T{\displaystyle t=1,2,…,T} Learner receives input x t{\displaystyle x_{t}} Learner outputs w t{\displaystyle w_{t}} from a fixed convex set S{\displaystyle S}Nature sends back a convex loss function v t: S→ R{\displaystyle v_{t}: S\rightarrow\mathbb{R.
Commonly used loss functions for probabilistic classification include log loss and the Brier score between the predicted and the true probability distributions.
For the final error to be exactly 1- erf( c){\displaystyle 1-{\ mbox{ erf}}({\ sqrt{c}})}, the variance of the loss function must decrease linearly w.r.t. time to form the 0-1 loss function at the end of boosting iterations.
In this setting, the loss function is given as V: Y× Y→ R{\displaystyle V: Y\times Y\to\mathbb{R}}, such that V( f( x), y){\displaystyle V(f(x), y)} measures the difference between the predicted value f( x){\displaystyle f(x)} and the true value y{\displaystyle y.
In addition to the standard hinge loss( 1- y f( x))+{\displaystyle 1-yfx for labeled data, a loss function( 1-| f( x)|)+{\displaystyle 1-|fx is introduced over the unlabeled data by letting y sign f( x){\displaystyle y=\operatorname{sign}{fx.