What is 'fit' in machine learning? I noticed in some cases it is a synonym for training.

Can someone please explain in layman's term?

1

Best Answer


A machine learning model is typically specified with some functional form that includes parameters.

An example is a line intended to model data that has an outcome variable y that can be described in terms of a feature x. In that case, the functional form would be:

y = mx + b

fitting the model means finding values for m and b that are in accordance with training data, which is a set of points (x1, y1), (x2, y2), ..., (xN, yN). It may not be possible to set m and b such that the line passes through all training data points, but some loss function could be defined for describing a well-fit line. The fitting algorithm's purpose would be to minimize that loss function. In the case of line fitting, the loss could be the total distance of training data points to the line, but it may be more mathematically convenient to set the loss to the total squared distance of training data points to the line.

In general, a model can be more complex than a line and include many parameters. For some models, the number of parameters is not fixed and can change as part of the fitting process. The features and the outcome variable can be discrete, continuous, and/or multidimensional. For unsupervised problems, there is no outcome variable.

In all these cases, fitting is still analogous to the line example above, where an algorithm is run to find model parameters that in some sense explain the training data. This often involves running some optimization procedure.

A model that is well-fit to the training data may not be well-fit to other non-training data, even if the other data is sampled from the same distribution as the training data. A technique called regularization can be used to address this issue.