home
Developmental psychology
How and where is the method of least squares used. Least Squares in Excel

How and where is the method of least squares used. Least Squares in Excel

17.10.2019

Approximation of experimental data is a method based on the replacement of experimentally obtained data with an analytical function that most closely passes or coincides at the nodal points with the initial values (data obtained during the experiment or experiment). There are currently two ways to define an analytic function:

By constructing an n-degree interpolation polynomial that passes directly through all points given array of data. In this case, the approximating function is represented as: an interpolation polynomial in the Lagrange form or an interpolation polynomial in the Newton form.

By constructing an n-degree approximating polynomial that passes close to points from the given data array. Thus, the approximating function smoothes out all random noise (or errors) that may occur during the experiment: the measured values during the experiment depend on random factors that fluctuate according to their own random laws (measurement or instrument errors, inaccuracy or experimental errors). In this case, the approximating function is determined by the least squares method.

Least square method(in the English literature Ordinary Least Squares, OLS) is a mathematical method based on the definition of an approximating function, which is built in the closest proximity to points from a given array of experimental data. The proximity of the initial and approximating functions F(x) is determined by a numerical measure, namely: the sum of the squared deviations of the experimental data from the approximating curve F(x) should be the smallest.

Fitting curve constructed by the least squares method

The least squares method is used:

To solve overdetermined systems of equations when the number of equations exceeds the number of unknowns;

To search for a solution in the case of ordinary (not overdetermined) nonlinear systems of equations;

For approximating point values by some approximating function.

The approximating function by the least squares method is determined from the condition of the minimum sum of squared deviations of the calculated approximating function from a given array of experimental data. This criterion of the least squares method is written as the following expression:

Values of the calculated approximating function at nodal points ,

Specified array of experimental data at nodal points .

The quadratic criterion has a number of "good" properties, such as differentiability, providing a unique solution to the approximation problem with polynomial approximating functions.

Depending on the conditions of the problem, the approximating function is a polynomial of degree m

The degree of the approximating function does not depend on the number of nodal points, but its dimension must always be less than the dimension (number of points) of the given array of experimental data.

∙ If the degree of the approximating function is m=1, then we approximate the table function with a straight line (linear regression).

∙ If the degree of the approximating function is m=2, then we approximate the table function with a quadratic parabola (quadratic approximation).

∙ If the degree of the approximating function is m=3, then we approximate the table function with a cubic parabola (cubic approximation).

In the general case, when it is required to construct an approximating polynomial of degree m for given tabular values, the condition for the minimum sum of squared deviations over all nodal points is rewritten in the following form:

- unknown coefficients of the approximating polynomial of degree m;

The number of specified table values.

A necessary condition for the existence of a minimum of a function is the equality to zero of its partial derivatives with respect to unknown variables . As a result, we obtain the following system of equations:

Let's transform the resulting linear system of equations: open the brackets and move the free terms to the right side of the expression. As a result, the resulting system of linear algebraic expressions will be written in the following form:

This system of linear algebraic expressions can be rewritten in matrix form:

As a result, a system of linear equations of dimension m + 1 was obtained, which consists of m + 1 unknowns. This system can be solved using any method for solving linear algebraic equations (for example, the Gauss method). As a result of the solution, unknown parameters of the approximating function will be found that provide the minimum sum of squared deviations of the approximating function from the original data, i.e. the best possible quadratic approximation. It should be remembered that if even one value of the initial data changes, all coefficients will change their values, since they are completely determined by the initial data.

Approximation of initial data by linear dependence

(linear regression)

As an example, consider the method for determining the approximating function, which is given as a linear relationship. In accordance with the least squares method, the condition for the minimum sum of squared deviations is written as follows:

Coordinates of nodal points of the table;

Unknown coefficients of the approximating function, which is given as a linear relationship.

A necessary condition for the existence of a minimum of a function is the equality to zero of its partial derivatives with respect to unknown variables. As a result, we obtain the following system of equations:

Let us transform the resulting linear system of equations.

We solve the resulting system of linear equations. The coefficients of the approximating function in the analytical form are determined as follows (Cramer's method):

These coefficients provide the construction of a linear approximating function in accordance with the criterion for minimizing the sum of squares of the approximating function from given tabular values (experimental data).

Algorithm for implementing the method of least squares

1. Initial data:

Given an array of experimental data with the number of measurements N

The degree of the approximating polynomial (m) is given

2. Calculation algorithm:

2.1. Coefficients are determined for constructing a system of equations with dimension

Coefficients of the system of equations (left side of the equation)

- index of the column number of the square matrix of the system of equations

Free members of the system of linear equations (right side of the equation)

- index of the row number of the square matrix of the system of equations

2.2. Formation of a system of linear equations with dimension .

2.3. Solution of a system of linear equations in order to determine the unknown coefficients of the approximating polynomial of degree m.

2.4 Determination of the sum of squared deviations of the approximating polynomial from the initial values over all nodal points

The found value of the sum of squared deviations is the minimum possible.

Approximation with Other Functions

It should be noted that when approximating the initial data in accordance with the least squares method, a logarithmic function, an exponential function, and a power function are sometimes used as an approximating function.

Log approximation

Consider the case when the approximating function is given by a logarithmic function of the form:

The method of least squares (LSM) allows you to estimate various quantities using the results of many measurements containing random errors.

Characteristic MNC

The main idea of this method is that the sum of squared errors is considered as a criterion for the accuracy of the solution of the problem, which is sought to be minimized. When using this method, both numerical and analytical approaches can be applied.

In particular, as a numerical implementation, the least squares method implies making as many measurements of an unknown random variable as possible. Moreover, the more calculations, the more accurate the solution will be. On this set of calculations (initial data), another set of proposed solutions is obtained, from which the best one is then selected. If the set of solutions is parametrized, then the least squares method will be reduced to finding the optimal value of the parameters.

As an analytical approach to the implementation of LSM on the set of initial data (measurements) and the proposed set of solutions, some (functional) is defined, which can be expressed by a formula obtained as a certain hypothesis that needs to be confirmed. In this case, the least squares method is reduced to finding the minimum of this functional on the set of squared errors of the initial data.

Note that not the errors themselves, but the squares of the errors. Why? The fact is that often the deviations of measurements from the exact value are both positive and negative. When determining the average, simple summation can lead to an incorrect conclusion about the quality of the estimate, since the mutual cancellation of positive and negative values will reduce the sampling power of the set of measurements. And, consequently, the accuracy of the assessment.

To prevent this from happening, the squared deviations are summed up. Even more than that, in order to equalize the dimension of the measured value and the final estimate, the sum of squared errors is used to extract

Some applications of MNCs

MNC is widely used in various fields. For example, in probability theory and mathematical statistics, the method is used to determine such a characteristic of a random variable as the standard deviation, which determines the width of the range of values of the random variable.

tutorial

Introduction

I am a computer programmer. I made the biggest leap in my career when I learned to say: "I do not understand anything!" Now I am not ashamed to tell the luminary of science that he is giving me a lecture, that I do not understand what it, the luminary, is talking to me about. And it's very difficult. Yes, it's hard and embarrassing to admit you don't know. Who likes to admit that he does not know the basics of something-there. By virtue of my profession, I have to attend a large number of presentations and lectures, where, I confess, in the vast majority of cases I feel sleepy, because I do not understand anything. And I don’t understand because the huge problem of the current situation in science lies in mathematics. It assumes that all students are familiar with absolutely all areas of mathematics (which is absurd). To admit that you do not know what a derivative is (that this is a little later) is a shame.

But I've learned to say that I don't know what multiplication is. Yes, I don't know what a subalgebra over a Lie algebra is. Yes, I do not know why quadratic equations are needed in life. By the way, if you are sure that you know, then we have something to talk about! Mathematics is a series of tricks. Mathematicians try to confuse and intimidate the public; where there is no confusion, no reputation, no authority. Yes, it is prestigious to speak in the most abstract language possible, which is complete nonsense in itself.

Do you know what a derivative is? Most likely you will tell me about the limit of the difference relation. In the first year of mathematics at St. Petersburg State University, Viktor Petrovich Khavin me defined derivative as the coefficient of the first term of the Taylor series of the function at the point (it was a separate gymnastics to determine the Taylor series without derivatives). I laughed at this definition for a long time, until I finally understood what it was about. The derivative is nothing more than just a measure of how much the function we are differentiating is similar to the function y=x, y=x^2, y=x^3.

I now have the honor of lecturing students who afraid mathematics. If you are afraid of mathematics - we are on the way. As soon as you try to read some text and it seems to you that it is overly complicated, then know that it is badly written. I argue that there is not a single area of mathematics that cannot be spoken about "on fingers" without losing accuracy.

The challenge for the near future: I instructed my students to understand what a linear-quadratic controller is. Don't be shy, waste three minutes of your life, follow the link. If you do not understand anything, then we are on the way. I (a professional mathematician-programmer) also did not understand anything. And I assure you, this can be sorted out "on the fingers." At the moment I do not know what it is, but I assure you that we will be able to figure it out.

So, the first lecture that I am going to give to my students after they come running to me in horror with the words that the linear-quadratic controller is a terrible bug that you will never master in your life is least squares methods. Can you solve linear equations? If you are reading this text, then most likely not.

So, given two points (x0, y0), (x1, y1), for example, (1,1) and (3,2), the task is to find the equation of a straight line passing through these two points:

illustration

This straight line should have an equation like the following:

Here alpha and beta are unknown to us, but two points of this line are known:

You can write this equation in matrix form:

Here we should make a lyrical digression: what is a matrix? A matrix is nothing but a two-dimensional array. This is a way of storing data, no more values should be attached to it. It is up to us how exactly to interpret a certain matrix. Periodically, I will interpret it as a linear mapping, periodically as a quadratic form, and sometimes simply as a set of vectors. This will all be clarified in context.

Let's replace specific matrices with their symbolic representation:

Then (alpha, beta) can be easily found:

More specifically for our previous data:

Which leads to the following equation of a straight line passing through the points (1,1) and (3,2):

Okay, everything is clear here. And let's find the equation of a straight line passing through three points: (x0,y0), (x1,y1) and (x2,y2):

Oh-oh-oh, but we have three equations for two unknowns! The standard mathematician will say that there is no solution. What will the programmer say? And he will first rewrite the previous system of equations in the following form:

In our case, the vectors i, j, b are three-dimensional, therefore, (in the general case) there is no solution to this system. Any vector (alpha\*i + beta\*j) lies in the plane spanned by the vectors (i, j). If b does not belong to this plane, then there is no solution (equality in the equation cannot be achieved). What to do? Let's look for a compromise. Let's denote by e(alpha, beta) how exactly we did not achieve equality:

And we will try to minimize this error:

Why a square?

We are looking not just for the minimum of the norm, but for the minimum of the square of the norm. Why? The minimum point itself coincides, and the square gives a smooth function (a quadratic function of the arguments (alpha,beta)), while just the length gives a function in the form of a cone, non-differentiable at the minimum point. Brr. Square is more convenient.

Obviously, the error is minimized when the vector e orthogonal to the plane spanned by the vectors i And j.

Illustration

In other words: we are looking for a line such that the sum of the squared lengths of the distances from all points to this line is minimal:

UPDATE: here I have a jamb, the distance to the line should be measured vertically, not orthographic projection. This commenter is correct.

Illustration

In completely different words (carefully, poorly formalized, but it should be clear on the fingers): we take all possible lines between all pairs of points and look for the average line between all:

Illustration

Another explanation on the fingers: we attach a spring between all data points (here we have three) and the line that we are looking for, and the line of the equilibrium state is exactly what we are looking for.

Quadratic form minimum

So, given the vector b and the plane spanned by the columns-vectors of the matrix A(in this case (x0,x1,x2) and (1,1,1)), we are looking for a vector e with a minimum square of length. Obviously, the minimum is achievable only for the vector e, orthogonal to the plane spanned by the columns-vectors of the matrix A:

In other words, we are looking for a vector x=(alpha, beta) such that:

I remind you that this vector x=(alpha, beta) is the minimum of the quadratic function ||e(alpha, beta)||^2:

Here it is useful to remember that the matrix can be interpreted as well as the quadratic form, for example, the identity matrix ((1,0),(0,1)) can be interpreted as a function of x^2 + y^2:

quadratic form

All this gymnastics is known as linear regression.

Laplace equation with Dirichlet boundary condition

Now the simplest real problem: there is a certain triangulated surface, it is necessary to smooth it. For example, let's load my face model:

The original commit is available. To minimize external dependencies, I took the code of my software renderer, already on Habré. To solve the linear system, I use OpenNL , it's a great solver, but it's very difficult to install: you need to copy two files (.h + .c) to your project folder. All smoothing is done by the following code:

For (int d=0; d<3; d++) { nlNewContext(); nlSolverParameteri(NL_NB_VARIABLES, verts.size()); nlSolverParameteri(NL_LEAST_SQUARES, NL_TRUE); nlBegin(NL_SYSTEM); nlBegin(NL_MATRIX); for (int i=0; i<(int)verts.size(); i++) { nlBegin(NL_ROW); nlCoefficient(i, 1); nlRightHandSide(verts[i][d]); nlEnd(NL_ROW); } for (unsigned int i=0; i&face = faces[i]; for (int j=0; j<3; j++) { nlBegin(NL_ROW); nlCoefficient(face[ j ], 1); nlCoefficient(face[(j+1)%3], -1); nlEnd(NL_ROW); } } nlEnd(NL_MATRIX); nlEnd(NL_SYSTEM); nlSolve(); for (int i=0; i<(int)verts.size(); i++) { verts[i][d] = nlGetVariable(i); } }

X, Y and Z coordinates are separable, I smooth them separately. That is, I solve three systems of linear equations, each with the same number of variables as the number of vertices in my model. The first n rows of matrix A have only one 1 per row, and the first n rows of vector b have original model coordinates. That is, I spring-tie between the new vertex position and the old vertex position - the new ones shouldn't be too far away from the old ones.

All subsequent rows of matrix A (faces.size()*3 = the number of edges of all triangles in the grid) have one occurrence of 1 and one occurrence of -1, while the vector b has zero components opposite. This means I put a spring on each edge of our triangular mesh: all edges try to get the same vertex as their starting and ending points.

Once again: all vertices are variables, and they cannot deviate far from their original position, but at the same time they try to become similar to each other.

Here is the result:

Everything would be fine, the model is really smoothed, but it moved away from its original edge. Let's change the code a little:

For (int i=0; i<(int)verts.size(); i++) { float scale = border[i] ? 1000: 1; nlBegin(NL_ROW); nlCoefficient(i, scale); nlRightHandSide(scale*verts[i][d]); nlEnd(NL_ROW); }

In our matrix A, for the vertices that are on the edge, I add not a row from the category v_i = verts[i][d], but 1000*v_i = 1000*verts[i][d]. What does it change? And this changes our quadratic form of the error. Now a single deviation from the top at the edge will cost not one unit, as before, but 1000 * 1000 units. That is, we hung a stronger spring on the extreme vertices, the solution prefers to stretch others more strongly. Here is the result:

Let's double the strength of the springs between the vertices:
nlCoefficient(face[ j ], 2); nlCoefficient(face[(j+1)%3], -2);

It is logical that the surface has become smoother:

And now even a hundred times stronger:

What is this? Imagine that we have dipped a wire ring in soapy water. As a result, the resulting soap film will try to have the least curvature as possible, touching the same border - our wire ring. This is exactly what we got by fixing the border and asking for a smooth surface inside. Congratulations, we have just solved the Laplace equation with Dirichlet boundary conditions. Sounds cool? But in fact, just one system of linear equations to solve.

Poisson equation

Let's have another cool name.

Let's say I have an image like this:

Everyone is good, but I don't like the chair.

I'll cut the picture in half:

And I will select a chair with my hands:

Then I will drag everything that is white in the mask to the left side of the picture, and at the same time I will say throughout the whole picture that the difference between two neighboring pixels should be equal to the difference between two neighboring pixels of the right image:

For (int i=0; i

Here is the result:

Code and pictures are available

Least square method

Least square method ( MNK, OLS, Ordinary Least Squares) - one of the basic methods of regression analysis for estimating unknown parameters of regression models from sample data. The method is based on minimizing the sum of squares of regression residuals.

It should be noted that the least squares method itself can be called a method for solving a problem in any area, if the solution consists of or satisfies a certain criterion for minimizing the sum of squares of some functions of the unknown variables. Therefore, the least squares method can also be used for an approximate representation (approximation) of a given function by other (simpler) functions, when finding a set of quantities that satisfy equations or restrictions, the number of which exceeds the number of these quantities, etc.

The essence of the MNC

Let some (parametric) model of probabilistic (regression) dependence between the (explained) variable y and many factors (explanatory variables) x

where is the vector of unknown model parameters

- Random model error.

Let there also be sample observations of the values of the indicated variables. Let be the observation number (). Then are the values of the variables in the -th observation. Then, for given values of the parameters b, it is possible to calculate the theoretical (model) values of the explained variable y:

The value of the residuals depends on the values of the parameters b.

The essence of LSM (ordinary, classical) is to find such parameters b for which the sum of the squares of the residuals (eng. Residual Sum of Squares) will be minimal:

In the general case, this problem can be solved by numerical methods of optimization (minimization). In this case, one speaks of nonlinear least squares(NLS or NLLS - English. Non Linear Least Squares). In many cases, an analytical solution can be obtained. To solve the minimization problem, it is necessary to find the stationary points of the function by differentiating it with respect to the unknown parameters b, equating the derivatives to zero, and solving the resulting system of equations:

If the random errors of the model are normally distributed, have the same variance, and are not correlated with each other, the least squares parameter estimates are the same as the maximum likelihood method (MLM) estimates.

LSM in the case of a linear model

Let the regression dependence be linear:

Let y- column vector of observations of the explained variable, and - matrix of observations of factors (rows of the matrix - vectors of factor values in a given observation, by columns - vector of values of a given factor in all observations). The matrix representation of the linear model has the form:

Then the vector of estimates of the explained variable and the vector of regression residuals will be equal to

accordingly, the sum of the squares of the regression residuals will be equal to

Differentiating this function with respect to the parameter vector and equating the derivatives to zero, we obtain a system of equations (in matrix form):

The solution of this system of equations gives the general formula for the least squares estimates for the linear model:

For analytical purposes, the last representation of this formula turns out to be useful. If the data in the regression model centered, then in this representation the first matrix has the meaning of the sample covariance matrix of factors, and the second one is the vector of covariances of factors with dependent variable. If, in addition, the data is also normalized at the SKO (that is, ultimately standardized), then the first matrix has the meaning of the sample correlation matrix of factors, the second vector - the vector of sample correlations of factors with the dependent variable.

An important property of LLS estimates for models with a constant- the line of the constructed regression passes through the center of gravity of the sample data, that is, the equality is fulfilled:

In particular, in the extreme case when the only regressor is a constant, we find that the OLS estimate of a single parameter (the constant itself) is equal to the mean value of the variable being explained. That is, the arithmetic mean, known for its good properties from the laws of large numbers, is also an least squares estimate - it satisfies the criterion for the minimum sum of squared deviations from it.

Example: simple (pairwise) regression

In the case of paired linear regression, the calculation formulas are simplified (you can do without matrix algebra):

Properties of OLS estimates

First of all, we note that for linear models, the least squares estimates are linear estimates, as follows from the above formula. For unbiased OLS estimates, it is necessary and sufficient to fulfill the most important condition of regression analysis: the mathematical expectation of a random error conditional on the factors must be equal to zero. This condition is satisfied, in particular, if

the mathematical expectation of random errors is zero, and
factors and random errors are independent random variables.

The second condition - the condition of exogenous factors - is fundamental. If this property is not satisfied, then we can assume that almost any estimates will be extremely unsatisfactory: they will not even be consistent (that is, even a very large amount of data does not allow obtaining qualitative estimates in this case). In the classical case, a stronger assumption is made about the determinism of factors, in contrast to a random error, which automatically means that the exogenous condition is satisfied. In the general case, for the consistency of the estimates, it is sufficient to fulfill the exogeneity condition together with the convergence of the matrix to some non-singular matrix with an increase in the sample size to infinity.

In order for, in addition to the consistency and unbiasedness, the estimates of the (usual) LSM to be also effective (the best in the class of linear unbiased estimates), it is necessary to fulfill additional properties of a random error:

These assumptions can be formulated for the covariance matrix of the random error vector

A linear model that satisfies these conditions is called classical. The least squares estimators for classical linear regression are unbiased, consistent, and the most efficient estimators in the class of all linear unbiased estimators (the abbreviation blue (Best Linear Unbaised Estimator) is the best linear unbiased estimate; in domestic literature, the Gauss-Markov theorem is more often cited). As it is easy to show, the covariance matrix of the coefficient estimates vector will be equal to:

Generalized least squares

The method of least squares allows for a wide generalization. Instead of minimizing the sum of squares of the residuals, one can minimize some positive definite quadratic form of the residual vector , where is some symmetric positive definite weight matrix. Ordinary least squares is a special case of this approach, when the weight matrix is proportional to the identity matrix. As is known from the theory of symmetric matrices (or operators), there is a decomposition for such matrices. Therefore, the specified functional can be represented as follows, that is, this functional can be represented as the sum of the squares of some transformed "residuals". Thus, we can distinguish a class of least squares methods - LS-methods (Least Squares).

It is proved (Aitken's theorem) that for a generalized linear regression model (in which no restrictions are imposed on the covariance matrix of random errors), the most effective (in the class of linear unbiased estimates) are estimates of the so-called. generalized OLS (OMNK, GLS - Generalized Least Squares)- LS-method with a weight matrix equal to the inverse covariance matrix of random errors: .

It can be shown that the formula for the GLS-estimates of the parameters of the linear model has the form

The covariance matrix of these estimates, respectively, will be equal to

In fact, the essence of the OLS lies in a certain (linear) transformation (P) of the original data and the application of the usual least squares to the transformed data. The purpose of this transformation is that for the transformed data, the random errors already satisfy the classical assumptions.

Weighted least squares

In the case of a diagonal weight matrix (and hence the covariance matrix of random errors), we have the so-called weighted least squares (WLS - Weighted Least Squares). In this case, the weighted sum of squares of the residuals of the model is minimized, that is, each observation receives a "weight" that is inversely proportional to the variance of the random error in this observation: . In fact, the data is transformed by weighting the observations (dividing by an amount proportional to the assumed standard deviation of the random errors), and normal least squares is applied to the weighted data.

Some special cases of application of LSM in practice

Linear Approximation

Consider the case when, as a result of studying the dependence of a certain scalar quantity on a certain scalar quantity (This can be, for example, the dependence of voltage on current strength: , where is a constant value, the resistance of the conductor), these quantities were measured, as a result of which the values \u200b\u200band and their corresponding values. Measurement data should be recorded in a table.

Table. Measurement results.

Measurement No.
1
2
3
4
5
6

The question sounds like this: what value of the coefficient can be chosen to best describe the dependence ? According to the least squares, this value should be such that the sum of the squared deviations of the values from the values

was minimal

The sum of squared deviations has one extremum - a minimum, which allows us to use this formula. Let's find the value of the coefficient from this formula. To do this, we transform its left side as follows:

The last formula allows us to find the value of the coefficient , which was required in the problem.

Story

Until the beginning of the XIX century. scientists did not have certain rules for solving a system of equations in which the number of unknowns is less than the number of equations; Until that time, particular methods were used, depending on the type of equations and on the ingenuity of the calculators, and therefore different calculators, starting from the same observational data, came to different conclusions. Gauss (1795) is credited with the first application of the method, and Legendre (1805) independently discovered and published it under its modern name (fr. Methode des moindres quarres ) . Laplace related the method to the theory of probability, and the American mathematician Adrain (1808) considered its probabilistic applications. The method is widespread and improved by further research by Encke, Bessel, Hansen and others.

Alternative use of MNCs

The idea of the least squares method can also be used in other cases not directly related to regression analysis. The fact is that the sum of squares is one of the most common proximity measures for vectors (the Euclidean metric in finite-dimensional spaces).

One application is "solving" systems of linear equations in which the number of equations is greater than the number of variables

where the matrix is not square, but rectangular.

Such a system of equations, in the general case, has no solution (if the rank is actually greater than the number of variables). Therefore, this system can be "solved" only in the sense of choosing such a vector in order to minimize the "distance" between the vectors and . To do this, you can apply the criterion for minimizing the sum of squared differences of the left and right parts of the equations of the system, that is, . It is easy to show that the solution of this minimization problem leads to the solution of the following system of equations

If some physical quantity depends on another quantity, then this dependence can be investigated by measuring y at different values of x. As a result of measurements, a series of values is obtained:

x 1 , x 2 , ..., x i , ... , x n ;

y 1 , y 2 , ..., y i , ... , y n .

Based on the data of such an experiment, it is possible to plot the dependence y = ƒ(x). The resulting curve makes it possible to judge the form of the function ƒ(x). However, the constant coefficients that enter into this function remain unknown. They can be determined using the least squares method. The experimental points, as a rule, do not lie exactly on the curve. The method of least squares requires that the sum of the squared deviations of the experimental points from the curve, i.e. 2 was the smallest.

In practice, this method is most often (and most simply) used in the case of a linear relationship, i.e. When

y=kx or y = a + bx.

Linear dependence is very widespread in physics. And even when the dependence is non-linear, they usually try to build a graph in such a way as to get a straight line. For example, if it is assumed that the refractive index of glass n is related to the wavelength λ of the light wave by the relation n = a + b/λ 2 , then the dependence of n on λ -2 is plotted on the graph.

Consider the dependence y=kx(straight line passing through the origin). Let us compose the value φ the sum of the squared deviations of our points from the straight line

The value of φ is always positive and turns out to be the smaller, the closer our points lie to the straight line. The method of least squares states that for k one should choose such a value at which φ has a minimum

or
(19)

The calculation shows that the root-mean-square error in determining the value of k is equal to

, (20)
where n is the number of dimensions.

Let us now consider a somewhat more difficult case, when the points must satisfy the formula y = a + bx(a straight line not passing through the origin).

The task is to find the best values of a and b from the given set of values x i , y i .

Again we compose a quadratic form φ equal to the sum of the squared deviations of the points x i , y i from the straight line

and find the values a and b for which φ has a minimum

;

The joint solution of these equations gives

(21)

The root-mean-square errors of determining a and b are equal

(23)

. (24)

When processing the measurement results by this method, it is convenient to summarize all the data in a table in which all the amounts included in formulas (19)(24) are preliminarily calculated. The forms of these tables are shown in the examples below.

Example 1 The basic equation of the dynamics of rotational motion ε = M/J (a straight line passing through the origin) was studied. For various values of the moment M, the angular acceleration ε of a certain body was measured. It is required to determine the moment of inertia of this body. The results of measurements of the moment of force and angular acceleration are listed in the second and third columns tables 5.

Table 5

n	M, N m	ε, s-1	M2	M ε	ε - kM	(ε - kM) 2
1	1.44	0.52	2.0736	0.7488	0.039432	0.001555
2	3.12	1.06	9.7344	3.3072	0.018768	0.000352
3	4.59	1.45	21.0681	6.6555	-0.08181	0.006693
4	5.90	1.92	34.81	11.328	-0.049	0.002401
5	7.45	2.56	55.5025	19.072	0.073725	0.005435
∑			123.1886	41.1115		0.016436

By formula (19) we determine:

To determine the root-mean-square error, we use formula (20)

0.005775kg-1 · m -2 .

By formula (18) we have

; .

SJ = (2.996 0.005775)/0.3337 = 0.05185 kg m 2.

Given the reliability P = 0.95 , according to the table of Student coefficients for n = 5, we find t = 2.78 and determine the absolute error ΔJ = 2.78 0.05185 = 0.1441 ≈ 0.2 kg m 2.

We write the results in the form:

J = (3.0 ± 0.2) kg m 2;

Example 2 We calculate the temperature coefficient of resistance of the metal using the least squares method. Resistance depends on temperature according to a linear law

R t \u003d R 0 (1 + α t °) \u003d R 0 + R 0 α t °.

The free term determines the resistance R 0 at a temperature of 0 ° C, and the angular coefficient is the product of the temperature coefficient α and the resistance R 0 .

The results of measurements and calculations are given in the table ( see table 6).

Table 6

n	t°, s	r, Ohm	t-¯t	(t-¯t) 2	(t-¯t)r	r-bt-a	(r - bt - a) 2,10 -6
1	23	1.242	-62.8333	3948.028	-78.039	0.007673	58.8722
2	59	1.326	-26.8333	720.0278	-35.581	-0.00353	12.4959
3	84	1.386	-1.83333	3.361111	-2.541	-0.00965	93.1506
4	96	1.417	10.16667	103.3611	14.40617	-0.01039	107.898
5	120	1.512	34.16667	1167.361	51.66	0.021141	446.932
6	133	1.520	47.16667	2224.694	71.69333	-0.00524	27.4556
∑	515	8.403		8166.833	21.5985		746.804
∑/n	85.83333	1.4005

By formulas (21), (22) we determine

R 0 = ¯ R- α R 0 ¯ t = 1.4005 - 0.002645 85.83333 = 1.1735 Ohm.

Let us find an error in the definition of α. Since , then by formula (18) we have:

Using formulas (23), (24) we have

;

0.014126 Ohm.

Given the reliability P = 0.95, according to the table of Student's coefficients for n = 6, we find t = 2.57 and determine the absolute error Δα = 2.57 0.000132 = 0.000338 deg -1.

α = (23 ± 4) 10 -4 hail-1 at P = 0.95.

Example 3 It is required to determine the radius of curvature of the lens from Newton's rings. The radii of Newton's rings r m were measured and the numbers of these rings m were determined. The radii of Newton's rings are related to the radius of curvature of the lens R and the ring number by the equation

r 2 m = mλR - 2d 0 R,

where d 0 the thickness of the gap between the lens and the plane-parallel plate (or lens deformation),

λ is the wavelength of the incident light.

λ = (600 ± 6) nm;
r 2 m = y;
m = x;
λR = b;
-2d 0 R = a,

then the equation will take the form y = a + bx.

The results of measurements and calculations are entered in table 7.

Table 7

n	x = m	y \u003d r 2, 10 -2 mm 2	m-¯m	(m-¯m) 2	(m-¯m)y	y-bx-a, 10-4	(y - bx - a) 2, 10 -6
1	1	6.101	-2.5	6.25	-0.152525	12.01	1.44229
2	2	11.834	-1.5	2.25	-0.17751	-9.6	0.930766
3	3	17.808	-0.5	0.25	-0.08904	-7.2	0.519086
4	4	23.814	0.5	0.25	0.11907	-1.6	0.0243955
5	5	29.812	1.5	2.25	0.44718	3.28	0.107646
6	6	35.760	2.5	6.25	0.894	3.12	0.0975819
∑	21	125.129		17.5	1.041175		3.12176
∑/n	3.5	20.8548333