Development of regression model of proteins attackability process in meat food (in vitro)

In the presented article the authors consider the issues of development of regression model for process of food digestion by proteolytic enzymes in human body. The authors use correlation analysis. They analyze the main nutritional values and physical and chemical properties of meat products, the modes of heat treatment of semi-finished lamb products. The essential parameters and features are determined to find the dependence between the factor values and efficient values of the basic raw material, which affect the quality of the technological processes and, in general, the finished product. The regression model equation is mathematically calculated by methods of solving K. Gauss linear equations. The standard deviations of parameters are calculated, the initial data are normalized; the matrices of the pair correlation coefficients, lower and upper limits of their values are compiled. Equations of the mathematical regression model of meat proteins attackability by proteolytic enzymes — in vitro (pepsin, trypsin) are developed. It is proved that the obtained equation represents a regression model of the process of meat food proteins attackability by enzymes (pepsin, trypsin and chymotrypsin), depending on the determined 3 essential factors (weight of a meat piece, duration of frying, collagen content in lamb meat). Also this equation reflects the process of lamb digestibility in a digestive tract of a human body.


Introduction
The rate of proteins digestion in the gastrointestinal tract, or attackability of the proteins in the composition of meat dish by proteolytic enzymes is one of the important factors that determine the biological value of food products [1][2][3][4][5].
Nowadays the theory and methods of correlation analysis can be successfully applied to study mathematical problems, relations between phenomena and features in various fields of science, technology and national economy. Correction relation is considered as established when a row of function values correspond to the same value of the argument. The features that characterize this relation are divided into factor and efficient. The features that affect a certain result are called factor features. And the features that respond to factor features are called efficient [6].
Correlation-regression analysis is often used in scientific research. When processing the big amounts of statistical data, correlation analysis quantifies the strength of link between two or more quantitative variables. Regression or correlation analysis describes the link between variables, while the correlation provides a numerical way to measure the level or strength of link between two variables [7,8].
Here are the examples of some researches that cover the application of correlation-regression analysis in various fields of science. The research [9] assessed the capacity of an interactive dual-energy X-ray absorptiometer (DEXA) installed on the slaughter line of a meat processing plant to determine the composition of lamb carcasses. 607 lamb carcasses from 7 slaughter groups were scanned by DEXA device and later were scanned by computed tomography device to determine the ratio of fat, lean meat and bone in carcasses. The results of those test across whole data range showed high accuracy of body fat percentage forecasting by computed tomography device, with coefficient of determination (R 2 ) = 0.89, compared to 0.69 for lean meat and 0.68 for bones in carcass, that showed less accuracy. Accuracy in the seven groups was also high in comparison with the mean values of bias 0.66, 0.83 and 0.51.
The researches [10] carried out by the Danish Research Institute of Meat are conducted to develop the forecasting models for the shelf life of meat products. Expiration dates of chilled meat were simulated for beef cuts (850 samples), pork cuts (1500 samples) and chicken (1080 samples), minced beef and minced pork (680 samples of each type of minced meat) and for bacon (1080 samples). In this case, the samples of meat and meat products were packed in modified atmosphere into vacuum bags in combination with microwave treatment at various storage temperatures. Research showed that forecasting models serve as tools able to assess the importance of temperature and packaging changes for the shelf life of various meat products. In the research [11] volatiles were studied during roasting beef at a temperature of 180 °C. 70 volatile substances were identified, including non-aromatic, homocyclic and heterocyclic compounds. A significant positive regression model was constructed to forecast the storage of toluene, benzol acetaldehyde, 2-formylfuran, pyrazine, 2.6-dimethylpyrazine, 2.3-dimethylpyrazine, 2-acetylthiazole, and 2-formyl-3-methylthiophene. For calculation of aging time a linear and logarithmic regression model was chosen.
In a research [12] the authors studied the influence of the carcass parts weights (thigh, chest, wing, back, stomach, heart) on whole carcass weight of white turkeys (Big-6). Data were analyzed with the help of regression analysis based on ridge regression and factor analysis. Both regression models were found to be suitable for turkey carcass weights forecasting. However, the ridge regression method was preferred, as it showed higher R 2 value and explains carcass weight in a better way.
In the study [13] three varieties of wheat (PKB Talas, BG Merkur and PKB Lepoklas) harvested in 2009 and 2010 were studied. The correlations between the morphological and yield parameters of plants were studied: the number of shoots, the number of spikelets on a wheat head, the number of grains per wheat head, the weight of 1,000 grains and mass of grain per wheat head. Taking into account the parameters of all three varieties, high and positive correlations were found between the number of grains per wheat head and the weight of grain per wheat head(> 0.78), the number of spikelets per wheat head and the number of grains per wheat head (> 0.79), as well as the number of spikelets per wheat head and grain weight per wheat head (> 0.73). Regression analysis was conducted only as an addition to the correlation formula, and was presented in the form of charts showing the link between the studied dependent and independent characteristics.
Researchers in the study [14] constructed a quadratic polynomial that explains the link of three variables (fermentation temperature, X 1 ; amount of inoculation, X 2 ; and concentration of solid substrate, X 3 ) on yield of monacolin K.
The goals of the study [15] were to define: (1) the effect of fertilizers, the environment and their interactions on the thousand grains weight (TGW), hectolitre weight (HW) and grain yield (GY) of winter triticale variety and (2) the correlation between these characteristics in various environments. Negative and significant correlation was found between GY and TGW (minus 0.392) in 2015, positive highly significant correlation was found in 2013 (0.648) and 2014 (0.493).
Recently, researches have appeared which study the effects of meat consumption of health of population in the modern world, including issues related to the consumption of saturated fat [16]. Regression analysis was used to determine the effect of stearic acid on total cholesterol (TC) level in blood plasma and low-density lipoprotein (LDL) cholesterol when people of various ages consume fatty foods is described in researches [17,18,19].
With an increase in standards of living and well-being of peoples of Central Asia, and in particular, the Republic of Uzbekistan, in the diet of population the consumers' demand for natural food products and a wide range of lamb meat dishes cooked in traditional ways increases. In accordance with modern concepts of nutritional science, food must have an attractive appearance and high taste, and also food shall be biologically complete, i. e. the food shall contain all necessary essential amino acids and other important food components in optimal proportions. Those food components are well digested in a human body by digestive enzymes [20,21].
The basis of enzymatic hydrolysis of meat proteins according to method of A. A. Pokrovsky and I. D. Ertanov [5] is the conditions when the availability of attacked peptide bonds in a meat is determined not only by its physical and chemical parameter of proteins, but also by characteristics related to the structure and chemical composition of the basic food product. It is important and necessary to conduct scientific research in this sphere using mathematical methods of analysis, as they prove the effectiveness of the approaches, the possibility of determining the calculated parameters of the biological and nutritional value of food products and comparing them with the FAO data.
The purpose of our study was to build a regression model of the most important processes of food digestion by proteolytic enzymes (pepsin, trypsin), using correlation analysis and expressing the analytical dependence of the efficient characteristic (Y-process of meat proteins attackability) on factor characteristics x i , i = 1, n.
To achieve this goal the following tasks were solved: based on 20-fold physical and chemical analyzes to obtain the necessary digital parameters of the analyzed main product (meat) and based on results of obtained data, calculate their position and dispersion characteristics, determine the influence of significant factors -(modes of heat treatment, quantitative content of imperfect proteins) for efficient features (quality of semi-finished meat products).

Objects and methods
To conduct the scientific research we selected and prepared sample materials, as well as chose experimental methods for testing. The object of research and study was: meat productivity, quality of meat and raw fat of fat-tailed sheep of "Jaydara" breed, popular in the foothill regions of the Republic of Uzbekistan. Experimental studies analyzed 14 parameters of meat product quality, including the parameter fully covered in this article.
The process of hydrolysis (digestion of proteins in vitro) was run in the Department of Food Technology laboratory on a special 3-cells device that provides continuous mixing and dialysis of the samples, and the products of their breakdown were analyzed by micro-methods that allow simultaneous analysis of a significant number of samples under study [22].
The meat product containing about 150 mg of protein (N2 x 6.25 or x 5.75) is placed in the inner vessel of the specified device, then 15 ml of 0.02 N solution of HCl with pH 1.2 is added into the vessel. In order to comply with the isoionic process, it is also necessary to add 60 ml of the same solution to the outer vessel. The test samples are incubated in the thermostat at a temperature of 37 ºС.
The index of the proteins attackability of meat semifinished products by proteolytic enzymes (pepsin, trypsin) was estimated by build-up of hydrolysis products in individual samples. The calculations were made according to the corresponding formulas.
For statistical data processing and construction of 1) matrix of pair correlation coefficients; 2) the regression equations, we used "Data Analysis" in a spreadsheet processor MS Excel.

Results and discussion
In order to conduct the mathematical analysis, preliminarily we studied the process of proteins attackability in lamb meat food cooked in various modes and duration of heat treatment, taking into account the unequal content of connective tissue proteins in meat (collagen and elastin), as well as the content of pure hydroxyproline in the samples under study [21].
As a result of physical and chemical analyzes data were obtained, which were subsequently presented as initial data, where: X 1 -is weight of meat pieces (weight, g); X 2 -deep fry temperature (°C); X 3 -duration of frying (min); X 4 -collagen content in meat (%); X 5 -elastin content in meat (%); X 6 -oxyproline content in meat (mg,%), Y -degree of attackability (mg of hydrolyzed protein). The layout of the initial data table is compiled on the basis of laboratory analysis values and is presented below in Table 1. Based on the initial data, the parameters of the multiple regression equation were calculated as follows: -The arithmetic mean value of each parameter is determined according to the formula (1) Based on the initial data, the parameters of the multiple regression equ calculated as follows: -The arithmetic mean value of each parameter is determined according to the f where k -is number of factors (in our case k=1,2....6); k X -arithmetic mean of the k ki x -value of the i th measurement of the k th factor (i=1, 2……n); n -number of trials The obtained values are presented in the last line of the Table 1 and are highligh orange.
-The standard deviation of each parameter was determined by the formula (2): where k -is number of factors (in our case k = 1, 2 … 6); X karithmetic mean of the k th factor; x ki -value of the i th measurement of the k th factor (i = 1, 2…n); n -number of trials.
The obtained values are presented in the last line of the Table 1 and are highlighted in orange.
-The standard deviation of each parameter was determined by the formula (2) Based on the initial data, the parameters of the multiple regression equatio calculated as follows: -The arithmetic mean value of each parameter is determined according to the formu where k -is number of factors (in our case k=1,2....6); k X -arithmetic mean of the k th fac ki x -value of the i th measurement of the k th factor (i=1, 2……n); n -number of trials .
The obtained values are presented in the last line of the Table 1 and are highlighted orange.
-The standard deviation of each parameter was determined by the formula (2): where C xr -is the mean-square deviation of the k th factor.
Further, the initial data were normalized (Table 1) according to the following formula: Further, the initial data were normalized ( where * x -is rationing the value of the k th factor. After normalizing the data the following results were obtained (Table 2).
where x* -is rationing the value of the k th factor.
After normalizing the data the following results were obtained ( Table 2).  -The pair correlation coefficients can be calculated according to the formula (4) or using the spreadsheet processor "Data Analysis" in MS Excel.
where R Ks -is coefficient of correlation dependence between factors k and s.
The results of calculation are presented in the form of a matrix of pair correlations (Table 3). Analyzing the data presented in the Table 3, it is obvious that the coefficients of pair correlation between Y and factors X 1 , X 2 , X 5 feature rather low values (all coefficients in absolute value are below 0.4), and it means that the link is quite weak. The moderate link is observed between Y and X 6 (correlation ratio is minus 0.50). Strong link is defined between Y and X 3 , X 4 (correlation coefficients to modulo exceed 0.9). In addition the strong link is observed between variables X 3 and X 4 (correlation coefficient is minus 0.93), moderate link is determined between variables X 4 and X 6 (correlation coefficient is 0.57), X 5 and X 6 (correlation coefficient is 0.64) and it means that these coefficients can be collinear.
After finding the pair correlation coefficients, the partial correlation coefficients were determined by the following formula: g the data presented in the Table 3, it is obvious that the coefficients of pair een Y and factors X1, X2, X5 feature rather low values (all coefficients in re below 0.4), and it means that the link is quite weak. The moderate link is en Y and X6 (correlation ratio is minus 0.50). Strong link is defined between Y rrelation coefficients to modulo exceed 0.9). In addition the strong link is en variables X3 and X4 (correlation coefficient is minus 0.93), moderate link is een variables X4 and X6 (correlation coefficient is 0.57), X5 and X6 (correlation 4) and it means that these coefficients can be collinear. ding the pair correlation coefficients, the partial correlation coefficients were e following formula: As result the following values are obtained: Correlation ratio is high (to modulo ober 0.9), except for Rx3x4. In this case the correlation is low. Correlation ratio is low for Rx6x4, moderate for others. To establish a connection between all factors and the resulting characteristic, the multiple regression coefficient is found: 7 coefficient is 0.64) and it means that these coefficients can be collinear.
After finding the pair correlation coefficients, the partial correlation coefficients determined by the following formula: Correlation ratio is low for Rx6x4, moderate for others.
To establish a connection between all factors and the resulting characteristic, the regression coefficient is found: As a result of the calculations, we found that the coefficient of multiple regressio which proves a strong correlation between the entire set of factors and the result.
The unadjusted coefficient of multiple determination �� � � � � = 0.96 shows that 9 variation in the result are explained by the variation of the factors presented in the equat The adjusted coefficient of multiple determination defines the correlation ratio into account the degrees of freedom of the total and residual variances and is calculat following formula: The adjusted multiple regression coefficient is 0.94, i.e. practically equal to 1 regression equation explains the variation of attackability.
To build a linear multiple regression model, "Data Analysis" in MS Excel (Figu used. (6) As a result of the calculations, we found that the coefficient of multiple regression is 0.98, which proves a strong correlation between the entire set of factors and the result.
The unadjusted coefficient of multiple determination shows that 96% of the variation in the result are explained by the variation of the factors presented in the equation.
The adjusted coefficient of multiple determination defines the correlation ration, taking into account the degrees of freedom of the total and residual variances and is calculated by the following formula: 7 determined between variables X4 and X6 (correlation coefficient is 0.57), X5 and X6 coefficient is 0.64) and it means that these coefficients can be collinear.
After finding the pair correlation coefficients, the partial correlation coefficien determined by the following formula: Correlation ratio is low for Rx6x4, moderate for others.
To establish a connection between all factors and the resulting characteristic, regression coefficient is found: As a result of the calculations, we found that the coefficient of multiple regres which proves a strong correlation between the entire set of factors and the result.
The unadjusted coefficient of multiple determination �� � � � � = 0.96 shows tha variation in the result are explained by the variation of the factors presented in the eq The adjusted coefficient of multiple determination defines the correlation ra into account the degrees of freedom of the total and residual variances and is calcu following formula: The adjusted multiple regression coefficient is 0.94, i.e. practically equal t regression equation explains the variation of attackability.
To build a linear multiple regression model, "Data Analysis" in MS Excel (F used.
The adjusted multiple regression coefficient is 0.94, i. e. practically equal to 1, i. e. the regression equation explains the variation of attackability.
To build a linear multiple regression model, "Data Analysis" in MS Excel (Figure 1) Figure 1. Regression analysis data Using the values in the "Coefficients" column ( Figure 1), we obtain the linear multiple Using the values in the "Coefficients" column ( Figure 1), we obtain the linear multiple regression equation in the standardized form: y ∨ = -0.049x 1 + 0.09x 2 + 0.342x 3 -0.666x 4 -0.02x 5 + 0.01x 6 (8) The analysis of the data presented in Figure 1 allows concluding about the significance of the regression equation, as F table (51.64) > F obser (0.00) with probability 1 -α = 0.95.
Using the particular of F-test of Fisher, we assessed the feasibility of including the factors x i in the multiple regression equation after the other factors:     Figure 1 allows concluding about the significance of the regression equation, as Ftable(51.64) > Fobser (0.00) with probability 1-α=0.95.
Using the particular of F-test of Fisher, we assessed the feasibility of including the factors xi in the multiple regression equation after the other factors: where � ( ��� , � ) = ∑ � �� � We came to conclusion that it is feasible to include x1, x3, x4 into the regression equation.
To switch from standardized values to natural values, the following formula is used: Where Ai is the natural coefficient of the equation; Cy is the standard deviation of the factor; Cxi the standard deviation of the parameter.
Thus the regression equation in natural values looks as follows: = 89.41 − 0.033 � + 0.516 � − 26.792 � (11) This equation represents a regression model of the process of meat proteins attackability by enzymes (pepsin, trypsin and chymotrypsin), depending on the defined 3 essential factors (mass of a piece of meat; duration of frying; collagen content in meat) and reflects the process of digestibility of meat products in a human digestive tract.
It can be seen from the equation that while the mass of a meat piece increases by 1 unit, its attackability decreases by 0.033 units. While the duration of frying meat increases, the protein attackability also increases by 0.516 units. While the collagen content in meat increases, attackability decreases by 26.792 units.

Conclusion
As a final summary of the research it is possible to conclude that the analyzes, mathematical calculations, statistical analysis of reliability of the obtained regression equation prove the possibility of successful application of correlation-regression analysis for the calculations of biological and nutritional value assessment -i.e. the processes of protein breakdown in various finished lamb culinary products by proteolytic enzymes (in vitro).
. We came to conclusion that it is feasible to include x 1 , x 3 , x 4 into the regression equation.
To switch from standardized values to natural values, the following formula is used: where � ( ��� , � ) = ∑ � �� � We came to conclusion that it is feasible to include x1, x3, x4 into the regression equation.
To switch from standardized values to natural values, the following formula is used: Where Ai is the natural coefficient of the equation; Cy is the standard deviation of the factor; Cxi the standard deviation of the parameter.
Thus the regression equation in natural values looks as follows: = 89.41 − 0.033 � + 0.516 � − 26.792 � (11) This equation represents a regression model of the process of meat proteins attackability by enzymes (pepsin, trypsin and chymotrypsin), depending on the defined 3 essential factors (mass of a piece of meat; duration of frying; collagen content in meat) and reflects the process of digestibility of meat products in a human digestive tract.
It can be seen from the equation that while the mass of a meat piece increases by 1 unit, its attackability decreases by 0.033 units. While the duration of frying meat increases, the protein attackability also increases by 0.516 units. While the collagen content in meat increases, attackability decreases by 26.792 units.

Conclusion
As a final summary of the research it is possible to conclude that the analyzes, mathematical calculations, statistical analysis of reliability of the obtained regression equation prove the possibility of successful application of correlation-regression analysis for the calculations of biological and nutritional value assessment -i.e. the processes of protein breakdown in various finished lamb culinary products by proteolytic enzymes (in vitro). (10) Where A i is the natural coefficient of the equation; C y is the standard deviation of the factor; C xi the standard deviation of the parameter.
Thus the regression equation in natural values looks as follows: y = 89.41 -0.033x 1 + 0.516x 3 -26.792x 4 (11) This equation represents a regression model of the process of meat proteins attackability by enzymes (pepsin, trypsin and chymotrypsin), depending on the defined 3 essential factors (mass of a piece of meat; duration of frying; collagen content in meat) and reflects the process of digestibility of meat products in a human digestive tract.
It can be seen from the equation that while the mass of a meat piece increases by 1 unit, its attackability decreases by 0.033 units. While the duration of frying meat increases, the protein attackability also increases by 0.516 units. While the collagen content in meat increases, attackability decreases by 26.792 units.

Conclusion
As a final summary of the research it is possible to conclude that the analyzes, mathematical calculations, statistical analysis of reliability of the obtained regression equation prove the possibility of successful application of correlation-regression analysis for the calculations of biological and nutritional value assessment -i. e. the processes of protein breakdown in various finished lamb culinary products by proteolytic enzymes (in vitro).