Least squares, for example, minimizes the variance of the residuals and is a special case of sestimators. While choice of cutoff is known to be important in the. Rousseeuw and yohai 1984 were obtained by minimization. We find that 1 that robust regression applications are appropriate for modeling stock returns in global markets. A very simple device that considerably reduces the number of times that the. As noted in chapter 3, there are many measures of scale. The feasible solution algorithm for least trimmed squares. Rousseeuw and yohai 1984 calculate efficiencies and rejection points c when 9 equals tukeys biweight function. Yohai 1984, which can attain an efficiency up to 33%. Robust regression and outlier detection rousseeuw, peter. Computing s estimators for regression and multivariate. Rousseeuw and yohai 1984 proposed svestimates, defined by the property of minimizing an. A fast algorithm for sregression estimates ubc department of.
Use the link below to share a fulltext version of this article with your friends and colleagues. Keywords influence function robust regression breakdown point projection pursuit least trim square. It turned out that the computation time of existing lts algorithms grew too fast with the size of the data set, precluding their use for data mining. This is achieved via probabilistic principal component analysis ppca in which new statisticallyrobust variants are derived also treating missing data. Therefore it can be viewed as a statistical theory dealing with approximate parametric models and a bridge between the fisherian parametric approach and the full nonparametric approach. Rousseeuw born october 1956 is a statistician known for his work on robust statistics and cluster analysis.
Part of the lecture notes in statistics book series lns, volume 26. He obtained his phd in 1981 at the vrije universiteit brussel, following research carried out at the eth in zurich in the group of frank hampel, which led to a book on influence. With the same breakdown value, it has a higher statistical ef. In this part of the monte carlo study, 200 samples were generated with n 50.
The performance of this method was improved by the fastlts algorithm ofrousseeuw and van driessen2000. Pdf there are at least two reasons why robust regression techniques are useful tools in. Econometrics free fulltext financial big data solutions. Outlier detection using distributionally robust optimization. Robust regression by means of sestimators springerlink. Mm estimation, introduced by yohai 1987, combines high breakdown value estimation and m estimation. Whether youve loved the book or not, if you give your honest and detailed thoughts then people will find new books that are right for them. Part of the springer series in statistics book series sss. Rousseeuw, 1984 the asymptotic breakdown point is then defined as 2. The results directly paralleled the uncorrected analyses. The least trimmed squares lts regression is a robust method that reduces the impact of outliers.
Other readers will always be interested in your opinion of the books youve read. The performance of this method was improved by the fastlts algorithm of rousseeuw and van driessen 1998. Trimmed squares lts rousseeuw, 1985, sestimation rousseeuw and yohai, 1984, and mmestimation yohai, 1987, are elaborated in the book of rousseeuw and leroy 2005. The breakdown value is a measure of the proportion of contamination that a procedure can withstand and still maintain its robustness.
Estimator for regression ubc department of statistics. Rousseeuw and yohai, 1984 and mm estimation yohai, 1987 procedures, both of which correct estimates for the effects of outliers. Part i, graph partitioning, operations research, 37, 865892. A combination of the high breakdown value method and mestimation is the mmestimation yohai, 1987. The asymptotic breakdown point of the sestimator is given by rousseeuw and yohai, 1984. Robust dependence modeling for highdimensional covariance matrices with financial applications zhe zhu and roy e. Applying robust regression techniques to institutional data.
Robust regression and outlier detection rousseeuw, peter j. Then evaluate the objective function that is, the median of the squared residuals from this direction. To compute it, they use a modified version of the forward search algorithm see e. In section 2 we formulate the problem and in section 3 4 the optimal.
Yohai, robust regression by means of sestimators, in robust and nonlinear time series analysis, lecture notes in statistics, vol. The mve algorithm is based on the algorithm used in the minvol program by rousseeuw 1984. It has a higher statistical e ciency than sestimation. Rousseeuw and yohai 1984, by permission of springerverlag, new york. Your use of this publication shall be governed by the terms established by the vendor at the time. It has a higher statistical efficiency than sestimation. In order to obtain a more efficient estimator under. Mm robust regression techniques, discussed in maronna. Examples include the least median of squares lms rousseeuw, 1984 which minimizes the median of the absolute residuals, the least trimmed squares lts rousseeuw, 1985 which minimizes the sum of the q smallest squared residuals, and sestimation rousseeuw and yohai, 1984 which has a higher statistical efficiency than lts with the same. The paper will provide an overview of robust regression methods, describe the syntax of proc robustreg, and illustrate the use of the procedure to. In this section the algorithms of section 2 are applied to simulated data. There are at least two reasons why robust regression techniques are useful tools in robust time series analysis. We refer the reader to the book of rousseeuw and leroy for an elaborate description of these robust regression methods. The topic of heteroscedasticityconsistent hc standard errors arises in statistics and econometrics in the context of linear regression and time series analysis.
S estimation is a high breakdown value method that was introduced byrousseeuw and yohai 1984. S estimation is a high breakdown value method introduced by rousseeuw and yohai 1984. Mar 26, 2017 3,339 rousseeuw 1984 lms and lts regression paper 21 765 rousseeuw yohai 1984 sestimators in a proceedings book paper 23 975 rousseeuw 1985 multivariate estimation in a proceedings book paper 28 6,031 hampelronchettirousseeuwstahel 1986 robust statistics book 1 3,990 rousseeuw 1987 silhouettes display for clustering. The breakdown value is a measure of the proportion of contamination that an estimation method can withstand and still maintain its robustness.
The breakdown point and robust estimators exercises and problems 2. Sestimators of regression parameters, proposed by rousseeuw and yohai 1984, search for the slope and intercept values that minimize some measure of. Ronchetti, rousseeuw, and stahel 1986, maronna, martin, and yohai 2006, and dellaquila and ronchetti 2006 for an overview. In this paper, we propose to use instead a modification of the cstep algorithm proposed by rousseeuw and van driessen 1999 which is actually a lot faster. He obtained his phd in 1981 at the vrije universiteit brussel, following research carried out at the eth in zurich in the group of frank hampel, which led to a book on influence functions. Rousseeuw and yohai 1984 24 introduced the trimmed least squares tls regression which is a highly robust method for fitting a linear regression model.
Sestimators of regression parameters, proposed by rousseeuw and yohai 1984, search for the slope and intercept values that minimize some measure of scale associated with the residuals. The analysis is focused on high breakdownpoint and boundedinfluence estimators. Robust regression examples worcester polytechnic institute. Application and comparison of high breakdownpoint and.
Following seminal papers by box 1953 and tukey 1960, which demonstrated the need for robust statistical procedures, the theory of robust statistics blossomed in the 1960s and 1970s. The most widely used measure of correlation is the productmoment correlation coefficient. That is, an minimizes the mscale an a implicitly defined by the equation 2. Hestenes, m conjugate direction methods in optimization. P application of the conjugate gradient method in electromagnetics and signal. Fast very robust methods for the detection of multiple outliers. Frontiers individual differences in the perception of. For comparison to the partial correlation and linear regression analyses summarized above, we also conducted robust regression analyses using the s rousseeuw and yohai, 1984 and mm estimation yohai, 1987 procedures, both of which correct estimates for the effects of outliers. Heteroscedasticityconsistent standard errors wikipedia.
Mestimation is the simplest approach both computationally and theoretically but cannot handle data which are contaminated in the covari. Moreover, a choice of cutoff or tuning constant must be made, before any of these estimators apart from rousseeuws is fully defined and the problem exists of which to use. Springer nature is making sarscov2 and covid19 research free. Leroy provides an applicationsoriented introduction to robust regression and outlier detection, emphasising highbreakdown methods which can cope with a sizeable fraction of contamination. S estimators for regression were introduced by rousseeuw and yohai 1984. Individual differences in the perception of biological motion and fragmented figures are not correlated. We finally opt for a minimum covariance determinant mcd algorithm rousseeuw 1984, 1985. Squares lms rousseeuw, 1984 which minimizes the median of the absolute residuals, the least trimmed squares lts rousseeuw, 1985 which minimizes the sum of the q smallest squared residuals, and sestimation rousseeuw and yohai, 1984 which has a higher statistical e. The effectiveness and robustness of the proposed balancing procedure are shown by means of simulations of a 180 mw gas turbo generator of a power plant. In section 2 we formulate the problem and in section 3 4 the optimal score functions are derived. Heidelberg workshop on robust and nonlinear time series, sept. The basic idea of the mcd algorithm is to use only an ideally noisefree subset of the entire data set to calculate data centre and covariance. Sestimators introduced by rousseeuw and yohai 1984.
Pdf robust regression by means of sestimators researchgate. Sestimator are asymptotically normal with rate of convergence n12 and their asymptotic efficiencies under. The asymptotic distribution of mmestimates has been studied by yohai 1987 under the assumption that h ho central parametric model. A class of robust and fully efficient regression estimators. Therefore, the outputs may differ slightly from those given in rousseeuw and leroy 1987 or those obtained from software based on the older version of progress.
Least trimmed squares lts estimation is a high breakdown value method introduced by rousseeuw 1984. Introduction to rousseeuw 1984 least median of squares. Huber, 1964, 1973, least median of squares lms rousseeuw, 1984, least trimmed squares lts rousseeuw, 1985, sestimation rousseeuw and yohai, 1984, and mmestimation yohai, 1987, are elaborated in the book of rousseeuw and leroy 2005. In addition, asymptotic distributions of the estimators are given, coupled with second order corrections to the bias of the estimators. Easily share your publications and get them in front of issuus. Jun 12, 20 issuu is a digital publishing platform that makes it simple to publish magazines, catalogs, newspapers, books, and more online. Its selfcontained treatment allows readers to skip the mathematical material which is concentrated in a few sections. Sestimators of regression parameters, proposed by rousseeuw and yohai 1984, search for the slope and intercept values that minimize some measure of scale associated with. A robust learning approach for regression models based on. High breakdownpoint and high efficiency robust estimates for regression. Sestimators, proposed by rousseeuw and yohai 1984, were the first high. Mestimation is the simplest approach both computationally. Peter rousseeuw be3001 leuven, belgium march 26, 2017. These are also known as eickerhuberwhite standard errors also huberwhite standard errors or white standard errors, to recognize the contributions of friedhelm eicker, peter j.
928 796 1173 205 1256 759 510 641 1151 1628 140 1685 366 668 1154 810 99 1646 625 1055 1586 741 1417 881 1613 1642 457 738 1315 1594 1607 263 1469 1322 976 821 249 1071 1120 392 449 116 1030 1096 1497