Quantile regression
Quantile regression is a type of regression analysis used in statistics and econometrics. Whereas the method of least squares results in estimates of the conditional mean of the response variable given certain values of the predictor variables, quantile regression aims at estimating either the conditional median or other quantiles of the response variable. Essentially, quantile regression is the extension of linear regression and we use it when the conditions of linear regression are not applicable.
Part of a series on Statistics 
Regression analysis 

Models 
Estimation 
Background 

Advantages and applications
Quantile regression is desired if conditional quantile functions are of interest. One advantage of quantile regression, relative to the ordinary least squares regression, is that the quantile regression estimates are more robust against outliers in the response measurements. However, the main attraction of quantile regression goes beyond that. Different measures of central tendency and statistical dispersion can be useful to obtain a more comprehensive analysis of the relationship between variables.[1]
In ecology, quantile regression has been proposed and used as a way to discover more useful predictive relationships between variables in cases where there is no relationship or only a weak relationship between the means of such variables. The need for and success of quantile regression in ecology has been attributed to the complexity of interactions between different factors leading to data with unequal variation of one variable for different ranges of another variable.[2]
Another application of quantile regression is in the areas of growth charts, where percentile curves are commonly used to screen for abnormal growth.[3][4]
Mathematics
The mathematical forms arising from quantile regression are distinct from those arising in the method of least squares. The method of least squares leads to a consideration of problems in an inner product space, involving projection onto subspaces, and thus the problem of minimizing the squared errors can be reduced to a problem in numerical linear algebra. Quantile regression does not have this structure, and instead leads to problems in linear programming that can be solved by the simplex method.
History
The idea of estimating a median regression slope, a major theorem about minimizing sum of the absolute deviances and a geometrical algorithm for constructing median regression was proposed in 1760 by Ruđer Josip Bošković, a Jesuit Catholic priest from Dubrovnik.[1]^{:4}[5] He was interested in the ellipticity of the earth, building on Isaac Newton's suggestion that its rotation could cause it to bulge at the equator with a corresponding flattening at the poles.[6] He finally produced the first geometric procedure for determining the equator of a rotating planet from three observations of a surface feature. More importantly for quantile regression, he was able to develop the first evidence of the least absolute criterion and preceded the least squares introduced by Legendre in 1805 by fifty years.[7]
Other thinkers began building upon Bošković's idea such as PierreSimon Laplace, who developed the socalled "methode de situation." This led to Francis Edgeworth's plural median[8]  a geometric approach to median regression  and is recognized as the precursor of the simplex method.[7] The works of Bošković, Laplace, and Edgeworth were recognized as a prelude to Roger Koenker's contributions to quantile regression.
Median regression computations for larger data sets are quite tedious compared to the least squares method, for which reason it has historically generated a lack of popularity among statisticians, until the widespread adoption of computers in the latter part of the 20th century.
Quantiles
Let be a real valued random variable with cumulative distribution function . The th quantile of Y is given by
where
Define the loss function as , where is an indicator function. A specific quantile can be found by minimizing the expected loss of with respect to :[1]^{:5–6}
This can be shown by setting the derivative of the expected loss function to 0 and letting be the solution of
This equation reduces to
and then to
Hence is th quantile of the random variable Y.
Example
Let be a discrete random variable that takes values 1,2,..,9 with equal probabilities. The task is to find the median of Y, and hence the value is chosen. The expected loss, L(u), is
Since is a constant, it can be taken out of the expected loss function (this is only true if ). Then, at u=3,
Suppose that u is increased by 1 unit. Then the expected loss will be changed by on changing u to 4. If, u=5, the expected loss is
and any change in u will increase the expected loss. Thus u=5 is the median. The Table below shows the expected loss (divided by ) for different values of u.
u  1  2  3  4  5  6  7  8  9 
Expected loss  36  29  24  21  20  21  24  29  36 
Intuition
Consider and let q be an initial guess for . The expected loss evaluated at q is
In order to minimize the expected loss, we move the value of q a little bit to see whether the expected loss will rise or fall. Suppose we increase q by 1 unit. Then the change of expected loss would be
The first term of the equation is and second term of the equation is . Therefore, the change of expected loss function is negative if and only if , that is if and only if q is smaller than the median. Similarly, if we reduce q by 1 unit, the change of expected loss function is negative if and only if q is larger than the median.
In order to minimize the expected loss function, we would increase (decrease) L(q) if q is smaller (larger) than the median, until q reaches the median. The idea behind the minimization is to count the number of points (weighted with the density) that are larger or smaller than q and then move q to a point where q is larger than % of the points.
Sample quantile
The sample quantile can be obtained by solving the following minimization problem
 , where the function is the tilted absolute value function. The intuition is the same as for the population quantile.
Conditional quantile and quantile regression
Suppose the th conditional quantile function is . Given the distribution function of , can be obtained by solving
Solving the sample analog gives the estimator of .
Computation
The minimization problem can be reformulated as a linear programming problem
where
 ,
Simplex methods[1]^{:181} or interior point methods[1]^{:190} can be applied to solve the linear programming problem.
Asymptotic properties
For , under some regularity conditions, is asymptotically normal:
where
 and
Direct estimation of the asymptotic variancecovariance matrix is not always satisfactory. Inference for quantile regression parameters can be made with the regression rankscore tests or with the bootstrap methods.[9]
Equivariance
See invariant estimator for background on invariance or see equivariance.
Scale equivariance
For any and
Shift equivariance
For any and
Equivariance to reparameterization of design
Let be any nonsingular matrix and
Invariance to monotone transformations
If is a nondecreasing function on 'R, the following invariance property applies:
Example (1):
If and , then . The mean regression does not have the same property since
Bayesian methods for quantile regression
Because quantile regression does not normally assume a parametric likelihood for the conditional distributions of YX, the Bayesian methods work with a working likelihood. A convenient choice is the asymmetric Laplacian likelihood,[10] because the mode of the resulting posterior under a flat prior is the usual quantile regression estimates. The posterior inference, however, must be interpreted with care. Yang, Wang and He[11] provided a posterior variance adjustment for valid inference. In addition, Yang and He[12] showed that one can have asymptotically valid posterior inference if the working likelihood is chosen to be the empirical likelihood.
Machine learning methods for quantile regression
Beyond simple linear regression, there are several machine learning methods that can be extended to quantile regression. A switch from the squared error to the tilted absolute value loss function allows gradient descent based learning algorithms to learn a specified quantile instead of the mean. It means that we can apply all neural network and deep learning algorithms to quantile regression.[13][14] Treebased learning algorithms are also available for quantile regression (see, e.g., Quantile Regression Forests[15], as a simple generalization of Random Forests).
Censored quantile regression
If the response variable is subject to censoring, the conditional mean is not identifiable without additional distributional assumptions, but the conditional quantile is often identifiable. For recent work on censored quantile regression, see: Portnoy[16] and Wang and Wang[17]
Example (2):
Let and . Then . This is the censored quantile regression model: estimated values can be obtained without making any distributional assumptions, but at the cost of computational difficulty,[18] some of which can be avoided by using a simple three step censored quantile regression procedure as an approximation.[19]
For random censoring on the response variables, the censored quantile regression of Portnoy (2003)[16] provides consistent estimates of all identifiable quantile functions based on reweighting each censored point appropriately.
Implementations
Numerous statistical software packages include implementations of quantile regression:
 Matlab function
quantreg
[20]  Eviews, since version 6.
 gretl has the
quantreg
command.[21]  R offers several packages that implement quantile regression, most notably
quantreg
by Roger Koenker,[22] but alsogbm
,[23]quantregForest
[24],qrnn
[25] andqgam
[26]  Python, via
Scikitgarden
[27] andstatsmodels
[28]  SAS through
proc quantreg
(ver. 9.2) andproc quantselect
(ver. 9.3).[29]  Stata, via the
qreg
command.[30][31]  Vowpal Wabbit, via
loss_function quantile
.[32]  Statsmodels package for Python, via
QuantReg
[33]  Mathematica package
QuantileRegression.m
[34] hosted at the MathematicaForPrediction project at GitHub.
References
 Koenker, Roger (2005). Quantile Regression. Cambridge University Press. pp. 146–7. ISBN 9780521608275.
 Cade, Brian S.; Noon, Barry R. (2003). "A gentle introduction to quantile regression for ecologists" (PDF). Frontiers in Ecology and the Environment. 1 (8): 412–420. doi:10.2307/3868138. JSTOR 3868138.
 Wei, Y.; Pere, A.; Koenker, R.; He, X. (2006). "Quantile Regression Methods for Reference Growth Charts". Statistics in Medicine. 25 (8): 1369–1382. doi:10.1002/sim.2271. PMID 16143984.
 Wei, Y.; He, X. (2006). "Conditional Growth Charts (with discussions)". Annals of Statistics. 34 (5): 2069–2097 and 2126–2131. arXiv:math/0702634. doi:10.1214/009053606000000623.
 Stigler, S. (1984). "Boscovich, Simpson and a 1760 manuscript note on fitting a linear relation". Biometrika. 71 (3): 615–620. doi:10.1093/biomet/71.3.615.
 Koenker, Roger (2005). Quantile Regression. Cambridge: Cambridge University Press. p. 2. ISBN 9780521845731.
 Furno, Marilena; Vistocco, Domenico (2018). Quantile Regression: Estimation and Simulation. Hoboken, NJ: John Wiley & Sons. pp. xv. ISBN 9781119975281.
 Koenker, Roger (August 1998). "Galton, Edgeworth, Frisch, and prospects for quantile regression in economics" (PDF). UIUC.edu. Retrieved August 22, 2018.
 Kocherginsky, M.; He, X.; Mu, Y. (2005). "Practical Confidence Intervals for Regression Quantiles". Journal of Computational and Graphical Statistics. 14 (1): 41–55. doi:10.1198/106186005X27563.
 Kozumi, H.; Kobayashi, G. (2011). "Gibbs sampling methods for Bayesian quantile regression" (PDF). Journal of Statistical Computation and Simulation. 81 (11): 1565–1578. doi:10.1080/00949655.2010.496117.
 Yang, Y.; Wang, H.X.; He, X. (2016). "Posterior Inference in Bayesian Quantile Regression with Asymmetric Laplace Likelihood". International Statistical Review. 84 (3): 327–344. doi:10.1111/insr.12114.
 Yang, Y.; He, X. (2010). "Bayesian empirical likelihood for quantile regression". Annals of Statistics. 40 (2): 1102–1131. arXiv:1207.5378. doi:10.1214/12AOS1005.
 Petneházi, Gábor (20190821). "QCNN: Quantile Convolutional Neural Network". arXiv:1908.07978 [cs.LG].
 Rodrigues, Filipe; Pereira, Francisco C. (20180827). "Beyond expectation: Deep joint mean and quantile regression for spatiotemporal problems". arXiv:1808.08798 [stat].
 Meinshausen, Nicolai (2006). "Quantile Regression Forests" (PDF). Journal of Machine Learning Research. 7 (6): 983–999.
 Portnoy, S. L. (2003). "Censored Regression Quantiles". Journal of the American Statistical Association. 98 (464): 1001–1012. doi:10.1198/016214503000000954.
 Wang, H.; Wang, L. (2009). "Locally Weighted Censored Quantile Regression". Journal of the American Statistical Association. 104 (487): 1117–1128. CiteSeerX 10.1.1.504.796. doi:10.1198/jasa.2009.tm08230.
 Powell, James L. (1986). "Censored Regression Quantiles". Journal of Econometrics. 32 (1): 143–155. doi:10.1016/03044076(86)900163.
 Chernozhukov, Victor; Hong, Han (2002). "ThreeStep Censored Quantile Regression and Extramarital Affairs". J. Amer. Statist. Assoc. 97 (459): 872–882. doi:10.1198/016214502388618663.
 "quantreg(x,y,tau,order,Nboot)  File Exchange  MATLAB Central". www.mathworks.com. Retrieved 20160201.
 "Gretl Command Reference" (PDF). April 2017.
 "quantreg: Quantile Regression". R Project. 20181218.
 "gbm: Generalized Boosted Regression Models". R Project. 20190114.
 "quantregForest: Quantile Regression Forests". R Project. 20171219.
 "qrnn: Quantile Regression Neural Networks". R Project. 20180626.
 "qgam: Smooth Additive Quantile Regression Models". R Project. 20190523.
 "Quantile Regression Forests". Scikitgarden. Retrieved 3 January 2019.
 "Statsmodels: Quantile Regression". Statsmodels. Retrieved 15 November 2019.
 "An Introduction to Quantile Regression and the QUANTREG Procedure" (PDF). SAS Support.
 "qreg — Quantile regression" (PDF). Stata Manual.
 Cameron, A. Colin; Trivedi, Pravin K. (2010). "Quantile Regression". Microeconometrics Using Stata (Revised ed.). College Station: Stata Press. pp. 211–234. ISBN 9781597180733.
 "JohnLangford/vowpal_wabbit". GitHub. Retrieved 20160709.
 "Quantile regression". statsmodels.org. Retrieved 3 January 2019.
 "QuantileRegression.m". MathematicaForPrediction. Retrieved 3 January 2019.
Further reading
The Wikibook R Programming has a page on the topic of: Quantile Regression 
 Angrist, Joshua D.; Pischke, JörnSteffen (2009). "Quantile Regression". Mostly Harmless Econometrics: An Empiricist's Companion. Princeton University Press. pp. 269–291. ISBN 9780691120348.
 Koenker, Roger (2005). Quantile Regression. Cambridge University Press. ISBN 9780521608275.