quantile: Sample Quantiles (2024)

quantileR Documentation

Description

The generic function quantile produces sample quantilescorresponding to the given probabilities.The smallest observation corresponds to a probability of 0 and thelargest to a probability of 1.

Usage

quantile(x, ...)## Default S3 method:quantile(x, probs = seq(0, 1, 0.25), na.rm = FALSE, names = TRUE, type = 7, digits = 7, ...)

Arguments

x

numeric vector whose sample quantiles are wanted, or anobject of a class for which a method has been defined (see also‘details’). NA and NaN values are notallowed in numeric vectors unless na.rm is TRUE.

probs

numeric vector of probabilities with values in[0,1]. (Values up to 2e-14 outside thatrange are accepted and moved to the nearby endpoint.)

na.rm

logical; if true, any NA and NaN'sare removed from x before the quantiles are computed.

names

logical; if true, the result has a namesattribute. Set to FALSE for speedup with many probs.

type

an integer between 1 and 9 selecting one of thenine quantile algorithms detailed below to be used.

digits

used only when names is true: the precision to usewhen formatting the percentages. In R versions up to 4.0.x, this hadbeen set to max(2, getOption("digits")), internally.

...

further arguments passed to or from other methods.

Details

A vector of length length(probs) is returned;if names = TRUE, it has a names attribute.

NA and NaN values in probs arepropagated to the result.

The default method works with classed objects sufficiently likenumeric vectors that sort and (not needed by types 1 and 3)addition of elements and multiplication by a number work correctly.Note that as this is in a namespace, the copy of sort inbase will be used, not some S4 generic of that name. Also notethat that is no check on the ‘correctly’, and soe.g. quantile can be applied to complex vectors which (apartfrom ties) will be ordered on their real parts.

There is a method for the date-time classes (see"POSIXt"). Types 1 and 3 can be used for class"Date" and for ordered factors.

Types

quantile returns estimates of underlying distribution quantilesbased on one or two order statistics from the supplied elements inx at probabilities in probs. One of the nine quantilealgorithms discussed in Hyndman and Fan (1996), selected bytype, is employed.

All sample quantiles are defined as weighted averages ofconsecutive order statistics. Sample quantiles of type iare defined by:

Q[i](p) = (1 - γ) x[j] + γ x[j+1],

where 1 ≤ i ≤ 9,(j-m)/n ≤ p < (j-m+1)/n,x[j] is the jth order statistic, n is thesample size, the value of γ is a function ofj = floor(np + m) and g = np + m - j,and m is a constant determined by the sample quantile type.

Discontinuous sample quantile types 1, 2, and 3

For types 1, 2 and 3, Q[i](p) is a discontinuousfunction of p, with m = 0 when i = 1 and i = 2, and m = -1/2 when i = 3.

Type 1

Inverse of empirical distribution function.γ = 0 if g = 0, and 1 otherwise.

Type 2

Similar to type 1 but with averaging at discontinuities.γ = 0.5 if g = 0, and 1 otherwise (SAS default, seeWicklin(2017)).

Type 3

Nearest even order statistic (SAS default till ca. 2010).γ = 0 if g = 0 and j is even,and 1 otherwise.

Continuous sample quantile types 4 through 9

For types 4 through 9, Q[i](p) is a continuous functionof p, with gamma = g and m given below. Thesample quantiles can be obtained equivalently by linear interpolationbetween the points (p[k],x[k]) where x[k]is the kth order statistic. Specific expressions forp[k] are given below.

Type 4

m = 0. p[k] = k / n.That is, linear interpolation of the empirical cdf.

Type 5

m = 1/2.p[k] = (k - 0.5) / n.That is a piecewise linear function where the knots are the valuesmidway through the steps of the empirical cdf. This is popularamongst hydrologists.

Type 6

m = p. p[k] = k / (n + 1).Thus p[k] = E[F(x[k])].This is used by Minitab and by SPSS.

Type 7

m = 1-p.p[k] = (k - 1) / (n - 1).In this case, p[k] = mode[F(x[k])].This is used by S.

Type 8

m = (p+1)/3.p[k] = (k - 1/3) / (n + 1/3).Then p[k] =~ median[F(x[k])].The resulting quantile estimates are approximately median-unbiasedregardless of the distribution of x.

Type 9

m = p/4 + 3/8.p[k] = (k - 3/8) / (n + 1/4).The resulting quantile estimates are approximately unbiased forthe expected order statistics if x is normally distributed.

Further details are provided in Hyndman and Fan (1996) who recommended type 8.The default method is type 7, as used by S and by R < 2.0.0.Makkonen argues for type 6, also as already proposed by Weibull in 1939.The Wikipedia page contains further information about availability ofthese 9 types in software.

Author(s)

of the version used in R >= 2.0.0, Ivan Frohne and Rob J Hyndman.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

Hyndman, R. J. and Fan, Y. (1996) Sample quantiles in statisticalpackages, American Statistician 50, 361–365.\Sexpr[results=rd,stage=build]{tools:::Rd_expr_doi("10.2307/2684934")}.

Wicklin, R. (2017) Sample quantiles: A comparison of 9 definitions; SAS Blog.https://blogs.sas.com/content/iml/2017/05/24/definitions-sample-quantiles.html

Wikipedia: https://en.wikipedia.org/wiki/Quantile#Estimating_quantiles_from_a_sample

See Also

ecdf for empirical distributions of whichquantile is an inverse;boxplot.stats and fivenum for computingother versions of quartiles, etc.

Examples

quantile(x <- rnorm(1001)) # Extremes & Quartiles by defaultquantile(x, probs = c(0.1, 0.5, 1, 2, 5, 10, 50, NA)/100)### Compare different typesquantAll <- function(x, prob, ...) t(vapply(1:9, function(typ) quantile(x, probs = prob, type = typ, ...), quantile(x, prob, type=1, ...)))p <- c(0.1, 0.5, 1, 2, 5, 10, 50)/100signif(quantAll(x, p), 4)## 0% and 100% are equal to min(), max() for all types:stopifnot(t(quantAll(x, prob=0:1)) == range(x))## for complex numbers:z <- complex(real = x, imaginary = -10*x)signif(quantAll(z, p), 4)
quantile: Sample Quantiles (2024)
Top Articles
Latest Posts
Article information

Author: Terrell Hackett

Last Updated:

Views: 6002

Rating: 4.1 / 5 (52 voted)

Reviews: 91% of readers found this page helpful

Author information

Name: Terrell Hackett

Birthday: 1992-03-17

Address: Suite 453 459 Gibson Squares, East Adriane, AK 71925-5692

Phone: +21811810803470

Job: Chief Representative

Hobby: Board games, Rock climbing, Ghost hunting, Origami, Kabaddi, Mushroom hunting, Gaming

Introduction: My name is Terrell Hackett, I am a gleaming, brainy, courageous, helpful, healthy, cooperative, graceful person who loves writing and wants to share my knowledge and understanding with you.