Basic Concepts
We want to move away from “standard” or “typical” approaches to statistical inference, where we assume that our data are drawn from some distributional family, e.g. the standard setup in which $X_1, X_2, …, X_n \sim N(\mu, \sigma^2)$. Here $N(\mu, \sigma^2)$ is a normal distributional family. Similarly we could have $Pois(\lambda)$ for a Poisson distribution. In these cases, we’re making assumptions about the underlying distribution. These assumptions may (or may not) be realistic or valid. In any case, they are restrictive.
Nonparametric statistical methods (sometimes called “distributionfree” methods) aim to relax these assumptions about distributional forms
. They will be more general and more robust ^{1}, but we sacrifice power (not always) if the data truly come from a particular family, such as Normal, for which optimal tests (such as ztest
or ttest
) exist.
The term nonparametric method
is also used in a variety of ways, which we want to examine:
 Classical approaches, e.g. based on
ranks
 Computational approaches, e.g.
bootstrap
 Modern regression (and other) approaches, e.g.
smoothing
The big question is if we don’t assume a distributional family, how can we proceed to do inference? What sorts of inferential questions can we ask and answer?
We do still need to make some assumptions (of course), but they can be weaker than what we’re used to. For example, instead of normality, which is a strong assumption, we might assume that the true data distribution is merely symmetric.
For comparing two samples, rather than assuming that both come from normallydistributed populations with possibly different means, we might assume that their distributions are the same (without specifying what it is) but with a shift in location:


methods will be good in a wider range of applications. ↩︎
May 08  Modern Nonparametric Regression  8 min read 
May 06  Bootstrap  11 min read 
May 06  Categorical Data  17 min read 
May 06  Density Estimation  6 min read 
May 05  Correlation and Concordance  9 min read 