Randomization is the process of making something random; in various contexts this involves, for example:
- generating a random permutation of a sequence (such as when shuffling cards);
- selecting a random sample of a population (important in statistical sampling);
- allocating experimental units via random assignment to a treatment or control condition;
- generating random numbers (see Random number generation); or
- transforming a data stream (such as when using a scrambler in telecommunications).
Randomization is not haphazard. Instead, a random process is a sequence of random variables describing a process whose outcomes do not follow a deterministic pattern, but follow an evolution described by probability distributions. For example, a random sample of individuals from a population refers to a sample where every individual has a known probability of being sampled. This would be contrasted with nonprobability sampling where arbitrary individuals are selected.
Randomization is used in statistics and in gambling.
Randomization is a core principle in statistical theory, whose importance was emphasized by Charles S. Peirce in "Illustrations of the Logic of Science" (1877–1878) and "A Theory of Probable Inference" (1883). Randomization-based inference is especially important in experimental design and in survey sampling. The first use of "randomization" listed in the Oxford English Dictionary is its use by Ronald Fisher in 1926.
In the statistical theory of design of experiments, randomization involves randomly allocating the experimental units across the treatment groups. For example, if an experiment compares a new drug against a standard drug, then the patients should be allocated to either the new drug or to the standard drug control using randomization. Randomization reduces confounding by equalising so-called factors ( independent variables) that have not been accounted for in the experimental design.
Some important methods of statistical inference use resampling from the observed data. Multiple alternative versions of the data-set that "might have been observed" are created by randomization of the original data-set, the only one observed. The variation of statistics calculated for these alternative data-sets is a guide to the uncertainty of statistics estimated from the original data.
Although historically "manual" randomization techniques (such as shuffling cards, drawing pieces of paper from a bag, spinning a roulette wheel) were common, nowadays automated techniques are mostly used. As both selecting random samples and random permutations can be reduced to simply selecting random numbers, random number generation methods are now most commonly used, both hardware random number generators and pseudo-random number generators.
Randomization is used in optimization to alleviate the computational burden associated to robust control techniques: a sample of values of the uncertainty parameters is randomly drawn and robustness is enforced for these values only. This approach has gained popularity by the introduction of rigorous theories that permit one to have control on the probabilistic level of robustness, see scenario optimization.
Non-algorithmic randomization methods include: