Type inference
Type inference refers to the automatic detection of the data type of an expression in a programming language.
Type systems 

General concepts 
Major categories 

Minor categories 
See also 
It is a feature present in some strongly statically typed languages. It is often characteristic of functional programming languages in general. Some languages that include type inference include C++11, C# (starting with version 3.0), Chapel, Clean, Crystal, D, F#, FreeBASIC, Go, Haskell, Java (starting with version 10), Julia, Kotlin, ML, Nim, OCaml, Opa, RPython, Rust, Scala, Swift, Vala and Visual Basic (starting with version 9.0). The majority of them use a simple form of type inference; the HindleyMilner type system can provide more complete type inference. The ability to infer types automatically makes many programming tasks easier, leaving the programmer free to omit type annotations while still permitting type checking.
Nontechnical explanation
In some programming languages, all values have a data type explicitly declared at compile time, limiting the values a particular expression can take on at runtime. Increasingly, justintime compilation renders the distinction between run time and compile time moot. However, historically, if the type of a value is known only at runtime, these languages are dynamically typed. In other languages, the type of an expression is known only at compile time; these languages are statically typed. In most statically typed languages, the input and output types of functions and local variables ordinarily must be explicitly provided by type annotations. For example, in C:
int add_one(int x) {
int result; /* declare integer result */
result = x + 1;
return result;
}
The signature of this function definition, int add_one(int x)
, declares that add_one
is a function that takes one argument, an integer, and returns an integer. int result;
declares that the local variable result
is an integer. In a hypothetical language supporting type inference, the code might be written like this instead:
add_one(x) {
var result; /* inferredtype variable result */
var result2; /* inferredtype variable result #2 */
result = x + 1;
result2 = x + 1.0; /* this line won't work (in the proposed language) */
return result;
}
This is identical to how code is written in the language Dart, except that it is subject to some added constraints as described below. It would be possible to infer the types of all the variables at compile time. In the example above, the compiler would infer that result
and x
have type integer since the constant 1
is type integer, and hence that add_one
is a function int > int
. The variable result2
isn't used in a legal manner, so it wouldn't have a type.
In the imaginary language in which the last example is written, the compiler would assume that, in the absence of information to the contrary, +
takes two integers and returns one integer. (This is how it works in, for example, OCaml.) From this, the type inferencer can infer that the type of x + 1
is an integer, which means result
is an integer and thus the return value of add_one
is an integer. Similarly, since +
requires both of its arguments be of the same type, x
must be an integer, and thus, add_one
accepts one integer as an argument.
However, in the subsequent line, result2 is calculated by adding a decimal 1.0
with floatingpoint arithmetic, causing a conflict in the use of x
for both integer and floatingpoint expressions. The correct typeinference algorithm for such a situation has been known since 1958 and has been known to be correct since 1982. It revisits the prior inferences and uses the most general type from the outset: in this case floatingpoint. This can however have detrimental implications, for instance using a floatingpoint from the outset can introduce precision issues that would have not been there with an integer type.
Frequently, however, degenerate typeinference algorithms are used that cannot backtrack and instead generate an error message in such a situation. This behavior may be preferable as type inference may not always be neutral algorithmically, as illustrated by the prior floatingpoint precision issue.
An algorithm of intermediate generality implicitly declares result2 as a floatingpoint variable, and the addition implicitly converts x
to a floating point. This can be correct if the calling contexts never supply a floating point argument. Such a situation shows the difference between type inference, which does not involve type conversion, and implicit type conversion, which forces data to a different data type, often without restrictions.
Finally, a significant downside of complex typeinference algorithm is that the resulting type inference resolution is not going to be obvious to humans (notably because of the backtracking), which can be detrimental as code is primarily intended to be comprehensible to humans.
The recent emergence of justintime compilation allows for hybrid approaches where the type of arguments supplied by the various calling context is known at compile time, and can generate a large number of compiled versions of the same function. Each compiled version can then be optimized for a different set of types. For instance, JIT compilation allows there to be at least two compiled versions of add_one:
 A version that accepts an integer input and uses implicit type conversion.
 A version that accepts a floatingpoint number as input and uses floating point instructions throughout.
Technical description
Type inference is the ability to automatically deduce, either partially or fully, the type of an expression at compile time. The compiler is often able to infer the type of a variable or the type signature of a function, without explicit type annotations having been given. In many cases, it is possible to omit type annotations from a program completely if the type inference system is robust enough, or the program or language is simple enough.
To obtain the information required to infer the type of an expression, the compiler either gathers this information as an aggregate and subsequent reduction of the type annotations given for its subexpressions, or through an implicit understanding of the type of various atomic values (e.g. true : Bool; 42 : Integer; 3.14159 : Real; etc.). It is through recognition of the eventual reduction of expressions to implicitly typed atomic values that the compiler for a type inferring language is able to compile a program completely without type annotations.
In complex forms of higherorder programming and polymorphism, it is not always possible for the compiler to infer as much, and type annotations are occasionally necessary for disambiguation. For instance, type inference with polymorphic recursion is known to be undecidable. Furthermore, explicit type annotations can be used to optimize code by forcing the compiler to use a more specific (faster/smaller) type than it had inferred.[1]
Some methods for type inference are based on constraint satisfaction.[2]
Example
As an example, the Haskell function map
applies a function to each element of a list, and may be defined as:
map f [] = []
map f (first:rest) = f first : map f rest
Type inference on the map
function proceeds as follows. map
is a function of two arguments, so its type is constrained to be of the form a → b → c
. In Haskell, the patterns []
and (first:rest)
always match lists, so the second argument must be a list type: b = [d]
for some type d
. Its first argument f
is applied to the argument first
, which must have type d
, corresponding with the type in the list argument, so f :: d → e
(::
means "is of type") for some type e
. The return value of map f
, finally, is a list of whatever f
produces, so [e]
.
Putting the parts together leads to map :: (d → e) → [d] → [e]
. Nothing is special about the type variables, so it can be relabeled as
map :: (a → b) → [a] → [b]
It turns out that this is also the most general type, since no further constraints apply. As the inferred type of map
is parametrically polymorphic, the type of the arguments and results of f
are not inferred, but left as type variables, and so map
can be applied to functions and lists of various types, as long as the actual types match in each invocation.
Hindley–Milner type inference algorithm
The algorithm first used to perform type inference is now informally termed the Hindley–Milner algorithm, although the algorithm should properly be attributed to Damas and Milner.[3]
The origin of this algorithm is the type inference algorithm for the simply typed lambda calculus that was devised by Haskell Curry and Robert Feys in 1958. In 1969 J. Roger Hindley extended this work and proved that their algorithm always inferred the most general type. In 1978 Robin Milner,[4] independently of Hindley's work, provided an equivalent algorithm, Algorithm W. In 1982 Luis Damas[3] finally proved that Milner's algorithm is complete and extended it to support systems with polymorphic references.
Sideeffects of using the most general type
By design, type inference, especially correct (backtracking) type inference will introduce use of the most general type appropriate, however this can have implications as more general types may not always be algorithmically neutral, the typical cases being:
 floatingpoint being considered as a general type of integer, while floatingpoint will introduce precision issues
 variant/dynamic types being considered as a general type of other types, which will introduce casting rules and comparison that could be different, for instance such types use the '+' operator for both numeric additions and string concatenations, but what operation is performed is determined dynamically rather than statically
Type inference for natural languages
Type inference algorithms have been used to analyze natural languages as well as programming languages.[5][6][7] Type inference algorithms are also used in some grammar induction[8][9] and constraintbased grammar systems for natural languages.[10]
References
 Bryan O'Sullivan; Don Stewart; John Goerzen (2008). "Chapter 25. Profiling and optimization". Real World Haskell. O'Reilly.
 Talpin, JeanPierre, and Pierre Jouvelot. "Polymorphic type, region and effect inference." Journal of functional programming 2.3 (1992): 245271.
 Damas, Luis; Milner, Robin (1982), "Principal typeschemes for functional programs", POPL '82: Proceedings of the 9th ACM SIGPLANSIGACT symposium on principles of programming languages (PDF), ACM, pp. 207–212
 Milner, Robin (1978), "A Theory of Type Polymorphism in Programming", Journal of Computer and System Sciences, 17 (3): 348–375, doi:10.1016/00220000(78)900144
 Center, Artificiał Intelligence. Parsing and type inference for natural and computer languages. Diss. Stanford University, 1989.
 Emele, Martin C., and Rémi Zajac. "Typed unification grammars." Proceedings of the 13th conference on Computational linguisticsVolume 3. Association for Computational Linguistics, 1990.
 Pareschi, Remo. "Typedriven natural language analysis." (1988).
 Fisher, Kathleen, et al. "Fisher, Kathleen, et al. "From dirt to shovels: fully automatic tool generation from ad hoc data." ACM SIGPLAN Notices. Vol. 43. No. 1. ACM, 2008." ACM SIGPLAN Notices. Vol. 43. No. 1. ACM, 2008.
 Lappin, Shalom; Shieber, Stuart M. (2007). "Machine learning theory and practice as a source of insight into universal grammar" (PDF). Journal of Linguistics. 43 (2): 393–427. doi:10.1017/s0022226707004628.
 Stuart M. Shieber (1992). Constraintbased Grammar Formalisms: Parsing and Type Inference for Natural and Computer Languages. MIT Press. ISBN 9780262193245.
External links
 Archived email message by Roger Hindley, explains history of type inference
 Polymorphic Type Inference by Michael Schwartzbach, gives an overview of Polymorphic type inference.
 Basic Typechecking paper by Luca Cardelli, describes algorithm, includes implementation in Modula2
 Implementation of HindleyMilner type inference in Scala, by Andrew Forrest (retrieved July 30, 2009)
 Implementation of HindleyMilner in Perl 5, by Nikita Borisov at the Wayback Machine (archived February 18, 2007)
 What is HindleyMilner? (and why is it cool?) Explains HindleyMilner, examples in Scala