Confusion matrix
In the field of machine learning and specifically the problem of statistical classification, a confusion matrix, also known as an error matrix,[5] is a specific table layout that allows visualization of the performance of an algorithm, typically a supervised learning one (in unsupervised learning it is usually called a matching matrix). Each row of the matrix represents the instances in a predicted class while each column represents the instances in an actual class (or vice versa).[2] The name stems from the fact that it makes it easy to see if the system is confusing two classes (i.e. commonly mislabeling one as another).
Sources: Fawcett (2006),[1] Powers (2011),[2] Ting (2011),[3] and CAWCR[4] 
It is a special kind of contingency table, with two dimensions ("actual" and "predicted"), and identical sets of "classes" in both dimensions (each combination of dimension and class is a variable in the contingency table).
Example
If a classification system has been trained to distinguish between cats and dogs, a confusion matrix will summarize the results of testing the algorithm for further inspection. Assuming a sample of 13 animals — 8 cats and 5 dogs — the resulting confusion matrix could look like the table below:

In this confusion matrix, of the 8 actual cats, the system predicted that three were dogs, and of the five dogs, it predicted that two were cats. All correct predictions are located in the diagonal of the table (highlighted in bold), so it is easy to visually inspect the table for prediction errors, as they will be represented by values outside the diagonal.
Table of confusion
In predictive analytics, a table of confusion (sometimes also called a confusion matrix), is a table with two rows and two columns that reports the number of false positives, false negatives, true positives, and true negatives. This allows more detailed analysis than mere proportion of correct classifications (accuracy). Accuracy is not a reliable metric for the real performance of a classifier, because it will yield misleading results if the data set is unbalanced (that is, when the numbers of observations in different classes vary greatly). For example, if there were 95 cats and only 5 dogs in the data, a particular classifier might classify all the observations as cats. The overall accuracy would be 95%, but in more detail the classifier would have a 100% recognition rate (sensitivity) for the cat class but a 0% recognition rate for the dog class. F1 score is even more unreliable in such cases, and here would yield over 97.4%, whereas informedness removes such bias and yields 0 as the probability of an informed decision for any form of guessing (here always guessing cat).
Assuming the confusion matrix above, its corresponding table of confusion, for the cat class, would be:
Actual class  

Cat  Noncat  
Predicted class 
Cat  5 True Positives  2 False Positives  
Noncat  3 False Negatives  3 True Negatives  
The final table of confusion would contain the average values for all classes combined.
Let us define an experiment from P positive instances and N negative instances for some condition. The four outcomes can be formulated in a 2×2 confusion matrix, as follows:
True condition  
Total population  Condition positive  Condition negative  Prevalence = Σ Condition positive/Σ Total population  Accuracy (ACC) = Σ True positive + Σ True negative/Σ Total population  
Predicted condition 
Predicted condition positive 
True positive  False positive, Type I error 
Positive predictive value (PPV), Precision = Σ True positive/Σ Predicted condition positive  False discovery rate (FDR) = Σ False positive/Σ Predicted condition positive  
Predicted condition negative 
False negative, Type II error 
True negative  False omission rate (FOR) = Σ False negative/Σ Predicted condition negative  Negative predictive value (NPV) = Σ True negative/Σ Predicted condition negative  
True positive rate (TPR), Recall, Sensitivity, probability of detection, Power = Σ True positive/Σ Condition positive  False positive rate (FPR), Fallout, probability of false alarm = Σ False positive/Σ Condition negative  Positive likelihood ratio (LR+) = TPR/FPR  Diagnostic odds ratio (DOR) = LR+/LR−  F_{1} score = 2 · Precision · Recall/Precision + Recall  
False negative rate (FNR), Miss rate = Σ False negative/Σ Condition positive  Specificity (SPC), Selectivity, True negative rate (TNR) = Σ True negative/Σ Condition negative  Negative likelihood ratio (LR−) = FNR/TNR 
References
 Fawcett, Tom (2006). "An Introduction to ROC Analysis" (PDF). Pattern Recognition Letters. 27 (8): 861–874. doi:10.1016/j.patrec.2005.10.010.
 Powers, David M W (2011). "Evaluation: From Precision, Recall and FMeasure to ROC, Informedness, Markedness & Correlation" (PDF). Journal of Machine Learning Technologies. 2 (1): 37–63.
 Ting, Kai Ming (2011). Encyclopedia of machine learning. Springer. ISBN 9780387301648.
 Brooks, Harold; Brown, Barb; Ebert, Beth; Ferro, Chris; Jolliffe, Ian; Koh, TiehYong; Roebber, Paul; Stephenson, David (20150126). "WWRP/WGNE Joint Working Group on Forecast Verification Research". Collaboration for Australian Weather and Climate Research. World Meteorological Organisation. Retrieved 20190717.
 Stehman, Stephen V. (1997). "Selecting and interpreting measures of thematic classification accuracy". Remote Sensing of Environment. 62 (1): 77–89. doi:10.1016/S00344257(97)000837.