# Analysis of similarities

Analysis of similarities (ANOSIM) is a non-parametric statistical test widely used in the field of ecology. The test was first suggested by K. R. Clarke[1] as an ANOVA-like test, where instead of operating on raw data, operates on a ranked dissimilarity matrix.

Given a matrix of rank dissimilarities between a set of samples, each solely belong to one treatment group, the ANOSIM tests whether we can reject the null hypothesis that the similarity between groups is greater than or equal to the similarity within the groups.

The test statistic R is calculated in the following way:

${\displaystyle R={\frac {r_{B}-r_{W}}{M/2}}}$

where rB is the average of rank similarities of pairs of samples (or replicates) originating from different sites, rW is the average of rank similarity of pairs among replicates within sites, and M = n(n  1)/2 where n is the number of samples.

The test statistic R is constrained between the values −1 to 1, where positive numbers suggest more similarity within sites and values close to zero represent no difference between within sites and within sites similarities. Negative R values suggest more similarity between sites than within sites and may raise the possibility of wrong assignment of samples to sites.

For the purpose of hypothesis testing, where the null hypothesis is that the similarities within sites are smaller or equal to the similarities between sites, the R statistic is usually compared to a set of R values that are achieved by means of randomly shuffling site labels between the samples and calculating the resulting R, repeated many times. The percent of times that the actual R surpassed the permutations derived R values is the p-value for the actual R statistic.

Ranking of dissimilarity in ANOSIM and NMDS (non-metric multidimensional scaling) go hand in hand. Combining both methods complement visualisation of group differences along with significance testing.[2]

ANOSIM is implemented in several statistical software including PRIMER, R Vegan package and PAST.

## References

1. Clarke, K. R. (1993). "Non-parametric multivariate analyses of changes in community structure". Austral Ecology. 18 (1): 117–143. doi:10.1111/j.1442-9993.1993.tb00438.x. ISSN 1442-9985.
2. Buttigieg, Pier Luigi; Ramette, Alban (2014). "A guide to statistical analysis in microbial ecology: a community-focused, living review of multivariate data analyses". FEMS Microbiology Ecology. 90 (3): 543–550. doi:10.1111/1574-6941.12437. ISSN 0168-6496. PMID 25314312.