Latin script in Unicode
Many Unicode characters belonging to the Latin script are encoded in the Unicode Standard. As of version 12.0 of the Unicode Standard, 1,366 characters in the following blocks are classified as belonging to the Latin script:
- Basic Latin, 0000–007F. This block corresponds to ASCII.
- Latin-1 Supplement, 0080–00FF
- Latin Extended-A, 0100–017F
- Latin Extended-B, 0180–024F
- IPA Extensions, 0250–02AF
- Spacing Modifier Letters, 02B0–02FF
- Phonetic Extensions, 1D00–1D7F
- Phonetic Extensions Supplement, 1D80–1DBF
- Latin Extended Additional, 1E00–1EFF
- Superscripts and Subscripts, 2070–209F
- Letterlike Symbols, 2100–214F
- Number Forms, 2150–218F
- Latin Extended-C, 2C60–2C7F
- Latin Extended-D, A720–A7FF
- Latin Extended-E, AB30–AB6F
- Alphabetic Presentation Forms (Latin ligatures) FB00–FB4F
- Halfwidth and Fullwidth Forms, FF00–FFEF
In addition, a number of Latin-like characters are encoded in the Currency Symbols, Control Pictures, CJK Compatibility, Enclosed Alphanumerics, Enclosed CJK Letters and Months, Mathematical Alphanumeric Symbols, and Enclosed Alphanumeric Supplement blocks, but although they look like Latin letters they have the script property of common, and so do not belong to the Latin script in Unicode terms. Lisu also consists almost entirely of Latin forms but uses its own script property.
The extended ranges contain mainly precomposed diacritics that may be equivalently encoded with combining diacritics, as well as some ligatures, used in the orthography of various African languages (including click symbols in Latin Extended-B) and the Vietnamese alphabet (Latin Extended Additional). Latin Extended-C contains additions for Uighur and the Claudian letters. Latin Extended-D comprises characters that are mostly of interest to medievalists. Latin Extended-E mostly comprises characters used for German dialectology (Teuthonista).
Table of characters
In this table those characters with the Unicode script property of Latin are highlighted in colour, indicating the version of Unicode they were introduced in. Reserved code points (which may be assigned as characters at a future date) have a grey background. All characters that do not belong to the Latin script have a white background (and the version of Unicode they were introduced in is therefore not indicated).
|Legend: Unicode version|
|Unicode 1.0||Unicode 5.0|
|Unicode 1.1||Unicode 5.1|
|Unicode 2.0||Unicode 5.2|
|Unicode 2.1||Unicode 6.0|
|Unicode 3.0||Unicode 6.1|
|Unicode 3.1||Unicode 7.0|
|Unicode 3.2||Unicode 8.0|
|Unicode 4.0||Unicode 9.0|
|Unicode 4.1||Unicode 11.0|
|Not Latin script||Reserved|
- Everson, Michael; Dicklberger, Alois; Pentzlin, Karl; Wandl-Vogt, Eveline (2011-06-02). "Revised proposal to encode "Teuthonista" phonetic characters in the UCS" (PDF).