Locales and the CLDR
The Unicode Common Locale Data Repository (CLDR) is a comprehensive set of functionality that is included with PHP via its intl extension.
While HTML supports Unicode encoding using UTF-8, JavaScript support is still lacking, particularly in its lack of support for Unicode normalisation and named categories in its regular expressions. These shortfalls mean that processing and verification on all text entered has to all be done on the backend before presenting the completed versions back to the user, rather than many checks being able to be done all in the browser.
When entering non-ASCII characters, some may be created with multiple codepoints, whereas there are also single codepoint versions. Normalisation converts multiples to singles. Regex categories allow specifying a related set of characters, like all uppercase letters for all languages, using one short set of unchanging characters, rather than having to specify a complete set of codepoint ranges which will change as more languages are added.
HTML also has limited language support for alphabetic bullets for lists, with some browsers having more than others. The CLDR is produced by the Unicode Consortium itself, making for reliable long-term locale support, so the product had to generate its bullets with the
A side effect of creating the list item characters was that it allowed making them links back to the first character of the introduction, making it easier for those relying upon keyboards to navigate around lists.
The CLDR allows for quite sophisticated locale support but I tended to limit what I used of that. The main advantage is that each version of PHP will ship with the latest stable version of the CLDR, which means that more locales will be automatically included as time goes on. The principal functional uses are for formatting dates, times and numbers, though I allow the Indo-Arabic numbers to override native numerals on a per locale basis. This works for Western Arabic which uses the former, while the rest of the Arab world uses native.
The CLDR does provide for very sophisticated cardinality rendering when including numbers with what they apply to with them, and while I did try to use it at the start, it was easier, and perhaps better rendered for presentation in the form of Name: number or Name (unit): number. Languages evolve, and while grammars may change over time for rendering cardinality, especially as more languages adapt to using gender-neutral terms, the simplified and unified presentation of values is unlikely to need to be changed.
In choosing to use any sophisticated technology, it is a choice about whether to try to keep up with what is changing rapidly in it, or find ways to decouple from such dependencies. Those who choose to design using the cutting edge of technologies force themselves and their clients to be running a catch-up race. Avoiding such rat races by making design choices that do not depend upon them makes it easier all round. While I might be interested in advanced technologies, I do not want to force my end customers to have to do so as well, especially since the target audiences are not the type to.