top of page

Unicode Common Locale Data Repository (CLDR)

  • xiaofudong1
  • Mar 15, 2025
  • 2 min read

What is CLDR?

According to Unicode CLDR Project, The Unicode Common Locale Data Repository (CLDR) provides "key building blocks for software to support the world’s languages, with the largest and most extensive standard repository of locale data available." It is commonly utilized for the internationalization and localization of software, adjusting software to the standards of various languages for typical software functions.


What Can It Do?



Borrowing an example from Unicode CLDR (Common Locale Data Repository) lecture, it can power the covertion of case, accent (if applicaple), number formattings, date formatings, and etc.


In the example above, in English (United States), the birth information is July 17, 1954. In German (Germany), CLDR powers the web to conver the format to 17, July 1954. The same logic applies to the rest of the examples on the page.


There are also some more functions listed on the Unicode CLDR Project page:


  • Locale-specific patterns for formatting and parsing: dates, times, timezones, numbers and currency values, measurement units,…

  • Translations of names: languages, scripts, countries and regions, currencies, eras, months, weekdays, day periods, time zones, cities, and time units, emoji characters and sequences (and search keywords),…

  • Language & script information: characters used; plural cases; gender of lists; capitalization; rules for sorting & searching; writing direction; transliteration rules; rules for spelling out numbers; rules for segmenting text into graphemes, words, and sentences; keyboard layouts…

  • Country information: language usage, currency information, calendar preference, week conventions,…

  • Validity: Definitions, aliases, and validity information for Unicode locales, languages, scripts, regions, and extensions,…


Why It is Needed?


In English, when you would like to express plaural, you simply add a "s" at the end of the word in most cases. However, in Russian, numbers ending in "1", "2,3,4", and "5,6,7,8,9,0" have differnt set of rules inplaural.


For example:

один доллар - one dollar

четыре доллара - four dollars

сто долларов - one hundred dollars


In some of the localization practices, the translator may just use penthesis to distinguish. For example {MoneyAmount} dollar(s). However, when we have languages that have different paural rules for different numbers, it is not feasible for translators to list all the possibilities or just use a penthesis to include all senarios. In this case, CLDR can power libirary such as ICU to covert the plaural form of the nouns.


How To Use?


The majority of developers will typically utilize CLDR indirectly by employing various software libraries like ICU, Closure, or TwitterCLDR. These libraries commonly convert the CLDR data into a streamlined format that is convenient for the library to load and utilize.




Comments


bottom of page