The language of the intellectual content of the element for which this is an attribute.
Typical values are described by the IETF RFC 5646, two-letter, lower-case language codes such as “ fr ” (French), “ en ” (English), “ de ” (German), and “ zh ” (Chinese). These values are NOT case sensitive, but current best practice uses all lower case. Values can be obtained from the IANA Language Subtag Registry: http://www.iana.org/assignments/language-subtag-registry.
Inheritance: The language value inherits down the tree, so an @xml:lang attribute names the language of the element and all its descendants, unless the descendant sets its own @xml:lang attribute. The default value of English (“ en ”) is set at the top-level element, and can be over-ridden there or anywhere lower in the document.
Script and Language: In some languages, script codes are also critically important; for example, in Japanese, there is the need to express whether a name is in Kanji as opposed to in Kana (Hiragana or Katakana) to determine sort keys. Best practice is to use the full language-code-plus-script-code as the value for @xml:lang. In our use of both language and script tagging as values for @xml:lang, we are following the IETF (Internet Engineering Task Force) best practice guideline: Network Working Group Request for Comments: 5646 [Tags for Identifying Languages, A. Phillips and M. Davis, Editors, September 2009]. That document defines a language tag as composed of (in part):
Some sample values of @xml:lang for Chinese and Serbian illustrate this complexity:
Thus, for example, the following are among the expected values of @xml:lang for Japanese, incorporating both a language (“ ja ”) and a script type:
Value | Meaning |
---|---|
An alphanumeric string, which may include hyphens | An abbreviation for a natural language (such as “en” for English or “de” for German) or for a language and a script (“ ja-Kana ”) |
Default value: en |
Value | Meaning |
---|---|
An alphanumeric string, which may include hyphens | An abbreviation for a natural language (such as “en” for English or “de” for German) or for a language and a script (“ ja-Kana ”) |
Restriction: This attribute may be specified if the element is used. |