<content-language> Content Language

Metadata for a document that identifies the primary language(s) used in the document.

Usage/Remarks

Best Practice

The <content-language> element should appear once for each primary language used in the text of a multi-lingual document. For Best Practice, the content of <content-language> should be the two-letter ISO 639 code for the language, for example, “en” for English, “de” for German, or “es” for Spanish.
In addition:
  • There is no value to using <content-language> on a mono-lingual document.
  • The use of a language code value for @xml:lang on a top-level element strongly implies a mono-lingual document.
In conjunction with @xml:lang
For multi-lingual documents, the @xml:lang attribute may be omitted from the top document-level element or the document may use the 3-digit ISO 639-2 value “xml:lang="mul"”, indicating multiple primary languages are used.
This tag set is agnostic on how “primary” is defined, leaving that decision to each producer. However, the intent of this element is to record the principle languages used in a multi-lingual document, not to state that a few quotations in another language occur in an essentially mono-lingual document.
Related Elements
How to Tag the Language: In BITS, there are two ways to describe the natural language of the content of a document:
  • XML Lang Language Attribute: The @xml:lang attribute can be put on many elements, to indicate the language of that element and its descendants. This is an inherited value, so that element and all of its children will be in the named language, unless specifically overridden with another @xml:lang attribute. Thus a language code on the top level element of a document (in the case of BITS, <book> or <book-part-wrapper>) names the only primary language in a mono-lingual document or can be the value "mul" to indicate that the document has multiple primary languages.
  • Content Language Element: The <content-language> element, in the metadata of a book or book-part, identifies the primary language(s) used in the document. The element appears once for each primary language used in the document. For Best Practice, the <content-language> content should be the two-letter ISO 639 code for the language, for example, “en” for English, “de” for German, or “es” for Spanish.
Attributes

Base Attributes

Models and Context
May be contained in
Description
Text, numbers, or special characters, zero or more
Content Model
<!ELEMENT  content-language  
                        (#PCDATA %content-language-elements;)*       >
Expanded Content Model

(#PCDATA)*