Tagging Personal Names

In the metadata of books and book components, the personal names of authors, editors, translators, and other contributors can all be tagged using either the element <name> or the element <string-name>. Both <name> and <string-name> can identify the surname (family name or last name) and given names (first names and middle names) of a person.
Unlike nearly all elements in this Tag Set, the <name> element requires a specific sequence and may not contain spaces (more properly, it will not retain spaces if they are present). The required order for a personal name is:
  • First, one of the following
  • Next (optionally), a <prefix> such as a formal title (“Senator”);
  • Next (optionally), a <suffix> such as a lineage distinguisher (“Jr.”, “Sr.”, etc.)
as illustrated here:
<name>
  <surname>Jones-Smythe</surname>
  <given-names>Johnathan Irving Browning</given-names>
  <prefix>The Honorable</prefix>
  <suffix>III</suffix>
</name>

String Names

The <string-name> element is a container for personal names where the stricter organization of the <name> element cannot be followed or is not appropriate for other reasons. This is a very loose element, which may contain text, numbers, and special characters as well as any or all of the naming elements, such as <surname> and <given-names>. A <string-name> can be used to hold sort versions of the name, the full name in display order (for example, as a byline that does not require recombining the components), the name components with punctuation and spacing between them, a name where family versus given cannot be determined, etc. In the example below, notice that the <string-name> contains a “comma-space” as well as the <surname> and <given-names> elements:
<string-name>
  <surname>Lincoln</surname>, <given-names>Abraham</given-names>
</string-name>
String names may also be partially tagged, identifying only some of the name components:
<string-name>The Honorable Johnathan Irving 
  Browning <surname>Jones-Smythe</surname>, III
</string-name>

Spaces and Punctuation in Names

The model for <name> does not allow spaces or punctuation between the surname, given names, prefixes, or suffixes components. Even if one or more spaces exist in the XML source file, XML processors remove any space that occurs between elements. In other words, in the following XML examples, there will be no space between the surname, the given names, and the suffix. The <name> element cannot preserve these spaces.
<name>
  <surname>Petitti</surname>  <given-names>DB</given-names> <suffix>III</suffix>
</name>


<name>
  <surname>Petitti</surname>  <given-names>DB</given-names> <suffix>III</suffix>
</name>


<name>  <surname>Petitti</surname> <given-names>DB</given-names>   </name>
In contrast with <name>, the element <string-name> allows, but does not require, spacing and punctuation within the name:
<string-name>
  <surname>Jefferson</surname>,   <given-names>T</given-names>.
</string-name>


<string-name>
  <surname>Jefferson</surname><given-names>T</given-names>.
</string-name>

multi-part Names

Many names have multiple parts, and care should be taken with multi-part names to divide the components into family names (<surname>) and personal names (<given-names>) in a culturally appropriate fashion. The Tag Suite cannot give guidance on how to divide names, but it enables most cultural variations. For example:
<surname>Llanos De La Torre Quiralte</surname>
<given-names>M</given-names>    

<surname>Sánchez Mendoza</surname>   
<given-names>Josquin</given-names>    

<surname>Las Heras</surname>   
<given-names>Juan Fernando</given-names>    

<surname>Lapeyre</surname>   
<given-names>Kenneth Pritchard Carnu</given-names> 

<surname>Ben Gurion</surname>   
<given-names>David</given-names>    

<surname>de la Mare</surname>   
<given-names>Walter John</given-names>

<surname>Toulouse-Lautrec-Monfa, de</surname>   
<given-names>Henri Marie Raymond</given-names>

<name name-style="eastern">
<surname>Zhou</surname>   
<given-names>Xun-Ze</given-names>
</name>    
 
<name name-style="eastern">
<surname>Chou</surname>   
<given-names>Hsun-Tse</given-names>
</name>    
 
<name name-style="eastern">
<surname>Si-Ma</surname>   
<given-names>Mary-Sue</given-names>
</name>    
 
<name name-style="eastern">
<surname>SI-MA</surname>   
<given-names>Mary-Sue</given-names>
</name>    
 
<name>
<given-names>Cai-Rang</given-names>
</name>

Multiple Versions of a Name

Anywhere a person’s name can be used, inside both the metadata and bibliographic citations, this Tag Set allows the possibility of providing more than one version of that name. The <name-alternatives> element is intended to collect multiple versions of a single name without multiplying the number of names. (Three versions of one contributor’s name is not the same as three different contributors.) The element <name-alternatives> works similarly to the <alternatives> construction for graphics, allowing multiple name variations to be linked together as processing alternatives for a single name. It will be up to an application how multiple versions of a single name are to be processed. The @specific-use, @content-type, and @xml:lang attributes can be used to distinguish the cases for separate processing. For example, the following names are distinguished by language:
        
...
<contrib contrib-type="author">
<name-alternatives>
<name name-style="eastern" xml:lang="ja-Jpan">
<surname>中西</surname>
<given-names>秀彦</given-names>
</name>
<name name-style="western" xml:lang="en">
<surname>Nakanishi</surname>
<given-names>Hidehiko</given-names>
</name>
<name name-style="eastern" xml:lang="ja-Kana">
<surname>ナカニシ</surname>
<given-names>ヒデヒコ</given-names>
</name>
</name-alternatives>
<xref ref-type="aff" rid="aff2">&ast;&ast;</xref>
</contrib>
...
The <name-alternatives> element can be used to record:
  • A name in multiple languages (For example, a name in Korean or Chinese-Han characters and a transliterated version of the same name in the Latin alphabet);
  • A name in multiple language/script combinations (For example, a name in Japanese [xml:lang="ja-Jpan" for Han + Hiragana + Katakana] and the same name written in Kanji [xml:lang="ja-Hani"]);
  • An alternate name for sorting or searching (For example, a name in French with accented letters (such as an “é”) and a plain-letter lower-ASCII version of the same name with “é” replaced by “e” for sorting. The @specific-use attribute can be used to indicate that the ASCII version is only for “sort”, not for display.);
  • An alternate name indexing (For example, an XML database may need to record all the name variants found for an individual from “President Thomas Jefferson” to “Long Tom”, with @specific-use used to mark “primary” and “index”.); or
  • Both validated and known-to-be-incorrect names (For example, in the PubMed DTD, there is an attribute called “ValidYN” [valid yes or no], that can be used to record the fact that one version of a name was received, found to be in error and corrected. Only the corrected version should be displayed, but both name variants may be used for searching.).
The name variants inside a <name-alternatives> element within the document metadata do not take a unique identifier (@id) because they all represent the same person. This Tag Set assumes that any necessary unique identifier will be placed on the enclosing element (such as the <contrib> element or the <principal-investigator> element) that contains the name alternatives.
String Name: Both <name> and <string-name> are allowed inside <name-alternatives>. Within a <name-alternatives> grouping, the element <string-name> can be used, for example, to hold an undifferentiated transliteration (that is, one not tagged with specific name elements such as <surname>) or a search-specific name. In the following example, the <string-name> element is used for both a byline-style uninverted display and an abbreviated form of the name.
<name-alternatives>
  <string-name specific-use="display">José del Pozo García</string-name>
  <name specific-use="primary" name-style="western">
    <surname>del Pozo García</surname>
    <given-names>José</given-names>
  </name>
  <string-name specific-use="abbrev-form">Pozo Garcia J del</string-name>
</name-alternatives>


<name-alternatives>
  <string-name specific-use="display">PM Sudha</string-name>
  <name specific-use="primary">
    <given-names initials="PM">Sudha</given-names>
  </name>
  <string-name specific-use="abbrev-form">Sudha PM</string-name>
</name-alternatives>


<name-alternatives>
  <name content-type="formal-name" xml:lang="fr">
    <surname>Giscard d'Estaing</surname>   
    <given-names>Valéry Marie René Georges</given-names>
  </name>
  <name content-type="common-name" xml:lang="fr">
    <surname>Giscard d'Estaing</surname>
    <given-names>Valéry</given-names>
  </name>
  <string-name specific-use="abbrev-form">Giscard d'Estaing V</string-name>
</name-alternatives>
Best Practice Caveat: Before this release of this Tag Set, the element <string-name> was not permitted within metadata because this Tag Set strives to regularize names. Now, <string-name> will be allowed as one of the alternatives inside a <name-alternatives> grouping. Inside an <name-alternatives>, the element <string-name> should not be used for the primary name, but only to support name variants for such purposes as indexing, searching, or an undifferentiated transliteration. The element <name> should always be used for the primary name.

Name Display Order

The problem of eastern versus western display of names (for example, Toshiro Mifune versus Mifune Toshiro) can be addressed using the @name-style attribute. The @name-style attribute can record the preferred display order for the name, typically to make the distinction between eastern and western display order. Name ordering information can be used for choosing an inversion algorithm for sorting, for ordering the names for display, or for other processing functions. The values of the @name-style attribute and their approximate meanings are given below.
On the whole, this Tag Set can encode many, perhaps most, of the name variations found in the world. Both given names and surnames can be multiple words; there is no need to separate given names into first and middle names. Articles can be kept at the front of a name, or relegated to the rear following a comma. The element <string-name> is available for those who choose not to name a surname or given name, or for the cases where this distinction does not exist or cannot be determined.
There are some areas where this Tag Set cannot provide complete advice, and each Tag Set user must make business related rules. These include:
  • How to recognize/differentiate surnames from given names;
  • How to handle all single names (It is usually best practice to tag most westernized single names (“Pele”, “Cher”, and “Ice Cube”) as <surname> elements. Tibetan, Burmese, and Indian single names that are not surnames may be tagged as <given-names>.)
  • How to treat the article portions of both surnames and given names (such as “de”, “Del”, “Las”, “de la”, et al.). For example, whether “Rudolpho Del Pozo Garcia” (who may also be known as “Rudolpho del Pozo García”) sorts as an initial “P” or as an initial “D” is a business, not a Tag Set, decision.
The finer points of personal names are probably best determined by native speakers of the language involved.

Names and String Names in Citations

The names of authors, editors, translators, and other contributors can also be tagged within <element-citation> and <mixed-citation> using <name>, <string-name>, or <person-group>. The elements <name> and <string-name> identify the surname (family name or last name) and given names (first names and middle names) of the person. The element <person-group> is a container for <name> elements, <name-alternatives> elements, and <string-name> elements.
Here is a typical journal article citation tagged using <name>, tagged as an element citation:
<element-citation publication-type="journal" publication-format="print">
  <name>
    <surname>Leifer</surname><given-names>BP</given-names>
  </name>
  <article-title>Early diagnosis of Alzheimer&rsquor;s disease: clinical 
  and economic benefits</article-title>
  <source>J Am Geriatr Soc</source>
  <year iso-8601-date="2003-05">2003</year><month>May</month>
  <volume>51</volume>
  <issue>5 Suppl</issue><issue-title>Dementia</issue-title>
  <fpage>S281</fpage><lpage>S288</lpage>
</element-citation>
and here is the same journal citation tagged as a mixed citation:
<mixed-citation publication-type="journal" publication-format="print">
  <string-name><surname>Leifer</surname>, 
    <given-names>BP</given-names>
  </string-name>. <article-title>Early diagnosis of Alzheimer's 
  disease: clinical and economic benefits</article-title>. 
  <source>J Am Geriatr Soc</source>. 
  <year iso-8601-date="2003">2003</year>
  <month>May</month>;
  <volume>51</volume>(<issue>5 Suppl</issue>  
  <issue-title>Dementia</issue-title>):<fpage>S281
  </fpage>-<lpage>S288</lpage>.
</mixed-citation>

Spacing in Citation Names

Note that even when the <name> element is inside a <mixed-citation>, the element cannot be used to preserve this space. In the tagged examples below, there will be no space between the surname and given names, no matter which type of citation contains the <name>:
<element-citation publication-type="journal">
  <name>
    <surname>Petitti</surname>  <given-names>DB</given-names>
  </name>
</element-citation>


<mixed-citation publication-type="journal">
  <name>
    <surname>Petitti</surname>  <given-names>DB</given-names>
  </name>
</mixed-citation>
Within both types of citations, the elements <string-name> and <person-group> can be used to preserve punctuation. These elements are typically used in mixed citations to preserve the punctuation and spacing.
String Name: A <string-name> can preserve the punctuation that separates the surname from the initials or the given names, so <string-name> elements are frequently used inside <mixed-citation>s:
<mixed-citation publication-type="journal">
  <string-name>
    <surname>Washington</surname>, <given-names>George</given-names>
  </string-name>. ...
</mixed-citation>
In element-style citations, which do not preserve punctuation or spacing, <string-name> is typically only used to hold name alternatives or unusual names that are not easily broken into <surname> and <given-names> components. For example:
<element-citation publication-type="journal">
  <string-name>His Royal Highness The Prince Charles, 
  Prince of Wales and Earl of Chester</string-name> ...
</element-citation>
The element <string-name> can also be used to preserve the order of publication for a name. All of the following are legal string names:
<string-name>
  <surname>Smith</surname>, <given-names>JH</given-names>
</string-name>


<string-name>
  <given-names>JH</given-names> <surname>Smith</surname>
</string-name>


<string-name>J.H.  <surname>Smith</surname></string-name>
The first example above would not be a valid <name> because of the comma and space between <surname> and <given-names>. The second and third examples are not valid <name>s because of name component order. The <name> element specifies an order for the name component elements to help users regularize this data.

Person Groups

The <name> elements within citations may be grouped using the <person-group> element. <person-group> is very similar to <contrib-group> in the book and book component metadata in that it can contain <name>, <collab>, or <anonymous> elements. <person-group> takes an optional attribute @person-group-type that identifies the type of contributor (editor, illustrator) tagged within the group.
Here is an editor tagged using <person-group> inside <element-citation>:
<element-citation publication-type="journal" publication-format="print">
<source>Folia Primatologica: International Journal of Primatology</source>
<person-group person-group-type="editor">
<name><surname>Crompton</surname><given-names>R.H.</given-names></name>
</person-group>
<publisher-loc>Basel (Switzerland)</publisher-loc>
<publisher-name>S. Karger AG</publisher-name>
<volume>1</volume><year iso-8601-date="1863">1863</year>
<comment> -suspect date, may be 1864</comment>
</element-citation>
And that same person group in a mixed citation:
<mixed-citation publication-type="journal" publication-format="print">
<source>Folia Primatologica: International Journal of Primatology</source>. 
<person-group person-group-type="editor">
<name><surname>Crompton</surname><given-names>R.H.</given-names>
</name></person-group>, editor. <publisher-loc>Basel (Switzerland)</publisher-loc>: 
<publisher-name>S. Karger AG</publisher-name>. Vol. <volume>1</volume> 
<year iso-8601-date="1863">1863</year> -suspect date, may be 1864.</mixed-citation>
Notice that, in the mixed-style example just given, the information that a person is an editor may be there twice, once as loose textual material with a comma and space, and once as a searchable attribute on the <person-group> element.
Within both citation types, but more typically used within mixed citations, a <person-group> allows preservation of the punctuation between names or the punctuation between a name and its affiliation, such as the square brackets below:
<mixed-citation publisher-type="gov">
<person-group person-group-type="author">
<name><surname>Norman</surname>
<given-names>John C</given-names>
</name> [<aff>Texas Heart Institute, Houston, TX</aff>]
</person-group>
</mixed-citation>
The <person-group> element can also be used to tag a person’s name and affiliation or to collect a group of contributors, all of whom have a single affiliation. In the example below, two individuals share an affiliation, tagged in an element-style citation.
<element-citation publication-type="commun">
  <person-group>
    <name>
      <surname>Hennen</surname><given-names>John</given-names>
    </name>
    <name>
      <surname>McDougall</surname><given-names>Jenni</given-names>
    </name>
    <aff>Edinburgh, Scotland</aff>
  </person-group>
  <source>Letter to: Dr. Duncan</source><year>[date unknown]</year>
  <size units="pages">9 p</size><comment>Located at: History of Medicine
  Division, National Library of Medicine, Bethesda, MD; W6 P3 v.1575.
  Observations on the cure of syphilis without mercury.</comment>
</element-citation>
Another possible use of the <person-group> is to hold the element <etal>, to designate unnamed individuals (typically indicated in print with the text “et al.”). Unlike many tag sets, this Tag Set allows <etal> to contain text, so the user may choose between generating text based on the element or including the text inside the element.

APA Ellipses Style for Multiple Authors

The 6th edition of the APA Style Guide eliminates long lists of authors when citing a work with more than 7 or 8 authors. (Many genomics articles have hundreds of authors.) When there are more than 7 authors, the APA citation lists the first 6 authors, then an ellipsis or the words “et al.”, followed by the last author. Here is an example of such a citation, that has a large number of authors, as it would be shown in APA style for display or print:
Dodge, K. A., Berlin, L. J., Epstein, M., Spitz Roth, A., O'Donnell, K., 
Kauffman, M., . . ., & Christopoulos, C. (2003). The Durham 
Family Initiative: A preventive system of care. Child Welfare, 
83(2), 109-128
Here is the example above tagged as a <element-citation>, using the element <etal> as a placeholder, from which the ellipses could be generated:
<ref id="r1">
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name><surname>Dodge</surname>
<given-names>K. A.</given-names></name>
<name><surname>Berlin</surname>
<given-names>L. J.</given-names></name>
<name><surname>Epstein</surname>
<given-names>M.</given-names></name>
<name><surname>Spitz Roth</surname>
<given-names>A.</given-names></name>
<name><surname>O&#x2019;Donnell</surname>
<given-names>K.</given-names></name>
<name><surname>Kauffman</surname>
<given-names>M.</given-names></name>
<etal/>
<name><surname>Christopoulos</surname>
<given-names>C.</given-names></name>
</person-group>
<year iso-8601-date="2003">2003</year>
<article-title>The Durham Family Initiative: A
preventive system of care</article-title>
<source>Child Welfare</source>
<volume>83</volume><issue>2</issue>
<fpage>109</fpage><lpage>128</lpage>
</element-citation>
</ref>
Here is the example above tagged as a <mixed-citation>, using the element <etal> to hold the entity reference for the ellipsis, the comma-space following each author (including et al.), and the ampersand before the name of the final author:
<ref id="r1">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name><surname>Dodge</surname>
<given-names>K. A.</given-names></name>,
<name><surname>Berlin</surname>
<given-names>L. J.</given-names></name>,
<name><surname>Epstein</surname>
<given-names>M.</given-names></name>,
<name><surname>Spitz Roth</surname>
<given-names>A.</given-names></name>,
<name><surname>O&#x2019;Donnell</surname>
<given-names>K.</given-names></name>,
<name><surname>Kauffman</surname>
<given-names>M.</given-names></name>,
<etal>&hellip;</etal>, &amp; <name>
<surname>Christopoulos</surname>
<given-names>C.</given-names></name>
</person-group> (<year iso-8601-date="2003">2003</year>).
<article-title>The Durham Family Initiative: A preventive
system of care</article-title>. <source>Child Welfare</source>,
<volume>83</volume>(<issue>2</issue>),
<fpage>109</fpage>&#x2013;<lpage>128</lpage>.
</mixed-citation>
</ref>
Since this Tag Set allows <etal> to contain text, the user may choose between generated or contained text. Both of the samples above could alternately have included <etal>et al.</etal> or <etal>…</etal> instead of the empty <etal> element.