General Introduction
This “Tag Library” is provided as a service to users of ANSI/NISO Z39.96-2011, JATS:
Journal Article Tag Suite; the Tag Library is not part of NISO Z39.96. It contains non-normative
information that is intended to be helpful to users of NISO Z39.96, including:
- Remarks on usage and relationships among elements and attributes;
- Structural Diagrams, showing the element hierarchy;
- Tagging examples;
- Best practice recommendations;
- Implementation advice;
- Discussion of accessibility and the Tag Suite; and
- Pointers to (non-normative) downloadable versions of DTDs, XSDs, and RNGs that implement the NISO JATS Tag Sets.
The intent of the Journal Article Tag Suite is to provide a common
format in which publishers and archives can exchange journal content. The Suite provides a
set of XML schema modules that define elements and attributes for describing the textual and
graphical content of journal articles as well as some non-article material such as letters,
editorials, and book and product reviews.
Introduction to the Journal Archiving Tag Set
Rationale
The Journal Archiving and Interchange Tag Set (“Archiving”) defines elements and attributes that describe the content and metadata
of journal articles, including research and non-research articles, letters, editorials,
and book and product reviews. The Tag Set allows for descriptions of the full article
content or just the article header metadata.
The intent of the Archiving and Interchange Tag Set is to preserve the intellectual content of journals independent of the form in which that content was originally
delivered. This Tag Set enables an archive to capture structural and semantic
components of existing material without modeling any particular sequence or textual
format.
It was planned that Archiving could be used for conversion from a variety of journal
source Tag Sets, with the intent of providing a single format:
- in which publishers could deliver their content to a wide range of archives, and
- into which archives could conveniently translate content from many publishers.
In order to enable description of the content used by the wide array of publishers,
repositories, aggregators, etc., the Tag Set uses many loose structures, including some
elements with nearly all content structures optional. Many attribute values in the Tag
Set are data character values, accommodating any source values. Because some article
components are prescriptive in nature (article metadata (<article-meta>), for example, is a fairly specific sequence) the
Archiving and Interchange Tag Set includes a few completely generic structures for capturing
semantic tagging that is not available natively in the Tag Set. Although publication order cannot always be
preserved, particularly within the metadata, the Archiving and Interchange Tag Set works harder than any
of the other Tag Sets in this Suite to allow almost any publication arrangement and
to allow re-tagging as renaming without rearrangement during conversion.
The Archiving and Interchange Tag Set has a distinct focus on conversion from multiple sources. That
focus has made this Tag Set a large and inclusive one. Many elements have been created
explicitly so that information tagged by publishers would not be discarded when they
converted material from another Tag Set to this one (or one created from this Suite).
Care has also been taken to provide several mechanisms (frequently, information classing
attributes) to preserve the intellectual content of a document structure when that
structure is converted from another Tag Set or schema to this one, even when there is no
exact element equivalent of the structure.
The exact replication of the look and feel of any particular journal has not been a
consideration. Therefore, many purely formatting mechanisms have not been included. At
the same time, Archiving is intended to preserve observed content, without resorting to
stylesheets or generation of textual elements. For that reason, labels, numbers, and
symbols of tables, figures, sidebars, and the like can be recorded as elements, as can
the punctuation and spaces inside bibliographic references and lists.
Scope
By design, this is a model for journal articles, such as the typical research
article found in an STM journal, and not a model for complete journals. This Tag Set
does not include an overarching model for a collection of articles. In addition, the
following journal material is not described by this Tag Set:
- Company, product, or service display advertising
- Job search or classified advertising
- Calendars, meeting schedules, and conference announcements (except as these can be tagged as ordinary articles, sub-articles, or sections within articles)
- Material specific to an individual journal, such as Author Guidelines, Policy and Scope statements, editorial or advisory boards, detailed indicia, etc.
Structural Overview
The Journal Archiving and Interchange Tag Set defines a document that is a top-level component of a
journal such as an article, a book or product review, or a letter to the editor. Each
such document is composed of one or more parts; if there is more than one part, they
must appear in the following order:
- Front matter (required). The article front matter contains the metadata for the article (also called article header information), for example, the article title, the journal in which it appears, the date and issue of publication for that issue of that journal, a copyright statement, etc. This is not textual front matter as appears in books, rather this is bibliographic information about the article and the journal in which it was published.
- Body of the article (optional). The body of the article is the main textual and graphic content of the article. This usually consists of paragraphs and sections, which may themselves contain figures, tables, sidebars (boxed text), etc. The body of the article is optional to accommodate those repositories that just keep article header information and do not tag the textual content.
- Back matter for the article (optional). If present, the article back matter contains information that is ancillary to the main text, such as a glossary, appendix, or list of cited references.
- Floating Material (optional). A publisher may choose to place all the floating objects in an article (such as tables, figures, boxed text sidebars, etc.) into a separate container element outside the narrative flow for convenience of processing.
- Following the front, body, back, and floating material, there may be either one
or more responses to the article or one or more subordinate articles:
- Response. A response is a commentary on the article itself, for example, an opinion from an editor on the importance of the article or a reply from the original author to a letter concerning his article.
- Sub-article. A sub-article is a small article that is completely contained inside another article.
Tag Sets Developed from the Suite
XML schemas (DTDs, XSDs, and RNGs) are provided for 4 different variations of the Archiving Tag Set:
- Archiving Tag Set using XHTML tables and MathML 2.0
- Archiving Tag Set using XHTML tables and MathML 3.0
- Archiving Tag Set using both XHTML tables and OASIS Exchange CALS tables with MathML 2.0
- Archiving Tag Set using both XHTML tables and OASIS Exchange CALS tables with MathML 3.0
This Archiving Tag Set is one of several created from the Suite. Information about the other Tag
Sets may be found at the following site: https://jats.nlm.nih.gov.
How to Read This Tag Library
Terms and Definitions
Element | Elements are nouns, like “speech” and “speaker”,
that represent components of journal articles, the articles themselves, and
accompanying metadata. |
---|---|
Attribute | Attributes hold facts about an element, such as which type of list (e.g.,
numbered, bulleted, or plain) is being requested when using the <list>
(<list>) tag, or the name of a pointer
to an external file that contains an image. Each attribute has both a name (e.g.,
@list-type) and a value
(e.g., “bullet”). |
Metadata | Data about the data, for example,
bibliographic information. The distinction is between metadata elements which
describe an article (such as the name of the
journal in which an article was published or the article title) versus elements which contain the
textual and graphical content of the article. |
How To Start Using This Tag Library
How you use the documentation will depend on what you need to learn about the
modules and this Tag Set.
Learn this Tag Set
If you want to learn about the elements and the attributes in this Tag Set so you
can tag documents or learn how the journal article model is constructed, here is a
good way to start.
- Read the Tag Library General Introduction, taking particular note of the next section that describes the parts of the Tag Library so you will know what resources are available.
- Next, if you do not know the symbols used in the Document Hierarchy diagrams, read the “Key to the Near & Far® Diagrams”.
- Scan the Document Hierarchy diagrams to get a good sense of the top-level elements and their contents. (Find what is inside an <article>, now what is inside each of the four large pieces of an article, keep working your way down.)
- Pick an element from one of the diagrams. Look up the element in the Elements Section to find the full name of the element, its definition, usage notes, content allowed, and any attributes. Look up one of the attributes to find its full name, usage notes, and potential values.
Finally, if you are interested in conversion from a particular source:
- Look at an article in a printed or online journal or look at the DTD/schema
for the other journal.
- Can all the information you want to store from an article fit into the models shown in the diagrams?
- Do you have, or know how to get, all the information the models require? Will that information always be available for documents that are complete and correct?
- How difficult will it be to identify the parts of the information using the elements and attributes described in these models? Would changes to one or more models make this easier?
- Now look at some non-article content, such as a news column, a book review, or some letters to the editor. Are there tags to handle all these article types and all their components?
Structure of This Tag Library
This Tag Library contains the following sections:
How To Use (Read Me First) | How to make best use of this Tag Library to reference XML tags, become
familiar with the Archiving Tag Set as a whole, or see examples of recommended
usage. |
---|---|
Root Element | Naming the <article> element as the root of this XML schema (DTD, XSD, RNG). |
General Introduction | This introduction to the contents of this Tag Library, to the design
philosophy and intended usage of the JATS DTD Suite, and to the
Journal Archiving Tag Set. |
Selecting a Model & Schema | Describing the variant Archiving schemas and how to choose the right one for your implementation. |
Elements Section | Descriptions of the elements used in the Journal Archiving Tag Set and the
parts of the JATS DTD Suite used in this Tag Set. The element
descriptions are listed in alphabetical order by tag name. [Note: Each element has two names: a “tag name” (formally
called an element-type name) that is used in tagged documents, in the
DTDs/schemas, and by XML software; and an “element name” (usually
longer) that provides a fuller, more descriptive name for the benefit of human
readers. For example, a tag name might be <disp-quote> with the corresponding element name Quote, Displayed, or a tag name might be
<verse-group> with the
corresponding element name Verse Form for Poetry.] |
Attributes Section | Descriptions of the attributes in the Journal Archiving Tag Set. Like
elements, attributes also have two names: the shorter machine-readable one and a
(usually longer) human-readable one. Attributes are listed in order by the
shorter, machine-readable names. For example, the attribute short name
@list-type instead of the more
informal, easier to read: Type of List. |
Parameter Entities Section | Names (with occasional descriptions) and contents of the parameter entities in
the JATS DTD modules. |
Document Hierarchy Diagrams | Tree-like graphical representations of the content of many elements. This can
be a fast, visual way to determine the structure of an article or of any element
within an article. |
Common Tagging Practice | Tips, tricks, hints, and examples of how (and why) to tag certain structures
using this Tag Set. |
Accessibility | Brief description of how NISO JATS approaches the 508 and WCAG 2.0 Accessibility
issues. |
Modifying This Tag Set | Implementor’s instructions for using this Tag Set, customizing this Tag
Set, or making derivative tag sets based on this one. |
Version 1.1 Change Report | Pointer to the description of the changes made in response to the public
comments on the JATS Committee Draft Versions 1.1d1, 1.1d2, and 1.1d3 received through the end of November 2015, that resulted in this NISO JATS 1.1 Tag Set(ANSI NISO Z39.96-2015). |
Element Context Table | A listing of where each element may be used. All elements in this Tag Set are
given in a single alphabetical list. The Element Context Table is formatted in two columns. The first column (“This
Element”) names an element, with the name shown in pointy brackets. In
the second column (“May Be Contained In”) for each element is an
alphabetical list of all the elements in which the first column element may occur.
For example, if the first column contains the element <front> and the second column contains only the
<article> element, this means that
the <front> element may only be used
directly inside an <article>. Most
elements may be used inside more than one other element. For example, the element
<def> (a definition) may be used
inside the <abbrev> and the
<def-item> elements. The Element Context Table contains the same information that is found on each element
page under the heading “This element may be contained in:”. |
Index | Where to find elements, tags, and terms used in this Tag Library. Includes
synonyms (terms not used in this Tag Set) that
direct the reader to elements used in this Tag Library; for example,
“author” is paired with Contributor
<contrib>. |
Supporting Documentation Home | The Journal Archiving Tag Set is available in three forms: an XML Document
Type Definition (DTD), a W3C XML Schema (XSD), and a RELAX NG Schema (RNG). Each
of these formats is available in two forms: a zipped file containing a
downloadable version of the schema (often in multiple files), and a
readable/browsable version in which the internal markup has been escaped. |
Tag Library Typographic Conventions
<alt-text> | The tag name of an element (written in lower case with the entire name surrounded by “< >”) |
Alternate Text Name (for a figure, etc.) | The element name (long descriptive name of an element) or the descriptive name of an attribute (written in title case, with important words capitalized, and the words separated by spaces) |
@name | The “@” sign before a name indicates an attribute name. |
must not | Emphasis to stress a point |