Selecting a Model & Schema

After deciding to use JATS, a project or organization is faced with many decisions: In this essay we describe these choices and help make those decisions.

Which Tag Set

There are three JATS Tag Sets:
  • Journal Archiving and Interchange is the most permissive of the Tag Sets. The Archiving and Interchange Tag Set defines elements and attributes that describe the content and metadata of journal articles, including research and non-research articles, letters, editorials, and book and product reviews. The Tag Set allows for descriptions of the full article content or just the article header metadata. It also allows for preservation of the sequence of content and generated text. Read more about the intent of this Tag Set at: https://jats.nlm.nih.gov/archiving/
  • The Journal Publishing Tag Set is a moderately prescriptive set, optimized for the archives who wish to regularize and control their content, rather than accept the sequence and arrangement presented to them by any particular publisher. Publishing is also intended for use by publishers for the initial XML tagging of journal material, usually as converted from an authoring form like Microsoft Word. Read more about the intent of this Tag Set at: https://jats.nlm.nih.gov/publishing/
  • Article Authoring is the most prescriptive of the Tag Sets. The Article Authoring Tag Set is optimized for authorship of new journal articles, where regularization and control of content is important, and where it is useful rather than harmful to have only one way to tag a structure. This Tag Set is more prescriptive than descriptive and includes many elements whose content must occur in a specified order. Read more about the intent of this Tag Set at: https://jats.nlm.nih.gov/articleauthoring/
In consultation with the partners with which you want to interchange JATS documents, you should select the model that best meets your needs.

Options?

Tables

The Archiving and Publishing Tag Sets provide options for table modeling; they are available with either:
  • a table model based on the XHTML table model, or
  • both the XHTML and OASIS/CALS table models.
The Authoring Tag Set allows only the XHTML table model.
Many JATS users prefer the XHTML-based table model because it is easy to use in web-based and other electronic publications. Some users prefer the OASIS/CALS table model because they have tools that require this model or because they believe that it is easier to format complex print tables using it.

Math

All three Tag Sets provide two options for math modeling. Each Tag Set is available with either:
  • MathML version 2.0, or
  • MathML version 3.0.
If you do not have MathML, we recommend starting with version 3.0. If you are not using MathML, we recommend leaving the modules in and ignoring (never referencing) them.
We provide versions of the tag sets with MathML 2.0 for backwards compatibility. Some users of MathML version 2 find that their MathML 2.0 is also valid MathML 3.0, and they can more effortlessly into MathML 3. However, MathML 3.0 enforces rules that were not enforced in the version of the MathML 2.0 DTD used in JATS. The rules were in the textual documentation for MathML 2.0 and in some versions of the MathML 2.0 DTD, but some users may find that their existing documents are not forwards-compatible with MathML 3.0.
Note: MathML 2.0 that is not valid against MathML 3.0 may render incorrectly on display.
Note: It is likely that JATS will move to MathML 3.0, and stop supporting MathML 2.0 in a future version.

Expression Language (Constraint Language)

XML Tag Sets may be expressed as schema languages (also called “Constraint Languages”). These schemas are used by XML software to enforce the rules of the language and to guide authoring applications. The formal rules of JATS are expressed in the prose of ANSI/NISO Z39.96-2012. In addition, for the convenience of users, schemas for several options of each of the Tag Sets are provided by the NLM.
  • DTD, or Document Type Definitions, are the oldest of XML modeling languages. DTDs are used in many environments that work primarily with textual documents. DTDs are defined in the XML Specification.
  • XSD, or W3C XML Schema, is a modeling language specified by the W3C. Among the strengths of XSD are strong datatyping, context-based models, and the fact that XSD documents are in XML document syntax.
  • RNG, or Relax NG Schema, is a modeling language for XML documents that enables strong datatyping, a wide variety of modeling types, and both a compact and an XML document syntax.
Users usually use the modeling language that best suits the tools they are using. It is not unusual for an organization to use one version of the modeling language in authoring and/or receipt of documents from partners and another in their database environment.

Decision Tree

At each level of the tree below, select only one option. When you get to the end node, you will see the name of the appropriate model for you and a link to that model on the NCBI FTP site:
Archiving
Publishing