How To Make New Tag Sets
Parameter Entities Modules to Customize and Change
Parameter entities are the major mechanism for customizing a tag set or creating a
new tag set from the modules in the Suite. Individual Tag Sets will be constructed by (1)
establishing element and attribute combinations and content models using parameter
entities in one of the Tag-Set-specific customizing modules and (2) choosing appropriate
modules from the Suite that declare the elements needed. For example, if a base Tag
Set contained 6 kinds of lists and 2 table models, a more specific Tag Set, such as an
authoring Tag Set, might use a Customize Classes Module to redefine the List Class to
name only 3 lists and redefine the Display Class to allow only one table model.
The standard modules to create a customized tag set are: the DTD itself, a module to
name its components, and as many override modules and new elements modules as necessary.
Typical modules for a new Tag Set are:
- DTD — The DTD module (.dtd) for the new base Tag Set (At a minimum, this module declares the top-level element (such as article, book, or report) and any other structural elements unique to the new document type.);
- Tag-Set-specific Module of Modules — Module to name all the new modules created expressly for the new Tag Set;
- Class overrides — Tag-Set-specific overrides of the Suite default element classes;
- Mix overrides — Tag-set-specific overrides of the Suite default class mixes;
- Model overrides — Tag-set-specific content model overrides for the content models in the modules of the Suite (using “-elements” and “-model” parameter entities); and
- New Models — Tag-Set-specific new elements. (For example, a new Book Tag Set might add book-specific metadata elements.)
Element Classes Concept
Many of the elements in the Journal Archiving Tag Set have been grouped into loose
element classes. There is no hard and fast rule for what constitutes a class; each one
is a design decision, a matter of judgment. These classes are designed to ease
customization to meet the particular needs of new Tag Sets. Base classes for the Suite
are defined in a separate Default Element Classes Module
(%default-classes.ent;).
Content models are built using sequences of elements, and OR groups that are classes
(typically) or mixes. As an example, the content model for a <p> element is declared to be an OR group (that is, a
choice) of text, numbers, or special characters and any
of the elements named in the Paragraph Elements mix.
The mix %p-elements; is declared to be a large OR
group of many other element-defining classes: the Display Class No Alternatives Elements, the Block-level Mathematical Expressions and Formulae Class Elements, the List Class Elements, the Citation Class Elements, etc.
Design Note: These element classes can be viewed as building blocks that will be used to build
larger parameter entities for element mixes. A mix describes a usage circumstance for a group of elements, such as all the paragraph-level elements, all the elements allowed inside a table cell, all the elements inside a paragraph, or all the inline
elements. For example, to add another block display item to the Block Display Class Elements, you would edit the %block-display.class; parameter entity (and probably also the Display Class No Alternatives Elements parameter entity) in your
Tag-Set-specific Class Override Module to override the default parameter entity
defined in the Suite’s Default Element Classes Module
and create a new module containing the Element Declaration of the new block
display item.
Parameter Entity Names for Classes and Mixes
PARAMETER ENTITY: SAME FUNCTION, SAME NAME — The Suite modules and initial Tag
Sets have used a series of parameter entity naming conventions consistently. While
parsing software cannot enforce these parameter entity naming or usage conventions,
these conventions can make it much easier for a person to know how the content models
work and what must be modified to make a change to this Tag Set.
CLASSES — Classes are functional groupings of
elements used together in an OR group. Each class is named with a parameter entity, and
all class parameter entity names end in the
suffix “.class”:
<!ENTITY % list.class "def-list | list">
A class, by definition, should never be made empty; the class should be removed from
all models where you do not want the class elements included.
MIXES — Mixes are functional OR groups of
classes; mixes should never contain element names directly. All mixes must be declared
after all classes, since mixes are composed of classes. Mix names have no set suffix;
for example, they may end
in “-mix” or
“-elements”. Content models and content model overrides use mixes and classes for all OR
groups. Only content model sequences are made up of element names directly.
MODEL OVERRIDES — Parameter entity mixes for
overriding a content model are of two styles: (1) inline mixes and (2) full content model
replacements. These two groupings have been defined and named separately to preserve the
mixed-content or element-content nature of the models in Tag Sets derived from the
Suite.
The inline parameter entities to be intermingled with character data (#PCDATA) in a mixed content model are named with a
suffix “-elements”. For example,
“%copyright-holder-elements;” would be used in the content model for the element <copyright-holder>:
<!ENTITY % copyright-holder-elements "| %subsup.class; | %x.class;" > <!ELEMENT copyright-holder (#PCDATA %copyright-holder-elements;)* >
All inline mixes begin with an OR bar, so that the mix can be removed leaving just
character data (#PCDATA):
<!ENTITY % rendition-plus "| %all-phrase;" >
The override of a complete content model will be named with a
suffix “-model” and should include the entire content model, including the enclosing
parentheses:
<!ENTITY % kwd-group-model "(label?, title?, ((%kwd.class; | %x.class;)+ | (%unstructured-kwd-group.class;)* ) )" >
How To Build a New Custom Tag Set
The Concept
The basic idea for a new Tag Set is that all lower-level elements (paragraphs,
lists, figures, etc.) will be defined in modules — either the modules of the
base Suite or in new Tag-Set-specific modules — rather than in the DTD itself. The new
DTD will be fairly short and include only definitions of the topmost elements, at
least the document element and maybe its children.
Modules are defined (declared) using external parameter entities in the
Suite’s Module to Name the Modules or in the
Tag-Set-specific Module of Modules. Modules are referenced in the DTD proper,
in the order needed to define the parameter
entities in sequence.
This Journal Archiving Tag Set was written as an example of the new best-practice
customization technique. A new variant tag set, written as a DTD that follows this
plan, will probably consist of the following modules:
- A DTD module to define the top-level elements (for example, JATS-archivearticle1.dtd);
- A tag-set-specific Module of Modules to name new non-Suite modules in the Tag Set (for example, %archivecustom-modules.ent;);
- A tag-set-specific definition of element classes to add new classes and override the Suite default classes (for example, %archivecustom-classes.ent;);
- A tag-set-specific definition of element mixes to add new mixes and override the default mixes (for example, %archivecustom-mixes.ent;);
- A tag-set-specific module of content model overrides (for example, %archivecustom-models.ent;);
- Tag-set-specific modules to hold new element declarations; and
- All or most of the modules in the Suite.
Making a Variant Tag Set
To show the process, here is a series of instructions for making a new tag set as a DTD, illustrated by showing how the Journal Archiving Tag Set was created from the
modules of the whole Suite:
- Modules — Write a new tag-set-specific Module of Modules which defines all new customization modules this Tag Set needs. As an example, the Archiving Tag Set created the module %archivecustom-modules.ent;, which contains the definitions of the class-override module %archivecustom-classes.ent;, the mix-override module %archivecustom-mixes.ent;, and the models-override module %archivecustom-models.ent;.
- Class overrides — Write a tag-set-specific class-override module, defining any overrides to the Suite classes, which are defined in the default classes module, %default-classes.ent;. As an example, the Archiving Tag Set created the module %archivecustom-classes.ent;, in which a new model for %contrib-info.class; was declared and an entirely new class %x.class; was added.
- Mix overrides — Write a tag-set-specific mix-override module defining any overrides to the Suite mixes, which are defined in the default mixes module, %default-mixes.ent;. As an example, the Archiving Tag Set created the module %archivecustom-mixes.ent;, in which a new mix %all-phrase; was declared and then used in many existing mixes such as %simple-phrase;.
- Model overrides — Create a tag-set-specific content-model-override module defining any overrides to the content models and attribute lists for the Suite. As an example, the Archiving Tag Set created the module %archivecustom-models.ent;, in which element collections (suffixed “-elements”) that will be mixed with #PCDATA were redefined, full content models overrides (suffixed “-model”) were redefined, and some new attributes and attribute lists were added.
- New Elements — Write any new element modules needed. These will define any new block-level or phrase-level elements. As an example, the Archiving Tag Set did not need any new elements not in the Suite, but the new Book Tag Set added modules for book metadata and book component parts.
- DTD Module — With those modules in place,
construct a new DTD module. Within that module:
- Use an external parameter entity Declaration to name and then call the tag-set-specific Module of Modules, for the Archiving Tag Set, the module %archivecustom-modules.ent;.
- Use an external parameter entity Declaration to name and then call the Suite Module of Modules (which names all the potential modules), for the Archiving Tag Set, the module %modules.ent;.
- Use an external parameter entity reference to call the tag-set-specific class overrides, for the Archiving Tag Set, the module %archivecustom-classes.ent;.
- Use an external parameter entity reference to call the Suite default classes, for the Archiving Tag Set, the module %default-classes.ent;.
- Use an external parameter entity reference to call the tag-set-specific mix overrides, for the Archiving Tag Set, the module %archivecustom-mixes.ent;.
- Use an external parameter entity reference to call the Suite default mixes, for the Archiving Tag Set, the module %default-mixes.ent;.
- Use an external parameter entity reference to call the tag-set-specific content models and attribute list overrides, for the Archiving Tag Set, the module %archivecustom-models.ent;.
- Use an external parameter entity reference to call in the standard Common Module (%common.ent;) that defines elements and attributes so common they are used by many modules.
- Use external parameter entity references to call any new tag-set-specific modules that define new block-level or phrase-level elements. For the Archiving Tag Set, there are no such modules, but, for example, the Book Tag Set made from this Tag Suite calls a module that declares the book-specific metadata.
- Select, from the Module of Modules, those modules which contain the elements needed for your Tag Set (for instance, selecting lists and not selecting math elements) and call in each of the modules needed. (The Archiving Tag Set calls these in alphabetical order, since the order does not matter.)
- Define the document element and any other unique elements and entities needed for this Tag Set. For example, the Journal Archiving and Interchange Tag Set declares only a few elements including: <article> (the top-level element) and its potential components: <front>, <body>, <back>, <sub-article>, and <response>.
Namespaces and MathML
When JATS was first designed, many software tools did not handle multiple redefinitions of the same namespace cleanly and correctly. Therefore, the following namespace prefixes, namespace URIs, and xmlns declarations are declared in the MathML DTD setup modules or in the MathML 2.0 and MathML 3.0 QName modules (and MathML 2.0 and MathML 3.0 schema modules for XSD and RNG):
- XLink
- The XLink prefix is set to “xlink”.
- The XLink namespace URI is set to “http://www.w3.org/1999/xlink”.
- The XLink xmlns pseudo-attribute is set as follows, for use in attribute lists: "xmlns:xlink CDATA #FIXED 'http://www.w3.org/1999/xlink'.
- MathML
- The MathML namespace prefix is set to “mml”.
- The MathML namespace URI is set to “http://www.w3.org/1998/Math/MathML”.
- The MathML xmlns pseudo-attribute is set as follows, for use in attribute lists: "xmlns:mml CDATA #FIXED 'http://www.w3.org/1998/Math/MathML'.
- W3C Schema Instance
- The W3C Schema namespace prefix is set to “xsi”.
- The W3C Schema namespace URI is set to “http://www.w3.org/2001/XMLSchema-instance”.
- The W3C schema xmlns pseudo-attribute is set as follows, for use in attribute lists: xmlns:xsi CDATA #FIXED 'http://www.w3.org/2001/XMLSchema-instance'.
This definition outside the ordinary JATS modules has annoying subsetting implications. It means that if you do not include the MathML setup modules and MathML modules in your tag set, you will not have those namespaces defined.
Thus, if you want to use the JATS modules to create a tag set that does not include MathML, there are two options open to you:
- Include the MathML setup modules and MathML DTD modules and ignore them in your tagging and in your documentation; or
- Write your own namespace setup module that declares the namespaces mentioned above.