Version 2.1
[Updated versions of the Tag Suite have been released. Current version information is available here.]
There were several rationales for revising the NLM Archiving and Interchange Tag Suite:
- The W3C had produced new DTD and schema versions of MathML;
- The W3C had produced (in association with MathML) new, more complete sets of general character entities;
- The Authoring Tag Set was ready for release and it seemed prudent both to use the latest of everything for the new Tag Set and not to go ahead of the other Tag Setss; and
- The AIT Working Group, and others via the listserv, had requested changes since the last release.
Although the changes are fully backwards compatible for XML documents (document instances), the new Archiving Tag Set and the full Tag Suite may not be backwards compatible for all previous customizations.
Summary of Version 2.1 Changes
The major changes for this release were to amend the Module of Modules (%modules.ent;) to reflect the new MathML and character sets, including a revised directory structure for both the MathML modules and the general entity set modules. This necessitated changing the verison of all the Tag Suite files, to point to the new Module of Modules. Minor changes requested by the committee are also reflected in many of the modules.
Editors Note: We took the opportunity of the revision to fix typos, alignment errors, element order in parameter entities, and infelicities of the wording of comments. We wish to thank the many users who brought these to our attention.
Changes to the Entire Suite
The version number for the Suite modules was set to “v2.1 20050630”.
Versioning Note: All modules change version numbers at a numbered release, but, for a dot release, a module that has not changed (for example%phrase.ent;) retains its previous venison number. Therefore only modules that have changed are marked as version 2.1.
MathML DTD Upgrade
The new MathML required no changes to the math setup files %math.ent; and %mathmlsetup.ent;. Suite Verison 2.1 adds the latest version of the MathML 2.0 DTD modules (mathml2.dtd,v 1.12 2003/11/04). The following files have been replaced (there were no new modules added):
- mathml2.dtd;
- mathml/mmlextra.ent;
- mathml/mmlalias.ent; and
- mathml2-qname-1.mod.
The new parameter entity, %MathMLstrict; was left to the default “IGNORE”. Setting this entity to “INCLUDE” would enable marked sections to enforce stricter checking of MathML syntax rules.
Not only do the new math modules completely replace the old, but there has been a directory level change. The module %mathml2-qname-1.mod; is now inside the top level (as a peer of the %mathml.dtd;) instead of one level down in the mathml directory.
MathML Namespaces
For reasons of backwards compatibility, the MathML prefix for the Suite will continue to be “mml”, although the latest MathML DTD defaults to a prefix of “m”.
(Implementor’s Note: In Version 2.1 (as in all previous versions) the MathML namespace pseudoattribute has been implemented as a FIXED attribute in the DTD. Some XML processors (for example, certain implementations of the MSXML parser) do not recognize the defaulted value and require that the MathML namespace be declared explicitly on the top-level <article> element in the instance. The same implementations also require an explicit pseudoattribute for the XLink namespace.)
MathML Character Set Upgrade
The sets of general entities for special characters for the Suite have always been taken directly from the W3C MathML character sets. Since the MathML site has modified their character entity sets, the Suite was changed to match. The new sets of entities:
- Match Unicode 4.0;
- Have changed the older private use areas to 4.0 mappings; and
- Use a new directory structure, which separates the ISO 8879 (SGML) sets from the ISO 9573-13 (ISO tech rpt) sets.
Therefore a new directory structure was adopted for the sets of character entities in the Suite. To match the new MathML directories, there are now 3 character subdirectories:
iso8879 | Characters defined originally in the SGML specification ISO 8879 (directory patterned on MathML) |
iso9573-13 | Characters originally defined in 8879 but redefined in ISO Tech Report ISO 9573-13 (directory patterned on MathML) |
xmlchars | The three Greek alphabet sets not used in MathML but carried forward because they were used in earlier version of the DTD Suite (Suite specific) |
The modules %xmlspecchars.ent; was modified to invoke files and directory structure.
The MathML DTD parameter entity mathml-charent.module was set to IGNORE to get rid of the invocation of the character sets from within the MathML DTD itself. Characters for the Suite must be called independently of MathML so that the Suite can be used without MathML. Since ignoring all of the entity set calls in MathML DTD also gets rid of the mmlextra and mmlalias calls, those were also added to the MathML setup module, which was already calling the MathML DTD.
(Note: Implementor Alert: On the W3C website, the current MathML DTD includes some entity files in both 8879 and 9573 sets, for example isoamsa.ent is in both directories, but has added characters in the iso9573-13 directory. This Suite chose to use the most inclusive of the entity files referenced in the MathML DTD. We also did not fix the well known "dagger" problem, in our entity sets there are still entities for dagger and double dagger in both the isopub and isoamsb modules and the preferred double dagger within MathML [but not elsewhere] is the MathML alias %ddagger;.)
Copyright and Permissions
The copyright information in the previous Suite was limited to a spartan <copyright-statement> and <copyright-year>. A new <copyright-holder> element was added and a new wrapper element <permissions> was added to consolidate the copyright and licensing information. The model for <permissions> is the following, in order, even for Archiving (Green):
- copyright-statement
- copyright-year
- copyright-holder
- license
For backwards compatibility, <permissions> was added to places (such as <article-metadata> where one of the copyright elements was also allowed. The Permissions element does not replace the copyright that was there, it is in addition to it. The documentation will explain that using the Permissions wrapper is best practice, but previously tagged material will not need to be changed.
In order to make this change, the following were moved to the common module:
- %license-atts;
- %license-model;
- <copyright-year>
- <license>
Permissions have been added to:
- <appendix>
- <article-meta>
- The parameter entity display-back-matter.class and thus to the default of the following elements:
- <array>
- <boxed-text>
- <chem-struct-wrapper>
- <disp-formula>
- <disp-quote>
- <fig>
- <graphic>
- <preformat>
- <statement>
- <supplementary-material>
- <table-wrap>
- <table-wrap-foot>
- <verse-group>
Minor Base Suite Changes
The following changes were made in several modules. Each module has an updated change history.
- List Item — The attribute list for <list-item> was made into a parameter entity, so that, for example, individual DTDs could change the attribute “id” from CDATA to type ID. (There was already a parameter entity for <list> to allow the same change.
-
Titles and Subtitle
- The new element <journal-subtitle> was defined in %common.ent; and used in <journal-meta>.
- Added <journal-subtitle> to <journal-meta> through the parameter entity %journal-meta-model;
- Added <journal-subtitle> to the references class
- The xml:lang attribute was associated with <subtitle> element.
- Added the optional <trans-subtitle> element to to <article-meta> model through the parameter entity %title-group-model;.
-
Attribute Changes
- Journal Identifier Attributes — The parameter entity %related-article-atts; was changed to use the parameter entity %journal-id-atts;. The entity %journal-id-atts; was moved to %common.ent; to allow this use.
- doaj — Added new values “doaj” (Directory of Open Access Journals) and “manuscript” (Manuscript) were added to %pub-id-types; as well as to the list of suggested journal ID types.
- Hard-coded Date Attributes — In the common modules, the parameter entity %date-atts; was defined, but not used on the <date> element. Since the attribute list was hard-coded at the element, it could not be over-ridden. The parameter entity is now used, allowing the over-ride to work as designed.
- X-Generated Text — Added xml:space attribute with a value of “preserve” to the <x> element (per list request).
OASIS XML Catalog
Both an old-style OASIS SOCAT and an XML catalog file will be delivered with the Suite. The XML catalog contains instructions for modifying and setting up the catalog.
<!DOCTYPE catalog PUBLIC "-//OASIS//DTD Entity Resolution XML Catalog V2.1//EN" "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd"> <catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog" prefer="public">
Tag Sets
These Tag Sets are availble in version 2.1:
National Center for Biotechnology Information
U.S. National Library of Medicine
8600 Rockville Pike, Bethesda, MD 20894
Copyright,
Disclaimer,
Privacy,
Accessibility
Last updated: September 14, 2012