Introduction to the BITS Book Tag Set
The Book Interchange Tag Suite (BITS) contains, at least initially, an XML model for STM books that is based on the Journal Article Tag Suite (ANSI/NISO Z39-96-2012). The intent of BITS is to provide a common format in which publishers and archives can exchange book content, whether an entire book or merely a book part such as chapter. The Suite provides a set of XML schema modules that define elements and attributes for describing the textual and graphical content of books and book components as well as a packaging element for book part interchange.
The spring-board for developing BITS was the observation that a JATS-based XML book model, or a set of related book models, would be useful to a wide variety of publishers of professional and scholarly books, especially but not exclusively to publishers who are already using one of the NISO JATS journal article models and looking for a compatible model for their books. Just as the wide support of the NISO Journal Article Tag Sets enabled many journal publishers to move to XML, a well-supported XML book model (appropriate to their needs and compatible with current citation management tools) would enable book publishers to move to XML and through XML to electronic publication and archiving.
Why add yet another book model to the world? By developing book model(s) based on the existing NISO JATS Tag Suite, we hope to enable publishers to include their books in the systems that already create, manage, publish, and archive their journal articles and to build on the investment they, as well as their suppliers and vendors, have made in learning and developing around JATS.
The goal for BITS is to make a tag set adequate for supporting interchange, archiving, format-conversion, and publishing for scientific, technical, and medical books. As was true for JATS, the intent of BITS is to support marking up the content of material so that it can be reused, repurposed, and made more discoverable. This purpose implies, as it does in JATS, that the ability to reproduce a particular book format is not a goal.
The BITS Book Interchange DTD is a superset customization of the ANSI/NISO JATS Z39.96-2012 Journal Archiving and Interchange Tag Set, with added material to describe STM books, book components such as chapters, and information concerning the inclusion of books and book components in book series. The Tag Set describes both the metadata and the narrative content of a book, both the metadata and narrative content for book components, and collection-level metadata for book sets and book series, when a book part is associated with one or more such collections.
The BITS book model is intended to support scholarly, reference, higher education, medical, and technical books. Just as the NISO JATS journal article models do not attempt to support magazines or any of a wide variety of other serial publications, the book models are not intended describe trade books, cook books, grade school text books, or any of the wide variety of books outside the scientific, technical, and medical realm. Although this work is being supported by the National Library of Medicine, this book model should be usable beyond life sciences publishing, as the NISO JATS journal article models are useful in physics, social sciences, linguistics, and poetry.
An explicit goal was the creation of models that would enable the construction of books comprised of articles. The intent was to enable article bodies to pass nearly unchanged into book parts, with changes to only the outer wrapping element and certain book-specific metadata reflecting the move from an issue of a journal to the chapter of a book.
The BITS Book Interchange Tag Set was constructed using the modules in the NISO JATS and adding additional modules to define components that are specific to books. BITS was based on NISO JATS Version 1.1 (ANSI/NISO Z39-96-2015), which is the latest iteration of the original NLM journal article DTDs. This relationship between JATS and BITS has been quite strictly defined: “The models should be as similar as possible and only as different as necessary.” Therefore if JATS has a named structure that also occurs in books, the JATS name (and, to the extent possible, the JATS content model and attributes) were used for BITS.
Book Part Naming
Concerning the many terms used to name components of books (such as chapter, part, unit, module, lesson, segment, division, and section), BITS is entirely agnostic. BITS divides books into book parts and leaves it up to the publisher to call them chapters or units or anything else.
There are, however, a few named book parts in BITS, largely in the narrative front matter and the back matter of a book. Because BITS, like its JATS parent, does not lead publishers but tries to consolidate current publishing practice, the following publisher-requested named book parts have been built into BITS:
- Table of Contents (a structural Table of Contents that can be edited)
- Index (a structural index that can be edited)
The named structures for Tables of Contents and Indexes have many specific, unique included elements. In contrast, the named front matter parts, such as “Preface”, are modeled as generic structures. This means that a publisher can choose to use the named book parts or use the element <book-part> to tag all the parts of the book.
Publishers also give many names to collections of books and/or book components, such as book sets, book series, monograph series, and the like. BITS is entirely agnostic concerning such collective nouns and merely collects the metadata naming such a grouping.
Book Structural Overview
There are two top-level elements in the BITS Book:
Just as a NISO JATS top-level article element may contain only the metadata for an article and none of the narrative text, a BITS top-level book element may contain only the metadata for a book and not the narrative text of the book or any book part. This allows publishers and archives to use JATS and BITS for exchange of metadata even when not preserving the textual content of a document in XML.
If both the metadata and the text of a book are to be tagged in XML, a book may be composed of the following components:
- Collection Metadata (optional, repeatable). Bibliographic metadata describing a book set or series to which this book or book part belongs. A book or book part may be part of many collections.
- Book Metadata (optional). The book metadata element (<book-meta>) contains the metadata for the book, for example, the title of the book, the date of publication, the publisher, a copyright statement, etc. This is not the textual front matter that appears at the beginning of a book, rather this is bibliographic information about the book.
- Front Matter (optional). If present, the front matter element (<front-matter>) contains the textual front material for a book, such as a Dedication, Foreword, or Preface. (Note: This is a different naming than in JATS, where “front matter” refers to the metadata of the journal article.)
- Body of the Book (optional). If present, the body of
the book element (<book-body>) contains the narrative of the work, the main textual and graphic content of the book. The body of a book contains book parts (<book-part>), which may be called parts, sections, chapters, modules, lessons, or whatever divisions a publisher has named.Book parts are recursive, so they may contain other book parts. For example, “Part 3” of a book could contain several “Chapter”s, each of which could have a foreword, the body of the chapter, one or more appendices, and a reference list.
- Back Matter for the Book (optional). If present, the book back matter element (<book-back>) contains information that is ancillary to the main text, such as a glossary, appendix, or list of cited references. The back matter may also contain floating material (<floats-group>), a container element for all the “floating” objects (such as tables, figures, and sidebars) in a book. The back matter of book parts (<back>) and the <book-part-wrapper> element may also contain their own, separate Floating Material (<floats-group>).
A Book Part Wrapper
The second top-level element in this Tag Set is the book part wrapper (<book-part-wrapper>), which contains a single book part to be interchanged, along with the metadata that describes the book part and collection metadata that describes a grouping (such as a virtual book) of which the book part is a member. A book part may be associated with many collections. The collection metadata for each book can be stored in the book document instance.
The Roads Not Taken
The BITS Book Tag Set is not based on the NLM Book (Bookshelf) Tag Set that was part of the NLM predecessor to JATS. Instead, the BITS book model is based on the NISO JATS Journal Archiving and Interchange Tag Set (known as “Green” from the colors of the Tag Library pages), with book metadata in BITS replacing the journal and issue metadata from JATS Archiving.
By design, there are some structures which have been modeled as elements in other public book DTDs and schemas that were not named explicitly in BITS, including:
- Book Metadata — The book metadata is held in the element <book-meta>, which is not named “front” as is the corresponding element holding the article metadata for a JATS journal article. Book metadata is unlike that for journal articles, making this one of the real changes from ANSI/NISO JATS.
- CCC Statement — There is no element with this name in BITS. In BITS, this may be tagged as a license paragraph (<license-p>) inside a license statement (<license>), which may include the price tagged as a <price>. The @license-type attribute may be set to “CCC-statement” to identify the information.
- Colophon — In BITS, this may be tagged as a paragraph, as a section within the body of the book, as a book part in the back matter of a book, or as a book part.
- Contributor List — There is no special structure for theses lists; they can be tagged using ordinary structures (lists, definition lists, tables, paragraphs, etc.) within a <front-matter-part>, within the narrative front matter, or within the back matter. Such lists should be written in addition the contributor names (<contrib>) listed in the metadata of a book or book part.
- Frontispiece — No special structure was named in BITS; the material should be tagged using ordinary structures within a <front-matter-part> as part of the narrative front matter. While the <styled-content> element “could” be used to capture the special formatting typical in a Frontispiece, this is discouraged. The typical purpose for tagging a Frontispiece is to make the information content discoverable, not to replicate the look and feel of the document.
- Introduction — There will be no explicitly named “Introduction” element, because this name may be applied at many levels: to a front-matter component, a section of the body, or an entire book part.
- Map Group — This NLM-specific element was used in the previous NLM Book Tag Set; it was not replicated in BITS.
How to Read This Tag Library
Terms and Definitions
Elements are nouns, like “speech” and “speaker”, that represent components of books and book parts, the full text of the book or book part, and accompanying metadata.
Attributes hold facts about an element, such as which type of list (e.g., numbered, bulleted, or plain) is being requested when using the List (<list>) tag, or the name of a pointer to an external file that contains an image. Each attribute has both a name (e.g., @list-type) and a value (e.g., “bullet”).
Data about the data, for example, bibliographic information. The distinction is between metadata elements which describe a book or a book part (such as the name of the book or a chapter title) versus elements which contain the textual and graphical content of the book or book part.
How To Start Using This Tag Library
How you use the documentation will depend on what you need to learn about the modules and this Tag Set.
Learn this Tag Set
If you want to learn about the elements and the attributes in this Tag Set so you can tag documents or learn how the BITS book models are constructed, here is a good way to start.
- Read the Tag Library General Introduction, taking particular note of the next section that describes the parts of the Tag Library so you will know what resources are available.
- Next, if you do not know the symbols used in the Document Hierarchy diagrams, read the “Key to the Near & Far® Diagrams”.
- Scan the Document Hierarchy diagrams to get a good sense of the top-level elements and their contents. (Find what is inside an <book>, now what is inside each of the large pieces of a book, such as the narrative front matter or the body containing <book-part>s. Keep working your way down, from largest to smaller.)
- Pick an element from one of the diagrams. (Look up the element in the Elements Section to find the full element, the definition, usage notes, content allowed inside the element, where the element may be used, and a list of any attributes. Look up one of the attributes to find its full name, usage notes, potential values, and whether it has a default.)
Finally, if you are interested in conversion from a particular source:
- Look at book or chapter in a printed or online book source (and look at the DTD/schema
for the other publication if there is one).
- Can all the information you want to store from a book or chapter fit into the models shown in the diagrams?
- Do you have, or know how to get, all the information the models require? Will that information always be available for documents that are complete and correct?
- How difficult will it be to identify the parts of the information using the elements and attributes described in these models? Would changes to one or more models make this easier?
Structure of This Tag Library
This Tag Library contains the following sections:
How To Use (Read Me First)
How to make best use of this Tag Library to reference XML tags, become familiar with the BITS Tag Set as a whole, or see examples of recommended usage.
This introduction to the contents of this Tag Library, to the design philosophy and intended usage of the JATS DTD Suite, and to the BITS Tag Set.
Descriptions of the elements used in the BITS Tag Set and the parts of the JATS DTD Suite used in this Tag Set. The element descriptions are listed in alphabetical order by tag name.
[Note: Each element has two names: a “tag name” (formally called an element-type name) that is used in tagged documents, in the DTDs/schemas, and by XML software; and an “element name” (usually longer) that provides a fuller, more descriptive name for the benefit of human readers. For example, a tag name might be <disp-quote> with the corresponding element name Quote, Displayed, or a tag name might be <verse-group> with the corresponding element name Verse Form for Poetry.]
Descriptions of the attributes used in the BITS Tag Set. Like elements, attributes also have two names: the shorter machine-readable one and a (usually longer) human-readable one. Attributes are listed in order by the shorter, machine-readable names. For example, the attribute short name @list-type instead of the more informal, easier to read: Type of List.
Parameter Entity Section
Names (with occasional descriptions) and contents of the parameter entities in the JATS and BITS DTD modules.
Document Hierarchy Diagrams
Tree-like graphical representations of the content of many elements. This can be a fast, visual way to determine the structure of a book a book part, or of any complex element within a book.
Common Tagging Practice
Tips, tricks, hints, and examples of how (and why) to tag certain structures using this Tag Set.
Brief description of how NISO JATS approaches the 508 and WCAG 2.0 Accessibility issues.
Modifying This Tag Set
Implementor’s instructions for using this Tag Set, customizing this Tag Set, or making derivative tag sets based on this one.
Element Context Table
A listing of where each element may be used. All elements in this Tag Set are given in a single alphabetical list.
The Context Table is formatted in two columns. The first column (“This Element”) names an element, with the name shown in pointy brackets. In the second column (“May Be Contained In”) for each element is an alphabetical list of all the elements in which the first column element may occur. For example, if the first column contains the element <book-back> and the second column contains only the <book> element, this means that the <book-back> element may only be used directly inside an <book>. Most elements may be used inside more than one other element. For example, the element <def> (a definition) may be used inside the <abbrev> and the <def-item> elements.
The Context Table contains the same information that is found on each element page under the heading “This element may be contained in”.
Supporting Documentation Home
The BITS Tag Set is available in three forms: an XML Document Type Definition (DTD), a W3C XML Schema (XSD), and a RELAX NG Schema (RNG). Each of these formats is available in two forms: a zipped file containing a downloadable version of the schema (often in multiple files), and a readable/browsable version in which the internal markup has been escaped.
Tag Library Typographic Conventions
|The tag name of an element (written in lower case with the entire name surrounded by “< >”)
|Alternate Text Name (for a figure, etc.)
|The element name (long descriptive name of an element) or the descriptive name of an attribute (written in title case, with important words capitalized, and the words separated by spaces)
|The “@” sign before a name indicates an attribute name.
|Emphasis to stress a point