Introduction to the BITS Book Tag Set
The Book Interchange Tag Suite (BITS) contains, at least initially, an XML model for STM books that is based on the Journal Article Tag Suite (ANSI/NISO Z39.96-2021). The intent of BITS is to provide a common format in which publishers and archives can exchange book content, whether an entire book or merely a book part such as chapter. The Suite provides a set of XML schema modules that define elements and attributes for describing the textual and graphical content of books and book components as well as a packaging element for book part interchange.
The spring-board for developing BITS was the observation that a JATS-based XML book model, or a set of related book models, would be useful to a wide variety of publishers of professional and scholarly books, especially but not exclusively to publishers who are already using one of the NISO JATS journal article models and looking for a compatible model for their books. Just as the wide support of the NISO Journal Article Tag Sets enabled many journal publishers to move to XML, a well-supported XML book model (appropriate to their needs and compatible with current citation management tools) would enable book publishers to move to XML and through XML to electronic publication and archiving.
Why add yet another book model to the world? By developing book model(s) based on the existing NISO JATS Tag Suite, we hope to enable publishers to include their books in the systems that already create, manage, publish, and archive their journal articles and to build on the investment they, as well as their suppliers and vendors, have made in learning and developing around JATS.
The goal for BITS is to make a tag set adequate for supporting interchange, archiving, format-conversion, and publishing for scientific, technical, and medical books. As was true for JATS, the intent of BITS is to support marking up the content of material so that it can be reused, repurposed, and made more discoverable. This purpose implies, as it does in JATS, that the ability to reproduce a particular book format is not a goal.
The BITS Book Interchange DTD is a superset customization of the ANSI/NISO JATS Z39.96-2021 Journal Archiving and Interchange Tag Set, with added material to describe STM books, book components such as chapters, and information concerning the inclusion of books and book components in book series. The Tag Set describes both the metadata and the narrative content of a book, both the metadata and narrative content for book components, and collection-level metadata for book sets and book series, when a book part is associated with one or more such collections.
The BITS book model is intended to support scholarly, reference, higher education, medical, and technical books. Just as the NISO JATS journal article models do not attempt to support magazines or any of a wide variety of other serial publications, the book models are not intended describe trade books, cook books, grade school text books, legal documents, or any of the wide variety of books outside the scientific, technical, and medical realm. Although this work is being supported by the National Library of Medicine, this book model should be usable beyond life sciences publishing, as the NISO JATS journal article models are useful in physics, social sciences, linguistics, and poetry.
An explicit goal was the creation of models that would enable the construction of books comprised of articles. The intent was to enable article bodies to pass nearly unchanged into book parts, with changes to only the outer wrapping element and certain book-specific metadata reflecting the move from an issue of a journal to the chapter of a book.
The BITS Book Interchange Tag Set was constructed using the modules in the NISO JATS and adding additional modules to define components that are specific to books. BITS 2.0 was based on ANSI/NISO JATS Version 1.1 (ANSI/NISO Z39.96-2015). BITS 2.1 is based on ANSI/NISO JATS Version 1.3 (ANSI/NISO Z39.96-2021), which is the latest iteration of the original NLM journal article DTDs. This relationship between JATS and BITS has been quite strictly defined: “The models should be as similar as possible and only as different as necessary”. Therefore if JATS has a named structure that also occurs in books, the JATS name (and, to the extent possible, the JATS content model and attributes) were used for BITS directly or extended.
Book Part Naming
Concerning the many terms used to name components of books (such as chapter, part, unit, module, lesson, segment, division, and section), BITS is entirely agnostic. BITS divides books into book parts and leaves it up to the publisher to call them chapters or units or anything else.
There are, however, a few named book parts in BITS, largely in the narrative front matter and the back matter of a book. Because BITS, like its JATS parent, does not lead publishers but tries to consolidate current publishing practice, the following publisher-requested named book parts have been built into BITS:
- Table of Contents (a structural Table of Contents that can be edited)
- Index (a structural index that can be edited)
The named structures for Tables of Contents and Indexes have many specific, unique included elements. In contrast, the named front matter parts, such as “Preface”, are modeled as generic structures. This means that a publisher can choose to use the named book parts or use the element <book-part> to tag all the parts of the book.
Publishers also give many names to collections of books and/or book components, such as book sets, book series, monograph series, and the like. BITS is entirely agnostic concerning such collective nouns and merely collects the metadata naming such a grouping.
Book Structural Overview
There are two top-level elements in the BITS Book:
- the Book element (<book>), to contain an entire document such as a textbook or a monograph; and
- the Book Part Wrapper element (<book-part-wrapper>), to contain a book part such as a “chapter” or “module” that needs to be handled as a discrete unit.
Just as a NISO JATS top-level <article> element may contain only the metadata for an article and none of the narrative text, a BITS top-level <book> element may contain only the metadata for a book and not the narrative text of the book or any book part. This allows publishers and archives to use JATS and BITS for exchange of metadata even when not preserving the textual content of a document in XML.
If both the metadata and the text of a book are to be tagged in XML, a book may be composed of the following components:
- Processing Metadata (optional). The metadata that concerns the XML file rather than the contents of the book.
- Collection Metadata (optional, repeatable). Bibliographic metadata describing a book set or series to which this book or book part belongs. A book or book part may be part of many collections.
- Book Metadata (optional). The book metadata element (<book-meta>) contains the metadata for the book, for example, the title of the book, the date of publication, the publisher, a copyright statement, etc. This is not the textual front matter that appears at the beginning of a book, rather this is bibliographic information about the book.
- Front Matter (optional). If present, the front matter element (<front-matter>) contains the textual front material for a book, such as a Dedication, Foreword, or Preface. (Note: This is a different naming than in JATS, where “front matter” refers to the metadata of the journal article.)
Body of the Book (optional). If present, the body of
the book element (<book-body>) contains the narrative of the work, the main textual and graphic content of the
book. The body of a book contains book parts (<book-part>), which may be called parts, sections, chapters, modules, lessons, or whatever divisions
a publisher has named.
Book parts are recursive, so they may contain other book parts. For example, “Part 3” of a book could contain several “Chapter”s, each of which could have a foreword, the body of the chapter, one or more appendices, and a reference list.
- Back Matter for the Book (optional). If present, the book back matter element (<book-back>) contains information that is ancillary to the main text, such as a glossary, appendix, or list of cited references. The back matter may also contain floating material (<floats-group>), a container element for all the “floating” objects (such as tables, figures, and sidebars) in a book. The back matter of book parts (<back>) and the <book-part-wrapper> element may also contain their own, separate Floating Material (<floats-group>).
A Book Part Wrapper
The second top-level element in this Tag Set is the book part wrapper (<book-part-wrapper>), which contains a single book part to be interchanged, along with the metadata that describes the book part and collection metadata that describes a grouping (such as a virtual book) of which the book part is a member. A book part may be associated with many collections. The collection metadata for each book can be stored in the book document instance.
The Roads Not Taken
The BITS Book Tag Set is not based on the NLM Book (Bookshelf) Tag Set that was part of the NLM predecessor to JATS. Instead, the BITS book model is based on the NISO JATS Journal Archiving and Interchange Tag Set (known as “Green” from the colors of the Tag Library pages), with book metadata in BITS replacing the journal and issue metadata from JATS Archiving.
By design, there are some structures which have been modeled as elements in other public book DTDs and schemas that were not named explicitly in BITS, including:
- Book Metadata — The book metadata is held in the element <book-meta>, which is not named “front” as is the corresponding element holding the article metadata for a JATS journal article. Book metadata is unlike that for journal articles, making this one of the real changes from ANSI/NISO JATS.
- CCC Statement — There is no element with this name in BITS. In BITS, this may be tagged as a license paragraph (<license-p>) inside a license statement (<license>), which may include the price tagged as a <price>. The @license-type attribute may be set to “CCC-statement” to identify the information.
- Colophon — In BITS, this may be tagged as a paragraph, as a section within the body of the book, as a book part in the back matter of a book, or as a book part.
- Contributor List — There is no special structure for theses lists; they can be tagged using ordinary structures (lists, definition lists, tables, paragraphs, etc.) within a <front-matter-part>, within the narrative front matter, or within the back matter. Such lists should be written in addition the contributor names (<contrib>) listed in the metadata of a book or book part.
- Frontispiece — No special structure was named in BITS; the material should be tagged using ordinary structures within a <front-matter-part> as part of the narrative front matter. While the <styled-content> element “could” be used to capture the special formatting typical in a Frontispiece, this is discouraged. The typical purpose for tagging a Frontispiece is to make the information content discoverable, not to replicate the look and feel of the document.
- Introduction — There will be no explicitly named “Introduction” element, because this name may be applied at many levels: to a front-matter component, a section of the body, or an entire book part.
- Map Group — This NLM-specific element was used in the previous NLM Book Tag Set; it was not replicated in BITS.