◇◆
Introduction to the BITS Book Tag Set
The Book Interchange Tag Suite (BITS) contains, at least initially, an XML model for
STM
books that is based on the Journal Article
Tag Suite (ANSI/NISO Z39.96-2021). The intent of BITS is to provide a common format
in which
publishers and archives can exchange book content, whether an entire book or merely
a book part such
as chapter. The Suite provides a set of XML schema modules that define elements and
attributes for
describing the textual and graphical content of books and book components as well
as a packaging
element for book part interchange.
Rationale
The spring-board for developing BITS was the observation that a JATS-based XML
book model, or a set of related book models, would be useful to a wide variety of
publishers of
professional and scholarly books, especially but not exclusively to publishers who
are already using
one of the NISO JATS journal article models and looking for a compatible model for
their books. Just as the wide support of the NISO Journal Article Tag Sets enabled
many journal publishers to move to XML, a well-supported XML book model (appropriate
to their needs and compatible with current citation management tools) would enable
book publishers to move to XML and through XML to electronic publication and archiving.
Why add yet another book model to the world? By developing book model(s) based on
the
existing NISO JATS Tag Suite, we hope to enable publishers to include their books
in the systems that already create, manage, publish, and archive their journal articles
and to build on the investment they, as well as their suppliers and vendors, have
made in learning and developing around JATS.
Purpose
The goal for BITS is to make a tag set adequate for supporting interchange,
archiving, format-conversion, and publishing for scientific, technical, and medical
books. As was
true for JATS, the intent of BITS is to support marking up the content of material
so that it can be reused, repurposed, and made more discoverable. This purpose implies,
as it does in JATS, that the ability to reproduce a particular book format is not a goal.
Scope
The BITS Book Interchange DTD is a superset customization of the ANSI/NISO JATS Z39.96-2021
Journal
Archiving and Interchange Tag Set, with added material to describe STM books, book
components such
as chapters, and information concerning the inclusion of books and book components
in book series.
The Tag Set describes both the metadata and the narrative content of a book, both
the metadata and
narrative content for book components, and collection-level metadata for book sets
and book series,
when a book part is associated with one or more such collections.
The BITS book model is intended to support scholarly, reference, higher education,
medical, and technical books. Just as the NISO JATS journal article models do not
attempt to support
magazines or any of a wide variety of other serial publications, the book models are
not intended
describe trade books, cook books, grade school text books, legal documents, or any
of the wide variety of books
outside the scientific, technical, and medical realm. Although this work is being
supported by the
National Library of Medicine, this book model should be usable beyond life sciences
publishing, as
the NISO JATS journal article models are useful in physics, social sciences, linguistics,
and poetry.
Design Basis
An explicit goal was the creation of models that would enable the construction of
books
comprised of articles. The intent was to enable article bodies to pass nearly unchanged
into book
parts, with changes to only the outer wrapping element and certain book-specific metadata
reflecting the move from an issue of a journal to the chapter of a book.
The BITS Book Interchange Tag Set was constructed using the modules in the NISO JATS
and adding additional modules to define components that are specific to books. BITS
2.0 was based on
ANSI/NISO JATS Version 1.1 (ANSI/NISO Z39.96-2015). BITS 2.1 is based on ANSI/NISO
JATS Version 1.3 (ANSI/NISO Z39.96-2021), which is the latest iteration of the original
NLM journal
article DTDs. This relationship between JATS and BITS has been quite strictly defined:
“The models should be as similar as possible and only as different as necessary”.
Therefore if JATS has a named structure that also occurs in books, the JATS name (and,
to the extent possible, the JATS content model and attributes) were used for BITS
directly or extended.
Book Part Naming
Concerning the many terms used to name components of books (such as chapter, part,
unit, module, lesson, segment, division, and section), BITS is entirely agnostic.
BITS divides books into book parts and leaves it up to the publisher to call them
chapters or units or anything else.
There are, however, a few named book parts in BITS, largely in the narrative front
matter and the back matter of a book. Because BITS, like its JATS parent, does not
lead
publishers but tries to consolidate current publishing practice, the following publisher-requested
named book parts
have been built into BITS:
- Dedication
- Foreword
- Preface
- Table of Contents (a structural Table of Contents that can be edited)
- Index (a structural index that can be edited)
The named structures for Tables of Contents and Indexes have many specific, unique
included elements. In contrast, the named front matter parts, such as “Preface”, are
modeled as generic structures. This means that a publisher can choose to use the named
book parts or use the element <book-part> to tag all the parts of the book.
Publishers also give many names to collections of books and/or book components,
such as book sets, book series, monograph series, and the like. BITS is entirely agnostic
concerning such collective nouns and merely collects the metadata naming such a grouping.
Book Structural Overview
There are two top-level elements in the BITS Book:
- the Book element (<book>), to contain an entire document such as a textbook or a monograph; and
- the Book Part Wrapper element (<book-part-wrapper>), to contain a book part such as a “chapter” or “module” that needs to be handled as a discrete unit.
A Book
Just as a NISO JATS top-level <article> element may contain only the metadata for an article and none of the narrative text,
a BITS top-level <book> element may contain only the metadata for a book and not the narrative text of the
book or any book part. This allows publishers and archives to use JATS and BITS for
exchange of metadata even when not preserving the textual content of a document in
XML.
If both the metadata and the text of a book are to be tagged in XML, a book may be
composed of the following components:
- Processing Metadata (optional). The metadata that concerns the XML file rather than the contents of the book.
- Collection Metadata (optional, repeatable). Bibliographic metadata describing a book set or series to which this book or book part belongs. A book or book part may be part of many collections.
- Book Metadata (optional). The book metadata element (<book-meta>) contains the metadata for the book, for example, the title of the book, the date of publication, the publisher, a copyright statement, etc. This is not the textual front matter that appears at the beginning of a book, rather this is bibliographic information about the book.
- Front Matter (optional). If present, the front matter element (<front-matter>) contains the textual front material for a book, such as a Dedication, Foreword, or Preface. (Note: This is a different naming than in JATS, where “front matter” refers to the metadata of the journal article.)
-
Body of the Book (optional). If present, the body of
the book element (<book-body>) contains the narrative of the work, the main textual and graphic content of the
book. The body of a book contains book parts (<book-part>), which may be called parts, sections, chapters, modules, lessons, or whatever divisions
a publisher has named.
Book parts are recursive, so they may contain other book parts. For example, “Part 3” of a book could contain several “Chapter”s, each of which could have a foreword, the body of the chapter, one or more appendices, and a reference list.
- Back Matter for the Book (optional). If present, the book back matter element (<book-back>) contains information that is ancillary to the main text, such as a glossary, appendix, or list of cited references. The back matter may also contain floating material (<floats-group>), a container element for all the “floating” objects (such as tables, figures, and sidebars) in a book. The back matter of book parts (<back>) and the <book-part-wrapper> element may also contain their own, separate Floating Material (<floats-group>).
A Book Part Wrapper
The second top-level element in this Tag Set is the book part wrapper (<book-part-wrapper>), which contains a single book part to be interchanged, along with the metadata that
describes the book part and collection metadata that describes a grouping (such as
a virtual book) of which the book part is a member. A book part may be associated
with many collections. The collection metadata for each book can be stored in the
book document instance.
The Roads Not Taken
The BITS Book Tag Set is not based on the NLM Book (Bookshelf) Tag Set that was part
of the NLM predecessor to JATS. Instead, the BITS book model is based on the NISO
JATS Journal
Archiving and Interchange Tag Set (known as “Green” from the colors of the Tag
Library pages), with book metadata in BITS replacing the journal and issue metadata
from JATS
Archiving.
By design, there are some structures which have been modeled as elements in other
public book DTDs and schemas that were not named explicitly in BITS, including:
- Book Metadata — The book metadata is held in the element <book-meta>, which is not named “front” as is the corresponding element holding the article metadata for a JATS journal article. Book metadata is unlike that for journal articles, making this one of the real changes from ANSI/NISO JATS.
- CCC Statement — There is no element with this name in BITS. In BITS, this may be tagged as a license paragraph (<license-p>) inside a license statement (<license>), which may include the price tagged as a <price>. The @license-type attribute may be set to “CCC-statement” to identify the information.
- Colophon — In BITS, this may be tagged as a paragraph, as a section within the body of the book, as a book part in the back matter of a book, or as a book part.
- Contributor List — There is no special structure for theses lists; they can be tagged using ordinary structures (lists, definition lists, tables, paragraphs, etc.) within a <front-matter-part>, within the narrative front matter, or within the back matter. Such lists should be written in addition the contributor names (<contrib>) listed in the metadata of a book or book part.
- Frontispiece — No special structure was named in BITS; the material should be tagged using ordinary structures within a <front-matter-part> as part of the narrative front matter. While the <styled-content> element “could” be used to capture the special formatting typical in a Frontispiece, this is discouraged. The typical purpose for tagging a Frontispiece is to make the information content discoverable, not to replicate the look and feel of the document.
- Introduction — There will be no explicitly named “Introduction” element, because this name may be applied at many levels: to a front-matter component, a section of the body, or an entire book part.
- Map Group — This NLM-specific element was used in the previous NLM Book Tag Set; it was not replicated in BITS.