graphic with four colored squares
graphic with four colored squares NCBI race

What JATS Users should Know about the Book Interchange Tag Suite (BITS)

Jeffrey Beck

National Center for Biotechnology Information, National Library of Medicine, US National Institutes of Health

beck@ncbi.nlm.nih.gov

Presented at JATS-Con, April 2, 2014

Brief Intro to BITS

The Book Interchagnge Tag Suite (BITS) is a book model based on the JATS article model.

Version 1.0 was released at the end of 2013.

The schemas are available in DTD, XSD, and RELAX NG.

There is a complete Tag Library available: http://jats.nlm.nih.gov/extensions/bits/tag-library/1.0/

And you get to it from the JATS page

http://jats.nlm.nih.gov

BITS Home

BITS Tag Library

Why a new Book Tag Set?

Aren't there already plenty of Book tagging models?

NCBI Book

Designed for NCBI Bookshelf project

Was written as an extension to the NLM DTDs

Much in need of a re-examination

Not updated since NLM Version 3.0.

BITS Working Group

First meeting in person in the Summer of 2010 at NLM

The first task was to define the scope of the project

The Scope of BITS

The scope for BITS is a single completed book or a complete book component such as a chapter, part, or module.

It should define both new and legacy book material, and be able to describe both book sets and series.

For documents where time is a factor such as continually updated books or editions that represent a work in multiple versions, the XML model will represent a snapshot of the book or book component at a single point in time.

Document versioning should be handled by the publication system and not by the document XML.

The scope of BITS includes:

Atlases and Field Guides are not considered to be out-of-scope, but these may need additional semantic metadata to be represented completely.

STM textbooks also are not explicitly out of scope, but they typically have semantic tagging and processing requirements that are beyond what is available in BITS version 1.0.

BITS 1.0 explicitly excludes the following:

What is the relationship to JATS?

BITS is a project of NCBI at the US National Library of Medicine (NLM). While BITS is an extension of NISO Z39.86 JATS, it is not a NISO Standard.

The firt design decision that the BITS Working Group made was that the Tag Set will be based on the most recent version of NISO JATS, including the multi-language capbilities of this structure.

This means if JATS has a named structure and that structure occurs in book content, then the JATS name (and to the extent possible, the JATS model) will be used.

This implies that the NISO JATS model will not be improved by the BITS working group. However, the BITS working group made many comments on NISO JATS version 1.0 that were included in JATS version 1.1d1.

Other Working Group Design Decisions

Besides being JATS-based, the Working Group also made the following conclusions that informed details of BITS:

The details

Things you get for free as a JATS user

This should look familiar

article structure

Content area

article structure

Content area

article structure

Like Magic!

article structure

Let's not forget metadata

Any element with the same name in JATS and BITS has the same model**

** This is not 100% true. But it's OK. Trust me :)

Structures above Chapter/Article

Now that we know that the content area of a book chapter is similar to the content area of an article, we only have to worry about how all of the chapters fit together.

The recursive <book-part>

Books can have many different levels of content. Compare the Tables of Contents (TOCs) of a book divided into Parts with one divided simply into chapters.

Book with Parts and Chapters

Book Chapters Only

The recursive <book-part>

From the reader's, writers, publisher's, and editor's perspective, this is not a problem at all. A book can have chapters or a book can have parts that have chapters, or a book can have sections that have parts that have chapters (that then, of course, have sections inside of chapters).

 

But this gives the XML modeler something to think about.

Do you create an explicit <Section> element that allows <Part>, and <Part> that allows <Chapter> (which gets us down to our basic unit)?

If you do this, then the root element (let's call it <Book>) will either need to allow <Section> and/or <Part> and/or <Chapter> or you will force anyone tagging a just-chapter book to tag an "empty" copy of each level until they get to the level that has any content.

 

The recursive <book-part>

The simpler strategy (from a model writing point of view) is to have one element (<book-part> in our case) that has a type attribute (@book-part-type) that describes the type (or level) of the book part.

In this way, you can make as many levels as needed, and if @book-part-type is not a controlled list (which it is not in BITS), then you can call them whatever you like.

Parts and Chapters

Just Chapters

And then there is this example that I stole from Mulberry

But, books are big

A book represented by a single XML document does not need to be written in a single XML file.

This is a problem that can be solved with entities.

But this is not a great solution that has processing repurcussions.

BITS - Now with XInclude!

In BITS 1.0, we added the element <xi:include> and <xi:fallback>, so that books and book parts can be managed as separate files and “included” as needed into a final document.

Just be sure you have an XInclude parser.

New Things

There are two new book- (and book-part-) level objects that have been added:

These will most likely be used to tag EXISTING Tables of Contents and Indices.

More New Things - <index-term>

<index-term> is and element used in the document flow to tag terms for indexing.

It will most likely be used to CREATE indexes from content.

More New Things - Questions & Answers

The BITS Working Group proposed a simple but flexible Question and Answer model to the JATS Standing Committee as a comment on JATS 1.0. It included new elements for <question>, <question-wrap>, <answer>, <answer-set>, and <explaination>.

It models only questions and answers and not quizzes or Continuing Medical Education exams, but the elements can be used to build these items.

The JATS Standing Committee did not add the Question and Answer model to JAT 1.1d1 because it was not yet a proven model and there was not a lot of call for it in Journals. The BITS Working Group added these elements to BITS 1.0 because of the need for them in book content. Details on questions and answer are available in the Tag Library.

Future BITS

BITS will continue to be supported by NCBI.

We expect more changes to come about as the model gets more use.

Please take it, play with it, tag stuff in it, break it, complain about it, and give it a good going-over.

Naughty BITS

Thank you