JATS-Con 2013 Schedule with Abstracts

April 1, 2014

8:00-9:00	Registration
9:00	Welcome and Introductions
9:00-9:15	JATS and NISO — the value of community standardization Todd Carpenter, National Information Standards Organization (NISO) Since its earliest days, the publishing structures that became the Journal Article Tag Suite (JATS) standards have proven themselves valuable as a method for exchanging journal content. Designed by a team with decades of typesetting and markup expertise, the specifications were quickly adopted by preservation communities and as a basis for many of the largest publishers production processes. As digital publishing evolves, the importance of common vocabulary structures like JATS will only increase, because exchanging digital files is a critical component in a functioning digital content ecosystem. NISO plays a critical role in bringing together content creators, intermediaries, and consumers to develop interoperability standards for the creation, discovery, distribution and preservation of content. With the support and engagement of the National Library of Medicine, NISO has engaged a broader community of participants to the standardization of JATS and it continues to support its ongoing development and expansion of the standard. During this brief talk, Todd will discuss the value of national standardization of JATS, and the future of interoperable content standards in digital publishing. Video
9:15-9:45	When the "One Size Fits Most" tagset doesn't fit you Tommie Usdin, Mulberry Technologies, Inc. JATS does not actually claim to be a "one size fits all" specification. However, many information content consumers (libraries, archives, on-line services) accept only content that is valid to one of the JATS models, and in many cases specify a subset of the model defined in one of the JATS instantiations (Archiving, Publishing, or Authoring). Thus, content creators find that their vendors and tools often assume that they will be using one of the JATS models "out of the box". This can present a real problem when a publisher has, and wants, information that is not modeled in JATS, or is not modeled in the JATS DTD their vendors and publishing partners require. In this case, the publisher has several options: Drop the inconvenient information; use "Custom Metadata" , hide the inconvenient information in prose, abuse a tag, suggest a modification of the standard, or modify the tag set to encode the information that matters to you. None of these options are ideal, and which to choose in large part depends on circumstances. Video
9:45-10:30	The Challenges and Benefits of Automating NLM-to-ePub3 File Conversion Mike Dean, CFA Institute While converting NLM book tag XML to an ePub seems like a relatively straightforward process (hey, an ePub is mostly just HTML, right?), setting up a workflow to do just that is quite challenging. It turns out writing the XSLT could be considered the "easy" part. Other problems, such as dealing with ePub display issues across ebook readers (anything from minor CSS differences to major MathML display problems), deciding what tagging makes the most sense semantically, and figuring out how to give semantic meaning to visual formatting such as table cell shading add a layer of complexity to the process. This paper discusses the challenges, rewards, and as-yet unresolved problems encountered in the process of creating an NLM to ePub3 workflow. Full Paper \| Materials \| Video
10:30-11:00	Coffee Break
11:00-11:45	Tracking Changes to JATS XML in an Online Proofing System Charles O'Connor, Dartmouth Journal Services Antony Gnanapiragasam, Dartmouth Journal Services Michael Hepp, Dartmouth Journal Services When Dartmouth Journal Services began building ProofExpress, an online, XML-based proofing and editing system for STM journals, we knew that the most difficult challenge would be creating an accurate change-tracking mechanism. Change tracking is an essential feature, both to ensure that author corrections conform to journal style and to catch any changes to data or claims. The system must not only track each insertion, deletion, and formatting change, it must also give production editors the ability to accept or reject changes without breaking the XML. ProofExpress is built on SDL LiveContent Create (formerly Xopus). We use its extensive API to add custom elements and attributes to mark changes in the XML. The XML is then transformed through XSLT to group and nest changes so that they can be acted upon by the production editor. To prevent breaking the XML during this process, a rule engine enforces the order of acceptance and rejection of changes. Full Paper \| Materials \| Video
11:45-12:30	Case Study on Redlining application using JATS XML at the International Organization for Standardization Chandi Perera, Typefi Systems Redlining is the process of comparing two datasets and displaying the changes in a meaningful and human readable way. Comparing XML files and rendering the results is more complex than just identifying the differences between two files. Using the experiences of International Organization for Standardization (ISO) as a case study, this paper will describe the process of comparing two versions of a JATS XML file, filtering out changes that have no meaningful impact (e.g. changes in tag order of article-id tags) and ignoring changes that the business requirements deem trivial. The paper will go on to identifying and rendering changes to content ranging from simple paragraphs, tables, equations, figures and lists. The case study will cover how differences are rendered in a way where the reader can easily understand and follow the changes. The paper will describe the easy wins, the difficulties and impossibilities of a JATS XML redlining workflow. The paper will conclude with what changes can be made to process and content structure to make redlining more effective. Full Paper \| Materials \| Video
12:30-1:30	Lunch
1:30-3:00	Focused Talks
1:30-2:05	Transforming JATS XML for mobile-optimized consumption Mitra Ahadpour, Substance Abuse and Mental Health Services Administration (SAMHSA) Atul Ganatra, IQ Solutions Adam B. Lee, IQ Solutions The Substance Abuse and Mental Health Services Administration (SAMHSA) has adapted the JATS publishing model to accommodate a robust collection of behavioral health content. Parallel efforts are now underway to take advantage of the possibilities of XML both at the content creation and content dissemination levels. This presentation will focus on the content dissemination end of the lifecycle, and specifically how SAMHSA has implemented processes to transform the Agency's content so that it is optimized for mobile devices. We will discuss the goals of the project, the approaches we evaluated, and the challenges and lessons learned that emerged as we searched for an approach that would work for content of varying lengths and styles. Materials \| Video
2:05-2:40	A Publisher's InDesign to BITS and EPUB Infrastructure: Conventions, Configuration, Conversion, Checks Gerrit Imsieke, le-tex publishing services Deploying advanced XML technologies such as XProc, XSLT 2.0, and Schematron, an "ex-post" conversion of InDesign files may be a viable alternative to XML-first publishing production workflows. Full Paper \| Video
2:40-3:00	JATS and the Standards Ecosystem Bruce Rosenblum, Inera, Inc. JATS, BITS, and publishing live in an ecosystem of interrelated standards and initiatives. If you don't know what the acronyms ORCID, PIE-J and JAV stand for, this talk will describe what they are and why JATS implementers should be familiar with them and many standards, recommended practices, and other initiatives in the JATS neighborhood. Materials \| Video
3:00-3:30	Coffee Break
3:30-4:15	Inconsistent XML as a barrier to reuse of Open Access Content Daniel Mietchen, Open Knowledge Foundation, Germany Chris Maloney, PMC (Contractor with A-Tek, Inc.) Nils Dagsson Moskopp, In this paper, we will describe the current state of some of the tagging of articles within the PMC Open Access subset. As a case study, we will use our experiences developing the Open Access Media Importer, a tool to harvest content from the OA subset and automatically upload it to Wikimedia Commons. Tagging inconsistencies stretch across several aspects of the articles, ranging from licensing to keywords to the MIME types of supplementary materials. While all of these complicate large-scale reuse, the unclear licensing statements required us to implement text mining-like algorithms in order to accurately determine whether or not specific content was compatible with reuse on Wikimedia Commons. Besides presenting examples of incorrectly tagged XML from a range of publishers, we will also explore past and current efforts towards standardization of license tagging, and we will describe a set of recommendations for generators of content on how best to tag certain data so that it is both compatible with existing standards, and consistent and machine-readable. Full Paper \| Video
4:15-5:00	The Web, the W3C and the Future of Publishing Liam Quin, The World Wide Web Consortium (W3C) As custodians of the World Wide Web, the Web Consortium (W3C) is both a leader and a follower. We follow because you can't standardise a process or technology until it is in use. We lead, because we guide the new technologies from technical, business, and social perspectives. The Web has already changed publishing, and we are at the brink of even bigger changes. What happens when Web technologies are good enough to replace existing authoring tools? What happens when the Web includes SVG and MathML and can support typography powerful enough to produce printed books? What happens when electronic books and Web sites converge? We're not quite there yet, but W3C is working in this area, working with commercial publishers, with IPDF and other organizations, listening to industry experts and tool-makers, and gently nudging the Web forward all over the world. The difficulty facing publishers today is how to manage when the Web isn't quite ready. The right question to ask is, how do we make the Web ready? In this session Liam Quin from the W3C will describe what W3C is doing in its new Publishing Activity, how it will affect you, and how you can get involved. Video

April 2, 2014

9:00-9:45	Extending JATS to include the NISO/NFAIS Recommended Practices for Online Supplemental Journal Article Materials Karen Gutzman, National Library of Medicine Kimberly A. Tryka, NCBI, National Library of Medicine This paper discusses our experience of creating an extension for JATS that incorporates the NISO "Recommended Practices for Online Supplemental Journal Article Materials" (NISO RP-2013). We will discuss our analysis of the recommendations and our comparison of the recommendations with JATS, as well as our thrashing over language and terminology associated with supplementary materials and our eventual creation of the extension. The extension is not part of the official JATS specification; it is a local extension that will be made publicly available for community use and discussion. Full Paper \| Materials \| Video
9:45-10:30	NLM Conversion to Build "Atomic" Physics Content in an Agile Fashion M. Scott Dineen, The Optical Society Mark Gross, Data Conversion Laboratory, Inc. Devorah Ashlem, Data Conversion Laboratory, Inc. Beth Friedman, Data Conversion Laboratory, Inc. Alexander Schwarzman, The Optical Society Gitty Kupferstein, Data Conversion Laboratory, Inc. When faced with the challenge of converting 8 highly technical journals spanning 95 years, how do you divide responsibility between the content owner and the conversion vendor? Do you spend a year on document analysis and developing conversion specifications, or do you hand the project over to a well-regarded service provider and rely on their expertise entirely? This paper demonstrates how an agile approach to content conversion with close collaboration between the publisher and the conversion vendor has allowed The Optical Society (OSA) and Data Conversion Laboratory, Inc. (DCL) to navigate between the two extremes and create a high-quality digital archive that will serve OSA's strategic aims for developing innovative products and services. Full Paper \| Materials \| Video
10:30-11:00	Coffee Break
11:00-11:45	What JATS Users should Know about the Book Interchange Tag Suite (BITS) Jeffrey Beck, NCBI/NLM/NIH The Book Interchagnge Tag Suite (BITS) is a book model based on the JATS article model. There are many things that can be structured the same way in both a Journal Article and a Book (or a part of a book), and some things that are very different. We'll review the things you 'get for free' if you are already familiar with the article model, and what parts of the book model you will need to pay a little more attention to. Full Paper \| Materials \| Video
11:45-12:30	Formatting JATS: as easy as 1-2-3 Tony Graham, Mentea The JATS preview XSLT stylesheets are written in XSLT 1.0. This presentation describes approaches used when customizing the XSLT 1.0 stylesheets for use with reports from a government body, when adapting the stylesheets for XSLT 2.0 for processing articles for an online journal, and upgrading the stylesheets to XSLT 3.0 as a testbed for XSLT 3.0 techniques. Full Paper \| Materials \| Video
12:30-1:30	Lunch
1:30-3:00	JATS Open Session
3:00-3:30	Coffee Break
3:30-4:15	mPach: Integrated Publishing and Archiving of Journals in HathiTrust Seth Johnson, Michigan Publishing, University of Michigan Bryan Smith, Michigan Publishing, University of Michigan Kevin S. Hawkins, Michigan Publishing, University of Michigan mPach is a package of tools being developed to provide a modular platform to enable the publication of born-digital open-access journals in the HathiTrust repository. One of the chief technological challenges for this system is the conversion of edited manuscripts to an archivable format. We selected JATS as our preservation format because of the increasing coalescence of the publishing industry around this open, non-proprietary standard. This paper provides a technical overview of the mPach platform, with special attention paid to the design and functionality of Norm, a tool being developed to convert Microsoft Word documents to JATS. Full Paper \| Materials \| Video
4:15-5:00	Strategic Reading, the Future of Scientific Publishing — something for everyone Allen Renear, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign Materials \| Video

Originally Scheduled Papers

The following papers were originally scheduled to be presented at JATS-Con in October, 2013, but the authors are unable to be present for the make-up date in April 2014.

Ontology based Biomedical Research Paper Authoring Support Tool

Senator Jeong, National Center for Medical Information and Knowledge, Korea National Institute of Health
Sejin Nam, Biomedical Knowledge Engineering Laboratory, Seoul National University
Hyun-Young Park, National Center for Medical Information and Knowledge, Korea National Institute of Health

Biomedical research papers often follow IMRAD (Introduction, Methods, Results, and Discussion) structure. Lexical bundles (also known as formulaic patterns) function as basic building blocks of this discourse structure. They are combinations of three or more words that frequently occur in a corpus, For example, the lexical bundle "the purpose of this study was" indicates the research purpose in Introduction section.

The goal of this study is to develop a biomedical research paper authoring support tool that provides writers with appropriate expressions for a specific discourse purpose in a section.

Lexical bundles were extracted from sentences in 160,150 structured abstracts of the PubMed Central Open Access Subset and analyzed their distribution by IMRAD sections. We designed the Lexical Bundle Ontology (LBO) that semantically organizes lexical bundles according to their rhetorical purposes in each IMRAD section of a biomedical research paper. Then, a JATS -compliant authoring support tool was implemented. This tool lists up candidate lexical bundles responding to authors' discourse purposes in a specific section and helps to complete sentence. We will present use case scenarios of this authoring support tool. We expect that this tool helps to conveniently organize their ideas and arguments and lower the language barrier for non-English native writers.

Full Paper

Perspective on application of journal article tag extensible markup language for scholarly journal articles written in Korean

Sun Huh, Department of Parasitology, College of Medicine, Hallym Hallym Hallym University
Tae Jin Choi, National Research Foundation, Korea
So hyeong Kim, National Research Foundation, Korea

Korea is the fifth ranking country in the number of PMC journals. In May 2013, 73 journals from Korea are included in PMC. From 2013, a variety of funding agencies to research and journal publication began to introduce the open access full text databases in the fields of medicine, science, and social science & humanity. In those databases, JATS 1.0 will be used since Korean articles can be easily manipulated for full text XML. It is necessary for editors or publishers to make full text XML files based on JATS 1.0. I would like to introduce and present the present situation of application of JATS 1.0 to academic journals not only in English but also in Korean: technology, training programs and policy of Korean Government on open access full text XML. This experience in Korea can be a model in constructing mother-language open access full text journal databases based on JATS 1.0. The usefulness of JATS is stressed in scholarly journal publication of all fields and propagation of the information in Korea besides of medical fields.

Full Paper