JATS-Con 2012 Schedule with Abstracts

Tuesday, October 16, 2012

8:00-9:00	Registration
9:00-9:15	Welcome and Introductions
9:15-9:45	Evolutionary Chaos and the Road Ahead Laura Kelly, NCBI/NLM/NIH The JATS user community is growing in both number and diversity. With the acceptance of the JATS as a NISO standard, that's likely to continue. But what does that mean for the future of the JATS when its development has always been driven by the user community? How does the JATS continue to evolve without descending into chaos? Full Paper \| Materials \| Video
9:45-10:30	How Well do you Know Your Data? Converting an Archive of Proprietary Markup Schemes to JATS: A Case Study Faye Krawitz, American Institute of Physics Jennifer McAndrews, American Institute of Physics Richard O'Keeffe, American Institute of Physics The presentation will describe the challenges, benefits, and opportunities resulting from converting an archival collection of approximately 750,000 files to JATS. The goal was to migrate the American Institute of Physics (AIP) and member society archival collection from multiple generations of proprietary markup to an industry standard to create a true archive, all managed within a new, more controlled content management system. Integral to the process was the adoption and application of the XML technologies XSLT, XPath, and Schematron to transform and check the content. Sounds straightforward doesn't it? Perform a thorough document analysis, map out the transformation rules, convert the data. But is it? Have you accounted for all historical variations, generated text, metadata, nomenclature variations on XML file assets? Beside your core, don't forget about reuse for other products, edge cases, online presentation, distribution channels and staff training! Full Paper \| Materials \| Video
10:30-11:00	Coffee Break
11:00-11:45	Implementing XML for Japanese-language Scholarly Articles Soichi Tokizane, Aichi University Scholarly publishers and typesetting service providers in Japan, as well as the J-STAGE e-journal platform got together to devise a way to implement XML for Japanese-language scholarly articles, and worked with JATS working group to develop multi-language extention of JATS. It is now implemented at the new J-STAGE platform that was launched in May, 2012. Full Paper \| Materials \| Video
11:45-12:30	The Front Matters: Capturing Journal Front Matter Content with JATS Rachael Carter, NCBI/NLM/NIH Kathryn Funk, NCBI/NLM/NIH Rebecca Mooney, NCBI/NLM/NIH PMC strives to be a comprehensive archive of biomedical journals. Currently JATS provides no way to capture journal and issue specific front matter content, such as editorial boards, journal philosophy, submission instructions, etc. We developed an extension to the current Tag Suite, the pmc-journalmatter.dtd, that captures these journal artifacts. In mapping multiple publishing models, we found that front matter exists in two basic forms: issue and standing. The issue attribute should be used for administrative materials that are published in an issue. The standing attribute should be used for non-issue based and administrative materials that are not published in an issue, but are static, such as information published on a website. Our DTD aims to be flexible enough to accommodate a variety of user needs. It allows the users to update different elements of journal matter without the burden of updating all elements by offering separate document types.The document types include: author information, issue cover, editorial board, publisher information, and general information. In order to encapsulate front matter content, we introduced new elements that work in conjunction with JATS.This paper will explore the need for a front matter specific JATS extension, the limitations of the article model to represent this type of information, our process and rationale for the data mapping and subsequent development of new elements, and future implementation. Full Paper \| Materials \| Video
12:30-1:30	Lunch
1:30-3:00	Book Tagging Panel Discussion Bruce Rosenblum, Inera, Inc., Moderator Martin Latterner, NCBI/NLM/NIH Wendell Piez, Piez Consulting Services Cindy Maisannes, CFA Institute Daniel Grossberg, Grapevine Publishing Services, LLC Jake Furbush, MIT Press Presentations on tagging Book content in XML and a discussion will follow. Full Paper \| Materials \| Video
3:00-3:30	Coffee Break
3:30-4:15	Author Generated JATS XML Markup Andy Gajetzki, Internet Scientific Publications, LLC Oliver Wenker, Internet Scientific Publications, LLC At Internet Scientific Publications, we have since day one marked up submitted manuscripts using an in-house developed Microsoft Word macro. After 14 years, we feel that this approach is not ideal for two reasons: 1) most errors that exist in the finished XML are introduced during the data-entry / markup stage, and 2) markup represents a significant time expense for our staff that could be better spent elsewhere. Since we only charge at the point an article is accepted for publication, there is a time investment marking up manuscripts that may never be monetarily recouped. Consequently, we have explored the option of allowing authors to mark up their own documents from our submission frontend website. There are draw-backs to this approach, namely the complexity and completeness of JATS and the huge learning curve a non-technical author would encounter, but we have in-turn concluded that a majority of the JATS definition does not need to be made available to an author in our frontend application. If an article requires more specific markup that we do not support in the application, we can always fallback to publisher side markup using our tried and tested Word macro. Quality control occurs later in the pipeline during copy-editing regardless of which markup pathway is followed.To facilitate this, we have created a self-contained Symfony2 bundle that supports manuscript markup utilizing a subset of the JATS Journal Publishing 3.0 tag suite. Much of the front and back matter is captured using simple form inputs and is validated using regular expressions developed using common input patterns. For the body, an HTML5 DOM based WYSIWYG editor is used. Although the generated markup is HTML5, by using a subset of JATS, we can unambiguously map between the two markup languages. We speculate that Amazon Mechanical Turk could be used to simplify certain article markup tasks like, for example, endnotes, where it would be off-putting for the author to tokenize the citation string. While the distribution model of a final product has not been determined, it will most likely be made available in a dual-licensed manner depending on the commerciality of the customer. Full Paper \| Materials \| Video
4:15-5:00	Mapping JATS to RDF using the SPAR (Semantic Publishing and Referencing) Ontologies Silvio Peroni, University of Bologna Deborah Aleyne Lapeyre, Mulberry Technologies, Inc. David Shotton, University of Oxford We will present a mapping of the metadata and bibliographic references from the Journal Article Tag Suite (JATS) to RDF, using the SPAR (Semantic Publishing and Referencing) ontologies together with elements from other well-known vocabularies. This mapping will permit XML documents marked up using JATS to be converted automatically to RDF, enabling the information contained within those documents to be published to the Semantic Web in a manner that is (hopefully) unambiguous and universally understood. By so doing, we hope to facilitate the publication of bibliographic information on the web as linked open data and to enhance the toolkit for libraries, archives, and publishers who have chosen to encode their journal material in NISO JATS. Full Paper \| Materials \| Materials \| Video

Wednesday, October 17, 2012

9:00-9:45	Implementation of TaxPub, an NLM DTD Extension for Domain-specific Markup in Taxonomy, from the Experience of a Biodiversity Publisher Lyubomir Penev, Institute of Biodiversity and Ecosystem Research, Bulgarian Academy of Sciences, Sofia, Bulgaria & Pensoft Publishers, Sofia, Bulgaria Terence Catapano, Plazi, Bern, Switzerland & Columbia University, New York, NY Donat Agosti, Plazi, Bern, Switzerland Teodor Georgiev, Pensoft Publishers, Sofia, Bulgaria Guido Sautter, Plazi, Bern, Switzerland Pavel Stoev, Institute of Biodiversity and Ecosystem Research, Bulgarian Academy of Sciences, Sofia, Bulgaria & National Natural History Museum, Bulgarian Academy of Sciences, Sofia, Bulgaria TaxPub was created as an XML extension to the general JATS to provide domain-specific markup for prospective publishing in the area of biological systematics. The core idea of the schema is to delimit descriptions of taxa, or treatments, within an article, and to use these individual portions of information for various purposes. TaxPub was developed in a close cooperation between the author (Terence Catapano), a community interested in such markup (Plazi), the NLM JATS group and a journal publisher (Pensoft). Since July 2009, TaxPub has been routinely implemented in the everyday publishing practice of Pensoft, to provide: (1) Semantically enhanced, domain-specific XML versions of articles for archiving in PubMedCentral (PMC); (2) Visualization of taxon treatments on PMC; (3) Export of taxon treatments to various aggregators, such as Encyclopedia of Life, Plazi Treatment Repository, and the Wiki Species-ID.net. Full Paper \| Materials \| Video
9:45-10:30	JATS for both Journals and Books? — A Case Study of Adopting JATS to Build a Single Search for Ejournals and Ebooks Wei Zhao, OCUL—Scholars Portal Jayanthy Chengan, OCUL—Scholars Portal Ontario Scholars Portal (SP) Journals is an XML based digital repository containing over 31,000,000 articles from over 13,400 full text journals of 24 publishers. It has been a success of adopting NLM Journal Archiving and Interchange Tag Set for its XML based E-journals system using MarkLogic since 2006. Scholars Portal Books is a PDF based platform containing 460,000 ebooks from 25 publisher running on Ebrary's ISIS system. While the PDF is still the dominating format for ebooks, the publishers start to move from PDF to XML. This article describes the pilot of transforming a publisher's XML book into NLM book DTD XML format and load into MarkLogic with Journals so the users can get the book chapter results from the Journals search interface. This article will discuss why NLM book DTD is chosen, examine the process of data transforming and loading, analyze the benefits and challenges and make the recommendations of improving the book DTD. Full Paper \| Materials \| Video
10:30-11:00	Coffee Break
11:00-11:45	DtdAnalyzer—a Tool for Analyzing and Manipulating XML DTDs Demian Hess, Avalon Consulting, LLC Audrey Hamelers, NCBI/NLM/NIH Chris Maloney, NCBI/NLM/NIH This paper describes an open-source Java/XSLT application that allows users to analyze and manipulate XML DTDs. The application can be used to generate reference documentation, create reports comparing two DTDs, convert DTDs into Schematron, and to automatically scaffold conversion scripts.The DtdAnalyzer tool has been used at the National Center for Biotechnology Information (NCBI), a part of the National Library of Medicine, for a number of years, to help maintain the suite of NLM DTDs, and to develop new conversion applications. NCBI has recently decided to release it as a self-contained application and to move its development onto GitHub (https://github.com/NCBITools/DtdAnalyzer).The heart of the tool is a Java application which converts a DTD into an XML representation, including all the element, attribute, and entity declarations contained inside the DTD. Additionally, DTDs can be annotated with specially-formatted comments which are recognized by this converter, and these annotations are delivered in the XML output.The resulting XML can be transformed to create many useful outputs, and a basic set of those transformation stylesheets, as described above, are included with this tool. Full Paper \| Materials \| Video
11:45-12:30	Reducing Costs and Expanding XML Submissions with PDF to JATS Conversion Keishi Katoh, Digital Communications Co., Ltd Tokushige Kobayashi, Antenna House, Inc. Kitazawa Mitsuru , Japan Science and Technology Agency (JST) The paper presents a brief overview of the challenges facing institutions with the XML-ization of academic journals and the steps being taken in Japan to meet both those challenges with the new J-STAGE3 implementation and a solution for automatically analyzing and converting PDF into XML for JATS metadata and bibliographic information. J-STAGE3 has fully adopted the metadata and bibliographic JATS format. The automated solution is currently achieving more than a 90% accuracy rate and future plans are to expand it to be able to produce full-text XML from PDF. Full Paper \| Materials \| Video
12:30-1:30	Lunch
1:30-3:00	Journal Article Tag Suite Update and Open Discussion
3:00-3:30	Coffee Break
3:30-4:15	Developing a Schematron—Owning Your Content Markup: A Case Study Julie Blair, SAGE Publications Inc. This paper will detail an organization's development and implementation of Schematron in its workflow process to cut down on errors as well as develop consistent markup across articles and journals. The process for developing the Schematron will be explored. This consisted of compiling error reports from 8 months of data as the basis for writing rules.The paper will examine how the Schematron was implemented into a Content Management System and broken up into Phases for the varied workflows of the organization. Upon content ingestion, files are validated against a specific Phase in the Schematron, based on the workflow, and reports are generated if any rules throw an error or warning.The results of the implementation of the Schematron will be summarized. A decline in errors was realized which reduced the average number of deliveries prior to online approval. The case study demonstrates how introducing Schematron into an XML workflow can help a publisher drive their content markup while reducing publishing delays and cost of corrections. Full Paper \| Materials \| Video
4:15-5:00	Beware! The Spork Jeffrey Beck, NCBI/NLM/NIH Advice, advice, advice. There is so much advice on how to use the JATS. And everyone seems to have an opinion about how you should deal with your content. Full Paper \| Materials \| Video