Notice

N.B. For up to date information on the Citation Style Language (CSL), visit the new project home at CitationStyles.org.

Problem

Bibliographic management and citation formatting are central to the practice of all manner of research. The current bibliographic software landscape is divided broadly between a commercial market characterized by buggy software and glacial innovation, and an open software ecosystem built around BibTeX.

BibTeX’s success is a function of three factors. First, BibTeX was designed to solve real needs: allowing LaTeX users to format their manuscripts according to detailed publisher specifications. Second, it has a dedicated styling language to configure such formatting. Finally, it focuses on a single task: bibliographic and citation encoding and formatting. As a result, a variety of tools have been built around it. A GUI application designer can simply focus on how best to manage references, without having to worry about the obscure complexities of bibliographic and citation formatting.

Nevertheless, BibTeX is otherwise quite limited. Its data model is unsuitable for demanding users in the social sciences and humanities, it has no international support, its styling language is written in an obscure language that is very difficult to work with, and it is limited to LaTeX.

Purpose

XBib provides important building blocks for dramatically improved bibliographic and citation support in XML. The project consists of three key pieces:

  1. Cite Schema: a small namespaced schema for marking up citations in XML; recently approved for inclusion in the OpenOffice file format, it is suitable for embedding in other document formats, including WordML.
  2. Citation Style Language (CSL): an XML language for specifying citation and bibliographic formatting; similar in principle to BibTeX .bst files or the binary style files in commercial products like Endnote or Reference Manager, this styling language has the distinction of being open, easy-to-use, and feature-rich.
  3. CiteProc: a processor that formats documents based on the specifications in a CSL file; implemented using XSLT 2.0. The design is structured around a driver architecture that allows support for a variety of input and output formats. Initial supported input formats are DocBook NG and MODS. While focus is on XML formats like DocBook, TEI, WordML and OpenOffice, it is feasible to support other non-XML formats such as LaTeX, RTF or Textile.

The goals of the XBib project are in some sense quite modest. It is not to create bibliographic applications. Instead, the focus is on key tools and standards that are needed to push the state-of-the-art on a rather neglected but essential aspect of scholarly needs: citation and bibliographic formatting. XBib exploits pervasive internet-based protocols and standards such as HTTP, XML and XSLT and emerging library standards such as MODS, SRU/W and CQL where possible, and invents new ones—such as CSL—where necessary. By narrowing the focus on these issues, the hope is it will be easier for other projects to integrate better address these needs with minimal work.

On the other hand, the goals are quite ambitious indeed. XBib aims to provide a common framework for formatting bibliographies and citations across markup languages and document standards. In an ideal world, one could use the same CSL files to format DocBook, TEI, OpenOffice, WordML ... or even LaTeX documents. Moreover, such files should be available via web service access from an online repository.

In a world where XML is finally coming into its own as a useful way to encode information, and where there is increasing interest in principles of the semantic web, it is essential to address this missing piece of the puzzle.

TO DO

The following is to do, in order importance:

  1. Citation style language and stylesheets need to be finalized.
  2. Drivers need to be written for other document formats.
  3. Implement an online style repository and associated webservice access.

Acknowledgements

This project builds on earlier work by Markus Hoenicka and Peter Flynn. CiteProc has greatly benefitted from the generous help provided by members of the xsl-list, in particular David Carlisle, Geert Josten, Michael Kay, Wendell Piez, and Jeni Tennison.