version 1.00a64
2 March 2010
Table of Contents
This is the site administrator's manual for citeproc-js, a
Javascript implementation of the
Citation Style Language
(CSL) used by Zotero, Mendeley and other popular reference
tools to format citations in any of the hundreds of styles
supplied by the . The processor complies with version 1.0 of the CSL
specification, has been written and tested as an independent module,
and can be run by any ECMAscript-compliant interpreter. With an
appropriate supporting environment, [1] it can be deployed in a
browser plugin, as part of a desktop application, or as a formatting
backend for a website or web service.
This manual covers the basic operation of the processor, including the command set, the local system code that must be supplied by the integrator, and the expected format of input data. In addition, notes are provided on the test suite, on the infrastructure requirements for running the processor in particular environments, and on extended functionality that is available to address certain special requirements.
Comments and complaints relating to this document and to the processor itself
will be gladly received and eventually despatched with. The best channel
for providing feedback and getting help is the
project mailing list.
| [1] | For further details on required infrastructure, see the sections Local Environment and Data Input below. |
The processor is written in Javascript, one of the interesting
features of which is the lack of a standard method of I/O. As a
result, the processor must be wrapped in other code to get data in and
out of it, and every installation is going to be a little different.
This manual does not cover the nitty-gritty of setting up the
environment for running the processor in a particular environment, but
the basic system requirements are described below. If you get stuck
and want advice, or if you find something in this manual that is out
of date or just wrong, please feel free to drop a line to the
project list.
The citeproc-js sources are hosted on
BitBucket.
To obtain the sources, install the
Mercurial version control system
on a computer within your control (if you're on a Linux distro or a Mac,
just do a package install), and run the following command:
hg clone http://bitbucket.org/fbennett/citeproc-js/
This should get you a copy of the sources, and you should be able to exercise the test framework using the ./test.py script.
A ECMAscript interpreter with E4X support is required to run the
processor. The Rhino, Spidermonkey and Tracemonkey Javascript
interpreters all satisfy this requirement. The V8 interpreter used by
Google Chrome and Node does not. The task of tying a Javascript
interpreter into a given web framework or application is beyond
the scope of this manual; but in Python-based environments, the
python-spidermonkey bridge module by Paul Davis may be worth a
look.
Instructions on running the processor test suite can be found in the section Running the test suite at the end of this manual.
The primary source code of the processor is located under ./src, for ease of maintenance. The files necessary for use in a runtime environment are catenated, in the appropriate sequence, in the citeproc.js file, located in the root of the source archive. This file and the test fixtures can be refreshed using the ./test.py -r command.
To build the processor, the citeproc.js source code should be loaded into the Javascript interpreter context, together with a sys object provided by the integrator (see below), and the desired CSL style (as a string).
The processor command set will be a grave disappointment to those well versed in the tormented intricacies of reference management and bibliography formatting. The processor is instantiated with a single command, controlled with three others, and has just two commands for adjustments to its runtime configuration.
A working instance of the processor can (well, must) be created using the CSL.Engine() command, as shown in the code illustration below. This command takes up to three arguments, two of them required, and one of them optional:
Important
See the section Local Environment → System functions below for guidance on the definition of the functions contained in the sys object.
1 2 3 | var citeproc = new CSL.Engine(sys,
style,
lang)
|
Before citations or a bibliography can be generated, an ordered list of reference items must ordinarily be loaded into the processor using the updateItems() command, as shown below. This command takes a list of item IDs as its sole argument, and will reconcile the internal state of the processor to the provided list of items, making any necessary insertions and deletions, and making any necessary adjustments to internal registers related to disambiguation and so forth.
Hint
The sequence in which items are listed in the argument to updateItems() will be reflected in the ordering of bibliographies only if the style installed in the processor does not impose its own sort order.
1 2 3 4 5 6 7 | var my_ids = [
"ID-1",
"ID-53",
"ID-27"
]
citeproc.updateItems( my_ids );
|
Note that only IDs may be used to identify items. The ID is an arbitrary, system-dependent identifier, used by the locally customized retrieveItem() method to retrieve actual item data.
The makeBibliography() command does what its name implies. If invoked without an argument, it dumps a formatted bibliography containing all items currently registered in the processor:
1 | var mybib = citeproc.makeBibliography();
|
Important
Matches against the content of name and date variables are not possible, but empty fields can be matched for all variable types. See the quash example below for details.
The makeBibliography() command accepts one optional argument, which is a nested Javascript object that may contain one of the objects select, include or exclude, and optionally an additional quash object. Each of these four objects is an array containing one or more objects with field and value attributes, each with a simple string value (see the examples below). The matching behavior for each of the four object types, with accompanying input examples, is as follows:
Hint
The target field in the data items registered in the processor may either be a string or an array. In the latter case, an array containing a value identical to the relevant value is treated as a match.
1 2 3 4 5 6 7 8 9 10 11 12 13 | var myarg = {
"select" : [
{
"field" : "type",
"value" : "book"
},
{ "field" : "categories",
"value" : "1990s"
}
]
}
var mybib = cp.makeBibliography(myarg);
|
1 2 3 4 5 6 7 8 9 10 | var myarg = {
"include" : [
{
"field" : "type",
"value" : "book"
}
]
}
var mybib = cp.makeBibliography(myarg);
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | var myarg = {
"exclude" : [
{
"field" : "type",
"value" : "legal_case"
},
{
"field" : "type",
"value" : "legislation"
}
]
}
var mybib = cp.makeBibliography(myarg);
|
Hint
An empty string given as the field value will match items for which that field is missing or has a nil value.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | var myarg = {
"include" : [
{
"field" : "categories",
"value" : "classical"
}
],
"quash" : [
{
"field" : "type",
"value" : "manuscript"
},
{
"field" : "issued",
"value" : ""
}
]
}
var mybib = cp.makeBibliography(myarg);
|
The value returned by this command is a two-element list, composed of a Javascript array containing certain formatting parameters, and a list of strings representing bibliography entries. It is the responsibility of the calling application to compose the list into a finish string for insertion into the document. The first element —- the array of formatting parameters —- contains the key/value pairs shown below (the values shown are the processor defaults in the HTML output mode):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | [
{
"maxoffset": 0,
"entryspacing": 1,
"linespacing": 1,
"hangingindent": 0,
"bibstart": "<div class=\"csl-bib-body\">\n",
"bibend": "</div>"
},
[
"<div class=\"csl-entry\">Book A</div>",
"<div class=\"csl-entry\">Book C</div>"
]
]
|
Citation commands generate strings for insertion into the text of a target document. Citations can be added to a document in one of two ways: as a batch process (BibTeX, for example, works in this way) or interactively (Endnote, Mendeley and Zotero work in this way, through a connection to the user's word processing software). These two modes of operation are supported in citeproc-js by two separate commands, respectively appendCitationCluster(), and processCitationCluster(). A third, simpler command (makeCitationCluster()), is not covered by this manual. It is primarily useful as a tool for testing the processor, as it lacks any facility for position evaluation, which is needed in production environments. [2]
The appendCitationCluster() and processCitationCluster() commands use a similar input format for citation data, which is described below in the Data Input → Citation data object section below.
The appendCitationCluster() command takes a single citation object as argument, and an optional flag to indicate whether a full list of bibliography items has already been registered in the processor with the updateItems() command. If the flag is true, the command should return an array containing exactly one two-element array, consisting of the current index position as the first element, and a string for insertion into the document as the second. To wit:
1 2 3 4 5 | citeproc.appendCitationCluster(mycitation,true);
[
[ 5, "(J. Doe 2000)" ]
]
|
If the flag is false, invocations of the command may return multiple elements in the list, when the processor sense that the additional bibliography items added by the citation require changes to other citations to achieve disambiguation. In this case, a typical return value might look like this:
1 2 3 4 5 6 | citeproc.appendCitationCluster(mycitation);
[
[ 2, "(Jake Doe 2000)" ],
[ 5, "(John Doe 2000)" ]
]
|
The processCitationCluster() command is used to generate and maintain citations dynamically in the text of a document. It takes three arguments: a citation object, a list of citation ID/note index pairs representing existing citations that precede the target citation, and a similar list of pairs for citations coming after the target. Like the appendCitationCluster() command run without a flag, its return array may contain multiple elements, where the edit or addition of a citation triggers changes to other citations:
1 2 3 4 5 6 7 8 9 10 | var citationsPre = [ ["citation-abc",1], ["citation-def",2] ];
var citationsPost = [ ["citation-ghi",4] ];
citeproc.processCitationCluster(citation,citationsPre,citationsPost);
[
[ 1,"(Ronald Snoakes 1950)" ],
[ 3,"(Richard Snoakes 1950)" ]
]
|
A worked example showing the result of multiple transactions can be
found in the
processor test suite.
The output format of the processor can be changed after instantiation using the setOutputFormat() command. This command is specific to the citeproc-js processor.
Hint
See the section Output Formatting below for notes on defining new output formats.
1 | citeproc.setOutputFormat("rtf");
|
The processor recognizes abbreviation lists for journal titles, series titles, authorities (such as the Supreme Court of New York), and institution names (such as International Business Machines). A list can be set in the processor using the setAbbreviations() command, with the name of the list as sole argument. The named list is fetched and installed by the sys.getAbbreviations() command, documented below under Local Environment → System Functions.
1 | citeproc.setAbbreviations( "default" );
|
| [2] | For illustrations of the input syntax for the makeBibliography()
command, see any test in the test suite that uses the
CITATION-ITEMS environment -- it accepts a bare
array of citationItems objects, as described under
Data Input → Citation data object, below. |
While citeproc-js does a great deal of the heavy lifting needed for correct formatting of citations and bibliographies, a certain amount of programming is required to prepare the environment for its correct operation.
As mentioned above in the section on CSL.Engine(), two functions must be defined separately and supplied to the processor upon instantiation. These functions are used by the processor to obtain locale and item data from the surrounding environment. The exact definition of each may vary from one system to another; those given below assume the existence of a global DATA object in the context of the processor instance, and are provided only for the purpose of illustration.
The retrieveLocale() function is used internally by the processor to retrieve the serialized XML of a given locale. It takes a single RFC 4646 compliant language tag as argument, composed of a single language tag (en) or of a language tag and region subtag (en-US). The name of the XML document in the CSL distribution that contains the relevant locale data may be obtained from the CSL.localeRegistry array. The sample function below is provided for reference only.
1 2 3 4 | sys.retrieveLocale = function(lang){
var ret = DATA._locales[ CSL.localeRegistry[lang] ];
return ret;
};
|
The retrieveItem() function is used by the processor to fetch individual items from storage.
1 2 3 | sys.retrieveItem = function(id){
return DATA._items[id];
};
|
The getAbbreviations() command is invoked by the processor at startup, and when the setAbbreviations() command is invoked on the instantiated processor. The abbreviation list retrieved by the processor should have the following structure:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | ABBREVS = {
"default": {
"journal":{
"Journal of Irreproducible Results":"J. Irrep. Res."
},
"series":{
"International Rescue Wildlife Series":"I.R. Wildlife Series"
},
"authority":{
"United States Patent and Trademark Office": "USPTO"
},
"institution":{
"Bureau of Gaseous Unformed Stuff":"BoGUS"
};
};
};
|
If the object above provides the abbreviation store for the system, an appropriate sys.getAbbreviations() function might look like this:
1 2 3 | sys.getAbbreviations = function(name){
return ABBREVS[name];
};
|
The locally defined retrieveItem() function must return data for the target item as a simple Javascript array containing recognized CSL fields. [3] The layout of the three field types is described below.
Text and numeric variables are not distinguished in the data layer; both should be presented as simple strings.
1 2 3 | { "title" : "My Anonymous Life",
"volume" : "10"
}
|
When present in the item data, CSL name variables must be delivered as a list of Javascript arrays, with one array for each name represented by the variable. Simple personal names are composed of family and given elements, containing respectively the family and given name of the individual.
1 2 3 4 5 6 7 8 9 | { "author" : [
{ "family" : "Doe", "given" : "Jonathan" },
{ "family" : "Roe", "given" : "Jane" }
],
"editor" : [
{ "family" : "Saunders",
"given" : "John Bertrand de Cusance Morant" }
]
}
|
Institutional and other names that should always be presented literally (such as "The Artist Formerly Known as Prince", "Banksy", or "Ramses IV") should be delivered as a single literal element in the name array:
1 2 3 4 | { "author" : [
{ "literal" : "Society for Putting Things on Top of Other Things" }
]
}
|
Name particles, such as the "von" in "Werner von Braun", can be delivered separately from the family and given name, as dropping-particle and non-dropping-particle elements. Name suffixes such as the "Jr." in "Frank Bennett Jr." can be delivered as a suffix element.
Hint
A simplified format for delivering particles and name suffixes to the processor is described below in the section Dirty Tricks → Input data rescue → Names.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | { "author" : [
{ "family" : "Humboldt",
"given" : "Alexander",
"dropping-particle" : "von"
},
{ "family" : "Gogh",
"given" : "Vincent",
"non-dropping-particle" : "van"
},
{ "family" : "Stephens",
"given" : "James",
"suffix" : "Jr."
},
{ "family" : "van der Vlist",
"given" : "Eric"
}
]
}
|
Names not written in the Latin or Cyrillic scripts [4] are always displayed with the family name first. No special hint is needed in the input data; the processor is sensitive to the character set used in the name elements, and will handle such names appropriately.
1 2 3 4 5 6 | { "author" : [
{ "family" : "村上",
"given" : "春樹"
}
]
}
|
Hint
When the romanized transliteration is selected from a multi-lingual name field, the static-ordering flag is not required. See the section Dirty Tricks → Multi-lingual content below for further details.
Sometimes it might be desired to handle a Latin or Cyrillic transliteration as if it were a fixed (non-Byzantine) name. This behavior can be prompted by including a static-ordering element in the name array. The actual value of the element is irrelevant, so long as it returns true when tested by the Javascript interpreter.
1 2 3 4 5 6 7 | { "author" : [
{ "family" : "Murakami",
"given" : "Haruki",
"static-ordering" : 1
}
]
}
|
Date fields are Javascript objects, within which the "date-parts" element is a nested Javascript array containing a start date and optional end date, each of which consists of a year, an optional month and an optional day, in that order if present.
Hint
A simplified format for providing date input
is described below in the section
Dirty Tricks → Input data rescue → Dates.
1 2 3 4 5 6 | { "issued" : {
"date-parts" : [
[ "2000", "1", "15" ]
]
}
}
|
Date elements may be expressed either as numeric strings or as numbers.
1 2 3 4 5 6 | { "issued" : {
"date-parts" : [
[ 1895, 11 ]
]
}
}
|
The year element may be negative, but never zero.
1 2 3 4 5 6 | { "issued" : {
"date-parts" : [
[ -200 ]
]
}
}
|
A season element may also be included. If present, string or number values between 1 and 4 will be interpreted to correspond to Spring, Summer, Fall, and Winter, respectively.
1 2 3 4 5 6 7 | { "issued" : {
"date-parts" : [
[ 1950 ]
],
"season" : "1"
}
}
|
Other string values are permitted in the season element, but note that these will appear in the output as literal strings, without localization:
1 2 3 4 5 6 7 | { "issued" : {
"date-parts" : [
[ 1975 ]
],
"season" : "Trinity"
}
}
|
For approximate dates, a circa element should be included, with a non-nil value:
1 2 3 4 5 6 7 | { "issued" : {
"date-parts" : [
[ -225 ]
],
"circa" : 1
}
}
|
To input a date range, add an array representing the end date, with corresponding elements:
1 2 3 4 5 6 7 | { "issued" : {
"date-parts" : [
[ 2000, 11 ],
[ 2000, 12 ]
]
}
}
|
To specify an open-ended range, pass nil values for the end elements:
1 2 3 4 5 6 7 | { "issued" : {
"date-parts" : [
[ 2008, 11 ],
[ 0, 0 ]
]
}
}
|
A literal string may be passed through as a literal element:
1 2 3 4 | { "issued" : {
"literal" : "13th century"
}
}
|
A minimal citation data object, used as input by both the processCitationCluster() and appendCitationCluster() command, has the following form:
1 2 3 4 5 6 7 8 9 10 | {
"citationItems": [
{
"id": "ITEM-1"
}
],
"properties": {
"noteIndex": 1
}
}
|
The citationItems array is a list of one or more citation item objects, each containing an id used to retrieve the bibliographic details of the target resource. A citation item object may contain one or more additional optional values:
CSL specification.In the properties portion of a citation, the noteIndex value indicates the footnote number in which the citation is located within the document. Citations within the main text of the document have a noteIndex of zero.
The processor will add a number of data items to a citation during processing. Values added at the top level of the citation structure include:
Values added to individual citation item objects may include:
Citations are registered and accessed by the processor internally in arrays and Javascript objects. Calling applications should not need to access this data directly, but it is available in the processor registry, at the following locations:
1 2 3 4 5 | citeproc.registry.citationreg.citationById
citeproc.registry.citationreg.citationByIndex
citeproc.registry.citationreg.citationByItemId
|
| [3] | For information on valid CSL variable names, please refer to the CSL specification, available via http://citationstyles.org/. |
| [4] | The Latin and Cyrillic scripts are referred to here collectively as "Byzantine scripts", after the confluence of cultures in the first millenium that spanned both. |
The test fixtures assume HTML output, which the processor supports out
of the box as its default mode. It is currently the only mode
supported in the distributed version of the code, but additional modes
can be created by adding definitions for them to the source file ./src/formats.js.
See
the file itself for details; it's pretty straightforward.
This section presents features of the citeproc-js processor that are not properly speaking a part of the CSL specification. The functionality described here may or may not be found in other CSL 1.0 compliant processors, when they arrive on the scene.
Systems that use a simple two-field entry format can encode non-dropping-particle and dropping-particle elements on a name by including them in the family or given fields, respectively, setting the parse-names flag on the name object to indicate that the processor should perform particle extraction on these fields:
1 2 3 4 5 6 7 8 9 10 11 | { "author" : [
{ "family" : "Humboldt",
"given" : "Alexander von",
"parse-names" : true
},
{ "family" : "van Gogh",
"given" : "Vincent",
"parse-names" : true
}
]
}
|
The extraction of "non-dropping" particles is done by scanning the family field for leading terms that contain no uppercase letters. The extraction of "dropping" particles is done by scanning the given field for trailing terms that contain no uppercase letters.
For some names, leading lowercase terms in the family field should be treated as part of the name itself, and not as particles. The parse-names flag should not be set on such names:
1 2 3 4 5 6 | { "author" : [
{ "family" : "van der Vlist",
"given" : "Eric"
}
]
}
|
The citeproc-js processor contains its own internal parsing code for raw date strings. Clients may take advantage of the processor's internal parser by supplying date strings as a single raw element:
1 2 3 4 | { "issued" : {
"raw" : "25 Dec 2004"
}
}
|
Note that the parsing of raw date strings is not part of the CSL 1.0 standard. Clients that need to interoperate with other CSL processors should be capable of preparing input in the form described above under Data Input → Dates.
In ordinary operation, the processor generates citation strings suitable for a given position in the document. To support some use cases, the processor is capable of delivering special-purpose fragments of a citation.
When the makeCitationCluster() command (not documented here) is invoked with a non-nil author-only element, everything but the author name in a cite is suppressed. The name is returned without decorative markup (italics, superscript, and so forth).
1 2 3 | var my_ids = {
["ID-1", {"author-only": 1}]
}
|
You might think that printing the author of a cited work, without printing the cite itself, is a useless thing to do. And if that were the end of the story, you would be right ...
To suppress the rendering of names in a cite, include a suppress-author element with a non-nil value in the supplementary data:
1 2 3 | var my_ids = [
["ID-1", { "locator": "21", "suppress-author": 1 }]
]
|
This option is useful on its own. It can also be used in combination with the author-only element, as described below.
Calls to the makeCitationCluster() command with the author-only and to processCitationCluster() or appendCitationCluster() with the suppress-author control elements can be used to produce cites that divide their content into two parts. This permits the support of styles such as the Chinese national standard style GB7714-87, which requires formatting like the following:
The Discovery of Wetness
While it has long been known that rocks are dry [1] and that air is moist [2] it has been suggested by Source [3] that water is wet.
Bibliography
[1] John Noakes, The Dryness of Rocks (1952).
[2] Richard Snoakes, The Moistness of Air (1967).
[3] Jane Roe, The Wetness of Water (2000).
In an author-date style, the same passage should be rendered more or less as follows:
The Discovery of Wetness
While it has long been known that rocks are dry (Noakes 1952) and that air is moist (Snoakes 1967) it has been suggested by Roe (2000) that water is wet.
Bibliography
John Noakes, The Dryness of Rocks (1952).
Richard Snoakes, The Moistness of Air (1967).
Jane Roe, The Wetness of Water (2000).
In both of the example passages above, the cites to Noakes and Snoakes can be obtained with ordinary calls to citation processing commands. The cite to Roe must be obtained in two parts: the first with a call controlled by the author-only element; and the second with a call controlled by the suppress-author element, in that order:
1 2 3 4 5 | var my_ids = {
["ID-3", {"author-only": 1}]
}
var result = citeproc.makeCitationCluster( my_ids );
|
... and then ...
1 2 3 4 5 6 | var citation = {
"citationItems": ["ID-3", {"suppress-author": 1}],
"properties": { "noteIndex": 5 }
}
var result = citeproc.processCitationCluster( citation );
|
In the first call, the processor will automatically suppress decorations (superscripting). Also in the first call, if a numeric style is used, the processor will provide a localized label in lieu of the author name, and include the numeric source identifier, free of decorations. In the second call, if a numeric style is used, the processor will suppress output, since the numeric identifier was included in the return to the first call.
Detailed illustrations of the interaction of these two control elements are in the processor test fixtures in the "discretionary" category:
The version of citeproc-js described by this manual incorporates an experimental mechanism for supporting cross-lingual and mixed-language citation styles, such as 我妻栄 [Wagatsuma Sakae], 債権各論 [Obligations in Detail] (1969). While the scheme described below cannot be considered a permanent and stable solution to the problem of multi-lingual citation management, it provides a platform for proof of concept, and for the development of styles to support more robust multilingual support when it arrives.
The style tag in a CSL style may contain a default-locale attribute.
Hint
When the default-locale attribute is omitted, the default language is set to en-US.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | <style
xmlns="http://purl.org/net/xbiblio/csl"
class="in-text"
version="1.0"
default-locale="de">
<info>
<id />
<title />
<updated>2009-08-10T04:49:00+09:00</updated>
</info>
<citation>
<layout>
<names variable="author">
<name />
</names>
</layout>
</citation>
</style>
|
For multi-lingual operation, a style may be set to request alternative
versions and translations of the title field, and of the author
and other name fields, using an extension to the default-locale
attribute. Extensions consist of an extension tag, followed by
a language setting that conforms to
RFC 4646 (typically constructed
from components listed in the
IANA Language Subtag Registry). Recognized extension
tags are as follows:
The tags are applied to a style by appending them to the language string in the default-locale element:
1 2 3 4 5 | <style
xmlns="http://purl.org/net/xbiblio/csl"
class="in-text"
version="1.0"
default-locale="en-US-x-pri-ja-Hrkt">
|
Multiple tags may be specified, and tags are cumulative, and for readability, individual tags may be separated by newlines within the attribute. The following will attempt to render titles in either Pinyin transliteration (for Chinese titles) or Hepburn romanization (for Japanese titles), sorting by the transliteration.
1 2 3 4 5 6 7 8 9 | <style
xmlns="http://purl.org/net/xbiblio/csl"
class="in-text"
version="1.0"
default-locale="en-US
-x-pri-zh-Latn-pinyin
-x-pri-ja-Latn-hepburn
-x-sort-zh-Latn-pinyin
-x-sort-ja-Latn-hepburn">
|
Multi-lingual operation depends upon the presence of alternative representations of field content embedded in the item data. When alternative field content is not availaable, the "real" field content is used as a fallback. As a result, configuration of language and script selection parameters will have no effect when only a single language is available (as will normally be the case for an ordinary Zotero data store).
For titles, alternative representations are appended directly to the field content, separated by the appropriate language tag with a leading and trailing colon:
1 2 | { "title" : "民法 :ja-Latn-hepburn-heploc: Minpō :en: Civil Code"
}
|
For personal names, alternative representations should be presented as separate "name" entries, immediately following the original for the name element to which they apply. For example:
Hint
As described above, fixed ordering is used for non-Byzantine names. When such names are transliterated, the static-ordering element is set on them, to preserve their original formatting behavior.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | { "author" : [
{ "family" : "穂積",
"given" : "陳重"
},
{ "family" : ":ja-Latn: Hozumi",
"given" : "Nobushige"
},
{ "family" : "中川",
"given" : "善之助"
},
{ "family" : ":ja-Latn: Nakagawa",
"given" : "Zennosuke"
}
]
}
|
Citeproc-js ships with a large bundle of test data and a set of scripts that can be used to confirm that the system performs correctly after installation. The tests begin as individual human-friendly fixtures written in a special format, shown in the sample file immediately below. Tests are prepared for use by grinding them into a machine-friendly form (JSON), and by preparing an appropriate Javascript execution wrapper for each. These operations are performed automatically by the top-level test runner script that ships with the sources.
Tests are controlled by the ./test.py script in the root directory of the archive. To run all standard tests in the suite using the rhino interpreted shipped with the processor, use the following command:
./test.py -s
Options and arguments can be used to select an alternative Javascript interpreter, or to change or limit the set of tests run. The script options are as follows:
The --tracemonkey option requires the jslibs Javascript
development environment. The sources for jslibs can be obtained from
Google Code.
After installation, adjust the path to the jshost utility in ./tests/config/test.cnf.
The human-readable version of each test fixture is composed in the format below. The five sections MODE, RESULT, CSL and INPUT are required, and may be arranged in any order within the fixture file. As the sample below illustrates, text outside of the section delimiters is ignored. The sample file below shows the layout of a typical fixture. See the explanations of the individual sections further below for information on the usage of each.
Hint
Four additional sections are available for special purposes. The optional sections BIBENTRIES, BIBSECTION, CITATIONS and CITATION-ITEMS are also explained below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | >>===== MODE =====>>
citation
<<===== MODE =====<<
# Everything between the section blocks is
# ignored. Comment markup can be used for
# clarity, but it is not required.
>>===== RESULT =====>>
John Doe
<<===== RESULT =====<<
>>===== CSL =====>>
<style
xmlns="http://purl.org/net/xbiblio/csl"
class="in-text"
version="1.0">
<info>
<id />
<title />
<updated>2009-08-10T04:49:00+09:00</updated>
</info>
<citation>
<layout>
<names variable="author">
<name />
</names>
</layout>
</citation>
</style>
<<===== CSL =====<<
>>===== INPUT =====>>
[
{
"id":"ID-1",
"type": "book",
"author": [
{ "name":"Doe, John" }
],
"issued": {
"date-parts": [
[
"1965",
"6",
"1"
]
]
}
}
]
<<===== INPUT =====<<
|
The following four sections (MODE, CSL, INPUT, RESULT) are required in all test fixtures.
A single string tells whether to test citation or bibliography output. In the former case, the test will be performed using the makeCitationCluster() command if a CITATION-ITEMS area is included in the test fixture, or if neither that nor a CITATIONS area is included. If a CITATIONS area is included, citation mode uses the processCitationCluster command.
In the case of bibliography mode, the makeBibliography() command is used, with output possibly filtered by the conditions specified in a BIBSECTION area:
1 2 3 | >>===== MODE =====>>
citation
<<===== MODE =====<<
|
The code to be used in the test must be valid as a complete, if minimal, CSL style:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | >>===== CSL =====>>
<style
xmlns="http://purl.org/net/xbiblio/csl"
class="in-text"
version="1.0">
<info>
<id />
<title />
<updated>2009-08-10T04:49:00+09:00</updated>
</info>
<citation
et-al-min="3"
et-al-use-first="1">
<layout delimiter="; ">
<group delimiter=" ">
<names>
<name form="short"/>
</names>
<date
variable="issued"
date-parts="year"
form="text"
prefix="("
suffix=")"/>
</group>
</layout>
</citation>
<bibliography>
<layout>
<group delimiter=" ">
<names variable="author">
<name delimiter=" " initialize-with="."/>
</names>
<date
variable="issued"
date-parts="year"
form="text"
prefix="("
suffix=")"/>
</group>
</layout>
</bibliography>
</style>
<<===== CSL =====<<
|
The INPUT section provides the item data to be registered in the processor. In a simple test fixture that contains none of the optional areas BIBENTRIES, BIBSECTION CITATIONS or CITATION-ITEMS, a citation or bibligraphy is requested for all of the items in the INPUT section (where one of those two optional sections is included, the testing behavior is slightly different; see the discussion of the relevant sections below for details):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | >>===== INPUT =====>>
[
{
"id":"ID-1",
"author": [
{ "name":"Noakes, John" },
{ "name":"Doe, John" },
{ "name":"Roe, Jane" }
],
"issued": {
"date-parts": [
[
2005
]
]
}
},
{
"id":"ID-2",
"author": [
{ "name":"Stoakes, Richard" }
],
"issued": {
"date-parts": [
[
1898
]
]
}
}
]
<<===== INPUT =====<<
|
A string to compare with the citation or bibliography output received from the processor.
1 2 3 | >>===== RESULT =====>>
(Noakes, et al. 2005; Stoakes 1898)
<<===== RESULT =====<<
|
Note that in bibliography mode, the HTML string output used for testing will be affixed with a standard set of wrapper tags, which must be written into the result string used for comparison:
1 2 3 4 5 6 | >>===== RESULT =====>>
<div class="csl-bib-body">
<div class="csl-entry">J. Noakes, J. Doe, J. Roe (2005)</div>
<div class="csl-entry">R. Stoakes (1898)</div>
</div>
<<===== RESULT =====<<
|
Four optional sections are available for use in a fixture to exercise special aspects of processor behavior.
The citeproc-js processor maintains a persistent internal registry of citation data, and permits the addition, deletion and rearrangement of registered items. The behavior of the processor across a series of update transactions can be tested by including BIBENTRIES section. When included, the section should consist of a two-tier list, consisting of discrete lists of IDs, which must correspond to items registered in the INPUT section:
Hint
The test of output will be run after first updating the processor's internal registry to reflect each of the requested citation sets, and should correctly reflect the last in the series.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | >>===== BIBENTRIES =====>>
[
[
"ITEM-1",
"ITEM-2",
"ITEM-3",
"ITEM-4",
"ITEM-5"
],
[
"ITEM-1",
"ITEM-4",
"ITEM-5"
]
]
<<===== BIBENTRIES =====<<
|
When bibliography mode is used, a BIBSECTION area can be used to limit the output of the bibligraphy, through the interface described above under the makeBibliography() command:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | >>===== BIBSECTION =====>>
{
"include" : [
{
"field" : "categories",
"value" : "classical"
}
],
"quash" : [
{
"field" : "type",
"value" : "manuscript"
},
{
"field" : "issued",
"value" : ""
}
]
}
<<===== BIBSECTION =====<<
|
When testing in citation mode, the data items to be processed are ordinarily rendered as a single citation. To test operations that depend upon or may be affected by the internal state of the processor across a session, either a CITATION-ITEMS or a CITATIONS section may be included in the test fixture (only one may be used in a single test fixture).
CITATION-ITEMS is the simpler of the two, used in most of the standard processor formatting test fixtures. The data input in this area should consist of a list array of cite data, where each cite consists of a Javascript object containing, at least, item ID. A single citation is composed of a list of cites, and the full entry consists of a list of such citations:
1 2 3 4 5 6 7 8 9 10 11 | >>===== CITATION-ITEMS =====>>
[
[
{"id": "ITEM-1"}
],
[
{"id": "ITEM-2", "label": "page", "locator": "23"},
{"id":"ITEM-3"}
]
]
<<===== CITATION-ITEMS =====<<
|
A CITATIONS area can be used (instead of CITATION-ITEMS) to mimic a series of interactions with a word processor plugin. In this case, the area should contain a list array of citation data objects with explict citationID values and ID list values for subsequent invocations of the processCitationCluster() command, like the following:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | >>===== CITATIONS =====>>
[
[
{
"citationID": "CITATION-1",
"citationItems": [
{
"id": "ITEM-1"
}
],
"properties": {
"noteIndex": 1
}
},
[],
[]
],
[
{
"citationID": "CITATION-2",
"citationItems": [
{
"id": "ITEM-2",
"locator": 15
},
{
"id": "ITEM-3"
}
],
"properties": {
"noteIndex": 1
}
}
],
["CITATION-1"],
[]
]
<<===== CITATIONS =====<<
|
When accessed using a Javascript-enabled browser with E4X support
(such as
Firefox), the ./demo/demo.html file in the source archive
(or
online) will invoke the processor to render a few citations. The Javascript
files accompanying the page in the ./demo directory show the basic
steps required to load and run the processor, whether in the browser
or server-side.