document version 1.00a52
25 March 2011
|
Feedback and suggestions |
|
|
|
|
Contents
A warm welcome to Multilingual Zotero (MLZ), an experimental variant of the widely distributed Zotero reference management tool. This document provides a brief overview of project, and an outline of features special to the Multilingual version.
Multilingual Zotero is a development branch (a variant) of the main Zotero program. It adds the ability to attach transliterations and translations of names, titles and other fields to Zotero records. Once added to a record, citations can be constructed from these variants (such as the romanized form of Japanese), using standard Zotero styles. Bibliographies, citations, and listings in Zotero itself can also be sorted on a field variant (such as the phonetic transliteration of Chinese text); and records containing multilingual data can be exported and imported, for exchange between database systems. If you work with multiple languages, you will want this tool.
This is an experiment, and you should install it only if you are diligent about backing up your data, and are prepared to deal with the occasional problems that arise with software under active development. That said, if you are diligent about backing up your data, the problems are getting smaller, and the program can be used (and has been used) for real-world projects.
The ultimate aim of this project is to see the incorporation of multilingual functionality into mainstream Zotero itself. This will take time, but the core developers have been sympathetic (the code of the multilingual branch is hosted on Zotero's own servers), and the prospects for seeing the functionality described here in a stable Zotero release further down the road are very good indeed. So if you choose to wait, you will not wait in vain.
Frank Bennett @ Nagoya
MLZ is a drop-in replacement for the standard Zotero client, but if you have an existing database that you use for important projects, it should be installed in a separate profile. The steps are as follows:
Once the client and word processor plugin are installed, you are ready to go.
This overview assumes that the reader is familiar with the principle features of mainstream Zotero itself. Please refer to the main Zotero website for screencasts, third-party guides (including offerings in Chinese, Danish and French), user forums and other sources of information.
There is also a set of homegrown screencasts that cover features specific to MLZ:
Multilingual support in MLZ is built around the IANA Language Subtag Registry, a consolidated list of languages, regions and script variants maintained pursuant to RFC 5646, an Internet standards document that sets down rules for maintenance of the Registry, and provides guidelines on the proper usage of the language tags defined in it. Language tags built in conformance with RFC 5646 allow the form of entries to be expressed concisely, uniformly, and with a high degree of precision.
The Subtag Registry contains over 4,000 entries, which may be combined to specify a bewildering variety of language variants. This is rather more choice than is needed by any specific researcher or project, and for usability, the MLZ user interface shows only those tags that have been specifically enabled. Language tags are enabled automatically as required when multilingual data is imported into the MLZ database. Language tags can also be defined and managed explicitly through the Zotero Preferences panel.
To access the language preferences panel, click on the
(gear) icon,
and open the Preferences panel in the usual way. MLZ offers a
Languages tab in the panel; clicking on it will open the language
preferences pane.
Figure1 shows a pane from a fresh installation of MLZ, in which no languages have yet been defined.
Figure 1: Language preferences panel
To define a language, type a few characters of its name in the text box (the one with grayed out text reading Add a language), and select it from the drop-down menu that appears as you type. You can define as many languages as you like, and they can be deleted at any time, without adverse effects on your data. If you happen to delete delete a language from the Preferences panel that is used by an item in your database, the language will be re-added to your Preferences automatically the next time the item is viewed.
Figure 2: Selecting a language
After defining a language tag, you can give it a familiar name. Names can be entered in any script or language that is supported by your computer. In Figure3, a native-language label is being added for Japanese. The custom label will be finalized (like the existing ones shown for German and Spanish) when the Enter key is pressed.
Figure 3: Editing a language label
For each defined language, a set of tick-boxes appears under the twin headings User interface and Citations and bibliographies. The function of the tick-boxes is largely self-explanatory, but illustrations of their use will be given later in this overview.
Primary language tags (such as de, es, or ja) can be extended with script, region, or variant subtags. For example, if we wish to add romanized Japanese text to names and titles in our database, we would extend the ja tag, to create an additional language tag with the meaning Japanese text written in roman script, using [say] the romanization system adopted by the U.S. Library of Congress.
To extend the Japanese primary tag, we click on the + button to the right of the tag, revealing the selection menu shown in Figure 4. Selecting the item for ALA-LC Romanization, 1997 edition will add a new tag with the value ja-alalc97.
Figure 4: Adding a subtag to a language
The label for the newly added tag can then be edited in the usual way, giving it an easily recognized, human-friendly name such as Roman (ja).
Figure 5: Language preferences with a finished language subtag
It is worth stressing the importance of distinguishing between translation (into another language) and transliteration (of the same language into another script). A romanized Japanese name, for example, should not be tagged as en (English), but as ja-alalc97 or ja-hepburn (that is, Japanese text transliterated according to the rules followed by the U.S. Library of Congress, or Japanese text transliterated using a system that falls within the rough category of Hepburn transliteration rules).
As will become clear below, this distinction is important when generating citations. If romanized names are incorrectly tagged as English, it becomes impossible to distinguish between these transliterated forms (which may be used to replace the native-language form) and English translations of titles (which are used as supplementary information only, and should not replace the original title).
Once a set of language tags has been registered in the preferences panel, entries for the defined languages can be added to item content. The steps for adding entries and changing field languages are the same for creators (Author, Editor, etc.) and for ordinary fields that are multilingual-aware, with a very slight variation in the placement of menus. The steps for deletion are different for the two types of fields, but the differences are intuitive and largely self-explanatory.
Creators are added from a right-click context menu on the creator type label. After creation, language tags can be editing from a left-click context menu, and entries can be deleted in the usual way, by clicking on the - menu item to the right of each entry. Deletion of the main creator is blocked on creators that have multilingual entries, to prevent accidental loss of data.
A right-click over the creator type label reveals the Add Tag context menu. The menu shows only the language tags that have not yet been added to this creator.
Figure 6: Adding a multilingual creator entry.
To change the language (in our example, I mistakenly entered Japanese text instead of German), use the left-click context menu from the multilingual language label.
Figure 7: Changing the language of a multilingual creator entry.
Multilingual creator entries can be deleted in the usual way.
Figure 8: Deleting a multilingual creator entry.
Ordinary fields that are enabled for multilingual data work in the same way as creators. Fields with multilingual support can be identified by hovering the cursor over the field label, which reveals a thin blue outline. A multilingual field can be removed simply by removing its content.
Entries can be added to a field using the right-click menu from the fieldname label.
Figure 9: Adding a multilingual field.
The language of a multilingual field can be changed in the same way as for multilingual creators, by left-clicking over the language label.
Figure 10: Changing the language on a multilingual field.
To delete a multilingual field, just delete its content and it will disappear.
Figure 11: Deleting a multilingual field.
While not strictly necessary for generating multilingual citations, users ...
Pass.
Pass.
Pass.
The advent of full support for multilingual reference management opens up (literally) a world of possibilities, the extent of which will only become clear in the light of experience with such tools. A couple of immediate thoughts are offered below.
There is a network effect in shared reference archives. Multilingual reference archives and collections that are clean, comprehensive, easily extendable by their users, and tied directly to authoring tools will dramatically lower the barriers to collaboration across language divides. What was once a cumbersome and stovepiped process of document swapping and serial revision will give way, over time, to real-time collaboration of shared documents using shared data.
The infrastructure for such a world is yet incomplete, but as researchers explore the possibilities of this new space, the gap in research efficiency between monolingual and multilingual teams will inevitably begin to close.
With Multilingual Zotero now in place, the next target for close collaboration between authors working over a distance (a frequent requirement in cross-language collaborations) will be real-time, citation-supported document authoring. The Abiword word processor, which implements a document sharing model very similar in concept to Zotero itself (i.e. with local copies updated via a synchronization server) is a promising tool in this regard. Lobbying and programming efforts in this quarter -- or indeed in any quarter that can serve this emerging need -- would be very beneficial to the research community at large.
The end-to-end consumption of structured metadata in the production of published works gives authors a strong incentive to curate their local databases, to assure that citations are formatted correctly in finished manuscripts. RDF data exchange technology opens the possibility of feeding the result of this effort back into public archives such as CiNii, to improve the quality of available metadata, and further lighten the writing process. The gains from such a cycle would be particularly great in the international context, where human intervention in the drafting and proofreading of previously unavailable multilingual content is unavoidable.
Quality control is a clear concern for any such initiative, and the ownership of the underlying metadata provided by aggregators must be respected. One possible approach, once multilingual sync becomes available, would be for site engineers to harvest user-generated content from zotero.org and other sites capable of publishing RDF metadata, to analyze the results by automated means, and to provide the result to the original content suppliers for (voluntary) review, editing and approval. Such a workflow would avoid workload spikes in the maintenance process, and would allow original submitters to retain control over officially published metadata appearing on the aggregator website.
| [1] | Downloadable with the command: > svn co https://www.zotero.org/svn/extension/branches/trunk-multilingual/ |
| [2] | Downloadable with the command: > hg clone https://fbennett@bitbucket.org/fbennett/citeproc-js |