Introduction and Overview
To add publisher informations on existing or new articles to INSPIRE we need to translate these informations into our MARC xml. If we are lucky and can get the informations as xml as well, the most convenient way is to use an xslt for the translation. Still, different publishers use different schemes requiring different xslts. On the other hand, a lot of translation work is always the same (dividing the authors on 100 and 700, getting 999C5s into the right form, ...). Therefore, I designed a simple intermediate xml-scheme with human readable tags to which the publisher xml is translated. A universal xslt translates the intermediate to the final (INSPIRE) xml.
Files
- intermediate-1.24.xsd: scheme for intermediate xml
- intermediate-1.24_example.xml: a dummy example
- intermediate-1.24.xslt: xslt to translate the intermediate to the final (INSPIRE) xml
- intermediate-1.24.mfd: Altova-file used to create intermediate-1.24.xslt
- 10.1056_564564.xml: translation of dummy example specific publisher example:
specific publisher example:
- oxford-1.24.xslt: xslt to translate xml from OUP (PTEP) to intermediate scheme
- oxford-1.24.mfd: Altova-file used to create oxford-1.24.xslt
Description
intermediate-1.24.xslt does a couple of things (beside 1:1 mapping)
- deviding authors and editors on 100 (first author) and 700 (rest)
- writing 041 (language) only if unequal English
- trying to calculate 300 (number of pages) from page range (773__c)
- CC-licence: it's enough to give either licence code or URL
- bringing 999C5s into the right form using a look-up table for the journal names which is generated by hand from INSPIRE.Journals and copied into intermediate-1.24.xslt
- bringing 999C5r into the right form for arXiv-numbers
- bringing 999C5h into the right form concatenating authors
- if there is a free text reference it is written to 999C5m only if not DOI, pubnote etc. is found
- translating field- and type-codes from short (SPIRES) to long (INSPIRE)
oxford-1.24.xslt has two additional features (added by hand) which are not the altova file
- lookup table for publisher keys to publisher keywords
- reference are either in
<nlm-citation>
or <citation>
, I changed the according <xsl:for-each select="nlm-citation">
by hand instead of drawing a few dozen lines in the altova file again
Open Questions
- at the moment arXiv-number is written to 037, should there be an entry in 035 as well?
- in FFT: are 'INSPIRE-HIDDEN' and 'INSPIRE-PUBLIC' right?
- using look-up table also for 773? Doing the journal name translation with a different tool separately?
- which tags are missing in intermediate-1.24.xsd?
- which steps are missing in intermediate-1.24.xslt?