RDF XSLT

From NINESWiki
Jump to: navigation, search

Contents

Creating RDF from your XML source

We include below two XSL transformation examples which can be modified to produce RDF from your XML files. The first example was used on the Whitman Archive to produce RDF from manuscripts encoded in TEI (p4) XML. We have included the entire XSLT to help users generate their own RDF transformation.

Contributors who use this XSLT will need to fashion a match for generating <collex:genre> values.

The second example was used on Rossetti Archive textual documents encoded in the peculiar Rossetti Archive markup scheme (similar in several respects to TEI, but significantly divergent and far more than a typical TEI extension). Only portions of the XSL code are presented which might prove useful in solving more complex problems. In particular, the multitudinous <xsl:if> statements are a good guide for mapping local genre values to the acceptable values of <collex:genre>.

You are welcome and encouraged to modify these to your own needs.

Whitman Archive Sample

Please note: this sample is valid only for TEI P4 and earlier. TEI P5 differs dramatically.


<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
  xmlns:collex="http://www.collex.org/schema#"
  xmlns:whit="http://www.whitmanarchive.org/schema#">

  <xsl:output method="xml" encoding="utf-8" indent="yes"/>
  <xsl:template match="/">
    <xsl:apply-templates/>
  </xsl:template>
  <xsl:template match="TEI.2">
    <rdf:RDF>
      <whit:manuscript rdf:about="http://www.whitmanarchive.org/manuscripts/{@id}">
        <collex:archive>whitman</collex:archive>

        <xsl:if test="teiHeader/fileDesc/titleStmt/author">
          <role:AUT>
            <xsl:value-of select="teiHeader/fileDesc/titleStmt/author"/>
          </role:AUT>
        </xsl:if>
        <dc:title>
          <xsl:value-of select="teiHeader/fileDesc/titleStmt/title[@type='main']"/>
        </dc:title>
        <xsl:if test="teiHeader/fileDesc/sourceDesc/bibl/date">
          <dc:date><xsl:value-of select="teiHeader/fileDesc/sourceDesc/bibl/date"/></dc:date>
        </xsl:if>


        <!-- Whitman Archive @type values must be mapped to official genre values!!! --> 
        <collex:genre><xsl:value-of select="text/@type"/></collex:genre>

        <collex:thumbnail
         rdf:resource="http://www.whitmanarchive.org/testing/MS_page_breaks/MS_page_icon(original).gif"/>
        <collex:image
         rdf:resource="http://www.whitmanarchive.org/servlets/xmlmanuscripts/figures/{@id}.001.jpg"/>

        <!-- <collex:source_xml rdf:resource="http:// - pointer to source XML - "/> -->


        <rdfs:seeAlso 
rdf:resource="http://www.whitmanarchive.org/saxon/servlet/SaxonServlet?source=whitman/xmlmanuscripts/documents/{@id}.xml&style=whitman/xmlmanuscripts/xsl/whitman.xsl&clear-stylesheet-cache=yes"/>
      </whit:manuscript>
    </rdf:RDF>
  </xsl:template>
</xsl:stylesheet>

N.B. in the above example, the <collex:genre> element merely copies in the value of the "type" attribute on the <text> element. Actually, the <collex:genre> element should contain one of the standardized (genre categories).

Rossetti Archive Example

Here is some truncated Rossetti Archive XML, followed by the XSLT. It shows how we've mapped certain attribute values to NINES conformant <collex:genre> values. In particular, note how the XSLT creates <collex:genre> values from the "metatype" attribute of <ram> and the "type" attribute of <div0>. Although the present example has a simple @metatype, some cases can have multiple @metatype values (e.g. metatype="web.doublework, web.visual, web.poem"). One can also see how we've generated .txt files from our xml. The tricky bits involve creating a .txt for each
(the most granular level of object in our RDF), inserting linebreaks where appropriate, and suppressing unnecessary white space.

Rossetti XML

<?xml version="1.0" encoding="iso-8859-1"?>
<ram xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:noNamespaceSchemaLocation="file:/C:/xmlediting/ram.xsd" archivetype="rad" type="ms.faircopy"
  id="a.1-1847.princefrag" metatype="web.manuscript" image="a.1-1847.princefrag.1.tif"
  workcode="1-1847.s244" version="princefrag" dblwork="1-1847.s244">
  <ramheader>
    <filedesc>
      <titlestmt>
        <title>The Blessed Damozel (1872 autograph MS fragment)</title>
        <author>Dante Gabriel Rossetti</author>
        [...]
      </titlestmt>
      [...]			
      <sourcedesc>
        <citnstruct>
          <title>[untitled]</title>
          <author>Dante Gabriel Rossetti</author>
          <msprod>
            <date>1872 (circa)</date>
            <type>fair copy</type>
          </msprod>
          <scribe>DGR</scribe>
        </citnstruct>
      </sourcedesc>
    </filedesc>
  </ramheader>
  <text>
    <body>
      <page n="[1]" image="a.1-1847.princefrag.1.tif" width="415" height="600"/>
      <div0 type="ballad" n="1" title="The Blessed Damozel" id="a.1-1847.i1" workcode="1-1847.s244">
        <lg n="1" r="7" type="sexain">
          <l n="1" r="37">Around her, lovers, newly met</l>
          <l n="2" r="38" indent="1">'Mid deathless love's acclaims,</l>
          <l n="3" r="39">Spoke evermore among themselves</l>
        </lg>
      </div0>
    </body>
  </text>
</ram>

Rossetti XSLT

Here's an abridged version of the XSLT that renders this into RDF ...

<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/"
 xmlns:collex="http://www.collex.org/schema#" xmlns:ra="http://www.rossettiarchive.org/schema#"
 xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
 xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
 xmlns:role="http://www.loc.gov/loc.terms/relators/">

 <xsl:output method="xml" encoding="iso-8859-1" indent="yes"/>

 <xsl:strip-space elements="*"/>

 <xsl:variable name="id">
  <xsl:text>http://www.rossettiarchive.org/docs/</xsl:text>
  <xsl:value-of select="substring-after(/ram/@id,'a.')"/>
  <xsl:text>.</xsl:text>
  <xsl:value-of select="/ram/@archivetype"/>
 </xsl:variable>

 <xsl:variable name="idtxt">
  <xsl:value-of select="substring-after(/ram/@id,'a.')"/>
  <xsl:text>.</xsl:text>
  <xsl:value-of select="/ram/@archivetype"/>
 </xsl:variable>

 <xsl:variable name="metatype" select="/ram/@metatype"/>

 <xsl:variable name="type" select="/ram/@type"/>

 <xsl:template match="/">
  <xsl:apply-templates/>
 </xsl:template>

 <xsl:template match="ram">
  <rdf:RDF>
   <ra:rad rdf:about="{$id}">
    <collex:archive>rossetti</collex:archive>
    <collex:source_xml rdf:resource="{$id}.xml"/>
    <rdfs:seeAlso rdf:resource="{$id}.html"/>
    <collex:text rdf:resource="http://www.rossettiarchive.org/docs/{$idtxt}.txt"/>
    <xsl:result-document href="{$idtxt}.txt" method="text" encoding="utf-8"
 indent="no" xml:space="default">
     <xsl:apply-templates select="text" mode="text"/>
    </xsl:result-document>
    <dc:title>
     <xsl:value-of select="ramheader/filedesc/titlestmt/title[1]"/>
    </dc:title>
    <xsl:for-each select="ramheader/filedesc/titlestmt/title[2]|ramheader/filedesc/titlestmt/title[3]|
ramheader/filedesc/titlestmt/title[4]">
     <dcterms:alternative>
      <xsl:value-of select="."/>
     </dcterms:alternative>
    </xsl:for-each> 
    <xsl:apply-templates select="ramheader/filedesc/titlestmt/author"/>
    <xsl:apply-templates select="ramheader/filedesc/titlestmt/editor"/>
    <xsl:if test="ramheader/filedesc/sourcedesc/citnstruct/imprint/publisher">
     <role:PBL>
      <xsl:value-of select="ramheader/filedesc/sourcedesc/citnstruct/imprint/publisher"/>
     </role:PBL>
    </xsl:if>
    <collex:genre>Primary</collex:genre>
    <xsl:if test="contains($metatype, 'web.ephemera')">
     <collex:genre>Ephemera</collex:genre>
    </xsl:if>
    <xsl:if test="contains($metatype, 'web.manuscript')">
     <collex:genre>Manuscript</collex:genre>
    </xsl:if>
    <xsl:if test="contains($metatype, 'book')">
     <collex:genre>Paratext</collex:genre>
    </xsl:if>
    <xsl:if test="contains($metatype, 'web.poem')">
     <collex:genre>Poetry</collex:genre>
    </xsl:if>
    <xsl:if test="contains($metatype, 'web.prose')">
     <!-- Fiction v. Nonfiction tests -->
     <xsl:choose>
      <xsl:when test="($type ='drama') or ($type='fiction')">
       <collex:genre>Fiction</collex:genre>
      </xsl:when>
      <xsl:when test="($type ='criticism') or ($type='letter') or ($type='review')">
       <collex:genre>Nonfiction</collex:genre>
      </xsl:when>
      <xsl:when test="$type ='ms.notebk'">
       <collex:genre>Fiction</collex:genre>
       <collex:genre>Nonfiction</collex:genre>
      </xsl:when>
     </xsl:choose>
    </xsl:if>
    <xsl:if test="contains($metatype, 'web.serial')">
     <collex:genre>Periodical</collex:genre>
    </xsl:if>
    <xsl:if test="contains($metatype, 'web.translation')">
     <collex:genre>Translation</collex:genre>
    </xsl:if>
    <xsl:if test="contains($metatype, 'web.visual')">
     <collex:genre>Visual Art</collex:genre>
    </xsl:if>
    <!-- @type -->
    <xsl:if test="$type ='ms.collection'">
     <collex:genre>Collection</collex:genre>
    </xsl:if>
    <xsl:if test="($type ='criticism') or ($type ='review')">
     <collex:genre>Criticism</collex:genre>
    </xsl:if>
    <xsl:if test="$type ='drama'">
     <collex:genre>Drama</collex:genre>
    </xsl:if>
    <xsl:if test="($type ='pamphlet') or ($type ='broadside')">
     <xsl:if test="not(contains($metatype, 'web.ephemera'))">
      <collex:genre>Ephemera</collex:genre>
     </xsl:if>
    </xsl:if>
    <xsl:if test="$type ='fiction'">
     <collex:genre>Fiction</collex:genre>
    </xsl:if>
    <xsl:if test="$type ='letter'">
     <collex:genre>Letters</collex:genre>
    </xsl:if>
    <xsl:if test="contains($type, 'ms.')">
     <xsl:if test="not(contains($metatype, 'web.manuscript'))">
      <collex:genre>Manuscript</collex:genre>
     </xsl:if>
    </xsl:if>
   [ ... ]
    <dcterms:hasPart rdf:resource="{$id}header"/>
    <xsl:for-each select="//div0|//div1|//div2|//div3|//div4|//div5|//div6">
     <xsl:if test="string-length(@title) > 0">
      <!-- only DIV's with titles pass through -->
      <dcterms:hasPart rdf:resource="{$id}#{@anchor}"/>
     </xsl:if>
    </xsl:for-each>
    <xsl:for-each select="//figure">
     <xsl:variable name="nameWithPrefixSuffix" select="@entity"/>
     <xsl:variable name="nameWithSuffix" select="substring-after($nameWithPrefixSuffix,'a.')"/>
     <xsl:variable name="name" select="substring-before($nameWithSuffix,'.tif')"/>
     <dc:relation rdf:resource="http://www.rossettiarchive.org/img/{$name}"/>
    </xsl:for-each>
    <xsl:for-each select="//page">
     <xsl:variable name="nameWithPrefixSuffix" select="@image"/>
     <xsl:variable name="nameWithSuffix" select="substring-after($nameWithPrefixSuffix,'a.')"/>
     <xsl:variable name="name" select="substring-before($nameWithSuffix,'.tif')"/>
     <!-- escape-uri( , 						true())-->
     <xsl:if test="$name != '' and $name != 'unavailable'">
      <dc:relation rdf:resource="http://www.rossettiarchive.org/img/{$name}"/>
     </xsl:if>
    </xsl:for-each>
    <xsl:variable name="imageWithPrefixSuffix" select="@image"/>
    <xsl:variable name="imageWithSuffix" select="substring-after($imageWithPrefixSuffix,'a.')"/>
    <xsl:variable name="image" select="substring-before($imageWithSuffix,'.tif')"/>
    <xsl:if test="$image != ''">
     <collex:thumbnail rdf:resource="http://www.rossettiarchive.org/img/thumbs_small/{encode-for-uri(concat($image, '.jpg'))}"/>
     <collex:image rdf:resource="http://www.rossettiarchive.org/img/{$image}.jpg"/>
    </xsl:if>
    <xsl:apply-templates select="/ram/ramheader/profiledesc/date"/>
    <xsl:apply-templates select="/ram/ramheader/filedesc/sourcedesc/citnstruct/imprint/date"/>
    <xsl:apply-templates select="/ram/ramheader/filedesc/sourcedesc/citnstruct/imageprod/date"/>
    <xsl:apply-templates select="/ram/ramheader/filedesc/sourcedesc/citnstruct/msprod/date"/>
   </ra:rad>
   <!-- header -->
   <ra:header rdf:about="{$id}header">
    <collex:archive>rossetti</collex:archive>
    <collex:source rdf:resource="{$id}.xml"/>
    <rdfs:seeAlso rdf:resource="{$id}header.html"/>
    <xsl:result-document href="{$idtxt}header.txt" method="text" encoding="utf-8" indent="no" 
xml:space="default">
     <xsl:apply-templates select="ramheader" mode="text"/>
    </xsl:result-document>
    <dc:title>
     <xsl:text>Commentary for </xsl:text>
     <xsl:value-of select="ramheader/filedesc/titlestmt/title[1]"/>
    </dc:title>
    <dc:source>
     <xsl:value-of select="ramheader/filedesc/titlestmt/title[1]"/>
    </dc:source>
    <role:AUT>Jerome J. McGann</role:AUT>
    <dc:date>2006</dc:date>
    <collex:genre>Secondary</collex:genre>
    <dcterms:isPartOf rdf:resource="{$id}"/>
   </ra:header>
   <xsl:apply-templates select="//div0|//div1|//div2|//div3|//div4|//div5|//div6"/>
   <xsl:apply-templates select="//figure"/>
   <!--<xsl:apply-templates select="//page"/>-->
  </rdf:RDF>
 </xsl:template>

 <!-- div's -->
 <xsl:template match="div0|div1|div2|div3|div4|div5|div6">
  <xsl:if test="string-length(@title) > 0">
   <!-- only DIV's with titles pass through -->
   <xsl:variable name="anchor" select="@anchor"/>
   <xsl:variable name="divtype" select="@type"/>
   <!-- TODO: Implement hierarchical relationships with divs -->
   <ra:div rdf:about="{$id}#{$anchor}">
    <collex:archive>rossetti</collex:archive>
    <collex:source rdf:resource="{$id}.xml"/>
    <rdfs:seeAlso rdf:resource="{$id}.html#{$anchor}"/>
    <rdfs:seeAlso rdf:resource="{$id}.html"/>
    <collex:text rdf:resource="http://www.rossettiarchive.org/docs/{$idtxt}.{@anchor}.txt"/>
    <xsl:result-document href="{$idtxt}.{@anchor}.txt" method="text" encoding="utf-8" indent="no"
      xml:space="default">
     <xsl:apply-templates select=".//divheader|.//p|.//l|.//msadds/trans" mode="text"/>
    </xsl:result-document>
    <dc:title>
     <xsl:value-of select="@title"/>
    </dc:title>
    <dc:source>
     <xsl:value-of select="ancestor::ram/ramheader/filedesc/titlestmt/title[1]"/>
    </dc:source>
    <xsl:apply-templates select="/ram/ramheader/filedesc/titlestmt/author"/>
    <xsl:apply-templates select="/ram/ramheader/filedesc/titlestmt/editor"/>
    <xsl:apply-templates select="/ram/ramheader/profiledesc/date"/>
    <xsl:apply-templates select="/ram/ramheader/filedesc/sourcedesc/citnstruct/imprint/date"/>
    <xsl:apply-templates select="/ram/ramheader/filedesc/sourcedesc/citnstruct/imageprod/date"/>
    <xsl:apply-templates select="/ram/ramheader/filedesc/sourcedesc/citnstruct/msprod/date"/>
    <xsl:if test="$divtype ='anthology'">
     <collex:genre>Collection</collex:genre>
    </xsl:if>
    <xsl:if test="($divtype ='commentary') or ($divtype ='note') or ($divtype ='essay') or 
($divtype ='criticism') or ($divtype ='review') or ($divtype ='art criticism') or 
($divtype ='other.article')">
     <collex:genre>Criticism</collex:genre>
    </xsl:if>
    <xsl:if test="($divtype ='drama') or ($divtype ='drama notes') or ($divtype ='dramatis personae') or 
($divtype ='dialogue')">
     <collex:genre>Drama</collex:genre>
    </xsl:if>
    <xsl:if test="($divtype ='memoranda') or ($divtype ='memorandum') or ($divtype ='picture notes') or 
($divtype ='picture note') or ($divtype ='picturenotes') or ($divtype ='pictures notes') or 
($divtype ='art notes') or ($divtype ='note') or ($divtype ='poem notes') or ($divtype ='poetical words') 
or ($divtype ='notes') or ($divtype ='prose list') or ($divtype ='sketch for a play') or 
($divtype ='drama notes') or ($divtype ='marginalia') or ($divtype ='bibliographic notes') or ($divtype 
='painting.unexecute') or ($divtype ='pamphlet') or ($divtype ='notebook entry') or ($divtype ='list')">
     <collex:genre>Ephemera</collex:genre>
    </xsl:if>
    <xsl:if test="($divtype ='story') or ($divtype ='story notes') or ($divtype ='short story')">
     <collex:genre>Fiction</collex:genre>
    </xsl:if>
    <xsl:if test="($divtype ='epistle') or ($divtype ='letter')">
     <collex:genre>Letters</collex:genre>
    </xsl:if>
    <xsl:if test="($divtype ='notebookentry') or ($divtype ='notebook sketch') or 
($divtype ='notebook sketches') or ($divtype ='notebook') or ($divtype ='copy')">
     <collex:genre>Manuscript</collex:genre>
    </xsl:if>
    <xsl:if test="($divtype ='autobiography') or ($divtype ='biography') or ($divtype ='essay') or 
($divtype ='preface') or ($divtype ='biographical sketch') or ($divtype ='prose description') or 
($divtype ='prose essay')">
     <collex:genre>Nonfiction</collex:genre>
    </xsl:if>
    <xsl:if test="($divtype ='introductory section') or ($divtype ='illustration') or 
($divtype ='frontispiece') or ($divtype ='note') or ($divtype ='preface') or ($divtype ='notes') or 
($divtype ='marginalia') or ($divtype ='index') or ($divtype ='table of contents') or 
(contains($divtype,      'dvertisement')) or ($divtype ='appendix') or ($divtype ='book') or 
($divtype ='proof') or ($divtype ='proofs') or ($divtype ='chapter') or ($divtype ='bookplate') or 
($divtype ='colophon') or ($divtype ='Contents') or ($divtype ='cover') or ($divtype ='cover sheet') or 
($divtype ='coversheet')      or ($divtype ='endpaper') or ($divtype ='flyleaf') or ($divtype ='fly-title') or 
($divtype ='front') or ($divtype ='half title') or ($divtype ='half-title') or ($divtype ='imprint') or 
($divtype ='Introduction') or ($divtype ='introduction') or ($divtype ='introductory note') or
($divtype      ='transcription') or ($divtype ='title') or ($divtype ='title page') or ($divtype ='titlepage') or 
($divtype ='epigraph') or ($divtype ='cover notes') or ($divtype ='dedication') or ($divtype ='Dedication') 
or ($divtype ='end note') or ($divtype ='engraving') or ($divtype ='errata') or      ($divtype ='library notes') 
or ($divtype ='part') or ($divtype ='scene') or ($divtype ='section') or ($divtype ='Section') 
or ($divtype ='subset')">
     <collex:genre>Paratext</collex:genre>
    </xsl:if>
    <xsl:if test="$divtype ='photograph'">
     <collex:genre>Photograph</collex:genre>
    </xsl:if>
    <xsl:if test="(contains($divtype, 'poe')) or (contains($divtype, 'onnet')) or (contains($divtype, 'balla')) 
or (contains($divtype, 'canzon')) or ($divtype ='cantica') or ($divtype ='couplet') or 
($divtype ='dramatic monologue') or ($divtype ='elegy') or ($divtype ='epitaph') or ($divtype ='hymn') or 
($divtype ='extract') or ($divtype ='limerick') or ($divtype ='dialogue') or (contains($divtype, 'lyric')) 
or ($divtype ='madrigal') or ($divtype ='quatorzain') or ($divtype ='quatrain') or ($divtype ='epigram') or 
($divtype ='quintain') or ($divtype ='sestet') or ($divtype      ='sestina') or ($divtype ='song') or 
($divtype ='Song') or ($divtype ='stanza') or ($divtype ='strophe') or ($divtype ='rondeau') or 
($divtype ='epistle') or ($divtype ='fragment') or ($divtype ='narrative') or ($divtype ='cantata')">
     <collex:genre>Poetry</collex:genre>
    </xsl:if>
    <xsl:if test="$divtype ='translation'">
     <collex:genre>Translation</collex:genre>
    </xsl:if>
    <xsl:if test="($divtype ='cover') or ($divtype ='drawing') or ($divtype ='engraving') or 
($divtype ='etching') or ($divtype ='portraits')">
     <collex:genre>Visual Art</collex:genre>
    </xsl:if>
    
    <dcterms:isPartOf rdf:resource="{$id}"/>
   </ra:div>
  </xsl:if>
 </xsl:template>

 <!-- figure -->
 <xsl:template match="figure">
  <xsl:variable name="nameWithPrefixSuffix" select="@entity"/>
  <xsl:variable name="nameWithSuffix" select="substring-after($nameWithPrefixSuffix,'a.')"/>
  <xsl:variable name="name" select="substring-before($nameWithSuffix,'.tif')"/>
  <ra:figure rdf:about="http://www.rossettiarchive.org/img/{$name}">
   <collex:archive>rossetti</collex:archive>
   <collex:source_xml rdf:resource="{$id}.xml"/>
   <rdfs:seeAlso rdf:resource="{$id}.html"/>
   <dc:title>
    <xsl:value-of select="@title"/>
   </dc:title>
   <xsl:apply-templates select="/ram/ramheader/filedesc/titlestmt/author"/>
   <xsl:apply-templates select="/ram/ramheader/filedesc/titlestmt/editor"/>
   <xsl:apply-templates select="/ram/ramheader/profiledesc/date"/>
   <xsl:apply-templates select="/ram/ramheader/filedesc/sourcedesc/citnstruct/imprint/date"/>
   <xsl:apply-templates select="/ram/ramheader/filedesc/sourcedesc/citnstruct/imageprod/date"/>
   <xsl:apply-templates select="/ram/ramheader/filedesc/sourcedesc/citnstruct/msprod/date"/>
   <collex:genre>Visual Art</collex:genre>
   <collex:thumbnail rdf:resource="http://www.rossettiarchive.org/img/thumbs_small/{$name}.jpg"/>
   <collex:image rdf:resource="http://www.rossettiarchive.org/img/{$name}.jpg"/>
   <dcterms:isPartOf rdf:resource="{$id}"/>
  </ra:figure>
 </xsl:template>

 <!-- page -->
 <xsl:template match="page">
  <xsl:variable name="nameWithPrefixSuffix" select="@image"/>
  <xsl:variable name="nameWithSuffix" select="substring-after($nameWithPrefixSuffix,'a.')"/>
  <xsl:variable name="name" select="substring-before($nameWithSuffix,'.tif')"/>
  <!-- escape-uri( , true())-->
  <xsl:if test="$name != '' and $name != 'unavailable'">
   <ra:pageimage rdf:about="http://www.rossettiarchive.org/img/{$name}">
    <collex:archive>rossetti</collex:archive>
    <collex:source_xml rdf:resource="{$id}.xml"/>
    <rdfs:seeAlso rdf:resource="{$id}.html"/>
    <dc:title>
     <xsl:value-of select="/ram/ramheader/filedesc/titlestmt/title"/>
     <xsl:text> (Page </xsl:text>
     <xsl:value-of select="@n"/>
     <xsl:text>)</xsl:text>
    </dc:title>
    <xsl:apply-templates select="/ram/ramheader/filedesc/titlestmt/author"/>
    <xsl:apply-templates select="/ram/ramheader/filedesc/titlestmt/editor"/>
    <xsl:apply-templates select="/ram/ramheader/profiledesc/date"/>
    <xsl:apply-templates select="/ram/ramheader/filedesc/sourcedesc/citnstruct/imprint/date"/>
    <xsl:apply-templates select="/ram/ramheader/filedesc/sourcedesc/citnstruct/imageprod/date"/>
    <xsl:apply-templates select="/ram/ramheader/filedesc/sourcedesc/citnstruct/msprod/date"/>
    <collex:genre>Paratext</collex:genre>
    <collex:thumbnail rdf:resource="http://www.rossettiarchive.org/img/thumbs_small/{$name}.jpg"/>
    <collex:image rdf:resource="http://www.rossettiarchive.org/img/{$name}.jpg"/>
    <dcterms:isFormatOf rdf:resource="{$id}"/>
   </ra:pageimage>
  </xsl:if>
 </xsl:template>

 <xsl:template match="/ram/ramheader/profiledesc/date | 
  /ram/ramheader/filedesc/sourcedesc/citnstruct/imprint/date |  
  /ram/ramheader/filedesc/sourcedesc/citnstruct/imageprod/date | 
  /ram/ramheader/filedesc/sourcedesc/citnstruct/msprod/date">
  <xsl:if test="string-length(.) > 0">
   <dc:date>
    <xsl:value-of select="."/>
   </dc:date>
  </xsl:if>
 </xsl:template>

 <xsl:template match="/ram/ramheader/filedesc/titlestmt/author">
  <xsl:variable name="contents" select="."/>
  <xsl:if test="string-length(.) > 0">
   <xsl:if test="(contains($metatype, 'web.poem')) or (contains($metatype, 'web.otherbook')) or 
(contains($metatype, 'web.prose')) or (contains($metatype, 'web.book')) or 
(contains($metatype, 'web.manuscript'))">
    <role:AUT>
     <xsl:choose>
      <xsl:when test="($contents ='Dante Gabriel Rossetti') or ($contents ='DGR') or 
($contents ='Dante Gabriel Rossetti (designer)') or ($contents ='Dante Gabriel Rossetti?')">
       <xsl:text>Dante Gabriel Rossetti</xsl:text>
      </xsl:when>
      <xsl:otherwise>
       <xsl:value-of select="."/>
      </xsl:otherwise>
     </xsl:choose>
    </role:AUT>
   </xsl:if>
   <xsl:if test="contains($metatype, 'web.translation')">
    <role:TRL>
     <xsl:value-of select="."/>
    </role:TRL>
   </xsl:if>
  </xsl:if>
 </xsl:template>

 <xsl:template match="/ram/ramheader/filedesc/titlestmt/editor">
  <xsl:if test="string-length(.) > 0">
   <role:EDT>
    <xsl:value-of select="."/>
   </role:EDT>
  </xsl:if>
 </xsl:template>

 <xsl:template match="citnstruct/title|text//divheader|text//p|text//l|text//msadds/trans|
   item|section[@type='biblio']/p/bibl|
titlestmt/title" mode="text">
  <!-- elements that SHOULD be processed with a hard return -->
  <xsl:apply-templates mode="text"/>
  <xsl:text>
</xsl:text>
 </xsl:template>

 <xsl:template match="text//lb|epage" mode="text">
  <xsl:text>
</xsl:text>
 </xsl:template>

 <xsl:template match="*" mode="text">
  <xsl:apply-templates mode="text"/>
  <xsl:text>
</xsl:text>
 </xsl:template>

<xsl:template match="addspan|author|bibl|bibl/date|delspan|foreign|hi|pages|quote|title|xref" 
mode="text">
<!-- elements that SHOULD NOT be processed with a hard return -->
<xsl:variable name="previousalphanum" select="substring(preceding-sibling::text()[1],
string-length(preceding-sibling::text()[1])-0)"/>
<xsl:variable name="succeedingalphanum" select="substring(following-sibling::text()[1], 1, 1)"/>
<xsl:if test="$previousalphanum =' '">
<xsl:text> </xsl:text>
</xsl:if>
<xsl:apply-templates mode="text"/>
<xsl:if test="$succeedingalphanum =' '">
<xsl:text> </xsl:text>
</xsl:if>
</xsl:template>

<xsl:template match="add|del" mode="text">
	<!-- for add and del, no hard return,
             and test if preceding element is an add or del;
             if so, include a space -->
<xsl:variable name="previousalphanum" select="substring(preceding-sibling::text()[1],
string-length(preceding-sibling::text()[1])-0)"/>
<xsl:variable name="succeedingalphanum" select="substring(following-sibling::text()[1], 1, 1)"/>
<xsl:if test="$previousalphanum =' '">
<xsl:text> </xsl:text>
</xsl:if>
<xsl:if test="(preceding-sibling::del) or (preceding-sibling::add)"><xsl:text> </xsl:text></xsl:if>
<xsl:apply-templates mode="text"/>
<xsl:if test="$succeedingalphanum =' '">
<xsl:text> </xsl:text>
</xsl:if>
</xsl:template>

 <!-- elements that SHOULD NOT be processed AT ALL -->
 <xsl:template match="citnstruct/artist|citnstruct/author|desc|head|page|pageheader" mode="text"/>

 <xsl:template match="text()" mode="text">
  <xsl:value-of select="normalize-space(.)"/>
 </xsl:template>
</xsl:stylesheet>

Here is the RDF output for the above file. Note the use of dcterms:hasPart for divisional structures within the file that were desirable to mark as RDF objects:

<rdf:RDF xmlns:dc="http://purl.org/dc/elements/1.1/"
 xmlns:ra="http://www.rossettiarchive.org/schema#"
 xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
 xmlns:role="http://www.loc.gov/loc.terms/relators/"
 xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
 xmlns:collex="http://www.collex.org/schema#"
 xmlns:dcterms="http://purl.org/dc/terms/">

   <ra:rad rdf:about="http://www.rossettiarchive.org/docs/1-1847.princefrag.rad">
      <collex:archive>rossetti</collex:archive>
      <collex:source_xml rdf:resource="http://www.rossettiarchive.org/docs/1-1847.princefrag.rad.xml"/>
      <rdfs:seeAlso rdf:resource="http://www.rossettiarchive.org/docs/1-1847.princefrag.rad.html"/>
      <collex:text rdf:resource="http://www.rossettiarchive.org/docs/1-1847.princefrag.rad.txt"/>
      <dc:title>The Blessed Damozel (1872 autograph MS fragment)</dc:title>
      <role:AUT>Dante Gabriel Rossetti</role:AUT>

      <collex:genre>Primary</collex:genre>
      <collex:genre>Manuscript</collex:genre>
      <dcterms:hasPart rdf:resource="http://www.rossettiarchive.org/docs/1-1847.princefrag.radheader"/>
      <dcterms:hasPart rdf:resource="http://www.rossettiarchive.org/docs/1-1847.princefrag.rad#0.1"/>
      <dc:relation rdf:resource="http://www.rossettiarchive.org/img/1-1847.princefrag.1"/>
      <collex:thumbnail rdf:resource="http://www.rossettiarchive.org/img/thumbs_small/1-1847.princefrag.1.jpg"/>
      <collex:image rdf:resource="http://www.rossettiarchive.org/img/1-1847.princefrag.1.jpg"/>
      <dc:date>1872 (circa)</dc:date>

   </ra:rad>
   <ra:header rdf:about="http://www.rossettiarchive.org/docs/1-1847.princefrag.radheader">
      <collex:archive>rossetti</collex:archive>
      <collex:source_xml rdf:resource="http://www.rossettiarchive.org/docs/1-1847.princefrag.rad.xml"/>
      <rdfs:seeAlso rdf:resource="http://www.rossettiarchive.org/docs/1-1847.princefrag.radheader.html"/>
      <dc:title>Commentary for The Blessed Damozel (1872 autograph MS fragment)</dc:title>
      <dc:source>The Blessed Damozel (1872 autograph MS fragment)</dc:source>

      <role:AUT>Jerome J. McGann</role:AUT>
      <dc:date>2006</dc:date>
      <collex:genre>Secondary</collex:genre>
      <dcterms:isPartOf rdf:resource="http://www.rossettiarchive.org/docs/1-1847.princefrag.rad"/>
   </ra:header>
   <ra:div rdf:about="http://www.rossettiarchive.org/docs/1-1847.princefrag.rad#0.1">
      <collex:archive>rossetti</collex:archive>

      <collex:source rdf:resource="http://www.rossettiarchive.org/docs/1-1847.princefrag.rad.xml"/>
      <rdfs:seeAlso rdf:resource="http://www.rossettiarchive.org/docs/1-1847.princefrag.rad.html#0.1"/>
      <rdfs:seeAlso rdf:resource="http://www.rossettiarchive.org/docs/1-1847.princefrag.rad.html"/>
      <collex:text rdf:resource="http://www.rossettiarchive.org/docs/1-1847.princefrag.rad.0.1.txt"/>
      <dc:title>The Blessed Damozel</dc:title>
      <dc:source>The Blessed Damozel (1872 autograph MS fragment)</dc:source>
      <role:AUT>Dante Gabriel Rossetti</role:AUT>

      <dc:date>1872 (circa)</dc:date>
      <collex:genre>Poetry</collex:genre>
      <dcterms:isPartOf rdf:resource="http://www.rossettiarchive.org/docs/1-1847.princefrag.rad"/>
   </ra:div>
</rdf:RDF>

A particular line may warrant some additional explanation. See how the image thumbnail link is generated:

  <collex:thumbnail rdf:resource="http://www.rossettiarchive.org/img/thumbs_small/{encode-for-uri(concat($image, '.jpg'))}"/>

The "encode-for-uri" command is used to make sure that any unusual characters are encoded such that they can be interpreted as a URL. If you put brackets, ampersands, question marks or anything out of the ordinary in your image names you will want to do this so that the URL can be processed correctly.

Additionally, we have a Perl script (db2rdf.pl) which was used to extract bibliographic entries from the whitman bibliography SQL database.


Walters Art Museum Example (MESA)

Here is the XSLT that we used to generated MESA-compliant RDF from the TEI Manuscript Description (P5) files provided by the Walters Art Museum. The Walters TEI is available from their site [1]. A sample resulting RDF is shown below, and also on the MESA RDF page.

Sample XSLT

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:redirect="org.apache.xalan.xslt.extensions.Redirect"
extension-element-prefixes="redirect tei"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/"
xmlns:collex="http://www.nines.org/schema#" xmlns:walters="http://thedigitalwalters.org/schema#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:role="http://www.loc.gov/loc.terms/relators/" xmlns:tei="http://www.tei-c.org/ns/1.0">

<xsl:output method="xml" encoding="utf-8" indent="yes"/>
<xsl:strip-space elements="*"/>


<!-- variables -->
<xsl:variable name="baseURL">http://thedigitalwalters.org/Data/WaltersManuscripts</xsl:variable>

<xsl:variable name="newline">
  <xsl:text>
      
  </xsl:text>
</xsl:variable>

<xsl:param name="outputDir">RDF-Walters</xsl:param>

<xsl:template match="/">
  <xsl:apply-templates/>
</xsl:template>

<xsl:template match="tei:TEI">
 <xsl:for-each select="tei:teiHeader">
  <xsl:variable name="id">
   <xsl:value-of select="translate(tei:fileDesc/tei:sourceDesc/tei:msDesc/tei:msIdentifier/tei:idno, '.', '')"/>
  </xsl:variable>
  <xsl:variable name="filename" select="$id"/>
  <xsl:result-document href="{$outputDir}/{$filename}.rdf">
   <xsl:value-of select="$newline"/>
   <rdf:RDF>
    <xsl:value-of select="$newline"/>
    <walters:wam rdf:about="{$baseURL}/ManuscriptDescriptions/{$id}">
     <xsl:value-of select="$newline"/>
     <xsl:apply-templates select="."/>
     <xsl:value-of select="$newline"/>
     <xsl:comment>MESA specific metadata</xsl:comment>
     <xsl:value-of select="$newline"/>
     <collex:federation>MESA</collex:federation>
     <collex:archive>walters</collex:archive>
     <xsl:value-of select="$newline"/>
     <collex:thumbnail rdf:resource="http://idhmc.tamu.edu/image-store/walters/digitalwalters2.jpg"/>
     <collex:source_xml rdf:type="XML">
      <xsl:attribute name="rdf:resource">
       <xsl:value-of select="concat($baseURL, '/ManuscriptDescriptions/', $id, '_tei.xml')"/>
      </xsl:attribute>
     </collex:source_xml>
     <xsl:value-of select="$newline"/>
     <xsl:comment>Link that MESA should send users for this object</xsl:comment>
     <xsl:value-of select="$newline"/>
     <rdfs:seeAlso rdf:resource="{$baseURL}/html/{$id}/"/>
     <xsl:value-of select="$newline"/>
     <xsl:call-template name="hasParts"/>
     <xsl:value-of select="$newline"/>
    </walters:wam>
    <xsl:value-of select="$newline"/>
   </rdf:RDF>
  </xsl:result-document>
 </xsl:for-each>
</xsl:template>

<xsl:template match="tei:teiHeader">
 <xsl:comment>Standard DublinCore metadata</xsl:comment>
 <xsl:value-of select="$newline"/>

 <!-- title -->
 <dc:title>
  <xsl:value-of select="tei:fileDesc/tei:titleStmt/tei:title[@type='common']"/>
 </dc:title>

 <!-- creator  -->
 <role:AUT>
  <xsl:choose>
   <xsl:when test="tei:fileDesc/tei:titleStmt/tei:author/tei:name[@type='authority']">
    <xsl:value-of select="tei:fileDesc/tei:titleStmt/tei:author/tei:name[@type='authority']"/>
   </xsl:when>
   <xsl:when test="tei:fileDesc/tei:titleStmt/tei:author/tei:name[@type='supplied']">
    <xsl:value-of select="tei:fileDesc/tei:titleStmt/tei:author/tei:name[@type='supplied']"/>
   </xsl:when>
   <xsl:when test="tei:fileDesc/tei:titleStmt/tei:author/tei:name[@type='venacular']">
    <xsl:value-of select="tei:fileDesc/tei:titleStmt/tei:author/tei:name[@type='venacular']"/>
   </xsl:when>
   <xsl:otherwise>
     <xsl:text>Unknown</xsl:text>
   </xsl:otherwise>
  </xsl:choose>
 </role:AUT>
 <xsl:for-each select="tei:fileDesc/tei:sourceDesc/tei:msDesc/tei:history/tei:provenance">
  <dc:provenance>
   <xsl:value-of select="."/>
  </dc:provenance>
 </xsl:for-each>
 <xsl:if test="tei:fileDesc/tei:sourceDesc/tei:msDesc/tei:history/tei:acquisition">
  <dc:provenance>
   <xsl:value-of select="tei:fileDesc/tei:sourceDesc/tei:msDesc/tei:history/tei:acquisition"/>
  </dc:provenance>
 </xsl:if>

 <!-- MESA required and recommended: discipline, type, language -->

 <collex:discipline>Art History</collex:discipline>
 <dc:type>
  <xsl:choose>
   <xsl:when test="contains(tei:profileDesc/tei:textClass/tei:keywords, 'Codex')">Codex</xsl:when>
   <xsl:otherwise>Single Leaf</xsl:otherwise>
  </xsl:choose>
 </dc:type>
 <dc:language>
  <xsl:value-of select="tei:fileDesc/tei:sourceDesc/tei:msDesc/tei:msContents/tei:textLang/@mainLang"/>
 </dc:language>

<!-- publisher -->
 <role:PBL>
  <xsl:value-of select="tei:fileDesc/tei:publicationStmt/tei:publisher"/>
 </role:PBL>


<!--date -->
 <dc:date>
  <xsl:choose>
   <xsl:when test="tei:fileDesc/tei:sourceDesc/tei:msDesc/tei:history/tei:origin/@notBefore">
    <xsl:value-of select="tei:fileDesc/tei:sourceDesc/tei:msDesc/tei:history/tei:origin/@notBefore"/>
   </xsl:when>
   <xsl:when test="tei:fileDesc/tei:sourceDesc/tei:msDesc/tei:history/tei:origin/@notAfter">
    <xsl:value-of select="tei:fileDesc/tei:sourceDesc/tei:msDesc/tei:history/tei:origin/@notAfter"/>
   </xsl:when>
   <xsl:when test="tei:fileDesc/tei:sourceDesc/tei:msDesc/tei:history/tei:origin/@notBefore and tei:fileDesc/tei:sourceDesc/tei:msDesc/tei:history/tei:origin/@notAfter">
    <xsl:value-of select="tei:fileDesc/tei:sourceDesc/tei:msDesc/tei:history/tei:origin/@notBefore"/>
    <xsl:text>,</xsl:text>
    <xsl:value-of select="tei:fileDesc/tei:sourceDesc/tei:msDesc/tei:history/tei:origin/@notAfter"/>
   </xsl:when>
   <xsl:when test="tei:fileDesc/tei:sourceDesc/tei:msDesc/tei:history/tei:origin/@when">
    <xsl:value-of select="tei:fileDesc/tei:sourceDesc/tei:msDesc/tei:history/tei:origin/@when"/>
   </xsl:when>
   <xsl:otherwise>
    <xsl:text>Uncertain</xsl:text>
   </xsl:otherwise>
  </xsl:choose>
 </dc:date>
 <xsl:value-of select="$newline"/>

 <!--freeculture-->
 <collex:freeculture>true</collex:freeculture>
 <xsl:value-of select="$newline"/>

<!-- genre -->


 <xsl:for-each select="tei:profileDesc/tei:textClass/tei:catRef[@scheme='#genres']/@target">
  <xsl:choose>
   <xsl:when test=". = '#genre_1'">
    <collex:genre>Religion</collex:genre>
   </xsl:when>
   <xsl:when test=". = '#genre_2'">
    <collex:genre>Religion</collex:genre>
   </xsl:when>
   <xsl:when test=". = '#genre_3'">
    <collex:genre>Religion</collex:genre>
   </xsl:when>
   <xsl:when test=". = '#genre_4'">
    <collex:genre>Religion</collex:genre>
   </xsl:when>
   <xsl:when test=". = '#genre_5'">
    <collex:genre>Law</collex:genre>
   </xsl:when>
   <xsl:when test=". = '#genre_6'">
    <collex:genre>History</collex:genre>
   </xsl:when>
   <xsl:when test=". = '#genre_7'">
    <collex:genre>Science</collex:genre>
   </xsl:when>
   <xsl:when test=". = '#genre_8'">
    <collex:genre>Science</collex:genre>
   </xsl:when>
   <xsl:when test=". = '#genre_10'">
    <collex:genre>Poetry</collex:genre>
   </xsl:when>
   <xsl:when test=". = '#genre_11'">
    <collex:genre>Philosophy</collex:genre>
   </xsl:when>
   <xsl:when test=". = '#genre_12'">
    <collex:genre>Philosophy</collex:genre>
   </xsl:when>
   <xsl:when test=". = '#genre_13'">
    <collex:genre>Religion</collex:genre>
   </xsl:when>
  </xsl:choose>
 </xsl:for-each>
</xsl:template>

<xsl:template name="hasParts">
 <xsl:for-each select="ancestor-or-self::tei:TEI/tei:teiHeader/tei:fileDesc/tei:titleStmt/tei:title[@type='work']">
  <dcterms:hasPart>
   <xsl:attribute name="rdf:resource">
    <xsl:value-of select="normalize-space(.)"/>
   </xsl:attribute>
  </dcterms:hasPart>
 </xsl:for-each>
</xsl:template>

</xsl:stylesheet>

Sample RDF

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:role="http://www.loc.gov/loc.terms/relators/"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:walters="http://thedigitalwalters.org/schema#"
xmlns:collex="http://www.collex.org/schema#"
xmlns:dcterms="http://purl.org/dc/terms/" 
xmlns:dc="http://purl.org/dc/elements/1.1/">
        
        <!-- stable URI -->
      
 <walters:wam rdf:about="http://thedigitalwalters.org/Data/WaltersManuscripts/ManuscriptDescriptions/W768">
      
    <!--Standard DublinCore metadata-->
      
 <dc:title>Walters Ms. W.768, Ethiopic Psalter with Canticles, Song of Songs, and two hymns in praise of Mary</dc:title>

<!-- role information -->
<role:AUT>St. Yared the Aksumite priest St. Ephraem Syrus</role:AUT>
<role:PBL>The Walters Art Museum</role:PBL>

 <!-- add dc:type, dc:provenance, dc:format, etc -->

<dc:provenance>Probably created for one of the princes of the Gonderite royal family, whose reign ended in 1769</dc:provenance>
<dc:provenance>Names of Wälättä Ḥǝywät and Wälättä Kidan, who added texts in the nineteenth century, mentioned repeatedly in later added prayers (fols. 12r, 13v, 31v, and 37r)</dc:provenance>
<dc:provenance>Purchased by the Walters Art Museum through the S. & A. P. Fund, January 1960</dc:provenance>

<collex:discipline>Art History</collex:discipline>
<dc:type>Codex</dc:type>
<dc:language>gez</dc:language>
<collex:genre>Religion</collex:genre> 
       
<dc:date>1700</dc:date>
      
<collex:freeculture>true</collex:freeculture>
      
    
      
    <!--MESA specific metadata-->
      
<collex:federation>MESA</collex:federation>
<collex:archive>walters</collex:archive>
      
<collex:thumbnail rdf:resource="http://www.thedigitalwalters.org/Data/WaltersManuscripts/W768/data/W.768/thumb/W768_000001_thumb.jpg" />
<collex:source_xml rdf:resource="http://thedigitalwalters.org/Data/WaltersManuscripts/ManuscriptDescriptions/W768_tei.xml"/>
      
    <!--Link that MESA should send users for this object-->
      
    <rdfs:seeAlso rdf:resource="http://thedigitalwalters.org/Data/WaltersManuscripts/html/W768/"/>
      

      
      
    </walters:wam>
      
    </rdf:RDF>

Generating Readable Plain Text

Contributors can make their resources full-text searchable in the COLLEX browser by associating a plain-text transcription in a .txt file. Information on how this association appears in the RDF can be found in the Submitting RDF guide.

Generating plain-text files from XML is actually rather simple if the contributor is using XSLT. To process the node which contains the desired text, the following XSLT should be included.

    <xsl:result-document href="{$idtxt}.txt" method="text" encoding="utf-8"
 indent="no" xml:space="default">
     <xsl:apply-templates select="text" mode="text"/>
    </xsl:result-document>

<xsl:result-document> is the XSLT 2.0 solution for creating multiple output documents. The "href" attribute specifies the name of the output file; the "{ }" curly brackets are used for including a variable value within the URI. See the Rossetti XSLT (near the top of the XSL file) for an example of the $idtxt variable. The "method" must be "text", "encoding" should be specified as "utf-8", "indent" should be set as "no", and "xml:space" should be set to "default". This, all on its own, will output fairly readable plain-text.

However, if the XML node to be transcribed contains a complex hierarchy of subordinate nodes, then the output may include hard returns and indentation that the contributor would find less that desirable, especially if the plain-text file is also expected to be used within the Juxta collation tool. Much of this can be fixed further XSL matches. In the above example, <xsl:apply-templates> matches the selected node and does so in the "text" mode. In the Rossetti XSL, template matches with "mode="text"" are specifically used to match subordinate nodes that occur for text we wish to transcribe.

Below are the XSLT matches for nodes we include in transcriptions. The matches fall into three categories:

  1. elements that should be processed and end with a hard return line break
  2. elements that should be processed and SHOULD NOT end with a hard return line break
  3. elements that should not be processed at all
<xsl:template match="*" mode="text">
  <xsl:apply-templates mode="text"/>
  <xsl:text>
</xsl:text>
 </xsl:template>

<!-- elements that SHOULD be processed with a hard return -->
<xsl:template match="citnstruct/title|
                     text//divheader|
                     text//p|
                     text//l|
                     text//msadds/trans|
                     item|
                     section[@type='biblio']/p/bibl|
                     titlestmt/title|
                     titlestmt/author|
                     titlestmt/editor" mode="text">
  <xsl:apply-templates mode="text"/>
  <xsl:text>
</xsl:text>
</xsl:template>

<xsl:template match="text//lb|epage" mode="text">
  <xsl:text>
</xsl:text>
</xsl:template>

<!-- elements that SHOULD NOT be processed with a hard return -->
<xsl:template match="addspan|
                     author|
                     bibl|
                     bibl/date|
                     delspan|
                     foreign|
                     hi|
                     pages|
                     quote|
                     title|
                     xref" mode="text">
  <xsl:apply-templates mode="text"/>
</xsl:template>
	
<!-- elements that SHOULD NOT be processed AT ALL -->
<xsl:template match="citnstruct/artist|citnstruct/author|desc|head|page|pageheader" mode="text"/>

<!-- removes any extraneous indentation and soft returns -->
<xsl:template match="text()" mode="text">
  <xsl:value-of select="normalize-space(.)"/>
</xsl:template>

One will notice that the first match above, <xsl:template match="*" mode="text">, acts as a sort of backstop, catch-all, wherein any element not specifically listed in the following matches is processed and given a hard-return. Although this creates some redundancies with the second match (which processes with line breaks), it may prove useful for initial debugging of the XSLT.

The final match, <xsl:template match="text()" mode="text">, removes unwanted indentations and other white-space.
Personal tools