CSCI 5733.1
XML Application Development
Summer 2003
Final Examination

Name:  _________________________________

Time allowed: one hour and 30 minutes.  Total score: 30.

Open: text book, lecture notes, every file I wrote and posted in my Web page and your project assignments.  No other materials are allowed.

Answer all questions.  Turn in both question and answer sheets (with the question sheets on top).  Plan your time well.

Academic honesty policy will be followed strictly.  Cheating will result in a failing grade of D or below and a permanent academic record. 

(1) Write a simple XSLT program, flat.xsl, to accept an XML document to flatten the XML document. The program removes all comments, PI and text and keeps the elements and attributes. In addition, every element, except the root itself, will become a child of the root.

For example, if the input XML document is:

<?xml version="1.0" encoding="ISO-8859-1" ?>
<root a="1" b="y" y="3">
   <p y="y"><r a="1"><q /></r></p>
   <t><r c="1" /></t>
   Hello
   <s y="1">
   Bye
   </s>
</root>

The output XML should be (with the possible exception of whitespaces)

<?xml version="1.0" encoding="UTF-8"?>
<root a="1" b="y" y="3">
<p y="y"/>
<r a="1"/>
<q/>
<t/>
<r c="1"/>
<s y="1"/>
</root>

The skeleton of the XSLT program is:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                  version="1.0">
<xsl:output method="xml" version="1.0" indent="yes" />

<!-- Your XSL code here -->

</xsl:stylesheet>

You only need to write your XSLT templates. Hint: to output the contents of an element in XSLT, attributes should be output before child elements. It is not necessary to worry about the output encoding.

(2) Consider the following XML file:

<?xml version="1.0" ?>
<!DOCTYPE collection SYSTEM "collection.dtd" >
<collection>
   <authors>
      <author id="_1">Bun Yue</author>
      <author id="_2">Terry Feagin</author>
      <author id="_3">Sadegh Davari</author>
      <author id="_4">Morris Liaw</author>
      <author id="_5">Joe Giarratano</author>
   </authors>
   <categories>
      <category id="c1">Computing</category>
      <category id="c2">Internet</category>
      <category id="c3">Programming</category>
   </categories>
   <books>
      <book isbn="_123456" >
         <authorRef id="_1" />
         <categoryRef id="c1" />
         <categoryRef id="c2" />
         <title>Fun with XML</title>
         <chapter num="1">Beginning XML</chapter>
         <chapter num="2">intermediate XML</chapter>
      </book>
      <book ISBN="_123457" >
         <authorRef id="_1" />
         <authorRef id="_2" />
         <categoryRef id="c1" />
         <title>Fun with AI</title>
         <chapter num="1">Beginning AI</chapter>
         <chapter num="2">intermediate AI</chapter>
      </book>
      <book ISBN="_123458" >
         <authorRef id="_3" />
         <categoryRef id="c3" />
         <title>Fun with Java</title>
         <chapter num="1">Beginning Java</chapter>
      </book>
      <book ISBN="_123459" >
         <categoryRef id="c1" />
         <title>Fun with Perl</title>
         <chapter num="1">Programming with Perl</chapter>
      </book>
   </books>
</collection>

The meaning of the XML document is quite self-explanatory. In case you need its DTD:

<!ELEMENT collection (authors?, categories?, books?)>
<!ELEMENT authors (author*)>
<!ELEMENT author (#PCDATA)>
<!ATTLIST author id ID #REQUIRED>
<!ELEMENT categories (category*)>
<!ELEMENT category (#PCDATA)>
<!ATTLIST category id ID #REQUIRED>
<!ELEMENT books (book*)>
<!ELEMENT book (authorRef*, categoryRef*, title?, chapter*)>
<!ATTLIST book ISBN ID #REQUIRED>

<!ELEMENT authorRef EMPTY>
<!ATTLIST authorRef id IDREF #REQUIRED>
<!ELEMENT categoryRef EMPTY>
<!ATTLIST categoryRef id IDREF #REQUIRED>
<!ELEMENT title (#PCDATA)>
<!ELEMENT chapter (#PCDATA)>
<!ATTLIST chapter num NMTOKEN #REQUIRED>

Give the XPath expressions to select the following queries.

(a) The ISBN of all books written by "Bun Yue".
(b) The author elements of the authors with the first name "Sadegh" (assuming that the first name comes first and there is no leading spaces.)
(c) The names of the categories of the book with title "Fun with XML".
(d) The total number of books that do not belong to category "c3" (3 in the example above).
(e) A book with no author.

(3) Using Java DOM (jdom not acceptable), write a static method Vector getPIs(Node node) to return a Vector of all processing instructions (of type ProcessingInstruction) under the tree with node as the root.

For example, for the following program PrintPI.java

import javax.xml.parsers.*;
import org.xml.sax.SAXException; 
import org.xml.sax.SAXParseException; 
import org.w3c.dom.*;

import java.io.*;
import java.util.*;

public class PrintPI {
   static Document document;

   public static void main(String argv[])
   {   if (argv.length != 1) {
         System.err.println("Usage: java PrintPI xmlfilename");
         System.exit(1);
      }
      DocumentBuilderFactory factory
         = DocumentBuilderFactory.newInstance();
      try {
         DocumentBuilder builder = factory.newDocumentBuilder();
         document = builder.parse(new File(argv[0]));
         Vector pINodes = getPIs(document);
         for (Enumeration e = pINodes.elements() ;
                  e.hasMoreElements() ;) {
            ProcessingInstruction pi
               = (ProcessingInstruction) e.nextElement();
            System.out.println(pi.getTarget() + " => "
               + pi.getData());
         }
      } catch (Throwable e) {
         e.printStackTrace();
      }
   } // main

   // Definition of private static Vector getPIs(Node node) here.
   // ...
}

for the input XML file, PrintPIdat1.xml:

<?xml version="1.0" ?>
<?pi a="1" b="2" ?>
<root>
   <?br ?>
   <?xy ?>
   <a>
      <?br x="1" y="2" z="3" ?>
      hello
      <?hey ?>
      <b><?pi1 ?><?pi2 ?>
      </b>
   </a>
   <d>
      <?kkk l="1" m="2" n="3" ?>
   </d>
</root>
<?pi-end ?>

Running

java PrintPI PrintPIdat1.xml

will output:

pi => a="1" b="2"
Br =>
xy =>
Br => x="1" y="2" z="3"
hey =>
pi1 =>
pi2 =>
KKK => l="1" m="2" n="3"
pi-end =>

Hint: The method public boolean addAll(Collection c) of the class Vector, appends a collection to a vector if c is not null. If c is null, a null pointer exception will be thrown. The class Vector implements the interface Collection.