CSCI 5733.2
XML Application Development
Spring 2003
Final Examination

Name:  _________________________________

Time allowed: one hour and 20 minutes.  Total score: 100%.

Open: text book, lecture notes, every file I wrote and posted in my Web page and your project assignments.  No other materials are allowed.

Answer all questions.  Turn in both question and answer sheets (with the question sheets on top).  Plan your time well.

Academic honesty policy will be followed strictly.  Cheating will result in a failing grade of D or below and a permanent academic record. 

(1) Write a simple XSLT program, countText.xsl, to accept an XML document, remove all comments, PI and text and keep the elements and attributes. In addition, for every element with child text nodes, add an attribute that counts the number of child text nodes.

For example, if the input XML document is:

<?xml version="1.0" ?>
<root a="1" b="y" y="3">
   <p y="y"><q /></p>
   <r c="1" />
   Hello
   <s y="1">
   Bye
   </s>
</root>

The output XML should be (with the possible exception of whitespaces)

<?xml version="1.0" encoding="UTF-8"?>
<root a="1" b="y" y="3" textCount="4">
<p y="y">
<q/>
</p>
<r c="1"/>
<s y="1" textCount="1"/>
</root>

This is because only two elements have child text nodes. <root> has four and <s> has one.

The skeleton of the XSLT program is:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                  version="1.0">
<xsl:output method="xml" version="1.0" indent="yes" />

<!-- Your XSL code here -->

</xsl:stylesheet>

You only need to write your XSLT templates. Hint: to output the contents of an element in XSLT, attributes should be output before child elements.

(2) Consider the following XML file:

<?xml version="1.0" ?>
<!DOCTYPE collection SYSTEM "collection.dtd" >
<collection>
   <authors>
      <author id="_1">Bun Yue</author>
      <author id="_2">Terry Feagin</author>
      <author id="_3">Sadegh Davari</author>
      <author id="_4">Morris Liaw</author>
      <author id="_5">Joe Giarratano</author>
   </authors>
   <categories>
      <category id="c1">Computing</category>
      <category id="c2">Internet</category>
      <category id="c3">Programming</category>
   </categories>
   <books>
      <book isbn="_123456" >
         <authorRef id="_1" />
         <categoryRef id="c1" />
         <categoryRef id="c2" />
         <title>Fun with XML</title>
         <chapter num="1">Beginning XML</chapter>
         <chapter num="2">intermediate XML</chapter>
      </book>
      <book isbn="_123457" >
         <authorRef id="_1" />
         <authorRef id="_2" />
         <categoryRef id="c1" />
         <title>Fun with AI</title>
         <chapter num="1">Beginning AI</chapter>
         <chapter num="2">intermediate AI</chapter>
      </book>
      <book isbn="_123458" >
         <authorRef id="_3" />
         <categoryRef id="c3" />
         <title>Fun with Java</title>
         <chapter num="1">Beginning Java</chapter>
      </book>
      <book isbn="_123459" >
         <categoryRef id="c1" />
         <title>Fun with Perl</title>
         <chapter num="1">Programming with Perl</chapter>
      </book>
   </books>
</collection>

The meaning of the XML document is quite self-explanatory. In case you need its DTD:

<!ELEMENT collection (authors?, categories?, books?)>
<!ELEMENT authors (author*)>
<!ELEMENT author (#PCDATA)>
<!ATTLIST author id ID #REQUIRED>
<!ELEMENT categories (category*)>
<!ELEMENT category (#PCDATA)>
<!ATTLIST category id ID #REQUIRED>
<!ELEMENT books (book*)>
<!ELEMENT book (authorRef*, categoryRef*, title?, chapter*)>
<!ATTLIST book isbn ID #REQUIRED>

<!ELEMENT authorRef EMPTY>
<!ATTLIST authorRef id IDREF #REQUIRED>
<!ELEMENT categoryRef EMPTY>
<!ATTLIST categoryRef id IDREF #REQUIRED>
<!ELEMENT title (#PCDATA)>
<!ELEMENT chapter (#PCDATA)>
<!ATTLIST chapter num NMTOKEN #REQUIRED>

Give the XPath expressions to select the following queries.

(a) All book elements that have two or more chapters.
(b) The isbn of all books under the category "Computing".
(c) The number of books authored by "Bun Yue".
(d) The number of books with no explicit author.
(e) The number of authors (listed in <authors>) who have not written any book (listed in <books>.

(3) Consider the following DTD school.dtd:

<!ELEMENT school (division*)>
<!ELEMENT division (faculty*)>
<!ELEMENT faculty (#PCDATA)>
<!ATTLIST faculty department NMTOKEN #REQUIRED>

Write a Java program, NumberOfDistinctDepartment.java, to use JDOM (DOM or SAX will not be accepted), to output the number of distinct department values in the XML files validated by the DTD file school.dtd.

For example, if the input file school.xml is:

<?xml version="1.0" ?>
<!DOCTYPE school SYSTEM "school.dtd" >
<school>
   <division>
      <faculty department="CSCI">Bun Yue</faculty>
      <faculty department="CSCI">Terry Feagin</faculty>
      <faculty department="SWEN">Sadegh Davari</faculty>
      <faculty department="SWEN">Morris Liaw</faculty>
      <faculty department="CSCI">Joe Giarratano</faculty>
      <faculty department="MATH">Lie June Shiau</faculty>
   </division>
</school>

Running the program:

java NumberOfDistinctDepartment school.xml

should output:

Number of department = 3.

This is because there are three distinct department values: "CSCI", "SWEN" and "MATH".

The following skeleton is provided for you, so it is only necessary for you to provide the code for the method numDept (and possibly other helping methods).

import java.io.*;
import org.jdom.*;
import org.jdom.input.*;
import org.jdom.output.*;
import java.util.*;

public class NumberOfDistinctDepartment {

  public static void main(String argv[]) {
    if(argv.length == 1) {
      Document document = readDocument(argv[0]);
      if (document != null) {
        int numDept = numDept(document);
      System.out.println("Number of department = "
         + numDept + ".\n\n");
      }
     else {
      System.out.println("Sorry, the input file cannot be parsed.");
     }
    }
  }

  private static Document readDocument(String filename) {
    try {
      SAXBuilder builder = new SAXBuilder();
      Document result = builder.build(new File(filename));
      return result;
    } catch(JDOMException e) {
      e.printStackTrace();
    } catch(NullPointerException e) {
      e.printStackTrace();
    }
    return null;
  }

  // Develop the method numDept here.
}