CSCI 5733.1
XML Application Development
Summer 2003
Mid-Term Examination

Name:  _________________________________

Time allowed: one hour and 30 minutes.  Total score: 30 points.

Open: text book, lecture notes, every file I wrote and posted in my Web page and your project assignments.  No other materials are allowed.

Answer all questions.  Turn in both question and answer sheets.  Plan your time well.

Academic honesty policy will be followed strictly.  Cheating will result in a failing grade of D or below and a permanent academic record! 

(1) [9 points] Complete the following program, su3t1asax.java, which reads in an XML file output the name of the first deepest element and the number of its attributes using SAX (other methods not acceptable). The depth of an empty tree is defined to be -1.

For the following input file, data.xml:

<?xml version="1.0" ?>
<root>
   <a>
      <b />
      <b x="1">
         <c x="1" y="2"/>
         <d x="2" y="3"/>
      </b>
   </a>
   <d>
      <e/>
   </d>
</root>

running

java su3t1asax data.xml

output exactly the following:

The first deepest element is <c> and it has 2 attribute.

This is because the first deepest element is <c x="1" y="2"/> at a depth of 3 and it has two attributes: x and y.

You must not write your own program. Instead, you should study the following skeleton and fill in the missing parts, which include only the SAX event handlers, data members, the method printResult() and any other helping methods you may need.

import java.io.*;
Import java.util.*;

Import org.xml.sax.*;
Import org.xml.sax.helpers.DefaultHandler;

import javax.xml.parsers.SAXParserFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;

public class su3t1asax extends DefaultHandler
{
   public static void main(String argv[])
   {   if (argv.length != 1) {
         System.err.println("Usage: Java su3t1asax xmlfilename");
         System.exit(1);
      }

      DefaultHandler handler = new su3t1asax();
      SAXParserFactory factory = SAXParserFactory.newInstance();

      try
      {   SAXParser saxParser = factory.newSAXParser();
         saxParser.parse(new File(argv [0]), handler);
         //   print the result.
         printResult();
      } catch (Throwable t)
      {   t.printStackTrace();
      }
   }   //   main

   // Add your handlers, data members, printResult and
   // other methods here...

}   //   su3t1asax.

(2) (a) [3 points] Give the definition of the simple type mySimpleType in XML Schema to include only all positive integers divisible by 4: 4, 8, 12, 16, 20, 24, ...

(b) [7 points] A very small library wants to store its book collections in XML. A collection may have up to 12 categories of books. Each category has an unique id, a name and an optional brief description. The category id starts with the character "c". A book may belong to zero or more categories. A book has an unique call number that starts with a capital letter and may contain letters or digits. A book also has a title and zero or many authors. The title of a book may contain any characters and the first character can only be alphabets or digits, such as "123, sing along."

Design your DTD to be as close to the specification above as possible. Both category and book information should be stored. Enforce as much constraint as possible.

(3) (a) [5 points] True or False; no explanation should be given.

(i) XMLReader is a class defined by the SAX API.
(ii) &nbsp; is not a predefined entity in XML.
(iii) In XML Schema, a simple type can be used as the data type of an element.
(iv) xmlns is a predefined namespace prefix in XML.
(v) In non-validating mode, an XML parser will not access the DTD specified in DOCTYPE.

(b) [6 points] Write a CGI-Perl program that uses the XML document stored in http://dcm.uhcl.edu/yue/courses/xml/Summer2003/test/hquote.xml, which has a DTD of

<!ELEMENT quotes (quote*)>
<!ELEMENT quote (author, content)>
<!ELEMENT author (#PCDATA)>
<!ELEMENT content (#PCDATA)>

There are no attributes. The program generates plain text content (not HTML). Each line depicts one <quote> in the format of

author:content

where author and content are the values of <author> and <content> respectively.