Introduction to XPath

by K. Yue

1. Introduction

Resources:

Basics:

2. Path Expression

Example:

Consider film.xml (with data extracted from Sakila).

//films/film

Location path requirements:

Example:

//films/child::film[actor/@id='162']

Consider the location step:

child::film[actor/@id='162']

  1. Node axis: child
  2. Node test: film
  3. Node predicate: [actor/@id='162']

The XPath expression lists all <film> elements that:

  1. are a child node of a <films> element in the document, and
  2. have a child <actor> which has an attribute id with the value of '162'.

Note that actor/@id='162' is a relative path, relative to the context node, which is a film node.

Node Axes:

From XPath 3.0:

[40]      ForwardAxis      ::=      ("child" "::")
| ("descendant" "::")
| ("attribute" "::")
| ("self" "::")
| ("descendant-or-self" "::")
| ("following-sibling" "::")
| ("following" "::")
| ("namespace" "::")
[43]      ReverseAxis      ::=      ("parent" "::")
| ("ancestor" "::")
| ("preceding-sibling" "::")
| ("preceding" "::")
| ("ancestor-or-self" "::")

 

Node test:

Node Predicate:

Shorthand:

Example:

//text()

or

/descendant::*/text()

list all text nodes.

Note the difference of

//actor/@id[.='20']
//actor[@id='20']

//film/actor [position()=2]

or

//film/actor[2]

3. Sequence Expressions

Example:

(1, 5 to 8, "Bun Yue", 2.1)

(1+2, 5)

(1 to 50)[. mod 3 = 1]

//film/* | //film

(1, 2, (3, (4, 5))) is (1,2,3,4,5)

4. Other Expressions

Primary expressions

Example:

//film[count(actor)>=10]

or

//film[count(./actor)>=10]

returns all <film> nodes with 10 or more actors.

//film/actor [position()=2]
//film/actor [fn:position()=2]

or

//film/actor[2]

//film[starts-with(title/text(),'A')]

gives all <film> element with titles started with 'A'.

distinct-values(//film/actor[starts-with(text(),'A')])

gives a sequence of actor names starting with an 'A'.

Arithmetic, Comparison and Logical Expressions

For Expression and Variable Binding:

Example:

for $film in (//film) return $film/actor

is the same as:

//film/actor

List all names of all actors appeared in more than 35 films by using the for expression in XPath.

fn:distinct-values(//film/actor[for $a in . return count(//film/actor[@id = $a/@id]) > 35]/text())

This can be slow.

Conditional Expressions:

Example:

if (//film[title/text()='ADAPTATION HOLES']) then 'found Holes' else 'no Holes'


Qualified Expressions

Example:

All film elements with an actor of id of 4 or less

//film[some $a in actor satisfies $a/@id <= 4]

same as:

//film[actor/@id <= 4]

All film elements with only actors of id > 150.

//film[every $a in actor satisfies $a/@id > 150]

For filmActor.xml:

//film[every $a in //film[@id=937]/actorIds/actorId/@actorId satisfies ./actorIds/actorId/@actorId = $a]

returns all film elements that have all actors who appearred in film with id 937.