TMQL4J is an open source project available under an Apache 2.0 license. The project is hosted on http://code.google.com/p/tmql/. If you find bugs or have feature requests please file a ticket.

1. Scope of this document

The Topic Maps Query Language is the name of a language specified for querying topic maps in a easy and consistent way independent of the underlying topic maps engine implementation. The query language is inspired by different languages which are established in different sectors of the technology infrastructures and models, like SQL or the topic maps query language TOLOG. Currently there is no final standard specification for TMQL but we decided to implement a topic maps query engine based on the draft of 15th of august 2008 except of any extension developed by the topic maps lab.

The document is divided in three semantic sections.

The first section describing the language specified in the underlying draft and the extensions. The section tries to give the user an overview about the query language itself and give the possibility to use the engine by creating the queries to extract the informations the user is interested in. The language part starts with the grammar description and continues with the base navigation and graph concept of TMQL, finishing with the complex high-level query languages.

The second section describing the engine itself and give the user an short overview about the architecture and the plugin available for tmql4j. We describe how to use the engine, and how the user can extend or adapt the engine to realize the business use cases in the most comfortable way by using our core engine.

The last section contains some tutorials to combine the knowledge gain in the both other sections of this document by realizing some base use cases, starting by a simple one and finishing in a complex use case.

2. Topic Maps Query Language

Describing the query language.

2.1. The meta model of TMQL

The meta or data model of TMQL is quite different than the meta model of the topic maps ( TMDM ). There are many differences which are useful and show the different meanings and the huge discussion about the topic maps meta model. Currently the meta model of the query language includes concepts of the topic map reference model ( TMRM ) like the navigation concept which looks similar to the proxy concept of the reference model. Other import inputs are discussion about some changes of the topic maps meta model. As editors of the current draft some changes in context of the TMDM affected the current draft of the query language.

This chapter tries to highlight the main differences between both conceptions the TMQL meta model on the one hand and the TMDM on the other hand.

2.1.1. No topic map construct

In contrast to the topic maps meta model the topic map construct can not access directly. So it it impossible to get all topics or all associations starting a navigation at the root node. If we look at the topic map API ( TMAPI ) the root node seems to be the topic map construct. As two main methods the topic map provides access to all topics and all associations. The TMAPI currently based on the TMDM conception.

The main problem of this drawback is the missing possibility to get this informations quite simple like it is supported by the TMAPI.

The concept of missing the direct addressing of the topic maps item is similar to the concept of the CTM syntax. A topic map will never addressed directly using a CTM pattern, it will represented indirectly as the whole document. In this case TMQL addresses the topic map item indirectly as the querying context. There is only one possibility to use the topic maps item in the context of a query, the environment variable %_ representing the queried topic map. But there is no possibility to use it as navigation start point.

2.1.2. Merging the concept of names and occurrences

The topic maps meta model differs the two concepts of names and occurrences. Names are used as human readable identification of topic items. A name item representing a name of a subject of the real world and will be represented as a string literal. Occurrences representing the relationship of subjects and an information resource. An occurrence can be used to bind a characteristic information resource to the topic item including the occurrence, for example its email address or its date of birth.

In the context of information modeling the differences of names and occurrences are quite simple. Names can have variants and are always represented as a string literal which means names have the datatype string any time. Occurrences have no variants and can have different datatypes, but also the datatype string. Because of removing variants from the meta model, names and occurrence are quite the same except the datatype because of the fact that occurrences can be a string literal too, this difference has no relevance. As a result of this discussion the current draft of the topic maps query language does not differs between names and occurrences. Both concepts are merged to the concept of characteristics. A characteristic item representing some relation of the topic item and an information resource which also can be a name, if the name type is used.

2.2. Language grammar

The grammar of the query language are modeled in three levels. Each level uses defined constructs of the lower level to extend them. In this context each grammar inspired by an industrial standard.

2.2.1. The token level

The token level creates the base of the query language by defining the terminal symbols. It makes use of regular expression to specify case-sensitive character patterns for valid terminal symbols. In current draft contains constant tokens representing special keywords of the language itself like the keyword SELECT. In addition the draft contains terminal definitions using regular expression, the terminal can be represented by different token literal matching the given regular expression. For example variables has to start with a defined prefix $ and has to contain at least one character after the prefix.

Note
Binary infix or unary prefix operators are not defined as a part of the grammar itself. They handled as an part of the predefined environment in the same context like the functions representing the functionality of this operators. But this tokens are reserved too and can not be used in other meanings.

The token level define a set of delimiting characters used to split special token representation automatically. Any other token is not delimiting and whitespace characters must be used for separated. Whitespace characters are blank, tabs and newlines and can be placed without quantified restrictions between every other token. The white spaces wont be interpreted except their meaning in the context of string literal, XML fragments or CTM streams.

Note
The hash character # represents comments using in the query context which wont be interpreted by the query processor. The hash character will be identified as a comment only if it is encapsulated by white spaces and isn’t a part of a string literal.

2.2.2. The canonical level

The definition of the canonical syntax realized using the context-free grammar of XML 1.0 with some conventions. The canonical level defines a set of productions for non-terminals representing the expressions of the query language. The productions make use of terminals of the token level and represented as terms, a sequence of terminals and non-terminals.

2.2.3. The non-canonical level

The non-canonical level are realized on top of the canonical level and contains special term substitutions to reduce the syntactic overhead of some expressions often used in this context of query syntax. The defined shortcuts are equal to its expanded forms and does not add any computational complexity.

2.3. Literals and Atoms

Literals are terminals which aren’t reserved keywords of the query language. Non-constant literals normally represent values of constant atoms like a string literal.

Constant literals are called atoms and representing the internal datatypes which can be used in the constant way by using the identifier of the datatype. The current draft adopts a list of primitive datatypes of the CTM draft and add the date and dateTime datatype of XSD to define time as core types of the queried topic map. The following table contains a number of all predefined atoms and there values which can be represented by other literals.

datatype

possible values

atom

undefined | boolean | number | date | dateTime | iri | string [ ^^ QIRI ]

undefined

undef

boolean

true | false

number

decimal | integer

decimal

[+-]?[0-9]+(\.[0-9]+)?

integer

[+-]?[0-9]+

date

-? yyyy - mm - dd zzzzzz?

dateTime

-? yyyy - mm - dd T hh : mm : ss (. s+)? (zzzzzz)?

iri

" QIRI "

string

"([^"]|\")*"/ | """([^"]|\")*""" | /'([^']|\')*'

2.4. Topic Identification

The identification of topic items or topic types is one of the fundamental parts of a query language for topic maps. In this case the current draft has to support a simple way to address topics as parts of the query. The current draft currently supports two different types for identification of topics.

Similar to the CTM syntax a topic item can be addressed by using one of its identifiers, its subject-identifier, subject-locator or item-identifier. The identifier will be transformed automatically by the query processor the represented topic item.

Note
If there are two topics using the same identifier IRI as different identifiers - one as locators and the other as subject-identifier. The processor can not decide which topic item should be extracted. In this case the engine always prefers subject-identifiers before subject-locators and subject-locators before item-identifies.
Address a topic by its subject-identifier
http://psi.ontopedia.net/Puccini

As alternative a topic item can be addressed by the literal of its identifiers. The string literal of the IRI can be used to address a topic directly. The ambiguousness are resolved by using one of the three identification axes item , indicators or locators. For more information see the chapter about navigation concepts.

Address a topic by the string literal of its subject-identifier
"http://psi.ontopedia.net/Puccini" << indicators

2.5. Navigation concept

In relation to the topic maps reference model ( TMRM ) a topic map is represented as a abstract bidirectional graph of construct nodes. Each node support a number of defined axis to navigated to a related note. This concept is similar to the XPath navigation of XML documents except the difference of a tree-structure as a special graph.

The navigations concept of TMQL realize a navigation throw the abstract bidirectional graph of a topic map. The current draft supports a set of predefined axis on the canonical level which can be used to navigate in forward or in backward direction. Each axis support a set of construct types which can be used as start point of the navigation step represented by the current axis. Some axes support an optional type filter to control the navigation step in different ways.

The syntax of a navigation expression looks like the following production.

1:      NAVIGATION      ::=     STEP [ NAVIGATION ]
2:      STEP            ::=     DIRECTION AXIS [TOPIC-REF]
3:      DIRECTION       ::= << | >>

The following 12 section describes each axis as stand-alone construct. By combine the axis the user can navigate to each node of the topic map. There are no isolated nodes which cannot be access by a navigation.

2.5.1. The indicators axis

The indicators axis represents the relationship between a topic item and its subject-identifiers. It does not support any optional type arguments and can be navigated in both directions.

If the current node is a topic item the forward navigation returns all subject-identifiers of the current topic item as locator objects.

If the current node is a string literal the backward navigation returns the topic represented by this subject-identifier or an empty set if there is no topic contained.

Note
There is a shortcut definition contained by the non-canonical level based on the CTM syntax. The shortcut ~ returns the topic item represented by the following literal used as subject-identifier.

2.5.2. The locators axis

The locators axis represents the relationship between a topic item and its subject-locators. It does not support any optional type arguments and can be navigated in both directions.

If the current node is a topic item the forward navigation returns all subject-locators of the current topic item as locator objects.

If the current node is a string literal the backward navigation returns the topic represented by this subject-locator or an empty set if there is no topic contained.

Note
There is a shortcut definition contained by the non-canonical level based on the CTM syntax. The shortcut = returns the topic item represented by the following literal used as subject-locator.

2.5.3. The item axis

The item axis represents the relationship between a topic item and its item-identifier. It does not support any optional type arguments and can be navigate in both directions.

If the current node is a construct the forward navigation returns all item-identifiers of the current construct.

If the current node is a string literal the backward navigation returns the construct represented by this item-identifier or an empty set if there is no construct contained.

Note
There is a shortcut definition contained by the non-canonical level based on the CTM syntax. The shortcut ^ returns the construct represented by the following literal used as item-identifier.

2.5.4. The id axis

The id axis represents the relationship between a topic map construct and its id. It does not support any optional type arguments and can be navigate in both directions.

If the current node is a construct the forward navigation returns the id of the current construct.

If the current node is a string literal the backward navigation returns the construct represented by this id or an empty set if there is no construct contained.

2.5.5. The typed axis

The typed axis represents the relationship between typed constructs and the topic type. It does not support any optional type arguments and can be navigated in both directions.

If the current node is a topic type the forward navigation returns all typed constructs being of this type.

If the current node is a typed construct the backward navigation returns the topic type of this construct.

2.5.6. The types axis

The types axis represents the relationship between topic types and its instances. It does not support any optional type arguments and can be navigated in both directions.

If the current node is a topic item the forward navigation returns all topic types of the current topic.

Note
There is an instances axis defined on the non-canonical level. If the current node is a topic item the backward navigation returns all topic types of the current topic, too.
Note
All type-instance associations handle transitive or not in relation to the pragma-definition. If the type relation handled transitive, the navigation returns all types of the whole type hierarchy, which means if A is an instance of B and B is a subtype of C than B and C are types of A.

If the current node is a topic type the backward navigation returns all instances of the current topic type.

Note
There is an instances axis defined on the non-canonical level. If the current node is a topic types the forward navigation returns all topic items which are instances of the current topic type.
Note
As root navigation there is an shortcut definition based on the non-canonical level. The shortcut // returns all instances of the topic type defined by the following topic reference.
Note
All type-instance associations handle transitive or not in relation to the pragma-definition. If the instance relation handled transitive, the navigation returns all instances of the whole type hierarchy, which means if A is an instance of B and B is a subtype of C than A is an instance of C.

2.5.7. The supertypes axis

The supertypes axis represents the relationship between topic types and its supertypes. It does not support any optional type arguments and can be navigated in both directions.

If the current node is a topic type the forward navigation returns all topic types acts as supertypes of the current topic type.

Note
There is an subtypes axis defined on the non-canonical level. If the current node is a topic type the backward navigation returns all topic types acts as supertypes of the current topic, too.
Note
All supertype-subtype associations handle transitive or not in relation to the pragma-definition. If the supertype relation handled transitive, the navigation returns all supertypes of the whole type hierarchy, which means if A is a subtype of B and B is a subtype of C than B and C are supertypes of A.

If the current node is a topic type the backward navigation returns all topic types acts as subtypes of the current topic type.

Note
There is an subtypes axis defined on the non-canonical level. If the current node is a topic types the forward navigation returns all topic types acts as subtypes of the current topic type.
Note
All supertype-subtype associations handle transitive or not in relation to the pragma-definition. If the subtype relation handled transitive, the navigation returns all subtypes of the whole type hierarchy, which means if A is a subtype of B and B is a subtype of C than A and B are subtypes of C.

2.5.8. The characteristics axis

The characteristics axis represents the relationship between a topic item and its characteristics ( merged concept of names and occurrences ). The axis supports a optional type in both directions.

If the current node is a topic item the forward navigation returns all characteristic items of the current topic item. If the optional type is used only characteristics of the specified type will be returned.

Note
There are special predefined topic references to identify the TMDM name-type and occurrence-type as optional type arguments of the characteristics axis. The topic reference tm:name used as optional type argument of the characteristics axis will return all characteristics which represent a name item in the meaning of the TMDM. The topic reference tm:occurrence used as optional type argument of the characteristics axis will return all characteristics which represent an occurrence item in the meaning of the TMDM.

If the current node is a characteristics item the backward navigation returns the topic item related to this characteristics item. If the optional type is used only topic items being an instance of the specified type will be returned.

2.5.9. The variants axis

The variants axis represents the relationship between a topic name and its variants. The optional type only support for backward navigation.

If the current node is a topic name, the forward navigation returns all variant items of this name object. The optional type has no relevance.

If the current node if a variant item, the backward navigation returns the parent topic names being type of the optional argument.

2.5.10. The datatype axis

The datatype axis represents the relationship between an occurrence or variant and its datatype. The optional type only support for forward navigation.

If the current node is an occurrence or variant, the forward navigation returns the datatype of this construct. The optional type has no relevance.

If the current node if a locator or string item, the backward navigation returns all variants or occurrences having this datatype. If the optional argument is given, only occurrences of this type are returned.

2.5.11. The atomify axis

The atomify axis represents the relationship between a characteristics or locator object and its literal. The axis does not support any optional type argument and can be navigated in both directions.

If the current node is a characteristics item the forward navigation returns the literal representing the value of the characteristics item. The literal will be interpreted in relation to the datatype of the characteristics item and will be casted automatically. If the current node is a locator object the forward navigation returns the string literal representing the IRI of this locator object.

Note
If a characteristic or locator item compared with a literal it will be transformed to an atomic automatically.
Note
There is a shortcut definition as a part of the non-canonical level representing the combination of the characteristics and atomify axis. The shortcut / returns the literals of all characteristics of the topic item addressed by the current node. Please note that the optional type can not be left out by using this shortcut, it is obligatory .

If the current node is a string literal the backward navigation returns all characteristic items or locator items using exactly this literal as value.

Note
There is a shortcut definition as a part of the non-canonical level representing the combination of the characteristics and atomify axis. The shortcut \ returns all topics related to at least one characteristics or locators represented by there literal. Please note that the optional type can not be left out by using this shortcut, it is obligatory .

2.5.12. The ratomify axis

The ratomify axis is a special variant of the atomify axis. In contrast to the atomify axis, the left hand literal will be interpreted as an regular expression. The axis does not support any optional type argument and can be navigated in both directions.

If the current node is a characteristics item the forward navigation returns the literal representing the value of the characteristics item similar to the atomify axis. The literal will be interpreted in relation to the datatype of the characteristics item and will be casted automatically. If the current node is a locator object the forward navigation returns the string literal representing the IRI of this locator object.

Note
If a characteristic or locator item compared with a literal it will be transformed to an atomic automatically.

If the current node is a string literal the backward navigation returns all characteristic items or locator items using a literal matching the given literal interpreted as regular expression.

2.5.13. The players axis

The players axis represents the relationship between a role item and its players. The optional type argument is supported in both directions.

If the current node is a role item the forward navigation returns the topic being a player of the current role item. If the optional type is used only the topic items being an instance of the optional type will be returned. If the current node is a topic type, at first all role items of the topic type will be extracted.

Note
There is a shortcut contained by the non-canonical level. The shortcut returns the player of the role item specified by the current node. The optional type can be used similar to the canonical production.

If the current node is a topic item the backward navigation returns all role items played by the topic item. If the optional type is used only the role items will returned typed by the optional type.

Note
There is a shortcut contained by the non-canonical level. The shortcut returns all roles played by the topic item specified by the current node. The optional type can be used similar to the canonical production.

2.5.14. The roles axis

The roles axis represents the relationship between an association item and its roles. The optional type argument support for both navigations.

If the current node is an association item the forward navigation returns all roles of the association. If the current node is a topic type, at first all association items of the topic type will be extracted.

If the current node is a role item the backward navigation returns the association items being parent of this role. The optional type is interpreted as the type of the returned associations.

2.5.15. The roletypes axis

The roles axis represents the relationship between an association item and its role types. The optional type argument only support for backward navigations.

If the current node is an association item the forward navigation returns all topic types acts as role types of the association. If the current node is a topic type, at first all association items of the topic type will be extracted.

If the current node is a topic type the backward navigation returns all associations items using the topic type as a role type. The optional type has no relevance.

Note
In addition to the current draft some implementations of the query language supporting an optional type argument. If the optional type is used the backward navigation only returns association items of the defined association type.

2.5.16. The traverse axis

The traverse axis represents the relationship between topic items, playing the same association items. The type argument can be used in both directions.

If the current node is a topic item the forward navigation returns all topic items connected the current node by playing at least one association item with it. If the optional type is used only topic items connected through an association item of the optional type are returned.

Note
There is a shortcut definition contained by the non-canonical level. The shortcut <→ returns all connected topic items. The optional type can be used in the same way like the canonical production.
Note
The result can contains a topic item multiple times, if it is connected through more than one association item.

If the current node is an association item the backward navigation returns all association items connected by using the same playing topic item. The optional type has no relevance.

Note
The result can contains an association item multiple times, if it is connected through more than one topic item.

2.5.17. The scope axis

The scope represents the relationship between an association item or an characteristic item and its scoping topics. The optional type is not supported.

If the current node is an association item or an characteristic item the forward navigation returns all themes of the scope of that constructs.

If the current node is a topic item the backward navigation returns all association items and characteristic items containing the topic item as one theme of their scope.

2.5.18. The reifier axis

The reifier represents the relationship between an association item or an characteristic item and its reifying topic item. The optional type has no relevance.

If the current node is an association item or an characteristic item the forward navigation returns the topic item used as reifier of the current construct.

If the current node is a topic item the backward navigation returns the association item or characteristic item using the topic item as reifier.

2.6. The Environment

The draft of topic maps query language specifies a predefined environment represented as a topic map. The predefined environment contains a set of predefined functions, a set of operators, a set of predefined topic references, a set of prefixes and additional ontology information.

2.6.1. Predefined prefixes

In the context of a TMQL query a topic will be represented by a subject-identifier, subject-locator or item-identifier. Each of this identifiers are represented by a string-represented IRI which has to be known by the underlying topic maps engine. Related to a model, the most identifiers of a topic map will be similar to each other in relation to their IRI string literals, except from a short part at the end of the IRI literal. The identifiers use the same prefix and because of using a set of topics as part of the query we have to write a set of many identifiers only differs in a short part at the end. The solution of this problem is to define a number of prefixes and use relative IRIs instead of the absolute one.

There are some predefined prefixes defined by the current draft of the topic maps query language which can be used without defining explicitly. The following prefixes are contained by the predefined environment.

prefix literal

absolute IRI

description

tm

http://psi.topicmaps.org/iso13250/model/

This is the namespace for the concepts defined by TMDM (via the TMDM/TMRM mapping).

xsd

http://www.w3.org/2001/XMLSchema#

This is the namespace for the XML Schema Datatypes.

tmql

http://psi.topicmaps.org/tmql/1.0/

Under this prefix the concepts of TMQL itself are located.

fn

http://psi.topicmaps.org/tmql/1.0/functions/

Under this prefix user-callable functions of the predefined TMQL environment are located.

dc

http://purl.org/dc/terms/

Under this prefix Dublin Core elements are located.

2.6.2. Predefined functions and operators

Similar to the other query languages like SQL, the current draft specify a number of functions which can be used to transform tuples or sequences. Each function are represented by a topic as a part of the environment topic map of the runtime container and can be used as part of the TMQL query like each other topic reference. In addition TMQL define a expression type called function-invocation to call a function with a list of arguments. Each function will be addressed by a topic item reference and a tuple-expression to define the parameter list given to the function interpreter.

string concat

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/string-concat

symbolic pattern


profile

fn:string-concat (a : string, b : string) return string

precedence

2

The function string-concat combines a set of strings. The function expects exactly two arguments given by the following tuple-expression. The type of the argument can be simple strings or sets of strings and the result will be a set of strings or a simple string. The behavior of the function is dependent from the given argument type. If the first argument a is a string and the second argument b two, the method will return the string combination of a and b. If one of the arguments a or b is a set of strings the method will return a set of strings containing each combination of the atomic string and each string of the given set. If both arguments are sets the method will return each combination of each string of the first set and the second set.

1:      fn:string-concat ( a => [ "foo" ] , b => [ "bar" ] )
2:      => "foobar"
3:
4:      fn:string-concat ( a => [ "foo" , "main" ] , b => [ "bar" ] )
5:      => [ "foobar" , "mainbar" ]
6:
7:      fn:string-concat ( a => [ "foo" , "main" ] , b => [ "bar" , "menu" ] )
8:      => [ "foobar" , "mainbar" , "foomenu" , "mainmenu" ]
string-length

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/string-length

profile

fn:length (s : string) return integer

The function length returns the internal size of a string literal by counting the contained characters. The function expect exactly one argument which can be a simple string or a sequence of strings. The behavior of the function is dependent from the given argument type. If the argument is a simple string the function return a single integer value. If the argument is a set of strings it will return a sequence of integer values.

1:      fn:length ( s => [ "foo" ]  )
2:      => 3
3:
4:      fn:length ( s => [ "foo" , "main" ] )
5:      => [ 3 , 4 ]
string-less-than

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/string-less-than

symbolic pattern

<

profile

fn:string-lt (a : string, b : string) return tuple-sequence

precedence

5

The function string-lt compare the two string literals and only return the string literal given by the first argument if the literal is lexicographically lower than the second string. The function expected two arguments which has be of the type string. The first argument also can be a sequence of strings. The behavior of the function is dependent from the given argument type. If the first argument is a set of strings the function returns a set containing the lexicographically lower string in relation to the second string argument. If the first argument is a string it return the given string if it is lexicographically lower than the second one, otherwise it return an empty set. If the second argument is a set of strings the first one will be used.

1:      fn:string-lt ( a => [ "a" ] , b => [ "aaa" ] )
2:      => "a"
3:
4:      fn:string-lt ( a => [ "a" , "b" ] , b => [ "aaa" ] )
5:      => "a"
6:
7:      fn:string-lt ( a => [ "a" , "b" ] , b => [ "aaa" , "bbb" ] )
8:      => "a"
string-less-equal-than

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/string-less-equal-than

symbolic pattern

< =

profile

fn:string-leq (a : string, b : string) return tuple-sequence

precedence

5

The function string-leq compare the two string literals and only return the string literal given by the first argument if the literal is lexicographically lower or equal than the second string. The function expected two arguments which has be of the type string. The first argument also can be a sequence of strings. The behavior of the function is dependent from the given argument type. If the first argument is a set of strings the function returns a set containing the lexicographically lower string in relation to the second string argument. If the first argument is a string it return the given string if it is lexicographically lower or equal than the second one, otherwise it return an empty set. If the second argument is a set of strings the first one will be used.

1:      fn:string-leq ( a => [ "a" ] , b => [ "aaa" ] )
2:      => "a"
3:
4:      fn:string-leq ( a => [ "a" , "b" ] , b => [ "aaa" ] )
5:      => "a"
6:
7:      fn:string-leq ( a => [ "a" , "b" ] , b => [ "aaa" , "bbb" ] )
8:      => "a"
string-greater-equal-than

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/string-greater-equal-than

symbolic pattern

>=

profile

fn:string-geq (a : string, b : string) return tuple-sequence

precedence

5

The function string-geq compare the two string literals and only return the string literal given by the first argument if the literal is lexicographically greater or equal than the second string. The function expected two arguments which has be of the type string. The first argument also can be a sequence of strings. The behavior of the function is dependent from the given argument type. If the first argument is a set of strings the function returns a set containing the lexicographically greater string in relation to the second string argument. If the first argument is a string it return the given string if it is lexicographically greater or equal than the second one, otherwise it return an empty set. If the second argument is a set of strings the first one will be used.

1:      fn:string-geq ( a => [ "a" ] , b => [ "aaa" ] )
2:      => [ ]
3:
4:      fn:string-geq ( a => [ "a" , "b" ] , b => [ "aaa" ] )
5:      => "b"
6:
7:      fn:string-geq ( a => [ "a" , "b" ] , b => [ "aaa" , "bbb" ] )
8:      => "b"
string-greater-than

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/string-greater-than

symbolic pattern

>

profile

fn:string-gt (a : string, b : string) return tuple-sequence

precedence

5

The function string-gt compare the two string literals and only return the string literal given by the first argument if the literal is lexicographically greater than the second string. The function expected two arguments which has be of the type string. The first argument also can be a sequence of strings. The behavior of the function is dependent from the given argument type. If the first argument is a set of strings the function returns a set containing the lexicographically greater string in relation to the second string argument. If the first argument is a string it return the given string if it is lexicographically greater than the second one, otherwise it return an empty set. If the second argument is a set of strings the first one will be used.

1:      fn:string-gt ( a => [ "a" ] , b => [ "aaa" ] )
2:      => [ ]
3:
4:      fn:string-gt ( a => [ "a" , "b" ] , b => [ "aaa" ] )
5:      => "b"
6:
7:      fn:string-gt ( a => [ "a" , "b" ] , b => [ "aaa" , "bbb" ] )
8:      => "b"
string-regexp-match

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/string-regexp-match

symbolic pattern

=~

profile

fn:regexp (s : string, re : string) return tuple-sequence

precedence

5

The function regexp checks if the given string argument matches to the regular expression. The method expected exactly two arguments, which can be an instance of string or a simple sequence or strings. The behavior of the function is dependent from the given argument type. If the first argument is a simple string, the result will be an empty sequence if the string does not match or the string if it matches. If the first argument is a set of strings the function will return a set of all matching strings. If the second argument is a set of strings only the first one will be used.

1:      fn:regexp ( a => [ "aaa" ] , b => [ "[a]+" ] )
2:      => "aaa"
3:
4:      fn:regexp ( a => [ "aaa" , "bbb" ] , b => [ "[a]+" ] )
5:      => "aaa"
6:
7:      fn:regexp ( a => [ "aaa" , "bbb" ] , b => [ "[a]+" , "[b]+" ] )
8:      => "aaa"
substring

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/substring

profile

fn:substring (s : string, f : integer , t : integer ) return string

precedence

5

The function substring will be return an substring of the given string argument addressed by the given indexes. The function expects exactly three arguments of type string and integer. As first argument a string sequence is supported two. The behavior of the function is dependent from the given arguments. If the first argument is a string it will return a single string representing the substring of the first argument. If the first argument is a sequence it will return a sequence of substrings. If any of the indexes is out of bounds the function will clear this indexes to the possible values which are encapsulate by the given range. The string indexes will be zero-based and the upper index will be excluded.

1:      fn:substring ( s => [ "Java-based engine" ] , f => 1 , t => 5 )
2:      => "ava-"
3:
4:      fn:substring ( s => [ "Java-based engine" , "foo" ] , f => 1 , t => 5 )
5:      => [ "ava-" , "oo" ]
6:
7:      fn:substring ( s => [ "Java-based engine" , "foo" ] , f => -1 , t => 50 )
8:      =>  [ "Java-based engine" , "foo" ]
9:
10:     fn:substring ( s => [ "Java-based engine" , "foo" ] , f => "1" , t => "5" )
11:     =>  [ "ava-" , "oo" ]
12:
13:     fn:substring ( s => [ "Java-based engine" , "foo" ] , f => "a" , t => "5" )
14:     =>  raises an error
has-datatype

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/has-datatype

profile

fn:has-datatype (s: tuple-sequence) return tuple-sequence

The function has-datatype retrieves the data type for each tuple element in each tuple. The function expects exactly one argument which has to be a tuple or a tuple sequence. The behavior of the functions is dependent from the given arguments. If the the contained element is a name item the data type will be string, for occurrence item it will be the internal data type and for each atom it will be the data type of the atom itself. Any other item results in the data type any. Each data type is a IRI.

1:      fn:has-datatype ( s => [ "http://tmql4j.topicmapslab.de"^^xsd:anyURI , "aaa" , 5 ] )
2:      =>  [ "xsd:anyURI" , "xsd:string" , "xsd:interger" ]
has-variant

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/has-variant

profile

fn:has-variant (s: tuple-sequence, s: item-reference) return tuple-sequence

The function has-variant only supported for topic name items. The function expected exactly two argument which has to be a tuple sequence and a topic reference. It retrieves the variant items for each tuple element in each tuple the variant name for the given scope. For name items this is the variant value, if such exists. Otherwise it is undef. For all other things the function will return ever undef.

slice

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/slice

profile

fn:slice (s : tuple-sequence, low : integer, high : integer) return tuple-sequence

The function slice is a function supports tuple-sequences. The method returns the selected tuples of the sequence with an index between the given arguments. The indexes are zero-based and the upper index will be excluded. If the indexes given as strings, it will be transformed to interger values automatically. The function will be used by the select expression to realize the keywords LIMIT and OFFSET. If the integer values are negative or invalid an error will be raised. If the indexes are out of bounds the function will return the tuples with an index in range of the tuple-sequence and the given arguments.

1:      fn:slice  ( s => [ "a" , "b" , "c" , "d" ] , low => 1 , high => 2 )
2:      => [ "a" ]
3:
4:      fn:slice  ( s => [ "a" , "b" , "c" , "d" ] , low => 3 , high => 10 )
5:      => [ "c" , "d" ]
6:
7:      fn:slice  ( s => [ "a" , "b" , "c" , "d" ] , low => "a" , high => "-1" )
8:      =>  raises an error
count

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/count

profile

fn:count (s : tuple-sequence) return integer

The function count returns the number of tuples of the tuple sequence. The function expected one argument which should be a tuple sequence or an atom. The behavior of the function is dependent from given arguments. If an atom is given the method will return 1 every time. If it is an tuple sequence it will return the number of contained items.

1:      fn:count  ( s => [ "a" , "b" , "c" , "d" ]  )
2:      => 4
3:
4:      fn:count  ( s => "b" )
5:      => 1
6:
7:      fn:count  ( s => [ ] )
8:      =>  0
uniq

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/uniq

profile

fn:uniq (s : tuple-sequence) return tuple-sequence

The function uniq unifies the given tuple sequence. As an interpretation of the current draft each tuple sequence can contain each tuple multiple times. The function removes all multiple references of the same tuple in the given sequence. As arguments the function expects exactly one tuple sequences. The functions is used by the select expression to realize the keyword UNIQUE. The indexes of the tuples in the tuple sequence will be changed.

1:      fn:uniq  ( s => [ "a" , "b" , "a" , "b" ]  )
2:      => [ "a" , "b" ]
concat

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/concat

symbolic pattern

UNION

profile

fn:concat (s : tuple-sequence, t : tuple-sequence) return tuple-sequence

precedence

1

The function concat combine two tuple sequence to a new one. The functions adds all tuples of the second sequence to the first sequence. If one of the arguments is an atom instead of a sequence, a new sequence will be created and the atoms will be added. While the combination any ordering is honored.

1:      fn:concat  ( s => [ "a" , "b" , "a" , "b" ]  , t => "a" )
2:      => [ "a" , "b" , "a" , "b" , "a" ]
3:
4:      fn:concat  ( s => "a"  , t => "b" )
5:      => [ "a" , "b" ]
6:
7:      fn:concat  ( s => [ "a" , "b" , "a" , "b" ]  , t =>  [ "a"  , "c" ] )
8:      => [ "a" , "b" , "a" , "b" , "a" , "c" ]
except

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/except

symbolic pattern

MINUS

profile

fn:except (s : tuple-sequence, t : tuple-sequence) return tuple-sequence

precedence

1

The function except produces a tuple sequence where all tuple which appear in t are removed from s. If one of the arguments is an atom instead of a sequence, a new sequence will be created an the atoms will be added. There is no effect for the ordering. If the first tuple sequence contains an element of the second one multiple times, all references of this element will be removed.

1:      fn:except  ( s => [ "a" , "b" , "a" , "b" ]  , t => "a" )
2:      => [ "b" , "b" ]
3:
4:      fn:except  ( s => "a"  , t => "b" )
5:      => [ "a" ]
6:
7:      fn:except  ( s => [ "a" , "b" , "a" , "b" ]  , t =>  [ "a"  , "c" ] )
8:      => [ "b" , "b" ]
compare

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/compare

symbolic pattern

INTERSECT

profile

fn:compare (s : tuple-sequence, t : tuple-sequence) return tuple-sequence

precedence

1

The function compare produces a tuple sequence of all tuples which appear in s and t. Any tuple of the sequence s which is not contained by t will be removed from s. If one of the arguments is an atom instead of a sequence, a new sequence will be created an the atoms will be added. There is no effect for the ordering. The number of references of a tuple element will not be considered.

1:      fn:compare  ( s => [ "a" , "b" , "a" , "b" ]  , t => "a" )
2:      => [ "a" , "a" ]
3:
4:      fn:compare  ( s => "a"  , t => "b" )
5:      => [ ]
6:
7:      fn:compare  ( s => [ "a" , "b" , "a" , "b" ]  , t =>  [ "a"  , "c" ] )
8:      => [ "a" , "a" ]
zigzag

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/zigzag

profile

fn:zigzag (s : tuple-sequence) return tuple-sequence

The function zigzag is the reverse function of zagzig and returns a single tuple filled with all values from all tuples. The function can be used to speed up the application because indexes of tuples run faster than indexes within the tuple sequence.

1:      fn:zigzag  ( s => [ [ "a" , "b" ] , [ "a" , "b" ] )
2:      => [ "a" , "b" , "a" , "b" ]
3:
4:      fn:zigzag  ( s => [ "a" , [ "a" , "b" ] )
5:      => [ "a" , "a" , "b" ]
zagzig

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/zagzig

profile

fn:zagzig (s : tuple-sequence) return tuple-sequence

The function zagzig is the reverse function of zigzag and returns a tuple sequence filled by singleton tuples containing an element of the origin tuple. The index of the singleton tuple in context of the sequence will be the same than the index of the item contained by the new singleton tuple in context of the origin tuple.

1:      fn:zagzig  ( s =>  [ "a" , "b" , "a" , "b" ] )
2:      => ( [ "a" ] , [ "a" ] , [ "a" ] , [ "a" ] )
url-decode

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/url-decode

profile

fn:url-decode (s : string) return string

The function url-decode decodes the given URL reference. The result will be the reference without any encoded characters.

1:      fn:url-decode  ( s =>  "http://psi.example.org/Hello%20World" )
2:      => ( "http://psi.example.org/Hello World" )
url-encode

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/url-encode

profile

fn:url-encode (s : string) return string

The function url-encode encodes the given string literal as an URI reference and escape all forbidden characters of the URI syntax.

1:      fn:url-encode  ( s =>  "http://psi.example.org/Hello World" )
2:      => ( "http://psi.example.org/Hello%20World" )
fn:min

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/min

profile

fn:min (e : simple-content, i : integers) return integer

The function min is an aggregate function which returns the minimal value of the second argument list. The first argument interpreted as the context of the second one. The second one could be a expression returns a set of integers.

1:      fn:min  ( e =>  // tm:subject  , i => fn:count ( . / tm:name ) )
2:      => ( ... the minimum number of names a topic has )

The . of the second argument i are bound to any value returned by the e argument. In this case all topics are selected, and the second expression counts the number of names of each topic instance. The smallest number are returned.

fn:max

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/max

profile

fn:min (e : simple-content, i : integers) return integer

The function max is an aggregate function which returns the maximum value of the second argument list. The first argument interpreted as the context of the second one. The second one could be a expression returns a set of integers.

1:      fn:max  ( e =>  // tm:subject  , i => fn:count ( . / tm:name ) )
2:      => ( ... the maximum number of names a topic has )

The . of the second argument i are bound to any value returned by the e argument. In this case all topics are selected, and the second expression counts the number of names of each topic instance. The largest number are returned.

2.6.3. Predefined topic references

In adaptation to the topic maps meta model there are some predefined topic references which can be used as topic reference in the query. The following table contains all predefined references and a short description of their meaning. In this context the prefix tm is used for the full qualified IRI http://psi.topicmaps.org/iso13250/model/.

topic reference

description

tm:subject

Representing the topic-type of the TMDM. Every topic item and topic type is an instance of this topic type. Can be used as a wildcard for undefined topic actors.

tm:name

Representing the name-type of the TMDM. Every name item is an instance of this type. Can be used to filter characteristics representing a name item in the meaning of the TMDM.

tm:occurrence

Representing the occurrence-type of the TMDM. Every occurrence item is an instance of this type. Can be used to filter characteristics representing an occurrence item in the meaning of the TMDM.

tm:subclass-of

Representing the type of the supertype-subtype associations of the TMDM. Can be used as association type.

tm:subclass

Representing the type of the subtype role of the TMDM. Can be used as role type.

tm:superclass

Representing the type of the supertype role of the TMDM. Can be used as role type.

tm:type-instance

Representing the type of the type-instance associations of the TMDM. Can be used as association type.

tm:instance

Representing the type of the instance role of the TMDM. Can be used as role type.

tm:type

Representing the type of the type role of the TMDM. Can be used as role type.

Please note: The prefixes tm and tmdm are similar to each other.

2.6.4. Environment clause

To change the environment in the context of one query the current draft supports a special expression type to add additional information to control the querying process. An environment clause can be add the beginning of a query to define additional environment information. The production looks like the following one.

1:      ENVIRONEMT-CLAUSE       ::= { DIRECTIVE | PRAGMA }
2:      DIRECTIVE               ::= PREFIX
3:      PREFIX                  ::= '%prefix' IDENTIFIER QIRI
4:      PRAGMA                  ::= '%pragma' IDENTIFIER QIRI

A prefix directive can be used to define a prefix only valid for the following query. The prefixes defined by a triple containing the keyword %prefix the prefix identifier and the full qualified IRI of the prefix.

A pragma definition contains additional information to control the querying process.

Pragma

Currently the following pragmas are supported:

name

values

description

taxonometry

tm:transitive or tm:intransitive

Describing the transitivity of the type hierarchy

datatype-binding

true or false

Indicates if the datatype should be reused by the value modifcation of a variant or occurrence. If the value is false, the datatype are set to xsd:string.

datatype-validation

true or false

Indicates if the value should be validate againts the datatype by the modification of a variant or occurrence.

template

any template-def

Defines the anonym template.

2.7. Variables and bindings

During the querying process variables are bind to different values. The processor bind the variable to value contained by the binding definition. During the iteration each value will be checked for satisfaction and validity.

A variable has to start with the prefix $ followed by a number of alpha-numerical characters or an underscore character. The variable can be post-fixed by number of primes. If two variables has the same name except their number of post-fixed primes, the processor interpret them as protected variable sub set. Protected variables cannot be bind to the same value in one iteration.

The possible bindings of a variable are defined by a variable binding definition called variable-assignment. A binding set is a special expression type containing a number of variable assignments. Each variable-assignment looks like the following production.

1:      BINDING-SET             ::=     < VARIABLE-ASSIGNMENT >
2:      VARIABLE-ASSIGNMENT ::= VARIABLE 'IN' CONTENT

The content expression returns a set of topic map items which can be bind to the variable represented by their name.

2.7.1. Predefined variables

The current draft contains a number of predefined variables which will be reserved by the processor. The following table contains all predefined variables of the TMQL draft.

variable

description

%_

Representing the queried topic map itself.

$#

Representing the current index of the tuple in the context of the tuple sequence. The index can be used to filter tuples by their index.

$[0-9]+

Representing the tuples of a tuple sequence similar to the variable $#. The variable $0 bound to the tuple at the first index, $1 to the tuple at the second and so on.

All this variables are read-only variables except the anonymous variable. This variable is an write-only variable which can be used if their binding is not necessary for the query result. Anonymous variables are simply like wildcards.

2.8. Boolean expressions and boolean primitives

To define conditions which should be satisfied by the items contained in the result set, the draft supports boolean expressions representing the combination of boolean primitives. A boolean primitive represent an atomic condition for example a type or scope matching.

In the context of query boolean expressions and boolean primitives can be used at different positions, for example as filter definitions after navigation steps or as a part of where-clauses. For more information please take a look at the corresponding chapter.

There are different types of boolean primitives and boolean combinations. This chapter give a short overview about boolean primitives and their combinations using boolean expressions.

2.8.1. Instance-of expressions

The instance-of expression or ISA-expression is contained by the non-canonical level of the query language and describe the type-instance relation of a topic type and an instance represented by a topic item. The expression should be used to bind variables only to instances of a specific type. In this context a variable will be bind only to instances of this specific type. The production of an instance-of expression is quite simple and looks like the following definition.

1:      ISA-EXPRESSION          ::=     VALUE-EXPRESSION 'ISA' VALUE-EXPRESSION
2:      VALUE-EXPRESSION        ::=     VARIABLE | TOPIC-REF | ATOM

An instance-of expression symbolized by the keyword ISA which will be reserved for this expression type. The first value-expression in front of the keyword represents the instance. Normally the first value-expression only contains a variable which will be bind to different topic map items. The second value expression represents the topic type which will be represented by a topic reference or a navigation expression. The expression returns true only if the value-expression in front of the keyword returns a topic item being an instance of the topic type returned by the second value-expression. If the first value-expression is represented as a variable the set of possible variable bindings will be reduced to all instances of the topic type returned by the second value-expression.

Note
The production is only valid on the non-canonical level. The expression can be replaced by its canonical counterpart representing a predicate-invocation with the predefined type of a type-instance association and the predefined role types.

2.8.2. Kind-of expressions

The kind-of expression or AKO-expression is contained by the non-canonical level of the query language and describe the supertype-subtype relation of a topic type and its supertypes. The expression should be used to bind variables only to subtypes of a specific type. In this context a variable will be bind only to subtypes of this specific type. The production of an kind-of expression is quite simple and looks like the following definition.

1:      AKO-EXPRESSION          ::=     VALUE-EXPRESSION 'AKO' VALUE-EXPRESSION
2:      VALUE-EXPRESSION        ::=     VARIABLE | TOPIC-REF | ATOM

A kind-of expression symbolized by the keyword AKO which will be reserved for this expression type. The first value-expression in front of the keyword represents the subtype. Normally the first value-expression only contains a variable which will be bind to different topic map items. The second value expression represents the supertype which will be represented by a topic reference or a navigation expression. The expression returns true only if the value-expression in front of the keyword returns a topic type being a subtype of the topic type returned by the second value-expression. If the first value-expression is represented as a variable the set of possible variable bindings will be reduced to all subtypes of the topic type returned by the second value-expression.

Note
The production is only valid on the non-canonical level. The expression can be replaced by its canonical counterpart representing a predicate-invocation with the predefined type of a supertype-subtype association and the predefined role types.

2.8.3. Predicate invocations

One of the fundamental concept of a topic map are association items. An association item is used to define a relationship between a set of topic items and representing a defined semantic meaning. Each topic item can act as a player in the context of different role types.

The importance of association items indicates that necessity of modeling associations as a part of a query. To realize the definition of relations between topic items as a condition, the topic maps query language contains the special expression type called predicate-invocation. The syntax of predicates is similar to the CTM syntax to define associations.

1:      PREDICATE-INVOCATION    ::= TOPIC-REF '(' < TYPE-PLAYER-DEFINITION > ')'
2:      TYPE-PLAYER-DEFINITION  ::=     TOPIC-REF ':' VALUE-EXPRESSION | '...'

The predicate-invocation starts with a topic reference representing the association type followed by a set of type-player definitions for a specific role construct. If the association type does not exists an error will be raised by the query processor. A predicate can contain a non-quantified number of role constraints define a type-player combination which has to be satisfied by an association item. A role type will be represented by a topic reference and the player can be a topic reference or a variable. If the player is bound to a variable the condition checks if the variable is bind to a topic reference playing the specific role in one association item with the other constraints. Each constraint must be satisfied by the association item.

In contrast to the tolog query language this operation is strict, which means that the association item has to satisfy exactly the constraints defined and must not contain any other role construct than defined by the set of constraints. To handle them as non-strict operation the ellipsis has to be used at the end of role-constraints.

2.8.4. Quantified expressions

Quantified expressions are special boolean primitives checks a boolean condition in combination with a numerical restriction of satisfying items.

A quantified expression define a sub set of variables bindings which will be checked for satisfaction of the given boolean condition. The number of satisfying items depends on the type of quantified expression and can be defined as upper or lower bound. Quantified expressions can be split into two types for-all clauses and exists clauses.

Forall clauses

The production of for-all clause looks like.

1:      FORALL-CLAUSE   ::=     'EVERY' BINDING-SET 'SATISFIES' BOOLEAN-EXPRESSION
2:      BINDING-SET             ::= VARIABLE 'IN' CONTENT

In this context a binding set is used to define a variable binding for the specified variable given as literal starting with the prefix $. The keyword IN symbolize the relation between the variable and the sequence of values the variable could bind to which will be represented by the following content expression. The content expression can contain each query expression which creates a set of topic map constructs, like navigations. Normally the content expression only contains a simple navigation step using the instances axis.

A for-all clause only returns true if each value of the content satisfies the condition defined by the contained boolean-expression. After the execution of the for-all clause the internal variable will be destroyed.

Note
The complexity of quantified expression is much higher than simple expression because of the internal iteration over a variable binding set.
Exists clauses

An exists clause can be split into numerically unrestricted and numerically restricted expression. Unrestricted exists clauses always starts with the keyword SOME followed by a binding-set and a boolean-expression similar to the for-all clause as we see at the following query snippet.

1:      EXISTS-CLAUSE   ::=     'SOME' BINDING-SET 'SATISFIES' BOOLEAN-EXPRESSION
2:      BINDING-SET             ::= VARIABLE 'IN' CONTENT

The expression returns true exactly if at least one variable binding of the binding set satisfies the contained boolean condition. If the boolean condition does not depends on new variables the binding-set can be removed. To avoid ambiguousness the keyword has to replace with the keyword EXISTS. The keyword can also be left out.

1:      EXISTS-CLAUSE   ::=     [ 'EXISTS' ] BOOLEAN-EXPRESSION

Restricted exists clauses define an upper or lower bound of satisfying bindings.

1:      EXISTS-CLAUSE   ::=     'AT' ( 'LEAST' | 'MOST' ) NUMBER BINDING-SET 'SATISFIES' BOOLEAN-EXPRESSION
2:      BINDING-SET             ::= VARIABLE 'IN' CONTENT

There are two syntactical forms of numerically restricted exists clauses. If the keyword LEAST is used the expression returns true if the condition will be satisfied by at least n items. The numerical bound of n will be represented by the NUMBER terminal. If the keyword MOST is used the expression returns true if the condition will be satisfied by at most n items.

2.8.5. Boolean operators

To combine boolean primitives in the context of boolean condition the current draft supports boolean expressions. Boolean expressions normally representing a boolean conjunction, disjunction or negation. Sometimes they will be used to clamp a boolean combination to change the execution order.

Boolean conjunctions are symbolized by the keyword AND and disjunction by the keyword OR. The keyword NOT is used to create a negation of a condition.

2.9. Use Expressions

Since version 3.0.0 the tmql4j engine supports the use-expression to enable the modification of result type and result format. The use-expression only contains two or three tokens representing the request format of result processing.

1:      USE-EXPRESSION ::=      'USE' ( 'CTM' | 'JTMQR' | ( 'TEMPLATE' literal ))

2.9.1. CTM

The token CTM in the use-expression are used to call the result processor that the results should be returned as CTM fragments.

Note
The result processor only supports the possibility to returned topic or association CTM fragments, any other constructs will be ignored.

2.9.2. JTMQR

JTMQR is a modification of the JSON Topic Maps Notation to enable the representation of TMQL results within the JTM syntax. The syntax is quite simple.

version: "1"
seq: [
        {
         t      : [
                                i:      { topic-map-construct },
                                n:      decimal or integer,
                                s:      string
                          ],
                }
         ]

The overall results of the query are contained within the seq item, which contains a JSON array of result tuples t. A result tuple t is also a JSON array containing a set of cell values. A cell value can be a topic map construct represented by the i key, a numerical value represented by the n key or a string literal represented by s.

2.9.3. TEMPLATE

Some query engine provides the functionality of defining templates, which should be used in a later query.

Definition of Templates

A template are identified by its name, because of that a name identifier can only used one times for a template. To avoid unexpected side effects, the engine only supports the redefinition for the same name identifier by a special keyword.

1:      TEMPLATE-DEF    ::=     ( 'DEFINE' | 'REDEFINE' ) 'TEMPLATE' string-literal string-literal

The keyword DEFINE enables the definition of a new template using a name which was not registered before. If the definition uses the same name like an other template, an error occurred. To avoid errors the keyword REDEFINE can be used.

The first string literal represents the name of the template. The name is important to enable the usage of this template in other queries. The template-definition is represented by a simple string and can contain any content. As wildcards it can contains any string literal encapsulated by ?.

DEFINE TEMPLATE "myTemplate" """<div>?name?</div>"""

The wildcard ?name? within this template will be replaced by the result column aliased with the same string literal name.

Anonymous Definition of Templates

Sometimes the template only needed for one query or should be published to the query engine. To enable this use case the query engine supports the anonymous definition of a template within the same query, than the result are proceeded. Anonymous template definitions are a special kind of pragma definitions.

%pragma TEMPLATE """<div>?name?</div>"""
// tm:subject / tm:name AS "name"
USE TEMPLATE

The anonymous template does not have any name, because it is only defined within the query which defines it. The keyword TEMPLATE in the use-expression calls the result processor to use the template with the name followed by this keyword. In this case the name is left out, because the anonymous template should be used.

Delete Templates

To delete a template definition from the query processor, the keyword DELETE followed by TEMPLATE and the name of the template can be used.

2:      TEMPLATE-DEL    ::=     'DELETE' 'TEMPLATE' string-literal
Usage of Templates

The defined templates can be used by the use-expression by using the keyword TEMPLATE the keyword may be followed by the name of the template, except the usage of the anonymous template is expected. The template have to contain special wildcards encapsulated by ? and named by the same literals than the result columns are aliased.

%pragma TEMPLATE """<div>?name?</div>"""
// tm:subject / tm:name AS "name"
USE TEMPLATE

The result will contain a <div> item for each topic name. The topic name are represented by the ?name? wildcard which will be replaced by the result column aliased with name.

2.10. Path expression

The current draft of the topic maps query language supports three different types of query-expressions. Each of theme will be inspired by other query languages of other industrial standards. On of them is the path expression style inspired by the XPath language of XML documents.

A path expression represent a set of navigations through the abstract bidirectional topic maps graph. To realize that the path-expression contains navigations as combination of the defined axes. By combine the navigation axes a path expression can be used to extract information of a topic map without complex conditions on the result set. In addition the results can be filtered by using an optional filter expression based on the boolean expressions. Currently the filter can only added at the end of the whole navigation and cannot be used after one step over a specific axis. This will be changed as soon as possible.

2.10.1. Filter expressions

A filter expression is an optional part of path expressions to reduce the result set to a sub set of items matching the given filter definition. The concept is similar to filters of XPath expressions. A filter has to be added at the end of a path expression encapsulated by square brackets.

1:      FILTER-EXPRESSION ::= '[' BOOLEAN-EXPRESSION ']'

The filter definition can contain each boolean expression discussed in the previous chapters of this document except the fact of missing variables bindings in the context of the path expression.

In the context of a filter expression the dot . can be used to represent the items of the result set of the path expression. Using iterations each element of the navigation result will be bound to the dot . token representing the current node similar to the XPath notation.

As frequently used filters the current draft defines a set of productions contained by the non-canonical level which can be used quite simple. A non-canonical filter can only used as stand-alone expression if the filter expression make use of conjunctions or disjunction the canonical counterpart has to be used.

Type filter

A type filter simple removes all items which are not an instance of the defined type. The type can be defined as topic reference. The filter identified by the token ^ followed by the topic reference representing the topic type.

1:      [ ^ TOPIC-REF ]

The productions is only valid on the non-canonical level. The canonical counterpart can be used in an equal way.

1:      [ . >> types == TOPIC-REF ]

In contrast to the non-canonical type filter the canonical counterpart can be used in the context of boolean combinations.

Note
There is a shortcut defined on the non-canonical level representing a type filter in a shorter way. The shortcut // can be used instead of ^ and the square brackets.
Index filter

If the user is only interested in a specific element at a defined index the index filter can be used to extract the element at the current position defined by a numerical literal. If the index is out of bound the constant topic undef will be returned.

Note
By default topic maps and topic map items are unordered. The order of the contained items of the navigation result is quite haphazardly. Because of that the result of an index filter will be different for each execution.

A index filter simply represented by a numerical value. The canonical counterpart take use of a predefined variable of the query processor representing the current iteration index. The processor simple checks if the variable is bound to the same value than the user specified by the numerical literal.

1:      [ NUMBER ]                      #non-canonical production
2:
3:      [ $# == NUMBER ]        #canonical counterpart
Index range filter

In addition to index filter the user can define an upper and lower border interpreted as a sub set of valid indexes. The processor will return each item with an index which is greater or equal to the lower bound but less than the upper bound.

Note
The upper bound will be excluded any time and all indices are zero based.

A index range filter simply represented by two numerical values and the special token ... The canonical counterpart take use of a predefined variable of the query processor representing the current iteration index. The processor simple checks if the variable is bound to a value contained by the defined range.

1:      [ NUMBER .. NUMBER ]                            #non-canonical production
2:
3:      [ NUMBER >= $# AND NUMBER < $# ]        #canonical counterpart
Scope filter

If the user is only interested in association items or characteristic items valid in a special scope, he can use the scope filter. A scope filter simply define a topic item representing one theme which has been included by the scope of the current node. If the current node is a topic item the filter fails any time. The filter symbolized by the token @.

1:      [ @ TOPIC-REF ]

The productions is only valid on the non-canonical level. The canonical counterpart can be used in an equal way.

1:      [ . >> scope == TOPIC-REF ]

In contrast to the non-canonical type filter the canonical counterpart can be used in the context of boolean combinations.

Note
There is a shortcut defined on the non-canonical level representing a cope filter in a shorter way and allow to remove the square brackets.

2.10.2. Projections

Mostly the user is interested in a number of different information items related to the same node of the abstract graph for example all names and all traversal topic players. To realize that the current draft supports projections which can be used instead of filter expressions. A projection used to combine the results of a set of navigations starting at the same node of the abstract graph.

The projection is defined as a comma separated list of navigation expressions starting at the current node using the dot .. A projection can be add at the end of a navigation expression but cannot combine with a filter expression. In addition to that fact an projection must not contain filters as well.

1:      PROJECTION ::=  '(' < VALUE-EXPRESSION > ')'

A projection definition is always encapsulated by round brackets. The definition of projections make use of value expressions containing simple navigation expressions but must not contain filter or projections as well. To define a number of projections they will be separated by a comma ,.

2.10.3. Tuple expressions as path expressions

In addition to the simple navigation expressions a path expression can create a sequence of tuples containing more than one item using an expression type called tuple expressions. Tuple expressions are quite similar to projections because they based on the same production. The only difference is the missing navigation in front of the projection. A tuple expression is always encapsulated by round brackets and can contain a unrestricted number of navigations to extract multiple sets of navigation results. The dot . token cannot be used because of the missing context.

1:      TUPLE-EXPRESSION ::=    '(' < VALUE-EXPRESSION > ')'

A tuple expression can be used to extract a set of topic items by using a comma-separated list of topic references.

On the other hand the tuple expressions often used as parts of other productions like merge expressions.

Using Alias Expression

In special use cases the numerical index of the result set are not useful. In such a case it would be helpful to use alias defined in a value expression, which enables the usage of special string-based access methods of the result set. An alias expression is a special value-expression ending of the two tokens AS and a string literal used as alias for this index. The index are the same than the index of the value expression within the tuple-expression.

Note
If there are more than one alias expression using the same string reference, an error occurred.
1:      ALIAS ::=       VALUE-EXPRESSION 'AS' STRING-LITERAL

2.11. Select expressions

The current draft of the topic maps query language supports three different types of query-expressions. Each of theme will be inspired by other query languages of other industrial standards. One of them is the select-expression style inspired by the query language of relation databases - SQL.

The select style represented by the production called select expression. A select expression contains a number of optional sub expression smilar to the SQL query language.

1:      select-expression       ::=     SELECT    < value-expression >
2:                                                      [  FROM   value-expression ]
3:                                                      [  WHERE   boolean-expression ]
4:                                                      [  GROUP BY  < $[0-9]+ > ]
5:                                                      [  ORDER BY    < value-expression > ]
6:                                                      [  UNIQUE  ]
7:                                                      [  OFFSET  value-expression ]
8:                                                      [  LIMIT  value-expression ]
9:                                                      [ USE use-definition ]

Only the select-clause represented by the keyword SELECT is mandatory, all other expression can be left out.

2.11.1. Processing model

The processing model of a select expression defines to execution order of the contained expressions. In this case the from clause has to be interpreted at first. The from clause define the context of possible variable bindings of the whole select expression and define a sub set of topic map constructs they can be bind to. If the from clause is missing the context will be defined as the whole queried topic map represented by the variable %_.

After the execution of the from clause the offset and limit clause will be interpreted. The contained value expressions only define a numerical literal representing the first selected index and the maximum number of selected values. The result will be stored as the processing variables $_lower' and $_limit.

Then all free unbound variables of the where-clause are determinded. Each variable are bound i. e. using iterations to any possible value of the context defined by the from-clause. During the processing of the where-clause each variable binding tuple will be checked for satisfaction in the context of the defined boolean condition. The result of the where clause will be a unordered sequence of tuple representing all satisfying variable bindings.

If the order-by clause is used, the unordered sequence will be ordered using the defined value expressions.

After the optional sortation of the variable bindings the select clause used to transform the variable bindings to the values to user is interested in. The clause normally contains a set of navigations or functions to extract exactly the information the user is interested in.

If there is a group-by clause the interpreter creates a projection of the results, to group them using the given indexes. If the clause contains all indexes, the result keeps unchanged.

The keyword UNIQUE only symbolize the reduction of the result set by removing duplicates. If the keyword is missing, the result set can contain each tuple multiple times.

As last execution step the bounded variables $_limit and $_lower used to extract the expected selection window of the whole result set. To realize that the function slice is used.

2.11.2. Select clause

The select clause is symbolized by the keyword SELECT and contains a comma-separated list of value expressions to transform the variable bindings of the evaluation to a sequence of tuples. Normally the select clause contains navigation expression or simple functions.

Note
The select clause must not use variables which are not used in the context of the where-clause and they must not use anonymous variables like $_ because the variables support only write-access.

2.11.3. From clause

The from clause is used to define the context of the query execution. The context is used to get the set of possible variable bindings used by the where clause to evaluate variable bindings. The clause is optional if it is missing the context will be bound to the variable %_ representing the queried topic map.

Normally the clause contains a simple content combination or a navigation expression to define the context of the querying process for this select expression.

2.11.4. Where clause

The where clause define a boolean condition to evaluate the variable bindings. The clause can contain every boolean expression which was discussed in the previous chapters. A where clause returns a set of satisfying variable bindings.

The select expression can only use variable bound in the context of the where clause. If it is missing the usage of any variable is invalid.

2.11.5. Group-By clause

The group-by clause is optional and enables to control the representation of results. In normal case the results are represented as two-dimensional tables containing only atomic values within a cell. Using the group-by clause results can be grouped resulting in array values within a table cell.

The group-by clause allows the definition of a set of indexes, the result should group by. The indexes are represented by the tuple variables starting with the dollar $ following by any numerical combination.

SELECT $t , $t >> indicators
WHERE $t ISA person
GROUP BY $0

The example query returns all person instances and the topic subject-identifiers. Without the group-by clause, the result processor returns this topic instance n-times, one time for each subject-identifier of the topic. Using the group-by there is only one row for each topic containing an array at the second position which contains all its subject-identifier.

2.11.6. Order-By clause

The order-by clause is optional if it is missing the sequence of variable evaluations of the where clause keep unordered. The clause can be used to order them by using a comma-separated list of value expressions normally containing only simple navigations to get literals of internal characteristics of the bound items.

The keywords ASC and DESC can be used after every value expression to define the order direction. If the keyword is missing ASC is used as default.

2.11.7. Unique

By default the result set can contain the same tuple multiple times. Sometimes the user only interested in the unique set of results. By adding the keyword UNIQUE the result sequence will be reduced to a sequence containing every origin tuple only one times.

2.11.8. Offset clause

The offset clause is optional and starts with the keyword OFFSET followed by a numerical literal. The numerical value will be bound to the system variable $_lower and will be used to define the first index of the result set to select. All tuples of the result sequence located at a lower index will be ignored and removed from the result set. If the clause is missing the offset will be 0.

2.11.9. Limit clause

The limit clause is optional and starts with the keyword LIMIT followed by a numerical literal. The numerical value will be bound to the system variable $_limit and will be used as the maximum number of selected items. If the result set contains more than the limit clause restricts, they will be removed from the final result set. If the limit clause is missing the limit value will be -1 representing an unlimited selection.

2.12. Flwr expressions

The current draft of the topic maps query language supports three different types of query-expressions. Each of theme will be inspired by other query languages of other industrial standards. One of them is the flwr expression style inspired by programming languages or flwor of XML documents.

The flwr style represented by the production called flwr expression. A flwr expression contains a number of optional sub expression similar to a for loop of modern programming languages.

1:      flwr-expression ::=     [  FOR   variable-assignment ]
2:                                              [  WHERE   boolean-expression ]
3:                                              [  GROUP BY  < $[0-9]+ > ]
4:                                              [  ORDER BY  < value-expression > ]
5:                                              [  UNIQUE  ]
6:                                              [  OFFSET  value-expression ]
7:                                              [  LIMIT  value-expression ]
9:                                              RETURN   content
10:                                             [ USE use-definition ]

Only the return clause represented by the keyword RETURN is mandatory.

2.12.1. Processing model

At first the variable associations inside the for clauses will be evaluated in lexical order. Each for clause creates a sequences of variable bindings for a specific variable used as evaluation context for the variable in the context of the where clause. The overall result of all for clauses will be an unordered sequence of variable bindings.

After the execution of the from clause the offset and limit clause will be interpreted. The contained value expressions only define a numerical literal representing the first selected index and the maximum number of selected values. The result will be stored as the processing variables $_lower' and $_limit.

By adding the order-by clause the variables bindings can be ordered in the defined way. The order-by clause contains a set of value expressions defining the values used to order the variable bindings in the given context. The result will be a sequence of ordered variable bindings.

All variable bindings will be evaluated by the boolean expression contained by the where clause. If the clause is missing all variables of the for clauses are interpreted as valid.

If there is a group-by clause the interpreter creates a projection of the results, to group them using the given indexes. If the clause contains all indexes, the result keeps unchanged.

The keyword UNIQUE only symbolize the reduction of the result set by removing duplicates. If the keyword is missing, the result set can contain each tuple multiple times.

After all the return clause is used iterative to extract the values by using the variables bindings. The return clause creates a tuple sequence by using the defined content expression and the variable binding set.

As last execution step the bounded variables $_limit and $_lower used to extract the expected selection window of the whole result set. To realize that the function slice is used.

The overall result can be a sequence of tuples containing topic map construct, CTM fragments, XML fragments, JTMQR or any template context.

2.12.2. For clause

The for clause can be used to define a variable binding set of one specific variable. A flwr expression can contain any numbers of for clauses to create a set of variable bindings of a set of variables used in the context of the where clause.

A for clause symbolized by the keyword FOR followed by a variable assignment for a specific variable. For more information using variables and variables bindings please take a look at the previous chapter.

2.12.3. Where clause

The where clause define a boolean condition to evaluate the variable bindings. The clause can contain every boolean expression which was discussed in the previous chapters. A where clause returns a set of satisfying variable bindings.

The return expression can only use variable bound in the context of the where clause or for clause. If both are missing the usage of any variable is invalid.

2.12.4. Group-By clause

The group-by clause is optional and enables to control the representation of results. In normal case the results are represented as two-dimensional tables containing only atomic values within a cell. Using the group-by clause results can be grouped resulting in array values within a table cell.

The group-by clause allows the definition of a set of indexes, the result should group by. The indexes are represented by the tuple variables starting with the dollar $ following by any numerical combination.

WHERE $t ISA person
GROUP BY $0
RETURN $t , $t >> indicators

The example query returns all person instances and the topic subject-identifiers. Without the group-by clause, the result processor returns this topic instance n-times, one time for each subject-identifier of the topic. Using the group-by there is only one row for each topic containing an array at the second position which contains all its subject-identifier.

2.12.5. Order-by clause

The order-by clause is optional if it is missing the sequence of variable evaluations of the where clause keep unordered. The clause can be used to order them by using a comma-separated list of value expressions normally containing only simple navigations to get literals of internal characteristics of the bound items.

The keywords ASC and DESC can be used after every value expression to define the order direction. If the keyword is missing ASC is used as default.

2.12.6. Unique

By default the result set can contain the same tuple multiple times. Sometimes the user only interested in the unique set of results. By adding the keyword UNIQUE the result sequence will be reduced to a sequence containing every origin tuple only one times.

2.12.7. Offset clause

The offset clause is optional and starts with the keyword OFFSET followed by a numerical literal. The numerical value will be bound to the system variable $_lower and will be used to define the first index of the result set to select. All tuples of the result sequence located at a lower index will be ignored and removed from the result set. If the clause is missing the offset will be 0.

2.12.8. Limit clause

The limit clause is optional and starts with the keyword LIMIT followed by a numerical literal. The numerical value will be bound to the system variable $_limit and will be used as the maximum number of selected items. If the result set contains more than the limit clause restricts, they will be removed from the final result set. If the limit clause is missing the limit value will be -1 representing an unlimited selection.

2.12.9. Return clause

The return clause is the only mandatory expression contained by the flwr style. The clause is used to transform the variable binding set evaluated by the where clause and for clauses to a sequence of tuples. Normally the return clause is used in an iterative way to transform each combination of valid variable bindings to a tuple sequence. All tuple sequences will be combined by using the combination operator UNION.

The result of the return clause can be a sequence of different tuple types. Normally the tuple contain a set of topic map constructs represented as simple objects but the flwr style is the only expression style to return XML or CTM fragments too. By using special content expressions called TM-content and XML-content the return clause will be create simple CTM or XML fragments.

Return XML fragments

By using an xml-content expression inside the return clause the flwr expression will be return a sequence of XML fragments representing the result of the query.

An xml-content simply represented by using XML-tags in the context of the return clause for example like the following one.

1:      RETURN <xml> Text </xml>

The interpretation of the XML content contained by the return clause depends on their type. XML-Tags and simple text will keep uninterpreted and will be add to the final XML fragment. All whitespace characters will be add in the same way.

By using embedded queries the XML fragment can use the bounded variables of the for clauses and where clause or embed an independent sub-query. An embedded query expression symbolized by the enclosing angle brackets { and }. The content of this query expression can be represented by every expression type described by the whole draft like an other flwr expression or a path expression. At first the sub-query will be interpreted in the normal way except the dependency of inherit variable bindings by the enclosing expression, which means that the embedded query can use all variables bound by the enclosing expression, too.

Note
The embedded query may not overwrite the inherit variables of the enclosing flwr expression.

The result of the embed query will be interpreted and added to the XML fragment. If the embed query returns any topic map construct it will be transformed to valid XML content using the XTM syntax. If the query returns literal values they will keep unmodified an will be added to the XML fragment as simple text content.

Embed queries can be used at every position, like as a topic node …

1:      # create a sequence of XML fragments containing one XTM topic node for each person topic
2:      FOR $p IN // person
3:      RETURN <xml> { $p } </xml>

or as association node …

1:      # create a sequence of XML fragments containing one XTM association node for each played association of each person topic
2:      FOR $p IN // person
3:      RETURN <xml> { $p << players } </xml>

or simple as text content …

1:      # create a sequence of XML fragments containing one XML node for each person topic containing the first name as text content
2:      FOR $p IN // person
3:      RETURN <xml> { $p / tm:name [0] } </xml>

or as attribute value.

1:      # create a sequence of XML fragments containing one XML node using the first indicator of each person topic as value
2:      FOR $p IN // person
3:      RETURN <person ref="{ $p >> indicators >> atomify [0] }" />
Return CTM fragments

By using an tm-content' expression inside the return clause the flwr expression will be return a sequence of CTM fragments representing the result of the query.

The tm-content simply defined by the enclosing triple quotes symbolizing the CTM stream creating by the return clause.

1:      RETURN ''' Text '''

The interpretation of the CTM content contained by the return clause depends on their type. White spaces and simple text will keep uninterpreted and will be add to the final CTM stream.

By using embedded queries the CTM fragment can use the bounded variables of the for clauses and where clause or embed an independent sub-query. An embedded query expression symbolized by the enclosing angle brackets { and }. The content of this query expression can be represented by every expression type described by the whole draft like an other flwr expression or a path expression. At first the sub-query will be interpreted in the normal way except the dependency of inherit variable bindings by the enclosing expression, which means that the embedded query can use all variables bound by the enclosing expression, too.

Note
The embedded query may not overwrite the inherit variables of the enclosing flwr expression.

The result of the embed query will be interpreted and added to the final CTM stream. If the embed query returns any topic map construct it will be transformed to valid CTM content using the CTM syntax. If the query returns literal values they will keep unmodified an will be added to the CTM stream as simple text content.

Embed queries can be used at every position which expected topic map content, like as a topic …

1:      # create a sequence of CTM fragments containing one topic block for each person topic
2:      FOR $p IN // person
3:      RETURN ''' { $p } '''

or as association items …

1:      # create a sequence of CTM fragments containing one association for each played association
2:      FOR $p IN // person
3:      RETURN ''' { $p << players } '''

or simple as a subject-identifier.

1:      # create a sequence of CTM fragments containing one association definition using the indicator of each person topic
2:      FOR $p IN // person
3:      RETURN ''' life-in ( city : leipzig , person : { $p >> indicators >> atomify [0] } '''

2.13. Insert expressions

The part II of the topic maps query language specification contains the modification expressions of the language. The modification part supports four different expression types representing the four operation types modifying a topic map. Insert operation is one them and enables the creation of new topic map items. Insert operations are represented by insert expressions.

1:      insert-expression ::= INSERT ''' ctm-stream ''' { WHERE query-expression }

The only mandatory part of an insert expression is the insert clause symbolized by the keyword INSERT. The content which want to be added to the current topic map instance will be represented as CTM fragment similar to the CTM-stream of a return clause.

It is possible to use embedded queries to create connections between existing content and the new items.

2.13.1. Processing model

The processing model of insert expressions is quite simple. If there is an where clause it will be interpreted in the same way like as a part of a select expression or flwr expression. The where clause is used to evaluate a sub set of possible variables bindings which should be used in the content of the insert clause. The result of the evaluation will be a set of variables bindings.

Then the insert clause will be interpreted for each variable binding returned by the where-clause. If the CTM fragment contains an embedded query it will be interpreted and the result will be added to the CTM stream. If the result is an topic map information item it will be serialized to CTM. Literals will keep unmodified. Each variable binding creates one CTM fragment which will be de-serialized to the corresponding topic map fragment and will be added to the queried topic map.

The overall result of the query will be a singleton sequence of a singleton tuple containing the number of created items.

2.13.2. Insert clause

The insert clause only contains the keyword INSERT followed by a CTM stream enclosing by triple single quotes.

By using embedded queries the CTM fragment can use the bounded variables of the where clause or embed an independent sub-query. An embedded query expression symbolized by the enclosing angle brackets { and }. The content of this query expression can be represented by every expression type described by the whole draft like an other flwr expression or a path expression. At first the sub-query will be interpreted in the normal way except the dependency of inherit variable bindings by the enclosing expression, which means that the embedded query can use all variables bound by the enclosing expression, too.

If the insert clause creates an invalid CTM fragment in relation to the current CTM draft an error will be raised.

Note
The embedded query may not overwrite the inherit variables of the enclosing flwr expression.

The result of the embed query will be interpreted and added to the final CTM stream. If the embed query returns any topic map construct it will be transformed to valid CTM content using the CTM syntax. If the query returns literal values they will keep unmodified an will be added to the CTM stream as simple text content.

Embed queries can be used at every position which expected topic map content, like as a topic …

1:      # create a copy of each person topic
2:      INSERT ''' { $p } '''
3:      WHERE $p ISA person

or as association items …

1:      # add a reification to the first association played by each person topic
2:      INSERT ''' { $p << players } ~ reifier '''
3:      WHERE $p ISA person

or simple as a subject-identifier.

1:      # create a new association played by the each person topic
2:      INSERT ''' life-in ( city : leipzig , person : { $p >> indicators >> atomify [0] } '''
3:      WHERE $p ISA person

2.13.3. Where clause

The where clause define a boolean condition to evaluate the variable bindings. The clause can contain every boolean expression which was discussed in the previous chapters. A where clause returns a set of satisfying variable bindings.

The insert expression can only use variable bound in the context of the where clause. If it is missing the usage of any variable is invalid.

2.14. Delete expressions

The part II of the topic maps query language specification contains the modification expressions of the language. The modification part supports four different expression types representing the four operation types modifying a topic map. Delete operation is one them and enables the deletion of specific elements of the topic map. Delete operations are represented by delete expressions.

1:      delete-expression       ::= DELETE [ CASCADE ] < value-expression >
2:                                                      [ WHERE boolean-expression ]
3:
4:      delete-expression       ::= DELETE CASCADE ALL

The only mandatory part of an delete expression is the delete clause symbolized by the keyword DELETE and a comma-separated list of value expressions. The keyword CASCADE is an optional part of the delete clause. In addition to the delete clause a where clause be used to evaluate a set of variable bindings similar to the insert expression.

The value expressions of the delete clause are used to define exactly the information resources to delete like a locator object, an association item or a topic item.

2.14.1. Processing model

The processing model of a delete expression is quite simple. If there is an where clause it will be interpreted in the same way like as a part of a select expression or flwr expression. The where clause is used to evaluate a sub set of possible variables bindings which should be used in the content of the delete clause. The result of the evaluation will be a set of variables bindings.

The delete clause will be execute for each valid variable binding of the where-clause. The delete clause contains a set of value expressions to define exactly the information resources to remove. Normally a value expression only define a variable to remove the bound value or use a navigation to navigate to the item to remove. The keyword CASCADE represents the deletion mode of the query processor. If the keyword is missing the processor only removes elements which are independent to other information resources of the same topic map. For example if a topic is used as theme, it cannot removed without using this keyword. If the keyword is contained every dependent item will be removed too.

The overall result of the query will be a singleton sequence of a singleton tuple containing the number of removed items.

2.14.2. Delete clause

The delete clause starts with the keyword DELETE and is the only non-optional part of a delete expression. A delete clause contains a comma-separated list of value expressions used to identify exactly the information resources to remove. Normally a delete clause contains a list of variables or navigations based on variable bindings.

Note
The delete clause can only contain variables which are bind by the where clause.

As addition to that the delete clause can contains a set of path expressions without any variables to remove specific items. For example if a specific topic has to be removed the delete clause only contains a topic reference of this topic item.

1:      # remove the topic representing the composer puccini
2:      DELETE http://psi.ontopedia.net/Puccini

2.14.3. Cascade

Sometimes information items cannot removed from a topic map because they are connected to other items in a way they must not simply removed. For example if a topic is used as a theme the meaning of a statement will be changed if the theme will be removed from the scope. The processor will stop the execution if at least one topic depends on other information resources.

If the keyword CASCADE is used the processor will remove all dependent information items too. Please take care.

If a topic item will be removed, the processor will remove all of its characteristic items and all played associations. If the deleted topic represent a topic type all instances and subtypes will be removed too. If the topic is a theme the scoped item will be removed too. If the topic item used as reifier the reification will be removed, but the reified item keep alive.

If an association item will be removed the processor will remove all role constructs and destroy each scoping and reification relationship to any topic item.

If a name item will be remove the processor will remove all variants too and destroy each scoping and reification relationship to any topic item.

If a occurrence item will be removed the processor will destroy each scoping and reification relationship to any topic item.

If a locator object will be removed the processor checks if the construct using the identifier has to remove too. The processor will remove the identified item too, if there isn’t another locator identifying the construct.

If a variant item will be removed the processor will destroy each scoping and reification relationship to any topic item.

Note
The process is iterative which means i.e. if a name is removed because of the deletion of its parent topic, all variants will be removed too.

The following table summarizes the dependencies resolved by the deletion processor.

item type

removed dependencies

topic

  • all characteristics and their dependencies
  • all association played
  • all instances and subtypes
  • all scoped constructs using this item as theme
  • reification destroyed

occurrence

reification destroyed

name

  • all variants
  • reification destroyed

variant

reification destroyed

association item

  • all role constructs
  • reification destroyed

a locator

the construct if there isn’t an other locator

2.14.4. Where clause

The where clause define a boolean condition to evaluate the variable bindings. The clause can contain every boolean expression which was discussed in the previous chapters. A where clause returns a set of satisfying variable bindings.

The delete expression can only use variable bound in the context of the where clause. If it is missing the usage of any variable is invalid.

2.14.5. The keyword ALL

The keyword ALL can be used instead of a value-expression in the context of a delete-clause to remove all items of the queried topic map. The keyword CASCADE is mandatory.

2.15. Merge expressions

The part II of the topic maps query language specification contains the modification expressions of the language. The modification part supports four different expression types representing the four operation types modifying a topic map. Merge operation is one them and enables the controlled merging of topic map items. Merge operations are represented by merge expressions.

1:      merge-expression ::= MERGE path-expression
2:
3:      merge-expression ::= MERGE < value-expression > WHERE boolean-expression
4:
5:      merge-expression ::= MERGE ALL WHERE boolean-expression

There are three different grammatical productions of a merge-expression. Each merge expression starts with the symbolic keyword MERGE which will be used by the engine to identify the expression type.

2.15.1. Processing model

The processing model of a merge expression is quite simple. If there is an where clause it will be interpreted in the same way like as a part of a select expression or flwr expression. The where clause is used to evaluate a sub set of possible variables bindings which should be used in the content of the delete clause. The result of the evaluation will be a set of variables bindings.

The merge clause will be execute for each valid variable binding of the where-clause. The merge clause contains a set of value expressions or one path-expression to define exactly the information resources to merge. Normally a value expression only define a variable to remove the bound value or use a navigation to navigate to the item to remove. The keyword ALL can be used instead of contained expressions to indicates the merging process of all result items of the where-clause.

The overall result of the query will be a singleton sequence of a singleton tuple containing the number of merged items.

2.15.2. Merge clause

The merge clause can contain a set of value expressions only containing variables which have to be bind by the where-clause. The second possibility is to add a path-expression defining the items to merge, in this case a where clause is not supported.

All constructs will be merged according to the merging-rules of the TMDM.

2.15.3. Where clause

The where clause define a boolean condition to evaluate the variable bindings. The clause can contain every boolean expression which was discussed in the previous chapters. A where clause returns a set of satisfying variable bindings.

The merge expression can only use variable bound in the context of the where clause. If it is missing the usage of any variable is invalid.

2.15.4. The keyword ALL

The keyword ALL can be used instead of a value-expression in the context of a merge-clause to merge all items of the queried topic map. The keyword CASCADE is mandatory.

2.16. Update expressions

The part II of the topic maps query language specification contains the modification expressions of the language. The modification part supports four different expression types representing the four operation types modifying a topic map. Update operation is one them and enables the modification of existing topic map items. Update operations are represented by update expressions.

1:      update-expression       ::= UPDATE < update-clause > ( WHERE boolean-expression )
2:
3:      update-clause           ::= topic-def | association-def | anchor { topic-ref } ( SET | ADD | REMOVE ) value-expression

An update-expression has to contain at least one sub-expressions an update-clause, topic-def or association-def. Sometimes the where-clause may not be optional. The number of update-clauses is not restricted but has to be greater than one. The where-clause of an update-expression is used to define the context of changes done by the update-clauses. All update-clauses has to act in the same context and effect changes at the same topic map graph node. An update-clause can change a child-node of topic map item or add a new child-node to an existing topic map node.

2.16.1. Processing model

The processing model of an update expression is quite simple. If there is an where clause it will be interpreted in the same way like as a part of a select expression or flwr expression. The where clause is used to evaluate a sub set of possible variables bindings which should be used in the content of the delete clause. The result of the evaluation will be a set of variables bindings.

The update clause will be interpreted in the context defined by the where-clause. It contains exactly one atomic update statement for a topic map item which will be add a new child node or change the value of the child node.

The overall result of the query will be a singleton sequence of a singleton tuple containing the number of changed and created items.

2.16.2. Update clause

An update clause consists of an anchor identifier representing the type of child information which should be changed by the current expression. The following topic reference is optional and can be used for a specific number of anchors to define the type of the new child information. The keyword indicates the type of change as the creation of new child informations or the change of them. The value-expression at the end indicates the value which shall be set to the created or modified child node. The possible keywords are dependent from the given anchor and will be represented by the following matrix.

anchor

supported keyword

optional parameter

value-type

current node type

description

locators

ADD

not supported

any iri

a topic

adding a new subject-locator to given topic

REMOVE

not supported

any iri

a topic

remove a subject-locator from given topic

indicators

ADD

not supported

any iri

a topic

adding a new subject-identifier to given topic

REMOVE

not supported

any iri

a topic

remove a subject-identifier from given topic

item

ADD

not supported

any iri

any item

adding a new item-identifier to given item

REMOVE

not supported

any iri

any item

remove a item-identifier from given topic

names

ADD

the name type

any string

a topic

adding a new name to given topic. If the name-type is empty, then "default-name-type":http://www.isotopicmaps.org/sam/sam-model/#d0e2429 of TMDM is used

REMOVE

not supported

a name

a topic

removes the current name from the given topic

SET

not supported

any string

a name

setting the value of the given name

variants

ADD

the variant theme

any string

a name

adding a new variant to given name. The variant theme is mandatory and will be added as theme of the new variants scope.

REMOVE

not supported

a variant

a name

removes the current variant from the given name

SET

not supported

any string

a variant

setting the value of the given variant

occurrences

ADD

the occurrence type

any object

a topic

adding a new occurrences to given topic. If the occurrence-type is empty an error occurred.

REMOVE

not supported

an occurrence

a topic

removes the current occurrence from the given topic

SET

not supported

any object

an occurrence

setting the value of the given occurrence

datatype

SET

not supported

any string

an occurrence or variant

setting the datatype of the given occurrence

scope

ADD

not supported

a topic

an association, a name or an occurrence

adding a new theme to the scope of the given item

types

ADD

not supported

a topic or an iri

a topic

adding a new type to the given topic, if type does not exists and value is an iri, the type will be created

SET

not supported

a topic or an iri

an association, a name or an occurrence

setting the type to the given topic, if type does not exists and value is an iri, the type will be created

instances

ADD

not supported

a topic or an iri

a topic

adding a new instance to the given type, if instance does not exists and value is an iri, the instance will be created

supertypes

ADD

not supported

a topic or an iri

a topic

adding a new supertype to the given topic, if supertype does not exists and value is an iri, the supertype will be created

subtypes

ADD

not supported

a topic or an iri

a topic

adding a new subtype to the given topic, if subtype does not exists and value is an iri, the subtype will be created

players

SET

the role type

a topic or an iri

a topic

setting the player of all or the given role, if player does not exists and value is an iri, the player will be created

roles

ADD

the role type

a topic or an iri

an association

adding a new role-player combination to the given association, if player does not exists and value is an iri, the player will be created.

reifier

SET

not supported

a topic, an association, a name or an occurrence

a topic, an association, a name or an occurrence

Setting a new reification. Hint: Restrictions of TMDM are checked

Note: The NCL level contains the anchor characteristics supporting the SET and REMOVE operations. The interpretation depends on the given context type. If the current node is a topic name, the behavior is similar to names anchor. If the current node is an occurrence, the behavior is similar to occurrences anchor.

Note: If the value of an occurrence or variant are modified, the old datatype will be reused. The value will only be validated if the pragma is set.

Topic Definition

A topic definition can be used to define a new topic item by one of its identities. A topic can be created by using a string literal as item-identifier, subject-identifier or subject-locator. The type of identity is specified by the navigation pattern behind the string reference. If there is no navigation pattern, the default is subject-identifier. The identity type can also be defined by the non-canonical shortcut of the corresponding navigation axis.

1:      topic-def       ::=     'topics' 'ADD' string ( '<<' 'indicators' | '~' )? # by subject-identifier
2:
3:      topic-def       ::=     'topics' 'ADD' string ( '<<' 'item' | '!' ) # by item-identifier
4:
5:      topic-def       ::=     'topics' 'ADD' string ( '<<' 'locators' | '=') ) # by subject-locator

The anchor of update clause is topics.

Association Definition

An association definition is similar to a predicate invocation and can be used to create a new association in the context of the current topic map. The syntax is similar to predicate-invocation except the usage of ellipsis is not allowed.

1:      association-def ::=     'associations' 'ADD' predicate-invocation

The anchor of update clause is associations.

2.16.3. Where clause

The where clause define a boolean condition to evaluate the variable bindings. The clause can contain every boolean expression which was discussed in the previous chapters. A where clause returns a set of satisfying variable bindings.

The update expression can only use variable bound in the context of the where clause. If it is missing the usage of any variable is invalid.

3. TMQL4J

The tmql4j engine is the first Java-based engine processing the topic maps query language. The engine id designed as processing chain of independent modules and in a flexible architecture to provide the possibility of integrating your own module extensions and adaptations. The engine supports the draft of the 15th of august 2008 in addition to a number of extension like the topic maps modification language ( TMQL-ML ) or the template definition language ( TMQL-TD ).

3.1. Design patterns

3.1.1. Architecture

The TMQL4J engine offers a new abstraction layer on top of the TMAPI. Instead of accessing the Topic Maps engine directly applications may use a simple TMQL query. The architecture of Topic Maps applications is layered. The base of all applications are the Topic Maps backends, which are administrated by the Topic Maps engines. To abstract from the real implementation, TMAPI is used as a standardized interface. The last layer under the application is the TMQL4J engine.

Some special modules are designed to directly access the backend the TMAPI engine based on

3.1.2. Modularization

TMQL4J is designed as a multi-lingual querying engine providing different TMQL versions or other topic map query languages like tolog. As base modules of version 3.1.0 the engine supports the TMQL drafts of 2007, 2010 and an experimental SQL and tolog translator.

Each style of the query language are encapsulated by its own plug-in which can be added to the class-path if needed. If a plug-in is missing, the parser and lexical scanner does not supports this style.

Note
Because of the fact, that some parts of the language style FLWR and SELECT of the 2007-draft based on the PATH style, they depends on it, which means, that the jar has to added to the class-path to.

3.1.3. Optimization

TMQL4J realizes an additional abstraction layer over the TMAPI and implements additional features to speed up the querying process by using new index implementations and extension of the TMAPI or directly access the backend.

3.2. Create the Runtime

The whole querying process is encapsulated by a container called runtime. The runtime provides methods to define prefixes or register new functions or any other extensions..

The runtime can not be instantiate in a usual way. The user has to use an instance of the runtime factory to get an representation of this container class using the topic map system as optional argument. The container implementation is represented by an interface class to hide internal methods and unify the using of the engine. The next paragraphs describing each function provided by the runtime interface and how it shall be used to realize the business use cases.

1:      TopicMapSystem topicMapSystem = TopicMapSystemFactory.newInstance().newTopicMapSystem();
2:
3:      TopicMap topicMap = topicMapSystem.createTopicMap("http://de.topicmapslab/tmql4j/");
4:
5:      File ltm = new File("src/test/resources/ItalianOpera.ltm");
6:
7:      LTMTopicMapReader reader = new LTMTopicMapReader(topicMap, ltm);
8:      reader.read();
9:
10:     ITMQLRuntime runtime = TMQLRuntimeFactory.newFactory().newRuntime("tmql-2007");
11:
12:     IQuery query = runtime.run(topicMap, "http://psi.ontopedia.net/Puccini");

The short code snippet give you an overview about initializing and using the runtime container. As we can see in line 10, the TMQLRuntimeFactory is used to create a new runtime by calling the method newRuntime. In the lines 1 until 8 we initializing the topic map instance by importing a topic map from an external LTM file. The last code line of the snippet calling the runtime to execute the given query.

The factory supports a set of different method initializing a runtime container estimating a topic map system and/or the language name which should be supported by the runtime instance.

3.2.1. Control supported expressions

Using a query instance the application can control the parsing process by excluding forbidden expression types, like the insert - or update-expressions. The query only handle forbidden expression, any other expression are allowed. If a query contains a restricted expression type an exception will be thrown by the parser implementation. Please note that the expression types only represent restriction of first-level expressions, that means that the expression only restricted as children of the root expression containing in the generated parsing tree. If the expression is used in lower levels of the tree it wont be restricted and wont be affected in an exception.

As a special shortcut the query interface provides the functionality to disable all expression types which occurs a modification of the topic map as one call.

1:      IQuery query = new TMQLQuery(topicMap, "INSERT ''' myTopic . '''");
2:      query.forbidExpression(InsertExpression.class);
3:
4:      query.forbidModificationQueries();

3.3. Query

A query using in context of the runtime container can represented in two different ways, as string literal or as instance of a class implementing the IQuery interface. If the query is represented as string-literal the runtime container auTomatically create a new instance of IQuery representing the given string-literal using the QueryFactory of the engine.

3.3.1. Query-Factory

Currently there is no industrial standard of a topic maps query language, but it is fundamental for each business application to extract the information used to resolve the use cases in a easy way. Because of that and the missing standard there are many languages realizing a query language of topic maps at the moment - tolog, Toma, TMQL and a topic maps path language. Many applications using topic maps are based on one of this query languages but can not handle any other of them. The tmql4j provides the functionality to convert any other query format to a topic maps query language pattern ( TMQL ). A developer can use a simple interface to add a new query transformer module to the core implementation of the tmql4j engine and can use his language, like SQL, to query a topic map.

Because of the different syntactical notification of each language it is not secure to instantiate a query by using a specific IQuery implementation class, except the user can be sure that the given literal is valid in context of the represented query language. Instead of initializing the query the query factory shall be used to create a query from a given string-literal. The query factory try to detect the query language the given string-literal is written in and create a new instance of this query class. Because of the fact that the tmql4j engine only handles real TMQL queries the toTMQL method will be called by the query factory to transform the specific query to a TMQL query. A developer has to provide the transformation functionality to use his own query language with the tmql4j engine.

3.3.2. tmql4j-tolog PlugIn

The tmql4j-tolog plugin implementing a query transformer for the query language tolog, which was established as de-facto standard as a part of the Ontopia topic maps engine. Some parts of the current draft inspired by the tolog query language. In combination with the plugin the tmql4j engine can used with tolog queries. Please note that the current version of the tmql4j-tolog' plugin only supports the querying part of the tolog query language and not the additional modification part developed in 2009. The next version of the plugin will be support the whole tolog specification.

3.3.3. Integrate your own query language plug-in

The tmql4j engine can be extended by own query language plugins to integrate your own query language in the tmql4j engine. The new plugin has to contain at least one class implements the IQuery interface of the tmql4j engine. In addition the tmql4j engine contains an abstract base implementation of IQuery implementing the base functions of this interface. The abstract base class can be use to reduce the implementation work.

1:      public class MyQueryImplementation extends QueryImpl {
2:      ...
3:      }

3.4. Prefix Definitions

In the context of a TMQL query a topic will be represented by a subject-identifier, subject-locator or item-identifier. Each of this identifiers are represented by a string-represented IRI which has to be known by the underlying topic maps engine. Related to a model, the most identifiers of a topic map will be similar to each other in relation to their IRI string literals, except from a short part at the end of the IRI literal. The identifiers use the same prefix and because of using a set of topics as part of the query we have to write a set of many identifiers only differs in a short part at the end. The solution of this problem is to define a number of prefixes and use relative IRIs instead of the absolute one.

There are some predefined prefixes defined by the current draft of the topic maps query language which can be used without defining explicitly. The following prefixes are contained by the predefined environment.

prefix literal

absolute IRI

description

tm

http://psi.topicmaps.org/iso13250/model/

This is the namespace for the concepts defined by TMDM (via the TMDM/TMRM mapping).

xsd

http://www.w3.org/2001/XMLSchema#

This is the namespace for the XML Schema Datatypes.

tmql

http://psi.topicmaps.org/tmql/1.0/

Under this prefix the concepts of TMQL itself are located.

fn

http://psi.topicmaps.org/tmql/1.0/functions/

Under this prefix user-callable functions of the predefined TMQL environment are located.

dc

http://purl.org/dc/terms/

Under this prefix Dublin Core elements are located.

3.4.1. Define Prefixes as Query-Part

The current draft of the query language define an special expression type to define prefixes as part of the query itself. The defined prefixes are only valid for this specific query and not for the whole runtime. If the user wants to define a prefix only for a query this method should be used. The number of prefixes defined as part of the query is not restricted by the current draft. The prefix definition is a part of the environment clause of a TMQL query and starts with the keyword %prefix followed by the prefix literal and the absolute IRI as replacement of the prefix IRI.

1:      %prefix tmql4j http://tmql4j.topicmapslab.de/

The defined prefixes can be used as QNames as part of a relative IRI in the context of the same query to identify a topic or a topic type.

1:      %prefix tmql4j http://tmql4j.topicmapslab.de/ tmql4j:person >> instances
2:
3:      http://tmql4j.topicmapslab.de/person >> instances

The example show two queries with the same meaning. In line 1 there is a query using prefix definitions as part of the environment clause and uses the defined prefix literal as QName of the identifier tmql4j:person. As we can see the QName and the rest of the IRI will be devide by a colon. The second query in line 3 don’t use a prefix definition, but in the current example it is shorter because the prefix is used only one times. Please note that the benefits of using prefixes is proportional to the number of repeating IRI parts.

3.4.2. Define Prefixes using Prefix Handler

Sometimes a prefix will be used in different queries by the same tmql4j engine, so it will be uncomfortable to define this prefixes as part of each query again. The tmql4j engine provides a method to register additional prefixes which are valid in the whole runtime lifecycle. The method expects two arguments which are the same like the tokens of the prefix definition as a part of the query. The first argument is the prefix literal and the second is the absolute IRI represented by the QName.

1:      PrefixHandler handler = runtime.getLanguageContext().getPrefixHandler();
2:
3:      property.registerPrefix("tmql4j", "http://tmql4j.topicmapslab.de");

The prefix management will be encapsulate by the language context which can be accessed by the runtime method getLanguageExtension as we can see in line 1. In line 3 the method registerPrefix of the prefix handler will be used to add a new prefix definition to the runtime container with the prefix literal tmql4j and the given string-represented IRI. The code line 3 has the same effect like the prefix definition of the last section, but in addition each query executed by the runtime container can use the prefix without defining it as a part of the query.

3.4.3. Define Default Prefixes using Prefix Handler

For a whole runtime it is possible to define a default prefix without any prefix IRI. The default prefix can be used, in the way, that the relative part can be written without the prefix. The engine transforms automatically the relative IRI to a absolute IRI using this prefix.

1:      PrefixHandler handler = runtime.getLanguageContext().getPrefixHandler();
2:
3:      property.setDefaultPrefix("http://tmql4j.topicmapslab.de");

3.5. Functions

Similar to the other query languages like SQL, the current draft specify a number of functions which can be used to transform tuples or sequences. Each function are represented by a topic as a part of the environment topic map of the runtime container and can be used as part of the TMQL query like each other topic reference. In addition TMQL define a expression type called function-invocation to call a function with a list of arguments. Each function will be addressed by a topic item reference and a tuple-expression to define the parameter list given to the function interpreter.

Each function type will be handled by a special function interpreter module.

3.5.1. Predefined functions

The current draft of TMQL contains a number of predefined functions which are implemented by the tmql4j engine and can be used as a part of the given query.

string concat

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/string-concat

symbolic pattern


profile

fn:string-concat (a : string, b : string) return string

precedence

2

The function string-concat combines a set of strings. The function expects exactly two arguments given by the following tuple-expression. The type of the argument can be simple strings or sets of strings and the result will be a set of strings or a simple string. The behavior of the function is dependent from the given argument type. If the first argument a is a string and the second argument b two, the method will return the string combination of a and b. If one of the arguments a or b is a set of strings the method will return a set of strings containing each combination of the atomic string and each string of the given set. If both arguments are sets the method will return each combination of each string of the first set and the second set.

1:      fn:string-concat ( a => [ "foo" ] , b => [ "bar" ] )
2:      => "foobar"
3:
4:      fn:string-concat ( a => [ "foo" , "main" ] , b => [ "bar" ] )
5:      => [ "foobar" , "mainbar" ]
6:
7:      fn:string-concat ( a => [ "foo" , "main" ] , b => [ "bar" , "menu" ] )
8:      => [ "foobar" , "mainbar" , "foomenu" , "mainmenu" ]
string-length

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/string-length

profile

fn:length (s : string) return integer

The function length returns the internal size of a string literal by counting the contained characters. The function expect exactly one argument which can be a simple string or a sequence of strings. The behavior of the function is dependent from the given argument type. If the argument is a simple string the function return a single integer value. If the argument is a set of strings it will return a sequence of integer values.

1:      fn:length ( s => [ "foo" ]  )
2:      => 3
3:
4:      fn:length ( s => [ "foo" , "main" ] )
5:      => [ 3 , 4 ]
string-less-than

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/string-less-than

symbolic pattern

<

profile

fn:string-lt (a : string, b : string) return tuple-sequence

precedence

5

The function string-lt compare the two string literals and only return the string literal given by the first argument if the literal is lexicographically lower than the second string. The function expected two arguments which has be of the type string. The first argument also can be a sequence of strings. The behavior of the function is dependent from the given argument type. If the first argument is a set of strings the function returns a set containing the lexicographically lower string in relation to the second string argument. If the first argument is a string it return the given string if it is lexicographically lower than the second one, otherwise it return an empty set. If the second argument is a set of strings the first one will be used.

1:      fn:string-lt ( a => [ "a" ] , b => [ "aaa" ] )
2:      => "a"
3:
4:      fn:string-lt ( a => [ "a" , "b" ] , b => [ "aaa" ] )
5:      => "a"
6:
7:      fn:string-lt ( a => [ "a" , "b" ] , b => [ "aaa" , "bbb" ] )
8:      => "a"
string-less-equal-than

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/string-less-equal-than

symbolic pattern

< =

profile

fn:string-leq (a : string, b : string) return tuple-sequence

precedence

5

The function string-leq compare the two string literals and only return the string literal given by the first argument if the literal is lexicographically lower or equal than the second string. The function expected two arguments which has be of the type string. The first argument also can be a sequence of strings. The behavior of the function is dependent from the given argument type. If the first argument is a set of strings the function returns a set containing the lexicographically lower string in relation to the second string argument. If the first argument is a string it return the given string if it is lexicographically lower or equal than the second one, otherwise it return an empty set. If the second argument is a set of strings the first one will be used.

1:      fn:string-leq ( a => [ "a" ] , b => [ "aaa" ] )
2:      => "a"
3:
4:      fn:string-leq ( a => [ "a" , "b" ] , b => [ "aaa" ] )
5:      => "a"
6:
7:      fn:string-leq ( a => [ "a" , "b" ] , b => [ "aaa" , "bbb" ] )
8:      => "a"
string-greater-equal-than

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/string-greater-equal-than

symbolic pattern

>=

profile

fn:string-geq (a : string, b : string) return tuple-sequence

precedence

5

The function string-geq compare the two string literals and only return the string literal given by the first argument if the literal is lexicographically greater or equal than the second string. The function expected two arguments which has be of the type string. The first argument also can be a sequence of strings. The behavior of the function is dependent from the given argument type. If the first argument is a set of strings the function returns a set containing the lexicographically greater string in relation to the second string argument. If the first argument is a string it return the given string if it is lexicographically greater or equal than the second one, otherwise it return an empty set. If the second argument is a set of strings the first one will be used.

1:      fn:string-geq ( a => [ "a" ] , b => [ "aaa" ] )
2:      => [ ]
3:
4:      fn:string-geq ( a => [ "a" , "b" ] , b => [ "aaa" ] )
5:      => "b"
6:
7:      fn:string-geq ( a => [ "a" , "b" ] , b => [ "aaa" , "bbb" ] )
8:      => "b"
string-greater-than

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/string-greater-than

symbolic pattern

>

profile

fn:string-gt (a : string, b : string) return tuple-sequence

precedence

5

The function string-gt compare the two string literals and only return the string literal given by the first argument if the literal is lexicographically greater than the second string. The function expected two arguments which has be of the type string. The first argument also can be a sequence of strings. The behavior of the function is dependent from the given argument type. If the first argument is a set of strings the function returns a set containing the lexicographically greater string in relation to the second string argument. If the first argument is a string it return the given string if it is lexicographically greater than the second one, otherwise it return an empty set. If the second argument is a set of strings the first one will be used.

1:      fn:string-gt ( a => [ "a" ] , b => [ "aaa" ] )
2:      => [ ]
3:
4:      fn:string-gt ( a => [ "a" , "b" ] , b => [ "aaa" ] )
5:      => "b"
6:
7:      fn:string-gt ( a => [ "a" , "b" ] , b => [ "aaa" , "bbb" ] )
8:      => "b"
string-regexp-match

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/string-regexp-match

symbolic pattern

=~

profile

fn:regexp (s : string, re : string) return tuple-sequence

precedence

5

The function regexp checks if the given string argument matches to the regular expression. The method expected exactly two arguments, which can be an instance of string or a simple sequence or strings. The behavior of the function is dependent from the given argument type. If the first argument is a simple string, the result will be an empty sequence if the string does not match or the string if it matches. If the first argument is a set of strings the function will return a set of all matching strings. If the second argument is a set of strings only the first one will be used.

1:      fn:regexp ( a => [ "aaa" ] , b => [ "[a]+" ] )
2:      => "aaa"
3:
4:      fn:regexp ( a => [ "aaa" , "bbb" ] , b => [ "[a]+" ] )
5:      => "aaa"
6:
7:      fn:regexp ( a => [ "aaa" , "bbb" ] , b => [ "[a]+" , "[b]+" ] )
8:      => "aaa"
substring

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/substring

profile

fn:substring (s : string, f : integer , t : integer ) return string

precedence

5

The function substring will be return an substring of the given string argument addressed by the given indexes. The function expects exactly three arguments of type string and integer. As first argument a string sequence is supported two. The behavior of the function is dependent from the given arguments. If the first argument is a string it will return a single string representing the substring of the first argument. If the first argument is a sequence it will return a sequence of substrings. If any of the indexes is out of bounds the function will clear this indexes to the possible values which are encapsulate by the given range. The string indexes will be zero-based and the upper index will be excluded.

1:      fn:substring ( s => [ "Java-based engine" ] , f => 1 , t => 5 )
2:      => "ava-"
3:
4:      fn:substring ( s => [ "Java-based engine" , "foo" ] , f => 1 , t => 5 )
5:      => [ "ava-" , "oo" ]
6:
7:      fn:substring ( s => [ "Java-based engine" , "foo" ] , f => -1 , t => 50 )
8:      =>  [ "Java-based engine" , "foo" ]
9:
10:     fn:substring ( s => [ "Java-based engine" , "foo" ] , f => "1" , t => "5" )
11:     =>  [ "ava-" , "oo" ]
12:
13:     fn:substring ( s => [ "Java-based engine" , "foo" ] , f => "a" , t => "5" )
14:     =>  raises an error
has-datatype

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/has-datatype

profile

fn:has-datatype (s: tuple-sequence) return tuple-sequence

The function has-datatype retrieves the data type for each tuple element in each tuple. The function expects exactly one argument which has to be a tuple or a tuple sequence. The behavior of the functions is dependent from the given arguments. If the the contained element is a name item the data type will be string, for occurrence item it will be the internal data type and for each atom it will be the data type of the atom itself. Any other item results in the data type any. Each data type is a IRI.

1:      fn:has-datatype ( s => [ "http://tmql4j.topicmapslab.de"^^xsd:anyURI , "aaa" , 5 ] )
2:      =>  [ "xsd:anyURI" , "xsd:string" , "xsd:integer" ]
has-variant

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/has-variant

profile

fn:has-variant (s: tuple-sequence, s: item-reference) return tuple-sequence

The function has-variant only supported for topic name items. The function expected exactly two argument which has to be a tuple sequence and a topic reference. It retrieves the variant items for each tuple element in each tuple the variant name for the given scope. For name items this is the variant value, if such exists. Otherwise it is undef. For all other things the function will return ever undef.

slice

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/slice

profile

fn:slice (s : tuple-sequence, low : integer, high : integer) return tuple-sequence

The function slice is a function supports tuple-sequences. The method returns the selected tuples of the sequence with an index between the given arguments. The indexes are zero-based and the upper index will be excluded. If the indexes given as strings, it will be transformed to integer values automatically. The function will be used by the select expression to realize the keywords LIMIT and OFFSET. If the integer values are negative or invalid an error will be raised. If the indexes are out of bounds the function will return the tuples with an index in range of the tuple-sequence and the given arguments.

1:      fn:slice  ( s => [ "a" , "b" , "c" , "d" ] , low => 1 , high => 2 )
2:      => [ "a" ]
3:
4:      fn:slice  ( s => [ "a" , "b" , "c" , "d" ] , low => 3 , high => 10 )
5:      => [ "c" , "d" ]
6:
7:      fn:slice  ( s => [ "a" , "b" , "c" , "d" ] , low => "a" , high => "-1" )
8:      =>  raises an error
count

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/count

profile

fn:count (s : tuple-sequence) return integer

The function count returns the number of tuples of the tuple sequence. The function expected one argument which should be a tuple sequence or an atom. The behavior of the function is dependent from given arguments. If an atom is given the method will return 1 every time. If it is an tuple sequence it will return the number of contained items.

1:      fn:count  ( s => [ "a" , "b" , "c" , "d" ]  )
2:      => 4
3:
4:      fn:count  ( s => "b" )
5:      => 1
6:
7:      fn:count  ( s => [ ] )
8:      =>  0
uniq

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/uniq

profile

fn:uniq (s : tuple-sequence) return tuple-sequence

The function uniq unifies the given tuple sequence. As an interpretation of the current draft each tuple sequence can contain each tuple multiple times. The function removes all multiple references of the same tuple in the given sequence. As arguments the function expects exactly one tuple sequences. The functions is used by the select expression to realize the keyword UNIQUE. The indexes of the tuples in the tuple sequence will be changed.

1:      fn:uniq  ( s => [ "a" , "b" , "a" , "b" ]  )
2:      => [ "a" , "b" ]
concat

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/concat

symbolic pattern

++

profile

fn:concat (s : tuple-sequence, t : tuple-sequence) return tuple-sequence

precedence

1

The function concat combine two tuple sequence to a new one. The functions adds all tuples of the second sequence to the first sequence. If one of the arguments is an atom instead of a sequence, a new sequence will be created and the atoms will be added. While the combination any ordering is honored.

1:      fn:concat  ( s => [ "a" , "b" , "a" , "b" ]  , t => "a" )
2:      => [ "a" , "b" , "a" , "b" , "a" ]
3:
4:      fn:concat  ( s => "a"  , t => "b" )
5:      => [ "a" , "b" ]
6:
7:      fn:concat  ( s => [ "a" , "b" , "a" , "b" ]  , t =>  [ "a"  , "c" ] )
8:      => [ "a" , "b" , "a" , "b" , "a" , "c" ]
except

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/except

symbolic pattern

 — 

profile

fn:except (s : tuple-sequence, t : tuple-sequence) return tuple-sequence

precedence

1

The function except produces a tuple sequence where all tuple which appear in t are removed from s. If one of the arguments is an atom instead of a sequence, a new sequence will be created an the atoms will be added. There is no effect for the ordering. If the first tuple sequence contains an element of the second one multiple times, all references of this element will be removed.

1:      fn:except  ( s => [ "a" , "b" , "a" , "b" ]  , t => "a" )
2:      => [ "b" , "b" ]
3:
4:      fn:except  ( s => "a"  , t => "b" )
5:      => [ "a" ]
6:
7:      fn:except  ( s => [ "a" , "b" , "a" , "b" ]  , t =>  [ "a"  , "c" ] )
8:      => [ "b" , "b" ]
compare

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/compare

symbolic pattern

==

profile

fn:compare (s : tuple-sequence, t : tuple-sequence) return tuple-sequence

precedence

1

The function compare produces a tuple sequence of all tuples which appear in s and t. Any tuple of the sequence s which is not contained by t will be removed from s. If one of the arguments is an atom instead of a sequence, a new sequence will be created an the atoms will be added. There is no effect for the ordering. The number of references of a tuple element will not be considered.

1:      fn:compare  ( s => [ "a" , "b" , "a" , "b" ]  , t => "a" )
2:      => [ "a" , "a" ]
3:
4:      fn:compare  ( s => "a"  , t => "b" )
5:      => [ ]
6:
7:      fn:compare  ( s => [ "a" , "b" , "a" , "b" ]  , t =>  [ "a"  , "c" ] )
8:      => [ "a" , "a" ]
zigzag

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/zigzag

profile

fn:zigzag (s : tuple-sequence) return tuple-sequence

The function zigzag is the reverse function of zagzig and returns a single tuple filled with all values from all tuples. The function can be used to speed up the application because indexes of tuples run faster than indexes within the tuple sequence.

1:      fn:zigzag  ( s => [ [ "a" , "b" ] , [ "a" , "b" ] )
2:      => [ "a" , "b" , "a" , "b" ]
3:
4:      fn:zigzag  ( s => [ "a" , [ "a" , "b" ] )
5:      => [ "a" , "a" , "b" ]
zagzig

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/zagzig

profile

fn:zagzig (s : tuple-sequence) return tuple-sequence

The function zagzig is the reverse function of zigzag and returns a tuple sequence filled by singleton tuples containing an element of the origin tuple. The index of the singleton tuple in context of the sequence will be the same than the index of the item contained by the new singleton tuple in context of the origin tuple.

1:      fn:zagzig  ( s =>  [ "a" , "b" , "a" , "b" ] )
2:      => [ [ "a" ] , [ "a" ] , [ "a" ] , [ "a" ] )
url-decode

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/url-decode

profile

fn:url-decode (s : string) return string

The function url-decode decodes the given URL reference. The result will be the reference without any encoded characters.

1:      fn:url-decode  ( s =>  "http://psi.example.org/Hello%20World" )
2:      => ( "http://psi.example.org/Hello World" )
url-encode

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/url-encode

profile

fn:url-encode (s : string) return string

The function url-encode encodes the given string literal as an URI reference and escape all forbidden characters of the URI syntax.

1:      fn:url-encode  ( s =>  "http://psi.example.org/Hello World" )
2:      => ( "http://psi.example.org/Hello%20World" )
topics-by-subjectidentifier

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/topics-by-subjectidentifier

profile

fn:topics-by-subjectidentifier ((s : string)+) return sequence

The function returns a sequence of topics idenitified by one of the provided subject-identifiers. The identifiers are given as string literals.

topics-by-subjectlocator

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/topics-by-subjectlocator

profile

fn:topics-by-subjectlocator ((s : string)+) return sequence

The function returns a sequence of topics idenitified by one of the provided subject-locators. The identifiers are given as string literals.

topics-by-itemidentifier

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/topics-by-itemidentifier

profile

fn:topics-by-itemidentifier ((s : string)+) return sequence

The function returns a sequence of topics idenitified by one of the provided item-identifiers. The identifiers are given as string literals.

array

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/array

profile

fn:array ( (a : anytype)+ ) return sequence

The array function convert the given list of any objects to an array, which will returned without reduction in the result set.

3.5.2. Implementing your own function

The tmql4j engine provides an API to integrate your own functions. The function can be used in the query by the given identifier. This chapter tries to explain how to define a new function at the example of the function length shown in the upper sections.

To define your own function we have to create a new function interpreter implementing the API interface IFunctionInvocationInterpreter. The function interpreter will be called during the interpretation process to handle the function in relation to the current query. Each function interpreter will be initialized one times for each position in the parsing tree, that means if the function is used as a part of the select clause of a select expression, the interpreter will be initialized exactly at the time, the function will be interpret first. This generated instance will be stored by the internal function handler and called if the function is used again. But if the function is used two times at different parts of the query, two instances will be created.

At first we have to create a new function interpreter class as shown in the next code snippet.

1:      public class LengthFunctionInvocationInterpreter extends
2:              ExpressionInterpreterImpl<FunctionInvocation> implements
3:                      IFunctionInvocationInterpreter {
4:              ....
5:      }

A simple notation to create new function interpreter is to named it by the function identifier Length in combination with the post-fix FunctionInvocationInterpreter. The class has to implements the API interface IFunctionInvocationInterpreter as shown in line 3. To use the interpreter in the context of a TMQL query it also has to extend the abstract class ExpressionInterpreterImpl with the type argument FunctionInvocation. An expression interpreter is used by the tmql4j handler during the interpretation process to handle the expression of the specific type given by the type argument.

The new function interpreter has to implements there inherit methods. The first method only returns the internal identifier of the function use to store it as a topic of the internal environment map. The given identifier shall be unique to avoid side effects.

1:      public String getItemIdentifier() {
2:              return "fn:length";
3:      }

The second method simply returns the number of expected variables used to handle the function.

1:      public long getRequiredVariableCount() {
2:              return 1;
3:      }

The last method interpret represent the core functionality of the function interpreter and will be called if the function shall be handled.

1:      public void interpret(TMQLRuntime runtime) throws TMQLRuntimeException {
2:              QueryMatches results = new QueryMatches(runtime);
3:              runtime.getRuntimeContext().pushToStack();
4:
5:              /*
6:              * call sub-expression
7:              */
8:              IExpressionInterpreter<?> interpreter = getInterpreters(runtime).get(0);
9:              interpreter.interpret(runtime);
10:
11:             /*
12:              * extract results and check number of parameters
13:              */
14:             QueryMatches parameters = (QueryMatches) runtime.getRuntimeContext().popFromStack().getValue(VariableNames.QUERYMATCHES);
15:             if (parameters.getOrderedKeys().size() < getRequiredVariableCount()) {
16:                     throw new TMQLRuntimeException(getItemIdentifier() + "() requires " + getRequiredVariableCount() + " parameter.");
17:             }
18:
19:             /*
20:              * iterate over parameters
21:              */
22:             for (Map<String, Object> tuple : parameters) {
23:                     Object sequence = tuple.get("$0");
24:                     Map<String, Object> result = new THashMap<String, Object>();
25:                     /*
26:                      * check if value is a sequence
27:                      */
28:                     if (sequence instanceof Collection<?>) {
29:                             ITupleSequence<Integer> lengths = runtime.getProperties().newSequence();
30:                             /*
31:                              * add length of each string to a new sequence
32:                              */
33:                             for (Object obj : (Collection<?>) sequence) {
34:                                     lengths.add(obj.toString().length());
35:                             }
36:                             result.put(QueryMatches.getNonScopedVariable(), lengths);
37:                     }
38:                     /*
39:                      * add length of the string to result tuple
40:                      */
41:                     else {
42:                             result.put(QueryMatches.getNonScopedVariable(), sequence
43:                                             .toString().length());
44:                     }
45:                     results.add(result);
46:             }
47:             runtime.getRuntimeContext().peekFromStack().createAndAddToOrSetTo( VariableNames.QUERYMATCHES, results);
48:     }

The tuple sequence representing the result of an interpretation task of an expression will be encapsulated by an instance of the class QueryMatches. In line 2 we create a new instance to store the results.

The processing model of the tmql4j engine based on the indirect communication between the the different expression at the different tree levels. The communication based on a stack storing all variable bindings which are valid for the current expression instance. If a new expression interpreter gain control of the tmql4j engine it has to push a new variable set on top of the stack before calling any subexpressions as you see in line 3. The production of a function in TMQL describe a subexpression containing all parameters of this function interpretation, so have to call the underlying subexpression first, as we see in line 8 and 9.

To get the results of this interpretation the interpreter has to pop the variables set which was added before from the internal stack ( line 14 ). The results are stored by the variable VariableNames.QUERYMATCHES, because of that we has to call the getValue method with the variable name.

In line 15 until 17 we will check the number of contained arguments and raise an exception if the number differs from the expected one given by the method getRequiredVariableCount.

Between line 22 and 46 the code realized the function interpretation. In line 23 we extracted the argument representing the tuple sequence by using the index variable $0.

The last step is to store the overall results at the stack to return it to the parent expression handler. The results are also identified by the variable VariableNames.QUERYMATCHES as we see in line 47.

After finishing our implementation we have to register our new function at the tmql4j runtime using the provided interface.

1:      runtime.getLanguageContext().getFunctionRegistry().registerFunction("fn:length", LengthFunctionInvocationInterpreter.class);

Now we can use our new function.

3.6. Engine-Modules

TMQL4J is designed as a process chain of simple modules. Each module provides a simple or atomic function in context of the querying process and will be handled by a processing task of the runtime. The modules can be exchanged to adapt the implementation to your own business cases. The processing chain is encapsulated by the tmql4j runtime container and can not be access directly.

3.6.1. Pre-Processing Module

Before execution the query instance is called by the TMQL processor to handle any pre-processing stuff.

screener

A screener is a special part of a compiler and clean the given string representation of the TMQL query by removing comments and white spaces or new-line-commands. The screener implementation of the tmql4j engine is represented by the API interface IScreener. The base implementation of this module is the TMQLScreener which removes all comments contained by the given query symbolized by the hex character #. Line breaks will be removed by replacing it with a single space. The result of the screening process will be a single line string representation of the origin query.

1:      SELECT $p # set the variable
2:              WHERE $p ISA person # boolean condition
3:
4:      => SELECT $p WHERE $p ISA person
whitepacer

The whitespacer is a special part of the pre-processing module and cleaning the white spaces of the given query. The white-spacer add optional white spaces of the current draft missing by the user. If there are multiple white spaces at a position where at least one is required by the draft. Protected white spaces will not be changed. A whitespace is protected if it is a part of a string literal or an XML content. The white-spacer is represented by the API interface IWhitespacer and the base implementation is TMQLWhiteSpacer.

1:      SELECT $p>>characteristics tm:name
2:              WHERE $p ISA person
3:
4:      => SELECT $p >> characteristics tm:name WHERE $p ISA person

3.6.2. Lexical scanner

The lexical scanner is a special program module of the engine to split the given query into language-specific elements, called tokens. Each token represents a logically independent and language specific part of the query, like keywords or variables names. All tokens are defined by the lowest grammar level - token level. The lexical scanner is represented by the API interface ILexer and the base implementation is TMQLLexer.

At first the lexical scanner split the given string-represented query using a special tokenizer class provided by the internal engine processor. The tokenized string will be checked for knwon keywords element by element and will be represented by a language-specific token. The result of lexical scanning will be a chain of string-represented tokens and language-specific tokens.

1:      SELECT $p / tm:name
2:              WHERE $p ISA person
3:
4:      => [ SELECT , $p , >> , characteristics , tm:name , >> , atomify , WHERE , $p , ISA , person ]

3.6.3. Parser Module

The parser is the core module of the tmql4j engine. It converts the lexical tokens to a tree-structure representing the semantic structure of the given query. Each node of the parser tree represents a production rule of the TMQL draft and contains a number of children, if the production rule contains terminals representing new production rules. Leafs can only be simple atoms or a simple step. The tree structure will be an instance of the API interface IParserTree and will be created by an implementation of IParser. The base implementation of the parser is the class TMQLParser.

1:      SELECT $p / tm:name
2:              WHERE $p ISA person
3:
4:      => QueryExpression([SELECT, $p, >>, characteristics, tm:name, >>, atomify, WHERE, $p, ISA, person])
5:              |--SelectExpression([SELECT, $p, >>, characteristics, tm:name, >>, atomify, WHERE, $p, ISA, person])
6:              |--SelectClause([SELECT, $p, >>, characteristics, tm:name, >>, atomify])
7:              |       |--ValueExpression([$p, >>, characteristics, tm:name, >>, atomify])
8:              |               |--Content([$p, >>, characteristics, tm:name, >>, atomify])
9:              |                       |--QueryExpression([$p, >>, characteristics, tm:name, >>, atomify])
10:             |                               |--PathExpression([$p, >>, characteristics, tm:name, >>, atomify])
11:             |                                       |--PostfixedExpression([$p, >>, characteristics, tm:name, >>, atomify])
12:             |                                               |--SimpleContent([$p, >>, characteristics, tm:name, >>, atomify])
13:             |                                                       |--Navigation([>>, characteristics, tm:name, >>, atomify])
14:             |                                                               |--Step([>>, characteristics, tm:name])
15:             |                                                               |--Step([>>, atomify])
16:             |--WhereClause([WHERE, $p, ISA, person])
17:                     |--BooleanExpression([$p, ISA, person])
18:                             |--BooleanPrimitive([$p, ISA, person])
19:                                     |--ExistsClause([$p, ISA, person])
20:                                             |--Content([$p, ISA, person])
21:                                                     |--QueryExpression([$p, ISA, person])
22:                                                             |--PathExpression([$p, ISA, person])
23:                                                                     |--ISAExpression([$p, ISA, person])
24:                                                                             |--SimpleContent([$p])
25:                                                                             |--SimpleContent([person])

3.6.4. Interpreter Module

The interpreter module is represented by a set of many expression interpreters. An interpreter is responsible for one specific expression type and will be instantiate if a expression of this type is part of the parsing tree. The overall results of the interpretation task will be an instance of QueryMatches which will be transformed to a IResultSet by the underlying results processor. An expression interpreter will be an instance of the API interface IExpressionInterpter or the abstract class ExpressionInterpreterImpl.

3.6.5. Result-Processing Module

The last module of the default processing chain will be the result processor. The result processor transform and auto-reduce the interpretation results. The results will be transformed to an instance of IResultSet and will be reduced to a two-dimensional result.

1:      {
2:              [ "a" , [ "c" , "b" ] ] ,
3:              [ "x" , [ "y0" , "y1" ] , [ "z0" , "z1" ] ]
4:      }
5:
6:      =>      {       [ "a" , "c" ] ,
7:                      [ "a" , "b" ] ,
8:                      [ "x" , "y0" , "z0" ] ,
9:                      [ "x" , "y1" , "z0" ] ,
10:                     [ "x" , "y0" , "z1" ] ,
11:                     [ "x" , "y1" , "z1" ] }

3.6.6. Definition of a new Runtime

The tmql4j engine provides the possibility to design your own query engine and use it in the context of the other ones. To enable this, the user has to implement the ITmqlRuntime interface and provide this implementation by using Java Services with the full qualified interface name as service identifier.

3.7. Results

The results of a querying process are represented similar to the Java Database Connectivity ( JDBC ) ResultSet. The result processor transform the sequence of tuples generated by the interpretation module to an instance of IResultSet. The type of the result set will be different dependent from the property value and the expression type. The user can add its own implementation of a IResultSet to add specific functions or represent the data in the desired way.

The tmql4j engine will automatically change the implementation of the result set which is compatible to the result types. If the query returns XML content the engine will change the result set class to an implementation of XMLResult. If the interpreter returns CTM content, the result set will be an instance of CTMResult.

The IResultSet interface define a method to check the type of the result set based on an internal enumeration.

1:      IResultSet set = query.getResults();
2:      return set.getResultType();
3:
4:      => "TMAPI" or "XML" or "CTM" or "TEMPLATE" or "JTMQR"

3.7.1. Using an IResultSet

The IResultSet is designed similar to the JDBC ResultSet and can be used in a similar way. The result set provides a functionality to iterate over the contained results represented by the interface IResult. An instance of IResult represents exactly one tuple of the interpretation result and contains a set of literals and items representing the tuple items. The IResult class can also be used by iteration like the IResultSet.

1:      IResultSet<?> set = query.getResults();
2:              for ( IResult result : set ){
3:                      for ( Object item : result ){
4:                              ...
5:                      }
6:      }

In line 1 we extract the result set from the query instance query using the method getResults. The wildcard ? is used because we don’t know the IResult class contained by the result set. Because of the fact that IResult and IResultSet are extending the Java Iterable interface, we simply can iterate over the contained elements using the for-loop as we see in line 2 and 3.

As alternative the IResultSet and IResult interface provides a set of get methods to directly access any element of the results.

1:      IResultSet#get(int,int) ::      Object
2:      IResultSet#get(int)             ::      IResult
3:      IResult#get(int)                ::      Object

The get method with two integer arguments accesses the cell element at the given row and column. The indexes are zero-based. The method automatically converts the result to the type of the variable it is bind to. The same handling is provided by the get method of IResult.

The get method with only one argument accesses the whole row of the result set at the specified index.

Using alias within the query the user can define special string-indexes similar to JDBC. The string-based indexes can be used by the get Methods of the IResult or IResultSet.

IQuery q = runtime.run(tm, "// tm:subject / tm:name AS \"name\"");
IResultSet rs = q.getResults();
String name = rs.get(0,"name");
// or
IResult r = rs.get(0);
String name_ = r.get("name");

The get methods using string-indexes are similar to the integer-based ones.

1:      IResultSet#get(int,String)      ::      Object
2:      IResultSet#get(String)          ::      IResult
3:      IResult#get(String)                     ::      Object

Sometimes the cell value can be null if the navigation result is empty, for example if a topic has no names, the navigation to the name literals are results in a empty cell. The result set provides special methods to check if a cell value is null.

1:      IResultSet#isNullValue(int,int)         ::      boolean
2:      IResultSet#isNullValue(int,String)      ::      boolean
3:      IResult#isNullValue(int)                        ::      boolean
4:      IResult#isNullValue(String)                     ::      boolean
TMAPI-Results

The base result type is a TMAPI result set containing plain java objects and TMAPI objects representing the querying results. This result type will be generated by the result processor during the querying process of each expression type except flwr-expression return XML or CTM contents.

TopicMap-Results

The IResultSet interface provides a method toTopicMap to convert the results of the tmql query to a new topic map instance only containing only the topics and associations of the result set.

Please note: If the query returns XML fragments, the method is unsupported by the result set implementation.

CTM-Results

The IResultSet interface provides a method toCtm to convert the results of the tmql query to a CTM topic map serialization.

Please note: If the query returns CTM fragments, the the method connects each fragment to a whole CTM file.

Please note: If the query returns XML fragments, the method is unsupported by the result set implementation.

Please note: If the query returns JTMQR, the method is unsupported by the result set implementation.

XTM-Results

The IResultSet interface provides a method toXtm to convert the results of the tmql query to a XTM topic map serialization. In this case every topic and association are convert to its XTM fragment.

Please note: If the query returns CTM fragments, the method is unsupported by the result set implementation.

Please note: If the query returns XML fragments, the method is unsupported by the result set implementation.

Please note: If the query returns JTMQR, the method is unsupported by the result set implementation.

JTMQR-Results

The IResultSet interface provides a method toJTMQR to convert the results to a JSON representation (JTMQR). topic map serialization. The method supports an optional argument to define the version of JTMQR which should be used. The default version is 1.

Please note: If the query returns CTM fragments, the method is unsupported by the result set implementation.

Please note: If the query returns XML fragments, the method is unsupported by the result set implementation.

XML-Results

The IResultSet interface provides a method toXml to convert the results of the tmql query to an XML document. This method is only supported by the flwr result set XMLResult.

Update Results

Using the update-expression of the Topic Maps Modification Language (TMQL-ML) the result contains a set of information about the construct modified. The update-expression returns the internal ids of the constructs which were created, modified or being context of any modification. For example if a name was added to a topic, the result set will contain the name id, the topic id and in special cases if the type used for the new name does not exist before, the id of this new type will also be contained.

The results of the modification will be aliased by the type of the construct. The following table contains the alias and a description.

column alias

description

topics

The id of the topic a name or occurrences was added to or the id of topic which has to be created to use as type, theme or player.

associations

The id of the association a role was added to or the id of the new association created.

names

The id of the name which was modified, created or a variant was added to.

occurrences

The id of the occurrence which was modified or created.

roles

The id of the role which was modified or created.

variants

The id of the variant which was modified or created.

Overview

method

description

toCtm

Converts the results to a new topic map instance and returns its CTM representation.

toXtm

Converts the results to a new topic map instance and returns its XTM representation.

toTopicMap

Converts the results to a new topic map instance.

toJTMQR

Converts the results to its JTMQR representation.

toXML

Only for XML content (FLWR). Returns the XML fragments as XML document.

Please note: Each method except the toTopicMap method supports an overloaded variant with a argument of OutputStream. The method writes the result directly to the given stream.

3.8. TMQL API

Simply the topic maps query engine tmql4j isn’t a query engine only supports topic maps engine based on the TMAPI specification. In addition the engine provides a set of interfaces to add any backend implementation providing informations represented as topic map constructs.

3.8.1. The ITMQLRuntime Interface

The interface definition called ITMQLRuntime specifies a runtime implementation handle any topic map query language.

public interface ITMQLRuntime {

        public void run(IQuery query) throws TMQLRuntimeException;

        public IQuery run(TopicMap topicMap, String query) throws TMQLRuntimeException;

        public ILanguageContext getLanguageContext();

        public IExtensionPointAdapter getExtensionPointAdapter() throws UnsupportedOperationException;

        public boolean isExtensionMechanismSupported();

        public void setTopicMapSystem(TopicMapSystem system);

        public TopicMapSystem getTopicMapSystem();

        public IParserTree parse(final String query) throws TMQLRuntimeException;

        public IParserTree parse(final IQuery query) throws TMQLRuntimeException;

        public ITmqlProcessor getTmqlProcessor();

        public IConstructResolver getConstructResolver();

         public String getLanguageName();
}

The method getConstructResolver is used to identify a topic by the given identifier represented by the second argument. The first argument is a reference of the current querying context. In relation to the TMAPI the method has to return an instance of Construct representing any construct of a topic map, like topics, associations or roles. The method has to return the data set for the given identifier, but may not return null any time. If the construct cannot found or is unknown for the called backend null should be returned.

The method getTmqlProcessor is called during the querying process to fetch a processor instance executes the querying process. The processor encapsulate the lexical scanner, the parser and result processing modules.

The method getExtensionPointAdapter returns the internal reference of the extension point adapter if the runtime supports extension mechanism, which can be check by calling the method isExtensionMechanismSupported. If the method is called if the mechanism is not supported, an exception is caused.

The methods run and parse are called by the upper application.

3.8.2. The IConstructResolver Interface

The interface definition represents a utility module to find a construct by different identifier types, like its subject-identifiers, subject-locators or item-identifiers.

1:      public interface IConstructResolver {
2:
3:              public Topic getTopicBySubjectIdentifier(final IContext context, final String identifier);
5:
6:              public Topic getTopicBySubjectLocator(final IContext context,final String identifier);
8:
9:              public Construct getConstructByItemIdentifier(final IContext context,final String identifier);
11:
12:             public Construct getConstructByIdentifier(final IContext context,final String identifier);
14:
15:     }

The first parameter of each method contains the current context of querying processes, like the topic map instance it-self, the query and some additional prefixes defined by the query.

Like the name of getTopicBySubjectIdentifier method indicates, implements the method the functionality to get a topic item by its subject-identifier. As parameters the method will get an unique subject-identifier of a topic as string-represented IRI and an instance of the topic map construct, contains the information. The topic map construct can also be an abstraction container for the underlying backend. As alternative the function getTopicBySubjectLocator returns a topic item represented by its subject-locator. Both method has to return the topic item identified by the given IRI but never null. If the construct cannot be resolved, null shall be returned.

In relation to the topic maps data model the third method getConstructByItemIdentifier has to return an topic map construct identified by it item-identifier given by the first argument. If there is no construct with this item-identifier, null shall be returned.

The last method combines the three functions to get a topic map construct by its identifiers. Please note that this method can be ambiguous, if the topic map contains a construct with the IRI as subject-locator and one with the IRI as subject-identifier. If the topic map scheme of the queried topic map or the abstract layer contains a restriction to forbid something like that, the method can use securely.

3.8.3. The INavigationAxis Interface

The current draft of the topic maps query language contains a special navigation expression type based on the proxies defined by the topic maps reference model. The draft describe 12 axis as abstraction of the topic maps graph defined by the relation between any topic map constructs like associations and roles. In relation to this axis the tmql4j API contains the interface INavigationAxis representing one of this axis.

1:      public interface INavigationAxis {
2:
3:              TopicMap getTopicMap() throws NavigationException;
4:
5:              void setTopicMap(TopicMap topicMap);
10:
11:             Class<? extends IToken> getNavigationType();
12:
13:             boolean supportsForwardNavigation(final Object construct, final Object optional) throws NavigationException;
14:
15:             boolean supportsForwardNavigation(final Object construct, final Object optional) throws NavigationException;
16:
17:             ITupleSequence<?> navigateForward(final Object construct) throws NavigationException;
18:
19:             ITupleSequence<?> navigateForward(final Object construct, final Object optional) throws NavigationException;
20:
21:             Class<?> getForwardNavigationResultClass(final Object construct) throws NavigationException;
22:
23:     }

The method setTopicMap used by the runtime container to set the current topic map to the navigation axis implementation. The given topic maps can be used by the implementation to realize the navigation step if it is necessary.

The getNavigationType returns a the class of the token representing this axis. The core engine defines one token for each pre-defined navigation axis.

Some of the TMQL axis supporting an optional type parameter in addition to the navigation start represented by a topic reference. Because of that most of the methods of the interface are overloaded to realize the usage with an optional argument or without it. The method supportsForwardNavigation checks if the given start node of the abstract topic maps graph will be supported by the axis and if the optional type can be used if it is not null.

The navigation step over the axis will be provided by the navigateForward method an will return a tuple sequence containing all topic map constructs as target nodes of the navigation step.

The last method getForwardNavigationResultClass returns the class object representing the expected result type of a navigation step of this navigation axis.

In addition to the method containing the word forward there are a correspondent to realize the method in backward direction as it is described by the current draft.

If any of the navigation functions fails, the method will throw an exception of type NavigationException.

3.9. Prepared Statements

Since version 3.0.0 the tmql4j query suite supports the feature of prepared statements to pre-calculate the parsing tree. Prepared statements can be contain wildcards, which have to replace by values before querying. The main benefit of prepared statements is the reuse of the parsed query tree, which can be speed up the application, if a query will be used more than one times.

3.9.1. Create a Prepared Statement

The new ITmqlRuntime supports new methods to create a new prepared statement for a given string representation. The prepared statement may contain wildcards starting with ?.

1:      IPreparedStatement statement = runtime.preparedStatement(" ? >> characteristics ");

3.9.2. Using wildcards

Wildcards can be used in a named or anonymous variant. All wildcards have to start with ? following by any alpha-numerical character in case of the named variant. The anonymous wildcard only uses the ?.

Before the prepared statement can be executed by using the run method, the variables have to bind to specific values. If at least one wildcard is not bind to any value, an error occur by calling the run method. The value binding can be done in different ways.

Provides as Arguments of Run Method

The run method of a prepared statement supports an non-limited list of object arguments, which will be interpreted as values for the wildcards. The wildcard binding will be index-based in the same order than the values are provided for the run method.

1:      IPreparedStatement statement = runtime.preparedStatement(" ? >> characteristics ");
2:      Topic topic = topicMap.createTopic();
3:      ...
4:      statement.run(topicMap, topic);

The code at line 4 are interpreted in the following way. The first argument are interpreted as the topic map the query should execute for. The second argument, the topic, are set as value for the first wildcard ?.

Using the Index-Based Setter

An alternative for the run method which is more intuitive, is to use the setters of the IPreparedStatement interface. There is one setter for each supported value type. The binding will be done for the index given by the first integer argument of the setter method. The indexes are zero-based.

1:      IPreparedStatement statement = runtime.preparedStatement(" ? >> characteristics ");
2:      Topic topic = topicMap.createTopic();
3:      ...
4:      statement.setTopic(0, topic);
5:      statement.run(topicMap);
Using the Name-Wildcards

By using named wildcards, the prepared statement also provides special setters which expects a string argument as first parameter. This string argument have to be the same like the name of the wildcard. If the string argument starts with ? it will be removed automatically. Using named wildcards and the special setters, an application can replace more than one wildcard by one call.

1:      IPreparedStatement statement = runtime.preparedStatement(" ?topic >> indicators UNION ?topic >> locators ");
2:      Topic topic = topicMap.createTopic();
3:      ...
4:      statement.setTopic("topic", topic); // or statement.setTopic("?topic", topic);
5:      statement.run(topicMap);

As a special string argument the IPreparedStatement#ANONYMOUS constant can be used to replace all anonymous wildcards by the same value.

Note
Named-wildcards can also be set by using index-based methods and they does not modify the index of anonymous wildcards.

3.10. TMQL4J Extensions

The tmql4j engine provides a framework to include your own extensions quite simple. Using Java Service Providers the extension will be registered automatically if the implementation class is in the class path.

The engine currently supports two different extension types - language extensions and engine extensions.

3.10.1. Engine Extensions

An engine extension simple add or change some functionality of the querying process or the engine workflow. To learn how to integrate your own extensions, the following example showing an extension implementing a multi-threaded expression interpreter for quantified-expressions of TMQL.

To integrate an engine extension the plug-in has to contain an implementation of the IExtensionPoint interface representing the entry point of the plug-in itself. An extension point is used by the runtime to initialize and register the plug-in. Using Java services the extension adapter will find all implementations of the IExtensionPoint interface.

1:      public interface IExtensionPoint {
2:
3:              /**
4:              * Method called by the runtime to register the extension end-point before
5:              * running the query process.
6:              *
7:              * @param runtime
8:              *            the calling runtime
9:              * @throws TMQLExtensionRegistryException
10:             *             thrown if an exception caused by the internal runtime
11:             */
12:             public void registerExtension(ITMQLRuntime runtime)
13:                     throws TMQLExtensionRegistryException;
14:
15:             /**
16:             * Each extension point has to define an unique extension point id used to
17:             * represent the extension point in context of the current TMQL runtime. If
18:             * two extension points has the same identifier the extension adapter will
19:             * throw an exception during the initialization time of extension points on
20:             * startup.
21:             *
22:             * @return the unique id
23:             */
24:             public String getExtensionPointId();
25:
26:     }

An extension point has to implement only two methods of the interface to realize the usage as a part of the tmql4j engine. The method getExtensionPointId has to return a string-represented and unique identification of the plug-in. The unique ID will be used by the runtime to identify the plug-in during the runtime process. If the plug-ins are integrate using the same identification string, an exception will be raised.

The main method of the extension point handled by the method registerExtension. The method will be called by the extension adapter and get a reference of the current runtime container, which tries to load the plug-in. Using this method the developer can creates changes or integrate new interpreters for existing expression types. To register your own interpreter class, the interpreter registry has to be used which is managed by the runtime properties of the current runtime container.

1:      public class MultiThreadExtensionPoint implements IExtensionPoint {
2:
3:              /**
4:               * {@inheritDoc}
5:               */
6:              @Override
7:              public void registerExtension(ITMQLRuntime runtime)
8:                      throws TMQLExtensionRegistryException {
9:                              try {
10:                                             runtime.getLanguageContext().getInterpreterRegistry().registerInterpreterClass( ExistsClause.class,
                                                                MultiThreadExistsClauseInterpreter.class);
11:                             } catch (TMQLException e) {
12:                                     throw new TMQLExtensionRegistryException(e);
13:                             }
14:             }
15:
16:             /**
17:              * {@inheritDoc}
18:              */
19:             @Override
20:             public String getExtensionPointId() {
21:                     return "tmql4j.extension.mutlithreaded";
22:             }
23:
24:     }

This extension point implementations is used as entry point of the multi-thread extension for quantified-expression. In line 21 the class return the string literal tmql4j.extension.mutlithreaded as identification for the current plug-in. In code line 10 the interpreter class MultiThreadExistsClauseInterpreter is registered as the interpreter for all expressions of the type ExistsClause. After this registration the tmql4j engine will create an instance of the interpreter class for each exists clause in the context of the parsing tree.

To register the interpreter class the runtime reference is used to get the runtime properties calling the method getProperties. The property handler provides the method getRegistry to get the internal interpreter registry instance which should be used to add new interpreters. To realize that the method registerInterpreterClass with two arguments is used. The first argument represents the expression type as Java class and the second argument is a Java class representing the interpreter.

Please note that plug-ins can interfere with each other. Sometimes a plug-in functionality is overlapped by another plug-in. It can not be predict which plug-in will be used or which function, because the order of including is haphazardly. plug-ins always covers functionality of the core engine which cannot be used if the plug-in is integrated. Take care!

To use Java service provider technology, the jar file must be include a file named de.topicmapslab.tmql4j.extensions.model.IExtensionPoint only includes one text line with the full qualified name of the extension point implementation. The file has to be located in the folder META-INF\services.

3.10.2. Language Extensions

As second plug-in type, the tmql4j engine supports language extensions. A language extension should be used to add some additional productions to the core language defined in the current draft. The extension may not overwrite some functionality of the current engine, it only has to add some new expression types and all which seems to be necessary to realize the interpretation and the usage of the language extension.

To migrate a language extension the plug-in has to implement the abstract interface ILanguageExtension. A language extension is a extension point and has to implement the same methods. An extension point is used by the runtime to initialize and register the plug-in. Using Java services the extension adapter will find all implementations of the IExtensionPoint interface.

1:      public interface ILanguageExtension extends IExtensionPoint {
2:
3:              /**
4:               * Checks if the language extension extends the given expression type. If
5:               * the language extension add new productions starting with the given
6:               * expression type it has to return <code>true</code>, <code>false</code>
7:               * otherwise.
8:               *
9:               * @param expressionType
10:              *            the extended expression type
11:              * @return <code>true</code> if the extension based on the expression type.
12:              */
13:             public boolean extendsExpressionType(
14:                     final Class<? extends IExpression> expressionType);
15:
16:             /**
17:              * Return the language extension entry defining the entry point for using
18:              * the extension during the querying process.
19:              *
20:              * @return the extension entry
21:              */
22:             public ILanguageExtensionEntry getLanguageExtensionEntry();
23:
24:     }

A language extension has to implement the two method defined in line 13 and 22. The method extendsExpressionType seems to be used by the runtime to check if the extension is based on the current expression type. If the langauge extension add some functionality based on a query-expression, like a new sub-expression for creating new content, it has to return true if the parameter expressionType bind to the QueryExpression class.

The method getLanguageExtensionEntry returns a reference of the language extension entry. A language extension entry is used to migrate the new productions.

1:      public interface ILanguageExtensionEntry {
2:
3:              /**
4:               * Returns the expression type used as anchor in the parsing tree.
5:               *
6:               * @return the expression type as anchor in the parsing tree
7:               */
8:              public Class<? extends IExpression> getExpressionType();
9:
10:             /**
11:              * Checks if the given language-specific tokens matching the new production
12:              * of the language extension. The method has to check if the extension an be
13:              * used for the given sub-query.
14:              *
15:              * @param runtime
16:              *            the current runtime container
17:              * @param tmqlTokens
18:              *            the language-specific tokens
19:              * @param tokens
20:              *            the string-represented tokens
21:              * @return <code>true</code> if the productions can be used,
22:              *         <code>false</code> otherwise.
23:              */
24:             public boolean isValidProduction(final ITMQLRuntime runtime,
25:                     final List<Class<? extends IToken>> tmqlTokens,
26:                     final List<String> tokens);
27:
28:             /**
29:              * Called by the parser to add new sub-tree nodes using the extension
30:              * anchor.
31:              *
32:              * @param runtime
33:              *            the runtime container
34:              * @param tmqlTokens
35:              *            the language-specific tokens
36:              * @param tokens
37:              *            the string-represented tokens
38:              * @param caller
39:              *            the calling expression
40:              * @param autoAdd
41:              *            flag representing if the sub-tree should add automatically
42:              * @return the created expression
43:              * @throws TMQLInvalidSyntaxException
44:              *             thrown if the expression is invalid
45:              * @throws TMQLGeneratorException
46:              *             thrown if the expression can not be created
47:              */
48:             public IExpression parse(final ITMQLRuntime runtime,
49:                     final List<Class<? extends IToken>> tmqlTokens,
50:                     final List<String> tokens, IExpression caller, boolean autoAdd)
51:                     throws TMQLInvalidSyntaxException, TMQLGeneratorException;
52:     }

The language extension entry representing the handler of the language extension. The entry will be called during the parsing process for the specified type returned by the method getExpressionType to migrate the new extension ad the current tree node of the parsing tree. The migration takes place in two steps. At first the runtime calls the method isValidProduction with the current list of language-specific tokens. The method is used to check if the current tokens matches to the represented production rule. If the method is true, the method returns true, the method parse will called. The method should create the new sub-tree starting with the current node. The expression has to be added to the parsing tree automatically if the flag autoAdd is true.

Please note that the interpreter class and new tokens has to register first during the registration of the extension point. The following example register new tokens and new interpreter classes for a new expression representing an insert statement.

1:      /**
2:       * {@inheritDoc}
3:       */
4:      @Override
5:      public void registerExtension(ITMQLRuntime runtime)
6:                      throws TMQLExtensionRegistryException {
7:              /*
8:               * register tokens
9:               */
10:             TokenRegistry registry = runtime.getLanguageContext().getTokenRegistry();
11:             registry.register(Insert.class);
12:
13:             /*
14:              * register expression interpreter
15:              */
16:             InterpreterRegistry interpreterRegistry = runtime.getLanguageContext()
17:                             .getInterpreterRegistry();
18:             try {
19:                     interpreterRegistry.registerInterpreterClass(InsertClause.class,
20:                                     InsertClauseInterpreter.class);
21:                     interpreterRegistry.registerInterpreterClass(
22:                                     InsertExpression.class, InsertExpressionInterpreter.class);
23:                     interpreterRegistry
24:                                     .registerInterpreterClass(
25:                                                     de.topicmapslab.tmql4j.extension.tmml.grammar.expressions.QueryExpression.class,
26:                                                     QueryExpressionInterpreter.class);
27:             } catch (TMQLException e) {
28:                     throw new TMQLExtensionRegistryException(e);
29:             }
30:
31:             entry = new ModificationPartLanguageExtensionEntry();
32:     }

The snippet is an abstract of the extension point implementation of the modification part extension of TMQL. The code snippet register a new insert-expression type to the current runtime. First a new language-specific token added in line 11 using the token registry. a new token will be represented by a class extending the interface IToken. In line 19 and 21 two new interpreter class added to the current interpreter registry. The interpreter classes will be used to interpret the expression during the querying process. As you can see the extension register an extended implementation of a query-expression in line 23. After the registration of this language extension, there are two different query-expressions - the origin one and the modification extension. In this case, the core implementation wont be effected by the extension.

At last we take a look at the IToken interface.

1:      public interface IToken {
2:
3:              /**
4:               * Method checks if the token can be represented by the given string literal
5:               *
6:               * @param literal
7:               *            the string literal
8:               * @param runtime
9:               *            the contained runtime
10:              * @return <code>true</code> if the literal can be represented by this
11:              *         expression type
12:              */
13:             public boolean isToken(final ITMQLRuntime runtime, final String literal);
14:
15:             /**
16:              * Method returns the string representation of the current language token
17:              *
18:              * @return the literal
19:              */
20:             public String getLiteral();
21:
22:     }

The interface represents a specific token of the language of TMQL or some extensions. The method isToken checks if the given literal can be represented as a token of this type. The method getLiteral will return a string-representation of the token type.

To use Java service provider technology, the jar file must be include a file named de.topicmapslab.tmql4j.extensions.model.IExtensionPoint only includes one text line with the full qualified name of the extension point implementation. The file has to be located in the folder META-INF\services.

You can also create your own pragma and register them to the core runtime. To define a new pragma the extension should implement a class inherit from the IPragma interface:

1:      public interface IPragma {
2:
3:              /**
4:              * Returns the identifier of the pragma
5:              *
6:              * @return the identifier of the pragma
7:              */
8:              public String getIdentifier();
9:
10:             /**
11:             * Interpret the given pragma
12:             *
13:             * @param runtiem
14:             *            the runtime
15:             * @param context
16:             *            the context
17:             * @param value
18:             *            the value
19:             * @throws TMQLRuntimeException
20:             */
21:             public void interpret(final ITMQLRuntime runtime, final IContext context, final String value) throws TMQLRuntimeException;
22:
23:     }

Each pragma implementation class should be registered during the initialization process of the extension by calling the PragmaRegistry provided by the language context.

3.10.3. Omnigator Plug-in

As one of the first implementations, we provide a generic query plug-in for the Topic Maps browser Omnigator, called tmql4j-ontopia. It offers the possibility to query the Topic Maps using the TMQL syntax. The generic plug-in is an extension of TMQL4J

3.11. MaJorToM PlugIn

The MaJorToM plugin of the tmql4j query engine provides a set of new functions only enable in the context of the MaJorToM Topic Maps Engine. The new functions supports spatial and temporal calculations and extraction of information.

3.11.1. Best-Label Function

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/best-label

profile

fn:best-label (i : topic-item) return string

profile

fn:best-label (i : topic-item, theme: topic-item) return string

profile

fn:best-label (i : topic-item, theme: topic-item, strict: boolean) return string

The first argument is the topic item, in which context the best label should be extracted. The second is optional and represents a theme of the scopes, a name should have. The third argument is also optional and represents the strict mode of the best-label function. If the value is true and there is no scope with the given theme for any name item of the topic, the function will return nothing.

The best-label function exists in three different versions differ in the number of arguments. The result of the function is a string literal or nothing which will be calculated by the following rules.

  1. If the topic has no names, continue with 6 except the third argument is given and set to the value true. In this case nothing will be returned.

  2. The names with the TMDM default-name type will be preferred.

  3. The names with the lowest number of themes in scope will be preferred. As default the unconstrained scope has the highest priority. If the second argument is given, the scopes containing this theme have the highest priority. If there is no scope with the theme and the third argument is given and set to value true nothing will be returned.

  4. The names with the lowest number of characters will be preferred. The empty string will be ignored.

  5. The names with the lexicographically lowest value will be preferred.

  6. The lexicographically lowest subject-identifier will be returned.

  7. The lexicographically lowest subject-locator will be returned.

  8. The lexicographically lowest item-identifier will be returned.

  9. The id will be returned.

3.11.2. Best-Identifier Function

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/best-identifier

profile

fn:best-identifier (i : topic-item) return string

profile

fn:best-identifier (i : topic-item, b: prefixed ) return string

The best-identifier function returns the best identifier of the given topic item. The best identifier are calculated by the following rules.

  1. If the topic has at least one subject-identifier, the shortest and lexicographically smallest IRI will be returned.

  2. If the topic has at least one subject-locator, the shortest and lexicographically smallest IRI will be returned.

  3. If the topic has at least one item-identifier, the shortest and lexicographically smallest IRI will be returned.

  4. As fall-back the id will be returned.

The optional boolean argument prefixed indicates if the best identifier should prefixed with its type. If the argument is true the following prefixes are used.

type

prefix

subject-identifier

si:

subject-locator

sl:

item-identifier

ii:

id

id:

3.11.3. Coordinates-In-Distance Function

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/coordinates-in-distance

profile

fn:coordinates-in-distance (s : string , dist : decimal ) return tuple-sequence

profile

fn:coordinates-in-distance (lat : decimal, long : decimal , dist : decimal ) return tuple-sequence

The function coordinates-in-distance exists in two versions differ in the number of arguments. The result of the function will be a sequence of occurrences representing a WGS84 coordinate located nearby the given position. The WGS84 coordinate can be given as string literal matching the following pattern

<latitude>';'<longitude>(;<altitude>)?

The coordinate can also be given by a latitude and longitude value. The last decimal argument represents the distance in meter between the occurrence value and the given coordinate as maximal value.

3.11.4. Distance Function

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/distance

profile

fn:distance (s : string , s:string ) return decimal

profile

fn:distance (lat1 : decimal, long1 : decimal , lat2 : decimal, long2 : decimal ) return decimal

The function distance returns the distance between two WGS84 coordinates given by string literals or latitude and longitude values. The distance will be returned as decimal value measured in meters.

3.11.5. Dates Function

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/dates

profile

fn:dates ( ) return tuple-sequence

The function dates returns all occurrences representing a date-time value checked by the datatype xsd:dateTime or xsd:date .

3.11.6. Dates-After Function

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/dates-after

profile

fn:dates-after ( s : string ) return tuple-sequence

profile

fn:dates-after ( y : integer , m : integer , d : integer ) return tuple-sequence

profile

fn:dates-after ( y : integer , m : integer , d : integer , h : integer , min : integer , s : integer ) return tuple-sequence

The function dates-after returns all occurrences representing a date-time value checked by the datatype xsd:dateTime or xsd:date after a specific date. The date can be defined by a single string literal or as a set of integer values representing the year, month, day and optional the hour, the minute and the second.

3.11.7. Dates-Before Function

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/dates-before

profile

fn:dates-before ( s : string ) return tuple-sequence

profile

fn:dates-before ( y : integer , m : integer , d : integer ) return tuple-sequence

profile

fn:dates-before ( y : integer , m : integer , d : integer , h : integer , min : integer , s : integer ) return tuple-sequence

The function dates-before returns all occurrences representing a date-time value checked by the datatype xsd:dateTime or xsd:date before a specific date. The date can be defined by a single string literal or as a set of integer values representing the year, month, day and optional the hour, the minute and the second.

3.11.8. Dates-In-Range Function

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/dates-in-range

profile

fn:dates-in-range ( l : string , u : string ) return tuple-sequence

profile

fn:dates-in-range ( y1 : integer , m1 : integer , d1 : integer , y2 : integer , m2 : integer , d2 : integer ) return tuple-sequence

profile

fn:dates-in-range ( y1 : integer , m1 : integer , d1 : integer , h1 : integer , min1 : integer , s1 : integer , y2 : integer , m2 : integer , d2 : integer , h2 : integer , min2 : integer , s2 : integer ) return tuple-sequence

The function dates-in-range returns all occurrences representing a date-time value checked by the datatype xsd:dateTime or xsd:date between two specific dates. The dates can be defined by a single string literal or as a set of integer values representing the year, month, day and optional the hour, the minute and the second.

3.11.9. Get-Association-Types Function

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/get-association-types

profile

fn:get-association-types ( ) return tuple-sequence

profile

fn:get-association-types ( b : transitive ) return tuple-sequence

The function get-association-types returns all association types of the topic map by using the provided indexes. If the boolean argument is given and set to true the supertypes of the association types also contained within the result.

3.11.10. Get-Topic-Types Function

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/get-topic-types

profile

fn:get-topic-types ( ) return tuple-sequence

profile

fn:get-topic-types ( b : transitive ) return tuple-sequence

The function get-topic-types returns all topic types of the topic map by using the provided indexes. If the boolean argument is given and set to true the supertypes of the topic types also contained within the result.

3.11.11. Get-Name-Types Function

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/get-name-types

profile

fn:get-name-types ( ) return tuple-sequence

profile

fn:get-name-types ( b : transitive ) return tuple-sequence

profile

fn:get-name-types ( t : topic-type ) return tuple-sequence

profile

fn:get-name-types ( t : topic-type, b: withDuplicates ) return tuple-sequence

The function get-name-types returns all name types of the topic map by using the provided indexes. If the boolean argument is given and set to true the supertypes of the name types also contained within the result. If the first argument is a topic, only the result only contains the type of names contained by at least instance of the given topic type. If the second boolean argument is missing or set to false, each name-type only returned one times.

3.11.12. Get-Occurrence-Types Function

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/get-occurrence-types

profile

fn:get-occurrence-types ( ) return tuple-sequence

profile

fn:get-occurrence-types ( b : transitive ) return tuple-sequence

profile

fn:get-occurrence-types ( t : topic-type ) return tuple-sequence

profile

fn:get-occurrence-types ( t : topic-type, b: withDuplicates ) return tuple-sequence

The function get-occurrence-types returns all occurrence types of the topic map by using the provided indexes. If the boolean argument is given and set to true the supertypes of the occurrence types also contained within the result. If the first argument is a topic, only the result only contains the type of occurrences contained by at least instance of the given topic type. If the second boolean argument is missing or set to false, each occurrence-type only returned one times.

3.11.13. Get-Characteristic-Types Function

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/get-characteristic-types

profile

fn:get-characteristic-types ( ) return tuple-sequence

profile

fn:get-characteristic-types ( b : transitive ) return tuple-sequence

profile

fn:get-characteristic-types ( t : topic-type ) return tuple-sequence

profile

fn:get-characteristic-types ( t : topic-type, b: withDuplicates ) return tuple-sequence

The function get-characteristic-types returns all characteristic types of the topic map by using the provided indexes. If the boolean argument is given and set to true the supertypes of the characteristic types also contained within the result. If the first argument is a topic, only the result only contains the type of characteristics contained by at least instance of the given topic type. If the second boolean argument is missing or set to false, each characteristic-type only returned one times.

3.11.14. Get-Role-Types Function

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/get-role-types

profile

fn:get-role-types ( ) return tuple-sequence

profile

fn:get-role-types ( b : transitive ) return tuple-sequence

profile

fn:get-role-types ( t : association-type ) return tuple-sequence

profile

fn:get-role-types ( t : association-type, b: withDuplicates ) return tuple-sequence

The function get-role-types returns all role types of the topic map by using the provided indexes. If the boolean argument is given and set to true the supertypes of the role types also contained within the result. If the first argument is a topic, only the result only contains the type of roles contained by at least instance of the given association type. If the second boolean argument is missing or set to false, each role-type only returned one times.

3.11.15. Get-Supertypes Function

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/get-supertypes

profile

fn:get-supertypes ( ) return tuple-sequence

profile

fn:get-supertypes ( t: topic-type ) return tuple-sequence

profile

fn:get-supertypes ( s: sequence ) return tuple-sequence

The function get-supertypes returns all topic types of the topic map acting as supertype. If the second argument is given, only the supertypes of the provided topics are returned.

3.11.16. Get-Subtypes Function

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/get-subtypes

profile

fn:get-subtypes ( ) return tuple-sequence

profile

fn:get-subtypes ( t: topic-type ) return tuple-sequence

profile

fn:get-subtypes ( s: sequence ) return tuple-sequence

The function get-subtypes returns all topic types of the topic map acting as subtype. If the second argument is given, only the subtypes of the provided topics are returned.

3.11.17. Get-Null-Value Function

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/get-null-value

profile

fn:get-null-value ( ) return null

The function can be used to add a null value to the result set, for example as fall-back if any calculation may be invalid.

3.11.18. Get-Topics-By-Name-Value Function

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/fn:get-topics-by-name-value

profile

fn:get-topics-by-name-value ( s: string ) return tuple-sequence

profile

fn:get-topics-by-name-value ( s: sequence ) return tuple-sequence

The function returns all topics of the topic map which have at least one name with exactly one of the given values.

Note
This method with one argument is similar to the navigation "value" << atomify tm:name << characteristics.

3.11.19. Get-Topics-By-Name-Regular-Expression Function

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/fn:get-topics-by-name-regular-expression

profile

fn:get-topics-by-name-regular-expression ( p: pattern ) return tuple-sequence

profile

fn:get-topics-by-name-regular-expression ( s: sequence ) return tuple-sequence

The function returns all topics of the topic map which have at least one name with a value matching one of the given patterns.

Note
This method with one argument is similar to the navigation "pattern" << ratomify tm:name << characteristics.

3.11.20. Get-Topics-By-Occurrence-Value Function

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/fn:get-topics-by-occurrence-value

profile

fn:get-topics-by-occurrence-value ( s: string ) return tuple-sequence

profile

fn:get-topics-by-occurrence-value ( s: sequence ) return tuple-sequence

The function returns all topics of the topic map which have at least one occurrence with exactly one of the given values.

Note
This method with one argument is similar to the navigation "value" << atomify tm:occurrence << characteristics.

3.11.21. Get-Topics-By-Occurrence-Regular-Expression Function

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/fn:get-topics-by-occurrence-regular-expression

profile

fn:get-topics-by-occurrence-regular-expression ( p: pattern ) return tuple-sequence

profile

fn:get-topics-by-occurrence-regular-expression ( s: sequence ) return tuple-sequence

The function returns all topics of the topic map which have at least one occurrence with a value matching one of the given patterns.

Note
This method with one argument is similar to the navigation "pattern" << ratomify tm:occurrence << characteristics.

3.11.22. Get-Topics-By-Characteristic-Value Function

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/fn:get-topics-by-characteristic-value

profile

fn:get-topics-by-characteristic-value ( s: string ) return tuple-sequence

profile

fn:get-topics-by-characteristic-value ( s: sequence ) return tuple-sequence

The function returns all topics of the topic map which have at least one name or occurrence with exactly one of the given values.

Note
This method with one argument is similar to the navigation "value" << atomify << characteristics.

3.11.23. Get-Topics-By-Characteristic-Regular-Expression Function

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/fn:get-topics-by-characteristic-regular-expression

profile

fn:get-topics-by-characteristic-regular-expression ( p: pattern ) return tuple-sequence

profile

fn:get-topics-by-characteristic-regular-expression ( s: sequence ) return tuple-sequence

The function returns all topics of the topic map which have at least one name or occurrence with a value matching one of the given patterns.

Note
This method with one argument is similar to the navigation "pattern" << ratomify << characteristics.

3.11.24. Remove-Duplicates Function

item-identifier

http://psi.topicmaps.org/tmql/1.0/functions/fn:remove-duplicates

profile

fn:remove-duplicates( ) return void

The function removes all duplicates from the underlying topic map. In detail, all occurrences having the same parent, type, value, scope and datatype are merged to one. All occurrences having the same parent, type, value, and scope are merged to one. All variants having the same parent, value, datatype and scope are merged to one. All roles having the same type, player and parent are merged to one. All associations having the same type, scope and set of roles are merged to one.

3.12. RTM-TMQL integration

Ruby Topic Maps (RTM) is a Topic Maps engine for Ruby which allows querying using TMQL using the RTM-TMQL extension. To do so, Ruby Topic Maps must be run using JRuby which runs on the JVM and allows using Java libraries like TMQL4J.

3.12.1. Installation

RTM-TMQL is available through RubyGems, the official Ruby package management system. Assuming JRuby installed and configured it can be installed using

Installing RTM-TMQL
gem install rtm-tmql

If there are multiple versions of Ruby installed on the system, the JRuby version is usually available as jgem. Alternatively it can be called from JRuby using the parameter -S. If JRuby was installed with as administrator, also the gem must be installed as administrator using sudo.

Alternatives for Installation
jgem install rtm-tmql
jruby -S gem install rtm-tmql
sudo jgem install rtm-tmql
sudo jruby -S gem install rtm-tmql

The above command installs RTM-TMQL and all its dependencies. RTM-TMQL includes TMQL4J, so there is no need to install it manually. To query something, also a Topic Maps engine is needed. In the following we assume RTM-Ontopia but others should work, too. RTM-Ontopia can be installed the same way as RTM-TMQL:

Installing RTM-Ontopia
gem install rtm-ontopia

3.12.2. Loading

To be used in an application, RTM-TMQL has to be loaded. As usual in Ruby this is done using the require statement. RTM-TMQL was installed via RubyGems which is consequently needed to find the library. As said above, we will use RTM-Ontopia in the example, so we will load it, too.

Loading RTM-TMQL
require 'rubygems'
require 'rtm/tmql'
require 'rtm/ontopia'

3.12.3. Usage

To address common usage, RTM prepares one TMQL engine per topic map automatically. Once loaded, TMQL queries can be run using the tmql method:

Running a TMQL query
topic_map.tmql("the query")

The above example assumes a topic map is already available in the variable topic_map. The following example shows the creation of such a topic map using a default in-memory connection. Within a Topic Maps engine, each topic map must have a base address which is an IRI according to RFC3987. The topic map contains the letters a to z; each is an instance of letter. The identifier is also used as the default name in this example.

Creating a simple topic map, the letter-map
connection = RTM.connect
letter_map = connection.create "http://example.org/my_tm"
letter_type = letter_map.get!("letter")
('a'..'z').each do |letter|
   topic = letter_map.get!(letter)
   letter_type.add_instance(topic)
   topic["-"] = letter
 end

Now that we have a map, we can run a query for all instances of the topic letter. Within a single call tmql, the query is run and the results are returned.

Querying all letters from the letter-map
letter_map.tmql("%prefix lm http://example.org/my_tm // lm:letter")

When first needed, the TMQL runtime is initialized automatically. It is available from the topic map using the tmql_engine method. Using the engine object, the TMQL4J properties can be modified. Currently there are two properties available: The topics of the meta model, like the generic tm:name topic can be materialized into the topic map. The following example shows how to enable this setting:

Setting TMQL properties: Materialize Meta Model
topic_map.tmql_engine.enable_materialize_meta_model(true)

For restricted environments it may be useful to disable updates, so the topic map can only be queried but not modified:

Setting TMQL properties: Disable Updates.
topic_map.tmql_engine.enable_updates(false)

Once new properties become available and are not yet implemented in RTM-TMQL, the TMQL4J-internal properties object can be accessed using the properties method from the TMQL engine. The following example shows how to disable updates if the direct setter was not available:

Setting TMQL properties: Disable Updates; directly in TMQL4J
topic_map.tmql_engine.properties.enableLanguageExtensionTmqlUl(false)

The same is true for the TMQLRuntime, which is available using the runtime method. If there is anything to changed with the runtime or a method to be called directly, it can be accessed using this method.

Concluding, RTM-TMQL is just a very thin wrapper layer around TMQL4J which makes calls to the library easier and is closely integrated with RTM.

4. Tutorials

This chapter contains a set of tutorials to learn the topic maps query language and its extensions step by step. To get a fine grasp the session starts with simple examples and get more complicated step by step. Please note that all tutorials are base on the previous one. The explanation of the examples used from previous tutorials will be reduced as much as possible.

To create a consistent understanding we use the known topic map about italian operas. Each tutorial will be organized into three parts. At first we define our use case containing the goals we try to gain with the query. At section two we create the query step by step and get explanations why we make use of this language patterns. In Addition to that alternatives will be shown because of the fact, that there isn’t the only one solution for our use case. The last section we be a short summary about the new information described by the tutorial.

4.1. Tutorial: Getting a topic

Use Case

We try to get the topic of the topic map representing the composer Puccini.

Solution

In relation to the topic maps data model we know, that a topic can be identified by a subject-locator, subject-identifier or item-identifier. In the context of a TMQL query this identifiers can be used as item references to identify a topic map construct.

Because of that the first we need a identifier of the topic representing the composer Puccini like http://psi.ontopedia.net/Puccini and the identifier type. We know that the IRI is a subject-identifier of the topic Puccini.

The next question we have to answer is the type of expression we have to use to gain our use case. Because of the fact the query is quite simple, we decide to use the path expression style.

1:      http://psi.ontopedia.net/Puccini

The query is quite short, because it only contains the item reference of the wanted topic item Puccini.

In relation to the identifier type, we also can use one of the indicators axis to get a topic by the string-represented IRI if the IRI is used as subject-identifier.

1:      "http://psi.ontopedia.net/Puccini" << indicators

Strings are represented by using double quotes as we see at the beginning of the query. Starting at the literal node of the abstract topic map we can use the indicators axis in backward direction to get the topic using this IRI as subject-identifier.

If the IRI is of type subject-locator or item-identifier the using query is similar to the last one except the name of the used axis.

1:      "http://psi.ontopedia.net/Puccini" << item

Returns the topic item using the string literal http://psi.ontopedia.net/Puccini as item-identifier.

1:      "http://psi.ontopedia.net/Puccini" << locators

Returns the topic item using the string literal http://psi.ontopedia.net/Puccini as subject-locator.

Summary

We learned that a topic can be represented by an item reference as a part of the query. In addition we can use the string-literal of the IRI as start node of the navigation over the identifier axes ( indicators, locators or item ) in relation to the identifier type.

4.2. Tutorial: Using Prefixes

Use Case

In addition to repeating IRI parts, we want to use prefixes instead of absolute IRIs. We also want to extract the topic item representing the composer Puccini.

Solution

Each TMQL query can start with an environment clause containing prefix definitions and ontology definitions. A prefix definition can be used to define a QName for a absolute IRI used as prefix in the query itself.

1:      %prefix psi http:/psi.ontopedia.net/ psi:Puccini

The keyword %prefix symbolize a prefix definition as a part of the query. Each prefix definition contains three tokens - the keyword prefix, the QName identifier ( psi ) and the absolute IRI http://psi.ontopedia.net ). The remaining part of the query extract the topic item representing Puccini using an item reference as relative IRI using the defined prefix. Like the IRI specification the QName and the remaining IRI part are divided by a colon.

In relation to the last tutorial it is not possible to use QNames as parts of string literals. Following query will return an empty result.

1:      %prefix psi http:/psi.ontopedia.net/ "psi:Puccini" << indicators
Summary

The short tutorial show how to use prefixes as a part of the query. Each prefix definition contains three tokens - the keyword prefix, the QName identifier ( psi ) and the absolute IRI http://psi.ontopedia.net ).

4.3. Tutorial: Getting a sequence of all characteristics

Use Case

In the next iteration we try to extract all characteristics of the topic item Puccini.

Solution

In contrast to the topic maps data model TMQL does not differ between names and occurrences of a topic item. The current draft combines names and occurrences to the characteristics of a topic item. In relation to this there is only one defined axis to get all characteristics of a topic item as one tuple sequence.

The first step is to get the topic item we wanted to know. After that we extract the characteristics using the navigation axis characteristics of the TMQL draft.

1:      %prefix psi http://psi.ontopedia.net/ psi:Puccini >> characteristics

As first part of the query we define a prefix using to reduce the absolute IRIs of the remaining query part to relative IRIs. After that we use the item reference psi:Puccini to identify the topic item Puccini. Using the navigation axis characteristics we navigate from the node representing the topic to all nodes representing characteristics of the topic node. The query does not returns the literals representing the values of the characteristics. It will return a tuple sequence of topic map items representing the characteristics.

Because of the fact we want to query the literals representing the values of characteristics, we have to add a new navigation step at the end of the last query. The current draft supports a new navigation axis atomify to serialize a characteristics construct and locators to its literals, mostly strings.

1:      %prefix psi http://psi.ontopedia.net/ psi:Puccini >> characteristics >> atomify

In addition to the last query the forward navigation using the atomify axis extract all values of each characteristic of the topic item Puccini. The query will return a set of string literals representing the values of the characteristics.

The characteristics axis supports a optional type representing the topic type of the characteristics construct. The optional type is specified by a topic item reference after the axis name.

1:      %prefix psi http://psi.ontopedia.net/ psi:Puccini >> characteristics psi:date_of_birth >> atomify

In addition to the last query the result of this query only contains all literals of the characteristics of type psi:date_of_birth.

In most use case the literals of characteristics are used instead of the characteristics constructs itself. Because of that the non-canonical level contains a term-substitution for the both navigation steps >> characteristics >> atomify. It is possible to use the / followed by the optional type. Please note that the shortcut restricts the optional type, if it is missing a grammar error will be raised.

1:      %prefix psi http://psi.ontopedia.net/ psi:Puccini / psi:date_of_birth

This query has the same result as the last one.

Summary

The tutorial explain how to extract characteristics or their literals using a TMQL query. It explain the functionality of the characteristics and atomify axis with optional type parameters. It also show a non-canonical shortcut for the combination of this both axis.

4.4. Tutorial: Get names and occurrences

Use Case

The goal is to extract all names and occurrences of the topic Puccini.

Solution

In relation to the last tutorial we use the characteristics axis, too. To extract names as characteristics of a topic item, the optional type shall be used. According to the topic maps data model all names are of type topic-name-type. The current draft of TMQL reserve a identifier representing all names or occurrence as optional parameter of the characteristics axis. The allocated identifier are tm:name and tm:occurrence.

1:      %prefix psi http://psi.ontopedia.net/ psi:Puccini >> characteristics tm:name
2:
3:      %prefix psi http://psi.ontopedia.net/ psi:Puccini >> characteristics tm:occurrence

The query in line 1 returns all names of the topic item representing the composer Puccini. The query make use of the reserved type identifier tm:name as optional argument of the characteristics navigation axis. In line 3 we see the same query except the optional type definition. The query returns all occurrences of the topic item Puccini using the type identifier tm:occurrence

Summary

The tutorial explain how to use the optional type argument in addition to the reserved identifier tm:name and tm:occurrence to extract all names or all occurrences of a topic item.

4.5. Tutorial: Getting all associations played by a topic

Use Case

In this case we want to extract all associations played by a the topic item Puccini.

Solution

The extract all associations played by a topic the current draft contains a navigation axis with the identifier players. In forward direction this axis only supports association items and results in a set of topics playing in this association. In backward direction we will get all association items played by the current node, which has to be a topic.

1:      %prefix psi http://psi.ontopedia.net/ psi:Puccini << players

The pattern << symbolize the backward navigation using the navigation axis players. The result of the query will be a set of association items, if the topic plays any role.

Summary

The short tutorial explain how to use the players axis to get all associations played by a topic item.

4.6. Tutorial: Getting all instances of a topic type

Use Case

At this step we want to extract all instance of the topic type Composer.

Solution

The topic type representing a composer has to identify by a item reference too. In addition to that we has to use the types axis in backward direction to get all instances of the given type.

1:      %prefix psi http://psi.ontopedia.net/ psi:Composer << types

The topic type will be addressed by the identifier psi:Composer. By using the types axis in backward direction, we will extract all instances of the given type. The axis will interpret transitive.

In addition to the types axis the non-canonical level contains an instances axis which can be used reverse to the types axis.

1:      %prefix psi http://psi.ontopedia.net/ psi:Composer >> instances

The result will be the same like the last query.

If the usage of the types or instances axis represent the start node of the whole navigation there is another shortcut similar to XPath.

1:      %prefix psi http://psi.ontopedia.net/ // psi:Composer

The result will be the same like the last query. Please note that the shortcut // is only allowed at the beginning of a navigation.

Summary

The tutorial shows how to extract all instances of a given topic type using three different patterns or solutions. We can use the navigation axis types in backward direction and the instances axis in forward direction. In addition to that we can use the shortcut // at the beginning of a query.

4.7. Tutorial: Using a type filter

Use Case

The next use case is to extract all occurrences of the specific type psi:date_of_birth without using the optional type argument.

Solution

In addition to the optional type argument of navigation axes a path expression supports a filter expression at the end. A filter expression can contain a type filter to remove all elements which are not an instance of the defined type. The filter can only used at the end of the whole navigation and is only applicable for typed constructs or topic items.

1:      %prefix psi http://psi.ontopedia.net/ psi:Puccini >> characteristics [ . >> types == psi:date_of_birth ]

A filter definition is always encapsulate by square brackets. To extract the type of the current characteristics object the types axis is used. The current characteristics will be identified by the single dot . similar to the XPath notation. The navigation result will be compared with the given topic type psi:date_of_birth. If the navigation result contains at least the topic type with the identifier psi:date_of_birth the filter will return true and the current characteristics construct will be added to the result set.

In addition the current draft specifies a shortcut as non-canonical production for type filters. A type filter can also be identified using the pattern ^.

1:      %prefix psi http://psi.ontopedia.net/ psi:Puccini >> characteristics [ ^ psi:date_of_birth ]

The result will be the same like the last query.

Summary

The tutorial explain how to use a type filter as a part of a path-expression. It explain that the dot . can use to identify the current node of navigation results. The operator == is used to compare to sequences and will return true if at least every element of the second sequence is contained by the first one.

In addition the tutorial shows the shortcut for the type filter ^.

4.8. Tutorial: Using an index filter

Use Case

The next step is to extract only the first name of the topic item representing the composer Puccini.

Solution

Another filter type are index filters returning only a subset of the tuple sequence defined by integer literals. An index can be defined as single integer, than the result will only contains the element at the specific index if it exists, otherwise the result will be empty.

1:      %prefix psi http://psi.ontopedia.net/ psi:Puccini >> characteristics tm:name [ 0 ]

The index filter [ 0 ] specifies the selection of the first element of the navigation results. If puccini has names, the first one will be returned. Please note that topic map constructs are unsorted any time, so the result of the query can be differ for several executions.

In addition there is another index filter specify an index range. The result of a query using a range filter will contain all elements between the given indexes. The lower index will be included by the result set.

1:      %prefix psi http://psi.ontopedia.net/ psi:Puccini >> characteristics tm:name [ 0 .. 1 ]

The index filter [ 0 .. 1 ] specifies the selection of all elements between the index 0 and the index 1. Because of the fact that the first index is included, the result will be the same, like the last query. Please note that topic map constructs are unsorted any time, so the result of the query can be differ for several executions.

The shown filter definitions are non-canonical shortcuts for complex expression representing an index filter at the canonical level. The following snippet shows the corresponding canonical expression of the index filters.

1:      %prefix psi http://psi.ontopedia.net/ psi:Puccini >> characteristics tm:name [ $# == 0 ]
2:
3:      %prefix psi http://psi.ontopedia.net/ psi:Puccini >> characteristics tm:name [ 0 <= $# AND $# < 1]

Both queries has the same results like the non-canonical one. The variable $ represents the current index during the iteration over the navigation result. In line 1 the filter expression contains the condition that the value of the variable $ has to be the same like 0. In line 3 the value of this variable has to be greater or equal than 0 and less than 1.

Summary

There are two types of index filters ( a simple index filter and a range filter ). Index filter return a subset of the origin results specified by the index bounds. For index ranges the upper bound will be excluded and the lower bound will be included. At the canonical level the variable $# represents the current index of the tuple.

4.9. Tutorial: Using a scope filter

Use Case

Extract all names of the topic item representing the composer puccini scoped by the theme psi:short_name.

Solution

There are another filter type representing scope filters. A scope filter defines a theme represented by a topic item reference which has to be contained by themes of the scoped construct.

1:      %prefix psi http://psi.ontopedia.net/ psi:Puccini >> characteristics tm:name [ . >> scope == psi:short_name ]

The filter will be defined between the square brackets. The dot . symbolize the current node in the context of the abstract topic maps graph. By using the scope axis we extract all themes of the scoped construct. At last we define that the themes has to contain at least the topic item identified by the topic with the reference psi:shot_name.

There are a shortcut at the non-canonical grammar level replacing the grammar pattern . >> scope == ' with the symbol '@. In addition the square brackets can be removed.

1:      %prefix psi http://psi.ontopedia.net/ psi:Puccini >> characteristics tm:name @ psi:short_name

The query has the same result than the last one.

Summary

Scoping filters can be used to filter the navigation results by a defined topic theme. There are a canonical filter definition based on the scope axis to get all themes of the current node and also a non-canonical shortcut @ representing the same statement.

4.10. Tutorial: Using a boolean filter

Use Case

Extract all occurrence of puccini scoped by the theme psi:Web and all names.

Solution

At first we has to analyze the use case. We has to extract all occurrences scoped by the theme Web in namespace psi. To realize that we can use the scoped filter and the optional type of the characteristics axis.

1:      %prefix psi http://psi.ontopedia.net/ psi:Puccini >> characteristics tm:occurrence @ psi:Web

The second challenge is to extract all names of the topic item Puccini without any other restrictions.

1:      %prefix psi http://psi.ontopedia.net/ psi:Puccini >> characteristics tm:name

The problem is to combine both expression to one complex query according two our principle to extract all by one query. We can not combine the optional types of the navigation axis tm:name and tm:occurrence because of there different type hierarchy, because of that we have to replace the optional type by a type filter.

Shortcuts can not be combined with other filter expression, so we have to use the canonical syntax of the scope filter instead of @.

Our new filter has to combine different restriction using conjunctions and disjunctions.

1:      %prefix psi http://psi.ontopedia.net/
2:      psi:Puccini >> characteristics [ . >> types == tm:name OR ( . >> types == tm:occurrence AND . > scope == psi:Web ) ]

The filter shown by the example code is more complex than the other one, but it returns the values we want. The first part of the filter . >> types == tm:name matches to all names of the topic item Puccini. The second part of out filter ( . >> types == tm:occurrence AND . > scope == psi:Web ) returns all scoped occurrences with the theme psi:Web. Both expressions are combined as a disjunction symbolized by the keyword OR. Because of the fact that all elements which does not matches to the first filter part has to be occurrences, that filter expression . >> types == tm:occurrence always return true and can be removed.

1:      %prefix psi http://psi.ontopedia.net/ psi:Puccini >> characteristics [ . >> types == tm:name OR . > scope == psi:Web ]
Summary

Filter can be more complex than simply a index filter. They can contain any boolean condition but not non-canonical shortcuts. Sometimes it can be important to check if the boolean expression can be reduced to speed up the execution of the query.

4.11. Tutorial: Using projections

Use Case

In the previous examples we only get simple results as a sequence of singleton tuples. Now we want to extract a tuple sequence containing triples of a topic item which is a composer, its names and all associations played.

Solution

A navigation always results in a sequence of singleton tuples containing the result of a navigation step from the source node to the target node like names, occurrences or something else. Sometimes we want to fetch informations starting from the same source node but navigates over different axis. To realise that the current draft contains a tuple projection expression. A projection is used to project a current node to a set of target nodes using navigations too. A projection always creates a tuple sequence containing one tuple item for each projection.

At first we want to extract the topic items representing an instance of Composer, which we realize with the following query.

1:      %prefix psi http://psi.ontopedia.net/ // psi:Composer

The second tuples shall contain all names of the composers instance, which can be realize by the following query.

1:      %prefix psi http://psi.ontopedia.net/ // psi:Composer / tm:name

As last tuple item, we need all associations played by the composer instance.

1:      %prefix psi http://psi.ontopedia.net/ // psi:Composer << players

To use a projection we have to extract a navigation part which is similar for all expression we need to combine. In our example the navigation part // psi:Composer is similar for all three queries and can be used as projection base.

1:      %prefix psi http://psi.ontopedia.net/ // psi:Composer ( . , . / tm:name , . << players )

Projections defined by using round brackets ( and ). The round brackets containing a comma separated list of projections. In this case the first projection represents the current node by using the dot .. The second projection projects the current node to its name literals and the last one projects the topic map construct to its played associations.

Please note that all projections contained by the projection list uses the same context, which means that the current node will not be inherit by previous projections. The context of the current node will defined by the navigation expression in front of the projection definition. The number of projections is not restricted by the current draft.

In relation to the topic maps data model ( TMDM ) each topic map will be unordered. This can be a problem, because the results of queries can be differs and some applications want to extract informations in a defined sequence. To realize the goal, the projection supports a kind of ordering which can be defined by the user as a part of the query. Each tuple of the projection sequence can be ordered in different ways. The interpretation will be start at the first projection and if two elements of the projection are equal the next projection will be used to order and so far. Please note that the tuples will be ordered only if at least on projection contains an ordering keyword ASC and DESC. Both keywords are similar for other languages too and has not to be explained. In addition, if a keyword is missing, the default keyword ASC will be add if it is necessary.

1:      %prefix psi http://psi.ontopedia.net/ psi:Puccini ( . ASC , . / tm:name )
2:
3:      => {    [ Puccini, "Giacomo Puccini" ],
4:                      [ Puccini, "Puccini" ],
5:                      [ Puccini, "Puccini, Giacomo" ] }

The projection definition contains the keyword ASC as part of the first projection. Because of the fact that all values of this projection are equal, the order of the second will be used. The default order is ASC.

1:      %prefix psi http://psi.ontopedia.net/ psi:Puccini ( . ASC , . / tm:name DESC)
2:
3:      => {    [ Puccini, "Puccini, Giacomo" ],
4:                      [ Puccini, "Puccini" ],
5:                      [ Puccini, "Giacomo Puccini" ] }

The projection definition contains the keyword ASC as part of the first projection. Because of the fact that all values of this projection are equal, the order of the second will be used defined by the keyword DESC.

Summary

The tutorial explain the syntax and usage of projections as one possibility to create are non-singleton result tuple. Projections are always encapsulated by round brackets and contains at least one projection tuple. The current node used in projections is always bound to the same value in all projection tuples contained by this projection.

In the last examples we learn how to sort the results using the tuple ordering keywords ASC and DESC. If at least one tuple will be ordered by a given keyword, all other will be sort ascending too.

4.12. Section: Select expressions

In the previous tutorials we only use the path expression style of the TMQL query. Sometimes our use case become to complex to handle it with this expression type and we have to use one of the other types SELECT or FLWR.

4.12.1. Tutorial: How to use the select style

Use Case

First we try to get all composer using the select style.

Solution

In remembrance of a previous tutorial we look at the following query realizing our use case, but make use of the path expression style.

1:      %prefix psi http://psi.ontopedia.net/ // psi:Composer

Now we want to use a select expression to realize the same use case. The results should be equal for both queries. At first we take a look to the grammar rule of the select expression.

1:      SELECT < value-expression >
2:              WHERE boolean-expression
3:              ORDER BY < value-expression >
4:              UNIQUE
5:              LIMIT integer
6:              OFFSET integer

For our example we only need a select clause and a boolean expression.

1:      %prefix psi http://psi.ontopedia.net/
2:      SELECT $composer
3:              WHERE $composer ISA psi:Composer

The first line only defines a simple prefix which we use in line 3. The where-clause starting with the keyword WHERE contains a boolean condition which will be check for each value or item which can be bound to the variable. In line 3 we see the boolean condition $composer ISA psi:Composer.

The keyword ISA represents a type restriction for the left-hand arguments. The left-hand argument $composer will bind to each topic exists in the topic map and the binding will be check if it is an instance of the right-hand argument psi:Composer. The result of this condition will be a set of topic items of the type Composer.

After the validation of the boolean-expression, each binding will be interpret by the select clause ( SELECT keyword ). In line 2 the select clause only adds the binding of the variable $composer to the tuple sequence representing the result of this query.

The overall result will contain exactly all instances of the type composer.

Summary

We learn how to use the select expressions and how to use variables in the context of such expressions. A variable will be bind to each possible value of the topic map and will be validated by the condition contained in the boolean expression. The overall result of valid variables bindings will be passed through the select clause and will be transformed or projected to the results the user want. The select clause can contains more than one value expression similar to the projection of path expressions and can use filters too.

4.12.2. Tutorial: Ordering results

Use Case

Now we want to order the results of the last tutorial by there first name.

Solution

To realize this task we make use of the tuple ordering syntax of the path expression. A tuple expression can be ordered by using the keywords ASC and DESC. Similar to tuple expressions a selection expressions contains an order-by clause symbolized by the keywords ORDER BY. The order-by clause can contain a set of value expressions separated by a simple comma and the keyword ASC can be left out.

To order our results we need a short path expression describing the values we used for ordering our result tuples.

1:      $composer >> characteristics tm:name >> atomify [ 0 ]

Our task is to order the composers by their first name. A composer instance is bound to the variable $composer. The name of the composer can be extracted by using the navigation >> characteristics tm:name >> atomify. Because there are a set of names contained by a topic item, we use the filter [ 0 ] to get the first one.

Now we know the expression to get the first name of a composer. We still have to add the expressions as a part of the order-by clause to our previous select-expression to realize our current use case.

1:      %prefix psi http://psi.ontopedia.net/
2:      SELECT $composer
3:              WHERE $composer ISA psi:Composer
4:              ORDER BY $composer >> characteristics tm:name >> atomify [ 0 ]

In line 4 we see the order by clause containing our short navigation getting the first name of a composer.

Summary

Currently we take use of the order-by clause to order the variable bindings which are valid to the boolean condition. The overall result will be ordered by the given expression. The order-by clause can contain more than one expression to order a result sequence. The second ordering expression will be used if at least two values of the first will be equal.

4.12.3. Tutorial: Getting values of specific index range

Use Case

We want to realize a simple website displaying all composers by their name. Because of the huge number of composers we want to implement a paging algorithm, which should be based on TMQL queries. We need a query to extract only the values for the next page between 10 and 20.

Solution

Based on the last query, we have to make two changes. At first we have to extract only the names of the composer instances. The second step is to add a limit and an offset clause to get only the values between the index 10 and 20.

To extract the names we can use the navigation axis characteristics and atomify.

1:      $composer >> characteristics tm:name >> atomify [ 0 ]

The next step is to add simple a selection range to get all tuples between the index 10 and 20.

1:      %prefix psi http://psi.ontopedia.net/
2:      SELECT $composer >> characteristics tm:name >> atomify [ 0 ]
3:              WHERE $composer ISA psi:Composer
4:              ORDER BY $composer >> characteristics tm:name >> atomify [ 0 ]
5:              OFFSET  10
6:              LIMIT 10

The shown syntax is similar to SQL. In line 2 we add the navigation to our select clause to get only the name literals of a composer and not the topic itself. To realize the selection range we add a offset-clause with the keyword OFFSET in line 5. The integer value after the keyword represents the first index to select. The limit clause with the keyword LIMIT defines a number of elements which should be selected at most.

Summary

The keyword OFFSET defines an offset-clause which can be used to extract the tuples starting at a specific index, given as literal after the keyword. To extract only a specific number of elements the limit-clause can be used. The limit-clause is defined by the keyword LIMIT and contains a simple integer literal two. If the number of available elements is smaller than the limit expect, the result will be smaller than the limit value.

4.12.4. Tutorial: Use disjunction

Use Case

Extract all operas and all composers as one result set.

Solution

In this use case we have to bind the variable to instances of different types, so we have to use a disjunction in combination with two instance-of expressions.

1:      %prefix psi http://psi.ontopedia.net/
2:      SELECT $composerOrOpera
3:              WHERE
4:                              $composerOrOpera ISA psi:Composer
5:                      OR
6:                              $composerOrOpera ISA psi:Opera

The where-clause contains two instance-of expressions in line 4 and 6. The expression in line 4 checks if the variable is bind to a topic item of the type composer and the expression in line 6 checks if it is bound to an instance of opera. By using the keyword OR we define a disjunction which means that at least on of this expression has to satisfy our condition. The interpreter finishes after the first satisfied expression.

Summary

The keyword OR defines a disjunction and the runtime will be finished with iteration step if at least one expression will be satisfied. In reverse the keyword AND symbolize a conjunction of boolean expression.

4.12.5. Tutorial: Quantified Expression

Use Case

The next task is more complex than the other one. We try to extract only composers who has composed at least 10 operas.

Solution

Now we want to extract only a subset of composers. The subset is defined by a new restriction, that the composer has composed at least 10 operas.

The semantic pattern at least suggest us, that this is a quantified restriction. The restriction isn’t non-quantified in that meaning that the existing of a fact satisfies the condition. The satisfaction is bind to a numerical bound, which suggests a quantified expression. Quantified expressions are realized as exists-clause in TMQL. A exists clause can be un-quantified like the ISA keyword or can be quantified which we need here. The syntax of a quantified expression is similar to human speach: "at least some satisfies something". The production of a quantified expressions looks like the following one

1:      AT LEAST < number > < binding-set > SATISFIES < boolean-expression >

The next step is to get the informations missing in the production rule. The number will be 10. The boolean expression has to check the restriction of composing an opera. This is modeled as an association in the topic map of the type psi:composed_by. The roles are psi:Composer for the composer and psi:Work for the opera or something else. The condition will look like psi:composed_by ( psi:Work : $opera , psi:Composer : $composer ). The boolean condition dependent on two variables $composer and $opera. The first variable $composer will be bound by the ISA statement of our last query. To bind the variable opera we will use a binding set as we see in the given production.

A bindings set is defined as a variable in the context of a sequence of possible values. In our example the variable $opera as an instance of all operas. The production rule of a binding set looks like the following one

1:      < variable > IN < context >

Our context will be the navigation pattern results in all instances of the topic type psi:Opera. The final quantified expression looks like this:

1:      AT LEAST 10 $opera IN // psi:Opera SATISFIES psi:composed_by ( psi:Work : $opera , psi:Composer : $composer )

After defining the quantified condition, we have to add them to our select expression as a part of the where clause. Because of the fact that the variable $composer has to bound to an instance of a composer and has to satisfy the quantified expression, we make use of boolean conjunctions.

1:      %prefix psi http://psi.ontopedia.net/
2:      SELECT $composer >> characteristics tm:name >> atomify [ 0 ]
3:              WHERE
4:                              $composer ISA psi:Composer
5:                      AND
6:                              AT LEAST 10 $opera IN // psi:Opera SATISFIES psi:composed_by ( psi:Work : $opera , psi:Composer : $composer )

If we change our use case to get all composers composed at most 10 expression, the query is similar the same except the keyword LEAST will be exchanged with the keyword MOST.

1:      %prefix psi http://psi.ontopedia.net/
2:      SELECT $composer >> characteristics tm:name >> atomify [ 0 ]
3:              WHERE
4:                              $composer ISA psi:Composer
5:                      AND
6:                              AT MOST 10 $opera IN // psi:Opera SATISFIES psi:composed_by ( psi:Work : $opera , psi:Composer : $composer )
Summary

The tutorial explain the syntax and interpretation of quantified expressions. Quantified expression can restrict an upper or lower number of satisfying bindings. A quantified expression with the keyword AT LEAST creates the restriction with a lower bound. The keyword AT MOST describe the reverse case. In addition if the number of satisfying elements is not important the keyword SOME can also used.

4.12.6. Tutorial: Use the forall clause

Use Case

Getting all operas which only has name items at the scope italian.

Solution

In reverse to the non quantified expression using the keyword SOME we need an expression checks that all variable bindings satisfy our condition. This type of expression is called forall-clause and will be represented by the keyword EVERY. The syntax of the forall clause is similar to the quantified expression except the quantification expression at the beginning of the sub-expression.

1:      EVERY < binding-set > SATISFIES boolean-expression

As you see there is no numerical restriction at the beginning of the expression, because every binding of the following binding set has to satisfy the condition.

As next step we have to define the condition our variable bindings has to satisfy. At first we need all names of all operas in the scope Italian. To get all names of a opera we can use the characteristics axis.

1:      $opera >> characteristics tm:name

To get the themes of a name element we use the scope axis.

1:      $name >> scope

Now we have to check if the theme contains the topic with the identifier psi:Italian.

1:      $name >> scope == psi:Italian

In combination the forall clause look like the following example.

1:      EVERY $name IN $opera >> characteristics tm:name SATISFIES $name >> scope == psi:Italian

The binding set $name IN $opera >> characteristics tm:name binds each name item of the topic bind to the variable $opera to the variable $name. Each of this bindings has to satisfy the condition $name >> scope == psi:Italian. The overall result contains all operas which has only name items at the scope Italian.

As last step we have to add the forall clause to our select expression.

1:      %prefix psi http://psi.ontopedia.net/
2:      SELECT $composer >> characteristics tm:name >> atomify [ 0 ]
3:              WHERE
4:                              $composer ISA psi:Composer
5:                      AND
6:                              AT MOST 10 $opera IN // psi:Opera
7:                                      SATISFIES
8:                                                      psi:composed_by ( psi:Work : $opera , psi:Composer : $composer )
9:                                              AND
10:                                                     EVERY $name IN $opera >> characteristics tm:name SATISFIES $name >> scope == psi:Italian
Summary

The forall clause is the reverse expression of the exists clause an checks if every binding of a subset satisfies a given condition. If not the origin variable binding is invalid. A forall clause starts with the keyword EVERY.

4.12.7. Tutorial: Unique results

Use Case

Extract all co-players of any composer instance, but each elements only one times.

Solution

At first we have to get all co-players of a specific topic item. There are different possibilities to realize this use case with path expression. We can use the players axis two times each direction one times starting in backward direction.

1:      $composer << players >> players

The problem of this expression is that the starting node of the navigation is contained by the result set, but we only wants the co-players and not the items itselfs. The solution of our problem is to use the traverse axis retrieving exactly the co-players of our topic item.

1:      $composer >> traverse

The overall select expression looks like the following one using the sub query defined by the last example.

1:      %prefix psi http://psi.ontopedia.net/
2:      SELECT $composer >> traverse
3:              WHERE $composer ISA psi:Composer

If we check the result set, we will found multiple instances of the same topic item. The cause of that is the fact that a topic item can be a co-player of different topic items. For example a opera can be composed by two composers. To solve this problem we can add the keyword UNIQUE which realize a reduction of the result set to a unique tuple sequence contains each item only one times.

1:      %prefix psi http://psi.ontopedia.net/
2:      SELECT $composer >> traverse
3:              WHERE $composer ISA psi:Composer
4:              UNIQUE
Summary

Each item can be contained multiple times by the result set because of the strong linkage between the items of a topic map construct. To extract each element only one time the keyword UNIQUE can be add to the select expression.

4.13. Section: Flwr expressions

The previous examples and tutorials always using the select or path expression style. There is an other expression style defined by the current draft called flwr expression. The flwr expression looks similar to a programming language and is the only expression type to return special result types like CTM or XTM.

4.13.1. Tutorial: How to use flwr expressions

Use Case

In this case we try to extract all Composers only using the flwr style.

Solution

In relation to the tutorial using select style to realize this use case, we try to use the flwr style to realize the same goals. At first we take a look at the grammar rule of the flwr-style.

1:      FOR < binding-set >
2:      WHERE boolean-expression
3:      ORDER BY < value-expressenion >
4:      RETURN content

The flwr style contains any expressions we already known from the select style. In line 2 there is a where clause containing some boolean condition which has been satisfying by all variable bindings of this expression. Where-clause can be used similar to where clauses of select expression. As we see in line 3 there are also an order-by clause contained by the flwr style. The expression works like order-by clauses of a select expression we learned in a previous example. The binding-set contained by the for-clause in line 1 can be used in the same way like the binding sets of quantified expressions.

If we want to use the flwr expression style to realize our use case, we need some additional information. At first we need a binding definition to get all composers. We already know that the binding set binds a variable iteratively to each possible value of the defined context. To get all composers we can use the instances axis with the known type psi:Composer and bind that values to a variable called $composer.

1:      %prefix psi http://psi.ontopedia.net/
2:      FOR $composer IN // psi:Composer

The flwr style enforces that the return clause has to be a part of the final query and cannot be left out. The return statement contains a content definition describe how to export the results of the boolean condition and the bindings set after optional ordering. The return clause can contain every possible content definition of different types, like simple tuple sequences or CTM or XTM. In this case we only need to export the results as simple tuple sequences.

1:      %prefix psi http://psi.ontopedia.net/
2:      FOR $composer IN // psi:Composer
3:      RETURN $composer

In line 3 we add a return clause starting with the keyword RETURN to get the results we need. The return clause only contains the simple sub expression $composer which will be return the values binding to the variable in the current iteration. The overall results will be a tuple sequence only contains singleton tuples with a topic item of the type composer.

Summary

This first short tutorial of the flwr style describes how to use the flwr style similar to the select style. The expression types have conformities in style and make use of the same sub-expression types including the where-clause and the order-by clause. In reverse to the select style, the flwr style does not support the selection of a specific window and can not unify the values using a unique-clause. But the flwr style only supports the variable binding as a core metric, because of that we have not to use an instance-of expression. A flwr-expression can contain an non-specific number of for-clauses to bind a set of variables in one query. The return-clause of a flwr expression is non optional and defines the return values of the expression itself.

4.13.2. Tutorial: Create a topic map fragment

Use Case

A client process wants to extract a specific construct of a topic map managed by the server process. In this case we want to extract our topic map fragment using TMQL and the CTM syntax.

Solution

At first we need to know something about the serialization format CTM. CTM is a simple text based serialization specification for topic maps. Each topic map construct will be represented by a specific text block. The benefit of CTM is the human readable syntax. The main drawback is the fact we needs a special parser.

To realize the use case we normally can use each of the three expression types. The problem is that only one of that can export CTM as a core element of the engine itself. If we don’t want to transform the elements to CTM for oneself, we only can use the flwr-expression. The TMQL draft contains a production rule as a subtype of content, creating a CTM literal in combination with a number of variable bindings. Because we know the return clause of flwr expression contains a content expression, we know how to use the new content type as part of our last query.

The last think to know is, how the ctm-content looks like. The ctm-fragments used as part of a TMQL query are encapsulated by a triple quote """". The interpreter will transform each content between this quotes to CTM. If the content contains a sub-query, at first the query will be interpreted and their results will be transformed to CTM itself. The overall result will be a set of little CTM fragments representing a fragment of our queried topic map.

1:      %prefix psi http://psi.ontopedia.net/
2:      FOR $composer IN // psi:Composer
3:      RETURN """ { $composer } """

The query looks like the last example except of the last line. In line 3 we change the result type to CTM content using the tripple quotes. The angle brackets { and } symbolize the internal sub-query. All tokens between both brackets are interpreted as TMQL query which can be return any content type. If the results will be topic map constructs, they will be transformed using the CTM syntax. In addition if the constructs representing a name item, a occurrence item or a variant, the parent topic item will be exported. If the item represents an association role, the parent association item will be exported. For more informations please look at the current CTM draft.

Summary

In this session we learn how to use the flwr style to export content as CTM. Currently only the flwr style supports the result type CTM as a part of the return clause. A CTM content are encapsulated by triple quotes """" every time. If the content contains sub-queries ( encapsulated by angle brackets { and } ) the results of the sub-query interpretation will be transformed to CTM.In addition if the constructs representing a name item, a occurrence item or a variant, the parent topic item will be exported. If the item represents an association role, the parent association item will be exported.

4.13.3. Tutorial: Create an XML document

Use Case

According to CTM the flwr style supports the transformation to XML too. Using a special content type called XML-content. In this tutorial we want to extract all composers and their operas as the following XML file.

1:      <xml>
2:              <composer>
3:                      <name>
4:                              <!-- the first name of the composer -->
5:                      </name>
6:                      <composed>
7:                              <opera>
8:                                      <!-- the first name of the opera -->
9:                              </opera>
10:                     </composed>
11:             </composer>
12:     </xml>
Solution

Because of the iteration over all composers we need an enclosing query creating the root node <xml>. If it is missing all results will be returned as XML fragments and not as one XML.

1:      %prefix psi http://psi.ontopedia.net/
2:      RETURN <xml> </xml>

The query will return simple an XML fragment containing one empty note <xml>.

To include all composer to our XML file we have to use a sub-query, similar to the last tutorial except the return type has to be XML too. To realize sub-queries we have to add angle brackets containing our sub-query.

1:      %prefix psi http://psi.ontopedia.net/
2:      RETURN
3:              <xml>
4:                      {
5:                              FOR $composer IN // psi:Composer
6:                              RETURN <composer> <name> { $composer / tm:name [0] } </name> </composer>
7:                      }
8:              </xml>

Between the line 4 and 7 we add the query of the last tutorial to get all composers. To export XML again we have to use a flwr query too as wee see in line 5 and 6. In line 5 we define a variable binding for each instance of composer and in line 6 we use a sub-query again to extract the first name of the composer using the characteristics and atomify axis represented by the shortcut /.

As next step we have to add a second sub-query based on the variable binding for $composer to extract all operas composed by the current composer topic item. If we want to use the variable $composer, we have to embed our query as part of the return clause of the sub-query binding this variable.

1:      %prefix psi http://psi.ontopedia.net/
2:      RETURN
3:              <xml>
4:                      {
5:                              FOR $composer IN // psi:Composer
6:                              RETURN <composer>
7:                                                      <name> { $composer / tm:name [0] } </name>
8:                                                      <composed>
9:                                                              {
10:                                                                     FOR $opera IN // psi:Opera
11:                                                                     WHERE psi:composed_by ( psi:Composer : $composer , psi:Work : $opera )
12:                                                                     RETURN <opera> { $opera / tm:name [0] } </opera>
13:                                                             }
14:                                                     </composed>
15:                                             </composer>
16:                     }
17:             </xml>

Between the line 8 and 14 we embed the sub-query to extract all operas composed by the current composer topic item bind to the variable $composer. In line 10 we bind the variable $opera to every instance of the topic type psi:Opera and in line 11 we make use of the predicate invocations to check if there is an association of type psi:composed_by played by the current opera and composer. The result of valid bindings of $opera will be transformed to XML in line 12. By using a sub-query too, we extract the first name of the opera to add the opera item to the XML file embedded by the <opera> XML tag. The overall result will be a XML file which looks like the file we want.

Summary

Flwr expression can also return XML content as well. Using XML-content we can extract informations of the topic map and add them to a specific XML structure. Sub-queries can be used similar to CTM-content to extract values bind to a variable by the encapsulating flwr expressions. Each sub-query inherit the bindings of the parent-query. If the result of sub-query is a topic map construct, it will be transformed using XTM, a XML-based serialization format of topic maps.

4.14. Section: Insert expressions

An insert expression can be used to add new content to the queried topic map. The insert expression isn’t currently a part of the TMQL draft and is an extension designed by the topic maps lab with request to get a part of the final standard of TMQL.

Please note that the extension will only supported by the engine by using the modification extension plugin. If the extension is located in the java class path, the engine would be able to use them.

4.14.1. Tutorial: Insert a new topic

Use Case

Similar to the rule - keep it simple - we start with the simple use case to add a new topic with the subject identifier http://tmql4j.topicmapslab.de/types/engine/tmql4j representing the engine itself as a topic item.

Solution

The insert expression contains two parts an insert-clause and a where-clause. The where-clause can be used to specify a variable binding using as a part of the insert-expression. In our case the where-clause is not necessary. The insert-clause contains a CTM fragment defining the content to add. Currently we don’t want to explain how to write CTM fragment, please take a look at the current CTM draft to get more information.

1:      INSERT ''' http://tmql4j.topicmapslab.de/types/engine/tmql4j . '''

The insert-expression always starts with the keyword INSERT followed by the insert-clause defining a CTM fragment encapsulated by the triple quotes '''. The syntax of the CTM fragment is based on the current CTM draft. In our case we define a topic block using the identifier http://tmql4j.topicmapslab.de/types/engine/tmql4j as a subject-identifier. The dot'.' symbolize the end of the topic definition.

Summary

The insert-expression always starts with the keyword INSERT followed by the insert-clause defining a CTM fragment encapsulated by the triple quotes '''. The syntax of the CTM fragment is based on the current CTM draft. As optional part a insert-expression can contain a where-clause defining conditions used to bind variables. Variables can be used in the context of the insert-clause, which we will see in a later example.

4.14.2. Tutorial: Using variables as part of the insert-clause

Use Case

In this case we try to add a new association can-queried played by each composer instance and the topic representing the tmql4j engine, we added in the last tutorial.

Solution

We know, that the insert-expression can contain a where-clause to bind variables to specific values. A variable can be used in the context of a CTM fragment contained by the insert-clause to expand the informations of a topic map item which is already contained by the current topic map. Of course we can realize the example without using a where-clause if we know all composer instances and their identifier, but this will be more complex than using variables.

At first we have to create a where condition binding a variable $composer to each instance of http://psi.ontopedia.net/Composer.

1:      $composer ISA http://psi.ontopedia.net/Composer

Or we use a prefix definition

1:      %prefix psi http://psi.ontopedia.net/ $composer ISA psi:Composer

Next we will design a CTM fragment creating a new association item with the type can-queried with two roles. The first role will be of the type topic-content and played by the composer instance. The other role will be played by the tmql4j engine. The role type should be engine.

1:      can-queried ( topic-content : $composer , engine : http://tmql4j.topicmapslab.de/types/engine/tmql4j )

To add not the whole topic item as CTM to our CTM fragment we only need to extract one identifier of the topic $composer.

1:      can-queried ( topic-content : { $composer >> indicators >> atomify [0] } , engine : http://tmql4j.topicmapslab.de/types/engine/tmql4j )

The last step will be the combination of both elements in our insert-expression.

1:      %prefix psi http://psi.ontopedia.net/
2:      INSERT ''' can-queried ( topic-content : { $composer >> indicators >> atomify [0] } , engine : http://tmql4j.topicmapslab.de/types/engine/tmql4j ) '''
3:              WHERE $composer ISA psi:Composer

The engine will create a temporary topic map containing the association item defined by the current insert-expression. All temporary engines will be merged into the origin one to realize the inclusion.

Summary

The where-clause of an insert-expression can be used to add new content to the topic map dependent of some content already contained by the topic map. By binding variables using the where-clause the insert-clause can be used content of the current topic map.

4.15. Section: Delete expression

A delete expression can be used to remove existing content of the topic map. The delete expression isn’t currently a part of the TMQL draft and is an extension designed by the topic maps lab with request to get a part of the final standard of TMQL.

Please note that the extension will only supported by the engine by using the modification extension plugin. If the extension is located in the java class path, the engine would be able to use them.

4.15.1. Tutorial: Remove a specific topic item

Use Case

We try to delete a topic item from the topic map. The topic item will be represented by its subject-identifier http://psi.ontopedia.net/Puccini.

Solution

A delete-expression can be split into two parts. The where-clause can be used to bind variables or check conditions to extract items to remove. The where-clause is optional and can be left out if the delete-clause is quite simple. The delete-clause define exactly the topic map items to remove.

A delete-clause always starts with the keyword DELETE following by the delete-clause and the optional where-clause. The items to remove can addressed by a simple path-expression only contains a topic reference - the subject-identifier.

1:      DELETE http://psi.ontopedia.net/Puccini

In context of the opera topic map the execution of this query will be failed, because the topic item with the identifier http://psi.ontopedia.net/Puccini will be used by other topic map constructs for example as reifier or as a association player. A delete-expression only removes independent constructs. Dependent constructs are topic items used as theme, type or player.

The delete-expression supports an optional keyword to remove dependent constructs too. Please take care, because all dependent elements will be removed too. If the delete-expression contains the keyword CASCADE all dependent elements will be removed. This means that all scoped elements, all instances or subtypes and all played associations will be removed from the topic map.

1:      DELETE CASCADE http://psi.ontopedia.net/Puccini

All dependencies which can be resolved will be changed too. If a reifier will be removed, the reification will be destroyed but the reified element keep alive.

If a topic item will be removed all names, occurrences and played associations deleted. If the topic is a type all instances and subtypes will be removed too. If the topic represents a theme the scoped construct will be removed too, because the semantic of the statement will be changed if the theme is removed.

If a name item will be removed, all variants removed too. This process is iterative, which means if a name will be removed because its parent topic was deleted, its variants removed too.

If an association item will be removed, all role items will be removed too. This process is iterative, which means if an association will be removed because one of its players are removed, all other roles will be removed too.

Summary

The delete-expression can be split into the delete-clause starting with the keyword DELETE and the optional where-clause. The where-clause can be used to filter the content to remove. The delete expression can contain a navigation to the item to remove, for example a name of a topic. If the keyword CASCADE is missing, the execution fails if the item is dependent to other items. The keyword CASCADE removes all dependencies or the item and its dependencies too. This process is iterative.

4.15.2. Tutorial: Using variables to remove content

Use Case

Now we try to remove all names of all instances of composer.

Solution

In this case we will use the where-clause to extract all instances of composer and bind them to the variable $composer. To realize the binding to all instances of composer we use the isa-expression.

1:      $composer ISA http://psi.ontopedia.net/Composer

Or we use a prefix definition

1:      %prefix psi http://psi.ontopedia.net/ $composer ISA psi:Composer

To remove only the names of a composer we have to modify the delete-clause using a navigation-expression getting all name items.

1:      $composer >> characteristics tm:name

The last step will be the combination of both query parts to delete all names of all composers.

1:      %prefix psi http://psi.ontopedia.net/
2:      DELETE CASCADE composer >> characteristics tm:name
3:              WHERE $composer ISA psi:Composer
Summary

By using variables the delete-clause can be used to remove specific elements of a set of topic map constructs like each instance of a specific type. The delete-clause can define simple path-expression to remove a specific item contained by a topic map construct, like its names or identifiers.

4.15.3. Tutorial: Delete the topic map content

Use Case

In the last tutorial in this section we want to remove all topic map constructs except the topic map itself.

Solution

To realize the deletion of all topic map items, the delete-expression can used the special keyword ALL. The keyword CASCADE will be mandatory because the topic map constructs will be depend to each other. The whole delete-expression similar contains only three keywords and nothing more.

1:      DELETE CASCADE ALL
Summary

The special keyword ALL can used to remove the whole content of a topic map.

4.16. Section: Update expression

An update expression can be used to modify or add content to the queried topic map. The update expression isn’t currently a part of the TMQL draft and is an extension designed by the topic maps lab with request to get a part of the final standard of TMQL.

4.16.1. Tutorial: Add a new content to a topic item

Use Case

In this case we want to add a new name to the topic item representing the composer puccini.

Solution

To realize the use case we need to learn something about the update-expressions at first. An update-expression contains a where-clause defining the context of changes. The where-clause contains a simple path-expression to define the items which should be updated. The context of updates is used for all changes defined by the update-clauses and is dependent from the preferred changes.

An update-clause similar represented by a triple or a 4-tuple looks like the following one.

1:      anchor { optional-type } ( SET | ADD ) value-expression

The anchor defines the kind of changes and has to be an identifier similar to the navigation axis to identify the node or edge of the abstract graph which has to be changed.

Because we want to add a new name item, the anchor has to be names representing all changes of name items and we have to use the keyword ADD symbolize the creation of a new name item. The optional type can be left out because our name has no specific type and the default name should be used.

1:      names ADD "the new name"

The current context has to be a topic item representing the composer puccini. To get the topic we simply use the subject-identifier of this topic.

1: http://psi.ontopida.net/Puccini

Or we use a prefix definition

1:      %prefix psi http://psi.ontopedia.net/ psi:Puccini

As final step we have to combine all query snippets to the final update-expression.

1:      %prefix psi http://psi.ontopedia.net/
2:      UPDATE names ADD "the new name"
3:              WHERE psi:Puccini

In line 1 we add an environment-clause containing our prefix definition. Please note that an environment-clause has to add at the beginning of the query any time. In line 3 we use the where-clause to define the update context and set them to the topic item represented by the subject-identifier psi:Puccini. In line 2 we add our update-clause starting with the keyword UPDATE. The update-expression only contains one update-clause adding a new name with the literal "the new name" to the topic puccini.

Summary

The tutorial explain how to use a update-clause to add a new name to a specific topic item. An update-expression contains at least one update-clause and a where-clause, but the number of update-clauses can be increased if it is necessary. The type of the update context depends on the used anchor. For more information take a look at the previous chapter describing the language specification.

4.16.2. Tutorial: Change a topic map construct

Use Case

Now we want to change the value of a occurrence of the same topic item. The occurrence has to be of a web site and has to be scoped by the theme Web. The new value has to http://en.wikipedia.org/wiki/Puccini with the correct data type.

Solution

In relation to the last tutorial we need another anchor and another keyword. Currently we want to change the value of an occurrence item of a topic, because of that we have to use the anchor occurrences and the keyword SET.

1:      occurrences SET "http://en.wikipedia.org/wiki/Puccini"

In our case the literal represent a string value and the data type of the occurrence will be changed to xsd:string automatically if this query will be execute, but our occurrence should represent an IRI. To realize that the data type has to be changed to xsd:anyIRI. To use a specific data type the TMQL draft supports data-typed literals looks like "literal"^^data-type. So we can change our query to the following one.

1:      occurrences SET "http://en.wikipedia.org/wiki/Puccini"^^xsd:anyIRI

Because of the fact that the changes will effected the occurrence item itself, the current context has to be the occurrence item. To define the context we will use a simple path expression using a scope and type filter to satisfy the requirements of the use case.

1:      %prefix psi http://psi.ontopedia.net/ psi:Puccini >> characteristics psi:website @ psi:Web

At first we define the prefix psi representing the absolute IRI http://psi.ontopedia.net/. The rest of the query represent the path expression to get all occurrence of the type web site at the scope Web. Our navigation start at the topic item psi:Puccini navigates to the occurrences of the type psi:website using the characteristics axis. The scope filter symbolized by the token @ is used to reduce the navigation results to all scoped occurrences.

After the combination our final query looks like the following one.

1:      %prefix psi http://psi.ontopedia.net/
2:      UPDATE occurrences SET "http://en.wikipedia.org/wiki/Puccini"^^xsd:anyIRI
3:              WHERE psi:Puccini >> characteristics psi:website @ psi:Web
Summary

This tutorial explain how to change values of specific topic map constructs at the example of an occurrence item. In addition it also describe how to specify the data-type of a literal used as new value of an occurrence item.

4.16.3. Tutorial: Combine update-clauses

Use Case

As I told you before, the update-clause can contain more than one update-clause. Now we combine the last tutorial and adding a new theme updated at the same time.

Solution

At first we look at our last query.

1:      %prefix psi http://psi.ontopedia.net/
2:      UPDATE occurrences SET "http://en.wikipedia.org/wiki/Puccini"^^xsd:anyIRI
3:              WHERE psi:Puccini >> characteristics psi:website @ psi:Web

The next step will be the definition of the new update-clause for add a new theme to the occurrence item. Because of the fact we want to change the scope of the occurrence item, the anchor has to be scope and the keyword ADD because we add a new theme. The theme will be represented as topic reference similar to the topic puccini.

1:      scope ADD psi:updated

Please note that the topic item psi:updated has to exists.

In combination our final query looks like this.

1:      %prefix psi http://psi.ontopedia.net/
2:      UPDATE occurrences SET "http://en.wikipedia.org/wiki/Puccini"^^xsd:anyIRI , scope ADD psi:updated
3:              WHERE psi:Puccini >> characteristics psi:website @ psi:Web

As we see in line 2 both update-clauses simply combined by using a comma.

4.17. Section: Merge expressions

A merge expression can be used to define merge rules which can be used to identify topic items which should be merged. The merge expression isn’t currently a part of the TMQL draft and is an extension designed by the topic maps lab with request to get a part of the final standard of TMQL.

A merge expression contains two different production rules.

The first possibility is to use a simple path-expression to define the topics to merge. If the path-expression contains a tuple-expression all tuples of this sequence will be merged. If the path-expression will be represented by a simple-content production, the whole result set will be merged.

The second possibility is to use a comma-separated list of value-expressions simply contains variables which will be bind by the contained where-clause. Each tuple sequence created by the set of value-expression will be merged to one topic.

4.17.1. Tutorial: Merge two specific topics

Use Case

In this case we try to learn how to use a simple tuple-expression to merge two specific topics. We want to merge the topics with the identifiers http://psi.ontopedia.net/Puccini and http://psi.ontopedia.net/Puccini_2.

Solution

As we can see we will use a tuple-expression to merge the both topics to one new topic according to the merging rules of the topic maps data model. A tuple-expression is a comma-separated list of value-expression encapsulted by round brackets. A value-expression can represent a topic item by using a simple topic reference.

A tuple-expression return a tuple sequence containing two topic items looks like the following one.

1:      ( http://psi.ontopedia.net/Puccini , http://psi.ontopedia.net/Puccini_2 )

Or we use simple a prefix

1:      %prefix psi http://psi.ontopedia.net/
2:      ( psi:Puccini , psi:Puccini_2 )

To merge the topics we simply has to add the keyword MERGE in front of our tuple-expression.

1:      %prefix psi http://psi.ontopedia.net/
2:      MERGE psi:Puccini , psi:Puccini_2
Summary

The merge-expression starts with the keyword MERGE followed by a simple path-expression or a set of value-expressions in combination with a where-clause. If the path-expression contains a tuple-expression all elements of each tuple of the generated sequence will be merged to one item.

4.17.2. Tutorial: Merge expressions using navigation

Use Case

Now we want to merge all instances of person with the email address tmql4j@topicmapslab.de.

Solution

In this case we cannot use a tuple-expression because we cannot address the tuples to merge directly. We can use a where-clause or simply a path-expression containing a navigation.

At first we try to create a path-expression returning all instances of person, which looks like:

1:      %prefix psi http://psi.ontopedia.net/
2:      psi:person >> instances

The requirement is to extract only persons with the specific email address. We expect that the email is modelled as an occurrence of the type email. To realize our requirement we use a simple filter expression to reduce the set of persons.

1:      %prefix psi http://psi.ontopedia.net/
2:      psi:person >> instances [ . >> characteristics psi:email >> atomify == "tmql4j@topicmapslab.de" ]

In this example we use a filter expression encapsulated by the square brackets to extract all persons satisfying the given condition. The dot represent one instance of person. Using the characteristics and atomify axes we extract the literal value of the occurrences of the type psi:email. By using the equality operator ==, we check if the literal is equal to the given string literal.

We simply can use the shortcut for atomification of characteristics to reduce the query.

1:      %prefix psi http://psi.ontopedia.net/
2:      psi:person >> instances [ . / psi:email == "tmql4j@topicmapslab.de" ]

To merge the topics we simply has to add the keyword MERGE in front of our path-expression.

1:      %prefix psi http://psi.ontopedia.net/
2:      MERGE psi:person >> instances [ . / psi:email == "tmql4j@topicmapslab.de" ]
Summary

By using a navigation in combination with filter-expressions the merge-expression can be used to merge topics statisfying a defined condition.

4.17.3. Tutorial: Complex merge

Use Case

The last tutorial try to merge persons born in the same city.

Solution

Because of the dependency of the variables we cannot use a path-expression, so we have to use a where-clause to define our merging-rule. The where-clause bind a set of variables, which should be merged. By using a set of value-expressions the user defines which variable should be merged. By using the keyword ALL each variable binding merged to one topic.

At first we define out variables bind to two persons and one city. We called them $person , $person' and $city. The post-fixed prime symbolize the query engine, that the variables cannot be bind to the same value. The next will be the definition of the type-instance-relationship of each of this variables. We will use the isa-expression to bind one variable to a set of instances. By using conjunctions we can define the variable binding of each variable.

1:      %prefix psi http://psi.ontopedia.net/
2:      $person ISA psi:person
3:      AND
4:      $person' ISA psi:person
5:      AND
6:      $city ISA psi:city

The next step will be the definition of the born-in association played by the city instance and one of this persons. To realize that we use a predicate-invocation.

1:      %prefix psi http://psi.ontopedia.net/
2:      psi:born-in ( psi:person : $person , psi:city : $city )
3:      AND
4:      psi:born-in ( psi:person : $person' , psi:city : $city )

Now we combine both sub-queries.

1:      %prefix psi http://psi.ontopedia.net/
2:      $person ISA psi:person
3:      AND
4:      $person' ISA psi:person
5:      AND
6:      $city ISA psi:city
7:      AND
8:      psi:born-in ( psi:person : $person , psi:city : $city )
9:      AND
10:     psi:born-in ( psi:person : $person' , psi:city : $city )

The overall result of this query will be a set of tuples containing one city and two persons which were born in this city.

The last step will be to define a list of value-expression to define the items to merge.

1:      %prefix psi http://psi.ontopedia.net/
2:      MERGE $person , $person'
3:              WHERE
4:                      $person ISA psi:person
5:              AND
6:                      $person' ISA psi:person
7:              AND
8:                      $city ISA psi:city
9:              AND
10:                     psi:born-in ( psi:person : $person , psi:city : $city )
11:             AND
12:                     psi:born-in ( psi:person : $person' , psi:city : $city )

In line 2 we add the merge-expression using the keyword MERGE. The merge-expression contains a comma-separated list of value-expression defining the set of values to merge. In our case we use the variable $person and $person' which should be merged.

Summary

A merge-expression can contain a complex where-clause defining a merge-rule by using variables and boolean-conditions. In addition to the where-clause the merge-expression contains a comma-separated list of value-expressions used to define a subset of variables to merge to one topic. If the user want to merge all item bound by a variable the keyword ALL can be used as replacement for the comma-separated list of value-expressions.

5. Additional Material

As additional material for learning more about the language TMQL and the described engine TMQL4J we published a set of tutorial slides (in German).

Part No.

Content

Link

1

Grammar, TMQL Meta Model, Topic Identity, Navigation Concept Part I

Link

2

Navigation Concept Part II, Navigation Concept of the new Draft 2010

Link

3

Filter, Projection, Operators

Link

4

Functions (Draft 2007, Draft 2010 and TMQL4J Extensions)

available on 2010-07-09

5

Sorting Within Tuple Sequences, Environment-Clause, Select Style

available on 2010-07-12

6

FLW(O)R Style, CTM and XTM Fragments

available on 2010-07-16

7

Conditional Query, TMQL Part II (Insert, Delete and Merge)

available on 2010-07-19

8

TMQL Part II (Update), Further Work

available on 2010-07-23

topicmapslab.de