bxh_eventstats "event" query language

The "event" query language used by bxh_eventstats is a relatively simple way to select events of interest from an XML events file. A query is typically applied to each event in turn and the event is returned if the query result is true. The following sections describe the syntax and semantics of the query language, after which follow the technical specifications for the syntax.

Query syntax and semantics

A query is composed of one or more "conditions" separated by logical operators (i.e. the symbols for "and" and "or"). Parentheses may be used to enforce a particular order of operations.

For the following examples, we will match against the following event taken from an XML events file:

  <event type="shape">
    <onset>60.4</onset>
    <duration>1.5</duration>
    <value name="color">red</value>
    <value name="shape">square</value>
    <value name="position">center</value>
    <value name="response">5</value>
    <value name="correct_response">5</value>
  </event>

Simple parameter matching

The simplest query matches against parameters listed in the event, and consists of a '%' followed by a simple string of characters:

%response

This query is true if the character string "response" matches the name of a value element in the event and the value stored in this element is non-zero. In this case the sample event above does have a value element with name "response", and the stored value is 5, which is non-zero, so this query would return true. Any value element can be queried in this manner, by specifying the name of the value element preceded by a '%' character.

There are various special parameters listed in the above element that do not store data in "value" elements. These are "magic", and are queried by preceding their name by a '$' character. Current magic elements are '$type', '$units', '$onset', '$duration', '$name', and '$description'. The following query returns true if the event has an onset element (which it should) and it is non-zero:

$onset

The '$' and '%' characters preceding the parameter names can in most cases be omitted if there is no ambiguity. The pre-defined magic parameter names listed above are assumed to start with a '$' if it does not exist. If you need to explicitly match a value element whose name is the same as a pre-defined magic value, you will need to precede it with the '%' character. So, in the absence of an ambiguity, the following queries are equivalent to the ones above:

response
onset

Equalities and inequalities

A query may test whether a parameter matches a particular value. This is done with the equality operator "==". The following query returns true if the "response" value element exists and whether its stored value is equal to 5:

response == 5

To test whether the stored value of the "response" value element is not equal to 5, using the equality operator "!=":

response != 5

Queries can match parameters to character strings, not just numbers. Character strings are surrounded by single or double quotes (note that if an application requires your entire query to be surrounded by quotes, you should probably choose a different quote character for the quoted strings within the query itself). The following query tests whether the value stored in the "shape" value element is equal to "square":

shape == 'square'

Various inequality operators are also available. The following queries test whether the onset parameter is respectively "greater than", "less than", "greater than or equal to", or "less than or equal to" 42.5:

onset > 42.5
onset < 42.5
onset >= 42.5
onset <= 42.5

The parameter names and numeric and string literals can appear on either or both sides of an equality or inequality operator. The following queries are all valid:

5 == response                 (true)
5 == 5                        (true, always)
5 == 4                        (false, always)
response == correct_response  (true)

The results of evaluating the above queries on the sample event are shown above in parentheses. The latter query returns true if the "response" and "correct_response" value elements exist and if their stored values are the same.

Logical operators and parentheses

More complex queries can be created by joining two queries with a logical operator. If you choose any two valid queries, represented by QUERY1 and QUERY2, the following query is true if and only if both QUERY1 and QUERY2 are true:

QUERY1 & QUERY2

The following query is true if either or both of QUERY1 and QUERY2 are true:

QUERY1 | QUERY2

Parentheses are useful to add readability or to enforce an order of operations. The following query QUERY is true if and only if the query inside the parentheses is true:

( QUERY )

The '&' operator binds tighter than the '|' operator, so the following two queries are equivalent:

QUERY1 & QUERY2 | QUERY3 & QUERY4
( QUERY & QUERY2 ) | ( QUERY3 & QUERY4 )

Shortcuts

Another way to test if a parameter matches a number or string is to specify it in parentheses after the parameter name. The following two queries are equivalent:

response(5)
response == 5

Here is an example using string literals:

shape('square')
shape == 'square'

You can also specify inequality operators. The following two queries are equivalent:

onset(>=10.5)
onset >= 10.5

A numeric range can be specified as two numbers separated by a hyphen ('-'). The query returns true if the parameter's value is in the (inclusive) range:

onset(10.5-64.2)
( onset >= 10.5 & onset <= 64.2 )

Any of the above types of shortcuts may be combined together in the parentheses, separated by commas. In this case, the query returns true if the parameter's values matches any one or more of the specified tests. The following two queries are equivalent:

response(5,6,8-10,<=3)
( response == 5 | response == 6 | ( response >= 8 & response <= 10 ) | response <= 3 )

That's all

Arbitrarily complex queries can be constructed using the above rules. However, by its nature there are certain types of queries that may be impossible to specify using the "event" query language. If you are in need of something more complex, some tools support queries using the "XPath" query language. However, XPath is a much more complex language and to use it to its fullest potential you are required to thoroughly understand the structure of the XML events file.


Technical specs

The following sections are only likely to be of interest to developers.

TOKENIZING

Tokenizing is greedy, largest match wins -- i.e. '>=' does not start with a '>' token. In any of the rules below, using the first valid match will ensure proper tokenizing. Arbitrary amounts of whitespace may separate tokens.
TOKEN      ::=  NUMTOKEN
             |  STRTOKEN
             |  PARAMTOKEN
             |  PAREN
	     |  ","
	     |  "-"
	     |  "&"
	     |  "|"
             |  INEQ_OP
             |  EQ_OP

NUMTOKEN   ::=  DIGIT+ "." DIGIT*
             |  DIGIT+
             |  "." DIGIT+

STRTOKEN   ::=  "'" STRCHAR1+ "'"
             |  '"' STRCHAR2+ '"'

PARAMTOKEN ::=  "$" PARAMSTART PARAMCHAR*
             |  "%" PARAMSTART PARAMCHAR*
             |      PARAMSTART PARAMCHAR*

PARAMSTART ::=  "_" | LETTER

PARAMCHAR  ::=  "." | "_" | LETTER | DIGIT

STRCHAR1   ::= any ASCII character except single quote (')

STRCHAR2   ::= any ASCII character except double quote (")

PAREN      ::=  "(" | ")"

INEQ_OP  ::=  "<=" | ">=" | "<" | ">"

EQ_OP    ::=  "==" | "!="

SYNTAX

Allowable whitespace is not specified here for readability. All rules accept whitespace between contiguous components except rules that match single tokens, i.e. NUMTOKEN, STRTOKEN, and PARAMTOKEN.
QUERY      ::= QUERY "|" QUERY
             | AQUERY

AQUERY     ::= AQUERY "&" AQUERY
             | PCQUERY

PCQUERY    ::= "(" QUERY ")"
             | CONDITION

CONDITION  ::= PARAM_NUM     INEQ_OP PARAM_NUM
             | PARAM_NUM_STR EQ_OP   PARAM_NUM_STR
             | PARAMTOKEN "(" TESTLIST ")"
	     | PARAMTOKEN

PARAM_NUM  ::= PARAMTOKEN
             | NUMTOKEN

PARAM_NUM_STR ::= PARAMTOKEN
             | NUMTOKEN
             | STRTOKEN

TESTLIST   ::= VALTEST "," TESTLIST
             | VALTEST

VALTEST    ::= INEQ_OP NUMTOKEN
             | NUMTOKEN "-" NUMTOKEN
             | NUMTOKEN
             | STRTOKEN

NUMTOKEN   ::= DIGIT+ "." DIGIT*
             | DIGIT+
             | "." DIGIT+

STRTOKEN   ::= "'" STRCHAR_S+ "'"
             | '"' STRCHAR_D+ '"'

PARAMTOKEN ::= '$' PARAMSTART PARAMCHAR*
             | '%' PARAMSTART PARAMCHAR*
             |     PARAMSTART PARAMCHAR*

PARAMSTART ::= '_'
             | LETTER

PARAMCHAR  ::= '.'
             | '_'
             | LETTER
             | DIGIT

STRCHAR_S  ::= any ASCII character except single quote (')

STRCHAR_D  ::= any ASCII character except double quote (")

INEQ_OP    ::= "<="
             | ">="
             | "<"
             | ">"

EQ_OP      ::= "=="
             | "!="