git-svn-id: svn://svn.open-ils.org/ILS/trunk@13134 dcc99617-32d9-48b4-a31d-7c20da2025e4
authorkgs <kgs@dcc99617-32d9-48b4-a31d-7c20da2025e4>
Tue, 12 May 2009 01:16:48 +0000 (01:16 +0000)
committerkgs <kgs@dcc99617-32d9-48b4-a31d-7c20da2025e4>
Tue, 12 May 2009 01:16:48 +0000 (01:16 +0000)
docs/TechRef/JSONGrammar.xml [new file with mode: 0644]
docs/TechRef/TechRef.xml [new file with mode: 0644]

diff --git a/docs/TechRef/JSONGrammar.xml b/docs/TechRef/JSONGrammar.xml
new file mode 100644 (file)
index 0000000..4812128
--- /dev/null
@@ -0,0 +1,1223 @@
+<?xml version="1.0" encoding="utf-8"?>
+
+<sect1 version="5.0" xmlns="http://docbook.org/ns/docbook"
+       xmlns:xi="http://www.w3.org/2003/XInclude" 
+       xmlns:xlink="http://www.w3.org/1999/xlink">
+
+       <title>Grammar of JSON Queries</title>
+
+       <para>
+               <author>
+                       <personname>
+                               <firstname>Scott</firstname>
+                               <surname>McKellar</surname>
+                       </personname>
+                       <affiliation>
+                               <orgname>Equinox Software, Inc.</orgname>
+                       </affiliation>
+               </author>
+       </para>
+
+
+       <sect2>
+               <title>Introduction</title>
+               <para> The format of this grammar approximates Extended Backus-Naur notation. However it is
+                       intended as input to human beings, not to parser generators such as Lex or Yacc. Do not
+                       expect formal rigor. Sometimes narrative text will explain things that are clumsy to
+                       express in formal notation. More often, the text will restate or summarize the formal
+                       productions. </para>
+               <para> Conventions: </para>
+               <orderedlist>
+                       <listitem>
+                               <para>The grammar is a series of productions.</para>
+                       </listitem>
+                       <listitem>
+                               <para>A production consists of a name, followed by "::=", followed by a definition
+                                       for the name. The name identifies a grammatical construct that can appear on the
+                                       right side of another production.</para>
+                       </listitem>
+                       <listitem>
+                               <para>Literals (including punctuation) are enclosed in 'single quotes', or in
+                                       "double quotes" if case is not significant.</para>
+                       </listitem>
+                       <listitem>
+                               <para>A single quotation mark within a literal is escaped with a preceding
+                                       backslash: 'dog\'s tail'.</para>
+                       </listitem>
+                       <listitem>
+                               <para>If a construct can be defined more than one way, then the alternatives may
+                                       appear in separate productions; or, they may appear in the same production,
+                                       separated by pipe symbols. The choice between these representations is of only
+                                       cosmetic significance.</para>
+                       </listitem>
+                       <listitem>
+                               <para>A construct enclosed within square brackets is optional.</para>
+                       </listitem>
+                       <listitem>
+                               <para>A construct enclosed within curly braces may be repeated zero or more
+                                       times.</para>
+                       </listitem>
+                       <listitem>
+                               <para>JSON allows arbitrary white space between tokens. To avoid ugly clutter, this
+                                       grammar ignores the optional white space. </para>
+                       </listitem>
+                       <listitem>
+                               <para>In many cases a production defines a JSON object, i.e. a list of name-value
+                                       pairs, separated by commas. Since the order of these name/value pairs is not
+                                       significant, the grammar will not try to show all the possible sequences. In
+                                       general it will present the required pairs first, if any, followed by any
+                                       optional elements.</para>
+                       </listitem>
+               </orderedlist>
+
+               <para> Since both EBNF and JSON use curly braces and square brackets, pay close attention to
+                       whether these characters are in single quotes. If they're in single quotes, they are
+                       literal elements of the JSON notation. Otherwise they are elements of the EBNF notation.
+               </para>
+       </sect2>
+
+       <sect2>
+               <title>Primitives</title>
+               <para> We'll start by defining some primitives, to get them out of the way. They're mostly
+                       just what you would expect. </para>
+
+               <productionset>
+                       <production xml:id="ebnf.string">
+                               <lhs> string </lhs>
+                               <rhs> '"' chars '"' </rhs>
+                       </production>
+
+                       <production xml:id="ebnf.chars">
+                               <lhs> chars </lhs>
+                               <rhs> any valid sequence of UTF-8 characters, with certain special characters
+                                       escaped according to JSON rules </rhs>
+                       </production>
+
+                       <production xml:id="ebnf.integer_literal">
+                               <lhs> integer_literal </lhs>
+                               <rhs> [ sign ] digit { digit } </rhs>
+                       </production>
+
+                       <production xml:id="ebnf.sign">
+                               <lhs> sign </lhs>
+                               <rhs> '+' | '-' </rhs>
+                       </production>
+
+                       <production xml:id="ebnf.digit">
+                               <lhs> digit </lhs>
+                               <rhs>digit = '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'</rhs>
+                       </production>
+
+                       <production xml:id="ebnf.integer_string">
+                               <lhs> integer_string </lhs>
+                               <rhs> '"' integer_literal '"' </rhs>
+                       </production>
+
+                       <production xml:id="ebnf.integer">
+                               <lhs> integer </lhs>
+                               <rhs> integer_literal | integer_string </rhs>
+                       </production>
+
+                       <production xml:id="ebnf.number">
+                               <lhs> number </lhs>
+                               <rhs> any valid character sequence that is numeric according to JSON rules </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> When json_query requires an integral value, it will usually accept a quoted string
+                       and convert it to an integer by brute force – to zero if necessary. Likewise it may
+                       truncate a floating point number to an integral value. Scientific notation will be
+                       accepted but may not give the intended results. </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.boolean">
+                               <lhs> boolean </lhs>
+                               <rhs> 'true' | 'false' | string | number </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> The preferred way to encode a boolean is with the JSON reserved word true or false,
+                       in lower case without quotation marks. The string <literal>true</literal>, in upper,
+                       lower, or mixed case, is another way to encode true. Any other string evaluates to
+                       false. </para>
+               <para> As an accommodation to perl, numbers may be used as booleans. A numeric value of 1
+                       means true, and any other numeric value means false. </para>
+               <para> Any other valid JSON value, such as an array, will be accepted as a boolean but
+                       interpreted as false. </para>
+               <para> The last couple of primitives aren't really very primitive, but we introduce them
+                       here for convenience: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.class_name">
+                               <lhs> class_name </lhs>
+                               <rhs> string </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> A class_name is a special case of a string: the name of a class as defined by the
+                       IDL. The class may refer either to a database table or to a source_definition, which is
+                       a subquery. </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.field_name">
+                               <lhs> field_name </lhs>
+                               <rhs> string </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> A field_name is another special case of a string: the name of a non-virtual field as
+                       defined by the IDL. A field_name is also a column name for the table corresponding to
+                       the relevant class. </para>
+
+       </sect2>
+
+       <sect2>
+               <title>Query</title>
+
+               <para> The following production applies not only to the main query but also to most
+                       subqueries. </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.query">
+                               <lhs> query </lhs>
+                               <rhs> '{'<sbr/> '"from"' ':' from_list<sbr/> [ ',' '"select"' ':' select_list
+                                       ]<sbr/> [ ',' '"where"' ':' where_condition ]<sbr/> [ ',' '"having"' ':'
+                                       where_condition ]<sbr/> [ ',' '"order_by"' ':' order_by_list ]<sbr/> [ ','
+                                       '"limit"' ':' integer ]<sbr/> [ ',' '"offset"' ':' integer ]<sbr/> [ ','
+                                       '"distinct"' ':' boolean ]<sbr/> [ ',' '"no_i18n"' ':' boolean ]<sbr/> '}'
+                               </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> Except for the <literal>"distinct"</literal> and <literal>no_i18n</literal> entries,
+                       each name/value pair represents a major clause of the SELECT statement. The name/value
+                       pairs may appear in any order. </para>
+               <para> There is no name/value pair for the GROUP BY clause, because json_query generates it
+                       automatically according to information encoded elsewhere. </para>
+               <para> The <literal>"distinct"</literal> entry, if present and true, tells json_query that
+                       it may have to create a GROUP BY clause. If not present, it defaults to false. </para>
+               <para> The <literal>"no_i18n"</literal> entry, if present and true, tells json_query to
+                       suppress internationalization. If not present, it defaults to false. (Note that
+                               <literal>"no_i18n"</literal> contains the digit one, not the letter ell.) </para>
+               <para> The values for <literal>limit</literal> and <literal>offset</literal> provide the
+                       arguments of the LIMIT and OFFSET clauses, respectively, of the SQL statement. Each
+                       value should be non-negative, if present, or else the SQL won't work. </para>
+
+       </sect2>
+
+       <sect2><title>FROM Clause</title>
+               <para>
+                       The object identified by <literal>“from”</literal> encodes the FROM clause of
+                       the SQL.  The associated value may be a string, an array, or a JSON object.
+               </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.from_list_0">
+                               <lhs> from_list </lhs>
+                               <rhs> class_name </rhs>
+                       </production>
+
+               </productionset>
+
+               <para>
+                       If <literal>from_list</literal> is a <literal>class_name</literal>, the
+                       json_query inserts the corresponding table name or subquery into the FROM
+                       clause, using the <literal>class_name</literal> as an alias for the table
+                       or subquery.  The class must be defined as non-virtual in the IDL.
+               </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.from_list_1">
+                               <lhs> from_list </lhs>
+                               <rhs> '['  string  {  ','  parameter  }  ']' </rhs>
+                       </production>
+
+                       <production xml:id="ebnf.parameter">
+                               <lhs> parameter </lhs>
+                               <rhs> string  |  number  |  'null' </rhs>
+                       </production>
+
+               </productionset>
+
+               <para>
+                       If from_list is a JSON array, then it represents a table-like function from
+                       which the SQL statement will select rows, using a SELECT clause consisting
+                       of “SELECT *” (regardless of the select_list supplied by the method parameter).
+               </para>
+               <para>
+                       The first entry in the array is the name of the function.  It must be a string
+                       naming a stored function.  Each subsequent entry is a function parameter.  If
+                       it is a string or a number, json_query will insert it into a comma-separated
+                       parameter list, enclosed in quotes, with any special characters escaped as needed.
+                       If it is the JSON reserved word <literal>null</literal>, json_query will insert
+                       it into the parameter list as a null value.
+               </para>
+               <para>
+                       If <literal>from_list</literal> is a JSON object, it must contain exactly one entry.
+                       The key of this entry must be the name of a non-virtual class defined in the IDL.
+                       This class will be the top-level class of the FROM clause, the only one named
+                       outside of a JOIN clause.
+               </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.from_list_2">
+                               <lhs> from_list </lhs>
+                               <rhs> '{' class_name ':' join_list '}' </rhs>
+                       </production>
+
+                       <production xml:id="ebnf.join_list_0">
+                               <lhs> join_list </lhs>
+                               <rhs> class_name </rhs>
+                       </production>
+
+                       <production xml:id="ebnf.join_list_1">
+                               <lhs> join_list </lhs>
+                               <rhs> '{' join_def { ',' join_def } '}' </rhs>
+                       </production>
+
+               </productionset>
+
+               <para>
+                       If the associated data is a <literal>class_name</literal>, json_query will
+                       construct an INNER JOIN clause joining the class to the top-level clause,
+                       using the columns specified by the IDL for such a join.
+               </para>
+               <para>
+                       Otherwise, the associated data must be a JSON object with one or more entries,
+                       each entry defining a join:
+               </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.join_def">
+                               <lhs> join_def </lhs>
+                               <rhs>
+                                       class_name  ':'<sbr/>
+                                       '{'<sbr/>
+                                       [  '”type”'      ':'  string      ]<sbr/>
+                                       [  '”field”'     ':'  field_name  ]<sbr/>
+                                       [  '”fkey”'      ':'  field_name  ]<sbr/>
+                                       [  '”filter”'    ':'  where_condition  ]<sbr/>
+                                       [  '”filter_op”' ':'  string      ]<sbr/>
+                                       [  '”join”'      ':'  join_list   ]<sbr/>
+                                       '}'
+                               </rhs>
+                       </production>
+
+               </productionset>
+
+               <para>
+                       The data portion of the <literal>“join_type”</literal> entry tells json_query
+                       whether to use a left join, right join, full join, or inner join.  The values
+                       <literal>“left”</literal>, <literal>“right”</literal>, and <literal>“full”</literal>,
+                       in upper, lower, or mixed case, have the obvious meanings.  If the
+                       <literal>“join_type”</literal> entry has any other value, or is not present,
+                       json_query constructs an inner join.
+               </para>
+               <para>
+                       The <literal>“field”</literal> and <literal>“fkey”</literal> attributes specify the
+                       columns to be equated in the join condition.  The <literal>“field”</literal>
+                       attribute refers to the column in the joined table, i.e. the one named by the
+                       <literal>join_def</literal>.  The <literal>“fkey”</literal> attribute refers to the
+                       corresponding column in the other table, i.e. the one named outside the
+                       <literal>join_def</literal> – either the top-level table or a table named by some
+                       other <literal>join_def</literal>.
+               </para>
+               <para>
+                       It may be tempting to suppose that <literal>“fkey”</literal> stands for “foreign key”,
+                       and therefore refers to a column in the child table that points to the key of a
+                       parent table.  Resist the temptation; the labels are arbitrary.  The json_query
+                       method doesn't care which table is the parent and which is the child.
+               </para>
+               <para>
+                       These relationships are best explained with an example.  The following
+                       <literal>from_list</literal>:
+               </para>
+
+               <informalexample>
+                       <programlisting language="JSON">
+       {
+           "aou": {
+               "asv": {
+                   "type" : "left",
+                   "fkey" : "id",
+                   "field" : "owner"
+               }
+           }
+       }
+                       </programlisting>
+               </informalexample>
+
+               <para>
+                       ...turns into the following FROM clause:
+               </para>
+
+               <informalexample>
+                       <programlisting language="SQL">
+       FROM
+           actor.org_unit AS "aou"
+               LEFT JOIN action.survey AS "asv"
+                   ON ( "asv".owner = "aou".id )
+                       </programlisting>
+               </informalexample>
+
+               <para>
+                       Note in this example that <literal>“fkey”</literal> refers to a column of the
+                       class <literal>“aou”</literal>, and <literal>“field”</literal> refers to a
+                       column of the class <literal>“asv”</literal>.
+               </para>
+               <para>
+                       If you specify only one of the two columns, json_query will try to identify the
+                       other one from the IDL. However, if you specify only the column from the parent
+                       table, this attempt will probably fail.
+               </para>
+               <para>
+                       If you specify both columns, json_query will use the column names you specify,
+                       without verifying them with a lookup in the IDL.  By this means you can perform
+                       a join using a linkage that the IDL doesn't define.  Of course, if the columns
+                       don't exist in the database, the query will fail when json_query tries to execute it.
+               </para>
+               <para>
+                       Using the columns specified, either explicitly or implicitly, the json_query
+                       method constructs a join condition.  With raw SQL it is possible (though
+                       rarely useful) to join two tables by an inequality.  However the json_query
+                       method always uses a simple equality condition.
+               </para>
+               <para>
+                       Using a <literal>“filter”</literal> entry in the join_def, you can apply one
+                       or more additional conditions to the JOIN clause, typically to restrict the
+                       join to certain rows of the joined table.  The data associated with the
+                       <literal>“filter”</literal> key is the same sort of
+                       <literal>where_condition</literal> that you use for a WHERE clause
+                       (discussed below).
+               </para>
+               <para>
+                       If the string associated with the <literal>“filter_op”</literal> entry is
+                       <literal>“OR”</literal> in upper, lower, or mixed case, then the json_query
+                       method uses OR to connect the standard join condition to any additional
+                       conditions supplied by a <literal>“filter”</literal> entry.
+               </para>
+               <para>
+                       (Note that if the <literal>where_condition</literal> supplies multiple
+                       conditions, they will be connected by AND.  You will probably want to move
+                       them down a layer – enclose them in parentheses, in effect – to avoid a
+                       confusing mixture of ANDs and ORs.)
+               </para>
+               <para>
+                       If the <literal>“filter_op”</literal> entry carries any other value, or if
+                       it is absent, then the json_query method uses AND.  In the absence of a
+                       <literal>“filter”</literal> entry, <literal>“filter_op”</literal> has no effect.
+               </para>
+               <para>
+                       A <literal>“join”</literal> entry in a <literal>join_def</literal> specifies
+                       another layer of join.  The class named in the subjoin is joined to the class
+                       named by the <literal>join_def</literal> to which it is subordinate.  By this
+                       means you can encode multiple joins in a hierarchy.
+               </para>
+       </sect2>
+
+       <sect2><title>SELECT Clause</title>
+               <para>
+                       If a query does not contain an entry for <literal>“select”</literal>, json_query
+                       will construct a default SELECT clause.  The default includes every non-virtual
+                       field from the top-level class of the FROM clause, as defined by the IDL.  The
+                       result is similar to SELECT *, except:
+               </para>
+
+               <itemizedlist>
+                       <listitem>
+                               <para>The default includes only the fields defined in the IDL.</para>
+                       </listitem>
+                       <listitem>
+                               <para>The columns will appear in the same order in which they appear in the IDL,
+                               regardless of the order in which the database defines them.</para>
+                       </listitem>
+               </itemizedlist>
+               
+               <para>
+                       There are other ways to specify a default SELECT list, as shown below.
+               </para>
+               <para>
+                       If a <literal>"select"</literal> entry is present, the associated value must
+                       be a JSON object, keyed on class names:
+               </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.select_list">
+                               <lhs> select_list </lhs>
+                               <rhs> '{' class_name ':' field_list { ',' class_name ':' field_list } '}' </rhs>
+                       </production>
+
+               </productionset>
+
+               <para>
+                       The <literal>class_name</literal> must identify either the top-level class or
+                       a class belonging to one of the joins.  Otherwise json_query will silently
+                       ignore the <literal>select_list</literal>.
+               </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.field_list_0">
+                               <lhs> field_list </lhs>
+                               <rhs> 'null'  |  '”*”' </rhs>
+                       </production>
+
+               </productionset>
+
+               <para>
+                       If a field_list is either the JSON reserved word <literal>null</literal>
+                       (in lower case) or an asterisk in double quotes, json_query constructs a
+                       default SELECT list – provided that the class is the top-level class of the
+                       query.  If the class belongs to a join somewhere, json_query ignores the
+                       <literal>field_list</literal>.
+               </para>
+               <para>
+                       More commonly, the <literal>field_list</literal> is a JSON array of zero or
+                       more field specifications:
+               </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.field_list_1">
+                               <lhs> field_list </lhs>
+                               <rhs> '['  [  field_spec  {  ','  field_spec  }  ]  ']' </rhs>
+                       </production>
+
+               </productionset>
+
+               <para>
+                       If the array is empty, json_query will construct a default SELECT list for
+                       the class – again, provided that the class is the top-level class in the query.
+               </para>
+               <para>
+                       In the simplest case, a field specification may name a non-virtual field
+                       defined in the IDL:
+               </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.field_spec_0">
+                               <lhs> field_spec </lhs>
+                               <rhs> field_name </rhs>
+                       </production>
+
+               </productionset>
+
+               <para>
+                       In some cases json_query constructs a call to the
+                       <literal>oils_i18n_xlate</literal> function to internationalize the value of the
+                       selected column.  Specifically, it does so if all the following are true:
+               </para>
+
+               <itemizedlist>
+                       <listitem>
+                               <para>the settings file defines a locale;</para>
+                       </listitem>
+                       <listitem>
+                               <para>in the field definition for the field in the IDL, the tag
+                               <literal>“il8n”</literal> is present and true;</para>
+                       </listitem>
+                       <listitem>
+                               <para>the query does <emphasis>not</emphasis> include the
+                               <literal>"no_il8n"</literal> tag (or includes it with a value of false).</para>
+                       </listitem>
+               </itemizedlist>
+               
+               <para>
+                       A field specification may be a JSON object:
+               </para>
+               
+               <productionset>
+                       
+                       <production xml:id="ebnf.field_spec_1">
+                               <lhs> field_spec </lhs>
+                               <rhs>
+                                       '{'<sbr/>
+                                       '”column”'  ':'  <sbr/>
+                                       [ ',' '”alias”'  ':'  string  ]<sbr/>
+                                       [ ',' '”aggregate”'  ':'  boolean  ]<sbr/>
+                                       [ ',' transform_spec  ]<sbr/>
+                                       '}'
+                               </rhs>
+                       </production>
+
+               </productionset>
+
+               <para>
+                       The <literal>“column”</literal> entry provides the column name, which must
+                       be defined as non-virtual in the IDL.
+               </para>
+               <para>
+                       The <literal>“alias”</literal> entry provides a column alias.  If no alias
+                       is specified, json_query uses the column name as its own alias.
+               </para>
+               <para>
+                       The <literal>“aggregate”</literal> entry has no effect on the SELECT clause
+                       itself.  Rather, it affects the construction of a GROUP BY class.  If there
+                       is an <literal>“aggregate”</literal> entry for any field, then json_query builds
+                       a GROUP BY clause listing every column that is <emphasis>not</emphasis> tagged
+                       for aggregation (or that carries an <literal>“aggregate”</literal> entry with
+                       a value of false).  If <emphasis>all</emphasis> columns are tagged for
+                       aggregation, then json_query omits the GROUP BY clause.
+               </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.transform_spec_0">
+                               <lhs> transform_spec </lhs>
+                               <rhs>
+                                       '”transform”'  ':'  string  ]<sbr/>
+                                       [ ',' '”result_field”  ':'  string  ]<sbr/>
+                                       [ ',' '”params”  ':' param_list  ]
+                               </rhs>
+                       </production>
+
+               </productionset>
+
+               <para>
+                       When a <literal>transform_spec</literal> is present, json_query selects the
+                       return value of a function instead of selecting the column directly.  The entry
+                       for <literal>“transform”</literal> provides the name of the function, and the
+                       column name (as specified by the <literal>“column”</literal> tag), qualified by
+                       the class name, is the argument to the function.  For example, you might use such
+                       a function to format a date or time, or otherwise transform a column value.
+                       You might also use an aggregate function such as SUM, COUNT, or MAX (possibly
+                       together with the <literal>“aggregate”</literal> tag).
+               </para>
+               <para>
+                       The <literal>“result_field”</literal> entry, when present, specifies a subcolumn
+                       of the function's return value.  The resulting SQL encloses the function call
+                       in parentheses, and follows it with a period and the subcolumn name.
+               </para>
+               <para>
+                       The <literal>“params”</literal> entry, if present, provides a possibly empty
+                       array of additional parameter values, either strings, numbers, or nulls:
+               </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.param_list">
+                               <lhs> param_list </lhs>
+                               <rhs> '['  [  parameter  {  ','  parameter  }  ]  ']' </rhs>
+                       </production>
+
+               </productionset>
+
+               <para>
+                       Such parameter values are enclosed in single quotes, with any special characters
+                       escaped as needed, and inserted after the column name as additional parameters
+                       to the function.  You might, for example, use an additional parameter to provide
+                       a format string for a reformatting function.
+               </para>
+       </sect2>
+
+       <sect2><title>WHERE Clause</title>
+               <para> There are two types of <literal>where_condition</literal>: objects and arrays.
+                       Of these, the object type is the more fundamental, and occurs at some level in every
+                       <literal>where_condition</literal>.  The array type is mainly a way of circumventing
+                       a limitation of the object type. </para>
+               <para> The object type of <literal>where_condition</literal> is a comma-separated list
+                       of one or more <literal>conditions</literal>: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.where_condition_0">
+                               <lhs> where_condition </lhs>
+                               <rhs> '{' condition { ',' condition } '}' </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> The generated SQL will include a code fragment for each <literal>condition</literal>,
+                       joined by AND (or in some cases by OR, as described below). </para>
+               <para> As usual for entries in a JSON object, each <literal>condition</literal> consists
+                       of a unique string to serve as a key, a colon, and an associated value. </para>
+               <para> The key string may be the name of a column belonging to the relevant table, or
+                       it may be an operator string.  In order to distinguish it from any possible column
+                       name, an operator string always begins with a plus sign or minus sign. </para>
+               <para> JSON requires that every key string be unique within an object.  This requirement
+                       imposes some awkward limitations on a JSON query.  For example, you might want to
+                       express two conditions for the same column: id &gt; 10 and id != 25.  Since each of
+                       those conditions would have the same key string, namely “id”, you can't put them
+                       into the same JSON object. </para>
+               <para> The solution is to put such conflicting conditions in separate JSON objects, and
+                       put the objects into an array: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.where_condition_1">
+                               <lhs> where_condition </lhs>
+                               <rhs> '[' where_condition { ',' where_condition } ']' </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> The resulting SQL encloses each subordinate set of <literal>conditions</literal>
+                       in parentheses, and connects the sets with AND (or in some cases OR, as described
+                       below).  It's possible to put only a single <literal>where_condition</literal> in
+                       the array; the result is to add a layer of parentheses around the condition. </para>
+               <para> There are two kinds of <literal>condition</literal> where the operator begins
+                       with a plus sign.  In the simpler case, the associated data is simply a column name:
+               </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.condition_0">
+                               <lhs> condition </lhs>
+                               <rhs> plus_class ':' field_name </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> A <literal>plus_class</literal> is a string that begins with a plus sign.
+                       The rest of the string, after the plus sign, must be the class name for the table
+                       to which the column belongs. </para>
+               <para> If the column is a boolean, then the resulting SQL uses it (as qualified by the
+                       class name) as a stand-alone condition. </para>
+               <para> Otherwise, this kind of syntax provides a way to place a column on the right side
+                       of a comparison operator.  For example: </para>
+
+               <informalexample>
+                       <programlisting language="JSON">
+       {
+           "from":"aou",
+           "select": { "aou":[ "id", "name" ] },
+           "where": {
+               "id": {
+                   "&gt;": { "+aou":"parent_ou" }
+               }
+           }
+       }
+                       </programlisting>
+               </informalexample>
+
+               <para> The resulting SQL: </para>
+
+               <informalexample>
+                       <programlisting language="SQL">
+       SELECT
+           "aou".id AS "id",
+           "aou".name AS "name"
+       FROM
+           actor.org_unit AS "aou"
+       WHERE
+           (
+               "aou".id &gt; (  "aou".parent_ou  )
+           );
+                       </programlisting>
+               </informalexample>
+
+               <para> The other type of <literal>condition</literal> that uses a
+                       <literal>plus_class</literal> applies a specified class name to a
+                       <literal>where_condition</literal>: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.condition_1">
+                               <lhs> condition </lhs>
+                               <rhs> plus_class ':' where_condition </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> The resulting SQL is enclosed in parentheses, and qualifies the columns with
+                       the specified class name.  This syntax provides a mechanism to shift the class
+                       context – i.e. to refer to one class in a context that would otherwise refer to
+                       a different class. </para>
+               <para> Ordinarily the class name must be a valid non-virtual class defined in the IDL,
+                       and applicable to the associated <literal>where_condition</literal>.  There is at
+                       least one peculiar exception.  The JSON fragment: </para>
+
+               <informalexample>
+                       <programlisting language="JSON">
+       "+abc": { "+xyz":"frobozz" }
+                       </programlisting>
+               </informalexample>
+
+               <para> ...is rendered as: </para>
+
+               <informalexample>
+                       <programlisting language="SQL">
+       (  "xyz".frobozz  )
+                       </programlisting>
+               </informalexample>
+
+               <para> ...even though neither <literal>“abc”</literal>, nor <literal>“xyz”</literal>,
+                       nor <literal>“frobozz”</literal> is defined in the IDL.  The class name
+                       <literal>“abc”</literal> isn't used at all because the <literal>“+xyz”</literal>
+                       operator overrides it.  Such a query won't fail until json_query tries
+                       to execute it in the database. </para>
+               <para> The other operators that may occur at this level all begin with a minus sign,
+                       and they all represent familiar SQL operators.  For example, the
+                       <literal>“-or”</literal> operator joins the conditions within a
+                       <literal>where_condition</literal> by OR (instead of the default AND), and
+                       encloses them all in parentheses: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.condition_2">
+                               <lhs> condition </lhs>
+                               <rhs> '”-or”' ':' where_condition </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> In fact the <literal>“-or”</literal> operator is the only way to get OR into
+                       the WHERE clause. </para>
+               <para> The <literal>“-and”</literal> operator is similar, except that it uses AND: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.condition_3">
+                               <lhs> condition </lhs>
+                               <rhs> '”-and”' ':'  where_condition </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> Arguably the <literal>“-and”</literal> operator is redundant, because you can
+                       get the same effect by wrapping the subordinate <literal>where_condition</literal>
+                       in a JSON array.  Either technique merely adds a layer of parentheses, since AND
+                       connects successive conditions by default. </para>
+               <para> The <literal>“-not”</literal> operator expands the subordinate
+                       <literal>where_condition</literal> within parentheses, and prefaces the result
+                       with NOT: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.condition_4">
+                               <lhs> condition </lhs>
+                               <rhs> '”-not”' ':'  where_condition </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> The <literal>“-exists”</literal> or <literal>“-not-exists”</literal> operator
+                       constructs a subquery within an EXISTS  or NOT EXISTS clause, respectively: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.condition_5">
+                               <lhs> condition </lhs>
+                               <rhs> '”-exists”' ':' query </rhs>
+                       </production>
+
+                       <production xml:id="ebnf.condition_6">
+                               <lhs> condition </lhs>
+                               <rhs> '”-not-exists”' ':' query </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> The remaining kinds of <literal>condition</literal> all have a
+                       <literal>field_name</literal> on the left and some kind of <literal>predicate</literal>
+                       on the right.  A <literal>predicate</literal> places a constraint on the value of
+                       the column – or, in some cases, on the value of the column as transformed by some
+                       function call: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.condition_7">
+                               <lhs> condition </lhs>
+                               <rhs> field_name ':' predicate </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> The simplest such constraint is to require that the column have a specific value,
+                       or be null: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.predicate_0">
+                               <lhs> predicate </lhs>
+                               <rhs> lit_value | 'null' </rhs>
+                       </production>
+
+                       <production xml:id="ebnf.lit_value">
+                               <lhs> lit_value </lhs>
+                               <rhs> string | number </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> You can also compare a column to a literal value using some kind of inequality.
+                       However it's a bit more complicated because you have to specify what kind of comparison
+                       to make: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.compare_op_0">
+                               <lhs> predicate </lhs>
+                               <rhs> '{' compare_op ':' lit_value '}' </rhs>
+                       </production>
+
+                       <production xml:id="ebnf.compare_op_1">
+                               <lhs> compare_op </lhs>
+                               <rhs> string </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> A <literal>compare_op</literal> is a string that defines a comparison operator.
+                       Valid values include the following: </para>
+
+               <programlisting language="SQL">
+       =    &lt;&gt;   !=
+       &lt;    &gt;    &lt;=   &gt;=
+       ~    ~*   !~   !~*
+       like      ilike
+       similar to
+               </programlisting>
+
+               <para> Strictly speaking, json_query accepts any <literal>compare_op</literal>
+                       that doesn't contain semicolons or white space (or
+                       <literal>“similar to”</literal> as a special exception).  As a result, it
+                       is possible – and potentially useful – to use a custom operator like
+                       <literal>“&gt;100*”</literal> in order to insert an expression that would
+                       otherwise be difficult or impossible to create through a JSON query.  The ban
+                       on semicolons and white space prevents certain kinds of SQL injection. </para>
+               <para> Note that json_query does <emphasis>not</emphasis> accept two operators that
+                       PostgreSQL <emphasis>does</emphasis> accept: <literal>“is distinct from”</literal>
+                       and <literal>“is not distinct from”</literal>. </para>
+               <para> You can also compare a column to a null value: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.predicate_1">
+                               <lhs> predicate </lhs>
+                               <rhs> '{' compare_op ':' 'null' '}' </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> The equality operator <literal>“=”</literal> turns into IS NULL.  Any other
+                       operator turns into IS NOT NULL. </para>
+               <para> When a <literal>compare_op</literal> is paired with an array, it defines a
+                       function call: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.predicate_2">
+                               <lhs> predicate </lhs>
+                               <rhs> '{' compare_op ':' '[' string { ',' parameter } ']' '}' </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> The first entry in the array is the function's name.  Subsequent entries in
+                       the array, if any, represent the parameters of the function call.  They may be
+                       strings, numbers, or nulls.  In the generated SQL, the function call appears on
+                       the right of the comparison. </para>
+               <para> The <literal>“between”</literal> operator creates a BETWEEN clause: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.predicate_3">
+                               <lhs> predicate </lhs>
+                               <rhs> '{'  “between”  ':'  '['  lit_value  ','  lit_value  ']'  '}' </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> Although PostgreSQL allows a null value in a BETWEEN clause, json_query
+                       requires literal non-null values.  It isn't sensible to use null values in a
+                       BETWEEN clause.  A few experiments show that the results of the comparison are
+                       peculiar and erratic. </para>
+               <para> There are two ways to create an IN list of allowed values.  The simplest is
+                       to put literal values into a JSON array: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.predicate_4">
+                               <lhs> predicate </lhs>
+                               <rhs> '[' lit_value { ',' lit_value }  ']' </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> As with BETWEEN clauses, json_query does not accept null values in an IN list,
+                       even though PostgreSQL does allow them.  Nulls are not sensible in this context
+                       because they never match anything. </para>
+               <para>  </para>
+       </sect2>
+
+       <sect2><title>Having Clause</title>
+               <para>For the HAVING clause, json_query accepts exactly the same syntax as it accepts for
+                       the WHERE clause.</para>
+               <para> The other way to create an IN list is to use an explicit
+                       <literal>“in”</literal> operator with an array of literal values.  This format
+                       also works for the <literal>“not in”</literal> operator: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.predicate_5">
+                               <lhs> predicate </lhs>
+                               <rhs> '{' in_operator ';'  '[' lit_value [ ',' lit_value ]  ']'  '}' </rhs>
+                       </production>
+
+                       <production xml:id="ebnf.in_operator">
+                               <lhs> in_operator </lhs>
+                               <rhs> “in”  |  “not in” </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> Another kind of IN or NOT IN clause uses a subquery instead of a list of
+                       values: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.predicate_6">
+                               <lhs> predicate </lhs>
+                               <rhs> '{' in_operator ':'  query  '}' </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> The remaining types of <literal>predicate</literal> can put a function call on
+                       the left of the comparison, by using a <literal>transform_spec</literal> together
+                       with a <literal>“value”</literal> tag.   The <literal>transform_spec</literal> is
+                       optional, and if you don't need it, the same SQL would in many cases be easier to
+                       express by other means. </para>
+               <para> The <literal>transform_spec</literal> construct was described earlier in
+                       connection with the SELECT clause, but here it is again: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.transform_spec_1">
+                               <lhs> transform_spec </lhs>
+                               <rhs>
+                                       '”transform”'  ':'  string  ]<sbr/>
+                                       [ ',' '”result_field”  ':'  string  ]<sbr/>
+                                       [ ',' '”params”  ':' param_list  ]
+                               </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> As in the SELECT clause, the <literal>“transform”</literal> string names the
+                       function.  The first parameter is always the column identified by the field_name.
+                       Additional parameters, if any, appear in the <literal>param_list</literal>.  The
+                       <literal>“result_field”</literal> string, if present, identifies one column of a
+                       multicolumn return value. </para>
+               <para> Here's a second way to compare a value to a literal value (but not to a null
+                       value): </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.predicate_7">
+                               <lhs> predicate </lhs>
+                               <rhs> '{' compare_op ':' '{' '”value”' ':' lit_value<sbr/>
+                                       [ transform_spec ] '}' '}' </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> ...and a way to compare a value to a boolean expression: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.predicate_8">
+                               <lhs> predicate </lhs>
+                               <rhs> '{' compare_op ':' '{' '”value”' ':' '{'<sbr/>
+                                       condition { ',' condition } [ transform_spec ] '}' '}' </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> The final predicate is another way to put a function call on the right side
+                       of the comparison: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.predicate_9">
+                               <lhs> predicate </lhs>
+                               <rhs> '{' compare_op ':' '{' '”value”' ':' '['<sbr/>
+                                       string { ',' parameter } ']' [ transform_spec ] '}' '}' </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> This format is available for the sake of consistency, but offers no advantage
+                       over the simpler version. </para>
+       </sect2>
+
+       <sect2><title>ORDER BY Clause</title>
+               <para> There are two ways to encode an ORDER BY clause: as an array, or as a list.
+                       Either may be empty, in which case the generated SQL will not include an ORDER BY
+                       clause: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.order_by_list_0">
+                               <lhs> order_by_list </lhs>
+                               <rhs> '['  ']'  |  '{'  '}' </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> If not empty, the array contains one or more objects, each defining a sort
+                       field: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.order_by_list_1">
+                               <lhs> order_by_list </lhs>
+                               <rhs> '{' sort_field_def  {  ','  sort_field_def }  '}' </rhs>
+                       </production>
+
+                       <production xml:id="ebnf.sort_field_def">
+                               <lhs> sort_field_def </lhs>
+                               <rhs> '{'<sbr/>
+                                       '”class”'  ':'  class_name<sbr/>
+                                       ','  '”field”'  ':'  field_name<sbr/>
+                                       [  ','  '”direction”'  ':'  lit_value  ]<sbr/>
+                                       [  ','  transform_spec  ]<sbr/>
+                                       '}' </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> The <literal>“class”</literal> and <literal>“field”</literal> entries are
+                       required, and of course the field must belong to the class.  Furthermore, at
+                       least one field from the class must appear in the SELECT clause. </para>
+               <para> The <literal>“direction”</literal> entry, if present, specifies whether the
+                       sort will be ascending or descending for the column in question.  If the associated
+                       value begins with “D” or “d”, the sort will be descending; otherwise the sort will
+                       be ascending.  If the value is a number, it will be treated as a string that does not
+                       start with “D” or “d”, resulting in an ascending sort. </para>
+               <para> In the absence of a <literal>“direction”</literal> entry, the sort will be
+                       ascending. </para>
+               <para> The <literal>transform_spec</literal> works here the same way it works in the
+                       SELECT clause and the WHERE clause, enabling you to pass the column through a
+                       transforming function before the sort: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.transform_spec_2">
+                               <lhs> transform_spec </lhs>
+                               <rhs>
+                                       '”transform”'  ':'  string  ]<sbr/>
+                                       [ ',' '”result_field”  ':'  string  ]<sbr/>
+                                       [ ',' '”params”  ':' param_list  ]
+                               </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> When the <literal>order_by_list</literal> is an object instead of an array,
+                       the syntax is less verbose, but also less flexible.  The keys for the object are
+                       class names: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.order_by_list_2">
+                               <lhs> order_by_list </lhs>
+                               <rhs> '{' class_name ':' sort_class_def<sbr/>
+                                       { ',' class_name ':' sort_class_def } '}' </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> Each class must be referenced in the SELECT clause. </para>
+               <para> As in the SELECT clause, all the fields for a given class must be grouped
+                       together.  You can't sort by a column from one table, then a column from a second
+                       table, then another column from the first table. If you need this kind of sort,
+                       you must encode the ORDER BY clause as an array instead of an object. </para>
+               <para> The data associated with a <literal>class_name</literal> may be either an array
+                       or an object.  If an array, it's simply a list of field names, and each field must
+                       belong to the class: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.sort_class_def_0">
+                               <lhs> sort_class_def </lhs>
+                               <rhs> '['  field_name  { ','  field_name }  ']' </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> With this syntax, the direction of sorting will always be ascending. </para>
+               <para> If the data is an object, the keys are field names, and as usual the fields
+                       must belong to the class: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.sort_class_def_1">
+                               <lhs> sort_class_def </lhs>
+                               <rhs> '{'  field_name  ':' sort_class_subdef<sbr/>
+                                       { ','  field_name  ':' sort_class_subdef  }  '}' </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> Since the <literal>field_name</literal> is the key for the object, it cannot
+                       appear more than once.  As a result, some kinds of sorts are not possible with this
+                       syntax.  For example, one might want to sort by UPPER( family_name ), and then by
+                       family_name with case unchanged, to make sure that “diBona” comes before “Dibona”.
+                       For situations like this, you must encode the ORDER BY clause as an array rather
+                       than an object. </para>
+               <para> The data associated with each <literal>field_name</literal> may take either of
+                       two forms.  In the simplest case, it's a literal value to specify the direction
+                       of sorting: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.sort_class_subdef_0">
+                               <lhs> sort_class_subdef </lhs>
+                               <rhs> lit_value </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> If the literal is a string starting with “D” or “d”, json_query sorts the field
+                       in descending order.  Otherwise it sorts the field in ascending order. </para>
+               <para> In other cases, the <literal>field_name</literal> may be paired with an object
+                       to specify more details: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.sort_class_subdef_1">
+                               <lhs> sort_class_subdef </lhs>
+                               <rhs> '{'<sbr/>
+                                       [  '”direction”'  ':'  lit_value ]<sbr/>
+                                       [  transform_spec  ]<sbr/>
+                                       '}' </rhs>
+                       </production>
+
+               </productionset>
+
+               <para> As before, the value tagged as <literal>“direction”</literal> specifies the
+                       direction of the sort, depending on the first character.  If not otherwise
+                       specified, the sort direction defaults to ascending. </para>
+               <para> Also as before, the <literal>transform_spec</literal> may specify a function
+                       through which to pass the column. </para>
+               <para> Since both the <literal>“direction”</literal> tag and the
+                       <literal>transform_spec</literal> are optional, the object may be empty: </para>
+
+               <productionset>
+
+                       <production xml:id="ebnf.sort_class_subdef_2">
+                               <lhs> sort_class_subdef </lhs>
+                               <rhs> '{'  '}' </rhs>
+                       </production>
+
+               </productionset>
+       </sect2>
+
+</sect1>
diff --git a/docs/TechRef/TechRef.xml b/docs/TechRef/TechRef.xml
new file mode 100644 (file)
index 0000000..65e86bb
--- /dev/null
@@ -0,0 +1,12 @@
+<?xml version="1.0" encoding="utf-8"?>
+
+<chapter version="5.0" xmlns="http://docbook.org/ns/docbook"
+  xmlns:xi="http://www.w3.org/2003/XInclude"
+  xmlns:xlink="http://www.w3.org/1999/xlink">
+  
+    <title>Evergreen Technical Reference</title>
+
+<xi:include  href="JSONGrammar.xml" 
+    xmlns:xi="http://www.w3.org/2001/XInclude" />
+  
+</chapter>