1 <?xml version="1.0" encoding="utf-8"?>
2 <!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook V4.5//EN"
3 "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
5 <article xmlns="http://docbook.org/ns/docbook">
8 <title>Grammar of JSON Queries</title>
10 <firstname>Scott</firstname>
11 <surname>McKellar</surname>
15 <sect1><title>Introduction</title>
17 The format of this grammar approximates Extended Backus-Naur notation. However it
18 is intended as input to human beings, not to parser generators such as Lex or
19 Yacc. Do not expect formal rigor. Sometimes narrative text will explain things
20 that are clumsy to express in formal notation. More often, the text will restate
21 or summarize the formal productions.
28 The grammar is a series of productions.
31 A production consists of a name, followed by "::=", followed by a
32 definition for the name. The name identifies a grammatical construct that can
33 appear on the right side of another production.
36 Literals (including punctuation) are enclosed in single quotes, or in double
37 quotes if case is not significant.
40 A single quotation mark within a literal is escaped with a preceding backslash.
43 If a construct can be defined more than one way, then the alternatives may appear
44 in separate productions; or, they may appear in the same production, separated by
45 pipe symbols. The choice between these representations is of only cosmetic
49 A construct enclosed within square brackets is optional.
52 A construct enclosed within curly braces may be repeated zero or more times.
55 JSON allows arbitrary white space between tokens. To avoid ugly clutter, this
56 grammar ignores the optional white space.
59 In many cases a production defines a JSON object, i.e. a list of name-value pairs,
60 separated by commas. Since the order of these name/value pairs is not significant,
61 the grammar will not try to show all the possible sequences. In general it will
62 present the required pairs first, if any, followed by any optional elements.
67 Since both EBNF and JSON use curly braces and square brackets, pay close attention to
68 whether these characters are in single quotes. If they're in single quotes, they are
69 literal elements of the JSON notation. Otherwise they are elements of the EBNF notation.
73 <sect1><title>Primitives</title>
75 We'll start by defining some primitives, to get them out of the way. They're
76 mostly just what you would expect.
94 any valid sequence of UTF-8 characters, with certain special characters
95 escaped according to JSON rules
104 [ sign ] digit { digit }
121 digit = '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'
131 '”' integer_literal '”'
140 integer_literal | integer_string
149 any valid character sequence that is numeric according to JSON rules
156 When json_query requires an integral value, it will usually accept a quoted string and
157 convert it to an integer by brute force – to zero if necessary. Likewise it may
158 truncate a floating point number to an integral value. Scientific notation will be
159 accepted but may not give the intended results.
169 'true' | 'false' | string | number
176 The preferred way to encode a boolean is with the JSON reserved word true or false,
177 in lower case without quotation marks. The string “<literal>true</literal>”, in
178 upper, lower, or mixed case, is another way to encode true. Any other string
182 As an accommodation to perl, numbers may be used as booleans. A numeric value of 1
183 means true, and any other numeric value means false.
186 Any other valid JSON value, such as an array, will be accepted as a boolean but interpreted
190 The last couple of primitives aren't really very primitive, but we introduce them here
208 A class_name is a special case of a string: the name of a class as defined
209 by the IDL. The class may refer either to a database table or to a
210 source_definition, which is a subquery.
227 A field_name is another special case of a string: the name of a non-virtual
228 field as defined by the IDL. A field_name is also a column name for the
229 table corresponding to the relevant class.
234 <sect1><title>Query</title>
237 The following production applies not only to the main query but also to
249 '”from”' ':' from_list<sbr/>
250 [ ',' '”select”' ':' select_list ]<sbr/>
251 [ ',' '”where”' ':' where_condition ]<sbr/>
252 [ ',' '”having”' ':' where_condition ]<sbr/>
253 [ ',' '”order_by”' ':' order_by_list ]<sbr/>
254 [ ',' '”limit”' ':' integer ]<sbr/>
255 [ ',' '”offset”' ':' integer ]<sbr/>
256 [ ',' '”distinct”' ':' boolean ]<sbr/>
257 [ ',' '”no_i18n”' ':' boolean ]<sbr/>
265 Except for the <literal>“distinct”</literal> and <literal>“no_i18n”</literal>
266 entries, each name/value pair represents a major clause of the SELECT statement.
267 The name/value pairs may appear in any order.
270 There is no name/value pair for the GROUP BY clause, because json_query
271 generates it automatically according to information encoded elsewhere.
274 The <literal>“distinct”</literal> entry, if present and true, tells json_query
275 that it may have to create a GROUP BY clause. If not present, it defaults to false.
278 The <literal>“no_i18n”</literal> entry, if present and true, tells json_query to
279 suppress internationalization. If not present, it defaults to false. (Note that
280 <literal>“no_i18n”</literal> contains the digit one, not the letter ell.)
283 The values for <literal>“limit”</literal> and <literal>“offset”</literal>
284 provide the arguments of the LIMIT and OFFSET clauses, respectively, of the
285 SQL statement. Each value should be non-negative, if present, or else the
291 <sect1><title>FROM Clause</title>
293 The object identified by <literal>“from”</literal> encodes the FROM clause of
294 the SQL. The associated value may be a string, an array, or a JSON object.
311 If <literal>from_list</literal> is a <literal>class_name</literal>, the
312 json_query inserts the corresponding table name or subquery into the FROM
313 clause, using the <literal>class_name</literal> as an alias for the table
314 or subquery. The class must be defined as non-virtual in the IDL.
324 '[' string { ',' parameter } ']'
333 string | number | 'null'
340 If from_list is a JSON array, then it represents a table-like function from
341 which the SQL statement will select rows, using a SELECT clause consisting
342 of “SELECT *” (regardless of the select_list supplied by the method parameter).
345 The first entry in the array is the name of the function. It must be a string
346 naming a stored function. Each subsequent entry is a function parameter. If
347 it is a string or a number, json_query will insert it into a comma-separated
348 parameter list, enclosed in quotes, with any special characters escaped as needed.
349 If it is the JSON reserved word <literal>null</literal>, json_query will insert
350 it into the parameter list as a null value.
353 If <literal>from_list</literal> is a JSON object, it must contain exactly one entry.
354 The key of this entry must be the name of a non-virtual class defined in the IDL.
355 This class will be the top-level class of the FROM clause, the only one named
356 outside of a JOIN clause.
366 '{' class_name ':' join_list '}'
384 '{' join_def { ',' join_def } '}'
391 If the associated data is a <literal>class_name</literal>, json_query will
392 construct an INNER JOIN clause joining the class to the top-level clause,
393 using the columns specified by the IDL for such a join.
396 Otherwise, the associated data must be a JSON object with one or more entries,
397 each entry defining a join:
409 [ '”type”' ':' string ]<sbr/>
410 [ '”field”' ':' field_name ]<sbr/>
411 [ '”fkey”' ':' field_name ]<sbr/>
412 [ '”filter”' ':' where_condition ]<sbr/>
413 [ '”filter_op”' ':' string ]<sbr/>
414 [ '”join”' ':' join_list ]<sbr/>
423 The data portion of the <literal>“join_type”</literal> entry tells json_query
424 whether to use a left join, right join, full join, or inner join. The values
425 <literal>“left”</literal>, <literal>“right”</literal>, and <literal>“full”</literal>,
426 in upper, lower, or mixed case, have the obvious meanings. If the
427 <literal>“join_type”</literal> entry has any other value, or is not present,
428 json_query constructs an inner join.
431 The <literal>“field”</literal> and <literal>“fkey”</literal> attributes specify the
432 columns to be equated in the join condition. The <literal>“field”</literal>
433 attribute refers to the column in the joined table, i.e. the one named by the
434 <literal>join_def</literal>. The <literal>“fkey”</literal> attribute refers to the
435 corresponding column in the other table, i.e. the one named outside the
436 <literal>join_def</literal> – either the top-level table or a table named by some
437 other <literal>join_def</literal>.
440 It may be tempting to suppose that <literal>“fkey”</literal> stands for “foreign key”,
441 and therefore refers to a column in the child table that points to the key of a
442 parent table. Resist the temptation; the labels are arbitrary. The json_query
443 method doesn't care which table is the parent and which is the child.
446 These relationships are best explained with an example. The following <literal>from_list</literal>:
449 <informalexample><programlisting language="JSON">
459 </programlisting></informalexample>
462 ...turns into the following FROM clause:
465 <informalexample><programlisting language="SQL">
467 actor.org_unit AS "aou"
468 LEFT JOIN action.survey AS "asv"
469 ON ( "asv".owner = "aou".id )
470 </programlisting></informalexample>
473 Note in this example that <literal>“fkey”</literal> refers to a column of the
474 class <literal>“aou”</literal>, and <literal>“field”</literal> refers to a
475 column of the class <literal>“asv”</literal>.
478 If you specify only one of the two columns, json_query will try to identify the
479 other one from the IDL. However, if you specify only the column from the parent
480 table, this attempt will probably fail.
483 If you specify both columns, json_query will use the column names you specify,
484 without verifying them with a lookup in the IDL. By this means you can perform
485 a join using a linkage that the IDL doesn't define. Of course, if the columns
486 don't exist in the database, the query will fail when json_query tries to execute it.
489 Using the columns specified, either explicitly or implicitly, the json_query
490 method constructs a join condition. With raw SQL it is possible (though
491 rarely useful) to join two tables by an inequality. However the json_query
492 method always uses a simple equality condition.
495 Using a <literal>“filter”</literal> entry in the join_def, you can apply one
496 or more additional conditions to the JOIN clause, typically to restrict the
497 join to certain rows of the joined table. The data associated with the
498 <literal>“filter”</literal> key is the same sort of
499 <literal>where_condition</literal> that you use for a WHERE clause
503 If the string associated with the <literal>“filter_op”</literal> entry is
504 <literal>“OR”</literal> in upper, lower, or mixed case, then the json_query
505 method uses OR to connect the standard join condition to any additional
506 conditions supplied by a <literal>“filter”</literal> entry.
509 (Note that if the <literal>where_condition</literal> supplies multiple
510 conditions, they will be connected by AND. You will probably want to move
511 them down a layer – enclose them in parentheses, in effect – to avoid a
512 confusing mixture of ANDs and ORs.)
515 If the <literal>“filter_op”</literal> entry carries any other value, or if
516 it is absent, then the json_query method uses AND. In the absence of a
517 <literal>“filter”</literal> entry, <literal>“filter_op”</literal> has no effect.
520 A <literal>“join”</literal> entry in a <literal>join_def</literal> specifies
521 another layer of join. The class named in the subjoin is joined to the class
522 named by the <literal>join_def</literal> to which it is subordinate. By this
523 means you can encode multiple joins in a hierarchy.
527 <sect1><title>SELECT Clause</title>
529 If a query does not contain an entry for <literal>“select”</literal>, json_query
530 will construct a default SELECT clause. The default includes every non-virtual
531 field from the top-level class of the FROM clause, as defined by the IDL. The
532 result is similar to SELECT *, except:
537 The default includes only the fields defined in the IDL.
540 The columns will appear in the same order in which they appear in the IDL,
541 regardless of the order in which the database defines them.
546 There are other ways to specify a default SELECT list, as shown below.
549 If a <literal>“select”</literal> entry is present, the associated value must
550 be a JSON object, keyed on class names:
560 '{' class_name ':' field_list { ',' class_name ':' field_list } '}'
567 The <literal>class_name</literal> must identify either the top-level class or
568 a class belonging to one of the joins. Otherwise json_query will silently
569 ignore the <literal>select_list</literal>.
586 If a field_list is either the JSON reserved word <literal>null</literal>
587 (in lower case) or an asterisk in double quotes, json_query constructs a
588 default SELECT list – provided that the class is the top-level class of the
589 query. If the class belongs to a join somewhere, json_query ignores the
590 <literal>field_list</literal>.
593 More commonly, the <literal>field_list</literal> is a JSON array of zero or
594 more field specifications:
604 '[' [ field_spec { ',' field_spec } ] ']'
611 If the array is empty, json_query will construct a default SELECT list for
612 the class – again, provided that the class is the top-level class in the query.
615 In the simplest case, a field specification may name a non-virtual field
633 In some cases json_query constructs a call to the
634 <literal>oils_i18n_xlate</literal> function to internationalize the value of the
635 selected column. Specifically, it does so if all the following are true:
640 the settings file defines a locale;
643 in the field definition for the field in the IDL, the tag
644 <literal>“il8n”</literal> is present and true;
647 the query does <emphasis>not</emphasis> include the
648 <literal>“no_il8n”</literal> tag (or includes it with a value of false).
653 A field specification may be a JSON object:
664 '”column”' ':' <sbr/>
665 [ ',' '”alias”' ':' string ]<sbr/>
666 [ ',' '”aggregate”' ':' boolean ]<sbr/>
667 [ ',' transform_spec ]<sbr/>
676 The <literal>“column”</literal> entry provides the column name, which must
677 be defined as non-virtual in the IDL.
680 The <literal>“alias”</literal> entry provides a column alias. If no alias
681 is specified, json_query uses the column name as its own alias.
684 The <literal>“aggregate”</literal> entry has no effect on the SELECT clause
685 itself. Rather, it affects the construction of a GROUP BY class. If there
686 is an <literal>“aggregate”</literal> entry for any field, then json_query builds
687 a GROUP BY clause listing every column that is <emphasis>not</emphasis> tagged
688 for aggregation (or that carries an <literal>“aggregate”</literal> entry with
689 a value of false). If <emphasis>all</emphasis> columns are tagged for
690 aggregation, then json_query omits the GROUP BY clause.
700 '”transform”' ':' string ]<sbr/>
701 [ ',' '”result_field” ':' string ]<sbr/>
702 [ ',' '”params” ':' param_list ]
709 When a <literal>transform_spec</literal> is present, json_query selects the
710 return value of a function instead of selecting the column directly. The entry
711 for <literal>“transform”</literal> provides the name of the function, and the
712 column name (as specified by the <literal>“column”</literal> tag), qualified by
713 the class name, is the argument to the function. For example, you might use such
714 a function to format a date or time, or otherwise transform a column value.
715 You might also use an aggregate function such as SUM, COUNT, or MAX (possibly
716 together with the <literal>“aggregate”</literal> tag).
719 The <literal>“result_field”</literal> entry, when present, specifies a subcolumn
720 of the function's return value. The resulting SQL encloses the function call
721 in parentheses, and follows it with a period and the subcolumn name.
724 The <literal>“params”</literal> entry, if present, provides a possibly empty
725 array of additional parameter values, either strings, numbers, or nulls:
735 '[' [ parameter { ',' parameter } ] ']'
742 Such parameter values are enclosed in single quotes, with any special characters
743 escaped as needed, and inserted after the column name as additional parameters
744 to the function. You might, for example, use an additional parameter to provide
745 a format string for a reformatting function.
749 <sect1><title>WHERE Clause</title>
752 <sect1><title>ORDER BY Clause</title>