]> git.evergreen-ils.org Git - Evergreen.git/blob - docs/Guides/grammar.xml
2601682f90a7b5980bcd35ee9dde9a85ad0d4b80
[Evergreen.git] / docs / Guides / grammar.xml
1 <?xml version="1.0" encoding="utf-8"?>
2 <!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook V4.5//EN"
3         "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
4
5 <article xmlns="http://docbook.org/ns/docbook">
6
7         <artheader>
8                 <title>Grammar of JSON Queries</title>
9                 <author>
10                         <firstname>Scott</firstname>
11                         <surname>McKellar</surname>
12                 </author>
13         </artheader>
14
15         <sect1><title>Introduction</title>
16                 <para>
17                         The format of this grammar approximates Extended Backus-Naur notation.  However it
18                         is intended as input to human beings, not to parser generators such as Lex or
19                         Yacc.  Do not expect formal rigor.  Sometimes narrative text will explain things
20                         that are clumsy to express in formal notation.  More often, the text will restate
21                         or summarize the formal productions.
22                 </para>
23                 <para>
24                         Conventions:
25                 </para>
26                 <orderedlist>
27                         <listitem>
28                                 The grammar is a series of productions.
29                         </listitem>
30                         <listitem>
31                                 A production consists of a name, followed by "::=", followed by a
32                                 definition for the name.  The name identifies a grammatical construct that can
33                                 appear on the right side of another production.
34                         </listitem>
35                         <listitem>
36                                 Literals (including punctuation) are enclosed in single quotes, or in double
37                                 quotes if case is not significant.
38                         </listitem>
39                         <listitem>
40                                 A single quotation mark within a literal is escaped with a preceding backslash.
41                         </listitem>
42                         <listitem>
43                                 If a construct can be defined more than one way, then the alternatives may appear
44                                 in separate productions; or, they may appear in the same production, separated by
45                                 pipe symbols.  The choice between these representations is of only cosmetic
46                                 significance.
47                         </listitem>
48                         <listitem>
49                                 A construct enclosed within square brackets is optional.
50                         </listitem>
51                         <listitem>
52                                 A construct enclosed within curly braces may be repeated zero or more times.
53                         </listitem>
54                         <listitem>
55                                 JSON allows arbitrary white space between tokens.  To avoid ugly clutter, this
56                                 grammar ignores the optional white space.
57                         </listitem>
58                         <listitem>
59                                 In many cases a production defines a JSON object, i.e. a list of name-value pairs,
60                                 separated by commas.  Since the order of these name/value pairs is not significant,
61                                 the grammar will not try to show all the possible sequences.  In general it will
62                                 present the required pairs first, if any, followed by any optional elements.
63                         </listitem>
64                 </orderedlist>
65
66                 <para>
67                         Since both EBNF and JSON use curly braces and square brackets, pay close attention to
68                         whether these characters are in single quotes.  If they're in single quotes, they are
69                         literal elements of the JSON notation.  Otherwise they are elements of the EBNF notation.
70                 </para>
71         </sect1>
72
73         <sect1><title>Primitives</title>
74                 <para>
75                         We'll start by defining some primitives, to get them out of the way.  They're
76                         mostly just what you would expect.
77                 </para>
78
79                 <productionset>
80                         <production>
81                                 <lhs>
82                                         string
83                                 </lhs>
84                                 <rhs>
85                                         '”' chars '”'
86                                 </rhs>
87                         </production>
88
89                         <production>
90                                 <lhs>
91                                         chars
92                                 </lhs>
93                                 <rhs>
94                                         any valid sequence of UTF-8 characters, with certain special characters
95                                         escaped according to JSON rules
96                                 </rhs>
97                         </production>
98
99                         <production>
100                                 <lhs>
101                                         integer_literal
102                                 </lhs>
103                                 <rhs>
104                                         [ sign ] digit { digit }
105                                 </rhs>
106                         </production>
107
108                         <production>
109                                 <lhs>
110                                         sign
111                                 </lhs>
112                                 <rhs>
113                                         '+' | '-'
114                                 </rhs>
115                         </production>
116
117                         <production>
118                                 <lhs>
119                                         digit
120                                 </lhs>
121                                 digit =  '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'
122                                 <rhs>
123                                 </rhs>
124                         </production>
125
126                         <production>
127                                 <lhs>
128                                         integer_string
129                                 </lhs>
130                                 <rhs>
131                                         '”'  integer_literal  '”'
132                                 </rhs>
133                         </production>
134
135                         <production>
136                                 <lhs>
137                                         integer
138                                 </lhs>
139                                 <rhs>
140                                         integer_literal  |  integer_string
141                                 </rhs>
142                         </production>
143
144                         <production>
145                                 <lhs>
146                                         number
147                                 </lhs>
148                                 <rhs>
149                                         any valid character sequence that is numeric according to JSON rules
150                                 </rhs>
151                         </production>
152
153                 </productionset>
154
155                 <para>
156                         When json_query requires an integral value, it will usually accept a quoted string and
157                         convert it to an integer by brute force – to zero if necessary.  Likewise it may
158                         truncate a floating point number to an integral value.  Scientific notation will be
159                         accepted but may not give the intended results.
160                 </para>
161
162                 <productionset>
163
164                         <production>
165                                 <lhs>
166                                         boolean
167                                 </lhs>
168                                 <rhs>
169                                         'true'  |  'false'  |  string  |  number
170                                 </rhs>
171                         </production>
172
173                 </productionset>
174
175                 <para>
176                         The preferred way to encode a boolean is with the JSON reserved word true or false,
177                         in lower case without quotation marks.  The string “<literal>true</literal>”, in
178                         upper, lower, or mixed case, is another way to encode true.  Any other string
179                         evaluates to false.
180                 </para>
181                 <para>
182                         As an accommodation to perl, numbers may be used as booleans.  A numeric value of 1
183                         means true, and any other numeric value means false.
184                 </para>
185                 <para>
186                         Any other valid JSON value, such as an array, will be accepted as a boolean but interpreted
187                         as false.
188                 </para>
189                 <para>
190                         The last couple of primitives aren't really very primitive, but we introduce them here
191                         for convenience:
192                 </para>
193
194                 <productionset>
195
196                         <production>
197                                 <lhs>
198                                         class_name
199                                 </lhs>
200                                 <rhs>
201                                         string
202                                 </rhs>
203                         </production>
204
205                 </productionset>
206
207                 <para>
208                         A class_name is a special case of a string: the name of a class as defined
209                         by the IDL.  The class may refer either to a database table or to a
210                         source_definition, which is a subquery.
211                 </para>
212
213                 <productionset>
214
215                         <production>
216                                 <lhs>
217                                         field_name
218                                 </lhs>
219                                 <rhs>
220                                         string
221                                 </rhs>
222                         </production>
223
224                 </productionset>
225
226                 <para>
227                         A field_name is another special case of a string: the name of a non-virtual
228                         field as defined by the IDL.  A field_name is also a column name for the
229                         table corresponding to the relevant class.
230                 </para>
231
232         </sect1>
233
234         <sect1><title>Query</title>
235
236                 <para>
237                         The following production applies not only to the main query but also to
238                         most subqueries.
239                 </para>
240
241                 <productionset>
242
243                         <production>
244                                 <lhs>
245                                         query
246                                 </lhs>
247                                 <rhs>
248                                         '{'<sbr/>
249                                         '”from”'  ':'  from_list<sbr/>
250                                         [ ','  '”select”'    ':'  select_list ]<sbr/>
251                                         [ ','  '”where”'     ':'  where_condition ]<sbr/>
252                                         [ ','  '”having”'    ':'  where_condition ]<sbr/>
253                                         [ ','  '”order_by”'  ':'  order_by_list ]<sbr/>
254                                         [ ','  '”limit”'     ':'  integer ]<sbr/>
255                                         [ ','  '”offset”'    ':'  integer ]<sbr/>
256                                         [ ','  '”distinct”'  ':'  boolean ]<sbr/>
257                                         [ ','  '”no_i18n”'   ':'  boolean ]<sbr/>
258                                         '}'
259                                 </rhs>
260                         </production>
261
262                 </productionset>
263
264                 <para>
265                         Except for the <literal>“distinct”</literal> and <literal>“no_i18n”</literal>
266                         entries, each name/value pair represents a major clause of the SELECT statement.
267                         The name/value pairs may appear in any order.
268                 </para>
269                 <para>
270                         There is no name/value pair for the GROUP BY clause, because json_query
271                         generates it automatically according to information encoded elsewhere.
272                 </para>
273                 <para>
274                         The <literal>“distinct”</literal> entry, if present and true, tells json_query
275                         that it may have to create a GROUP BY clause.  If not present, it defaults to false.
276                 </para>
277                 <para>
278                         The <literal>“no_i18n”</literal> entry, if present and true, tells json_query to
279                         suppress internationalization.  If not present, it defaults to false.  (Note that
280                         <literal>“no_i18n”</literal> contains the digit one, not the letter ell.)
281                 </para>
282                 <para>
283                         The values for <literal>“limit”</literal> and <literal>“offset”</literal>
284                         provide the arguments of the LIMIT and OFFSET clauses, respectively, of the
285                         SQL statement.  Each value should be non-negative, if present, or else the
286                         SQL won't work.
287                 </para>
288
289         </sect1>
290
291 </article>