1 <?xml version="1.0" encoding="utf-8"?>
3 <sect1 version="5.0" xmlns="http://docbook.org/ns/docbook"
4 xmlns:xi="http://www.w3.org/2003/XInclude" xmlns:xlink="http://www.w3.org/1999/xlink">
7 <title>JSON Queries: A Tutorial</title>
11 <firstname>Scott</firstname>
12 <surname>McKellar</surname>
15 <orgname>Equinox Software, Inc.</orgname>
21 <holder>Equinox Software, Inc.</holder>
25 Licensing: Creative Commons Attribution-Share Alike 3.0 United States License.
30 <title>Introduction</title>
31 <para> The json_query facility provides a way for client applications to query the
32 database over the network. Instead of constructing its own SQL, the application
33 encodes a query in the form of a JSON string and passes it to the json_query service.
34 Then the json_query service parses the JSON, constructs and executes the
35 corresponding SQL, and returns the results to the client application. </para>
36 <para> This arrangement enables the json_query service to act as a gatekeeper, protecting
37 the database from potentially damaging SQL commands. In particular, the generated SQL
38 is confined to SELECT statements, which will not change the contents of the database. </para>
39 <para> In addition, the json_query service sometimes uses its knowledge of the database
40 structure to supply column names and join conditions so that the client application
41 doesn't have to. </para>
42 <para> Nevertheless, the need to encode a query in a JSON string adds complications,
43 because the client needs to know how to build the right JSON. JSON queries are also
44 somewhat limiting -- they can't do all of the things that you can do with raw SQL. </para>
45 <para> This tutorial explains what you can do with a JSON query, and how you can do it. </para>
48 <title>The IDL</title>
49 <para> A JSON query does not refer to tables and columns. Instead, it refers to classes
50 and fields, which the IDL maps to the corresponding database entities. </para>
51 <para> The IDL (Interface Definition Language) is an XML file, typically
52 <filename>/openils/conf/fm_IDL.xml</filename>. It maps each class to a table, view,
53 or subquery, and each field to a column. It also includes information about foreign
54 key relationships. </para>
55 <para> (The IDL also defines virtual classes and virtual fields, which don't correspond
56 to database entities. We won't discuss them here, because json_query ignores them.) </para>
57 <para> When it first starts up, json_query loads a relevant subset of the IDL into memory.
58 Thereafter, it consults its copy of the IDL whenever it needs to know about the
59 database structure. It uses the IDL to validate the JSON queries, and to translate
60 classes and fields to the corresponding tables and columns. In some cases it uses the
61 IDL to supply information that the queries don't provide. </para>
65 <title>Definitions</title>
66 <para> References to "SQL" refer to the dialect implemented by PostgreSQL. This tutorial
67 assumes that you are already familiar with SQL. </para>
68 <para> You should also be familiar with JSON. However it is worth defining a couple of terms
69 that have other meanings in other contexts: </para>
74 <para> An "object" is a JSON object, i.e. a comma-separated list of name:value pairs,
75 enclosed in curly braces, like this:
78 { "a":"frobozz", "b":24, "c":null }
85 <para> An "array" is a JSON array, i.e. a comma-separated list of values, enclosed
86 in square brackets, like this:
89 [ "Goober", 629, null, false, "glub" ]
100 <title>The Examples</title>
101 <para> The test_json_query utility generated the SQL for all of the sample queries in this
102 tutorial. Newlines and indentation were then inserted manually for readability. </para>
103 <para> All examples involve the actor.org_unit table, sometimes in combination with a
104 few related tables. The queries themselves are designed to illustrate the syntax, not
105 to do anything useful at the application level. For example, it's not meaningful to
106 take the square root of an org_unit id, except to illustrate how to code a function call.
107 The examples are like department store mannequins -- they have no brains, they're only
114 <title>Hello, World!</title>
116 <para> The simplest kind of query defines nothing but a FROM clause. For example: </para>
118 <programlisting language="JSON">
126 <para> In this minimal example we select from only one table. Later we will see how to join
127 multiple tables. </para>
128 <para> Since we don't supply a WHERE clause, json_query constructs a default WHERE clause for
129 us, including all the available columns. The resulting SQL looks like this: </para>
131 <programlisting language="SQL">
133 "aou".billing_address AS "billing_address",
134 "aou".holds_address AS "holds_address",
136 "aou".ill_address AS "ill_address",
137 "aou".mailing_address AS "mailing_address",
138 "aou".name AS "name",
139 "aou".ou_type AS "ou_type",
140 "aou".parent_ou AS "parent_ou",
141 "aou".shortname AS "shortname",
142 "aou".email AS "email",
143 "aou".phone AS "phone",
144 "aou".opac_visible AS "opac_visible"
146 actor.org_unit AS "aou" ;
151 <title>Default SELECT Clauses</title>
152 <para> The default SELECT clause includes every column that the IDL defines it as a
153 non-virtual field for the class in question. If a column is present in the database
154 but not defined in the IDL, json_query doesn't know about it. In the case of the
155 example shown above, all the columns are defined in the IDL, so they all show up in
156 the default SELECT clause. </para>
157 <para> If the FROM clause joins two or more tables, the default SELECT clause includes
158 columns only from the core table, not from any of the joined tables. </para>
159 <para> The default SELECT clause has almost the same effect as "<literal>SELECT *</literal>",
160 but not exactly. If you were to "<literal>SELECT * from actor.org_unit_type</literal>
161 in psql, the output would include all the same columns as in the example above, but not in
162 the same order. A default SELECT clause includes the columns in the order in which the IDL
163 defines them, which may be different from the order in which the database defines them. </para>
164 <para> In practice, the sequencing of columns in the SELECT clause is not significant.
165 The result set is returned to the client program in the form of a data structure, which
166 the client program can navigate however it chooses. </para>
170 <title>Other Lessons</title>
171 <para> There are other ways to get a default SELECT clause. However, default SELECT clauses
172 are a distraction at this point, because most of the time you'll specify your own SELECT
173 clause explicitly, as we will discuss later. </para>
174 <para> Let's consider some more important aspects of this simple example -- more important
175 because they apply to more complex queries as well. </para>
180 <para> The entire JSON query is an object. In this simple case the object includes
181 only one entry, for the FROM clause. Typically you'll also have entries for
182 the SELECT clause and the WHERE clause, and possibly for HAVING, ORDER BY,
183 LIMIT, or OFFSET clauses. There is no separate entry for a GROUP BY clause,
184 which you can specify by other means. </para>
188 <para> Although all the other entries are optional, you must include an entry for
189 the FROM clause. You cannot, for example, do a SELECT USER the way you can in
194 <para> Every column is qualified by an alias for the table. This alias is always the
195 class name for the table, as defined in the IDL. </para>
199 <para> Every column is aliased with the column name. There is a way to choose a
200 different column alias (not shown here). </para>
209 <title>The SELECT Clause</title>
211 <para> The following variation also produces a default SELECT clause: </para>
214 <programlisting language="JSON">
224 <para> ...and so does this one: </para>
227 <programlisting language="JSON">
237 <para> While this syntax may not be terribly useful, it does illustrate the minimal structure
238 of a SELECT clause in a JSON query: an entry in the outermost JSON object, with a key of
239 <literal>"select"</literal>. The value associated with this key is another JSON object,
240 whose keys are class names. </para>
241 <para> (These two examples also illustrate another point: unlike SQL, a JSON query doesn't care
242 whether the FROM clause or the SELECT clause comes first.) </para>
243 <para> Usually you don't want the default SELECT clause. Here's how to select only some of the
247 <programlisting language="JSON">
251 "aou":[ "id", "name" ]
257 <para> The value associated with the class name is an array of column names. If you select
258 columns from multiple tables (not shown here), you'll need a separate entry for each
259 table, and a separate column list for each entry. </para>
260 <para> The previous example results in the following SQL: </para>
263 <programlisting language="SQL">
268 actor.org_unit AS "aou" ;
273 <title>Fancier SELECT Clauses</title>
274 <para> The previous example featured an array of column names. More generally, it
275 featured an array of field specifications, and one kind of field specification
276 is a column name. The other kind is a JSON object, with some combination of the
277 following keys: </para>
282 <para> <literal>"column"</literal> -- the column name (required). </para>
286 <para> <literal>"alias"</literal> -- used to define a column alias, which
287 otherwise defaults to the column name. </para>
291 <para> <literal>"aggregate"</literal> -- takes a value of
292 <literal>true</literal> or <literal>false.</literal> Don't worry about
293 this one yet. It concerns the use of GROUP BY clauses, which we will
294 examine later. </para>
298 <para> <literal>"transform"</literal> -- the name of an SQL function to be
303 <para> <literal>"result_field"</literal> -- used with
304 <literal>"transform"</literal>; specifies an output column of a function that
305 returns multiple columns at a time. </para>
309 <para> <literal>"params"</literal> -- used with <literal>"transform"</literal>;
310 provides a list of parameters for the function. They may be strings, numbers,
316 <para> This example assigns a different column alias: </para>
319 <programlisting language="JSON">
325 { "column":"name", "alias":"org_name" }
333 <programlisting language="SQL">
336 "aou".name AS "org_name"
338 actor.org_unit AS "aou" ;
342 <para> In this case, changing the column alias doesn't accomplish much. But if we
343 were joining to the actor.org_unit_type table, which also has a "name" column,
344 we could use different aliases to distinguish them. </para>
345 <para> The following example uses a function to raise a column to upper case: </para>
348 <programlisting language="JSON">
354 { "column":"name", "transform":"upper" }
362 <programlisting language="SQL">
365 upper("aou".name ) AS "name"
367 actor.org_unit AS "aou" ;
371 <para> Here we take a substring of the name, using the <literal>"params"</literal>
372 element to pass parameters: </para>
375 <programlisting language="JSON">
382 "transform":"substr",
392 <programlisting language="SQL">
395 substr("aou".name,'3','5' ) AS "name"
397 actor.org_unit AS "aou" ;
401 <para> The parameters specified with <literal>"params"</literal> are inserted
402 <emphasis>after</emphasis> the applicable column (<literal>"name"</literal> in this
403 case), which is always the first parameter. They are always passed as strings,
404 i.e. enclosed in quotes, even if the JSON expresses them as numbers. PostgreSQL
405 will ordinarily coerce them to the right type. However if the function name is
406 overloaded to accept different types, PostgreSQL may invoke a function other than
407 the one intended. </para>
408 <para> Finally we call a fictitious function <literal>"frobozz"</literal> that returns
409 multiple columns, where we want only one of them: </para>
412 <programlisting language="JSON">
419 "transform":"frobozz",
420 "result_field":"zamzam"
429 <programlisting language="SQL">
432 (frobozz("aou".name ))."zamzam" AS "name"
434 actor.org_unit AS "aou" ;
438 <para> The "frobozz" function doesn't actually exist, but json_query doesn't know
439 that. The query won't fail until json_query tries to execute it in the database. </para>
444 <title>Things You Can't Do</title>
445 <para> You can do some things in a SELECT clause with raw SQL (with psql, for example)
446 that you can't do with a JSON query. Some of them matter and some of them don't. </para>
447 <para> When you do a JOIN, you can't arrange the selected columns in any arbitrary
448 sequence, because all of the columns from a given table must be grouped together.
449 This limitation doesn't matter. The results are returned in the form of a data
450 structure, which the client program can navigate however it likes. </para>
451 <para> You can't select an arbitrary expression, such as
452 <literal>"percentage / 100"</literal> or <literal>"last_name || ', ' || first_name"</literal>.
453 Most of the time this limitation doesn't matter either, because the client program
454 can do these kinds of manipulations for itself. However, function calls may be a problem.
455 You can't nest them, and you can't pass more than one column value to them (and it has
456 to be the first parameter). </para>
457 <para> You can't use a CASE expression. Instead, the client application can do the equivalent
458 branching for itself. </para>
459 <para> You can't select a subquery. In raw SQL you can do something like the following: </para>
462 <programlisting language="SQL">
468 FROM actor.org_unit_type AS aout
469 WHERE aout.id = aou.ou_type
472 actor.org_unit AS aou;
476 <para> This contrived example is not very realistic. Normally you would use a JOIN in this
477 case, and that's what you should do in a JSON query. Other cases may not be so easy
484 <title>The WHERE Clause</title>
485 <para> Most queries need a WHERE clause, as in this simple example: </para>
488 <programlisting language="JSON">
491 "select": { "aou":[ "id", "name" ] },
499 <para> Like the SELECT clause, the WHERE clause gets its own entry in the top-level object
500 of a JSON query. The key is <literal>"where"</literal>, and the associated value is
501 either an object (as shown here) or an array (to be discussed a bit later). Each entry
502 in the object is a separate condition. </para>
503 <para> In this case, we use a special shortcut for expressing an equality condition. The
504 column name is on the left of the colon, and the value to which we are equating it is
505 on the right. </para>
506 <para> Here's the resulting SQL: </para>
509 <programlisting language="SQL">
514 actor.org_unit AS "aou"
520 <para> Like the SELECT clause, the generated WHERE clause qualifies each column name with
521 the alias of the relevant table. </para>
522 <para> If you want to compare a column to NULL, put "<literal>null</literal>" (without
523 quotation marks) to the right of the colon instead of a literal value. The resulting
524 SQL will include <literal>"IS NULL"</literal> instead of an equals sign. </para>
527 <title>Other Kinds of Comparisons</title>
528 <para> Here's the same query (which generates the same SQL) without the special
532 <programlisting language="JSON">
535 "select": { "aou":[ "id", "name" ] },
537 "parent_ou":{ "=":3 }
543 <para> We still have an entry whose key is the column name, but this time the
544 associated value is another JSON object. It must contain exactly one entry,
545 with the comparison operator on the left of the colon, and the value to be
546 compared on the right. </para>
547 <para> The same syntax works for other kinds of comparison operators.
551 <programlisting language="JSON">
554 "select": { "aou":[ "id", "name" ] },
556 "parent_ou":{ ">":3 }
562 <para> ...turns into: </para>
565 <programlisting language="SQL">
570 actor.org_unit AS "aou"
572 "aou".parent_ou > 3 ;
576 <para> The condition '<literal>"=":null</literal>' turns into IS NULL. Any other
577 operator used with <literal>"null"</literal> turns into IS NOT NULL. </para>
578 <para> You can use most of the comparison operators recognized by PostgreSQL: </para>
580 <programlisting language="SQL">
582 < > <= >=
588 <para> The only ones you can't use are <literal>"is distinct from"</literal> and
589 <literal>"is not distinct from"</literal>. </para>
593 <title>Custom Comparisons</title>
594 <para> Here's a dirty little secret: json_query doesn't really pay much attention to the
595 operator you supply. It merely checks to make sure that the operator doesn't contain
596 any semicolons or white space, in order to prevent certain kinds of SQL injection.
597 It also allows <literal>"similar to"</literal> as a special exception. </para>
598 <para> As a result, you can slip an operator of your own devising into the SQL, so long as
599 it doesn't contain any semicolons or white space, and doesn't create invalid syntax.
600 Here's a contrived and rather silly example: </para>
603 <programlisting language="JSON">
606 "select": { "aou":[ "id", "name" ] },
608 "parent_ou":{ "<2+":3 }
614 <para> ...which results in the following SQL: </para>
617 <programlisting language="SQL">
622 actor.org_unit AS "aou"
624 "aou".parent_ou <2+ 3;
628 <para> It's hard to come up with a realistic case where this hack would be useful, but it
629 could happen. </para>
633 <title>Comparing One Column to Another</title>
634 <para> Here's how to put another column on the right hand side of a comparison: </para>
637 <programlisting language="JSON">
640 "select": { "aou":[ "id", "name" ] },
642 "id": { ">": { "+aou":"parent_ou" } }
648 <para> This syntax is similar to the previous examples, except that instead of comparing
649 to a literal value, we compare to an object. This object has only a single entry,
650 whose key is a table alias preceded by a leading plus sign. The associated value is
651 the name of the column. </para>
652 <para> Here's the resulting SQL: </para>
655 <programlisting language="SQL">
660 actor.org_unit AS "aou"
663 "aou".id > ( "aou".parent_ou )
668 <para> The table alias must correspond to the appropriate table. Since json_query doesn't
669 validate the choice of alias, it won't detect an invalid alias until it tries to
670 execute the query. In this simple example there's only one table to choose from. The
671 choice of alias is more important in a subquery or join. </para>
672 <para> The leading plus sign, combined with a table alias, can be used in other situations
673 to designate the table to which a column belongs. We shall defer a discussion of this
674 usage to the section on joins. </para>
678 <title>Testing Boolean Columns</title>
679 <para> In SQL, there are several ways to test a boolean column such as
680 actor.org_unit.opac_visible. The most obvious way is to compare it to true or false: </para>
683 <programlisting language="SQL">
693 <para> In a JSON query this approach doesn't work. If you try it, the "= true" test will
694 turn into IS NULL. Don't do that. Instead, use a leading plus sign, as described in
695 the preceding section, to treat the boolean column as a stand-alone condition: </para>
698 <programlisting language="JSON">
701 "select": { "aou":[ "id" ] },
703 "+aou":"opac_visible"
709 <para> Result: </para>
712 <programlisting language="SQL">
716 actor.org_unit AS "aou"
722 <para> If you need to test for falsity, then write a test for truth and negate it with the
723 <literal>"-not"</literal> operator. We will discuss the "-not" operator later, but
724 here's a preview: </para>
727 <programlisting language="JSON">
730 "select": { "aou":[ "id" ] },
733 "+aou":"opac_visible"
741 <programlisting language="SQL">
745 actor.org_unit AS "aou"
747 NOT ( "aou".opac_visible );
751 <para> You can also compare a boolean column directly to a more complex condition: </para>
754 <programlisting language="JSON">
757 "select": { "aou":[ "id" ] },
760 "=": { "parent_ou":{ ">":3 } }
767 <para> Here we compare a boolean column, not to a literal value, but to a boolean expression.
768 The resulting SQL looks a little goofy, but it works: </para>
771 <programlisting language="SQL">
775 actor.org_unit AS "aou"
778 "aou".opac_visible = ( "aou".parent_ou > 3 )
783 <para> In this case we compare the boolean column to a single simple condition. However you
784 can include additional complications -- multiple conditions, IN lists, BETWEEN clauses,
785 and other features as described below. </para>
789 <title>Multiple Conditions</title>
790 <para> If you need multiple conditions, just add them to the <literal>"where"</literal>
791 object, separated by commas: </para>
794 <programlisting language="JSON">
797 "select": { "aou":[ "id", "name" ] },
799 "parent_ou":{ ">":3 },
800 "id":{ "<>":7 }
806 <para> The generated SQL connects the conditions with AND: </para>
809 <programlisting language="SQL">
814 actor.org_unit AS "aou"
816 "aou".parent_ou > 3
817 AND "aou".id <> 7;
821 <para> Later we will see how to use OR instead of AND. </para>
825 <title>Using Arrays</title>
826 <para> Here's a puzzler. Suppose you need two conditions for the same column. How do
827 you code them in the same WHERE clause? For example, suppose you want something
831 <programlisting language="SQL">
839 AND parent_ou <> 7;
843 <para> You might try a WHERE clause like this: </para>
846 <programlisting language="JSON">
848 "parent_ou":{ ">":3 },
849 "parent_ou":{ "<>":7 }
854 <para> Nope. Won't work. According to JSON rules, two entries in the same object
855 can't have the same key. </para>
856 <para> After slapping yourself in the forehead, you try something a little smarter: </para>
859 <programlisting language="JSON">
869 <para> Nice try, but that doesn't work either. Maybe it ought to work -- at least it's
870 legal JSON -- but, no. </para>
871 <para> Here's what works: </para>
874 <programlisting language="JSON">
877 "select": { "aou":[ "id", "name" ] },
879 { "parent_ou":{ ">":3 } },
880 { "parent_ou":{ "<>":7 } }
886 <para> We wrapped the two conditions into two separate JSON objects, and then wrapped
887 those objects together into a JSON array. The resulting SQL looks like this: </para>
890 <programlisting language="SQL">
895 actor.org_unit AS "aou"
897 ( "aou".parent_ou > 3 )
899 ( "aou".parent_ou <> 7 );
903 <para> That's not quite what we were hoping for, because the extra parentheses are so ugly.
904 But they're harmless. This will do. </para>
905 <para> If you're in the mood, you can use arrays to as many parentheses as
906 you like, even if there is only one condition inside: </para>
909 <programlisting language="JSON">
912 "select": { "aou":[ "id", "name" ] },
916 "parent_ou":{ ">":3 }
923 <para> ...yields: </para>
926 <programlisting language="SQL">
931 actor.org_unit AS "aou"
933 ( ( ( ( ( ( "aou".parent_ou > 3 ) ) ) ) ) );
940 <title>How to OR</title>
941 <para> By default, json_query combines conditions with AND. When you need OR,
942 here's how to do it: </para>
945 <programlisting language="JSON">
948 "select": { "aou":[ "id", "name" ] },
959 <para> We use <literal>"-or"</literal> as the key, with the conditions to be ORed in an
960 associated object. The leading minus sign is there to make sure that the operator
961 isn't confused with a column name. Later we'll see some other operators with leading
962 minus signs. In a couple of spots we even use plus signs. </para>
963 <para> Here are the results from the above example: </para>
966 <programlisting language="SQL">
971 actor.org_unit AS "aou"
975 OR "aou".parent_ou = 3
980 <para> The conditions paired with <literal>"-or"</literal> are linked by OR and enclosed
981 in parentheses, </para>
982 <para> Here's how to do the same thing using an array, except that it produces an extra
983 layer of parentheses: </para>
986 <programlisting language="JSON">
989 "select": { "aou":[ "id", "name" ] },
1001 <programlisting language="SQL">
1004 "aou".name AS "name"
1006 actor.org_unit AS "aou"
1010 OR ( "aou".parent_ou = 3 )
1015 <para> It's possible, though not very useful, to have only a single condition subject to
1016 the <literal>"-or"</literal> operator. In that case, the condition appears by itself,
1017 since there's nothing to OR it to. This trick is another way to add an extraneous
1018 layer of parentheses, </para>
1022 <title>Another way to AND</title>
1023 <para> You can also use the <literal>"-and"</literal> operator. It works just like
1024 <literal>"-or"</literal>, except that it combines conditions with AND instead of OR.
1025 Since AND is the default, we don't usually need a separate operator for it, but it's
1029 In rare cases, nothing else will do -- you can't include two conditions in the same
1030 list because of the duplicate key problem, but you can't combine them with arrays
1031 either. In particular, you might need to combine them within an expression that
1032 you're comparing to a boolean column (see the subsection above on Testing Boolean
1037 <title>Negation with NOT</title>
1038 <para> The <literal>"-not"</literal> operator negates a condition or set of conditions.
1039 For example: </para>
1042 <programlisting language="JSON">
1045 "select": { "aou":[ "id", "name" ] },
1057 <programlisting language="SQL">
1060 "aou".name AS "name"
1062 actor.org_unit AS "aou"
1067 AND "aou".parent_ou = 3
1072 <para> In this example we merely negate a combination of two comparisons. However the
1073 condition to be negated may be as complicated as it needs to be. Anything that can
1074 be subject to <literal>"where"</literal> can be subject to
1075 <literal>"-not"</literal>. </para>
1076 <para> In most cases you can achieve the same result by other means. However the
1077 <literal>"-not"</literal> operator is the only way to represent NOT BETWEEN
1078 (to be discussed later). </para>
1082 <title>EXISTS with Subqueries</title>
1083 <para> Two other operators carry a leading minus sign: <literal>"-exists"</literal>
1084 and its negation <literal>"-not-exists"</literal>. These operators apply to
1085 subqueries, which have the same format as a full query. For example: </para>
1088 <programlisting language="JSON">
1091 "select": { "aou":[ "id", "name" ] },
1095 "select":{ "asv":[ "id" ] },
1106 <programlisting language="SQL">
1109 "aou".name AS "name"
1111 actor.org_unit AS "aou"
1115 SELECT "asv".id AS "id"
1116 FROM action.survey AS "asv"
1117 WHERE "asv".owner = 7
1122 <para> This kind of subquery is of limited use, because its WHERE clause doesn't
1123 have anything to do with the main query. It just shuts down the main query
1124 altogether if it isn't satisfied. </para>
1125 <para> More typical is a correlated subquery, whose WHERE clause refers to a row
1126 from the main query. For example: </para>
1129 <programlisting language="JSON">
1132 "select": { "aou":[ "id", "name" ] },
1136 "select":{ "asv":[ "id" ] },
1138 "owner":{ "=":{ "+aou":"id" }}
1146 <para> Note the use of <literal>"+aou"</literal> to qualify the id column in the
1147 inner WHERE clause. </para>
1150 <programlisting language="SQL">
1153 "aou".name AS "name"
1155 actor.org_unit AS "aou"
1159 SELECT "asv".id AS "id"
1160 FROM action.survey AS "asv"
1161 WHERE ("asv".owner = ( "aou".id ))
1166 <para> This latter example illustrates the syntax, but in practice, it would
1167 probably be more natural to use an IN clause with a subquery (to be discussed
1172 <title>BETWEEN Clauses</title>
1173 <para> Here's how to express a BETWEEN clause: </para>
1176 <programlisting language="JSON">
1179 "select": { "aou":[ "id" ] },
1181 "parent_ou": { "between":[ 3, 7 ] }
1187 <para> The value associated with the column name is an object with a single
1188 entry, whose key is <literal>"between"</literal>. The corresponding
1189 value is an array with exactly two values, defining the range to be
1191 <para> The range bounds must be either numbers or string literals. Although
1192 SQL allows them to be null, a null doesn't make sense in this context,
1193 because a null never matches anything. Consequently json_query doesn't
1195 <para> The resulting SQL is just what you would expect: </para>
1198 <programlisting language="SQL">
1202 actor.org_unit AS "aou"
1204 parent_ou BETWEEN '3' AND '7';
1211 <title>IN and NOT IN Lists</title>
1212 <para> There are two ways to code an IN list. One way is simply to include
1213 the list of values in an array: </para>
1216 <programlisting language="JSON">
1219 "select": { "aou":[ "id", "name" ] },
1221 "parent_ou": [ 3, 5, 7 ]
1227 <para> As with a BETWEEN clause, the values in the array must be numbers or
1228 string literals. Nulls aren't allowed. Here's the resulting SQL, which
1229 again is just what you would expect: </para>
1232 <programlisting language="SQL">
1235 "aou".name AS "name"
1237 actor.org_unit AS "aou"
1239 "aou".parent_ou IN (3, 5, 7);
1243 <para> The other way is similar to the syntax shown above for a BETWEEN clause,
1244 except that the array may include any non-zero number of values: </para>
1247 <programlisting language="JSON">
1250 "select": { "aou":[ "id", "name" ] },
1252 "parent_ou": { "in": [ 3, 5, 7 ] }
1258 <para> This version results in the same SQL as the first one. </para>
1259 <para> For a NOT IN list, you can use the latter format, using the
1260 <literal>"not in"</literal> operator instead of <literal>"in"</literal>.
1261 Alternatively, you can use either format together with the
1262 <literal>"-not"</literal> operator. </para>
1266 <title>IN and NOT IN Clauses with Subqueries</title>
1267 <para> For an IN clause with a subquery, the syntax is similar to the second
1268 of the two formats for an IN list (see the previous subsection). The
1269 <literal>"in"</literal> or <literal>"not in"</literal> operator is paired,
1270 not with an array of values, but with an object representing the subquery.
1271 For example: </para>
1274 <programlisting language="JSON">
1277 "select": { "aou":[ "id", "name" ] },
1282 "select":{ "asv":[ "owner" ] },
1283 "where":{ "name":"Voter Registration" }
1291 <para> The results: </para>
1294 <programlisting language="SQL">
1297 "aou".name AS "name"
1299 actor.org_unit AS "aou"
1304 "asv".owner AS "owner"
1306 action.survey AS "asv"
1308 "asv".name = 'Voter Registration'
1313 <para> In SQL the subquery may select multiple columns, but in a JSON query it
1314 can select only a single column. </para>
1315 <para> For a NOT IN clause with a subquery, use the <literal>"not in"</literal>
1316 operator instead of <literal>"in"</literal>. </para>
1320 <title>Comparing to a Function</title>
1321 <para> Here's how to compare a column to a function call: </para>
1324 <programlisting language="JSON">
1327 "select": { "aou":[ "id", "name" ] },
1329 "id":{ ">":[ "sqrt", 16 ] }
1335 <para> A comparison operator (<literal>">"</literal> in this case) is paired
1336 with an array. The first entry in the array must be a string giving the name
1337 of the function. The remaining parameters, if any, are the parameters. They
1338 may be strings, numbers, or nulls. The resulting SQL for this example: </para>
1341 <programlisting language="SQL">
1344 "aou".name AS "name"
1346 actor.org_unit AS "aou"
1348 "aou".id > sqrt( '16' );
1352 <para> All parameters are passed as quoted strings -- even if, as in this case,
1353 they are really numbers. </para>
1354 <para> This syntax is somewhat limited in that the function parameters must be
1355 constants (hence the use of a silly example). </para>
1359 <title>Putting a Function Call on the Left</title>
1360 <para> In the discussion of the SELECT clause, we saw how you could transform the value
1361 of a selected column by passing it to a function. In the WHERE clause, you can use
1362 similar syntax to transform the value of a column before comparing it to something
1364 <para> For example: </para>
1367 <programlisting language="JSON">
1370 "select": { "aou":[ "id", "name" ] },
1374 "transform":"upper",
1375 "value":"CARTER BRANCH"
1383 <para> The <literal>"transform"</literal> entry gives the name of the function that we
1384 will use on the left side of the comparison. The <literal>"value"</literal> entry
1385 designates the value on the right side of the comparison. </para>
1388 <programlisting language="SQL">
1391 "aou".name AS "name"
1393 actor.org_unit AS "aou"
1395 upper("aou".name ) = 'CARTER BRANCH' ;
1399 <para> As in the SELECT clause, you can pass literal values or nulls to the function
1400 as additional parameters by using an array tagged as
1401 <literal>"params"</literal>: </para>
1404 <programlisting language="JSON">
1407 "select": { "aou":[ "id", "name" ] },
1411 "transform":"substr",
1422 <programlisting language="SQL">
1425 "aou".name AS "name"
1427 actor.org_unit AS "aou"
1429 substr("aou".name,'1','6' ) = 'CARTER' ;
1433 <para> The first parameter is always the column name, qualified by the class name,
1434 followed by any additional parameters (which are always enclosed in quotes even
1435 if they are numeric). </para>
1436 <para> As in the SELECT clause: if the function returns multiple columns, you can specify
1437 the one you want by using a <literal>"result_field"</literal> entry (not shown
1439 <para> If you leave out the <literal>"transform"</literal> entry (or misspell it), the
1440 column name will appear on the left without any function call. This syntax works,
1441 but it's more complicated than it needs to be. </para>
1445 <title>Putting Function Calls on Both Sides</title>
1446 <para> If you want to compare one function call to another, you can use the same syntax
1447 shown in the previous subsection -- except that the <literal>"value"</literal> entry
1448 carries an array instead of a literal value. For example: </para>
1451 <programlisting language="JSON">
1454 "select": { "aou":[ "id", "name" ] },
1458 "transform":"factorial",
1459 "value":[ "sqrt", 1000 ]
1468 <programlisting language="SQL">
1471 "aou".name AS "name"
1473 actor.org_unit AS "aou"
1475 factorial("aou".id ) > sqrt( '1000' ) ;
1479 <para> The format for the right side function is similar to what we saw earlier, in the
1480 subsection Comparing to a Function. Note that there are two different formats for
1481 defining function calls: </para>
1486 <para> For a function call to the left of the comparison, the function name is
1487 tagged as <literal>"transform"</literal>. The first parameter is always the
1488 relevant column name; additional parameters, if any, are in an array tagged
1489 as <literal>"params"</literal>. The entry for
1490 <literal>"result_field"</literal>, if present, specifies a subcolumn. </para>
1494 <para> For a function call to the right of the comparison, the function name is
1495 the first entry in an array, together with any parameters. There's no way to
1496 specify a subcolumn. </para>
1504 <title>Comparing a Function to a Condition</title>
1505 <para> So far we have seen two kinds of data for the <literal>"value"</literal> tag. A
1506 string or number translates to a literal value, and an array translates to a function
1507 call. The third possibility is a JSON object, which translates to a condition. For
1511 <programlisting language="JSON">
1514 "select": { "aou":[ "id", "name" ] },
1518 "value":{ "parent_ou":{ ">":3 } },
1519 "transform":"is_prime"
1527 <para> The function tagged as <literal>"transform"</literal> must return boolean, or else
1528 json_query will generate invalid SQL. The function used here,
1529 <literal>"is_prime"</literal>, is fictitious. </para>
1532 <programlisting language="SQL">
1535 "aou".name AS "name"
1537 actor.org_unit AS "aou"
1540 is_prime("aou".id ) = ( "aou".parent_ou > 3 )
1545 <para> If we left out the <literal>"transform"</literal> entry, json_query would compare
1546 the column on the left (which would to be boolean) to the condition on the right.
1547 The results are similar to those for a simpler format described earlier (see the
1548 subsection Testing Boolean Columns). </para>
1549 <para> In the example above we compared the boolean to a simple condition. However the
1550 expression on the right may include multiple conditions, IN lists, subqueries, and
1551 whatever other complications are necessary. </para>
1555 <title>Things You Can't Do</title>
1556 <para> The WHERE clause is subject to some of the same limitations as the SELECT clause.
1557 However, in the WHERE clause these limitations are more limiting, because the client
1558 program can't compensate by doing some of the work for itself. </para>
1559 <para> You can't use arbitrary expressions in a WHERE condition, such as
1560 <literal>"WHERE id > parent_ou -- 3"</literal>. In some cases you may be able to
1561 contrive a custom operator in order to fake such an expression. However this mechanism
1562 is neither very general nor very aesthetic. </para>
1563 <para> To the right of a comparison operator, all function parameters must be literals or
1564 null. You can't pass a column value, nor can you nest function calls. </para>
1565 <para> Likewise you can't include column values or arbitrary expressions in an IN list
1566 or a BETWEEN clause. </para>
1567 <para> You can't include null values in an IN list or a BETWEEN list, not that you should
1568 ever want to. </para>
1569 <para> As noted earlier: you can't use the comparison operators
1570 <literal>"is distinct from"</literal> or <literal>"is not distinct from"</literal>. </para>
1571 <para> Also as noted earlier: a subquery in an IN clause cannot select more than one
1578 <title>JOIN clauses</title>
1579 <para> Until now, our examples have selected from only one table at a time. As a result,
1580 the FROM clause has been very simple -- just a single string containing the class name of
1581 the relevant table. </para>
1582 <para> When the FROM clause joins multiple tables, the corresponding JSON naturally gets more
1583 complicated. </para>
1584 <para> SQL provides two ways to define a join. One way is to list both tables in the FROM
1585 clause, and put the join conditions in the WHERE clause: </para>
1588 <programlisting language="SQL">
1594 actor.org_unit_type aout
1596 aout.id = aou.ou_type;
1600 <para> The other way is to use an explicit JOIN clause: </para>
1603 <programlisting language="SQL">
1609 JOIN actor.org_unit_type aout
1610 ON ( aout.id = aou.ou_type );
1614 <para> JSON queries use only the second of these methods. The following example expresses
1615 the same query in JSON: </para>
1618 <programlisting language="JSON">
1620 "select": { "aou":[ "id" ], "aout":[ "name" ] },
1628 <para> First, let's review the SELECT clause. Since it selects rows from two different tables,
1629 the data for <literal>"select"</literal> includes two entries, one for each table. </para>
1630 <para> As for the FROM clause, it's no longer just a string. It's a JSON object, with exactly
1631 one entry. The key of this entry is the class name of the core table, i.e. the table
1632 named immediately after the FROM keyword. The data associated with this key contains the
1633 rest of the information about the join. In this simple example, that information consists
1634 entirely of a string containing the class name of the other table. </para>
1635 <para> So where is the join condition? </para>
1636 <para> It's in the IDL. Upon reading the IDL, json_query knows that actor.org_unit has a
1637 foreign key pointing to actor.org_unit_type, and builds a join condition accordingly: </para>
1640 <programlisting language="SQL">
1643 "aout".name AS "name"
1645 actor.org_unit AS "aou"
1646 INNER JOIN actor.org_unit_type AS "aout"
1647 ON ( "aout".id = "aou".ou_type ) ;
1651 <para> In this case the core table is the child table, and the joined table is the parent table.
1652 We could just as well have written it the other way around: </para>
1655 <programlisting language="JSON">
1657 "select": { "aou":[ "id" ], "aout":[ "name" ] },
1666 <programlisting language="SQL">
1669 "aout".name AS "name"
1671 actor.org_unit_type AS "aout"
1672 INNER JOIN actor.org_unit AS "aou"
1673 ON ( "aou".ou_type = "aout".id ) ;
1678 <title>Specifying The Join Columns Explicitly</title>
1679 <para> While it's convenient to let json_query pick the join columns, it doesn't
1680 always work. </para>
1681 <para> For example, the actor.org_unit table has four different address ids, for
1682 four different kinds of addresses. Each of them is a foreign key to the
1683 actor.org_address table. Json_query can't guess which one you want if you
1684 don't tell it. </para>
1685 <para> (Actually it will try to guess. It will pick the first matching link that
1686 it finds in the IDL, which may or may not be the one you want.) </para>
1687 <para> Here's how to define exactly which columns you want for the join: </para>
1690 <programlisting language="JSON">
1692 "select": { "aou":[ "id" ], "aoa":[ "street1" ] },
1696 "fkey":"holds_address",
1705 <para> Before, the table we were joining was represented merely by its class name.
1706 Now it's represented by an entry in a JSON object. The key of that entry is the
1707 class name, and the associated data is another layer of JSON object containing
1708 the attributes of the join. </para>
1709 <para> Later we'll encounter other kinds of join attributes. For now, the only
1710 attributes that we're looking at are the ones that identify the join columns:
1711 <literal>"fkey"</literal> and <literal>"field"</literal>. The hard part is
1712 remembering which is which: </para>
1717 <para> <literal>"fkey"</literal> identifies the join column from the
1718 <emphasis>left</emphasis> table; </para>
1722 <para> <literal>"field"</literal> identifies the join column from the
1723 <emphasis>right</emphasis> table. </para>
1728 <para> When there are only two tables involved, the core table is on the left, and
1729 the non-core table is on the right. In more complex queries neither table may
1730 be the core table. </para>
1731 <para> Here is the result of the preceding JSON: </para>
1734 <programlisting language="SQL">
1737 "aoa".street1 AS "street1"
1739 actor.org_unit AS "aou"
1740 INNER JOIN actor.org_address AS "aoa"
1741 ON ( "aoa".id = "aou".holds_address ) ;
1745 <para> In this example the child table is on the left and the parent table is on the
1746 right. We can swap the tables if we swap the join columns as well: </para>
1749 <programlisting language="JSON">
1751 "select": { "aou":[ "id" ], "aoa":[ "street1" ] },
1756 "field":"holds_address"
1765 <programlisting language="SQL">
1768 "aoa".street1 AS "street1"
1770 actor.org_address AS "aoa"
1771 INNER JOIN actor.org_unit AS "aou"
1772 ON ( "aou".holds_address = "aoa".id ) ;
1776 <para> When you specify both of the join columns, json_query assumes that you know
1777 what you're doing. It doesn't check the IDL to confirm that the join makes sense.
1778 The burden is on you to avoid absurdities. </para>
1782 <title>Specifying Only One Join Column</title>
1783 <para> We just saw how to specify both ends of a join. It turns out that there's a
1784 shortcut -- most of the time you only need to specify one end. Consider the
1785 following variation on the previous example: </para>
1788 <programlisting language="JSON">
1790 "select": { "aou":[ "id" ], "aoa":[ "street1" ] },
1794 "field":"holds_address"
1802 <para> ..which results in exactly the same SQL as before. </para>
1803 <para> Here we specified the join column from the child table, the column that is a
1804 foreign key pointing to another table. As long as that linkage is defined in the IDL,
1805 json_query can look it up and figure out what the corresponding column is in the
1806 parent table. </para>
1807 <para> However this shortcut doesn't work if you specify only the column in the parent
1808 table, because it would lead to ambiguities. Suppose we had specified the id column
1809 of actor.org_address. As noted earlier, there are four different foreign keys from
1810 actor.org_unit to actor.org_address, and json_query would have no way to guess
1811 which one we wanted. </para>
1815 <title>Joining to Multiple Tables</title>
1816 <para> So far we have joined only two tables at a time. What if we need to join one
1817 table to two different tables? </para>
1818 <para> Here's an example: </para>
1821 <programlisting language="JSON">
1823 "select": { "aou":[ "id" ], "aout":[ "depth" ], "aoa":[ "street1" ] },
1828 "fkey":"holds_address"
1836 <para> The first join, to actor.org_unit_type, is simple. We could have specified join
1837 columns, but we don't have to, because json_query will construct that join on the
1838 basis of what it finds in the IDL. Having no join attributes to specify, we leave
1839 that object empty. </para>
1840 <para> For the second join, to actor.org_address, we have to specify at least the join
1841 column in the child table, as discussed earlier. We could also have specified the
1842 join column from the parent table, but we don't have to, so we didn't. </para>
1843 <para> Here is the resulting SQL: </para>
1846 <programlisting language="SQL">
1849 "aout".depth AS "depth",
1850 "aoa".street1 AS "street1"
1852 actor.org_unit AS "aou"
1853 INNER JOIN actor.org_unit_type AS "aout"
1854 ON ( "aout".id = "aou".ou_type )
1855 INNER JOIN actor.org_address AS "aoa"
1856 ON ( "aoa".id = "aou".holds_address ) ;
1860 <para> Since there can be only one core table, the outermost object in the FROM clause
1861 can have only one entry, whose key is the class name of the core table. The next
1862 level has one entry for every table that's joined to the core table. </para>
1866 <title>Nested Joins</title>
1867 <para> Let's look at that last query again. It joins three tables, and the core table
1868 is the one in the middle. Can we make one of the end tables the core table instead? </para>
1869 <para> Yes, we can: </para>
1872 <programlisting language="JSON">
1874 "select": { "aou":[ "id" ], "aout":[ "depth" ], "aoa":[ "street1" ] },
1878 "field":"holds_address",
1880 "aout":{ "fkey":"ou_type" }
1889 <para> The <literal>"join"</literal> attribute introduces another level of join. In this
1890 case <literal>"aou"</literal> is the left table for the nested join, and the right table
1891 for the original join. Here are the results: </para>
1894 <programlisting language="SQL">
1897 "aout".depth AS "depth",
1898 "aoa".street1 AS "street1"
1900 actor.org_address AS "aoa"
1901 INNER JOIN actor.org_unit AS "aou"
1902 ON ( "aou".holds_address = "aoa".id )
1903 INNER JOIN actor.org_unit_type AS "aout"
1904 ON ( "aout".id = "aou".ou_type ) ;
1911 <title>Outer Joins</title>
1912 <para> By default, json_query constructs an inner join. If you need an outer join, you
1913 can add the join type as an attribute of the join: </para>
1916 <programlisting language="JSON">
1918 "select": { "aou":[ "id" ], "aoa":[ "street1" ] },
1922 "field":"mailing_address",
1931 <para> Here we asked for a left outer join. For a right outer join, code
1932 <literal>"type":"right"</literal>. For a full outer join, code
1933 <literal>"type":"full"</literal>. Any other value for "type" results in an inner
1934 join, so watch out for typos. A type of <literal>"rihgt"</literal> will give you
1935 a wrong join instead of a right one. </para>
1936 <para> Here is the resulting SQL for this example: </para>
1939 <programlisting language="SQL">
1942 "aoa".street1 AS "street1"
1944 actor.org_address AS "aoa"
1945 LEFT JOIN actor.org_unit AS "aou"
1946 ON ( "aou".mailing_address = "aoa".id ) ;
1953 <title>Referring to Joined Tables in the WHERE Clause</title>
1954 <para> In the WHERE clause of the generated SQL, every column name is qualified by a
1955 table alias, which is always the corresponding class name. </para>
1956 <para> If a column belongs to the core table, this qualification happens by default.
1957 If it belongs to a joined table, the JSON must specify what class name to use for
1958 an alias. For example: </para>
1961 <programlisting language="JSON">
1963 "select": { "aou":[ "id" ], "aout":[ "name" ] },
1968 "+aou":{ "parent_ou":2 }
1974 <para> Note the peculiar operator <literal>"+aou"</literal> -- a plus sign followed
1975 by the relevant class name. This operator tells json_query to apply the specified
1976 class to the condition that follows. The result: </para>
1979 <programlisting language="SQL">
1982 "aout".name AS "name"
1984 actor.org_unit_type AS "aout"
1985 INNER JOIN actor.org_unit AS "aou"
1986 ON ( "aou".ou_type = "aout".id )
1988 ( "aou".parent_ou = 2 );
1992 <para> The plus-class operator may apply to multiple conditions: </para>
1995 <programlisting language="JSON">
1997 "select": { "aou":[ "id" ], "aout":[ "name" ] },
2012 <programlisting language="SQL">
2015 "aout".name AS "name"
2017 actor.org_unit_type AS "aout"
2018 INNER JOIN actor.org_unit AS "aou"
2019 ON ( "aou".ou_type = "aout".id )
2023 AND "aou".id < 42
2028 <para> For these artificial examples, it would have been simpler to swap the tables,
2029 so that actor.org_unit is the core table. Then you wouldn't need to go through
2030 any special gyrations to apply the right table alias. In a more realistic case,
2031 however, you might need to apply conditions to both tables. Just swapping the
2032 tables wouldn't solve the problem. </para>
2033 <para> You can also use a plus-class operator to compare columns from two different
2037 <programlisting language="JSON">
2039 "select": { "aou":[ "id" ], "aout":[ "name" ] },
2044 "depth": { ">": { "+aou":"parent_ou" } }
2051 <programlisting language="SQL">
2054 "aout".name AS "name"
2056 actor.org_unit_type AS "aout"
2057 INNER JOIN actor.org_unit AS "aou"
2058 ON ( "aou".ou_type = "aout".id )
2061 "aout".depth > ( "aou".parent_ou )
2066 <para> Please don't expect that query to make any sense. It doesn't. But it
2067 illustrates the syntax. </para>
2071 <title>Join Filters</title>
2072 <para> While the above approach certainly works, the special syntax needed is goofy
2073 and awkward. A somewhat cleaner solution is to include a condition in the JOIN
2077 <programlisting language="JSON">
2079 "select": { "aou":[ "id" ], "aout":[ "name" ] },
2094 <programlisting language="SQL">
2096 "aou".id AS "id", "aout".name AS "name"
2098 actor.org_unit_type AS "aout"
2099 INNER JOIN actor.org_unit AS "aou"
2100 ON ( "aou".ou_type = "aout".id
2101 AND "aou".parent_ou = 2 ) ;
2105 <para> By default, json_query uses AND to combine the <literal>"filter"</literal>
2106 condition with the original join condition. If you need OR, you can use the
2107 <literal>"filter_op"</literal> attribute to say so: </para>
2110 <programlisting language="JSON">
2112 "select": { "aou":[ "id" ], "aout":[ "name" ] },
2128 <programlisting language="SQL">
2131 "aout".name AS "name"
2133 actor.org_unit_type AS "aout"
2134 INNER JOIN actor.org_unit AS "aou"
2135 ON ( "aou".ou_type = "aout".id
2136 OR "aou".parent_ou = 2 ) ;
2140 <para> If the data tagged by <literal>"filter_op"</literal> is anything but
2141 <literal>"or"</literal> (in upper, lower, or mixed case), json_query uses AND
2142 instead of OR. </para>
2143 <para> The condition tagged by <literal>"filter"</literal> may be much more complicated.
2144 In fact it accepts all the same syntax as the WHERE clause. </para>
2145 <para> Remember, though, that it all gets combined with the the original join condition
2146 with an AND, or with an OR if you so specify. If you're not careful, the result
2147 may be a confusing mixture of AND and OR at the same level. </para>
2151 <title>Joining to a Subquery</title>
2152 <para> In SQL you can put a subquery in a FROM clause, and select from it as if it were
2153 a table. A JSON query has no way to do that directly. The IDL, however, can define
2154 a class as a subquery instead of as a table. When you SELECT from it, json_query
2155 inserts the corresponding subquery into the FROM clause. For example: </para>
2158 <programlisting language="JSON">
2160 "select":{ "iatc":[ "id", "dest", "copy_status" ] },
2166 <para> There's nothing special-looking about this JSON, but json_query expands it as
2170 <programlisting language="SQL">
2173 "iatc".dest AS "dest",
2174 "iatc".copy_status AS "copy_status"
2179 action.transit_copy t
2180 JOIN actor.org_unit AS s
2181 ON (t.source = s.id)
2182 JOIN actor.org_unit AS d
2185 s.parent_ou <> d.parent_ou
2190 <para> The <literal>"iatc"</literal> class is like a view, except that it's defined in the
2191 IDL instead of the database. In this case it provides a way to do a join that would
2192 otherwise be impossible through a JSON query, because it joins the same table in two
2193 different ways (see the next subsection). </para>
2197 <title>Things You Can't Do</title>
2198 <para> In a JOIN, as with other SQL constructs, there are some things that you can't do with
2199 a JSON query. </para>
2200 <para> In particular, you can't specify a table alias, because the table alias is always the
2201 class name. As a result: </para>
2206 <para> You can't join a table to itself. For example, you can't join actor.org_unit
2207 to itself in order to select the name of the parent for every org_unit. </para>
2211 <para> You can't join to the same table in more than one way. For example, you can't
2212 join actor.org_unit to actor.org_address through four different foreign keys, to
2213 get four kinds of addresses in a single query. </para>
2218 <para> The only workaround is to perform the join in a view, or in a subquery defined in
2219 the IDL as described in the previous subsection. </para>
2220 <para> Some other things, while not impossible, require some ingenuity in the use of join
2222 <para> For example: by default, json_query constructs a join condition using only a single
2223 pair of corresponding columns. As long as the database is designed accordingly, a
2224 single pair of columns will normally suffice. If you ever need to join on more than
2225 one pair of columns, you can use join filters for the extras. </para>
2226 <para> Likewise, join conditions are normally equalities. In raw SQL it is possible
2227 (though rarely useful) to base a join on an inequality, or to use a function call in
2228 a join condition, or to omit any join condition in order to obtain a Cartesian product.
2229 If necessary, you can devise such unconventional joins by combining the normal join
2230 conditions with join filters. </para>
2231 <para> For example, here's how to get a Cartesian product: </para>
2234 <programlisting language="JSON">
2236 "select": { "aou":[ "id" ], "aout":[ "name" ] },
2241 "ou_type":{ "<>": { "+aout":"id" } }
2252 <programlisting language="SQL">
2255 "aout".name AS "name"
2257 actor.org_unit_type AS "aout"
2258 INNER JOIN actor.org_unit AS "aou"
2261 "aou".ou_type = "aout".id
2262 OR ("aou".ou_type <> ( "aout".id ))
2267 <para> Yes, it's ugly, but at least you're not likely to do it by accident. </para>
2272 <title>Selecting from Functions</title>
2273 <para> In SQL, you can put a function call in the FROM clause. The function may return
2274 multiple columns and multiple rows. Within the query, the function behaves like a
2276 <para> A JSON query can also select from a function: </para>
2279 <programlisting language="JSON">
2281 "from": [ "actor.org_unit_ancestors", 5 ]
2286 <para> The data associated with <literal>"from"</literal> is an array instead of a string
2287 or an object. The first element in the array specifies the name of the function.
2288 Subsequent elements, if any, supply the parameters of the function; they must be
2289 literal values or nulls. </para>
2290 <para> Here is the resulting query: </para>
2293 <programlisting language="SQL">
2296 actor.org_unit_ancestors( '5' ) AS "actor.org_unit_ancestors" ;
2300 <para> In a JSON query this format is very limited, largely because the IDL knows
2301 nothing about the available functions. You can't join the function to a table or
2302 to another function. If you try to supply a SELECT list or a WHERE clause,
2303 json_query will ignore it. The generated query will always select every column,
2304 via a wild card asterisk, from every row. </para>
2308 <title>The ORDER BY Clause</title>
2309 <para> In most cases you can encode an ORDER BY clause as either an array or an object.
2310 Let's take a simple example and try it both ways. First the array: </para>
2313 <programlisting language="JSON">
2315 "select":{ "aou":[ "name" ] },
2318 { "class":"aou", "field":"name" }
2324 <para> Now the object: </para>
2327 <programlisting language="JSON">
2329 "select":{ "aou":[ "name" ] },
2338 <para> The results are identical from either version: </para>
2341 <programlisting language="SQL">
2343 "aou".name AS "name"
2345 actor.org_unit AS "aou"
2351 <para> The array format is more verbose, but as we shall see, it is also more flexible.
2352 It can do anything the object format can do, plus some things that the object format
2353 <emphasis>can't</emphasis> do. </para>
2356 <title>ORDER BY as an Array</title>
2357 <para> In the array format, each element of the array is an object defining one of the
2358 sort fields. Each such object must include at least two tags: </para>
2363 <para> The <literal>"class"</literal> tag provides the name of the class,
2364 which must be either the core class or a joined class. </para>
2368 <para> The <literal>"field"</literal> tag provides the field name, corresponding
2369 to one of the columns of the class. </para>
2374 <para> If you want to sort by multiple fields, just include a separate object for each
2376 <para> If you want to sort a field in descending order, add a
2377 <literal>"direction"</literal> tag: </para>
2380 <programlisting language="JSON">
2382 "select":{ "aou":[ "name" ] },
2385 { "class":"aou", "field":"name", "direction":"desc" }
2392 <programlisting language="SQL">
2394 "aou".name AS "name"
2396 actor.org_unit AS "aou"
2402 <para> The string tagged as <literal>"direction"</literal> can be anything -- all that
2403 matters is the first character. If the string starts with "D" or "d", the sort
2404 will be descending. Otherwise it will be ascending. So
2405 <literal>"diplodocus"</literal> or <literal>"Dioscorides"</literal> will work as
2406 well as <literal>"desc"</literal>, but <literal>"going down"</literal> means that
2407 the sort will go up. </para>
2408 <para> You can also pass a column through some kind of transforming function, much as
2409 you can in the SELECT and WHERE clauses, using the <literal>"transform"</literal>
2410 tag. For example, for a case-insensitive sort, you could raise to upper case: </para>
2413 <programlisting language="JSON">
2415 "select":{ "aou":[ "name" ] },
2429 <programlisting language="SQL">
2431 "aou".name AS "name"
2433 actor.org_unit AS "aou"
2439 <para> If you need additional parameters for the function, you can use the
2440 <literal>"params"</literal> tag to pass them: </para>
2443 <programlisting language="JSON">
2445 "select":{ "aou":[ "name" ] },
2451 "transform":"substr",
2459 <para> The additional parameters appear as elements in an array. They may be numbers,
2460 strings, or nulls. </para>
2463 <programlisting language="SQL">
2465 "aou".name AS "name"
2467 actor.org_unit AS "aou"
2469 substr("aou".name,'1','8' );
2473 <para> As we have seen elsewhere, all literal values are passed as quoted strings,
2474 even if they are numbers. </para>
2475 <para> If the function returns multiple columns, you can use the
2476 <literal>"result_field"</literal> tag to indicate which one you want (not shown). </para>
2480 <title>ORDER BY as an Object</title>
2481 <para> When you encode the ORDER BY clause as an object, the keys of the object are
2482 class names. Each class must be either the core class or a joined class. The
2483 data for each class can be either an array or another layer of object. Here's an
2484 example with one of each: </para>
2487 <programlisting language="JSON">
2489 "select":{ "aout":"id", "aou":[ "name" ] },
2490 "from": { "aou":"aout" },
2493 "aou":{ "name":{ "direction":"desc" } }
2499 <para> For the <literal>"aout"</literal> class, the associated array is simply a list
2500 of field names (in this case, just one). Naturally, each field must reside in
2501 the class with which it is associated. </para>
2502 <para> However, a list of field names provides no way to specify the direction of
2503 sorting, or a transforming function. You can add those details only if the
2504 class name is paired with an object, as in the example for the
2505 <literal>"aou"</literal> class. The keys for such an object are field names, and
2506 the associated tags define other details. </para>
2507 <para> In this example, we use the <literal>"direction"</literal> tag to specify that
2508 the name field be sorted in descending order. This tag works the same way here as
2509 described earlier. If the associated string starts with "D" or "d", the sort will
2510 be descending; otherwise it will be ascending. </para>
2511 <para> Here is the resulting SQL: </para>
2514 <programlisting language="SQL">
2516 "aou".name AS "name"
2518 actor.org_unit AS "aou"
2519 INNER JOIN actor.org_unit_type AS "aout"
2520 ON ( "aout".id = "aou".ou_type )
2527 <para> You can also use the <literal>"transform</literal>", <literal>"params"</literal>,
2528 and <literal>"result_field"</literal> tags to specify the use of a transforming
2529 function, as described in the previous subsection. For example: </para>
2532 <programlisting language="JSON">
2534 "select":{ "aou":[ "name", "id" ] },
2538 "name":{ "transform":"substr", "params":[ 1, 8 ] }
2546 <programlisting language="SQL">
2548 "aou".name AS "name",
2551 actor.org_unit AS "aou"
2553 substr("aou".name,'1','8' );
2560 <title>Things You Can't Do</title>
2561 <para> If you encode the ORDER BY clause as an object, you may encounter a couple of
2562 restrictions. </para>
2563 <para> Because the key of such an object is the class name, all the fields from a given
2564 class must be grouped together. You can't sort by a column from one table, followed
2565 by a column from another table, followed by a column from the first table. If you
2566 need such a sort, you must encode the ORDER BY clause in the array format, which
2567 suffers from no such restrictions. </para>
2568 <para> For similar reasons, with an ORDER BY clause encoded as an object, you can't
2569 reference the same column more than once. Although such a sort may seem perverse,
2570 there are situations where it can be useful, provided that the column is passed to
2571 a transforming function. </para>
2572 <para> For example, you might want a case-insensitive sort, except that for any given
2573 letter you want lower case to sort first. For example, you want "diBona" to sort
2574 before "Dibona". Here's a way to do that, coding the ORDER BY clause as an array: </para>
2577 <programlisting language="JSON">
2579 "select":{ "au":[ "family_name", "id" ] },
2582 { "class":"au", "field":"family_name", "transform":"upper" },
2583 { "class":"au", "field":"family_name" }
2590 <programlisting language="SQL">
2592 "au".family_name AS "family_name",
2597 upper("au".family_name ),
2602 <para> Such a sort is not possible where the ORDER BY clause is coded as an object. </para>
2607 <title>The GROUP BY Clause</title>
2608 <para> A JSON query has no separate construct to define a GROUP BY clause. Instead, the
2609 necessary information is distributed across the SELECT clause. However, the way it works
2610 is a bit backwards from what you might expect, so pay attention. </para>
2611 <para> Here's an example: </para>
2614 <programlisting language="JSON">
2618 { "column":"parent_ou" },
2619 { "column":"name", "transform":"max", "aggregate":true }
2627 <para> The <literal>"transform"</literal> tag is there just to give us an excuse to do a GROUP
2628 BY. What's important to notice is the <literal>"aggregate"</literal> tag. </para>
2629 <para> Here's the resulting SQL: </para>
2632 <programlisting language="SQL">
2634 "aou".parent_ou AS "parent_ou",
2635 max("aou".name ) AS "name"
2637 actor.org_unit AS "aou"
2643 <para> The GROUP BY clause references fields from the SELECT clause by numerical reference,
2644 instead of by repeating them. Notice that the field it references, parent_ou, is the
2645 one that <emphasis>doesn't</emphasis> carry the <literal>"aggregate"</literal> tag in
2647 <para> Let's state that more generally. The GROUP BY clause includes only the fields that
2648 do <emphasis>not</emphasis> carry the <literal>"aggregate"</literal> tag (or that carry
2649 it with a value of false). </para>
2650 <para> However, that logic applies only when some field somewhere <emphasis>does</emphasis>
2651 carry the <literal>"aggregate"</literal> tag, with a value of true. If there is no
2652 <literal>"aggregate"</literal> tag, or it appears only with a value of false, then there
2653 is no GROUP BY clause. </para>
2654 <para> If you really want to include every field in the GROUP BY clause, don't use
2655 <literal>"aggregate"</literal>. Use the <literal>"distinct"</literal> tag, as described
2656 in the next section. </para>
2660 <title>The DISTINCT Clause</title>
2661 <para> JSON queries don't generate DISTINCT clauses. However, they can generate GROUP
2662 BY clauses that include every item from the SELECT clause. The effect is the same
2663 as applying DISTINCT to the entire SELECT clause. </para>
2664 <para> For example: </para>
2667 <programlisting language="JSON">
2681 <para> Note the <literal>"distinct"</literal> entry at the top level of the
2682 query object, with a value of "true". </para>
2685 <programlisting language="SQL">
2687 "aou".parent_ou AS "parent_ou",
2688 "aou".ou_type AS "ou_type"
2690 actor.org_unit AS "aou"
2696 <para> The generated GROUP BY clause references every column in the SELECT clause by
2701 <title>The HAVING Clause</title>
2702 <para> For a HAVING clause, add a <literal>"having"</literal> entry at the top level
2703 of the query object. For the associated data, you can use all the same syntax
2704 that you can use for a WHERE clause. </para>
2705 <para> Here's a simple example: </para>
2708 <programlisting language="JSON">
2714 "transform":"count",
2724 "transform":"count",
2733 <para> We use the "aggregate" tag in the SELECT clause to give us a GROUP BY to go
2734 with the HAVING. Results: </para>
2737 <programlisting language="SQL">
2739 "aou".parent_ou AS "parent_ou",
2740 count("aou".id ) AS "id_count"
2742 actor.org_unit AS "aou"
2746 count("aou".id ) > 6 ;
2750 <para> In raw SQL we could have referred to "count( 1 )". But since JSON queries
2751 cannot encode arbitrary expressions, we applied the count function to a column
2752 that cannot be null. </para>
2756 <title>The LIMIT and OFFSET Clauses</title>
2757 <para> To add an LIMIT or OFFSET clause, add an entry to the top level of a query
2758 object. For example: </para>
2761 <programlisting language="JSON">
2764 "aou": [ "id", "name" ]
2767 "order_by": { "aou":[ "id" ] },
2774 <para> The data associated with <literal>"offset"</literal> and <literal>"limit"</literal>
2775 may be either a number or a string, but if it's a string, it should have a number
2777 <para> Result: </para>
2780 <programlisting language="SQL">
2783 "aou".name AS "name"
2785 actor.org_unit AS "aou"