7 The query schema stores user-defined queries in an abstract form. The
8 qstore server reads the query tables, constructs the corresponding SQL,
9 executes the query, and returns the result set. This machinery supports
10 three main kinds of uses:
13 2. Repeated queries for reports or other kinds of extracts
14 3. Identifying rows that may be subject to automated updates
16 Queries may be customized at run time through the use of bind variables.
17 For example, a query might extract circulation statistics for a given
18 branch. It could include a bind variable as a placeholder for the org
19 unit id, which the user would supply at run time. A bind variable may
20 represent a single value or a variable-length list of values.
22 Although there are some limitations, the query tables can represent most
23 of the queries that anyone is likely to want. In particular they
24 support many SQL constructs that json_query does not support.
26 Warning: the machinery comprising qstore and the query tables is a text
27 generator with little understanding of how databases work. Depending on
28 the contents of the query tables, it may generate invalid SQL.
29 PostgreSQL is the final arbiter.
33 The query schema includes the following tables, each of which is
34 described in a later section:
36 1. stored_query – stores the topmost level of a query or subquery:
37 SELECT, UNION, INTERSECT, or EXCEPT. Other tables link to
38 query.stored_query, directly or indirectly.
39 2. query_sequence – specifies the sequence of subordinate queries
40 within a UNION, INTERSECT, or EXCEPT.
41 3. expression – each row represents an expression, often a
42 subexpression of some larger expression.
43 4. from_relation – each row represents a FROM clause, or part of a FROM
44 clause, identifying a table, view, subquery, or function from which the
46 5. select_item – each row specifies the location and content of an
47 entry in a SELECT list.
48 6. order_by_item – each row specifies the location and content of an
49 entry in an ORDER BY list.
50 7. function_sig – represents the names and return types of functions.
51 8. case_branch – represents branches in CASE expressions.
52 9. datatype – defines datatypes that may be used in CAST expressions.
53 10. bind_variable – represents bind variables whose values may be
54 supplied at execution time.
56 Three other tables are currently unused, and will not be discussed in
59 1. record_column – defines column sets for functions in a FROM clause.
60 2. function_param_def – defines the parameters of functions.
61 3. subfield – defines the components of composite types.
63 The latter two may or may not turn out to be useful for the user
68 The stored_query table is the entry point into the query schema. When
69 you want qstore to construct a query, you give it the id of a row in
70 query.stored_query. Then qstore reads that row and all the rows
71 connected to it, directly or indirectly, that collectively define the
74 The columns are as follows:
76 id integer primary key
80 use_all boolean not null default
83 use_distinct boolean not null default
86 from_clause integer points to
89 where_clause integer points to
92 having_clause integer points to
95 limit_count integer points to
98 offset_count integer points to
101 The id is normally assigned by a database sequence.
103 The type column must be one of SELECT, UNION, INTERSECT, or UNION. Most
104 queries, of course, are SELECT statements. Neither the query schema nor
105 qstore supports queries in the form of VALUES lists.
107 The use_all column indicates whether there will be an ALL clause on a
108 UNION, INTERSECT, or UNION. It is not meaningful for a SELECT.
110 The use_distinct column indicates whether there will be a DISTINCT
111 clause. It is meaningful only for a SELECT.
113 The from_clause column is meaningful only for a SELECT. It points to
114 the query.from_relation table to define the top-level or core relation
119 The where_clause and having_clause columns point to the query.expression
120 table to define a WHERE and HAVING clause, respectively. The
121 expressions must evaluate to a boolean result, or else PostgreSQL will
122 reject the query. These columns are meaningful only for a SELECT.
124 The limit_count and offset_count columns point to the query.expression
125 table to define values for a LIMIT and OFFSET clause, respectively. The
126 expressions must evaluate to a numeric result, or else PostgreSQL will
127 reject the query. These columns are meaningful only for a SELECT.
129 For GROUP BY clauses, see the section on the query.select_item table.
133 The query.query_sequence table defines the sequence of subordinate
134 queries within a UNION, INTERSECT, or EXCEPT query. It provides a layer
135 of indirection so that the same query can appear in multiple contexts.
137 Its columns are as follows:
139 id integer primary key
141 parent_query integer not null;
142 points to query.stored_query
144 seq_no integer not null
146 child_query integer not null
148 The id is normally assigned by a database sequence.
150 The parent_query column points to the UNION, INTERSECT, or EXCEPT query
151 to which the subordinate query is subordinate.
153 The seq_no column defines the placement of a given subordinate query
154 within the parent. No two subordinates of the same parent may have the
155 same value for seq_no.
157 The child_query column points to the subordinate query. Typically it
158 points to a SELECT, but it may point to a nested UNION, INTERSECT, or
163 The query.expression table is easily the most complicated of the tables
164 in the query schema. There are many types of expressions, and they may
165 be combined into structures of arbitrary complexity. Expressions may
166 appear in several different places within a query: in a SELECT list, in
167 a WHERE, ORDER BY, or ON clause, or as subexpressions within larger
170 Different kinds of expressions call for different combinations of
171 columns to be populated, as described in the Appendix. However the
172 following columns are relevant to all kinds of expressions:
174 id integer primary key
178 parenthesize boolean not null; default
181 parent_expr integer points to
184 seq_no integer not null;
187 negate boolean not null; default
190 The id is normally assigned by a database sequence.
192 The type column currently has sixteen possible values, which we will
193 examine briefly below after introducing the other columns.
195 If set to true, the parenthesize column tells qstore to enclose the
196 entire expression in parentheses. Usually qstore can figure out for
197 itself when it needs to insert parentheses, but this column is available
200 The parent_expr column identifies the larger expression to which a
201 subexpression belongs. It isn't needed for every subexpression; only
202 for those that may form series of two or more subexpressions, such as
203 the parameters of a function call or the branches of a CASE expression.
205 The seq_no column defines the sequence of subexpressions within the same
206 larger expression. No two expressions with the same parent expression
207 may have the same sequence number.
209 If true, the negate column tells qstore to negate the entire expression
210 by inserting a NOT somewhere. It is sensible to use it only when the
211 expression evaluates to a boolean result.
213 The usage of the remaining columns depends on the value of the type
214 column, as detailed in the Appendix. Here's a summary:
216 The literal column contains a number (as text) or a string literal. It
217 may also contain “true” or “false” as a boolean literal.
219 The column_name column contains the name of a column. It may optionally
220 be qualified by the table_alias column.
222 The left_operand and right_operand columns point to subexpressions to
223 appear with a designated operator. The left_operand operator is also
224 used to point to subexpressions in several other kinds of expressions,
225 such as IN expressions and casts.
227 The function_id column, pointing to a row in query.function_sig, is used
228 to express a function call.
230 The subquery column, pointing to a row in query.stored_query, refers to
233 The cast_type column, pointing to a row in query.datatype, is used to
234 express a CAST expression.
236 The bind_variable column, pointing to a row in query.bind_variable,
237 identifies a placeholder whose value will be supplied by the user when
238 he or she executes the query.
240 Currently there are sixteen allowed values for the type column,
241 signifying sixteen kinds of expressions:
243 1. xbet BETWEEN expression
244 2. xbind bind variable
245 3. xbool boolean literal
246 4. xcase CASE expression
247 5. xcast CAST expression
248 6. xcol column reference
249 7. xex EXISTS expression
250 8. xfunc function call
252 10. xisnull IS NULL expression
254 12. xnum numeric literal
255 13. xop operator with one or two operands
256 14. xser series of subexpressions separated by operators
258 15. xstr string literal
261 For each expression type there is an updatable view containing only the
262 columns that are relevant to that type. The name of the view is the
263 type prefaced by “expr_”; e.g.. query.exp_xbet.
265 Neither the query schema nor qstore tries to determine the datatype of
266 an expression. For example, you can encode a nonsensical expression
267 like 'W' + 3, or NOT CURRENT_DATE. Though qstore will blithely generate
268 the corresponding SQL, PostgreSQL will reject it.
272 A row in query.from_relation defines a table, view, function or subquery
273 in the FROM clause, from which the SELECT will draw its data.
275 Query.from_relation includes the following columns:
277 id integer primary key
285 subquery integer points to
286 query.stored_relation
288 function_call integer points to
293 parent_relation integer points to
296 seq_no integer not null;
301 on_clause integer points to
304 The id is normally assigned by a database sequence.
306 The type must be one of RELATION (meaning table or view), SUBQUERY, or
307 FUNCTION. Depending on the type, different combinations of the other
308 columns may be populated or not populated.
310 The table_name column may be populated for a RELATION to specify the
311 name of a table or view.
313 The class_name column is another way to specify a table or view for a
314 RELATION. If table_name is null, qstore looks up the class_name in the
315 IDL in order to get the name of the table or view – or in some cases the
316 body of a subquery defined in the IDL.
318 If the type is SUBQUERY, then the subquery column must point to a row in
319 query.stored_query. Otherwise this column has no meaning.
321 If the type is FUNCTION, then the function_call column must point to a
322 row in query.expression, and that row must represent a function call
323 expression. Otherwise this column has no meaning.
325 The table_alias column defines an alias to be used for the table, view,
326 subquery, or function. If table_alias is null, but class_name is
327 populated, then qstore will use the class_name as an alias.
329 The parent_relation column is used for joins. If a relation is joined
330 to the top-level relation (the one to which the query.stored_query table
331 points), then parent_relation points to the top level. Otherwise it
332 points to a relation that points to the top level, directly or
335 The seq_no field defines the sequence of relations with the same parent.
336 No two rows with the same value of parent_relation may have the same
339 If parent_relation is populated, then the join_type column must be
340 populated with one of INNER, LEFT, RIGHT or FULL to indicate the type of
343 The on_clause column is meaningful only if parent_relation is populated.
344 It points to a row in query.expression representing the join condition,
345 which must evaluate to a boolean result.
349 Each row in query.select_item represents an item in a SELECT list. The
350 columns are as follows:
352 id integer primary key
354 stored_query integer not null
356 seq_no integer not null
358 expression integer not null
362 grouped_by boolean not null; default false
364 The id is normally assigned by a database sequence.
366 The stored_query column points to the query to whose SELECT list the
367 item belongs. The query must be a SELECT.
369 The seq_no column defines the sequence of items within the SELECT list.
370 No two items within the same SELECT list may have the same value of
373 The expression column points to a row of any type in query.expression.
375 The column_alias column specifies a column alias to be supplied in an AS
376 clause. The generated SQL will enclose the column alias in double
379 The grouped_by column stipulates that the SELECT item be referenced in a
380 GROUP BY clause. The generated SQL references the item by its ordinal
381 position within the list, which may or may not be the same as the value
382 of the seq_no column. It's up to you to ensure that the resulting GROUP
383 BY clause is valid; i.e. if any item is in a GROUP BY clause, then every
384 other item that isn't an aggregate function must also be included in the
387 In SQL it is possible, though seldom useful, to include something in the
388 GROUP BY clause that is not included in the SELECT list. However the
389 query schema provides no way to encode such a query directly. The
390 workaround, should you ever need it, is to do the GROUP BY in a subquery
391 that includes everything it needs in the SELECT list, while the outer
392 query picks out only the items you want to keep.
396 Each row in query.order_by_item specifies an expression in an ORDER BY
397 list. Its columns are as follows:
399 id integer primary key
401 stored_query integer not null;
402 points to query.stored_query
404 seq_no integer not null
406 expression integer not null;
407 points to query.expression
409 The id is normally assigned by a database sequence.
411 The stored_query column identifies the query to which the ORDER BY
412 clause applies. This query must be a SELECT.
414 The seq_no column defines the sequence of items in the ORDER BY clause.
415 No two ORDER BY items for the same query may have the same value in
418 The expression column, pointing to a row in query.expression, identifies
419 an expression by which the query results will be sorted.
421 The generated ORDER BY clause includes the specified expressions bodily,
422 rather than by referring to items by their ordinal position in the
423 SELECT clause. As a result, you can include expressions that aren't in
424 the SELECT clause at all.
426 As a further result, the ORDER by clause becomes ugly and bulky if the
427 expressions are large and complicated. If you really want to reference
428 expressions in the SELECT list by number, use the corresponding numeric
429 constants as your ORDER BY expressions.
431 It may seem confusing that ORDER BY doesn't work the same way as GROUP
432 BY (see the discussion of the latter in the section on the
433 query.select_item table). In SQL, either clause can reference an
434 expression outside of the SELECT clause, but the query schema allows
435 such a reference only for ORDER BY. For GROUP BY you can get the same
436 effect only through an awkward workaround.
438 These design choices reflect a sense that having to use a workaround, in
439 order to list an expression not in the SELECT list, is more likely to be
440 a problem for ORDER BY than for GROUP BY.
444 The query.function_sig table stores information about function
447 id integer primary key
449 function_name text not null
451 return_type integer points to
454 is_aggregate boolean not null; default
457 The id is normally assigned by a database sequence.
459 The function_name column stores the name of the function.
461 The return_type column, pointing to a row in query.datatype, indicates
462 the return type of the function.
464 The is_aggregate column, if true, indicates that the function is an
465 aggregate function such as max() or sum(). Aggregate functions
466 typically don't have specific return types, because the effective return
467 type depends on the type of the argument.
469 Qstore pays attention only to the id and function_name columns; the
470 other two columns may be useful to the user interface. Likewise qstore
471 pays no attention to the query.function_param_def table, which defines
472 the datatypes of the function parameters.
476 The query schema represents a CASE expression as a row in
477 query.expression, with the type column set to “xcase”. For each branch
478 of the CASE expression there is a row in query.case_branch. Its columns
481 id integer primary key
483 parent_expr integer not null;
484 points to query.expression
486 seq_no integer not null
488 condition integer points to
491 result integer not null;
492 points to query.expression
494 The id is normally assigned by a database sequence.
496 The parent_expr column points to a row in query.expression representing
497 the entire CASE expression to which the branch belongs.
499 The seq_no column defines the sequence of branches within the CASE
500 expression. No two branches within the same CASE expression may have
501 the same value of seq_no.
503 The condition column, pointing to a row in query.expression, represents
504 a possible value of the expression being tested. In the generated SQL,
505 the corresponding expression will follow the WHEN keyword.
507 The result column, pointing to a row in query.expression, represents the
508 value to which the CASE expression evaluates if the branch is followed.
509 In the generated SQL, the corresponding expression will follow the THEN
512 If the condition column is null, then the branch is the ELSE branch.
513 There may be no more than one such branch in a given CASE statement,
514 and it must be the last branch.
518 The query schema represents a CAST expression with a row in
519 query.expression, where the type column is set to “xcast”. To identify
520 the datatype to which the operand is being cast, the query.row.datatype
521 column points to a row in query.datatype, which has the following
524 id integer primary key
526 datatype_name text not null
528 is_numeric boolean not null; default false
530 is_composite boolean not null; default
533 The id is normally assigned by a database sequence.
535 The datatype_name column, of course, the name of the datatype.
537 The is_numeric column, if true, indicates that the the type is numeric.
539 The is_composite column, if true, indicates that the datatype is
540 composed of two or more subfields, which may themselves be defined in
541 the query.subfield table.
543 Qstore pays attention only to the datatype_name and id columns. The
544 other two columns, and the query.subfield table, may be useful for the
549 The query.bind_variable table defines variables that may appear within
550 the query. Before executing the query, the user must supply a value for
551 each such variable, or accept the default value if one is defined. The
552 columns are as follows:
554 name text primary key
558 description text not null
564 The name column is the primary key, and contains the name of the
567 Depending on what kind of value the variable may hold, the type column
568 contains one of “string”, “number”, “string_list”, or “number_list”..
569 The first two denote individual scalar values, and the latter two
570 denote comma-separated lists of scalars. A null value may be encoded by
571 the JSON keyword “null”.
573 The description column describes the variable so that the user can know
576 The default_value column, if populated, contains the value that will be
577 used if the user does not specify some other value. This value must be
578 encoded as JSON; a list type must be encoded as a JSON array.
580 The label column is the identifier that will normally be shown to the
581 user. It should be reasonably short and descriptive, but it need not be
582 unique. The name provides uniqueness, and since it will mainly be used
583 internally, need not be as human-friendly as the label.
585 If qstore is asked to generate SQL for query with a bind variable that
586 has not been assigned a value, it will include the bind variable name in
587 the output SQL, preceded by a colon to mark it as a bind variable. Such
588 a query cannot be executed, but it can be displayed to the user for
591 Appendix: Expressions
593 A row in the query.expression table may represent any of several kinds
594 of expressions, as denoted by the contents of the type column. As noted
595 earlier, some of the columns in query.expression apply to all kinds of
596 expressions. The rest apply only to some kinds of expressions and not
597 to others, in various combinations.
599 This appendix discusses each expression type in turn, and how to
604 An “xbet” expression involves three subexpressions:
608 The left_operand column points to subexpression A. There must be
609 exactly two other rows representing subexpressions B and C, whose
610 parent_expr columns point to the “xbet” row.
612 The values of their seq_no columns determine which one comes first.
614 If the negate column is set to true, then the result is a NOT BETWEEN
619 An “xbind” expression refers to a bind variable, i.e. a value or series
620 of values that the user must supply before executing the query. In
621 query.expression, the bind_variable column points to a row in the
622 query.bind_variable table, which defines a name and a label for the
623 bind variable, and possibly a default value.
627 An “xbool” expression is a boolean literal. The literal column contains
628 “true” or “false” in any combination of upper, lower, or mixed case.
632 An “xcase” expression represents a CASE structure, as in the following
645 The left_operand column contains A, the value being tested. Each branch
646 of the CASE is represented by a row in query.case_branch, where the
647 condition column points to subexpressions B and D, and the result column
648 points to subexpressions C, E, and F. For the ELSE branch, the
649 condition column is null.
651 In the query.case_branch table, the seq_no column defines the order in
652 which the branches appear. If there is an ELSE branch, it must come
657 An “xcast” expression casts a subexpression to a datatype:
661 The left_operand column points to A, the expression being cast. The
662 cast_type column points to a row in query.datatype that defines the
665 xcol: Column Reference
667 An “xcol” expression refers to the contents of a column, optionally
668 qualified by an alias for a table, view, or other relation:
672 The column_name column contains the name of the column B. The
673 table_alias column, if not null, contains the alias A. Since qstore
674 always encloses the alias in quotation marks, there is no way to qualify
675 a column name by a raw table name.
679 An “xex” expression is an EXISTS clause with a subquery. The subquery
680 column points to a row in query.stored_query.
682 If the negate column is set to true, the result is a NOT EXISTS
687 An “xfunc” expression is a function call:
691 The function_id column points to a row in query.function_sig that
692 defines the function name A and other aspects of the function
693 signature.. Each parameter B, C, etc. is represented by a row in
694 query.expression, where parent_expr points to the “xfunc” row. The
695 seq_no columns for the various parameters define their positions within
698 If a function returns a composite type, it is possible to specify a
699 subfield of the return value:
701 (A( B, C, D ... )).”E”
703 In such a case, the column_name column contains the subfield name E.
705 Some built-in SQL functions don't follow the usual syntax of
706 parameter-passing. For example, the following function not only don't
707 accept any parameters, they don't even accept empty parentheses:
719 Qstore treats these functions as special exceptions in order to avoid
720 adding empty parentheses.
722 The extract function requires an extra keyword within the parameter
727 ...where A is one of a short list of unquoted strings. Qstore treats
728 calls to extract() as a special exception: pass A as if it were a string
729 literal, and qstore will build the call with a FROM and an unquoted A.
731 Qstore does not currently support other irregular functions.
735 An “xin” expression may take either of two forms. One form involves a
740 The left_operand column contains a pointer to another row in
741 query.expression, representing the value A to be tested. The subquery
742 column points to a row in query.stored_query, defining the subquery.
744 The other form involves a list of values:
748 Again, the left_operand indicates the value to be tested. Each value in
749 the list is represented by a row in query.expression whose parent_expr
750 column points to the “xin” row. The seq_no columns of the subexpression
751 rows define the order of their appearance.
753 If the negate column is set to true, then the result is a NOT IN
758 An “xisnull” expression tests whether a given value is null:
762 The left_operand column points to row in query.expression representing
763 the value to be tested.
765 If the negate column is set to true, then the result is an IS NOT NULL
770 An “xnull” expression represents a null value (and not a test for
775 An “xnum” expression represents a numeric literal. The literal column
776 contains the value as a string. This string may contain leading and/or
777 trailing white space, but otherwise must be numeric – possibly including
778 a leading minus sign, a decimal point, and/or scientific notation.
779 Currently this validation applies JSON's rules, which may differ in
780 some respects from SQL's rules.
784 An “xop” expression consists of an operator and one or two operands:
792 The operator column contains the operator as a string. This string may
793 contain any of the usual SQL operators. It may also contain a
794 non-standard custom operator, as long as it does not include white space
795 or a semicolon. (This support for custom operators was inherited from
796 json_query, where it makes sense. In qstore this support is unnecessary
797 and may be withdrawn in future releases.)
799 As special exceptions, the phrases "similar to", "is distinct from", and
800 "is not distinct from" may be used as binary operators, in any
801 combination of upper, lower, and mixed case, provided that they contain
802 no additional white space.
804 For a binary operator, then the left_operand column points to another
805 row in query.expression that represents the operand to the left of the
806 operator. Likewise the right_operand column identifies the expression
809 A few operators take only one operand. Accordingly only the
810 left_operand or right_operand column should be populated, depending on
811 whether the operand should appear to the left of the operator (such as
812 the factorial operator “!”) or to its right (such as the unary minus
817 An “xser” expression is a series of expressions separated by a specified
818 operator, or (if no operator is specifed) by commas:
820 A operator B operator C operator ... D
824 Typically the operator will be AND or OR, combining multiple conditions
825 in the WHERE clause. It is also possible to use, for example, an
826 arithmetic operator like “+”, or the concatenation operator “||”.
828 If the operator column is null, then qstore separates the expressions
829 with commas. By enclosing such a series in parentheses you can
832 Each subexpression in the series is represented by another row in
833 query.expression, whose parent_expr column points to the “xser” row.
834 The seq_no columns of the subexpressions define the order of their
835 appearance within the series.
837 The same operator is used for the entire series. If you need to combine
838 different operators in the same expression, as in A + B – C, then you
839 must nest multiple “xser” and “xop” expressions as needed.
841 Strictly speaking, the “xser” type isn't necessary. You can create all
842 the same expressions by nesting “xop” expressions, although it may be
843 rather cumbersome to do so. The “xser” type is merely a convenience,
844 making it easier to express certain common constructs.
846 xstr: Character String
848 An “xstr” expression consists of a character string, which must be
849 stored in the literal column. If the string contains any special
850 characters such as quotation marks or backslashes, qstore will escape
851 them as needed when it constructs the query.
855 An “xsubq” expression represents a subquery. The subquery column points
856 to a row in query.stored_query to identify the query.