1.6/development/introduction_to_sql.xml

   1 <?xml version="1.0" encoding="UTF-8"?>\r
   2 <chapter xml:id="intro_to_sql" xmlns="http://docbook.org/ns/docbook" version="5.0" xml:lang="EN"\r
   3     xmlns:xi="http://www.w3.org/2001/XInclude" xmlns:xlink="http://www.w3.org/1999/xlink">\r
   4         <chapterinfo>\r
   5                 <title>Introduction to SQL for Evergreen Administrators</title>\r
   6         </chapterinfo>\r
   7         <abstract id="itnroSQL_abstract">\r
   8                 <simpara>This chapter was taken from Dan Scott's <emphasis>Introduction to SQL for Evergreen Administrators</emphasis>, February 2010.</simpara>\r
   9         </abstract>     \r
  10         <section id="intro_to_databases">\r
  11                         <title>Introduction to SQL Databases</title>\r
  12                         <indexterm><primary>sql</primary></indexterm>\r
  13                 <simplesect>\r
  14                         <title>Introduction</title>\r
  15                         <simpara>Over time, the SQL database has become the standard method of storing,\r
  16                         retrieving, and processing raw data for applications. Ranging from embedded\r
  17                         databases such as SQLite and Apache Derby, to enterprise databases such as\r
  18                         Oracle and IBM DB2, any SQL database offers basic advantages to application\r
  19                         developers such as standard interfaces (Structured Query Language (SQL), Java\r
  20                         Database Connectivity (JDBC), Open Database Connectivity (ODBC), Perl Database\r
  21                         Independent Interface (DBI)), a standard conceptual model of data (tables,\r
  22                         fields, relationships, constraints, etc), performance in storing and retrieving\r
  23                         data, concurrent access, etc.</simpara>\r
  24                         <simpara>Evergreen is built on PostgreSQL, an open source SQL database that began as\r
  25                         <literal>POSTGRES</literal> at the University of California at Berkeley in 1986 as a research\r
  26                         project led by Professor Michael Stonebraker. A SQL interface was added to a\r
  27                         fork of the original POSTGRES Berkelely code in 1994, and in 1996 the project\r
  28                         was renamed PostgreSQL.</simpara>\r
  29                 </simplesect>\r
  30                 <simplesect id="_tables">\r
  31                         <title>Tables</title>\r
  32                         <indexterm><primary>sql</primary><secondary>tables</secondary></indexterm>\r
  33                         <simpara>The table is the cornerstone of a SQL database. Conceptually, a database table\r
  34                         is similar to a single sheet in a spreadsheet: every table has one or more\r
  35                         columns, with each row in the table containing values for each column. Each\r
  36                         column in a table defines an attribute corresponding to a particular data type.</simpara>\r
  37                         <simpara>We&#8217;ll insert a row into a table, then display the resulting contents. Don&#8217;t\r
  38                         worry if the INSERT statement is completely unfamiliar, we&#8217;ll talk more about\r
  39                         the syntax of the insert statement later.</simpara>\r
  40                         <formalpara><title><literal>actor.usr_note</literal> database table</title><para>\r
  41 <programlisting language="sql" linenumbering="unnumbered">\r
  42 evergreen=# INSERT INTO actor.usr_note (usr, creator, pub, title, value)\r
  43     VALUES (1, 1, TRUE, 'Who is this guy?', 'He''s the administrator!');\r
  44 \r
  45 evergreen=# select id, usr, creator, pub, title, value from actor.usr_note;\r
  46  id | usr | creator | pub |      title       |          value\r
  47 ----+-----+---------+-----+------------------+-------------------------\r
  48   1 |   1 |       1 | t   | Who is this guy? | He's the administrator!\r
  49 (1 rows)\r
  50 </programlisting>\r
  51                         </para></formalpara>\r
  52                         <simpara>PostgreSQL supports table inheritance, which lets you define tables that\r
  53                         inherit the column definitions of a given parent table. A search of the data in\r
  54                         the parent table includes the data in the child tables. Evergreen uses table\r
  55                         inheritance: for example, the <literal>action.circulation</literal> table is a child of the\r
  56                         <literal>money.billable_xact</literal> table, and the <literal>money.*_payment</literal> tables all inherit from\r
  57                         the <literal>money.payment</literal> parent table.</simpara>\r
  58                 </simplesect>\r
  59                 <simplesect id="_schemas">\r
  60                         <title>Schemas</title>\r
  61                         <simpara>PostgreSQL, like most SQL databases, supports the use of schema names to group\r
  62                         collections of tables and other database objects together. You might think of\r
  63                         schemas as namespaces if you&#8217;re a programmer; or you might think of the schema\r
  64                         / table / column relationship like the area code / exchange / local number\r
  65                         structure of a telephone number.</simpara>\r
  66                         <table\r
  67                         frame="all"\r
  68                         rowsep="1" colsep="1"\r
  69                         >\r
  70                         <title>Examples: database object names</title>\r
  71                         <?dbhtml table-width="80%"?>\r
  72                         <?dbfo table-width="80%"?>\r
  73                         <tgroup cols="4">\r
  74                         <colspec colname="col_1" colwidth="1.0*"/>\r
  75                         <colspec colname="col_2" colwidth="1.0*"/>\r
  76                         <colspec colname="col_3" colwidth="1.0*"/>\r
  77                         <colspec colname="col_4" colwidth="1.0*"/>\r
  78                         <thead>\r
  79                         <row>\r
  80                         <entry align="left" valign="top">Full name </entry>\r
  81                         <entry align="left" valign="top">Schema name </entry>\r
  82                         <entry align="left" valign="top">Table name </entry>\r
  83                         <entry align="left" valign="top">Field name</entry>\r
  84                         </row>\r
  85                         </thead>\r
  86                         <tbody>\r
  87                         <row>\r
  88                         <entry align="left" valign="top"><simpara>actor.usr_note.title</simpara></entry>\r
  89                         <entry align="left" valign="top"><simpara>actor</simpara></entry>\r
  90                         <entry align="left" valign="top"><simpara>usr_note</simpara></entry>\r
  91                         <entry align="left" valign="top"><simpara>title</simpara></entry>\r
  92                         </row>\r
  93                         <row>\r
  94                         <entry align="left" valign="top"><simpara>biblio.record_entry.marc</simpara></entry>\r
  95                         <entry align="left" valign="top"><simpara>biblio</simpara></entry>\r
  96                         <entry align="left" valign="top"><simpara>record_entry</simpara></entry>\r
  97                         <entry align="left" valign="top"><simpara>marc</simpara></entry>\r
  98                         </row>\r
  99                         </tbody>\r
 100                         </tgroup>\r
 101                         </table>\r
 102                         <simpara>The default schema name in PostgreSQL is <literal>public</literal>, so if you do not specify a\r
 103                         schema name when creating or accessing a database object, PostgreSQL will use\r
 104                         the <literal>public</literal> schema. As a result, you might not find the object that you&#8217;re\r
 105                         looking for if you don&#8217;t use the appropriate schema.</simpara>\r
 106                         <formalpara><title>Example: Creating a table without a specific schema</title><para>\r
 107 <programlisting language="sql" linenumbering="unnumbered">\r
 108 evergreen=# CREATE TABLE foobar (foo TEXT, bar TEXT);\r
 109 CREATE TABLE\r
 110 evergreen=# \d foobar\r
 111    Table "public.foobar"\r
 112  Column | Type | Modifiers\r
 113 --------+------+-----------\r
 114  foo    | text |\r
 115  bar    | text |\r
 116 </programlisting>\r
 117                         </para></formalpara>\r
 118                         <formalpara><title>Example: Trying to access a unqualified table outside of the public schema</title><para>\r
 119                         <programlisting language="sql" linenumbering="unnumbered">evergreen=# SELECT * FROM usr_note;\r
 120                         ERROR:  relation "usr_note" does not exist\r
 121                         LINE 1: SELECT * FROM usr_note;\r
 122                                               ^</programlisting>\r
 123                         </para></formalpara>\r
 124                         <simpara>Evergreen uses schemas to organize all of its tables with mostly intuitive,\r
 125                         if short, schema names. Here&#8217;s the current (as of 2010-01-03) list of schemas\r
 126                         used by Evergreen:</simpara>\r
 127                         <table\r
 128                         frame="all"\r
 129                         rowsep="1" colsep="1"\r
 130                         >\r
 131                         <title>Evergreen schema names</title>\r
 132                         <?dbhtml table-width="80%"?>\r
 133                         <?dbfo table-width="80%"?>\r
 134                         <tgroup cols="2">\r
 135                         <colspec colname="col_1" colwidth="1.0*"/>\r
 136                         <colspec colname="col_2" colwidth="1.0*"/>\r
 137                         <thead>\r
 138                         <row>\r
 139                         <entry align="left" valign="top">Schema name </entry>\r
 140                         <entry align="left" valign="top">Description</entry>\r
 141                         </row>\r
 142                         </thead>\r
 143                         <tbody>\r
 144                         <row>\r
 145                         <entry align="left" valign="top"><simpara><literal>acq</literal></simpara></entry>\r
 146                         <entry align="left" valign="top"><simpara>Acquisitions</simpara></entry>\r
 147                         </row>\r
 148                         <row>\r
 149                         <entry align="left" valign="top"><simpara><literal>action</literal></simpara></entry>\r
 150                         <entry align="left" valign="top"><simpara>Circulation actions</simpara></entry>\r
 151                         </row>\r
 152                         <row>\r
 153                         <entry align="left" valign="top"><simpara><literal>action_trigger</literal></simpara></entry>\r
 154                         <entry align="left" valign="top"><simpara>Event mechanisms</simpara></entry>\r
 155                         </row>\r
 156                         <row>\r
 157                         <entry align="left" valign="top"><simpara><literal>actor</literal></simpara></entry>\r
 158                         <entry align="left" valign="top"><simpara>Evergreen users and organization units</simpara></entry>\r
 159                         </row>\r
 160                         <row>\r
 161                         <entry align="left" valign="top"><simpara><literal>asset</literal></simpara></entry>\r
 162                         <entry align="left" valign="top"><simpara>Call numbers and copies</simpara></entry>\r
 163                         </row>\r
 164                         <row>\r
 165                         <entry align="left" valign="top"><simpara><literal>auditor</literal></simpara></entry>\r
 166                         <entry align="left" valign="top"><simpara>Track history of changes to selected tables</simpara></entry>\r
 167                         </row>\r
 168                         <row>\r
 169                         <entry align="left" valign="top"><simpara><literal>authority</literal></simpara></entry>\r
 170                         <entry align="left" valign="top"><simpara>Authority records</simpara></entry>\r
 171                         </row>\r
 172                         <row>\r
 173                         <entry align="left" valign="top"><simpara><literal>biblio</literal></simpara></entry>\r
 174                         <entry align="left" valign="top"><simpara>Bibliographic records</simpara></entry>\r
 175                         </row>\r
 176                         <row>\r
 177                         <entry align="left" valign="top"><simpara><literal>booking</literal></simpara></entry>\r
 178                         <entry align="left" valign="top"><simpara>Resource bookings</simpara></entry>\r
 179                         </row>\r
 180                         <row>\r
 181                         <entry align="left" valign="top"><simpara><literal>config</literal></simpara></entry>\r
 182                         <entry align="left" valign="top"><simpara>Evergreen configurable options</simpara></entry>\r
 183                         </row>\r
 184                         <row>\r
 185                         <entry align="left" valign="top"><simpara><literal>container</literal></simpara></entry>\r
 186                         <entry align="left" valign="top"><simpara>Buckets for records, call numbers, copies, and users</simpara></entry>\r
 187                         </row>\r
 188                         <row>\r
 189                         <entry align="left" valign="top"><simpara><literal>extend_reporter</literal></simpara></entry>\r
 190                         <entry align="left" valign="top"><simpara>Extra views for report definitions</simpara></entry>\r
 191                         </row>\r
 192                         <row>\r
 193                         <entry align="left" valign="top"><simpara><literal>metabib</literal></simpara></entry>\r
 194                         <entry align="left" valign="top"><simpara>Metadata about bibliographic records</simpara></entry>\r
 195                         </row>\r
 196                         <row>\r
 197                         <entry align="left" valign="top"><simpara><literal>money</literal></simpara></entry>\r
 198                         <entry align="left" valign="top"><simpara>Fines and bills</simpara></entry>\r
 199                         </row>\r
 200                         <row>\r
 201                         <entry align="left" valign="top"><simpara><literal>offline</literal></simpara></entry>\r
 202                         <entry align="left" valign="top"><simpara>Offline transactions</simpara></entry>\r
 203                         </row>\r
 204                         <row>\r
 205                         <entry align="left" valign="top"><simpara><literal>permission</literal></simpara></entry>\r
 206                         <entry align="left" valign="top"><simpara>User permissions</simpara></entry>\r
 207                         </row>\r
 208                         <row>\r
 209                         <entry align="left" valign="top"><simpara><literal>query</literal></simpara></entry>\r
 210                         <entry align="left" valign="top"><simpara>Stored SQL statements</simpara></entry>\r
 211                         </row>\r
 212                         <row>\r
 213                         <entry align="left" valign="top"><simpara><literal>reporter</literal></simpara></entry>\r
 214                         <entry align="left" valign="top"><simpara>Report definitions</simpara></entry>\r
 215                         </row>\r
 216                         <row>\r
 217                         <entry align="left" valign="top"><simpara><literal>search</literal></simpara></entry>\r
 218                         <entry align="left" valign="top"><simpara>Search functions</simpara></entry>\r
 219                         </row>\r
 220                         <row>\r
 221                         <entry align="left" valign="top"><simpara><literal>serial</literal></simpara></entry>\r
 222                         <entry align="left" valign="top"><simpara>Serial MFHD records</simpara></entry>\r
 223                         </row>\r
 224                         <row>\r
 225                         <entry align="left" valign="top"><simpara><literal>stats</literal></simpara></entry>\r
 226                         <entry align="left" valign="top"><simpara>Convenient views of circulation and asset statistics</simpara></entry>\r
 227                         </row>\r
 228                         <row>\r
 229                         <entry align="left" valign="top"><simpara><literal>vandelay</literal></simpara></entry>\r
 230                         <entry align="left" valign="top"><simpara>MARC batch importer and exporter</simpara></entry>\r
 231                         </row>\r
 232                         </tbody>\r
 233                         </tgroup>\r
 234                         </table>\r
 235                         <note><simpara>The term <emphasis>schema</emphasis> has two meanings in the world of SQL databases. We have\r
 236                         discussed the schema as a conceptual grouping of tables and other database\r
 237                         objects within a given namespace; for example, "the <emphasis role="strong">actor</emphasis> schema contains the\r
 238                         tables and functions related to users and organizational units". Another common\r
 239                         usage of <emphasis>schema</emphasis> is to refer to the entire data model for a given database;\r
 240                         for example, "the Evergreen database schema".</simpara></note>\r
 241                 </simplesect>\r
 242                 <simplesect id="_columns">\r
 243                         <title>Columns</title>\r
 244                         <simpara>Each column definition consists of:</simpara>\r
 245                         <itemizedlist>\r
 246                         <listitem>\r
 247                         <simpara>\r
 248                         a data type\r
 249                         </simpara>\r
 250                         </listitem>\r
 251                         <listitem>\r
 252                         <simpara>\r
 253                         (optionally) a default value to be used whenever a row is inserted that\r
 254                              does not contain a specific value\r
 255                         </simpara>\r
 256                         </listitem>\r
 257                         <listitem>\r
 258                         <simpara>\r
 259                         (optionally) one or more constraints on the values beyond data type\r
 260                         </simpara>\r
 261                         </listitem>\r
 262                         </itemizedlist>\r
 263                         <simpara>Although PostgreSQL supports dozens of data types, Evergreen makes our life\r
 264                         easier by only using a handful.</simpara>\r
 265                         <table\r
 266                         frame="all"\r
 267                         rowsep="1" colsep="1"\r
 268                         >\r
 269                         <title>PostgreSQL data types used by Evergreen</title>\r
 270                         <?dbhtml table-width="90%"?>\r
 271                         <?dbfo table-width="90%"?>\r
 272                         <tgroup cols="3">\r
 273                         <colspec colname="col_1" colwidth="1.0*"/>\r
 274                         <colspec colname="col_2" colwidth="1.0*"/>\r
 275                         <colspec colname="col_3" colwidth="2.5*"/>\r
 276                         <thead>\r
 277                         <row>\r
 278                         <entry align="left" valign="top">Type name   </entry>\r
 279                         <entry align="left" valign="top">Description    </entry>\r
 280                         <entry align="left" valign="top">Limits</entry>\r
 281                         </row>\r
 282                         </thead>\r
 283                         <tbody>\r
 284                         <row>\r
 285                         <entry align="left" valign="top"><simpara><literal>INTEGER</literal></simpara></entry>\r
 286                         <entry align="left" valign="top"><simpara>Medium integer</simpara></entry>\r
 287                         <entry align="left" valign="top"><simpara>-2147483648 to +2147483647</simpara></entry>\r
 288                         </row>\r
 289                         <row>\r
 290                         <entry align="left" valign="top"><simpara><literal>BIGINT</literal></simpara></entry>\r
 291                         <entry align="left" valign="top"><simpara>Large integer</simpara></entry>\r
 292                         <entry align="left" valign="top"><simpara>-9223372036854775808 to 9223372036854775807</simpara></entry>\r
 293                         </row>\r
 294                         <row>\r
 295                         <entry align="left" valign="top"><simpara><literal>SERIAL</literal></simpara></entry>\r
 296                         <entry align="left" valign="top"><simpara>Sequential integer</simpara></entry>\r
 297                         <entry align="left" valign="top"><simpara>1 to 2147483647</simpara></entry>\r
 298                         </row>\r
 299                         <row>\r
 300                         <entry align="left" valign="top"><simpara><literal>BIGSERIAL</literal></simpara></entry>\r
 301                         <entry align="left" valign="top"><simpara>Large sequential integer</simpara></entry>\r
 302                         <entry align="left" valign="top"><simpara>1 to 9223372036854775807</simpara></entry>\r
 303                         </row>\r
 304                         <row>\r
 305                         <entry align="left" valign="top"><simpara><literal>TEXT</literal></simpara></entry>\r
 306                         <entry align="left" valign="top"><simpara>Variable length character data</simpara></entry>\r
 307                         <entry align="left" valign="top"><simpara>Unlimited length</simpara></entry>\r
 308                         </row>\r
 309                         <row>\r
 310                         <entry align="left" valign="top"><simpara><literal>BOOL</literal></simpara></entry>\r
 311                         <entry align="left" valign="top"><simpara>Boolean</simpara></entry>\r
 312                         <entry align="left" valign="top"><simpara>TRUE or FALSE</simpara></entry>\r
 313                         </row>\r
 314                         <row>\r
 315                         <entry align="left" valign="top"><simpara><literal>TIMESTAMP WITH TIME ZONE</literal></simpara></entry>\r
 316                         <entry align="left" valign="top"><simpara>Timestamp</simpara></entry>\r
 317                         <entry align="left" valign="top"><simpara>4713 BC to 294276 AD</simpara></entry>\r
 318                         </row>\r
 319                         <row>\r
 320                         <entry align="left" valign="top"><simpara><literal>TIME</literal></simpara></entry>\r
 321                         <entry align="left" valign="top"><simpara>Time</simpara></entry>\r
 322                         <entry align="left" valign="top"><simpara>Expressed in HH:MM:SS</simpara></entry>\r
 323                         </row>\r
 324                         <row>\r
 325                         <entry align="left" valign="top"><simpara><literal>NUMERIC</literal>(precision, scale)</simpara></entry>\r
 326                         <entry align="left" valign="top"><simpara>Decimal</simpara></entry>\r
 327                         <entry align="left" valign="top"><simpara>Up to 1000 digits of precision. In Evergreen mostly used for money\r
 328                         values, with a precision of 6 and a scale of 2 (<literal>####.##</literal>).</simpara></entry>\r
 329                         </row>\r
 330                         </tbody>\r
 331                         </tgroup>\r
 332                         </table>\r
 333                         <simpara>Full details about these data types are available from the\r
 334                         <ulink url="http://www.postgresql.org/docs/8.4/static/datatype.html">data types section of\r
 335                         the PostgreSQL manual</ulink>.</simpara>\r
 336                 </simplesect>\r
 337                 <simplesect id="_constraints">\r
 338                         <title>Constraints</title>\r
 339                         <simplesect id="_prevent_null_values">\r
 340                                 <title>Prevent NULL values</title>\r
 341                                 <simpara>A column definition may include the constraint <literal>NOT NULL</literal> to prevent NULL\r
 342                                 values. In PostgreSQL, a NULL value is not the equivalent of zero or false or\r
 343                                 an empty string; it is an explicit non-value with special properties. We&#8217;ll\r
 344                                 talk more about how to work with NULL values when we get to queries.</simpara>\r
 345                         </simplesect>\r
 346                         <simplesect id="_primary_key">\r
 347                                 <title>Primary key</title>\r
 348                                 <simpara>Every table can have at most one primary key. A primary key consists of one or\r
 349                                 more columns which together uniquely identify each row in a table. If you\r
 350                                 attempt to insert a row into a table that would create a duplicate or NULL\r
 351                                 primary key entry, the database rejects the row and returns an error.</simpara>\r
 352                                 <simpara>Natural primary keys are drawn from the intrinsic properties of the data being\r
 353                                 modelled. For example, some potential natural primary keys for a table that\r
 354                                 contains people would be:</simpara>\r
 355                                 <table\r
 356                                 frame="all"\r
 357                                 rowsep="1" colsep="1"\r
 358                                 >\r
 359                                 <title>Example: Some potential natural primary keys for a table of people</title>\r
 360                                 <?dbhtml table-width="90%"?>\r
 361                                 <?dbfo table-width="90%"?>\r
 362                                 <tgroup cols="3">\r
 363                                 <colspec colname="col_1" colwidth="1.0*"/>\r
 364                                 <colspec colname="col_2" colwidth="2.0*"/>\r
 365                                 <colspec colname="col_3" colwidth="2.0*"/>\r
 366                                 <thead>\r
 367                                 <row>\r
 368                                 <entry align="left" valign="top">Natural key </entry>\r
 369                                 <entry align="left" valign="top">Pros </entry>\r
 370                                 <entry align="left" valign="top">Cons</entry>\r
 371                                 </row>\r
 372                                 </thead>\r
 373                                 <tbody>\r
 374                                 <row>\r
 375                                 <entry align="left" valign="top"><simpara>First name, last name, address</simpara></entry>\r
 376                                 <entry align="left" valign="top"><simpara>No two people with the same name would ever live at the same address, right?</simpara></entry>\r
 377                                 <entry align="left" valign="top"><simpara>Lots of columns force data duplication in referencing tables</simpara></entry>\r
 378                                 </row>\r
 379                                 <row>\r
 380                                 <entry align="left" valign="top"><simpara>SSN or driver&#8217;s license</simpara></entry>\r
 381                                 <entry align="left" valign="top"><simpara>These are guaranteed to be unique</simpara></entry>\r
 382                                 <entry align="left" valign="top"><simpara>Lots of people don&#8217;t have an SSN or a driver&#8217;s license</simpara></entry>\r
 383                                 </row>\r
 384                                 </tbody>\r
 385                                 </tgroup>\r
 386                                 </table>\r
 387                                 <simpara>To avoid problems with natural keys, many applications instead define surrogate\r
 388                                 primary keys. A surrogate primary keys is a column with an autoincrementing\r
 389                                 integer value added to a table definition that ensures uniqueness.</simpara>\r
 390                                 <simpara>Evergreen uses surrogate keys (a column named <literal>id</literal> with a <literal>SERIAL</literal> data type)\r
 391                                 for most of its tables.</simpara>\r
 392                         </simplesect>\r
 393                         <simplesect id="_foreign_keys">\r
 394                                 <title>Foreign keys</title>\r
 395                                 <simpara>Every table can contain zero or more foreign keys: one or more columns that\r
 396                                 refer to the primary key of another table.</simpara>\r
 397                                 <simpara>For example, let&#8217;s consider Evergreen&#8217;s modelling of the basic relationship\r
 398                                 between copies, call numbers, and bibliographic records. Bibliographic records\r
 399                                 contained in the <literal>biblio.record_entry</literal> table can have call numbers attached to\r
 400                                 them. Call numbers are contained in the <literal>asset.call_number</literal> table, and they can\r
 401                                 have copies attached to them. Copies are contained in the <literal>asset.copy</literal> table.</simpara>\r
 402                                 <table\r
 403                                 frame="all"\r
 404                                 rowsep="1" colsep="1"\r
 405                                 >\r
 406                                 <title>Example: Evergreen&#8217;s copy / call number / bibliographic record relationships</title>\r
 407                                 <?dbhtml table-width="100%"?>\r
 408                                 <?dbfo table-width="100%"?>\r
 409                                 <tgroup cols="4">\r
 410                                 <colspec colname="col_1" colwidth="1.0*"/>\r
 411                                 <colspec colname="col_2" colwidth="1.0*"/>\r
 412                                 <colspec colname="col_3" colwidth="1.0*"/>\r
 413                                 <colspec colname="col_4" colwidth="1.0*"/>\r
 414                                 <thead>\r
 415                                 <row>\r
 416                                 <entry align="left" valign="top">Table </entry>\r
 417                                 <entry align="left" valign="top">Primary key </entry>\r
 418                                 <entry align="left" valign="top">Column with a foreign key </entry>\r
 419                                 <entry align="left" valign="top">Points to</entry>\r
 420                                 </row>\r
 421                                 </thead>\r
 422                                 <tbody>\r
 423                                 <row>\r
 424                                 <entry align="left" valign="top"><simpara>asset.copy</simpara></entry>\r
 425                                 <entry align="left" valign="top"><simpara>asset.copy.id</simpara></entry>\r
 426                                 <entry align="left" valign="top"><simpara>asset.copy.call_number</simpara></entry>\r
 427                                 <entry align="left" valign="top"><simpara>asset.call_number.id</simpara></entry>\r
 428                                 </row>\r
 429                                 <row>\r
 430                                 <entry align="left" valign="top"><simpara>asset.call_number</simpara></entry>\r
 431                                 <entry align="left" valign="top"><simpara>asset.call_number.id</simpara></entry>\r
 432                                 <entry align="left" valign="top"><simpara>asset.call_number.record</simpara></entry>\r
 433                                 <entry align="left" valign="top"><simpara>biblio.record_entry.id</simpara></entry>\r
 434                                 </row>\r
 435                                 <row>\r
 436                                 <entry align="left" valign="top"><simpara>biblio.record_entry</simpara></entry>\r
 437                                 <entry align="left" valign="top"><simpara>biblio.record_entry.id</simpara></entry>\r
 438                                 <entry align="left" valign="top"><simpara></simpara></entry>\r
 439                                 <entry align="left" valign="top"><simpara></simpara></entry>\r
 440                                 </row>\r
 441                                 </tbody>\r
 442                                 </tgroup>\r
 443                                 </table>\r
 444                         </simplesect>\r
 445                         <simplesect id="_check_constraints">\r
 446                                 <title>Check constraints</title>\r
 447                                 <simpara>PostgreSQL enables you to define rules to ensure that the value to be inserted\r
 448                                 or updated meets certain conditions. For example, you can ensure that an\r
 449                                 incoming integer value is within a specific range, or that a ZIP code matches a\r
 450                                 particular pattern.</simpara>\r
 451                         </simplesect>\r
 452                 </simplesect>\r
 453                 <simplesect id="_deconstructing_a_table_definition_statement">\r
 454                         <title>Deconstructing a table definition statement</title>\r
 455                         <simpara>The <literal>actor.org_address</literal> table is a simple table in the Evergreen schema that\r
 456                         we can use as a concrete example of many of the properties of databases that\r
 457                         we have discussed so far.</simpara>\r
 458 <programlisting language="sql" linenumbering="unnumbered">\r
 459 CREATE TABLE actor.org_address (\r
 460   id            SERIAL  PRIMARY KEY,      <co id="sqlCO1-1"/>\r
 461   valid         BOOL    NOT NULL DEFAULT TRUE, <co id="sqlCO1-2"/>\r
 462   address_type  TEXT    NOT NULL DEFAULT 'MAILING', <co id="sqlCO1-3"/>\r
 463   org_unit      INT     NOT NULL REFERENCES actor.org_unit (id)  <co id="sqlCO1-4"/>\r
 464                           DEFERRABLE INITIALLY DEFERRED,\r
 465   street1       TEXT    NOT NULL,\r
 466   street2       TEXT, <co id="sqlCO1-5"/>\r
 467   city          TEXT    NOT NULL,\r
 468   county        TEXT,\r
 469   state         TEXT    NOT NULL,\r
 470   country       TEXT    NOT NULL,\r
 471   post_code     TEXT    NOT NULL\r
 472 );\r
 473 </programlisting>\r
 474                         <calloutlist>\r
 475                         <callout arearefs="sqlCO1-1">\r
 476                         <simpara>\r
 477                         The column named <literal>id</literal> is defined with a special data type of <literal>SERIAL</literal>; if\r
 478                         given no value when a row is inserted into a table, the database automatically\r
 479                         generates the next sequential integer value for the column. <literal>SERIAL</literal> is a\r
 480                         popular data type for a primary key because it is guaranteed to be unique - and\r
 481                         indeed, the constraint for this column identifies it as the <literal>PRIMARY KEY</literal>.\r
 482                         </simpara>\r
 483                         </callout>\r
 484                         <callout arearefs="sqlCO1-2">\r
 485                         <simpara>\r
 486                         The data type <literal>BOOL</literal> defines a boolean value: <literal>TRUE</literal> or <literal>FALSE</literal> are the only\r
 487                         acceptable values for the column. The constraint <literal>NOT NULL</literal> instructs the\r
 488                         database to prevent the column from ever containing a NULL value. The column\r
 489                         property <literal>DEFAULT TRUE</literal> instructs the database to automatically set the value\r
 490                         of the column to <literal>TRUE</literal> if no value is provided.\r
 491                         </simpara>\r
 492                         </callout>\r
 493                         <callout arearefs="sqlCO1-3">\r
 494                         <simpara>\r
 495                         The data type <literal>TEXT</literal> defines a text column of practically unlimited length.\r
 496                         As with the previous column, there is a <literal>NOT NULL</literal> constraint, and a default\r
 497                         value of <literal>'MAILING'</literal> will result if no other value is supplied.\r
 498                         </simpara>\r
 499                         </callout>\r
 500                         <callout arearefs="sqlCO1-4">\r
 501                         <simpara>\r
 502                         The <literal>REFERENCES actor.org_unit (id)</literal> clause indicates that this column has a\r
 503                         foreign key relationship to the <literal>actor.org_unit</literal> table, and that the value of\r
 504                         this column in every row in this table must have a corresponding value in the\r
 505                         <literal>id</literal> column in the referenced table (<literal>actor.org_unit</literal>).\r
 506                         </simpara>\r
 507                         </callout>\r
 508                         <callout arearefs="sqlCO1-5">\r
 509                         <simpara>\r
 510                         The column named <literal>street2</literal> demonstrates that not all columns have constraints\r
 511                         beyond data type. In this case, the column is allowed to be NULL or to contain a\r
 512                         <literal>TEXT</literal> value.\r
 513                         </simpara>\r
 514                         </callout>\r
 515                         </calloutlist>\r
 516                 </simplesect>\r
 517                 <simplesect id="_displaying_a_table_definition_using_literal_psql_literal">\r
 518                         <title>Displaying a table definition using <literal>psql</literal></title>\r
 519                         <simpara>The <literal>psql</literal> command-line interface is the preferred method for accessing\r
 520                         PostgreSQL databases. It offers features like tab-completion, readline support\r
 521                         for recalling previous commands, flexible input and output formats, and\r
 522                         is accessible via a standard SSH session.</simpara>\r
 523                         <simpara>If you press the <literal>Tab</literal> key once after typing one or more characters of the\r
 524                         database object name, <literal>psql</literal> automatically completes the name if there are no\r
 525                         other matches. If there are other matches for your current input, nothing\r
 526                         happens until you press the <literal>Tab</literal> key a second time, at which point <literal>psql</literal>\r
 527                         displays all of the matches for your current input.</simpara>\r
 528                         <simpara>To display the definition of a database object such as a table, issue the\r
 529                         command <literal>\d _object-name_</literal>. For example, to display the definition of the\r
 530                         actor.usr_note table:</simpara>\r
 531 <programlisting language="sh" linenumbering="unnumbered">\r
 532 $ psql evergreen <co id="sqlCO2-1"/>\r
 533 psql (8.4.1)\r
 534 Type "help" for help.\r
 535 \r
 536 evergreen=# \d actor.usr_note  <co id="sqlCO2-2"/>\r
 537                                         Table "actor.usr_note"\r
 538    Column    |           Type           |                          Modifiers\r
 539 -------------+--------------------------+-------------------------------------------------------------\r
 540  id          | bigint                   | not null default nextval('actor.usr_note_id_seq'::regclass)\r
 541  usr         | bigint                   | not null\r
 542  creator     | bigint                   | not null\r
 543  create_date | timestamp with time zone | default now()\r
 544  pub         | boolean                  | not null default false\r
 545  title       | text                     | not null\r
 546  value       | text                     | not null\r
 547 Indexes:\r
 548     "usr_note_pkey" PRIMARY KEY, btree (id)\r
 549     "actor_usr_note_creator_idx" btree (creator)\r
 550     "actor_usr_note_usr_idx" btree (usr)\r
 551 Foreign-key constraints:\r
 552     "usr_note_creator_fkey" FOREIGN KEY (creator) REFERENCES actor.usr(id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED\r
 553     "usr_note_usr_fkey" FOREIGN KEY (usr) REFERENCES actor.usr(id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED\r
 554 \r
 555 evergreen=# \q    <co id="sqlCO2-3"/>\r
 556 $\r
 557 </programlisting>\r
 558                         <calloutlist>\r
 559                         <callout arearefs="sqlCO2-1">\r
 560                         <simpara>\r
 561                         This is the most basic connection to a PostgreSQL database. You can use a\r
 562                         number of other flags to specify user name, hostname, port, and other options.\r
 563                         </simpara>\r
 564                         </callout>\r
 565                         <callout arearefs="sqlCO2-2">\r
 566                         <simpara>\r
 567                         The <literal>\d</literal> command displays the definition of a database object.\r
 568                         </simpara>\r
 569                         </callout>\r
 570                         <callout arearefs="sqlCO2-3">\r
 571                         <simpara>\r
 572                         The <literal>\q</literal> command quits the <literal>psql</literal> session and returns you to the shell prompt.\r
 573                         </simpara>\r
 574                         </callout>\r
 575                         </calloutlist>\r
 576                 </simplesect>\r
 577         </section>\r
 578         <section id="basic_sql_queries">\r
 579                 <title>Basic SQL queries</title>\r
 580                 <simplesect id="_the_select_statement">\r
 581                         <title>The SELECT statement</title>\r
 582                         <simpara>The SELECT statement is the basic tool for retrieving information from a\r
 583                         database. The syntax for most SELECT statements is:</simpara>\r
 584                         <blockquote>\r
 585                         <literallayout><literal>SELECT</literal> [<emphasis>columns(s)</emphasis>]\r
 586                           <literal>FROM</literal> [<emphasis>table(s)</emphasis>]\r
 587                           [<literal>WHERE</literal> <emphasis>condition(s)</emphasis>]\r
 588                           [<literal>GROUP BY</literal> <emphasis>columns(s)</emphasis>]\r
 589                           [<literal>HAVING</literal> <emphasis>grouping-condition(s)</emphasis>]\r
 590                           [<literal>ORDER BY</literal> <emphasis>column(s)</emphasis>]\r
 591                           [<literal>LIMIT</literal> <emphasis>maximum-results</emphasis>]\r
 592                           [<literal>OFFSET</literal> <emphasis>start-at-result-#</emphasis>]\r
 593                         ;</literallayout>\r
 594                         </blockquote>\r
 595                         <simpara>For example, to select all of the columns for each row in the\r
 596                         <literal>actor.usr_address</literal> table, issue the following query:</simpara>\r
 597                         <programlisting language="sql" linenumbering="unnumbered">SELECT *\r
 598                           FROM actor.usr_address\r
 599                         ;</programlisting>\r
 600                 </simplesect>\r
 601                 <simplesect id="_selecting_particular_columns_from_a_table">\r
 602                         <title>Selecting particular columns from a table</title>\r
 603                         <simpara><literal>SELECT *</literal> returns all columns from all of the tables included in your query.\r
 604                         However, quite often you will want to return only a subset of the possible\r
 605                         columns. You can retrieve specific columns by listing the names of the columns\r
 606                         you want after the <literal>SELECT</literal> keyword. Separate each column name with a comma.</simpara>\r
 607                         <simpara>For example, to select just the city, county, and state from the\r
 608                         actor.usr_address table, issue the following query:</simpara>\r
 609                         <programlisting language="sql" linenumbering="unnumbered">SELECT city, county, state\r
 610                           FROM actor.usr_address\r
 611                         ;</programlisting>\r
 612                 </simplesect>\r
 613                 <simplesect id="_sorting_results_with_the_order_by_clause">\r
 614                         <title>Sorting results with the ORDER BY clause</title>\r
 615                         <simpara>By default, a SELECT statement returns rows matching your query with no\r
 616                         guarantee of any particular order in which they are returned. To force\r
 617                         the rows to be returned in a particular order, use the ORDER BY clause\r
 618                         to specify one or more columns to determine the sorting priority of the\r
 619                         rows.</simpara>\r
 620                         <simpara>For example, to sort the rows returned from your <literal>actor.usr_address</literal> query by\r
 621                         city, with county and then zip code as the tie breakers, issue the\r
 622                         following query:</simpara>\r
 623 <programlisting language="sql" linenumbering="unnumbered">\r
 624 SELECT city, county, state\r
 625   FROM actor.usr_address\r
 626   ORDER BY city, county, post_code\r
 627 ;\r
 628 </programlisting>\r
 629                 </simplesect>\r
 630                 <simplesect id="_filtering_results_with_the_where_clause">\r
 631                         <title>Filtering results with the WHERE clause</title>\r
 632                         <simpara>Thus far, your results have been returning all of the rows in the table.\r
 633                         Normally, however, you would want to restrict the rows that are returned to the\r
 634                         subset of rows that match one or more conditions of your search. The <literal>WHERE</literal>\r
 635                         clause enables you to specify a set of conditions that filter your query\r
 636                         results. Each condition in the <literal>WHERE</literal> clause is an SQL expression that returns\r
 637                         a boolean (true or false) value.</simpara>\r
 638                         <simpara>For example, to restrict the results returned from your <literal>actor.usr_address</literal>\r
 639                         query to only those rows containing a state value of <emphasis>Connecticut</emphasis>, issue the\r
 640                         following query:</simpara>\r
 641 <programlisting language="sql" linenumbering="unnumbered">\r
 642 SELECT city, county, state\r
 643   FROM actor.usr_address\r
 644   WHERE state = 'Connecticut'\r
 645   ORDER BY city, county, post_code\r
 646 ;\r
 647 </programlisting>\r
 648                         <simpara>You can include more conditions in the <literal>WHERE</literal> clause with the <literal>OR</literal> and <literal>AND</literal>\r
 649                         operators. For example, to further restrict the results returned from your\r
 650                         <literal>actor.usr_address</literal> query to only those rows where the state column contains a\r
 651                         value of <emphasis>Connecticut</emphasis> and the city column contains a value of <emphasis>Hartford</emphasis>,\r
 652                         issue the following query:</simpara>\r
 653 <programlisting language="sql" linenumbering="unnumbered">\r
 654 SELECT city, county, state\r
 655   FROM actor.usr_address\r
 656   WHERE state = 'Connecticut'\r
 657     AND city = 'Hartford'\r
 658   ORDER BY city, county, post_code\r
 659 ;\r
 660 </programlisting>\r
 661                         <note><simpara>To return rows where the state is <emphasis>Connecticut</emphasis> and the city is <emphasis>Hartford</emphasis> or\r
 662                         <emphasis>New Haven</emphasis>, you must use parentheses to explicitly group the city value\r
 663                         conditions together, or else the database will evaluate the <literal>OR city = 'New\r
 664                         Haven'</literal> clause entirely on its own and match all rows where the city column is\r
 665                         <emphasis>New Haven</emphasis>, even though the state might not be <emphasis>Connecticut</emphasis>.</simpara></note>\r
 666                         <formalpara><title>Trouble with OR</title><para>\r
 667 <programlisting language="sql" linenumbering="unnumbered">\r
 668 SELECT city, county, state\r
 669   FROM actor.usr_address\r
 670   WHERE state = 'Connecticut'\r
 671     AND city = 'Hartford' OR city = 'New Haven'\r
 672   ORDER BY city, county, post_code\r
 673 ;\r
 674 \r
 675 -- Can return unwanted rows because the OR is not grouped!\r
 676 </programlisting>\r
 677                         </para></formalpara>\r
 678                         <formalpara><title>Grouped OR&#8217;ed conditions</title><para>\r
 679 <programlisting language="sql" linenumbering="unnumbered">\r
 680 SELECT city, county, state\r
 681   FROM actor.usr_address\r
 682   WHERE state = 'Connecticut'\r
 683     AND (city = 'Hartford' OR city = 'New Haven')\r
 684   ORDER BY city, county, post_code\r
 685 ;\r
 686 \r
 687 -- The parentheses ensure that the OR is applied to the cities, and the\r
 688 -- state in either case must be 'Connecticut'\r
 689 </programlisting>\r
 690                         </para></formalpara>\r
 691                         <simplesect id="_comparison_operators">\r
 692                                 <title>Comparison operators</title>\r
 693                                 <simpara>Here is a partial list of comparison operators that are commonly used in\r
 694                                 <literal>WHERE</literal> clauses:</simpara>\r
 695                                 <simplesect id="_comparing_two_scalar_values">\r
 696                                         <title>Comparing two scalar values</title>\r
 697                                         <itemizedlist>\r
 698                                         <listitem>\r
 699                                         <simpara>\r
 700                                         <literal>x = y</literal> (equal to)\r
 701                                         </simpara>\r
 702                                         </listitem>\r
 703                                         <listitem>\r
 704                                         <simpara>\r
 705                                         <literal>x != y</literal> (not equal to)\r
 706                                         </simpara>\r
 707                                         </listitem>\r
 708                                         <listitem>\r
 709                                         <simpara>\r
 710                                         <literal>x &lt; y</literal> (less than)\r
 711                                         </simpara>\r
 712                                         </listitem>\r
 713                                         <listitem>\r
 714                                         <simpara>\r
 715                                         <literal>x &gt; y</literal> (greater than)\r
 716                                         </simpara>\r
 717                                         </listitem>\r
 718                                         <listitem>\r
 719                                         <simpara>\r
 720                                         <literal>x LIKE y</literal> (TEXT value x matches a subset of TEXT y, where y is a string that\r
 721                                         can contain <emphasis>%</emphasis> as a wildcard for 0 or more characters, and <emphasis>_</emphasis> as a wildcard\r
 722                                         for a single character. For example, <literal>WHERE 'all you can eat fish and chips\r
 723                                         and a big stick' LIKE '%fish%stick'</literal> would return TRUE)\r
 724                                         </simpara>\r
 725                                         </listitem>\r
 726                                         <listitem>\r
 727                                         <simpara>\r
 728                                         <literal>x ILIKE y</literal> (like LIKE, but the comparison ignores upper-case / lower-case)\r
 729                                         </simpara>\r
 730                                         </listitem>\r
 731                                         <listitem>\r
 732                                         <simpara>\r
 733                                         <literal>x IN y</literal> (x is in the list of values y, where y can be a list or a SELECT\r
 734                                         statement that returns a list)\r
 735                                         </simpara>\r
 736                                         </listitem>\r
 737                                         </itemizedlist>\r
 738                                 </simplesect>\r
 739                         </simplesect>\r
 740                 </simplesect>\r
 741                 <simplesect id="_null_values">\r
 742                         <title>NULL values</title>\r
 743                         <simpara>SQL databases have a special way of representing the value of a column that has\r
 744                         no value: <literal>NULL</literal>. A <literal>NULL</literal> value is not equal to zero, and is not an empty\r
 745                         string; it is equal to nothing, not even another <literal>NULL</literal>, because it has no value\r
 746                         that can be compared.</simpara>\r
 747                         <simpara>To return rows from a table where a given column is not <literal>NULL</literal>, use the\r
 748                         <literal>IS NOT NULL</literal> comparison operator.</simpara>\r
 749                         <formalpara><title>Retrieving rows where a column is not <literal>NULL</literal></title><para>\r
 750 <programlisting language="sql" linenumbering="unnumbered">\r
 751 SELECT id, first_given_name, family_name\r
 752   FROM actor.usr\r
 753   WHERE second_given_name IS NOT NULL\r
 754 ;\r
 755 </programlisting>\r
 756                         </para></formalpara>\r
 757                         <simpara>Similarly, to return rows from a table where a given column is <literal>NULL</literal>, use\r
 758                         the <literal>IS NULL</literal> comparison operator.</simpara>\r
 759                         <formalpara><title>Retrieving rows where a column is <literal>NULL</literal></title><para>\r
 760 <programlisting language="sql" linenumbering="unnumbered">\r
 761 SELECT id, first_given_name, second_given_name, family_name\r
 762   FROM actor.usr\r
 763   WHERE second_given_name IS NULL\r
 764 ;\r
 765 \r
 766  id | first_given_name | second_given_name |  family_name\r
 767 ----+------------------+-------------------+----------------\r
 768   1 | Administrator    |                   | System Account\r
 769 (1 row)\r
 770 </programlisting>\r
 771                         </para></formalpara>\r
 772                         <simpara>Notice that the <literal>NULL</literal> value in the output is displayed as empty space,\r
 773                         indistinguishable from an empty string; this is the default display method in\r
 774                         <literal>psql</literal>. You can change the behaviour of <literal>psql</literal> using the <literal>pset</literal> command:</simpara>\r
 775                         <formalpara><title>Changing the way <literal>NULL</literal> values are displayed in <literal>psql</literal></title><para>\r
 776 <programlisting language="sql" linenumbering="unnumbered">\r
 777 evergreen=# \pset null '(null)'\r
 778 Null display is '(null)'.\r
 779 \r
 780 SELECT id, first_given_name, second_given_name, family_name\r
 781   FROM actor.usr\r
 782   WHERE second_given_name IS NULL\r
 783 ;\r
 784 \r
 785  id | first_given_name | second_given_name |  family_name\r
 786 ----+------------------+-------------------+----------------\r
 787   1 | Administrator    | (null)            | System Account\r
 788 (1 row)\r
 789 </programlisting>\r
 790                         </para></formalpara>\r
 791                         <simpara>Database queries within programming languages such as Perl and C have\r
 792                         special methods of checking for <literal>NULL</literal> values in returned results.</simpara>\r
 793                 </simplesect>\r
 794                 <simplesect id="_text_delimiter">\r
 795                         <title>Text delimiter: '</title>\r
 796                         <simpara>You might have noticed that we have been using the <literal>'</literal> character to delimit\r
 797                         TEXT values and values such as dates and times that are TEXT values. Sometimes,\r
 798                         however, your TEXT value itself contains a <literal>'</literal> character, such as the word\r
 799                         <literal>you&#8217;re</literal>. To prevent the database from prematurely ending the TEXT value at the\r
 800                         first <literal>'</literal> character and returning a syntax error, use another <literal>'</literal> character to\r
 801                         escape the following <literal>'</literal> character.</simpara>\r
 802                         <simpara>For example, to change the last name of a user in the <literal>actor.usr</literal> table to\r
 803                         <literal>L&#8217;estat</literal>, issue the following SQL:</simpara>\r
 804                         <formalpara><title>Escaping <literal>'</literal> in TEXT values</title><para>\r
 805 <programlisting language="sql" linenumbering="unnumbered">\r
 806 UPDATE actor.usr\r
 807   SET family_name = 'L''estat'\r
 808   WHERE profile IN (\r
 809     SELECT id\r
 810       FROM permission.grp_tree\r
 811       WHERE name = 'Vampire'\r
 812   )\r
 813         ;</programlisting>\r
 814                         </para></formalpara>\r
 815                         <simpara>When you retrieve the row from the database, the value is displayed with just\r
 816                         a single <literal>'</literal> character:</simpara>\r
 817 <programlisting language="sql" linenumbering="unnumbered">\r
 818 SELECT id, family_name\r
 819   FROM actor.usr\r
 820   WHERE family_name = 'L''estat'\r
 821 ;\r
 822 \r
 823  id | family_name\r
 824 ----+-------------\r
 825   1 | L'estat\r
 826 (1 row)\r
 827 </programlisting>\r
 828                 </simplesect>\r
 829                 <simplesect id="_grouping_and_eliminating_results_with_the_group_by_and_having_clauses">\r
 830                         <title>Grouping and eliminating results with the GROUP BY and HAVING clauses</title>\r
 831                         <simpara>The GROUP BY clause returns a unique set of results for the desired columns.\r
 832                         This is most often used in conjunction with an aggregate function to present\r
 833                         results for a range of values in a single query, rather than requiring you to\r
 834                         issue one query per target value.</simpara>\r
 835                         <formalpara><title>Returning unique results of a single column with <literal>GROUP BY</literal></title><para>\r
 836 <programlisting language="sql" linenumbering="unnumbered">\r
 837 SELECT grp\r
 838   FROM permission.grp_perm_map\r
 839   GROUP BY grp\r
 840   ORDER BY grp;\r
 841 \r
 842  grp\r
 843 -----+\r
 844    1\r
 845    2\r
 846    3\r
 847    4\r
 848    5\r
 849    6\r
 850    7\r
 851   10\r
 852 (8 rows)\r
 853 </programlisting>\r
 854                         </para></formalpara>\r
 855                         <simpara>While <literal>GROUP BY</literal> can be useful for a single column, it is more often used\r
 856                         to return the distinct results across multiple columns. For example, the\r
 857                         following query shows us which groups have permissions at each depth in\r
 858                         the library hierarchy:</simpara>\r
 859                         <formalpara><title>Returning unique results of multiple columns with <literal>GROUP BY</literal></title><para>\r
 860 <programlisting language="sql" linenumbering="unnumbered">\r
 861 SELECT grp, depth\r
 862   FROM permission.grp_perm_map\r
 863   GROUP BY grp, depth\r
 864   ORDER BY depth, grp;\r
 865 \r
 866  grp | depth\r
 867 -----+-------\r
 868    1 |     0\r
 869    2 |     0\r
 870    3 |     0\r
 871    4 |     0\r
 872    5 |     0\r
 873   10 |     0\r
 874    3 |     1\r
 875    4 |     1\r
 876    5 |     1\r
 877    6 |     1\r
 878    7 |     1\r
 879   10 |     1\r
 880    3 |     2\r
 881    4 |     2\r
 882   10 |     2\r
 883 (15 rows)\r
 884 </programlisting>\r
 885                         </para></formalpara>\r
 886                         <simpara>Extending this further, you can use the <literal>COUNT()</literal> aggregate function to\r
 887                         also return the number of times each unique combination of <literal>grp</literal> and <literal>depth</literal>\r
 888                         appears in the table. <emphasis>Yes, this is a sneak peek at the use of aggregate\r
 889                         functions! Keeners.</emphasis></simpara>\r
 890                         <formalpara><title>Counting unique column combinations with <literal>GROUP BY</literal></title><para>\r
 891 <programlisting language="sql" linenumbering="unnumbered">\r
 892 SELECT grp, depth, COUNT(grp)\r
 893   FROM permission.grp_perm_map\r
 894   GROUP BY grp, depth\r
 895   ORDER BY depth, grp;\r
 896 \r
 897  grp | depth | count\r
 898 -----+-------+-------\r
 899    1 |     0 |     6\r
 900    2 |     0 |     2\r
 901    3 |     0 |    45\r
 902    4 |     0 |     3\r
 903    5 |     0 |     5\r
 904   10 |     0 |     1\r
 905    3 |     1 |     3\r
 906    4 |     1 |     4\r
 907    5 |     1 |     1\r
 908    6 |     1 |     9\r
 909    7 |     1 |     5\r
 910   10 |     1 |    10\r
 911    3 |     2 |    24\r
 912    4 |     2 |     8\r
 913   10 |     2 |     7\r
 914 (15 rows)\r
 915 </programlisting>\r
 916                         </para></formalpara>\r
 917                         <simpara>You can use the <literal>WHERE</literal> clause to restrict the returned results before grouping\r
 918                         is applied to the results. The following query restricts the results to those\r
 919                         rows that have a depth of 0.</simpara>\r
 920                         <formalpara><title>Using the <literal>WHERE</literal> clause with <literal>GROUP BY</literal></title><para>\r
 921 <programlisting language="sql" linenumbering="unnumbered">\r
 922 SELECT grp, COUNT(grp)\r
 923   FROM permission.grp_perm_map\r
 924   WHERE depth = 0\r
 925   GROUP BY grp\r
 926   ORDER BY 2 DESC\r
 927 ;\r
 928 \r
 929  grp | count\r
 930 -----+-------\r
 931    3 |    45\r
 932    1 |     6\r
 933    5 |     5\r
 934    4 |     3\r
 935    2 |     2\r
 936   10 |     1\r
 937 (6 rows)\r
 938 </programlisting>\r
 939                         </para></formalpara>\r
 940                         <simpara>To restrict results after grouping has been applied to the rows, use the\r
 941                         <literal>HAVING</literal> clause; this is typically used to restrict results based on\r
 942                         a comparison to the value returned by an aggregate function. For example,\r
 943                         the following query restricts the returned rows to those that have more than\r
 944                         5 occurrences of the same value for <literal>grp</literal> in the table.</simpara>\r
 945                         <formalpara><title><literal>GROUP BY</literal> restricted by a <literal>HAVING</literal> clause</title><para>\r
 946 <programlisting language="sql" linenumbering="unnumbered">\r
 947 SELECT grp, COUNT(grp)\r
 948   FROM permission.grp_perm_map\r
 949   GROUP BY grp\r
 950   HAVING COUNT(grp) &gt; 5\r
 951 ;\r
 952 \r
 953  grp | count\r
 954 -----+-------\r
 955    6 |     9\r
 956    4 |    15\r
 957    5 |     6\r
 958    1 |     6\r
 959    3 |    72\r
 960   10 |    18\r
 961 (6 rows)\r
 962 </programlisting>\r
 963                         </para></formalpara>\r
 964                 </simplesect>\r
 965                 <simplesect id="_eliminating_duplicate_results_with_the_distinct_keyword">\r
 966                         <title>Eliminating duplicate results with the DISTINCT keyword</title>\r
 967                         <simpara><literal>GROUP BY</literal> is one way of eliminating duplicate results from the rows returned\r
 968                         by your query. The purpose of the <literal>DISTINCT</literal> keyword is to remove duplicate\r
 969                         rows from the results of your query. However, it works, and it is easy - so if\r
 970                         you just want a quick list of the unique set of values for a column or set of\r
 971                         columns, the <literal>DISTINCT</literal> keyword might be appropriate.</simpara>\r
 972                         <simpara>On the other hand, if you are getting duplicate rows back when you don&#8217;t expect\r
 973                         them, then applying the <literal>DISTINCT</literal> keyword might be a sign that you are\r
 974                         papering over a real problem.</simpara>\r
 975                         <formalpara><title>Returning unique results of multiple columns with <literal>DISTINCT</literal></title><para>\r
 976 <programlisting language="sql" linenumbering="unnumbered">\r
 977 SELECT DISTINCT grp, depth\r
 978   FROM permission.grp_perm_map\r
 979   ORDER BY depth, grp\r
 980 ;\r
 981 \r
 982  grp | depth\r
 983 -----+-------\r
 984    1 |     0\r
 985    2 |     0\r
 986    3 |     0\r
 987    4 |     0\r
 988    5 |     0\r
 989   10 |     0\r
 990    3 |     1\r
 991    4 |     1\r
 992    5 |     1\r
 993    6 |     1\r
 994    7 |     1\r
 995   10 |     1\r
 996    3 |     2\r
 997    4 |     2\r
 998   10 |     2\r
 999 (15 rows)\r
1000 </programlisting>\r
1001                         </para></formalpara>\r
1002                 </simplesect>\r
1003                 <simplesect id="_paging_through_results_with_the_limit_and_offset_clauses">\r
1004                         <title>Paging through results with the LIMIT and OFFSET clauses</title>\r
1005                         <simpara>The <literal>LIMIT</literal> clause restricts the total number of rows returned from your query\r
1006                         and is useful if you just want to list a subset of a large number of rows. For\r
1007                         example, in the following query we list the five most frequently used\r
1008                         circulation modifiers:</simpara>\r
1009                         <formalpara><title>Using the <literal>LIMIT</literal> clause to restrict results</title><para>\r
1010 <programlisting language="sql" linenumbering="unnumbered">\r
1011 SELECT circ_modifier, COUNT(circ_modifier)\r
1012   FROM asset.copy\r
1013   GROUP BY circ_modifier\r
1014   ORDER BY 2 DESC\r
1015   LIMIT 5\r
1016 ;\r
1017 \r
1018  circ_modifier | count\r
1019 ---------------+--------\r
1020  CIRC          | 741995\r
1021  BOOK          | 636199\r
1022  SER           | 265906\r
1023  DOC           | 191598\r
1024  LAW MONO      | 126627\r
1025 (5 rows)\r
1026 </programlisting>\r
1027                         </para></formalpara>\r
1028                         <simpara>When you use the <literal>LIMIT</literal> clause to restrict the total number of rows returned\r
1029                         by your query, you can also use the <literal>OFFSET</literal> clause to determine which subset\r
1030                         of the rows will be returned. The use of the <literal>OFFSET</literal> clause assumes that\r
1031                         you&#8217;ve used the <literal>ORDER BY</literal> clause to impose order on the results.</simpara>\r
1032                         <simpara>In the following example, we use the <literal>OFFSET</literal> clause to get results 6 through\r
1033                         10 from the same query that we prevously executed.</simpara>\r
1034                         <formalpara><title>Using the <literal>OFFSET</literal> clause to return a specific subset of rows</title><para>\r
1035 <programlisting language="sql" linenumbering="unnumbered">\r
1036 SELECT circ_modifier, COUNT(circ_modifier)\r
1037   FROM asset.copy\r
1038   GROUP BY circ_modifier\r
1039   ORDER BY 2 DESC\r
1040   LIMIT 5\r
1041   OFFSET 5\r
1042 ;\r
1043 \r
1044  circ_modifier | count\r
1045 ---------------+--------\r
1046  LAW SERIAL    | 102758\r
1047  DOCUMENTS     |  86215\r
1048  BOOK_WEB      |  63786\r
1049  MFORM SER     |  39917\r
1050  REF           |  34380\r
1051 (5 rows)\r
1052 </programlisting>\r
1053                         </para></formalpara>\r
1054                 </simplesect>\r
1055         </section>\r
1056         <section id="advanced_sql_queries">\r
1057                 <title>Advanced SQL queries</title>\r
1058                 <simplesect id="_transforming_column_values_with_functions">\r
1059                         <title>Transforming column values with functions</title>\r
1060                         <simpara>PostgreSQL includes many built-in functions for manipulating column data.\r
1061                         You can also create your own functions (and Evergreen does make use of\r
1062                         many custom functions). There are two types of functions used in\r
1063                         databases: scalar functions and aggregate functions.</simpara>\r
1064                         <simplesect id="_scalar_functions">\r
1065                                 <title>Scalar functions</title>\r
1066                                 <simpara>Scalar functions transform each value of the target column. If your query\r
1067                                 would return 50 values for a column in a given query, and you modify your\r
1068                                 query to apply a scalar function to the values returned for that column,\r
1069                                 it will still return 50 values. For example, the UPPER() function,\r
1070                                 used to convert text values to upper-case, modifies the results in the\r
1071                                 following set of queries:</simpara>\r
1072                                 <formalpara><title>Using the UPPER() scalar function to convert text values to upper-case</title><para>\r
1073 <programlisting language="sql" linenumbering="unnumbered">\r
1074 -- First, without the UPPER() function for comparison\r
1075 SELECT shortname, name\r
1076   FROM actor.org_unit\r
1077   WHERE id &lt; 4\r
1078 ;\r
1079 \r
1080  shortname |         name\r
1081 -----------+-----------------------\r
1082  CONS      | Example Consortium\r
1083  SYS1      | Example System 1\r
1084  SYS2      | Example System 2\r
1085 (3 rows)\r
1086 \r
1087 -- Now apply the UPPER() function to the name column\r
1088 SELECT shortname, UPPER(name)\r
1089   FROM actor.org_unit\r
1090   WHERE id &lt; 4\r
1091 ;\r
1092 \r
1093  shortname |       upper\r
1094 -----------+--------------------\r
1095  CONS      | EXAMPLE CONSORTIUM\r
1096  SYS1      | EXAMPLE SYSTEM 1\r
1097  SYS2      | EXAMPLE SYSTEM 2\r
1098 (3 rows)\r
1099 </programlisting>\r
1100                                 </para></formalpara>\r
1101                                 <simpara>There are so many scalar functions in PostgreSQL that we cannot cover them\r
1102                                 all here, but we can list some of the most commonly used functions:</simpara>\r
1103                                 <itemizedlist>\r
1104                                 <listitem>\r
1105                                 <simpara>\r
1106                                 || - concatenates two text values together\r
1107                                 </simpara>\r
1108                                 </listitem>\r
1109                                 <listitem>\r
1110                                 <simpara>\r
1111                                 COALESCE() - returns the first non-NULL value from the list of arguments\r
1112                                 </simpara>\r
1113                                 </listitem>\r
1114                                 <listitem>\r
1115                                 <simpara>\r
1116                                 LOWER() - returns a text value converted to lower-case\r
1117                                 </simpara>\r
1118                                 </listitem>\r
1119                                 <listitem>\r
1120                                 <simpara>\r
1121                                 REPLACE() - returns a text value after replacing all occurrences of a given text value with a different text value\r
1122                                 </simpara>\r
1123                                 </listitem>\r
1124                                 <listitem>\r
1125                                 <simpara>\r
1126                                 REGEXP_REPLACE() - returns a text value after being transformed by a regular expression\r
1127                                 </simpara>\r
1128                                 </listitem>\r
1129                                 <listitem>\r
1130                                 <simpara>\r
1131                                 UPPER() - returns a text value converted to upper-case\r
1132                                 </simpara>\r
1133                                 </listitem>\r
1134                                 </itemizedlist>\r
1135                                 <simpara>For a complete list of scalar functions, see\r
1136                                 <ulink url="http://www.postgresql.org/docs/8.3/interactive/functions.html">the PostgreSQL function documentation</ulink>.</simpara>\r
1137                         </simplesect>\r
1138                         <simplesect id="_aggregate_functions">\r
1139                                 <title>Aggregate functions</title>\r
1140                                 <simpara>Aggregate functions return a single value computed from the the complete set of\r
1141                                 values returned for the specified column.</simpara>\r
1142                                 <itemizedlist>\r
1143                                 <listitem>\r
1144                                 <simpara>\r
1145                                 AVG()\r
1146                                 </simpara>\r
1147                                 </listitem>\r
1148                                 <listitem>\r
1149                                 <simpara>\r
1150                                 COUNT()\r
1151                                 </simpara>\r
1152                                 </listitem>\r
1153                                 <listitem>\r
1154                                 <simpara>\r
1155                                 MAX()\r
1156                                 </simpara>\r
1157                                 </listitem>\r
1158                                 <listitem>\r
1159                                 <simpara>\r
1160                                 MIN()\r
1161                                 </simpara>\r
1162                                 </listitem>\r
1163                                 <listitem>\r
1164                                 <simpara>\r
1165                                 SUM()\r
1166                                 </simpara>\r
1167                                 </listitem>\r
1168                                 </itemizedlist>\r
1169                         </simplesect>\r
1170                 </simplesect>\r
1171                 <simplesect id="_sub_selects">\r
1172                         <title>Sub-selects</title>\r
1173                         <simpara>A sub-select is the technique of using the results of one query to feed\r
1174                         into another query. You can, for example, return a set of values from\r
1175                         one column in a SELECT statement to be used to satisfy the IN() condition\r
1176                         of another SELECT statement; or you could return the MAX() value of a\r
1177                         column in a SELECT statement to match the = condition of another SELECT\r
1178                         statement.</simpara>\r
1179                         <simpara>For example, in the following query we use a sub-select to restrict the copies\r
1180                         returned by the main SELECT statement to only those locations that have an\r
1181                         <literal>opac_visible</literal> value of <literal>TRUE</literal>:</simpara>\r
1182                         <formalpara><title>Sub-select example</title><para>\r
1183 <programlisting language="sql" linenumbering="unnumbered">\r
1184 SELECT call_number\r
1185   FROM asset.copy\r
1186   WHERE deleted IS FALSE\r
1187     AND location IN (\r
1188     SELECT id\r
1189       FROM asset.copy_location\r
1190       WHERE opac_visible IS TRUE\r
1191   )\r
1192 ;\r
1193 </programlisting>\r
1194                         </para></formalpara>\r
1195                         <simpara>Sub-selects can be an approachable way to breaking down a problem that\r
1196                         requires matching values between different tables, and often result in\r
1197                         a clearly expressed solution to a problem. However, if you start writing\r
1198                         sub-selects within sub-selects, you should consider tackling the problem\r
1199                         with joins instead.</simpara>\r
1200                 </simplesect>\r
1201                 <simplesect id="_joins">\r
1202                         <title>Joins</title>\r
1203                         <simpara>Joins enable you to access the values from multiple tables in your query\r
1204                         results and comparison operators. For example, joins are what enable you to\r
1205                         relate a bibliographic record to a barcoded copy via the <literal>biblio.record_entry</literal>,\r
1206                         <literal>asset.call_number</literal>, and <literal>asset.copy</literal> tables. In this section, we discuss the\r
1207                         most common kind of join&#8212;the inner join&#8212;as well as the less common outer join\r
1208                         and some set operations which can compare and contrast the values returned by\r
1209                         separate queries.</simpara>\r
1210                         <simpara>When we talk about joins, we are going to talk about the left-hand table and\r
1211                         the right-hand table that participate in the join. Every join brings together\r
1212                         just two tables - but you can use an unlimited (for our purposes) number\r
1213                         of joins in a single SQL statement. Each time you use a join, you effectively\r
1214                         create a new table, so when you add a second join clause to a statement,\r
1215                         table 1 and table 2 (which were the left-hand table and the right-hand table\r
1216                         for the first join) now act as a merged left-hand table and the new table\r
1217                         in the second join clause is the right-hand table.</simpara>\r
1218                         <simpara>Clear as mud? Okay, let&#8217;s look at some examples.</simpara>\r
1219                         <simplesect id="_inner_joins">\r
1220                                 <title>Inner joins</title>\r
1221                                 <simpara>An inner join returns all of the columns from the left-hand table in the join\r
1222                                 with all of the columns from the right-hand table in the joins that match a\r
1223                                 condition in the ON clause. Typically, you use the <literal>=</literal> operator to match the\r
1224                                 foreign key of the left-hand table with the primary key of the right-hand\r
1225                                 table to follow the natural relationship between the tables.</simpara>\r
1226                                 <simpara>In the following example, we return all of columns from the <literal>actor.usr</literal> and\r
1227                                 <literal>actor.org_unit</literal> tables, joined on the relationship between the user&#8217;s home\r
1228                                 library and the library&#8217;s ID. Notice in the results that some columns, like\r
1229                                 <literal>id</literal> and <literal>mailing_address</literal>, appear twice; this is because both the <literal>actor.usr</literal>\r
1230                                 and <literal>actor.org_unit</literal> tables include columns with these names. This is also why\r
1231                                 we have to fully qualify the column names in our queries with the schema and\r
1232                                 table names.</simpara>\r
1233                                 <formalpara><title>A simple inner join</title><para>\r
1234 <programlisting language="sql" linenumbering="unnumbered">\r
1235 SELECT *\r
1236   FROM actor.usr\r
1237     INNER JOIN actor.org_unit ON actor.usr.home_ou = actor.org_unit.id\r
1238     WHERE actor.org_unit.shortname = 'CONS'\r
1239 ;\r
1240 \r
1241 -[ RECORD 1 ]------------------+---------------------------------\r
1242 id                             | 1\r
1243 card                           | 1\r
1244 profile                        | 1\r
1245 usrname                        | admin\r
1246 email                          |\r
1247 ...\r
1248 mailing_address                |\r
1249 billing_address                |\r
1250 home_ou                        | 1\r
1251 ...\r
1252 claims_never_checked_out_count | 0\r
1253 id                             | 1\r
1254 parent_ou                      |\r
1255 ou_type                        | 1\r
1256 ill_address                    | 1\r
1257 holds_address                  | 1\r
1258 mailing_address                | 1\r
1259 billing_address                | 1\r
1260 shortname                      | CONS\r
1261 name                           | Example Consortium\r
1262 email                          |\r
1263 phone                          |\r
1264 opac_visible                   | t\r
1265 fiscal_calendar                | 1\r
1266 </programlisting>\r
1267                                 </para></formalpara>\r
1268                                 <simpara>Of course, you do not have to return every column from the joined tables;\r
1269                                 you can (and should) continue to specify only the columns that you want to\r
1270                                 return. In the following example, we count the number of borrowers for\r
1271                                 every user profile in a given library by joining the <literal>permission.grp_tree</literal>\r
1272                                 table where profiles are defined against the <literal>actor.usr</literal> table, and then\r
1273                                 joining the <literal>actor.org_unit</literal> table to give us access to the user&#8217;s home\r
1274                                 library:</simpara>\r
1275                                 <formalpara><title>Borrower Count by Profile (Adult, Child, etc)/Library</title><para>\r
1276 <programlisting language="sql" linenumbering="unnumbered">\r
1277 SELECT permission.grp_tree.name, actor.org_unit.name, COUNT(permission.grp_tree.name)\r
1278   FROM actor.usr\r
1279     INNER JOIN permission.grp_tree\r
1280       ON actor.usr.profile = permission.grp_tree.id\r
1281     INNER JOIN actor.org_unit\r
1282       ON actor.org_unit.id = actor.usr.home_ou\r
1283   WHERE actor.usr.deleted IS FALSE\r
1284   GROUP BY permission.grp_tree.name, actor.org_unit.name\r
1285   ORDER BY actor.org_unit.name, permission.grp_tree.name\r
1286 ;\r
1287 \r
1288  name  |        name        | count\r
1289 -------+--------------------+-------\r
1290  Users | Example Consortium |     1\r
1291 (1 row)\r
1292 </programlisting>\r
1293                                 </para></formalpara>\r
1294                         </simplesect>\r
1295                         <simplesect id="_aliases">\r
1296                                 <title>Aliases</title>\r
1297                                 <simpara>So far we have been fully-qualifying all of our table names and column names to\r
1298                                 prevent any confusion. This quickly gets tiring with lengthy qualified\r
1299                                 table names like <literal>permission.grp_tree</literal>, so the SQL syntax enables us to assign\r
1300                                 aliases to table names and column names. When you define an alias for a table\r
1301                                 name, you can access its column throughout the rest of the statement by simply\r
1302                                 appending the column name to the alias with a period; for example, if you assign\r
1303                                 the alias <literal>au</literal> to the <literal>actor.usr</literal> table, you can access the <literal>actor.usr.id</literal>\r
1304                                 column through the alias as <literal>au.id</literal>.</simpara>\r
1305                                 <simpara>The formal syntax for declaring an alias for a column is to follow the column\r
1306                                 name in the result columns clause with <literal>AS</literal> <emphasis>alias</emphasis>. To declare an alias for a table name,\r
1307                                 follow the table name in the FROM clause (including any JOIN statements) with\r
1308                                 <literal>AS</literal> <emphasis>alias</emphasis>. However, the <literal>AS</literal> keyword is optional for tables (and columns as\r
1309                                 of PostgreSQL 8.4), and in practice most SQL statements leave it out.  For\r
1310                                 example, we can write the previous INNER JOIN statement example using aliases\r
1311                                 instead of fully-qualified identifiers:</simpara>\r
1312                                 <formalpara><title>Borrower Count by Profile (using aliases)</title><para>\r
1313 <programlisting language="sql" linenumbering="unnumbered">\r
1314 SELECT pgt.name AS "Profile", aou.name AS "Library", COUNT(pgt.name) AS "Count"\r
1315   FROM actor.usr au\r
1316     INNER JOIN permission.grp_tree pgt\r
1317       ON au.profile = pgt.id\r
1318     INNER JOIN actor.org_unit aou\r
1319       ON aou.id = au.home_ou\r
1320   WHERE au.deleted IS FALSE\r
1321   GROUP BY pgt.name, aou.name\r
1322   ORDER BY aou.name, pgt.name\r
1323 ;\r
1324 \r
1325  Profile |      Library       | Count\r
1326 ---------+--------------------+-------\r
1327  Users   | Example Consortium |     1\r
1328 (1 row)\r
1329 </programlisting>\r
1330                                 </para></formalpara>\r
1331                                 <simpara>A nice side effect of declaring an alias for your columns is that the alias\r
1332                                 is used as the column header in the results table. The previous version of\r
1333                                 the query, which didn&#8217;t use aliased column names, had two columns named\r
1334                                 <literal>name</literal>; this version of the query with aliases results in a clearer\r
1335                                 categorization.</simpara>\r
1336                         </simplesect>\r
1337                         <simplesect id="_outer_joins">\r
1338                                 <title>Outer joins</title>\r
1339                                 <simpara>An outer join returns all of the rows from one or both of the tables\r
1340                                 participating in the join.</simpara>\r
1341                                 <itemizedlist>\r
1342                                 <listitem>\r
1343                                 <simpara>\r
1344                                 For a LEFT OUTER JOIN, the join returns all of the rows from the left-hand\r
1345                                 table and the rows matching the join condition from the right-hand table, with\r
1346                                 NULL values for the rows with no match in the right-hand table.\r
1347                                 </simpara>\r
1348                                 </listitem>\r
1349                                 <listitem>\r
1350                                 <simpara>\r
1351                                 A RIGHT OUTER JOIN behaves in the same way as a LEFT OUTER JOIN, with the\r
1352                                 exception that all rows are returned from the right-hand table participating in\r
1353                                 the join.\r
1354                                 </simpara>\r
1355                                 </listitem>\r
1356                                 <listitem>\r
1357                                 <simpara>\r
1358                                 For a FULL OUTER JOIN, the join returns all the rows from both the left-hand\r
1359                                 and right-hand tables, with NULL values for the rows with no match in either\r
1360                                 the left-hand or right-hand table.\r
1361                                 </simpara>\r
1362                                 </listitem>\r
1363                                 </itemizedlist>\r
1364                                 <formalpara><title>Base tables for the OUTER JOIN examples</title><para>\r
1365 <programlisting language="sql" linenumbering="unnumbered">\r
1366 SELECT * FROM aaa;\r
1367 \r
1368  id | stuff\r
1369 ----+-------\r
1370   1 | one\r
1371   2 | two\r
1372   3 | three\r
1373   4 | four\r
1374   5 | five\r
1375 (5 rows)\r
1376 \r
1377 SELECT * FROM bbb;\r
1378 \r
1379  id | stuff |   foo\r
1380 ----+-------+----------\r
1381   1 | one   | oneone\r
1382   2 | two   | twotwo\r
1383   5 | five  | fivefive\r
1384   6 | six   | sixsix\r
1385 (4 rows)\r
1386 </programlisting>\r
1387                                 </para></formalpara>\r
1388                                 <formalpara><title>Example of a LEFT OUTER JOIN</title><para>\r
1389 <programlisting language="sql" linenumbering="unnumbered">\r
1390 SELECT * FROM aaa\r
1391   LEFT OUTER JOIN bbb ON aaa.id = bbb.id\r
1392 ;\r
1393  id | stuff | id | stuff |   foo\r
1394 ----+-------+----+-------+----------\r
1395   1 | one   |  1 | one   | oneone\r
1396   2 | two   |  2 | two   | twotwo\r
1397   3 | three |    |       |\r
1398   4 | four  |    |       |\r
1399   5 | five  |  5 | five  | fivefive\r
1400 (5 rows)\r
1401 </programlisting>\r
1402                                 </para></formalpara>\r
1403                                 <formalpara><title>Example of a RIGHT OUTER JOIN</title><para>\r
1404 <programlisting language="sql" linenumbering="unnumbered">\r
1405 SELECT * FROM aaa\r
1406   RIGHT OUTER JOIN bbb ON aaa.id = bbb.id\r
1407 ;\r
1408  id | stuff | id | stuff |   foo\r
1409 ----+-------+----+-------+----------\r
1410   1 | one   |  1 | one   | oneone\r
1411   2 | two   |  2 | two   | twotwo\r
1412   5 | five  |  5 | five  | fivefive\r
1413     |       |  6 | six   | sixsix\r
1414 (4 rows)\r
1415 </programlisting>\r
1416                                 </para></formalpara>\r
1417                                 <formalpara><title>Example of a FULL OUTER JOIN</title><para>\r
1418 <programlisting language="sql" linenumbering="unnumbered">\r
1419 SELECT * FROM aaa\r
1420   FULL OUTER JOIN bbb ON aaa.id = bbb.id\r
1421 ;\r
1422  id | stuff | id | stuff |   foo\r
1423 ----+-------+----+-------+----------\r
1424   1 | one   |  1 | one   | oneone\r
1425   2 | two   |  2 | two   | twotwo\r
1426   3 | three |    |       |\r
1427   4 | four  |    |       |\r
1428   5 | five  |  5 | five  | fivefive\r
1429     |       |  6 | six   | sixsix\r
1430 (6 rows)\r
1431 </programlisting>\r
1432                                 </para></formalpara>\r
1433                         </simplesect>\r
1434                         <simplesect id="_self_joins">\r
1435                                 <title>Self joins</title>\r
1436                                 <simpara>It is possible to join a table to itself. You can, in fact you must, use\r
1437                                 aliases to disambiguate the references to the table.</simpara>\r
1438                         </simplesect>\r
1439                 </simplesect>\r
1440                 <simplesect id="_set_operations">\r
1441                         <title>Set operations</title>\r
1442                         <simpara>Relational databases are effectively just an efficient mechanism for\r
1443                         manipulating sets of values; they are implementations of set theory. There are\r
1444                         three operators for sets (tables) in which each set must have the same number\r
1445                         of columns with compatible data types: the union, intersection, and difference\r
1446                         operators.</simpara>\r
1447                         <formalpara><title>Base tables for the set operation examples</title><para>\r
1448 <programlisting language="sql" linenumbering="unnumbered">\r
1449 SELECT * FROM aaa;\r
1450 \r
1451         id | stuff\r
1452         ----+-------\r
1453         1 | one\r
1454         2 | two\r
1455         3 | three\r
1456         4 | four\r
1457         5 | five\r
1458         (5 rows)\r
1459 \r
1460 SELECT * FROM bbb;\r
1461 \r
1462         id | stuff |   foo\r
1463         ----+-------+----------\r
1464         1 | one   | oneone\r
1465         2 | two   | twotwo\r
1466         5 | five  | fivefive\r
1467         6 | six   | sixsix\r
1468 (4 rows)\r
1469 </programlisting>\r
1470                         </para></formalpara>\r
1471                         <simplesect id="_union">\r
1472                                 <title>Union</title>\r
1473                                 <simpara>The <literal>UNION</literal> operator returns the distinct set of rows that are members of\r
1474                                 either or both of the left-hand and right-hand tables. The <literal>UNION</literal> operator\r
1475                                 does not return any duplicate rows. To return duplicate rows, use the\r
1476                                 <literal>UNION ALL</literal> operator.</simpara>\r
1477                                 <formalpara><title>Example of a UNION set operation</title><para>\r
1478 <programlisting language="sql" linenumbering="unnumbered">\r
1479 -- The parentheses are not required, but are intended to help\r
1480 -- illustrate the sets participating in the set operation\r
1481 (\r
1482   SELECT id, stuff\r
1483     FROM aaa\r
1484 )\r
1485 UNION\r
1486 (\r
1487   SELECT id, stuff\r
1488     FROM bbb\r
1489 )\r
1490 ORDER BY 1\r
1491 ;\r
1492 \r
1493  id | stuff\r
1494 ----+-------\r
1495   1 | one\r
1496   2 | two\r
1497   3 | three\r
1498   4 | four\r
1499   5 | five\r
1500   6 | six\r
1501 (6 rows)\r
1502 </programlisting>\r
1503                                 </para></formalpara>\r
1504                         </simplesect>\r
1505                         <simplesect id="_intersection">\r
1506                                 <title>Intersection</title>\r
1507                                 <simpara>The <literal>INTERSECT</literal> operator returns the distinct set of rows that are common to\r
1508                                 both the left-hand and right-hand tables. To return duplicate rows, use the\r
1509                                 <literal>INTERSECT ALL</literal> operator.</simpara>\r
1510                                 <formalpara><title>Example of an INTERSECT set operation</title><para>\r
1511 <programlisting language="sql" linenumbering="unnumbered">\r
1512 (\r
1513   SELECT id, stuff\r
1514     FROM aaa\r
1515 )\r
1516 INTERSECT\r
1517 (\r
1518   SELECT id, stuff\r
1519     FROM bbb\r
1520 )\r
1521 ORDER BY 1\r
1522 ;\r
1523 \r
1524  id | stuff\r
1525 ----+-------\r
1526   1 | one\r
1527   2 | two\r
1528   5 | five\r
1529 (3 rows)\r
1530 </programlisting>\r
1531                                 </para></formalpara>\r
1532                         </simplesect>\r
1533                         <simplesect id="_difference">\r
1534                                 <title>Difference</title>\r
1535                                 <simpara>The <literal>EXCEPT</literal> operator returns the rows in the left-hand table that do not\r
1536                                 exist in the right-hand table. You are effectively subtracting the common\r
1537                                 rows from the left-hand table.</simpara>\r
1538                                 <formalpara><title>Example of an EXCEPT set operation</title><para>\r
1539 <programlisting language="sql" linenumbering="unnumbered">\r
1540 (\r
1541   SELECT id, stuff\r
1542     FROM aaa\r
1543 )\r
1544 EXCEPT\r
1545 (\r
1546   SELECT id, stuff\r
1547     FROM bbb\r
1548 )\r
1549 ORDER BY 1\r
1550 ;\r
1551 \r
1552  id | stuff\r
1553 ----+-------\r
1554   3 | three\r
1555   4 | four\r
1556 (2 rows)\r
1557 \r
1558 -- Order matters: switch the left-hand and right-hand tables\r
1559 -- and you get a different result\r
1560 (\r
1561   SELECT id, stuff\r
1562     FROM bbb\r
1563 )\r
1564 EXCEPT\r
1565 (\r
1566   SELECT id, stuff\r
1567     FROM aaa\r
1568 )\r
1569 ORDER BY 1\r
1570 ;\r
1571 \r
1572  id | stuff\r
1573 ----+-------\r
1574   6 | six\r
1575 (1 row)\r
1576 </programlisting>\r
1577                                 </para></formalpara>\r
1578                         </simplesect>\r
1579                 </simplesect>\r
1580                 <simplesect id="_views">\r
1581                         <title>Views</title>\r
1582                         <simpara>A view is a persistent <literal>SELECT</literal> statement that acts like a read-only table.\r
1583                         To create a view, issue the <literal>CREATE VIEW</literal> statement, giving the view a name\r
1584                         and a <literal>SELECT</literal> statement on which the view is built.</simpara>\r
1585                         <simpara>The following example creates a view based on our borrower profile count:</simpara>\r
1586                         <formalpara><title>Creating a view</title><para>\r
1587 <programlisting language="sql" linenumbering="unnumbered">\r
1588 CREATE VIEW actor.borrower_profile_count AS\r
1589   SELECT pgt.name AS "Profile", aou.name AS "Library", COUNT(pgt.name) AS "Count"\r
1590     FROM actor.usr au\r
1591       INNER JOIN permission.grp_tree pgt\r
1592         ON au.profile = pgt.id\r
1593       INNER JOIN actor.org_unit aou\r
1594         ON aou.id = au.home_ou\r
1595     WHERE au.deleted IS FALSE\r
1596     GROUP BY pgt.name, aou.name\r
1597     ORDER BY aou.name, pgt.name\r
1598 ;\r
1599 </programlisting>\r
1600                         </para></formalpara>\r
1601                         <simpara>When you subsequently select results from the view, you can apply additional\r
1602                         <literal>WHERE</literal> clauses to filter the results, or <literal>ORDER BY</literal> clauses to change the\r
1603                         order of the returned rows. In the following examples, we issue a simple\r
1604                         <literal>SELECT *</literal> statement to show that the default results are returned in the\r
1605                         same order from the view as the equivalent SELECT statement would be returned.\r
1606                         Then we issue a <literal>SELECT</literal> statement with a <literal>WHERE</literal> clause to further filter the\r
1607                         results.</simpara>\r
1608                         <formalpara><title>Selecting results from a view</title><para>\r
1609 <programlisting language="sql" linenumbering="unnumbered">\r
1610 SELECT * FROM actor.borrower_profile_count;\r
1611 \r
1612           Profile           | Library                    | Count\r
1613 ----------------------------+----------------------------+-------\r
1614  Faculty                    | University Library         |   208\r
1615  Graduate                   | University Library         |    16\r
1616  Patrons                    | University Library         |    62\r
1617 ...\r
1618 \r
1619 -- You can still filter your results with WHERE clauses\r
1620 SELECT *\r
1621   FROM actor.borrower_profile_count\r
1622   WHERE "Profile" = 'Faculty';\r
1623 \r
1624  Profile | Library                    | Count\r
1625 ---------+----------------------------+-------\r
1626  Faculty | University Library         |   208\r
1627  Faculty | College Library            |    64\r
1628  Faculty | College Library 2          |   102\r
1629  Faculty | University Library 2       |   776\r
1630 (4 rows)\r
1631 </programlisting>\r
1632                         </para></formalpara>\r
1633                 </simplesect>\r
1634                 <simplesect id="_inheritance">\r
1635                         <title>Inheritance</title>\r
1636                         <simpara>PostgreSQL supports table inheritance: that is, a child table inherits its\r
1637                         base definition from a parent table, but can add additional columns to its\r
1638                         own definition. The data from any child tables is visible in queries against\r
1639                         the parent table.</simpara>\r
1640                         <simpara>Evergreen uses table inheritance in several areas:\r
1641                           * In the Vandelay MARC batch importer / exporter, Evergreen defines base\r
1642                         tables for generic queues and queued records for which authority record and\r
1643                         bibliographic record child tables\r
1644                           * Billable transactions are based on the <literal>money.billable_xact</literal> table;\r
1645                         child tables include <literal>action.circulation</literal> for circulation transactions\r
1646                         and <literal>money.grocery</literal> for general bills.\r
1647                           * Payments are based on the <literal>money.payment</literal> table; its child table is\r
1648                         <literal>money.bnm_payment</literal> (for brick-and-mortar payments), which in turn has child\r
1649                         tables of <literal>money.forgive_payment</literal>, <literal>money.work_payment</literal>, <literal>money.credit_payment</literal>,\r
1650                         <literal>money.goods_payment</literal>, and <literal>money.bnm_desk_payment</literal>. The\r
1651                         <literal>money.bnm_desk_payment</literal> table in turn has child tables of <literal>money.cash_payment</literal>,\r
1652                         <literal>money.check_payment</literal>, and <literal>money.credit_card_payment</literal>.\r
1653                           * Transits are based on the <literal>action.transit_copy</literal> table, which has a child\r
1654                         table of <literal>action.hold_transit_copy</literal> for transits initiated by holds.\r
1655                           * Generic acquisition line items are defined by the\r
1656                         <literal>acq.lineitem_attr_definition</literal> table, which in turn has a number of child\r
1657                         tables to define MARC attributes, generated attributes, user attributes, and\r
1658                         provider attributes.</simpara>\r
1659                 </simplesect>\r
1660         </section>\r
1661         <section id="understanding_query_performance_with_explain">\r
1662                 <title>Understanding query performance with EXPLAIN</title>\r
1663                 <simpara>Some queries run for a long, long time. This can be the result of a poorly\r
1664                 written query&#8212;a query with a join condition that joins every\r
1665                 row in the <literal>biblio.record_entry</literal> table with every row in the <literal>metabib.full_rec</literal>\r
1666                 view would consume a massive amount of memory and disk space and CPU time&#8212;or\r
1667                 a symptom of a schema that needs some additional indexes. PostgreSQL provides\r
1668                 the <literal>EXPLAIN</literal> tool to estimate how long it will take to run a given query and\r
1669                 show you the <emphasis>query plan</emphasis> (how it plans to retrieve the results from the\r
1670                 database).</simpara>\r
1671                 <simpara>To generate the query plan without actually running the statement, simply\r
1672                 prepend the <literal>EXPLAIN</literal> keyword to your query. In the following example, we\r
1673                 generate the query plan for the poorly written query that would join every\r
1674                 row in the <literal>biblio.record_entry</literal> table with every row in the <literal>metabib.full_rec</literal>\r
1675                 view:</simpara>\r
1676                 <formalpara><title>Query plan for a terrible query</title><para>\r
1677 <programlisting language="sql" linenumbering="unnumbered">\r
1678 EXPLAIN SELECT *\r
1679   FROM biblio.record_entry\r
1680     FULL OUTER JOIN metabib.full_rec ON 1=1\r
1681 ;\r
1682 \r
1683                                    QUERY PLAN\r
1684 -------------------------------------------------------------------------------//\r
1685  Merge Full Join  (cost=0.00..4959156437783.60 rows=132415734100864 width=1379)\r
1686    -&gt;  Seq Scan on record_entry  (cost=0.00..400634.16 rows=2013416 width=1292)\r
1687    -&gt;  Seq Scan on real_full_rec  (cost=0.00..1640972.04 rows=65766704 width=87)\r
1688 (3 rows)\r
1689 </programlisting>\r
1690                 </para></formalpara>\r
1691                 <simpara>This query plan shows that the query would return 132415734100864 rows, and it\r
1692                 plans to accomplish what you asked for by sequentially scanning (<emphasis>Seq Scan</emphasis>)\r
1693                 every row in each of the tables participating in the join.</simpara>\r
1694                 <simpara>In the following example, we have realized our mistake in joining every row of\r
1695                 the left-hand table with every row in the right-hand table and take the saner\r
1696                 approach of using an <literal>INNER JOIN</literal> where the join condition is on the record ID.</simpara>\r
1697                 <formalpara><title>Query plan for a less terrible query</title><para>\r
1698 <programlisting language="sql" linenumbering="unnumbered">\r
1699 EXPLAIN SELECT *\r
1700   FROM biblio.record_entry bre\r
1701     INNER JOIN metabib.full_rec mfr ON mfr.record = bre.id;\r
1702                                         QUERY PLAN\r
1703 ----------------------------------------------------------------------------------------//\r
1704  Hash Join  (cost=750229.86..5829273.98 rows=65766704 width=1379)\r
1705    Hash Cond: (real_full_rec.record = bre.id)\r
1706    -&gt;  Seq Scan on real_full_rec  (cost=0.00..1640972.04 rows=65766704 width=87)\r
1707    -&gt;  Hash  (cost=400634.16..400634.16 rows=2013416 width=1292)\r
1708          -&gt;  Seq Scan on record_entry bre  (cost=0.00..400634.16 rows=2013416 width=1292)\r
1709 (5 rows)\r
1710 </programlisting>\r
1711                 </para></formalpara>\r
1712                 <simpara>This time, we will return 65766704 rows - still way too many rows. We forgot\r
1713                 to include a <literal>WHERE</literal> clause to limit the results to something meaningful. In\r
1714                 the following example, we will limit the results to deleted records that were\r
1715                 modified in the last month.</simpara>\r
1716                 <formalpara><title>Query plan for a realistic query</title><para>\r
1717 <programlisting language="sql" linenumbering="unnumbered">\r
1718 EXPLAIN SELECT *\r
1719   FROM biblio.record_entry bre\r
1720     INNER JOIN metabib.full_rec mfr ON mfr.record = bre.id\r
1721   WHERE bre.deleted IS TRUE\r
1722     AND DATE_TRUNC('MONTH', bre.edit_date) &gt;\r
1723         DATE_TRUNC ('MONTH', NOW() - '1 MONTH'::INTERVAL)\r
1724 ;\r
1725 \r
1726                                         QUERY PLAN\r
1727 ----------------------------------------------------------------------------------------//\r
1728  Hash Join  (cost=5058.86..2306218.81 rows=201669 width=1379)\r
1729    Hash Cond: (real_full_rec.record = bre.id)\r
1730    -&gt;  Seq Scan on real_full_rec  (cost=0.00..1640972.04 rows=65766704 width=87)\r
1731    -&gt;  Hash  (cost=4981.69..4981.69 rows=6174 width=1292)\r
1732          -&gt;  Index Scan using biblio_record_entry_deleted on record_entry bre\r
1733                  (cost=0.00..4981.69 rows=6174 width=1292)\r
1734                Index Cond: (deleted = true)\r
1735                Filter: ((deleted IS TRUE) AND (date_trunc('MONTH'::text, edit_date)\r
1736                  &gt; date_trunc('MONTH'::text, (now() - '1 mon'::interval))))\r
1737 (7 rows)\r
1738 </programlisting>\r
1739                 </para></formalpara>\r
1740                 <simpara>We can see that the number of rows returned is now only 201669; that&#8217;s\r
1741                 something we can work with. Also, the overall cost of the query is 2306218,\r
1742                 compared to 4959156437783 in the original query. The <literal>Index Scan</literal> tells us\r
1743                 that the query planner will use the index that was defined on the <literal>deleted</literal>\r
1744                 column to avoid having to check every row in the <literal>biblio.record_entry</literal> table.</simpara>\r
1745                 <simpara>However, we are still running a sequential scan over the\r
1746                 <literal>metabib.real_full_rec</literal> table (the table on which the <literal>metabib.full_rec</literal>\r
1747                 view is based). Given that linking from the bibliographic records to the\r
1748                 flattened MARC subfields is a fairly common operation, we could create a\r
1749                 new index and see if that speeds up our query plan.</simpara>\r
1750                 <formalpara><title>Query plan with optimized access via a new index</title><para>\r
1751 <programlisting language="sql" linenumbering="unnumbered">\r
1752 -- This index will take a long time to create on a large database\r
1753 -- of bibliographic records\r
1754 CREATE INDEX bib_record_idx ON metabib.real_full_rec (record);\r
1755 \r
1756 EXPLAIN SELECT *\r
1757   FROM biblio.record_entry bre\r
1758     INNER JOIN metabib.full_rec mfr ON mfr.record = bre.id\r
1759   WHERE bre.deleted IS TRUE\r
1760     AND DATE_TRUNC('MONTH', bre.edit_date) &gt;\r
1761         DATE_TRUNC ('MONTH', NOW() - '1 MONTH'::INTERVAL)\r
1762 ;\r
1763 \r
1764                                         QUERY PLAN\r
1765 ----------------------------------------------------------------------------------------//\r
1766  Nested Loop  (cost=0.00..1558330.46 rows=201669 width=1379)\r
1767    -&gt;  Index Scan using biblio_record_entry_deleted on record_entry bre\r
1768            (cost=0.00..4981.69 rows=6174 width=1292)\r
1769          Index Cond: (deleted = true)\r
1770          Filter: ((deleted IS TRUE) AND (date_trunc('MONTH'::text, edit_date) &gt;\r
1771            date_trunc('MONTH'::text, (now() - '1 mon'::interval))))\r
1772    -&gt;  Index Scan using bib_record_idx on real_full_rec\r
1773           (cost=0.00..240.89 rows=850 width=87)\r
1774          Index Cond: (real_full_rec.record = bre.id)\r
1775 (6 rows)\r
1776 </programlisting>\r
1777                 </para></formalpara>\r
1778                 <simpara>We can see that the resulting number of rows is still the same (201669), but\r
1779                 the execution estimate has dropped to 1558330 because the query planner can\r
1780                 use the new index (<literal>bib_record_idx</literal>) rather than scanning the entire table.\r
1781                 Success!</simpara>\r
1782                 <note><simpara>While indexes can significantly speed up read access to tables for common\r
1783                 filtering conditions, every time a row is created or updated the corresponding\r
1784                 indexes also need to be maintained - which can decrease the performance of\r
1785                 writes to the database. Be careful to keep the balance of read performance\r
1786                 versus write performance in mind if you plan to create custom indexes in your\r
1787                 Evergreen database.</simpara></note>\r
1788         </section>\r
1789         <section id="inserting_updating_and_deleting_data">\r
1790                 <title>Inserting, updating, and deleting data</title>\r
1791                 <simplesect id="_inserting_data">\r
1792                         <title>Inserting data</title>\r
1793                         <simpara>To insert one or more rows into a table, use the INSERT statement to identify\r
1794                         the target table and list the columns in the table for which you are going to\r
1795                         provide values for each row. If you do not list one or more columns contained\r
1796                         in the table, the database will automatically supply a <literal>NULL</literal> value for those\r
1797                         columns. The values for each row follow the <literal>VALUES</literal> clause and are grouped in\r
1798                         parentheses and delimited by commas. Each row, in turn, is delimited by commas\r
1799                         (<emphasis>this multiple row syntax requires PostgreSQL 8.2 or higher</emphasis>).</simpara>\r
1800                         <simpara>For example, to insert two rows into the <literal>permission.usr_grp_map</literal> table:</simpara>\r
1801                         <formalpara><title>Inserting rows into the <literal>permission.usr_grp_map</literal> table</title><para>\r
1802                         <programlisting language="sql" linenumbering="unnumbered">INSERT INTO permission.usr_grp_map (usr, grp)\r
1803                           VALUES (2, 10), (2, 4)\r
1804                         ;</programlisting>\r
1805                         </para></formalpara>\r
1806                         <simpara>Of course, as with the rest of SQL, you can replace individual column values\r
1807                         with one or more use sub-selects:</simpara>\r
1808                         <formalpara><title>Inserting rows using sub-selects instead of integers</title><para>\r
1809 <programlisting language="sql" linenumbering="unnumbered">\r
1810 INSERT INTO permission.usr_grp_map (usr, grp)\r
1811   VALUES (\r
1812     (SELECT id FROM actor.usr\r
1813        WHERE family_name = 'Scott' AND first_given_name = 'Daniel'),\r
1814     (SELECT id FROM permission.grp_tree\r
1815        WHERE name = 'Local System Administrator')\r
1816   ), (\r
1817     (SELECT id FROM actor.usr\r
1818        WHERE family_name = 'Scott' AND first_given_name = 'Daniel'),\r
1819     (SELECT id FROM permission.grp_tree\r
1820        WHERE name = 'Circulator')\r
1821   )\r
1822 ;\r
1823 </programlisting>\r
1824                         </para></formalpara>\r
1825                 </simplesect>\r
1826                 <simplesect id="_inserting_data_using_a_select_statement">\r
1827                         <title>Inserting data using a SELECT statement</title>\r
1828                         <simpara>Sometimes you want to insert a bulk set of data into a new table based on\r
1829                         a query result. Rather than a <literal>VALUES</literal> clause, you can use a <literal>SELECT</literal>\r
1830                         statement to insert one or more rows matching the column definitions. This\r
1831                         is a good time to point out that you can include explicit values, instead\r
1832                         of just column identifiers, in the return columns of the <literal>SELECT</literal> statement.\r
1833                         The explicit values are returned in every row of the result set.</simpara>\r
1834                         <simpara>In the following example, we insert 6 rows into the <literal>permission.usr_grp_map</literal>\r
1835                         table; each row will have a <literal>usr</literal> column value of 1, with varying values for\r
1836                         the <literal>grp</literal> column value based on the <literal>id</literal> column values returned from\r
1837                         <literal>permission.grp_tree</literal>:</simpara>\r
1838                         <formalpara><title>Inserting rows via a <literal>SELECT</literal> statement</title><para>\r
1839 <programlisting language="sql" linenumbering="unnumbered">\r
1840 INSERT INTO permission.usr_grp_map (usr, grp)\r
1841   SELECT 1, id\r
1842     FROM permission.grp_tree\r
1843     WHERE id &gt; 2\r
1844 ;\r
1845 \r
1846 INSERT 0 6\r
1847 </programlisting>\r
1848                         </para></formalpara>\r
1849                 </simplesect>\r
1850                 <simplesect id="_deleting_rows">\r
1851                         <title>Deleting rows</title>\r
1852                         <simpara>Deleting data from a table is normally fairly easy. To delete rows from a table,\r
1853                         issue a <literal>DELETE</literal> statement identifying the table from which you want to delete\r
1854                         rows and a <literal>WHERE</literal> clause identifying the row or rows that should be deleted.</simpara>\r
1855                         <simpara>In the following example, we delete all of the rows from the\r
1856                         <literal>permission.grp_perm_map</literal> table where the permission maps to\r
1857                         <literal>UPDATE_ORG_UNIT_CLOSING</literal> and the group is anything other than administrators:</simpara>\r
1858                         <formalpara><title>Deleting rows from a table</title><para>\r
1859 <programlisting language="sql" linenumbering="unnumbered">\r
1860 DELETE FROM permission.grp_perm_map\r
1861   WHERE grp IN (\r
1862     SELECT id\r
1863       FROM permission.grp_tree\r
1864       WHERE name != 'Local System Administrator'\r
1865   ) AND perm = (\r
1866     SELECT id\r
1867       FROM permission.perm_list\r
1868               WHERE code = 'UPDATE_ORG_UNIT_CLOSING'\r
1869           )\r
1870 ;\r
1871 </programlisting>\r
1872                         </para></formalpara>\r
1873                         <note><simpara>There are two main reasons that a <literal>DELETE</literal> statement may not actually\r
1874                         delete rows from a table, even when the rows meet the conditional clause.</simpara></note>\r
1875                         <orderedlist numeration="arabic">\r
1876                         <listitem>\r
1877                         <simpara>\r
1878                         If the row contains a value that is the target of a relational constraint,\r
1879                         for example, if another table has a foreign key pointing at your target\r
1880                         table, you will be prevented from deleting a row with a value corresponding\r
1881                         to a row in the dependent table.\r
1882                         </simpara>\r
1883                         </listitem>\r
1884                         <listitem>\r
1885                         <simpara>\r
1886                         If the table has a rule that substitutes a different action for a <literal>DELETE</literal>\r
1887                         statement, the deletion will not take place. In Evergreen it is common for a\r
1888                         table to have a rule that substitutes the action of setting a <literal>deleted</literal> column\r
1889                         to <literal>TRUE</literal>. For example, if a book is discarded, deleting the row representing\r
1890                         the copy from the <literal>asset.copy</literal> table would severely affect circulation statistics,\r
1891                         bills, borrowing histories, and their corresponding tables in the database that\r
1892                         have foreign keys pointing at the <literal>asset.copy</literal> table (<literal>action.circulation</literal> and\r
1893                         <literal>money.billing</literal> and its children respectively). Instead, the <literal>deleted</literal> column\r
1894                         value is set to <literal>TRUE</literal> and Evergreen&#8217;s application logic skips over these rows\r
1895                         in most cases.\r
1896                         </simpara>\r
1897                         </listitem>\r
1898                         </orderedlist>\r
1899                 </simplesect>\r
1900                 <simplesect id="_updating_rows">\r
1901                         <title>Updating rows</title>\r
1902                         <simpara>To update rows in a table, issue an <literal>UPDATE</literal> statement identifying the table\r
1903                         you want to update, the column or columns that you want to set with their\r
1904                         respective new values, and (optionally) a <literal>WHERE</literal> clause identifying the row or\r
1905                         rows that should be updated.</simpara>\r
1906                         <simpara>Following is the syntax for the <literal>UPDATE</literal> statement:</simpara>\r
1907                         <blockquote>\r
1908                         <literallayout><literal>UPDATE</literal> [<emphasis>table-name</emphasis>]\r
1909                           <literal>SET</literal> [<emphasis>column</emphasis>] <literal>TO</literal> [<emphasis>new-value</emphasis>]\r
1910                           <literal>WHERE</literal> [<emphasis>condition</emphasis>]\r
1911                         ;</literallayout>\r
1912                         </blockquote>\r
1913                 </simplesect>\r
1914         </section>\r
1915         <section id="query_requests">\r
1916                 <title>Query requests</title>\r
1917                 <simpara>The following queries were requested by Bibliomation, but might be reusable\r
1918                 by other libraries.</simpara>\r
1919                 <simplesect id="_monthly_circulation_stats_by_collection_code_library">\r
1920                         <title>Monthly circulation stats by collection code / library</title>\r
1921                         <formalpara><title>Monthly Circulation Stats by Collection Code/Library</title><para>\r
1922 <programlisting language="sql" linenumbering="unnumbered">\r
1923 SELECT COUNT(acirc.id) AS "COUNT", aou.name AS "Library", acl.name AS "Copy Location"\r
1924   FROM asset.copy ac\r
1925     INNER JOIN asset.copy_location acl ON ac.location = acl.id\r
1926     INNER JOIN action.circulation acirc ON acirc.target_copy = ac.id\r
1927     INNER JOIN actor.org_unit aou ON acirc.circ_lib = aou.id\r
1928   WHERE DATE_TRUNC('MONTH', acirc.create_time) = DATE_TRUNC('MONTH', NOW() - INTERVAL '3 month')\r
1929     AND acirc.desk_renewal IS FALSE\r
1930     AND acirc.opac_renewal IS FALSE\r
1931     AND acirc.phone_renewal IS FALSE\r
1932   GROUP BY aou.name, acl.name\r
1933   ORDER BY aou.name, acl.name, 1\r
1934 ;\r
1935 </programlisting>\r
1936                         </para></formalpara>\r
1937                 </simplesect>\r
1938                 <simplesect id="_monthly_circulation_stats_by_borrower_stat_library">\r
1939                         <title>Monthly circulation stats by borrower stat / library</title>\r
1940                         <formalpara><title>Monthly Circulation Stats by Borrower Stat/Library</title><para>\r
1941 <programlisting language="sql" linenumbering="unnumbered">\r
1942 SELECT COUNT(acirc.id) AS "COUNT", aou.name AS "Library", asceum.stat_cat_entry AS "Borrower Stat"\r
1943   FROM action.circulation acirc\r
1944     INNER JOIN actor.org_unit aou ON acirc.circ_lib = aou.id\r
1945     INNER JOIN actor.stat_cat_entry_usr_map asceum ON asceum.target_usr = acirc.usr\r
1946     INNER JOIN actor.stat_cat astat ON asceum.stat_cat = astat.id\r
1947   WHERE DATE_TRUNC('MONTH', acirc.create_time) = DATE_TRUNC('MONTH', NOW() - INTERVAL '3 month')\r
1948     AND astat.name = 'Preferred language'\r
1949     AND acirc.desk_renewal IS FALSE\r
1950     AND acirc.opac_renewal IS FALSE\r
1951     AND acirc.phone_renewal IS FALSE\r
1952   GROUP BY aou.name, asceum.stat_cat_entry\r
1953   ORDER BY aou.name, asceum.stat_cat_entry, 1\r
1954 ;\r
1955 </programlisting>\r
1956                         </para></formalpara>\r
1957                 </simplesect>\r
1958                 <simplesect id="_monthly_intralibrary_loan_stats_by_library">\r
1959                         <title>Monthly intralibrary loan stats by library</title>\r
1960                         <formalpara><title>Monthly Intralibrary Loan Stats by Library</title><para>\r
1961 <programlisting language="sql" linenumbering="unnumbered">\r
1962 SELECT aou.name AS "Library", COUNT(acirc.id)\r
1963   FROM action.circulation acirc\r
1964     INNER JOIN actor.org_unit aou ON acirc.circ_lib = aou.id\r
1965     INNER JOIN asset.copy ac ON acirc.target_copy = ac.id\r
1966     INNER JOIN asset.call_number acn ON ac.call_number = acn.id\r
1967   WHERE acirc.circ_lib != acn.owning_lib\r
1968     AND DATE_TRUNC('MONTH', acirc.create_time) = DATE_TRUNC('MONTH', NOW() - INTERVAL '3 month')\r
1969     AND acirc.desk_renewal IS FALSE\r
1970     AND acirc.opac_renewal IS FALSE\r
1971     AND acirc.phone_renewal IS FALSE\r
1972   GROUP by aou.name\r
1973   ORDER BY aou.name, 2\r
1974 ;\r
1975 </programlisting>\r
1976                         </para></formalpara>\r
1977                 </simplesect>\r
1978                 <simplesect id="_monthly_borrowers_added_by_profile_adult_child_etc_library">\r
1979                         <title>Monthly borrowers added by profile (adult, child, etc) / library</title>\r
1980                         <formalpara><title>Monthly Borrowers Added by Profile (Adult, Child, etc)/Library</title><para>\r
1981 <programlisting language="sql" linenumbering="unnumbered">\r
1982 SELECT pgt.name AS "Profile", aou.name AS "Library", COUNT(pgt.name) AS "Count"\r
1983   FROM actor.usr au\r
1984     INNER JOIN permission.grp_tree pgt\r
1985       ON au.profile = pgt.id\r
1986     INNER JOIN actor.org_unit aou\r
1987       ON aou.id = au.home_ou\r
1988   WHERE au.deleted IS FALSE\r
1989     AND DATE_TRUNC('MONTH', au.create_date) = DATE_TRUNC('MONTH', NOW() - '3 months'::interval)\r
1990   GROUP BY pgt.name, aou.name\r
1991   ORDER BY aou.name, pgt.name\r
1992 ;\r
1993 </programlisting>\r
1994                         </para></formalpara>\r
1995                 </simplesect>\r
1996                 <simplesect id="_borrower_count_by_profile_adult_child_etc_library">\r
1997                         <title>Borrower count by profile (adult, child, etc) / library</title>\r
1998                         <formalpara><title>Borrower Count by Profile (Adult, Child, etc)/Library</title><para>\r
1999 <programlisting language="sql" linenumbering="unnumbered">\r
2000 SELECT pgt.name AS "Profile", aou.name AS "Library", COUNT(pgt.name) AS "Count"\r
2001   FROM actor.usr au\r
2002     INNER JOIN permission.grp_tree pgt\r
2003       ON au.profile = pgt.id\r
2004     INNER JOIN actor.org_unit aou\r
2005       ON aou.id = au.home_ou\r
2006   WHERE au.deleted IS FALSE\r
2007   GROUP BY pgt.name, aou.name\r
2008   ORDER BY aou.name, pgt.name\r
2009 ;\r
2010 </programlisting>\r
2011                         </para></formalpara>\r
2012                 </simplesect>\r
2013                 <simplesect id="_monthly_items_added_by_collection_library">\r
2014                         <title>Monthly items added by collection / library</title>\r
2015                         <simpara>We define a <quote>collection</quote> as a shelving location in Evergreen.</simpara>\r
2016                         <formalpara><title>Monthly Items Added by Collection/Library</title><para>\r
2017 <programlisting language="sql" linenumbering="unnumbered">\r
2018 SELECT aou.name AS "Library", acl.name, COUNT(ac.barcode)\r
2019   FROM actor.org_unit aou\r
2020     INNER JOIN asset.call_number acn ON acn.owning_lib = aou.id\r
2021     INNER JOIN asset.copy ac ON ac.call_number = acn.id\r
2022     INNER JOIN asset.copy_location acl ON ac.location = acl.id\r
2023   WHERE ac.deleted IS FALSE\r
2024     AND acn.deleted IS FALSE\r
2025     AND DATE_TRUNC('MONTH', ac.create_date) = DATE_TRUNC('MONTH', NOW() - '1 month'::interval)\r
2026   GROUP BY aou.name, acl.name\r
2027   ORDER BY aou.name, acl.name\r
2028 ;\r
2029 </programlisting>\r
2030                         </para></formalpara>\r
2031                 </simplesect>\r
2032                 <simplesect id="_hold_purchase_alert_by_library">\r
2033                         <title>Hold purchase alert by library</title>\r
2034                         <simpara>in the following set of queries, we bring together the active title, volume,\r
2035                         and copy holds and display those that have more than a certain number of holds\r
2036                         per title. The goal is to UNION ALL the three queries, then group by the\r
2037                         bibliographic record ID and display the title / author information for those\r
2038                         records that have more than a given threshold of holds.</simpara>\r
2039                         <formalpara><title>Hold Purchase Alert by Library</title><para>\r
2040 <programlisting language="sql" linenumbering="unnumbered">\r
2041 -- Title holds\r
2042 SELECT all_holds.bib_id, aou.name, rmsr.title, rmsr.author, COUNT(all_holds.bib_id)\r
2043   FROM\r
2044     (\r
2045       (\r
2046         SELECT target, request_lib\r
2047           FROM action.hold_request\r
2048           WHERE hold_type = 'T'\r
2049             AND fulfillment_time IS NULL\r
2050             AND cancel_time IS NULL\r
2051       )\r
2052       UNION ALL\r
2053       -- Volume holds\r
2054       (\r
2055         SELECT bre.id, request_lib\r
2056           FROM action.hold_request ahr\r
2057             INNER JOIN asset.call_number acn ON ahr.target = acn.id\r
2058             INNER JOIN biblio.record_entry bre ON acn.record = bre.id\r
2059           WHERE ahr.hold_type = 'V'\r
2060             AND ahr.fulfillment_time IS NULL\r
2061             AND ahr.cancel_time IS NULL\r
2062       )\r
2063       UNION ALL\r
2064       -- Copy holds\r
2065       (\r
2066         SELECT bre.id, request_lib\r
2067           FROM action.hold_request ahr\r
2068             INNER JOIN asset.copy ac ON ahr.target = ac.id\r
2069             INNER JOIN asset.call_number acn ON ac.call_number = acn.id\r
2070             INNER JOIN biblio.record_entry bre ON acn.record = bre.id\r
2071           WHERE ahr.hold_type = 'C'\r
2072             AND ahr.fulfillment_time IS NULL\r
2073             AND ahr.cancel_time IS NULL\r
2074       )\r
2075     ) AS all_holds(bib_id, request_lib)\r
2076   INNER JOIN reporter.materialized_simple_record rmsr\r
2077   INNER JOIN actor.org_unit aou ON aou.id = all_holds.request_lib\r
2078     ON rmsr.id = all_holds.bib_id\r
2079   GROUP BY all_holds.bib_id, aou.name, rmsr.id, rmsr.title, rmsr.author\r
2080   HAVING COUNT(all_holds.bib_id) &gt; 2\r
2081   ORDER BY aou.name\r
2082 ;\r
2083 </programlisting>\r
2084                         </para></formalpara>\r
2085                 </simplesect>\r
2086                 <simplesect id="_update_borrower_records_with_a_different_home_library">\r
2087                         <title>Update borrower records with a different home library</title>\r
2088                         <simpara>In this example, the library has opened a new branch in a growing area,\r
2089                         and wants to reassign the home library for the patrons in the vicinity of\r
2090                         the new branch to the new branch. To accomplish this, we create a staging table\r
2091                         that holds a set of city names and the corresponding branch shortname for the home\r
2092                         library for each city.</simpara>\r
2093                         <simpara>Then we issue an <literal>UPDATE</literal> statement to set the home library for patrons with a\r
2094                         physical address with a city that matches the city names in our staging table.</simpara>\r
2095                         <formalpara><title>Update borrower records with a different home library</title><para>\r
2096 <programlisting language="sql" linenumbering="unnumbered">\r
2097 CREATE SCHEMA staging;\r
2098 CREATE TABLE staging.city_home_ou_map (city TEXT, ou_shortname TEXT,\r
2099   FOREIGN KEY (ou_shortname) REFERENCES actor.org_unit (shortname));\r
2100 INSERT INTO staging.city_home_ou_map (city, ou_shortname)\r
2101   VALUES ('Southbury', 'BR1'), ('Middlebury', 'BR2'), ('Hartford', 'BR3');\r
2102 BEGIN;\r
2103 \r
2104 UPDATE actor.usr au SET home_ou = COALESCE(\r
2105   (\r
2106     SELECT aou.id\r
2107       FROM actor.org_unit aou\r
2108         INNER JOIN staging.city_home_ou_map schom ON schom.ou_shortname = aou.shortname\r
2109         INNER JOIN actor.usr_address aua ON aua.city = schom.city\r
2110       WHERE au.id = aua.usr\r
2111       GROUP BY aou.id\r
2112   ), home_ou)\r
2113 WHERE (\r
2114   SELECT aou.id\r
2115     FROM actor.org_unit aou\r
2116       INNER JOIN staging.city_home_ou_map schom ON schom.ou_shortname = aou.shortname\r
2117       INNER JOIN actor.usr_address aua ON aua.city = schom.city\r
2118     WHERE au.id = aua.usr\r
2119     GROUP BY aou.id\r
2120 ) IS NOT NULL;\r
2121 </programlisting>\r
2122                         </para></formalpara>\r
2123                 </simplesect>\r
2124         </section>\r
2125         \r
2126 </chapter>\r