1.6/development/introduction_to_sql.xml

   1 <?xml version="1.0" encoding="UTF-8"?>\r
   2 <chapter xml:id="intro_to_sql" xmlns="http://docbook.org/ns/docbook" version="5.0" xml:lang="EN"\r
   3     xmlns:xi="http://www.w3.org/2001/XInclude" xmlns:xlink="http://www.w3.org/1999/xlink">\r
   4         <chapterinfo>\r
   5                 <title>Introduction to SQL for Evergreen Administrators</title>\r
   6         </chapterinfo>\r
   7         <abstract id="itnroSQL_abstract">\r
   8                 <simpara>This chapter was taken from Dan Scott's <emphasis>Introduction to SQL for Evergreen Administrators</emphasis>, February 2010.</simpara>\r
   9         </abstract>     \r
  10         <section id="intro_to_databases">\r
  11                         <title>Introduction to SQL Databases</title>\r
  12                 <simplesect>\r
  13                         <title>Introduction</title>\r
  14                         <simpara>Over time, the SQL database has become the standard method of storing,\r
  15                         retrieving, and processing raw data for applications. Ranging from embedded\r
  16                         databases such as SQLite and Apache Derby, to enterprise databases such as\r
  17                         Oracle and IBM DB2, any SQL database offers basic advantages to application\r
  18                         developers such as standard interfaces (Structured Query Language (SQL), Java\r
  19                         Database Connectivity (JDBC), Open Database Connectivity (ODBC), Perl Database\r
  20                         Independent Interface (DBI)), a standard conceptual model of data (tables,\r
  21                         fields, relationships, constraints, etc), performance in storing and retrieving\r
  22                         data, concurrent access, etc.</simpara>\r
  23                         <simpara>Evergreen is built on PostgreSQL, an open source SQL database that began as\r
  24                         <literal>POSTGRES</literal> at the University of California at Berkeley in 1986 as a research\r
  25                         project led by Professor Michael Stonebraker. A SQL interface was added to a\r
  26                         fork of the original POSTGRES Berkelely code in 1994, and in 1996 the project\r
  27                         was renamed PostgreSQL.</simpara>\r
  28                 </simplesect>\r
  29                 <simplesect id="_tables">\r
  30                         <title>Tables</title>\r
  31                         <simpara>The table is the cornerstone of a SQL database. Conceptually, a database table\r
  32                         is similar to a single sheet in a spreadsheet: every table has one or more\r
  33                         columns, with each row in the table containing values for each column. Each\r
  34                         column in a table defines an attribute corresponding to a particular data type.</simpara>\r
  35                         <simpara>We&#8217;ll insert a row into a table, then display the resulting contents. Don&#8217;t\r
  36                         worry if the INSERT statement is completely unfamiliar, we&#8217;ll talk more about\r
  37                         the syntax of the insert statement later.</simpara>\r
  38                         <formalpara><title><literal>actor.usr_note</literal> database table</title><para>\r
  39 <programlisting language="sql" linenumbering="unnumbered">\r
  40 evergreen=# INSERT INTO actor.usr_note (usr, creator, pub, title, value)\r
  41     VALUES (1, 1, TRUE, 'Who is this guy?', 'He''s the administrator!');\r
  42 \r
  43 evergreen=# select id, usr, creator, pub, title, value from actor.usr_note;\r
  44  id | usr | creator | pub |      title       |          value\r
  45 ----+-----+---------+-----+------------------+-------------------------\r
  46   1 |   1 |       1 | t   | Who is this guy? | He's the administrator!\r
  47 (1 rows)\r
  48 </programlisting>\r
  49                         </para></formalpara>\r
  50                         <simpara>PostgreSQL supports table inheritance, which lets you define tables that\r
  51                         inherit the column definitions of a given parent table. A search of the data in\r
  52                         the parent table includes the data in the child tables. Evergreen uses table\r
  53                         inheritance: for example, the <literal>action.circulation</literal> table is a child of the\r
  54                         <literal>money.billable_xact</literal> table, and the <literal>money.*_payment</literal> tables all inherit from\r
  55                         the <literal>money.payment</literal> parent table.</simpara>\r
  56                 </simplesect>\r
  57                 <simplesect id="_schemas">\r
  58                         <title>Schemas</title>\r
  59                         <simpara>PostgreSQL, like most SQL databases, supports the use of schema names to group\r
  60                         collections of tables and other database objects together. You might think of\r
  61                         schemas as namespaces if you&#8217;re a programmer; or you might think of the schema\r
  62                         / table / column relationship like the area code / exchange / local number\r
  63                         structure of a telephone number.</simpara>\r
  64                         <table\r
  65                         frame="all"\r
  66                         rowsep="1" colsep="1"\r
  67                         >\r
  68                         <title>Examples: database object names</title>\r
  69                         <?dbhtml table-width="80%"?>\r
  70                         <?dbfo table-width="80%"?>\r
  71                         <tgroup cols="4">\r
  72                         <colspec colname="col_1" colwidth="1.0*"/>\r
  73                         <colspec colname="col_2" colwidth="1.0*"/>\r
  74                         <colspec colname="col_3" colwidth="1.0*"/>\r
  75                         <colspec colname="col_4" colwidth="1.0*"/>\r
  76                         <thead>\r
  77                         <row>\r
  78                         <entry align="left" valign="top">Full name </entry>\r
  79                         <entry align="left" valign="top">Schema name </entry>\r
  80                         <entry align="left" valign="top">Table name </entry>\r
  81                         <entry align="left" valign="top">Field name</entry>\r
  82                         </row>\r
  83                         </thead>\r
  84                         <tbody>\r
  85                         <row>\r
  86                         <entry align="left" valign="top"><simpara>actor.usr_note.title</simpara></entry>\r
  87                         <entry align="left" valign="top"><simpara>actor</simpara></entry>\r
  88                         <entry align="left" valign="top"><simpara>usr_note</simpara></entry>\r
  89                         <entry align="left" valign="top"><simpara>title</simpara></entry>\r
  90                         </row>\r
  91                         <row>\r
  92                         <entry align="left" valign="top"><simpara>biblio.record_entry.marc</simpara></entry>\r
  93                         <entry align="left" valign="top"><simpara>biblio</simpara></entry>\r
  94                         <entry align="left" valign="top"><simpara>record_entry</simpara></entry>\r
  95                         <entry align="left" valign="top"><simpara>marc</simpara></entry>\r
  96                         </row>\r
  97                         </tbody>\r
  98                         </tgroup>\r
  99                         </table>\r
 100                         <simpara>The default schema name in PostgreSQL is <literal>public</literal>, so if you do not specify a\r
 101                         schema name when creating or accessing a database object, PostgreSQL will use\r
 102                         the <literal>public</literal> schema. As a result, you might not find the object that you&#8217;re\r
 103                         looking for if you don&#8217;t use the appropriate schema.</simpara>\r
 104                         <formalpara><title>Example: Creating a table without a specific schema</title><para>\r
 105 <programlisting language="sql" linenumbering="unnumbered">\r
 106 evergreen=# CREATE TABLE foobar (foo TEXT, bar TEXT);\r
 107 CREATE TABLE\r
 108 evergreen=# \d foobar\r
 109    Table "public.foobar"\r
 110  Column | Type | Modifiers\r
 111 --------+------+-----------\r
 112  foo    | text |\r
 113  bar    | text |\r
 114 </programlisting>\r
 115                         </para></formalpara>\r
 116                         <formalpara><title>Example: Trying to access a unqualified table outside of the public schema</title><para>\r
 117                         <programlisting language="sql" linenumbering="unnumbered">evergreen=# SELECT * FROM usr_note;\r
 118                         ERROR:  relation "usr_note" does not exist\r
 119                         LINE 1: SELECT * FROM usr_note;\r
 120                                               ^</programlisting>\r
 121                         </para></formalpara>\r
 122                         <simpara>Evergreen uses schemas to organize all of its tables with mostly intuitive,\r
 123                         if short, schema names. Here&#8217;s the current (as of 2010-01-03) list of schemas\r
 124                         used by Evergreen:</simpara>\r
 125                         <table\r
 126                         frame="all"\r
 127                         rowsep="1" colsep="1"\r
 128                         >\r
 129                         <title>Evergreen schema names</title>\r
 130                         <?dbhtml table-width="80%"?>\r
 131                         <?dbfo table-width="80%"?>\r
 132                         <tgroup cols="2">\r
 133                         <colspec colname="col_1" colwidth="1.0*"/>\r
 134                         <colspec colname="col_2" colwidth="1.0*"/>\r
 135                         <thead>\r
 136                         <row>\r
 137                         <entry align="left" valign="top">Schema name </entry>\r
 138                         <entry align="left" valign="top">Description</entry>\r
 139                         </row>\r
 140                         </thead>\r
 141                         <tbody>\r
 142                         <row>\r
 143                         <entry align="left" valign="top"><simpara><literal>acq</literal></simpara></entry>\r
 144                         <entry align="left" valign="top"><simpara>Acquisitions</simpara></entry>\r
 145                         </row>\r
 146                         <row>\r
 147                         <entry align="left" valign="top"><simpara><literal>action</literal></simpara></entry>\r
 148                         <entry align="left" valign="top"><simpara>Circulation actions</simpara></entry>\r
 149                         </row>\r
 150                         <row>\r
 151                         <entry align="left" valign="top"><simpara><literal>action_trigger</literal></simpara></entry>\r
 152                         <entry align="left" valign="top"><simpara>Event mechanisms</simpara></entry>\r
 153                         </row>\r
 154                         <row>\r
 155                         <entry align="left" valign="top"><simpara><literal>actor</literal></simpara></entry>\r
 156                         <entry align="left" valign="top"><simpara>Evergreen users and organization units</simpara></entry>\r
 157                         </row>\r
 158                         <row>\r
 159                         <entry align="left" valign="top"><simpara><literal>asset</literal></simpara></entry>\r
 160                         <entry align="left" valign="top"><simpara>Call numbers and copies</simpara></entry>\r
 161                         </row>\r
 162                         <row>\r
 163                         <entry align="left" valign="top"><simpara><literal>auditor</literal></simpara></entry>\r
 164                         <entry align="left" valign="top"><simpara>Track history of changes to selected tables</simpara></entry>\r
 165                         </row>\r
 166                         <row>\r
 167                         <entry align="left" valign="top"><simpara><literal>authority</literal></simpara></entry>\r
 168                         <entry align="left" valign="top"><simpara>Authority records</simpara></entry>\r
 169                         </row>\r
 170                         <row>\r
 171                         <entry align="left" valign="top"><simpara><literal>biblio</literal></simpara></entry>\r
 172                         <entry align="left" valign="top"><simpara>Bibliographic records</simpara></entry>\r
 173                         </row>\r
 174                         <row>\r
 175                         <entry align="left" valign="top"><simpara><literal>booking</literal></simpara></entry>\r
 176                         <entry align="left" valign="top"><simpara>Resource bookings</simpara></entry>\r
 177                         </row>\r
 178                         <row>\r
 179                         <entry align="left" valign="top"><simpara><literal>config</literal></simpara></entry>\r
 180                         <entry align="left" valign="top"><simpara>Evergreen configurable options</simpara></entry>\r
 181                         </row>\r
 182                         <row>\r
 183                         <entry align="left" valign="top"><simpara><literal>container</literal></simpara></entry>\r
 184                         <entry align="left" valign="top"><simpara>Buckets for records, call numbers, copies, and users</simpara></entry>\r
 185                         </row>\r
 186                         <row>\r
 187                         <entry align="left" valign="top"><simpara><literal>extend_reporter</literal></simpara></entry>\r
 188                         <entry align="left" valign="top"><simpara>Extra views for report definitions</simpara></entry>\r
 189                         </row>\r
 190                         <row>\r
 191                         <entry align="left" valign="top"><simpara><literal>metabib</literal></simpara></entry>\r
 192                         <entry align="left" valign="top"><simpara>Metadata about bibliographic records</simpara></entry>\r
 193                         </row>\r
 194                         <row>\r
 195                         <entry align="left" valign="top"><simpara><literal>money</literal></simpara></entry>\r
 196                         <entry align="left" valign="top"><simpara>Fines and bills</simpara></entry>\r
 197                         </row>\r
 198                         <row>\r
 199                         <entry align="left" valign="top"><simpara><literal>offline</literal></simpara></entry>\r
 200                         <entry align="left" valign="top"><simpara>Offline transactions</simpara></entry>\r
 201                         </row>\r
 202                         <row>\r
 203                         <entry align="left" valign="top"><simpara><literal>permission</literal></simpara></entry>\r
 204                         <entry align="left" valign="top"><simpara>User permissions</simpara></entry>\r
 205                         </row>\r
 206                         <row>\r
 207                         <entry align="left" valign="top"><simpara><literal>query</literal></simpara></entry>\r
 208                         <entry align="left" valign="top"><simpara>Stored SQL statements</simpara></entry>\r
 209                         </row>\r
 210                         <row>\r
 211                         <entry align="left" valign="top"><simpara><literal>reporter</literal></simpara></entry>\r
 212                         <entry align="left" valign="top"><simpara>Report definitions</simpara></entry>\r
 213                         </row>\r
 214                         <row>\r
 215                         <entry align="left" valign="top"><simpara><literal>search</literal></simpara></entry>\r
 216                         <entry align="left" valign="top"><simpara>Search functions</simpara></entry>\r
 217                         </row>\r
 218                         <row>\r
 219                         <entry align="left" valign="top"><simpara><literal>serial</literal></simpara></entry>\r
 220                         <entry align="left" valign="top"><simpara>Serial MFHD records</simpara></entry>\r
 221                         </row>\r
 222                         <row>\r
 223                         <entry align="left" valign="top"><simpara><literal>stats</literal></simpara></entry>\r
 224                         <entry align="left" valign="top"><simpara>Convenient views of circulation and asset statistics</simpara></entry>\r
 225                         </row>\r
 226                         <row>\r
 227                         <entry align="left" valign="top"><simpara><literal>vandelay</literal></simpara></entry>\r
 228                         <entry align="left" valign="top"><simpara>MARC batch importer and exporter</simpara></entry>\r
 229                         </row>\r
 230                         </tbody>\r
 231                         </tgroup>\r
 232                         </table>\r
 233                         <note><simpara>The term <emphasis>schema</emphasis> has two meanings in the world of SQL databases. We have\r
 234                         discussed the schema as a conceptual grouping of tables and other database\r
 235                         objects within a given namespace; for example, "the <emphasis role="strong">actor</emphasis> schema contains the\r
 236                         tables and functions related to users and organizational units". Another common\r
 237                         usage of <emphasis>schema</emphasis> is to refer to the entire data model for a given database;\r
 238                         for example, "the Evergreen database schema".</simpara></note>\r
 239                 </simplesect>\r
 240                 <simplesect id="_columns">\r
 241                         <title>Columns</title>\r
 242                         <simpara>Each column definition consists of:</simpara>\r
 243                         <itemizedlist>\r
 244                         <listitem>\r
 245                         <simpara>\r
 246                         a data type\r
 247                         </simpara>\r
 248                         </listitem>\r
 249                         <listitem>\r
 250                         <simpara>\r
 251                         (optionally) a default value to be used whenever a row is inserted that\r
 252                              does not contain a specific value\r
 253                         </simpara>\r
 254                         </listitem>\r
 255                         <listitem>\r
 256                         <simpara>\r
 257                         (optionally) one or more constraints on the values beyond data type\r
 258                         </simpara>\r
 259                         </listitem>\r
 260                         </itemizedlist>\r
 261                         <simpara>Although PostgreSQL supports dozens of data types, Evergreen makes our life\r
 262                         easier by only using a handful.</simpara>\r
 263                         <table\r
 264                         frame="all"\r
 265                         rowsep="1" colsep="1"\r
 266                         >\r
 267                         <title>PostgreSQL data types used by Evergreen</title>\r
 268                         <?dbhtml table-width="90%"?>\r
 269                         <?dbfo table-width="90%"?>\r
 270                         <tgroup cols="3">\r
 271                         <colspec colname="col_1" colwidth="1.0*"/>\r
 272                         <colspec colname="col_2" colwidth="1.0*"/>\r
 273                         <colspec colname="col_3" colwidth="2.5*"/>\r
 274                         <thead>\r
 275                         <row>\r
 276                         <entry align="left" valign="top">Type name   </entry>\r
 277                         <entry align="left" valign="top">Description    </entry>\r
 278                         <entry align="left" valign="top">Limits</entry>\r
 279                         </row>\r
 280                         </thead>\r
 281                         <tbody>\r
 282                         <row>\r
 283                         <entry align="left" valign="top"><simpara><literal>INTEGER</literal></simpara></entry>\r
 284                         <entry align="left" valign="top"><simpara>Medium integer</simpara></entry>\r
 285                         <entry align="left" valign="top"><simpara>-2147483648 to +2147483647</simpara></entry>\r
 286                         </row>\r
 287                         <row>\r
 288                         <entry align="left" valign="top"><simpara><literal>BIGINT</literal></simpara></entry>\r
 289                         <entry align="left" valign="top"><simpara>Large integer</simpara></entry>\r
 290                         <entry align="left" valign="top"><simpara>-9223372036854775808 to 9223372036854775807</simpara></entry>\r
 291                         </row>\r
 292                         <row>\r
 293                         <entry align="left" valign="top"><simpara><literal>SERIAL</literal></simpara></entry>\r
 294                         <entry align="left" valign="top"><simpara>Sequential integer</simpara></entry>\r
 295                         <entry align="left" valign="top"><simpara>1 to 2147483647</simpara></entry>\r
 296                         </row>\r
 297                         <row>\r
 298                         <entry align="left" valign="top"><simpara><literal>BIGSERIAL</literal></simpara></entry>\r
 299                         <entry align="left" valign="top"><simpara>Large sequential integer</simpara></entry>\r
 300                         <entry align="left" valign="top"><simpara>1 to 9223372036854775807</simpara></entry>\r
 301                         </row>\r
 302                         <row>\r
 303                         <entry align="left" valign="top"><simpara><literal>TEXT</literal></simpara></entry>\r
 304                         <entry align="left" valign="top"><simpara>Variable length character data</simpara></entry>\r
 305                         <entry align="left" valign="top"><simpara>Unlimited length</simpara></entry>\r
 306                         </row>\r
 307                         <row>\r
 308                         <entry align="left" valign="top"><simpara><literal>BOOL</literal></simpara></entry>\r
 309                         <entry align="left" valign="top"><simpara>Boolean</simpara></entry>\r
 310                         <entry align="left" valign="top"><simpara>TRUE or FALSE</simpara></entry>\r
 311                         </row>\r
 312                         <row>\r
 313                         <entry align="left" valign="top"><simpara><literal>TIMESTAMP WITH TIME ZONE</literal></simpara></entry>\r
 314                         <entry align="left" valign="top"><simpara>Timestamp</simpara></entry>\r
 315                         <entry align="left" valign="top"><simpara>4713 BC to 294276 AD</simpara></entry>\r
 316                         </row>\r
 317                         <row>\r
 318                         <entry align="left" valign="top"><simpara><literal>TIME</literal></simpara></entry>\r
 319                         <entry align="left" valign="top"><simpara>Time</simpara></entry>\r
 320                         <entry align="left" valign="top"><simpara>Expressed in HH:MM:SS</simpara></entry>\r
 321                         </row>\r
 322                         <row>\r
 323                         <entry align="left" valign="top"><simpara><literal>NUMERIC</literal>(precision, scale)</simpara></entry>\r
 324                         <entry align="left" valign="top"><simpara>Decimal</simpara></entry>\r
 325                         <entry align="left" valign="top"><simpara>Up to 1000 digits of precision. In Evergreen mostly used for money\r
 326                         values, with a precision of 6 and a scale of 2 (<literal>####.##</literal>).</simpara></entry>\r
 327                         </row>\r
 328                         </tbody>\r
 329                         </tgroup>\r
 330                         </table>\r
 331                         <simpara>Full details about these data types are available from the\r
 332                         <ulink url="http://www.postgresql.org/docs/8.4/static/datatype.html">data types section of\r
 333                         the PostgreSQL manual</ulink>.</simpara>\r
 334                 </simplesect>\r
 335                 <simplesect id="_constraints">\r
 336                         <title>Constraints</title>\r
 337                         <simplesect id="_prevent_null_values">\r
 338                                 <title>Prevent NULL values</title>\r
 339                                 <simpara>A column definition may include the constraint <literal>NOT NULL</literal> to prevent NULL\r
 340                                 values. In PostgreSQL, a NULL value is not the equivalent of zero or false or\r
 341                                 an empty string; it is an explicit non-value with special properties. We&#8217;ll\r
 342                                 talk more about how to work with NULL values when we get to queries.</simpara>\r
 343                         </simplesect>\r
 344                         <simplesect id="_primary_key">\r
 345                                 <title>Primary key</title>\r
 346                                 <simpara>Every table can have at most one primary key. A primary key consists of one or\r
 347                                 more columns which together uniquely identify each row in a table. If you\r
 348                                 attempt to insert a row into a table that would create a duplicate or NULL\r
 349                                 primary key entry, the database rejects the row and returns an error.</simpara>\r
 350                                 <simpara>Natural primary keys are drawn from the intrinsic properties of the data being\r
 351                                 modelled. For example, some potential natural primary keys for a table that\r
 352                                 contains people would be:</simpara>\r
 353                                 <table\r
 354                                 frame="all"\r
 355                                 rowsep="1" colsep="1"\r
 356                                 >\r
 357                                 <title>Example: Some potential natural primary keys for a table of people</title>\r
 358                                 <?dbhtml table-width="90%"?>\r
 359                                 <?dbfo table-width="90%"?>\r
 360                                 <tgroup cols="3">\r
 361                                 <colspec colname="col_1" colwidth="1.0*"/>\r
 362                                 <colspec colname="col_2" colwidth="2.0*"/>\r
 363                                 <colspec colname="col_3" colwidth="2.0*"/>\r
 364                                 <thead>\r
 365                                 <row>\r
 366                                 <entry align="left" valign="top">Natural key </entry>\r
 367                                 <entry align="left" valign="top">Pros </entry>\r
 368                                 <entry align="left" valign="top">Cons</entry>\r
 369                                 </row>\r
 370                                 </thead>\r
 371                                 <tbody>\r
 372                                 <row>\r
 373                                 <entry align="left" valign="top"><simpara>First name, last name, address</simpara></entry>\r
 374                                 <entry align="left" valign="top"><simpara>No two people with the same name would ever live at the same address, right?</simpara></entry>\r
 375                                 <entry align="left" valign="top"><simpara>Lots of columns force data duplication in referencing tables</simpara></entry>\r
 376                                 </row>\r
 377                                 <row>\r
 378                                 <entry align="left" valign="top"><simpara>SSN or driver&#8217;s license</simpara></entry>\r
 379                                 <entry align="left" valign="top"><simpara>These are guaranteed to be unique</simpara></entry>\r
 380                                 <entry align="left" valign="top"><simpara>Lots of people don&#8217;t have an SSN or a driver&#8217;s license</simpara></entry>\r
 381                                 </row>\r
 382                                 </tbody>\r
 383                                 </tgroup>\r
 384                                 </table>\r
 385                                 <simpara>To avoid problems with natural keys, many applications instead define surrogate\r
 386                                 primary keys. A surrogate primary keys is a column with an autoincrementing\r
 387                                 integer value added to a table definition that ensures uniqueness.</simpara>\r
 388                                 <simpara>Evergreen uses surrogate keys (a column named <literal>id</literal> with a <literal>SERIAL</literal> data type)\r
 389                                 for most of its tables.</simpara>\r
 390                         </simplesect>\r
 391                         <simplesect id="_foreign_keys">\r
 392                                 <title>Foreign keys</title>\r
 393                                 <simpara>Every table can contain zero or more foreign keys: one or more columns that\r
 394                                 refer to the primary key of another table.</simpara>\r
 395                                 <simpara>For example, let&#8217;s consider Evergreen&#8217;s modelling of the basic relationship\r
 396                                 between copies, call numbers, and bibliographic records. Bibliographic records\r
 397                                 contained in the <literal>biblio.record_entry</literal> table can have call numbers attached to\r
 398                                 them. Call numbers are contained in the <literal>asset.call_number</literal> table, and they can\r
 399                                 have copies attached to them. Copies are contained in the <literal>asset.copy</literal> table.</simpara>\r
 400                                 <table\r
 401                                 frame="all"\r
 402                                 rowsep="1" colsep="1"\r
 403                                 >\r
 404                                 <title>Example: Evergreen&#8217;s copy / call number / bibliographic record relationships</title>\r
 405                                 <?dbhtml table-width="100%"?>\r
 406                                 <?dbfo table-width="100%"?>\r
 407                                 <tgroup cols="4">\r
 408                                 <colspec colname="col_1" colwidth="1.0*"/>\r
 409                                 <colspec colname="col_2" colwidth="1.0*"/>\r
 410                                 <colspec colname="col_3" colwidth="1.0*"/>\r
 411                                 <colspec colname="col_4" colwidth="1.0*"/>\r
 412                                 <thead>\r
 413                                 <row>\r
 414                                 <entry align="left" valign="top">Table </entry>\r
 415                                 <entry align="left" valign="top">Primary key </entry>\r
 416                                 <entry align="left" valign="top">Column with a foreign key </entry>\r
 417                                 <entry align="left" valign="top">Points to</entry>\r
 418                                 </row>\r
 419                                 </thead>\r
 420                                 <tbody>\r
 421                                 <row>\r
 422                                 <entry align="left" valign="top"><simpara>asset.copy</simpara></entry>\r
 423                                 <entry align="left" valign="top"><simpara>asset.copy.id</simpara></entry>\r
 424                                 <entry align="left" valign="top"><simpara>asset.copy.call_number</simpara></entry>\r
 425                                 <entry align="left" valign="top"><simpara>asset.call_number.id</simpara></entry>\r
 426                                 </row>\r
 427                                 <row>\r
 428                                 <entry align="left" valign="top"><simpara>asset.call_number</simpara></entry>\r
 429                                 <entry align="left" valign="top"><simpara>asset.call_number.id</simpara></entry>\r
 430                                 <entry align="left" valign="top"><simpara>asset.call_number.record</simpara></entry>\r
 431                                 <entry align="left" valign="top"><simpara>biblio.record_entry.id</simpara></entry>\r
 432                                 </row>\r
 433                                 <row>\r
 434                                 <entry align="left" valign="top"><simpara>biblio.record_entry</simpara></entry>\r
 435                                 <entry align="left" valign="top"><simpara>biblio.record_entry.id</simpara></entry>\r
 436                                 <entry align="left" valign="top"><simpara></simpara></entry>\r
 437                                 <entry align="left" valign="top"><simpara></simpara></entry>\r
 438                                 </row>\r
 439                                 </tbody>\r
 440                                 </tgroup>\r
 441                                 </table>\r
 442                         </simplesect>\r
 443                         <simplesect id="_check_constraints">\r
 444                                 <title>Check constraints</title>\r
 445                                 <simpara>PostgreSQL enables you to define rules to ensure that the value to be inserted\r
 446                                 or updated meets certain conditions. For example, you can ensure that an\r
 447                                 incoming integer value is within a specific range, or that a ZIP code matches a\r
 448                                 particular pattern.</simpara>\r
 449                         </simplesect>\r
 450                 </simplesect>\r
 451                 <simplesect id="_deconstructing_a_table_definition_statement">\r
 452                         <title>Deconstructing a table definition statement</title>\r
 453                         <simpara>The <literal>actor.org_address</literal> table is a simple table in the Evergreen schema that\r
 454                         we can use as a concrete example of many of the properties of databases that\r
 455                         we have discussed so far.</simpara>\r
 456 <programlisting language="sql" linenumbering="unnumbered">\r
 457 CREATE TABLE actor.org_address (\r
 458   id            SERIAL  PRIMARY KEY,      <co id="sqlCO1-1"/>\r
 459   valid         BOOL    NOT NULL DEFAULT TRUE, <co id="sqlCO1-2"/>\r
 460   address_type  TEXT    NOT NULL DEFAULT 'MAILING', <co id="sqlCO1-3"/>\r
 461   org_unit      INT     NOT NULL REFERENCES actor.org_unit (id)  <co id="sqlCO1-4"/>\r
 462                           DEFERRABLE INITIALLY DEFERRED,\r
 463   street1       TEXT    NOT NULL,\r
 464   street2       TEXT, <co id="sqlCO1-5"/>\r
 465   city          TEXT    NOT NULL,\r
 466   county        TEXT,\r
 467   state         TEXT    NOT NULL,\r
 468   country       TEXT    NOT NULL,\r
 469   post_code     TEXT    NOT NULL\r
 470 );\r
 471 </programlisting>\r
 472                         <calloutlist>\r
 473                         <callout arearefs="sqlCO1-1">\r
 474                         <simpara>\r
 475                         The column named <literal>id</literal> is defined with a special data type of <literal>SERIAL</literal>; if\r
 476                         given no value when a row is inserted into a table, the database automatically\r
 477                         generates the next sequential integer value for the column. <literal>SERIAL</literal> is a\r
 478                         popular data type for a primary key because it is guaranteed to be unique - and\r
 479                         indeed, the constraint for this column identifies it as the <literal>PRIMARY KEY</literal>.\r
 480                         </simpara>\r
 481                         </callout>\r
 482                         <callout arearefs="sqlCO1-2">\r
 483                         <simpara>\r
 484                         The data type <literal>BOOL</literal> defines a boolean value: <literal>TRUE</literal> or <literal>FALSE</literal> are the only\r
 485                         acceptable values for the column. The constraint <literal>NOT NULL</literal> instructs the\r
 486                         database to prevent the column from ever containing a NULL value. The column\r
 487                         property <literal>DEFAULT TRUE</literal> instructs the database to automatically set the value\r
 488                         of the column to <literal>TRUE</literal> if no value is provided.\r
 489                         </simpara>\r
 490                         </callout>\r
 491                         <callout arearefs="sqlCO1-3">\r
 492                         <simpara>\r
 493                         The data type <literal>TEXT</literal> defines a text column of practically unlimited length.\r
 494                         As with the previous column, there is a <literal>NOT NULL</literal> constraint, and a default\r
 495                         value of <literal>'MAILING'</literal> will result if no other value is supplied.\r
 496                         </simpara>\r
 497                         </callout>\r
 498                         <callout arearefs="sqlCO1-4">\r
 499                         <simpara>\r
 500                         The <literal>REFERENCES actor.org_unit (id)</literal> clause indicates that this column has a\r
 501                         foreign key relationship to the <literal>actor.org_unit</literal> table, and that the value of\r
 502                         this column in every row in this table must have a corresponding value in the\r
 503                         <literal>id</literal> column in the referenced table (<literal>actor.org_unit</literal>).\r
 504                         </simpara>\r
 505                         </callout>\r
 506                         <callout arearefs="sqlCO1-5">\r
 507                         <simpara>\r
 508                         The column named <literal>street2</literal> demonstrates that not all columns have constraints\r
 509                         beyond data type. In this case, the column is allowed to be NULL or to contain a\r
 510                         <literal>TEXT</literal> value.\r
 511                         </simpara>\r
 512                         </callout>\r
 513                         </calloutlist>\r
 514                 </simplesect>\r
 515                 <simplesect id="_displaying_a_table_definition_using_literal_psql_literal">\r
 516                         <title>Displaying a table definition using <literal>psql</literal></title>\r
 517                         <simpara>The <literal>psql</literal> command-line interface is the preferred method for accessing\r
 518                         PostgreSQL databases. It offers features like tab-completion, readline support\r
 519                         for recalling previous commands, flexible input and output formats, and\r
 520                         is accessible via a standard SSH session.</simpara>\r
 521                         <simpara>If you press the <literal>Tab</literal> key once after typing one or more characters of the\r
 522                         database object name, <literal>psql</literal> automatically completes the name if there are no\r
 523                         other matches. If there are other matches for your current input, nothing\r
 524                         happens until you press the <literal>Tab</literal> key a second time, at which point <literal>psql</literal>\r
 525                         displays all of the matches for your current input.</simpara>\r
 526                         <simpara>To display the definition of a database object such as a table, issue the\r
 527                         command <literal>\d _object-name_</literal>. For example, to display the definition of the\r
 528                         actor.usr_note table:</simpara>\r
 529 <programlisting language="sh" linenumbering="unnumbered">\r
 530 $ psql evergreen <co id="sqlCO2-1"/>\r
 531 psql (8.4.1)\r
 532 Type "help" for help.\r
 533 \r
 534 evergreen=# \d actor.usr_note  <co id="sqlCO2-2"/>\r
 535                                         Table "actor.usr_note"\r
 536    Column    |           Type           |                          Modifiers\r
 537 -------------+--------------------------+-------------------------------------------------------------\r
 538  id          | bigint                   | not null default nextval('actor.usr_note_id_seq'::regclass)\r
 539  usr         | bigint                   | not null\r
 540  creator     | bigint                   | not null\r
 541  create_date | timestamp with time zone | default now()\r
 542  pub         | boolean                  | not null default false\r
 543  title       | text                     | not null\r
 544  value       | text                     | not null\r
 545 Indexes:\r
 546     "usr_note_pkey" PRIMARY KEY, btree (id)\r
 547     "actor_usr_note_creator_idx" btree (creator)\r
 548     "actor_usr_note_usr_idx" btree (usr)\r
 549 Foreign-key constraints:\r
 550     "usr_note_creator_fkey" FOREIGN KEY (creator) REFERENCES actor.usr(id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED\r
 551     "usr_note_usr_fkey" FOREIGN KEY (usr) REFERENCES actor.usr(id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED\r
 552 \r
 553 evergreen=# \q    <co id="sqlCO2-3"/>\r
 554 $\r
 555 </programlisting>\r
 556                         <calloutlist>\r
 557                         <callout arearefs="sqlCO2-1">\r
 558                         <simpara>\r
 559                         This is the most basic connection to a PostgreSQL database. You can use a\r
 560                         number of other flags to specify user name, hostname, port, and other options.\r
 561                         </simpara>\r
 562                         </callout>\r
 563                         <callout arearefs="sqlCO2-2">\r
 564                         <simpara>\r
 565                         The <literal>\d</literal> command displays the definition of a database object.\r
 566                         </simpara>\r
 567                         </callout>\r
 568                         <callout arearefs="sqlCO2-3">\r
 569                         <simpara>\r
 570                         The <literal>\q</literal> command quits the <literal>psql</literal> session and returns you to the shell prompt.\r
 571                         </simpara>\r
 572                         </callout>\r
 573                         </calloutlist>\r
 574                 </simplesect>\r
 575         </section>\r
 576         <section id="basic_sql_queries">\r
 577                 <title>Basic SQL queries</title>\r
 578                 <simplesect id="_the_select_statement">\r
 579                         <title>The SELECT statement</title>\r
 580                         <simpara>The SELECT statement is the basic tool for retrieving information from a\r
 581                         database. The syntax for most SELECT statements is:</simpara>\r
 582                         <blockquote>\r
 583                         <literallayout><literal>SELECT</literal> [<emphasis>columns(s)</emphasis>]\r
 584                           <literal>FROM</literal> [<emphasis>table(s)</emphasis>]\r
 585                           [<literal>WHERE</literal> <emphasis>condition(s)</emphasis>]\r
 586                           [<literal>GROUP BY</literal> <emphasis>columns(s)</emphasis>]\r
 587                           [<literal>HAVING</literal> <emphasis>grouping-condition(s)</emphasis>]\r
 588                           [<literal>ORDER BY</literal> <emphasis>column(s)</emphasis>]\r
 589                           [<literal>LIMIT</literal> <emphasis>maximum-results</emphasis>]\r
 590                           [<literal>OFFSET</literal> <emphasis>start-at-result-#</emphasis>]\r
 591                         ;</literallayout>\r
 592                         </blockquote>\r
 593                         <simpara>For example, to select all of the columns for each row in the\r
 594                         <literal>actor.usr_address</literal> table, issue the following query:</simpara>\r
 595                         <programlisting language="sql" linenumbering="unnumbered">SELECT *\r
 596                           FROM actor.usr_address\r
 597                         ;</programlisting>\r
 598                 </simplesect>\r
 599                 <simplesect id="_selecting_particular_columns_from_a_table">\r
 600                         <title>Selecting particular columns from a table</title>\r
 601                         <simpara><literal>SELECT *</literal> returns all columns from all of the tables included in your query.\r
 602                         However, quite often you will want to return only a subset of the possible\r
 603                         columns. You can retrieve specific columns by listing the names of the columns\r
 604                         you want after the <literal>SELECT</literal> keyword. Separate each column name with a comma.</simpara>\r
 605                         <simpara>For example, to select just the city, county, and state from the\r
 606                         actor.usr_address table, issue the following query:</simpara>\r
 607                         <programlisting language="sql" linenumbering="unnumbered">SELECT city, county, state\r
 608                           FROM actor.usr_address\r
 609                         ;</programlisting>\r
 610                 </simplesect>\r
 611                 <simplesect id="_sorting_results_with_the_order_by_clause">\r
 612                         <title>Sorting results with the ORDER BY clause</title>\r
 613                         <simpara>By default, a SELECT statement returns rows matching your query with no\r
 614                         guarantee of any particular order in which they are returned. To force\r
 615                         the rows to be returned in a particular order, use the ORDER BY clause\r
 616                         to specify one or more columns to determine the sorting priority of the\r
 617                         rows.</simpara>\r
 618                         <simpara>For example, to sort the rows returned from your <literal>actor.usr_address</literal> query by\r
 619                         city, with county and then zip code as the tie breakers, issue the\r
 620                         following query:</simpara>\r
 621 <programlisting language="sql" linenumbering="unnumbered">\r
 622 SELECT city, county, state\r
 623   FROM actor.usr_address\r
 624   ORDER BY city, county, post_code\r
 625 ;\r
 626 </programlisting>\r
 627                 </simplesect>\r
 628                 <simplesect id="_filtering_results_with_the_where_clause">\r
 629                         <title>Filtering results with the WHERE clause</title>\r
 630                         <simpara>Thus far, your results have been returning all of the rows in the table.\r
 631                         Normally, however, you would want to restrict the rows that are returned to the\r
 632                         subset of rows that match one or more conditions of your search. The <literal>WHERE</literal>\r
 633                         clause enables you to specify a set of conditions that filter your query\r
 634                         results. Each condition in the <literal>WHERE</literal> clause is an SQL expression that returns\r
 635                         a boolean (true or false) value.</simpara>\r
 636                         <simpara>For example, to restrict the results returned from your <literal>actor.usr_address</literal>\r
 637                         query to only those rows containing a state value of <emphasis>Connecticut</emphasis>, issue the\r
 638                         following query:</simpara>\r
 639 <programlisting language="sql" linenumbering="unnumbered">\r
 640 SELECT city, county, state\r
 641   FROM actor.usr_address\r
 642   WHERE state = 'Connecticut'\r
 643   ORDER BY city, county, post_code\r
 644 ;\r
 645 </programlisting>\r
 646                         <simpara>You can include more conditions in the <literal>WHERE</literal> clause with the <literal>OR</literal> and <literal>AND</literal>\r
 647                         operators. For example, to further restrict the results returned from your\r
 648                         <literal>actor.usr_address</literal> query to only those rows where the state column contains a\r
 649                         value of <emphasis>Connecticut</emphasis> and the city column contains a value of <emphasis>Hartford</emphasis>,\r
 650                         issue the following query:</simpara>\r
 651 <programlisting language="sql" linenumbering="unnumbered">\r
 652 SELECT city, county, state\r
 653   FROM actor.usr_address\r
 654   WHERE state = 'Connecticut'\r
 655     AND city = 'Hartford'\r
 656   ORDER BY city, county, post_code\r
 657 ;\r
 658 </programlisting>\r
 659                         <note><simpara>To return rows where the state is <emphasis>Connecticut</emphasis> and the city is <emphasis>Hartford</emphasis> or\r
 660                         <emphasis>New Haven</emphasis>, you must use parentheses to explicitly group the city value\r
 661                         conditions together, or else the database will evaluate the <literal>OR city = 'New\r
 662                         Haven'</literal> clause entirely on its own and match all rows where the city column is\r
 663                         <emphasis>New Haven</emphasis>, even though the state might not be <emphasis>Connecticut</emphasis>.</simpara></note>\r
 664                         <formalpara><title>Trouble with OR</title><para>\r
 665 <programlisting language="sql" linenumbering="unnumbered">\r
 666 SELECT city, county, state\r
 667   FROM actor.usr_address\r
 668   WHERE state = 'Connecticut'\r
 669     AND city = 'Hartford' OR city = 'New Haven'\r
 670   ORDER BY city, county, post_code\r
 671 ;\r
 672 \r
 673 -- Can return unwanted rows because the OR is not grouped!\r
 674 </programlisting>\r
 675                         </para></formalpara>\r
 676                         <formalpara><title>Grouped OR&#8217;ed conditions</title><para>\r
 677 <programlisting language="sql" linenumbering="unnumbered">\r
 678 SELECT city, county, state\r
 679   FROM actor.usr_address\r
 680   WHERE state = 'Connecticut'\r
 681     AND (city = 'Hartford' OR city = 'New Haven')\r
 682   ORDER BY city, county, post_code\r
 683 ;\r
 684 \r
 685 -- The parentheses ensure that the OR is applied to the cities, and the\r
 686 -- state in either case must be 'Connecticut'\r
 687 </programlisting>\r
 688                         </para></formalpara>\r
 689                         <simplesect id="_comparison_operators">\r
 690                                 <title>Comparison operators</title>\r
 691                                 <simpara>Here is a partial list of comparison operators that are commonly used in\r
 692                                 <literal>WHERE</literal> clauses:</simpara>\r
 693                                 <simplesect id="_comparing_two_scalar_values">\r
 694                                         <title>Comparing two scalar values</title>\r
 695                                         <itemizedlist>\r
 696                                         <listitem>\r
 697                                         <simpara>\r
 698                                         <literal>x = y</literal> (equal to)\r
 699                                         </simpara>\r
 700                                         </listitem>\r
 701                                         <listitem>\r
 702                                         <simpara>\r
 703                                         <literal>x != y</literal> (not equal to)\r
 704                                         </simpara>\r
 705                                         </listitem>\r
 706                                         <listitem>\r
 707                                         <simpara>\r
 708                                         <literal>x &lt; y</literal> (less than)\r
 709                                         </simpara>\r
 710                                         </listitem>\r
 711                                         <listitem>\r
 712                                         <simpara>\r
 713                                         <literal>x &gt; y</literal> (greater than)\r
 714                                         </simpara>\r
 715                                         </listitem>\r
 716                                         <listitem>\r
 717                                         <simpara>\r
 718                                         <literal>x LIKE y</literal> (TEXT value x matches a subset of TEXT y, where y is a string that\r
 719                                         can contain <emphasis>%</emphasis> as a wildcard for 0 or more characters, and <emphasis>_</emphasis> as a wildcard\r
 720                                         for a single character. For example, <literal>WHERE 'all you can eat fish and chips\r
 721                                         and a big stick' LIKE '%fish%stick'</literal> would return TRUE)\r
 722                                         </simpara>\r
 723                                         </listitem>\r
 724                                         <listitem>\r
 725                                         <simpara>\r
 726                                         <literal>x ILIKE y</literal> (like LIKE, but the comparison ignores upper-case / lower-case)\r
 727                                         </simpara>\r
 728                                         </listitem>\r
 729                                         <listitem>\r
 730                                         <simpara>\r
 731                                         <literal>x IN y</literal> (x is in the list of values y, where y can be a list or a SELECT\r
 732                                         statement that returns a list)\r
 733                                         </simpara>\r
 734                                         </listitem>\r
 735                                         </itemizedlist>\r
 736                                 </simplesect>\r
 737                         </simplesect>\r
 738                 </simplesect>\r
 739                 <simplesect id="_null_values">\r
 740                         <title>NULL values</title>\r
 741                         <simpara>SQL databases have a special way of representing the value of a column that has\r
 742                         no value: <literal>NULL</literal>. A <literal>NULL</literal> value is not equal to zero, and is not an empty\r
 743                         string; it is equal to nothing, not even another <literal>NULL</literal>, because it has no value\r
 744                         that can be compared.</simpara>\r
 745                         <simpara>To return rows from a table where a given column is not <literal>NULL</literal>, use the\r
 746                         <literal>IS NOT NULL</literal> comparison operator.</simpara>\r
 747                         <formalpara><title>Retrieving rows where a column is not <literal>NULL</literal></title><para>\r
 748 <programlisting language="sql" linenumbering="unnumbered">\r
 749 SELECT id, first_given_name, family_name\r
 750   FROM actor.usr\r
 751   WHERE second_given_name IS NOT NULL\r
 752 ;\r
 753 </programlisting>\r
 754                         </para></formalpara>\r
 755                         <simpara>Similarly, to return rows from a table where a given column is <literal>NULL</literal>, use\r
 756                         the <literal>IS NULL</literal> comparison operator.</simpara>\r
 757                         <formalpara><title>Retrieving rows where a column is <literal>NULL</literal></title><para>\r
 758 <programlisting language="sql" linenumbering="unnumbered">\r
 759 SELECT id, first_given_name, second_given_name, family_name\r
 760   FROM actor.usr\r
 761   WHERE second_given_name IS NULL\r
 762 ;\r
 763 \r
 764  id | first_given_name | second_given_name |  family_name\r
 765 ----+------------------+-------------------+----------------\r
 766   1 | Administrator    |                   | System Account\r
 767 (1 row)\r
 768 </programlisting>\r
 769                         </para></formalpara>\r
 770                         <simpara>Notice that the <literal>NULL</literal> value in the output is displayed as empty space,\r
 771                         indistinguishable from an empty string; this is the default display method in\r
 772                         <literal>psql</literal>. You can change the behaviour of <literal>psql</literal> using the <literal>pset</literal> command:</simpara>\r
 773                         <formalpara><title>Changing the way <literal>NULL</literal> values are displayed in <literal>psql</literal></title><para>\r
 774 <programlisting language="sql" linenumbering="unnumbered">\r
 775 evergreen=# \pset null '(null)'\r
 776 Null display is '(null)'.\r
 777 \r
 778 SELECT id, first_given_name, second_given_name, family_name\r
 779   FROM actor.usr\r
 780   WHERE second_given_name IS NULL\r
 781 ;\r
 782 \r
 783  id | first_given_name | second_given_name |  family_name\r
 784 ----+------------------+-------------------+----------------\r
 785   1 | Administrator    | (null)            | System Account\r
 786 (1 row)\r
 787 </programlisting>\r
 788                         </para></formalpara>\r
 789                         <simpara>Database queries within programming languages such as Perl and C have\r
 790                         special methods of checking for <literal>NULL</literal> values in returned results.</simpara>\r
 791                 </simplesect>\r
 792                 <simplesect id="_text_delimiter">\r
 793                         <title>Text delimiter: '</title>\r
 794                         <simpara>You might have noticed that we have been using the <literal>'</literal> character to delimit\r
 795                         TEXT values and values such as dates and times that are TEXT values. Sometimes,\r
 796                         however, your TEXT value itself contains a <literal>'</literal> character, such as the word\r
 797                         <literal>you&#8217;re</literal>. To prevent the database from prematurely ending the TEXT value at the\r
 798                         first <literal>'</literal> character and returning a syntax error, use another <literal>'</literal> character to\r
 799                         escape the following <literal>'</literal> character.</simpara>\r
 800                         <simpara>For example, to change the last name of a user in the <literal>actor.usr</literal> table to\r
 801                         <literal>L&#8217;estat</literal>, issue the following SQL:</simpara>\r
 802                         <formalpara><title>Escaping <literal>'</literal> in TEXT values</title><para>\r
 803 <programlisting language="sql" linenumbering="unnumbered">\r
 804 UPDATE actor.usr\r
 805   SET family_name = 'L''estat'\r
 806   WHERE profile IN (\r
 807     SELECT id\r
 808       FROM permission.grp_tree\r
 809       WHERE name = 'Vampire'\r
 810   )\r
 811         ;</programlisting>\r
 812                         </para></formalpara>\r
 813                         <simpara>When you retrieve the row from the database, the value is displayed with just\r
 814                         a single <literal>'</literal> character:</simpara>\r
 815 <programlisting language="sql" linenumbering="unnumbered">\r
 816 SELECT id, family_name\r
 817   FROM actor.usr\r
 818   WHERE family_name = 'L''estat'\r
 819 ;\r
 820 \r
 821  id | family_name\r
 822 ----+-------------\r
 823   1 | L'estat\r
 824 (1 row)\r
 825 </programlisting>\r
 826                 </simplesect>\r
 827                 <simplesect id="_grouping_and_eliminating_results_with_the_group_by_and_having_clauses">\r
 828                         <title>Grouping and eliminating results with the GROUP BY and HAVING clauses</title>\r
 829                         <simpara>The GROUP BY clause returns a unique set of results for the desired columns.\r
 830                         This is most often used in conjunction with an aggregate function to present\r
 831                         results for a range of values in a single query, rather than requiring you to\r
 832                         issue one query per target value.</simpara>\r
 833                         <formalpara><title>Returning unique results of a single column with <literal>GROUP BY</literal></title><para>\r
 834 <programlisting language="sql" linenumbering="unnumbered">\r
 835 SELECT grp\r
 836   FROM permission.grp_perm_map\r
 837   GROUP BY grp\r
 838   ORDER BY grp;\r
 839 \r
 840  grp\r
 841 -----+\r
 842    1\r
 843    2\r
 844    3\r
 845    4\r
 846    5\r
 847    6\r
 848    7\r
 849   10\r
 850 (8 rows)\r
 851 </programlisting>\r
 852                         </para></formalpara>\r
 853                         <simpara>While <literal>GROUP BY</literal> can be useful for a single column, it is more often used\r
 854                         to return the distinct results across multiple columns. For example, the\r
 855                         following query shows us which groups have permissions at each depth in\r
 856                         the library hierarchy:</simpara>\r
 857                         <formalpara><title>Returning unique results of multiple columns with <literal>GROUP BY</literal></title><para>\r
 858 <programlisting language="sql" linenumbering="unnumbered">\r
 859 SELECT grp, depth\r
 860   FROM permission.grp_perm_map\r
 861   GROUP BY grp, depth\r
 862   ORDER BY depth, grp;\r
 863 \r
 864  grp | depth\r
 865 -----+-------\r
 866    1 |     0\r
 867    2 |     0\r
 868    3 |     0\r
 869    4 |     0\r
 870    5 |     0\r
 871   10 |     0\r
 872    3 |     1\r
 873    4 |     1\r
 874    5 |     1\r
 875    6 |     1\r
 876    7 |     1\r
 877   10 |     1\r
 878    3 |     2\r
 879    4 |     2\r
 880   10 |     2\r
 881 (15 rows)\r
 882 </programlisting>\r
 883                         </para></formalpara>\r
 884                         <simpara>Extending this further, you can use the <literal>COUNT()</literal> aggregate function to\r
 885                         also return the number of times each unique combination of <literal>grp</literal> and <literal>depth</literal>\r
 886                         appears in the table. <emphasis>Yes, this is a sneak peek at the use of aggregate\r
 887                         functions! Keeners.</emphasis></simpara>\r
 888                         <formalpara><title>Counting unique column combinations with <literal>GROUP BY</literal></title><para>\r
 889 <programlisting language="sql" linenumbering="unnumbered">\r
 890 SELECT grp, depth, COUNT(grp)\r
 891   FROM permission.grp_perm_map\r
 892   GROUP BY grp, depth\r
 893   ORDER BY depth, grp;\r
 894 \r
 895  grp | depth | count\r
 896 -----+-------+-------\r
 897    1 |     0 |     6\r
 898    2 |     0 |     2\r
 899    3 |     0 |    45\r
 900    4 |     0 |     3\r
 901    5 |     0 |     5\r
 902   10 |     0 |     1\r
 903    3 |     1 |     3\r
 904    4 |     1 |     4\r
 905    5 |     1 |     1\r
 906    6 |     1 |     9\r
 907    7 |     1 |     5\r
 908   10 |     1 |    10\r
 909    3 |     2 |    24\r
 910    4 |     2 |     8\r
 911   10 |     2 |     7\r
 912 (15 rows)\r
 913 </programlisting>\r
 914                         </para></formalpara>\r
 915                         <simpara>You can use the <literal>WHERE</literal> clause to restrict the returned results before grouping\r
 916                         is applied to the results. The following query restricts the results to those\r
 917                         rows that have a depth of 0.</simpara>\r
 918                         <formalpara><title>Using the <literal>WHERE</literal> clause with <literal>GROUP BY</literal></title><para>\r
 919 <programlisting language="sql" linenumbering="unnumbered">\r
 920 SELECT grp, COUNT(grp)\r
 921   FROM permission.grp_perm_map\r
 922   WHERE depth = 0\r
 923   GROUP BY grp\r
 924   ORDER BY 2 DESC\r
 925 ;\r
 926 \r
 927  grp | count\r
 928 -----+-------\r
 929    3 |    45\r
 930    1 |     6\r
 931    5 |     5\r
 932    4 |     3\r
 933    2 |     2\r
 934   10 |     1\r
 935 (6 rows)\r
 936 </programlisting>\r
 937                         </para></formalpara>\r
 938                         <simpara>To restrict results after grouping has been applied to the rows, use the\r
 939                         <literal>HAVING</literal> clause; this is typically used to restrict results based on\r
 940                         a comparison to the value returned by an aggregate function. For example,\r
 941                         the following query restricts the returned rows to those that have more than\r
 942                         5 occurrences of the same value for <literal>grp</literal> in the table.</simpara>\r
 943                         <formalpara><title><literal>GROUP BY</literal> restricted by a <literal>HAVING</literal> clause</title><para>\r
 944 <programlisting language="sql" linenumbering="unnumbered">\r
 945 SELECT grp, COUNT(grp)\r
 946   FROM permission.grp_perm_map\r
 947   GROUP BY grp\r
 948   HAVING COUNT(grp) &gt; 5\r
 949 ;\r
 950 \r
 951  grp | count\r
 952 -----+-------\r
 953    6 |     9\r
 954    4 |    15\r
 955    5 |     6\r
 956    1 |     6\r
 957    3 |    72\r
 958   10 |    18\r
 959 (6 rows)\r
 960 </programlisting>\r
 961                         </para></formalpara>\r
 962                 </simplesect>\r
 963                 <simplesect id="_eliminating_duplicate_results_with_the_distinct_keyword">\r
 964                         <title>Eliminating duplicate results with the DISTINCT keyword</title>\r
 965                         <simpara><literal>GROUP BY</literal> is one way of eliminating duplicate results from the rows returned\r
 966                         by your query. The purpose of the <literal>DISTINCT</literal> keyword is to remove duplicate\r
 967                         rows from the results of your query. However, it works, and it is easy - so if\r
 968                         you just want a quick list of the unique set of values for a column or set of\r
 969                         columns, the <literal>DISTINCT</literal> keyword might be appropriate.</simpara>\r
 970                         <simpara>On the other hand, if you are getting duplicate rows back when you don&#8217;t expect\r
 971                         them, then applying the <literal>DISTINCT</literal> keyword might be a sign that you are\r
 972                         papering over a real problem.</simpara>\r
 973                         <formalpara><title>Returning unique results of multiple columns with <literal>DISTINCT</literal></title><para>\r
 974 <programlisting language="sql" linenumbering="unnumbered">\r
 975 SELECT DISTINCT grp, depth\r
 976   FROM permission.grp_perm_map\r
 977   ORDER BY depth, grp\r
 978 ;\r
 979 \r
 980  grp | depth\r
 981 -----+-------\r
 982    1 |     0\r
 983    2 |     0\r
 984    3 |     0\r
 985    4 |     0\r
 986    5 |     0\r
 987   10 |     0\r
 988    3 |     1\r
 989    4 |     1\r
 990    5 |     1\r
 991    6 |     1\r
 992    7 |     1\r
 993   10 |     1\r
 994    3 |     2\r
 995    4 |     2\r
 996   10 |     2\r
 997 (15 rows)\r
 998 </programlisting>\r
 999                         </para></formalpara>\r
1000                 </simplesect>\r
1001                 <simplesect id="_paging_through_results_with_the_limit_and_offset_clauses">\r
1002                         <title>Paging through results with the LIMIT and OFFSET clauses</title>\r
1003                         <simpara>The <literal>LIMIT</literal> clause restricts the total number of rows returned from your query\r
1004                         and is useful if you just want to list a subset of a large number of rows. For\r
1005                         example, in the following query we list the five most frequently used\r
1006                         circulation modifiers:</simpara>\r
1007                         <formalpara><title>Using the <literal>LIMIT</literal> clause to restrict results</title><para>\r
1008 <programlisting language="sql" linenumbering="unnumbered">\r
1009 SELECT circ_modifier, COUNT(circ_modifier)\r
1010   FROM asset.copy\r
1011   GROUP BY circ_modifier\r
1012   ORDER BY 2 DESC\r
1013   LIMIT 5\r
1014 ;\r
1015 \r
1016  circ_modifier | count\r
1017 ---------------+--------\r
1018  CIRC          | 741995\r
1019  BOOK          | 636199\r
1020  SER           | 265906\r
1021  DOC           | 191598\r
1022  LAW MONO      | 126627\r
1023 (5 rows)\r
1024 </programlisting>\r
1025                         </para></formalpara>\r
1026                         <simpara>When you use the <literal>LIMIT</literal> clause to restrict the total number of rows returned\r
1027                         by your query, you can also use the <literal>OFFSET</literal> clause to determine which subset\r
1028                         of the rows will be returned. The use of the <literal>OFFSET</literal> clause assumes that\r
1029                         you&#8217;ve used the <literal>ORDER BY</literal> clause to impose order on the results.</simpara>\r
1030                         <simpara>In the following example, we use the <literal>OFFSET</literal> clause to get results 6 through\r
1031                         10 from the same query that we prevously executed.</simpara>\r
1032                         <formalpara><title>Using the <literal>OFFSET</literal> clause to return a specific subset of rows</title><para>\r
1033 <programlisting language="sql" linenumbering="unnumbered">\r
1034 SELECT circ_modifier, COUNT(circ_modifier)\r
1035   FROM asset.copy\r
1036   GROUP BY circ_modifier\r
1037   ORDER BY 2 DESC\r
1038   LIMIT 5\r
1039   OFFSET 5\r
1040 ;\r
1041 \r
1042  circ_modifier | count\r
1043 ---------------+--------\r
1044  LAW SERIAL    | 102758\r
1045  DOCUMENTS     |  86215\r
1046  BOOK_WEB      |  63786\r
1047  MFORM SER     |  39917\r
1048  REF           |  34380\r
1049 (5 rows)\r
1050 </programlisting>\r
1051                         </para></formalpara>\r
1052                 </simplesect>\r
1053         </section>\r
1054         <section id="advanced_sql_queries">\r
1055                 <title>Advanced SQL queries</title>\r
1056                 <simplesect id="_transforming_column_values_with_functions">\r
1057                         <title>Transforming column values with functions</title>\r
1058                         <simpara>PostgreSQL includes many built-in functions for manipulating column data.\r
1059                         You can also create your own functions (and Evergreen does make use of\r
1060                         many custom functions). There are two types of functions used in\r
1061                         databases: scalar functions and aggregate functions.</simpara>\r
1062                         <simplesect id="_scalar_functions">\r
1063                                 <title>Scalar functions</title>\r
1064                                 <simpara>Scalar functions transform each value of the target column. If your query\r
1065                                 would return 50 values for a column in a given query, and you modify your\r
1066                                 query to apply a scalar function to the values returned for that column,\r
1067                                 it will still return 50 values. For example, the UPPER() function,\r
1068                                 used to convert text values to upper-case, modifies the results in the\r
1069                                 following set of queries:</simpara>\r
1070                                 <formalpara><title>Using the UPPER() scalar function to convert text values to upper-case</title><para>\r
1071 <programlisting language="sql" linenumbering="unnumbered">\r
1072 -- First, without the UPPER() function for comparison\r
1073 SELECT shortname, name\r
1074   FROM actor.org_unit\r
1075   WHERE id &lt; 4\r
1076 ;\r
1077 \r
1078  shortname |         name\r
1079 -----------+-----------------------\r
1080  CONS      | Example Consortium\r
1081  SYS1      | Example System 1\r
1082  SYS2      | Example System 2\r
1083 (3 rows)\r
1084 \r
1085 -- Now apply the UPPER() function to the name column\r
1086 SELECT shortname, UPPER(name)\r
1087   FROM actor.org_unit\r
1088   WHERE id &lt; 4\r
1089 ;\r
1090 \r
1091  shortname |       upper\r
1092 -----------+--------------------\r
1093  CONS      | EXAMPLE CONSORTIUM\r
1094  SYS1      | EXAMPLE SYSTEM 1\r
1095  SYS2      | EXAMPLE SYSTEM 2\r
1096 (3 rows)\r
1097 </programlisting>\r
1098                                 </para></formalpara>\r
1099                                 <simpara>There are so many scalar functions in PostgreSQL that we cannot cover them\r
1100                                 all here, but we can list some of the most commonly used functions:</simpara>\r
1101                                 <itemizedlist>\r
1102                                 <listitem>\r
1103                                 <simpara>\r
1104                                 || - concatenates two text values together\r
1105                                 </simpara>\r
1106                                 </listitem>\r
1107                                 <listitem>\r
1108                                 <simpara>\r
1109                                 COALESCE() - returns the first non-NULL value from the list of arguments\r
1110                                 </simpara>\r
1111                                 </listitem>\r
1112                                 <listitem>\r
1113                                 <simpara>\r
1114                                 LOWER() - returns a text value converted to lower-case\r
1115                                 </simpara>\r
1116                                 </listitem>\r
1117                                 <listitem>\r
1118                                 <simpara>\r
1119                                 REPLACE() - returns a text value after replacing all occurrences of a given text value with a different text value\r
1120                                 </simpara>\r
1121                                 </listitem>\r
1122                                 <listitem>\r
1123                                 <simpara>\r
1124                                 REGEXP_REPLACE() - returns a text value after being transformed by a regular expression\r
1125                                 </simpara>\r
1126                                 </listitem>\r
1127                                 <listitem>\r
1128                                 <simpara>\r
1129                                 UPPER() - returns a text value converted to upper-case\r
1130                                 </simpara>\r
1131                                 </listitem>\r
1132                                 </itemizedlist>\r
1133                                 <simpara>For a complete list of scalar functions, see\r
1134                                 <ulink url="http://www.postgresql.org/docs/8.3/interactive/functions.html">the PostgreSQL function documentation</ulink>.</simpara>\r
1135                         </simplesect>\r
1136                         <simplesect id="_aggregate_functions">\r
1137                                 <title>Aggregate functions</title>\r
1138                                 <simpara>Aggregate functions return a single value computed from the the complete set of\r
1139                                 values returned for the specified column.</simpara>\r
1140                                 <itemizedlist>\r
1141                                 <listitem>\r
1142                                 <simpara>\r
1143                                 AVG()\r
1144                                 </simpara>\r
1145                                 </listitem>\r
1146                                 <listitem>\r
1147                                 <simpara>\r
1148                                 COUNT()\r
1149                                 </simpara>\r
1150                                 </listitem>\r
1151                                 <listitem>\r
1152                                 <simpara>\r
1153                                 MAX()\r
1154                                 </simpara>\r
1155                                 </listitem>\r
1156                                 <listitem>\r
1157                                 <simpara>\r
1158                                 MIN()\r
1159                                 </simpara>\r
1160                                 </listitem>\r
1161                                 <listitem>\r
1162                                 <simpara>\r
1163                                 SUM()\r
1164                                 </simpara>\r
1165                                 </listitem>\r
1166                                 </itemizedlist>\r
1167                         </simplesect>\r
1168                 </simplesect>\r
1169                 <simplesect id="_sub_selects">\r
1170                         <title>Sub-selects</title>\r
1171                         <simpara>A sub-select is the technique of using the results of one query to feed\r
1172                         into another query. You can, for example, return a set of values from\r
1173                         one column in a SELECT statement to be used to satisfy the IN() condition\r
1174                         of another SELECT statement; or you could return the MAX() value of a\r
1175                         column in a SELECT statement to match the = condition of another SELECT\r
1176                         statement.</simpara>\r
1177                         <simpara>For example, in the following query we use a sub-select to restrict the copies\r
1178                         returned by the main SELECT statement to only those locations that have an\r
1179                         <literal>opac_visible</literal> value of <literal>TRUE</literal>:</simpara>\r
1180                         <formalpara><title>Sub-select example</title><para>\r
1181 <programlisting language="sql" linenumbering="unnumbered">\r
1182 SELECT call_number\r
1183   FROM asset.copy\r
1184   WHERE deleted IS FALSE\r
1185     AND location IN (\r
1186     SELECT id\r
1187       FROM asset.copy_location\r
1188       WHERE opac_visible IS TRUE\r
1189   )\r
1190 ;\r
1191 </programlisting>\r
1192                         </para></formalpara>\r
1193                         <simpara>Sub-selects can be an approachable way to breaking down a problem that\r
1194                         requires matching values between different tables, and often result in\r
1195                         a clearly expressed solution to a problem. However, if you start writing\r
1196                         sub-selects within sub-selects, you should consider tackling the problem\r
1197                         with joins instead.</simpara>\r
1198                 </simplesect>\r
1199                 <simplesect id="_joins">\r
1200                         <title>Joins</title>\r
1201                         <simpara>Joins enable you to access the values from multiple tables in your query\r
1202                         results and comparison operators. For example, joins are what enable you to\r
1203                         relate a bibliographic record to a barcoded copy via the <literal>biblio.record_entry</literal>,\r
1204                         <literal>asset.call_number</literal>, and <literal>asset.copy</literal> tables. In this section, we discuss the\r
1205                         most common kind of join&#8212;the inner join&#8212;as well as the less common outer join\r
1206                         and some set operations which can compare and contrast the values returned by\r
1207                         separate queries.</simpara>\r
1208                         <simpara>When we talk about joins, we are going to talk about the left-hand table and\r
1209                         the right-hand table that participate in the join. Every join brings together\r
1210                         just two tables - but you can use an unlimited (for our purposes) number\r
1211                         of joins in a single SQL statement. Each time you use a join, you effectively\r
1212                         create a new table, so when you add a second join clause to a statement,\r
1213                         table 1 and table 2 (which were the left-hand table and the right-hand table\r
1214                         for the first join) now act as a merged left-hand table and the new table\r
1215                         in the second join clause is the right-hand table.</simpara>\r
1216                         <simpara>Clear as mud? Okay, let&#8217;s look at some examples.</simpara>\r
1217                         <simplesect id="_inner_joins">\r
1218                                 <title>Inner joins</title>\r
1219                                 <simpara>An inner join returns all of the columns from the left-hand table in the join\r
1220                                 with all of the columns from the right-hand table in the joins that match a\r
1221                                 condition in the ON clause. Typically, you use the <literal>=</literal> operator to match the\r
1222                                 foreign key of the left-hand table with the primary key of the right-hand\r
1223                                 table to follow the natural relationship between the tables.</simpara>\r
1224                                 <simpara>In the following example, we return all of columns from the <literal>actor.usr</literal> and\r
1225                                 <literal>actor.org_unit</literal> tables, joined on the relationship between the user&#8217;s home\r
1226                                 library and the library&#8217;s ID. Notice in the results that some columns, like\r
1227                                 <literal>id</literal> and <literal>mailing_address</literal>, appear twice; this is because both the <literal>actor.usr</literal>\r
1228                                 and <literal>actor.org_unit</literal> tables include columns with these names. This is also why\r
1229                                 we have to fully qualify the column names in our queries with the schema and\r
1230                                 table names.</simpara>\r
1231                                 <formalpara><title>A simple inner join</title><para>\r
1232 <programlisting language="sql" linenumbering="unnumbered">\r
1233 SELECT *\r
1234   FROM actor.usr\r
1235     INNER JOIN actor.org_unit ON actor.usr.home_ou = actor.org_unit.id\r
1236     WHERE actor.org_unit.shortname = 'CONS'\r
1237 ;\r
1238 \r
1239 -[ RECORD 1 ]------------------+---------------------------------\r
1240 id                             | 1\r
1241 card                           | 1\r
1242 profile                        | 1\r
1243 usrname                        | admin\r
1244 email                          |\r
1245 ...\r
1246 mailing_address                |\r
1247 billing_address                |\r
1248 home_ou                        | 1\r
1249 ...\r
1250 claims_never_checked_out_count | 0\r
1251 id                             | 1\r
1252 parent_ou                      |\r
1253 ou_type                        | 1\r
1254 ill_address                    | 1\r
1255 holds_address                  | 1\r
1256 mailing_address                | 1\r
1257 billing_address                | 1\r
1258 shortname                      | CONS\r
1259 name                           | Example Consortium\r
1260 email                          |\r
1261 phone                          |\r
1262 opac_visible                   | t\r
1263 fiscal_calendar                | 1\r
1264 </programlisting>\r
1265                                 </para></formalpara>\r
1266                                 <simpara>Of course, you do not have to return every column from the joined tables;\r
1267                                 you can (and should) continue to specify only the columns that you want to\r
1268                                 return. In the following example, we count the number of borrowers for\r
1269                                 every user profile in a given library by joining the <literal>permission.grp_tree</literal>\r
1270                                 table where profiles are defined against the <literal>actor.usr</literal> table, and then\r
1271                                 joining the <literal>actor.org_unit</literal> table to give us access to the user&#8217;s home\r
1272                                 library:</simpara>\r
1273                                 <formalpara><title>Borrower Count by Profile (Adult, Child, etc)/Library</title><para>\r
1274 <programlisting language="sql" linenumbering="unnumbered">\r
1275 SELECT permission.grp_tree.name, actor.org_unit.name, COUNT(permission.grp_tree.name)\r
1276   FROM actor.usr\r
1277     INNER JOIN permission.grp_tree\r
1278       ON actor.usr.profile = permission.grp_tree.id\r
1279     INNER JOIN actor.org_unit\r
1280       ON actor.org_unit.id = actor.usr.home_ou\r
1281   WHERE actor.usr.deleted IS FALSE\r
1282   GROUP BY permission.grp_tree.name, actor.org_unit.name\r
1283   ORDER BY actor.org_unit.name, permission.grp_tree.name\r
1284 ;\r
1285 \r
1286  name  |        name        | count\r
1287 -------+--------------------+-------\r
1288  Users | Example Consortium |     1\r
1289 (1 row)\r
1290 </programlisting>\r
1291                                 </para></formalpara>\r
1292                         </simplesect>\r
1293                         <simplesect id="_aliases">\r
1294                                 <title>Aliases</title>\r
1295                                 <simpara>So far we have been fully-qualifying all of our table names and column names to\r
1296                                 prevent any confusion. This quickly gets tiring with lengthy qualified\r
1297                                 table names like <literal>permission.grp_tree</literal>, so the SQL syntax enables us to assign\r
1298                                 aliases to table names and column names. When you define an alias for a table\r
1299                                 name, you can access its column throughout the rest of the statement by simply\r
1300                                 appending the column name to the alias with a period; for example, if you assign\r
1301                                 the alias <literal>au</literal> to the <literal>actor.usr</literal> table, you can access the <literal>actor.usr.id</literal>\r
1302                                 column through the alias as <literal>au.id</literal>.</simpara>\r
1303                                 <simpara>The formal syntax for declaring an alias for a column is to follow the column\r
1304                                 name in the result columns clause with <literal>AS</literal> <emphasis>alias</emphasis>. To declare an alias for a table name,\r
1305                                 follow the table name in the FROM clause (including any JOIN statements) with\r
1306                                 <literal>AS</literal> <emphasis>alias</emphasis>. However, the <literal>AS</literal> keyword is optional for tables (and columns as\r
1307                                 of PostgreSQL 8.4), and in practice most SQL statements leave it out.  For\r
1308                                 example, we can write the previous INNER JOIN statement example using aliases\r
1309                                 instead of fully-qualified identifiers:</simpara>\r
1310                                 <formalpara><title>Borrower Count by Profile (using aliases)</title><para>\r
1311 <programlisting language="sql" linenumbering="unnumbered">\r
1312 SELECT pgt.name AS "Profile", aou.name AS "Library", COUNT(pgt.name) AS "Count"\r
1313   FROM actor.usr au\r
1314     INNER JOIN permission.grp_tree pgt\r
1315       ON au.profile = pgt.id\r
1316     INNER JOIN actor.org_unit aou\r
1317       ON aou.id = au.home_ou\r
1318   WHERE au.deleted IS FALSE\r
1319   GROUP BY pgt.name, aou.name\r
1320   ORDER BY aou.name, pgt.name\r
1321 ;\r
1322 \r
1323  Profile |      Library       | Count\r
1324 ---------+--------------------+-------\r
1325  Users   | Example Consortium |     1\r
1326 (1 row)\r
1327 </programlisting>\r
1328                                 </para></formalpara>\r
1329                                 <simpara>A nice side effect of declaring an alias for your columns is that the alias\r
1330                                 is used as the column header in the results table. The previous version of\r
1331                                 the query, which didn&#8217;t use aliased column names, had two columns named\r
1332                                 <literal>name</literal>; this version of the query with aliases results in a clearer\r
1333                                 categorization.</simpara>\r
1334                         </simplesect>\r
1335                         <simplesect id="_outer_joins">\r
1336                                 <title>Outer joins</title>\r
1337                                 <simpara>An outer join returns all of the rows from one or both of the tables\r
1338                                 participating in the join.</simpara>\r
1339                                 <itemizedlist>\r
1340                                 <listitem>\r
1341                                 <simpara>\r
1342                                 For a LEFT OUTER JOIN, the join returns all of the rows from the left-hand\r
1343                                 table and the rows matching the join condition from the right-hand table, with\r
1344                                 NULL values for the rows with no match in the right-hand table.\r
1345                                 </simpara>\r
1346                                 </listitem>\r
1347                                 <listitem>\r
1348                                 <simpara>\r
1349                                 A RIGHT OUTER JOIN behaves in the same way as a LEFT OUTER JOIN, with the\r
1350                                 exception that all rows are returned from the right-hand table participating in\r
1351                                 the join.\r
1352                                 </simpara>\r
1353                                 </listitem>\r
1354                                 <listitem>\r
1355                                 <simpara>\r
1356                                 For a FULL OUTER JOIN, the join returns all the rows from both the left-hand\r
1357                                 and right-hand tables, with NULL values for the rows with no match in either\r
1358                                 the left-hand or right-hand table.\r
1359                                 </simpara>\r
1360                                 </listitem>\r
1361                                 </itemizedlist>\r
1362                                 <formalpara><title>Base tables for the OUTER JOIN examples</title><para>\r
1363 <programlisting language="sql" linenumbering="unnumbered">\r
1364 SELECT * FROM aaa;\r
1365 \r
1366  id | stuff\r
1367 ----+-------\r
1368   1 | one\r
1369   2 | two\r
1370   3 | three\r
1371   4 | four\r
1372   5 | five\r
1373 (5 rows)\r
1374 \r
1375 SELECT * FROM bbb;\r
1376 \r
1377  id | stuff |   foo\r
1378 ----+-------+----------\r
1379   1 | one   | oneone\r
1380   2 | two   | twotwo\r
1381   5 | five  | fivefive\r
1382   6 | six   | sixsix\r
1383 (4 rows)\r
1384 </programlisting>\r
1385                                 </para></formalpara>\r
1386                                 <formalpara><title>Example of a LEFT OUTER JOIN</title><para>\r
1387 <programlisting language="sql" linenumbering="unnumbered">\r
1388 SELECT * FROM aaa\r
1389   LEFT OUTER JOIN bbb ON aaa.id = bbb.id\r
1390 ;\r
1391  id | stuff | id | stuff |   foo\r
1392 ----+-------+----+-------+----------\r
1393   1 | one   |  1 | one   | oneone\r
1394   2 | two   |  2 | two   | twotwo\r
1395   3 | three |    |       |\r
1396   4 | four  |    |       |\r
1397   5 | five  |  5 | five  | fivefive\r
1398 (5 rows)\r
1399 </programlisting>\r
1400                                 </para></formalpara>\r
1401                                 <formalpara><title>Example of a RIGHT OUTER JOIN</title><para>\r
1402 <programlisting language="sql" linenumbering="unnumbered">\r
1403 SELECT * FROM aaa\r
1404   RIGHT OUTER JOIN bbb ON aaa.id = bbb.id\r
1405 ;\r
1406  id | stuff | id | stuff |   foo\r
1407 ----+-------+----+-------+----------\r
1408   1 | one   |  1 | one   | oneone\r
1409   2 | two   |  2 | two   | twotwo\r
1410   5 | five  |  5 | five  | fivefive\r
1411     |       |  6 | six   | sixsix\r
1412 (4 rows)\r
1413 </programlisting>\r
1414                                 </para></formalpara>\r
1415                                 <formalpara><title>Example of a FULL OUTER JOIN</title><para>\r
1416 <programlisting language="sql" linenumbering="unnumbered">\r
1417 SELECT * FROM aaa\r
1418   FULL OUTER JOIN bbb ON aaa.id = bbb.id\r
1419 ;\r
1420  id | stuff | id | stuff |   foo\r
1421 ----+-------+----+-------+----------\r
1422   1 | one   |  1 | one   | oneone\r
1423   2 | two   |  2 | two   | twotwo\r
1424   3 | three |    |       |\r
1425   4 | four  |    |       |\r
1426   5 | five  |  5 | five  | fivefive\r
1427     |       |  6 | six   | sixsix\r
1428 (6 rows)\r
1429 </programlisting>\r
1430                                 </para></formalpara>\r
1431                         </simplesect>\r
1432                         <simplesect id="_self_joins">\r
1433                                 <title>Self joins</title>\r
1434                                 <simpara>It is possible to join a table to itself. You can, in fact you must, use\r
1435                                 aliases to disambiguate the references to the table.</simpara>\r
1436                         </simplesect>\r
1437                 </simplesect>\r
1438                 <simplesect id="_set_operations">\r
1439                         <title>Set operations</title>\r
1440                         <simpara>Relational databases are effectively just an efficient mechanism for\r
1441                         manipulating sets of values; they are implementations of set theory. There are\r
1442                         three operators for sets (tables) in which each set must have the same number\r
1443                         of columns with compatible data types: the union, intersection, and difference\r
1444                         operators.</simpara>\r
1445                         <formalpara><title>Base tables for the set operation examples</title><para>\r
1446 <programlisting language="sql" linenumbering="unnumbered">\r
1447 SELECT * FROM aaa;\r
1448 \r
1449         id | stuff\r
1450         ----+-------\r
1451         1 | one\r
1452         2 | two\r
1453         3 | three\r
1454         4 | four\r
1455         5 | five\r
1456         (5 rows)\r
1457 \r
1458 SELECT * FROM bbb;\r
1459 \r
1460         id | stuff |   foo\r
1461         ----+-------+----------\r
1462         1 | one   | oneone\r
1463         2 | two   | twotwo\r
1464         5 | five  | fivefive\r
1465         6 | six   | sixsix\r
1466 (4 rows)\r
1467 </programlisting>\r
1468                         </para></formalpara>\r
1469                         <simplesect id="_union">\r
1470                                 <title>Union</title>\r
1471                                 <simpara>The <literal>UNION</literal> operator returns the distinct set of rows that are members of\r
1472                                 either or both of the left-hand and right-hand tables. The <literal>UNION</literal> operator\r
1473                                 does not return any duplicate rows. To return duplicate rows, use the\r
1474                                 <literal>UNION ALL</literal> operator.</simpara>\r
1475                                 <formalpara><title>Example of a UNION set operation</title><para>\r
1476 <programlisting language="sql" linenumbering="unnumbered">\r
1477 -- The parentheses are not required, but are intended to help\r
1478 -- illustrate the sets participating in the set operation\r
1479 (\r
1480   SELECT id, stuff\r
1481     FROM aaa\r
1482 )\r
1483 UNION\r
1484 (\r
1485   SELECT id, stuff\r
1486     FROM bbb\r
1487 )\r
1488 ORDER BY 1\r
1489 ;\r
1490 \r
1491  id | stuff\r
1492 ----+-------\r
1493   1 | one\r
1494   2 | two\r
1495   3 | three\r
1496   4 | four\r
1497   5 | five\r
1498   6 | six\r
1499 (6 rows)\r
1500 </programlisting>\r
1501                                 </para></formalpara>\r
1502                         </simplesect>\r
1503                         <simplesect id="_intersection">\r
1504                                 <title>Intersection</title>\r
1505                                 <simpara>The <literal>INTERSECT</literal> operator returns the distinct set of rows that are common to\r
1506                                 both the left-hand and right-hand tables. To return duplicate rows, use the\r
1507                                 <literal>INTERSECT ALL</literal> operator.</simpara>\r
1508                                 <formalpara><title>Example of an INTERSECT set operation</title><para>\r
1509 <programlisting language="sql" linenumbering="unnumbered">\r
1510 (\r
1511   SELECT id, stuff\r
1512     FROM aaa\r
1513 )\r
1514 INTERSECT\r
1515 (\r
1516   SELECT id, stuff\r
1517     FROM bbb\r
1518 )\r
1519 ORDER BY 1\r
1520 ;\r
1521 \r
1522  id | stuff\r
1523 ----+-------\r
1524   1 | one\r
1525   2 | two\r
1526   5 | five\r
1527 (3 rows)\r
1528 </programlisting>\r
1529                                 </para></formalpara>\r
1530                         </simplesect>\r
1531                         <simplesect id="_difference">\r
1532                                 <title>Difference</title>\r
1533                                 <simpara>The <literal>EXCEPT</literal> operator returns the rows in the left-hand table that do not\r
1534                                 exist in the right-hand table. You are effectively subtracting the common\r
1535                                 rows from the left-hand table.</simpara>\r
1536                                 <formalpara><title>Example of an EXCEPT set operation</title><para>\r
1537 <programlisting language="sql" linenumbering="unnumbered">\r
1538 (\r
1539   SELECT id, stuff\r
1540     FROM aaa\r
1541 )\r
1542 EXCEPT\r
1543 (\r
1544   SELECT id, stuff\r
1545     FROM bbb\r
1546 )\r
1547 ORDER BY 1\r
1548 ;\r
1549 \r
1550  id | stuff\r
1551 ----+-------\r
1552   3 | three\r
1553   4 | four\r
1554 (2 rows)\r
1555 \r
1556 -- Order matters: switch the left-hand and right-hand tables\r
1557 -- and you get a different result\r
1558 (\r
1559   SELECT id, stuff\r
1560     FROM bbb\r
1561 )\r
1562 EXCEPT\r
1563 (\r
1564   SELECT id, stuff\r
1565     FROM aaa\r
1566 )\r
1567 ORDER BY 1\r
1568 ;\r
1569 \r
1570  id | stuff\r
1571 ----+-------\r
1572   6 | six\r
1573 (1 row)\r
1574 </programlisting>\r
1575                                 </para></formalpara>\r
1576                         </simplesect>\r
1577                 </simplesect>\r
1578                 <simplesect id="_views">\r
1579                         <title>Views</title>\r
1580                         <simpara>A view is a persistent <literal>SELECT</literal> statement that acts like a read-only table.\r
1581                         To create a view, issue the <literal>CREATE VIEW</literal> statement, giving the view a name\r
1582                         and a <literal>SELECT</literal> statement on which the view is built.</simpara>\r
1583                         <simpara>The following example creates a view based on our borrower profile count:</simpara>\r
1584                         <formalpara><title>Creating a view</title><para>\r
1585 <programlisting language="sql" linenumbering="unnumbered">\r
1586 CREATE VIEW actor.borrower_profile_count AS\r
1587   SELECT pgt.name AS "Profile", aou.name AS "Library", COUNT(pgt.name) AS "Count"\r
1588     FROM actor.usr au\r
1589       INNER JOIN permission.grp_tree pgt\r
1590         ON au.profile = pgt.id\r
1591       INNER JOIN actor.org_unit aou\r
1592         ON aou.id = au.home_ou\r
1593     WHERE au.deleted IS FALSE\r
1594     GROUP BY pgt.name, aou.name\r
1595     ORDER BY aou.name, pgt.name\r
1596 ;\r
1597 </programlisting>\r
1598                         </para></formalpara>\r
1599                         <simpara>When you subsequently select results from the view, you can apply additional\r
1600                         <literal>WHERE</literal> clauses to filter the results, or <literal>ORDER BY</literal> clauses to change the\r
1601                         order of the returned rows. In the following examples, we issue a simple\r
1602                         <literal>SELECT *</literal> statement to show that the default results are returned in the\r
1603                         same order from the view as the equivalent SELECT statement would be returned.\r
1604                         Then we issue a <literal>SELECT</literal> statement with a <literal>WHERE</literal> clause to further filter the\r
1605                         results.</simpara>\r
1606                         <formalpara><title>Selecting results from a view</title><para>\r
1607 <programlisting language="sql" linenumbering="unnumbered">\r
1608 SELECT * FROM actor.borrower_profile_count;\r
1609 \r
1610           Profile           | Library                    | Count\r
1611 ----------------------------+----------------------------+-------\r
1612  Faculty                    | University Library         |   208\r
1613  Graduate                   | University Library         |    16\r
1614  Patrons                    | University Library         |    62\r
1615 ...\r
1616 \r
1617 -- You can still filter your results with WHERE clauses\r
1618 SELECT *\r
1619   FROM actor.borrower_profile_count\r
1620   WHERE "Profile" = 'Faculty';\r
1621 \r
1622  Profile | Library                    | Count\r
1623 ---------+----------------------------+-------\r
1624  Faculty | University Library         |   208\r
1625  Faculty | College Library            |    64\r
1626  Faculty | College Library 2          |   102\r
1627  Faculty | University Library 2       |   776\r
1628 (4 rows)\r
1629 </programlisting>\r
1630                         </para></formalpara>\r
1631                 </simplesect>\r
1632                 <simplesect id="_inheritance">\r
1633                         <title>Inheritance</title>\r
1634                         <simpara>PostgreSQL supports table inheritance: that is, a child table inherits its\r
1635                         base definition from a parent table, but can add additional columns to its\r
1636                         own definition. The data from any child tables is visible in queries against\r
1637                         the parent table.</simpara>\r
1638                         <simpara>Evergreen uses table inheritance in several areas:\r
1639                           * In the Vandelay MARC batch importer / exporter, Evergreen defines base\r
1640                         tables for generic queues and queued records for which authority record and\r
1641                         bibliographic record child tables\r
1642                           * Billable transactions are based on the <literal>money.billable_xact</literal> table;\r
1643                         child tables include <literal>action.circulation</literal> for circulation transactions\r
1644                         and <literal>money.grocery</literal> for general bills.\r
1645                           * Payments are based on the <literal>money.payment</literal> table; its child table is\r
1646                         <literal>money.bnm_payment</literal> (for brick-and-mortar payments), which in turn has child\r
1647                         tables of <literal>money.forgive_payment</literal>, <literal>money.work_payment</literal>, <literal>money.credit_payment</literal>,\r
1648                         <literal>money.goods_payment</literal>, and <literal>money.bnm_desk_payment</literal>. The\r
1649                         <literal>money.bnm_desk_payment</literal> table in turn has child tables of <literal>money.cash_payment</literal>,\r
1650                         <literal>money.check_payment</literal>, and <literal>money.credit_card_payment</literal>.\r
1651                           * Transits are based on the <literal>action.transit_copy</literal> table, which has a child\r
1652                         table of <literal>action.hold_transit_copy</literal> for transits initiated by holds.\r
1653                           * Generic acquisition line items are defined by the\r
1654                         <literal>acq.lineitem_attr_definition</literal> table, which in turn has a number of child\r
1655                         tables to define MARC attributes, generated attributes, user attributes, and\r
1656                         provider attributes.</simpara>\r
1657                 </simplesect>\r
1658         </section>\r
1659         <section id="understanding_query_performance_with_explain">\r
1660                 <title>Understanding query performance with EXPLAIN</title>\r
1661                 <simpara>Some queries run for a long, long time. This can be the result of a poorly\r
1662                 written query&#8212;a query with a join condition that joins every\r
1663                 row in the <literal>biblio.record_entry</literal> table with every row in the <literal>metabib.full_rec</literal>\r
1664                 view would consume a massive amount of memory and disk space and CPU time&#8212;or\r
1665                 a symptom of a schema that needs some additional indexes. PostgreSQL provides\r
1666                 the <literal>EXPLAIN</literal> tool to estimate how long it will take to run a given query and\r
1667                 show you the <emphasis>query plan</emphasis> (how it plans to retrieve the results from the\r
1668                 database).</simpara>\r
1669                 <simpara>To generate the query plan without actually running the statement, simply\r
1670                 prepend the <literal>EXPLAIN</literal> keyword to your query. In the following example, we\r
1671                 generate the query plan for the poorly written query that would join every\r
1672                 row in the <literal>biblio.record_entry</literal> table with every row in the <literal>metabib.full_rec</literal>\r
1673                 view:</simpara>\r
1674                 <formalpara><title>Query plan for a terrible query</title><para>\r
1675 <programlisting language="sql" linenumbering="unnumbered">\r
1676 EXPLAIN SELECT *\r
1677   FROM biblio.record_entry\r
1678     FULL OUTER JOIN metabib.full_rec ON 1=1\r
1679 ;\r
1680 \r
1681                                    QUERY PLAN\r
1682 -------------------------------------------------------------------------------//\r
1683  Merge Full Join  (cost=0.00..4959156437783.60 rows=132415734100864 width=1379)\r
1684    -&gt;  Seq Scan on record_entry  (cost=0.00..400634.16 rows=2013416 width=1292)\r
1685    -&gt;  Seq Scan on real_full_rec  (cost=0.00..1640972.04 rows=65766704 width=87)\r
1686 (3 rows)\r
1687 </programlisting>\r
1688                 </para></formalpara>\r
1689                 <simpara>This query plan shows that the query would return 132415734100864 rows, and it\r
1690                 plans to accomplish what you asked for by sequentially scanning (<emphasis>Seq Scan</emphasis>)\r
1691                 every row in each of the tables participating in the join.</simpara>\r
1692                 <simpara>In the following example, we have realized our mistake in joining every row of\r
1693                 the left-hand table with every row in the right-hand table and take the saner\r
1694                 approach of using an <literal>INNER JOIN</literal> where the join condition is on the record ID.</simpara>\r
1695                 <formalpara><title>Query plan for a less terrible query</title><para>\r
1696 <programlisting language="sql" linenumbering="unnumbered">\r
1697 EXPLAIN SELECT *\r
1698   FROM biblio.record_entry bre\r
1699     INNER JOIN metabib.full_rec mfr ON mfr.record = bre.id;\r
1700                                         QUERY PLAN\r
1701 ----------------------------------------------------------------------------------------//\r
1702  Hash Join  (cost=750229.86..5829273.98 rows=65766704 width=1379)\r
1703    Hash Cond: (real_full_rec.record = bre.id)\r
1704    -&gt;  Seq Scan on real_full_rec  (cost=0.00..1640972.04 rows=65766704 width=87)\r
1705    -&gt;  Hash  (cost=400634.16..400634.16 rows=2013416 width=1292)\r
1706          -&gt;  Seq Scan on record_entry bre  (cost=0.00..400634.16 rows=2013416 width=1292)\r
1707 (5 rows)\r
1708 </programlisting>\r
1709                 </para></formalpara>\r
1710                 <simpara>This time, we will return 65766704 rows - still way too many rows. We forgot\r
1711                 to include a <literal>WHERE</literal> clause to limit the results to something meaningful. In\r
1712                 the following example, we will limit the results to deleted records that were\r
1713                 modified in the last month.</simpara>\r
1714                 <formalpara><title>Query plan for a realistic query</title><para>\r
1715 <programlisting language="sql" linenumbering="unnumbered">\r
1716 EXPLAIN SELECT *\r
1717   FROM biblio.record_entry bre\r
1718     INNER JOIN metabib.full_rec mfr ON mfr.record = bre.id\r
1719   WHERE bre.deleted IS TRUE\r
1720     AND DATE_TRUNC('MONTH', bre.edit_date) &gt;\r
1721         DATE_TRUNC ('MONTH', NOW() - '1 MONTH'::INTERVAL)\r
1722 ;\r
1723 \r
1724                                         QUERY PLAN\r
1725 ----------------------------------------------------------------------------------------//\r
1726  Hash Join  (cost=5058.86..2306218.81 rows=201669 width=1379)\r
1727    Hash Cond: (real_full_rec.record = bre.id)\r
1728    -&gt;  Seq Scan on real_full_rec  (cost=0.00..1640972.04 rows=65766704 width=87)\r
1729    -&gt;  Hash  (cost=4981.69..4981.69 rows=6174 width=1292)\r
1730          -&gt;  Index Scan using biblio_record_entry_deleted on record_entry bre\r
1731                  (cost=0.00..4981.69 rows=6174 width=1292)\r
1732                Index Cond: (deleted = true)\r
1733                Filter: ((deleted IS TRUE) AND (date_trunc('MONTH'::text, edit_date)\r
1734                  &gt; date_trunc('MONTH'::text, (now() - '1 mon'::interval))))\r
1735 (7 rows)\r
1736 </programlisting>\r
1737                 </para></formalpara>\r
1738                 <simpara>We can see that the number of rows returned is now only 201669; that&#8217;s\r
1739                 something we can work with. Also, the overall cost of the query is 2306218,\r
1740                 compared to 4959156437783 in the original query. The <literal>Index Scan</literal> tells us\r
1741                 that the query planner will use the index that was defined on the <literal>deleted</literal>\r
1742                 column to avoid having to check every row in the <literal>biblio.record_entry</literal> table.</simpara>\r
1743                 <simpara>However, we are still running a sequential scan over the\r
1744                 <literal>metabib.real_full_rec</literal> table (the table on which the <literal>metabib.full_rec</literal>\r
1745                 view is based). Given that linking from the bibliographic records to the\r
1746                 flattened MARC subfields is a fairly common operation, we could create a\r
1747                 new index and see if that speeds up our query plan.</simpara>\r
1748                 <formalpara><title>Query plan with optimized access via a new index</title><para>\r
1749 <programlisting language="sql" linenumbering="unnumbered">\r
1750 -- This index will take a long time to create on a large database\r
1751 -- of bibliographic records\r
1752 CREATE INDEX bib_record_idx ON metabib.real_full_rec (record);\r
1753 \r
1754 EXPLAIN SELECT *\r
1755   FROM biblio.record_entry bre\r
1756     INNER JOIN metabib.full_rec mfr ON mfr.record = bre.id\r
1757   WHERE bre.deleted IS TRUE\r
1758     AND DATE_TRUNC('MONTH', bre.edit_date) &gt;\r
1759         DATE_TRUNC ('MONTH', NOW() - '1 MONTH'::INTERVAL)\r
1760 ;\r
1761 \r
1762                                         QUERY PLAN\r
1763 ----------------------------------------------------------------------------------------//\r
1764  Nested Loop  (cost=0.00..1558330.46 rows=201669 width=1379)\r
1765    -&gt;  Index Scan using biblio_record_entry_deleted on record_entry bre\r
1766            (cost=0.00..4981.69 rows=6174 width=1292)\r
1767          Index Cond: (deleted = true)\r
1768          Filter: ((deleted IS TRUE) AND (date_trunc('MONTH'::text, edit_date) &gt;\r
1769            date_trunc('MONTH'::text, (now() - '1 mon'::interval))))\r
1770    -&gt;  Index Scan using bib_record_idx on real_full_rec\r
1771           (cost=0.00..240.89 rows=850 width=87)\r
1772          Index Cond: (real_full_rec.record = bre.id)\r
1773 (6 rows)\r
1774 </programlisting>\r
1775                 </para></formalpara>\r
1776                 <simpara>We can see that the resulting number of rows is still the same (201669), but\r
1777                 the execution estimate has dropped to 1558330 because the query planner can\r
1778                 use the new index (<literal>bib_record_idx</literal>) rather than scanning the entire table.\r
1779                 Success!</simpara>\r
1780                 <note><simpara>While indexes can significantly speed up read access to tables for common\r
1781                 filtering conditions, every time a row is created or updated the corresponding\r
1782                 indexes also need to be maintained - which can decrease the performance of\r
1783                 writes to the database. Be careful to keep the balance of read performance\r
1784                 versus write performance in mind if you plan to create custom indexes in your\r
1785                 Evergreen database.</simpara></note>\r
1786         </section>\r
1787         <section id="inserting_updating_and_deleting_data">\r
1788                 <title>Inserting, updating, and deleting data</title>\r
1789                 <simplesect id="_inserting_data">\r
1790                         <title>Inserting data</title>\r
1791                         <simpara>To insert one or more rows into a table, use the INSERT statement to identify\r
1792                         the target table and list the columns in the table for which you are going to\r
1793                         provide values for each row. If you do not list one or more columns contained\r
1794                         in the table, the database will automatically supply a <literal>NULL</literal> value for those\r
1795                         columns. The values for each row follow the <literal>VALUES</literal> clause and are grouped in\r
1796                         parentheses and delimited by commas. Each row, in turn, is delimited by commas\r
1797                         (<emphasis>this multiple row syntax requires PostgreSQL 8.2 or higher</emphasis>).</simpara>\r
1798                         <simpara>For example, to insert two rows into the <literal>permission.usr_grp_map</literal> table:</simpara>\r
1799                         <formalpara><title>Inserting rows into the <literal>permission.usr_grp_map</literal> table</title><para>\r
1800                         <programlisting language="sql" linenumbering="unnumbered">INSERT INTO permission.usr_grp_map (usr, grp)\r
1801                           VALUES (2, 10), (2, 4)\r
1802                         ;</programlisting>\r
1803                         </para></formalpara>\r
1804                         <simpara>Of course, as with the rest of SQL, you can replace individual column values\r
1805                         with one or more use sub-selects:</simpara>\r
1806                         <formalpara><title>Inserting rows using sub-selects instead of integers</title><para>\r
1807 <programlisting language="sql" linenumbering="unnumbered">\r
1808 INSERT INTO permission.usr_grp_map (usr, grp)\r
1809   VALUES (\r
1810     (SELECT id FROM actor.usr\r
1811        WHERE family_name = 'Scott' AND first_given_name = 'Daniel'),\r
1812     (SELECT id FROM permission.grp_tree\r
1813        WHERE name = 'Local System Administrator')\r
1814   ), (\r
1815     (SELECT id FROM actor.usr\r
1816        WHERE family_name = 'Scott' AND first_given_name = 'Daniel'),\r
1817     (SELECT id FROM permission.grp_tree\r
1818        WHERE name = 'Circulator')\r
1819   )\r
1820 ;\r
1821 </programlisting>\r
1822                         </para></formalpara>\r
1823                 </simplesect>\r
1824                 <simplesect id="_inserting_data_using_a_select_statement">\r
1825                         <title>Inserting data using a SELECT statement</title>\r
1826                         <simpara>Sometimes you want to insert a bulk set of data into a new table based on\r
1827                         a query result. Rather than a <literal>VALUES</literal> clause, you can use a <literal>SELECT</literal>\r
1828                         statement to insert one or more rows matching the column definitions. This\r
1829                         is a good time to point out that you can include explicit values, instead\r
1830                         of just column identifiers, in the return columns of the <literal>SELECT</literal> statement.\r
1831                         The explicit values are returned in every row of the result set.</simpara>\r
1832                         <simpara>In the following example, we insert 6 rows into the <literal>permission.usr_grp_map</literal>\r
1833                         table; each row will have a <literal>usr</literal> column value of 1, with varying values for\r
1834                         the <literal>grp</literal> column value based on the <literal>id</literal> column values returned from\r
1835                         <literal>permission.grp_tree</literal>:</simpara>\r
1836                         <formalpara><title>Inserting rows via a <literal>SELECT</literal> statement</title><para>\r
1837 <programlisting language="sql" linenumbering="unnumbered">\r
1838 INSERT INTO permission.usr_grp_map (usr, grp)\r
1839   SELECT 1, id\r
1840     FROM permission.grp_tree\r
1841     WHERE id &gt; 2\r
1842 ;\r
1843 \r
1844 INSERT 0 6\r
1845 </programlisting>\r
1846                         </para></formalpara>\r
1847                 </simplesect>\r
1848                 <simplesect id="_deleting_rows">\r
1849                         <title>Deleting rows</title>\r
1850                         <simpara>Deleting data from a table is normally fairly easy. To delete rows from a table,\r
1851                         issue a <literal>DELETE</literal> statement identifying the table from which you want to delete\r
1852                         rows and a <literal>WHERE</literal> clause identifying the row or rows that should be deleted.</simpara>\r
1853                         <simpara>In the following example, we delete all of the rows from the\r
1854                         <literal>permission.grp_perm_map</literal> table where the permission maps to\r
1855                         <literal>UPDATE_ORG_UNIT_CLOSING</literal> and the group is anything other than administrators:</simpara>\r
1856                         <formalpara><title>Deleting rows from a table</title><para>\r
1857 <programlisting language="sql" linenumbering="unnumbered">\r
1858 DELETE FROM permission.grp_perm_map\r
1859   WHERE grp IN (\r
1860     SELECT id\r
1861       FROM permission.grp_tree\r
1862       WHERE name != 'Local System Administrator'\r
1863   ) AND perm = (\r
1864     SELECT id\r
1865       FROM permission.perm_list\r
1866               WHERE code = 'UPDATE_ORG_UNIT_CLOSING'\r
1867           )\r
1868 ;\r
1869 </programlisting>\r
1870                         </para></formalpara>\r
1871                         <note><simpara>There are two main reasons that a <literal>DELETE</literal> statement may not actually\r
1872                         delete rows from a table, even when the rows meet the conditional clause.</simpara></note>\r
1873                         <orderedlist numeration="arabic">\r
1874                         <listitem>\r
1875                         <simpara>\r
1876                         If the row contains a value that is the target of a relational constraint,\r
1877                         for example, if another table has a foreign key pointing at your target\r
1878                         table, you will be prevented from deleting a row with a value corresponding\r
1879                         to a row in the dependent table.\r
1880                         </simpara>\r
1881                         </listitem>\r
1882                         <listitem>\r
1883                         <simpara>\r
1884                         If the table has a rule that substitutes a different action for a <literal>DELETE</literal>\r
1885                         statement, the deletion will not take place. In Evergreen it is common for a\r
1886                         table to have a rule that substitutes the action of setting a <literal>deleted</literal> column\r
1887                         to <literal>TRUE</literal>. For example, if a book is discarded, deleting the row representing\r
1888                         the copy from the <literal>asset.copy</literal> table would severely affect circulation statistics,\r
1889                         bills, borrowing histories, and their corresponding tables in the database that\r
1890                         have foreign keys pointing at the <literal>asset.copy</literal> table (<literal>action.circulation</literal> and\r
1891                         <literal>money.billing</literal> and its children respectively). Instead, the <literal>deleted</literal> column\r
1892                         value is set to <literal>TRUE</literal> and Evergreen&#8217;s application logic skips over these rows\r
1893                         in most cases.\r
1894                         </simpara>\r
1895                         </listitem>\r
1896                         </orderedlist>\r
1897                 </simplesect>\r
1898                 <simplesect id="_updating_rows">\r
1899                         <title>Updating rows</title>\r
1900                         <simpara>To update rows in a table, issue an <literal>UPDATE</literal> statement identifying the table\r
1901                         you want to update, the column or columns that you want to set with their\r
1902                         respective new values, and (optionally) a <literal>WHERE</literal> clause identifying the row or\r
1903                         rows that should be updated.</simpara>\r
1904                         <simpara>Following is the syntax for the <literal>UPDATE</literal> statement:</simpara>\r
1905                         <blockquote>\r
1906                         <literallayout><literal>UPDATE</literal> [<emphasis>table-name</emphasis>]\r
1907                           <literal>SET</literal> [<emphasis>column</emphasis>] <literal>TO</literal> [<emphasis>new-value</emphasis>]\r
1908                           <literal>WHERE</literal> [<emphasis>condition</emphasis>]\r
1909                         ;</literallayout>\r
1910                         </blockquote>\r
1911                 </simplesect>\r
1912         </section>\r
1913         <section id="query_requests">\r
1914                 <title>Query requests</title>\r
1915                 <simpara>The following queries were requested by Bibliomation, but might be reusable\r
1916                 by other libraries.</simpara>\r
1917                 <simplesect id="_monthly_circulation_stats_by_collection_code_library">\r
1918                         <title>Monthly circulation stats by collection code / library</title>\r
1919                         <formalpara><title>Monthly Circulation Stats by Collection Code/Library</title><para>\r
1920 <programlisting language="sql" linenumbering="unnumbered">\r
1921 SELECT COUNT(acirc.id) AS "COUNT", aou.name AS "Library", acl.name AS "Copy Location"\r
1922   FROM asset.copy ac\r
1923     INNER JOIN asset.copy_location acl ON ac.location = acl.id\r
1924     INNER JOIN action.circulation acirc ON acirc.target_copy = ac.id\r
1925     INNER JOIN actor.org_unit aou ON acirc.circ_lib = aou.id\r
1926   WHERE DATE_TRUNC('MONTH', acirc.create_time) = DATE_TRUNC('MONTH', NOW() - INTERVAL '3 month')\r
1927     AND acirc.desk_renewal IS FALSE\r
1928     AND acirc.opac_renewal IS FALSE\r
1929     AND acirc.phone_renewal IS FALSE\r
1930   GROUP BY aou.name, acl.name\r
1931   ORDER BY aou.name, acl.name, 1\r
1932 ;\r
1933 </programlisting>\r
1934                         </para></formalpara>\r
1935                 </simplesect>\r
1936                 <simplesect id="_monthly_circulation_stats_by_borrower_stat_library">\r
1937                         <title>Monthly circulation stats by borrower stat / library</title>\r
1938                         <formalpara><title>Monthly Circulation Stats by Borrower Stat/Library</title><para>\r
1939 <programlisting language="sql" linenumbering="unnumbered">\r
1940 SELECT COUNT(acirc.id) AS "COUNT", aou.name AS "Library", asceum.stat_cat_entry AS "Borrower Stat"\r
1941   FROM action.circulation acirc\r
1942     INNER JOIN actor.org_unit aou ON acirc.circ_lib = aou.id\r
1943     INNER JOIN actor.stat_cat_entry_usr_map asceum ON asceum.target_usr = acirc.usr\r
1944     INNER JOIN actor.stat_cat astat ON asceum.stat_cat = astat.id\r
1945   WHERE DATE_TRUNC('MONTH', acirc.create_time) = DATE_TRUNC('MONTH', NOW() - INTERVAL '3 month')\r
1946     AND astat.name = 'Preferred language'\r
1947     AND acirc.desk_renewal IS FALSE\r
1948     AND acirc.opac_renewal IS FALSE\r
1949     AND acirc.phone_renewal IS FALSE\r
1950   GROUP BY aou.name, asceum.stat_cat_entry\r
1951   ORDER BY aou.name, asceum.stat_cat_entry, 1\r
1952 ;\r
1953 </programlisting>\r
1954                         </para></formalpara>\r
1955                 </simplesect>\r
1956                 <simplesect id="_monthly_intralibrary_loan_stats_by_library">\r
1957                         <title>Monthly intralibrary loan stats by library</title>\r
1958                         <formalpara><title>Monthly Intralibrary Loan Stats by Library</title><para>\r
1959 <programlisting language="sql" linenumbering="unnumbered">\r
1960 SELECT aou.name AS "Library", COUNT(acirc.id)\r
1961   FROM action.circulation acirc\r
1962     INNER JOIN actor.org_unit aou ON acirc.circ_lib = aou.id\r
1963     INNER JOIN asset.copy ac ON acirc.target_copy = ac.id\r
1964     INNER JOIN asset.call_number acn ON ac.call_number = acn.id\r
1965   WHERE acirc.circ_lib != acn.owning_lib\r
1966     AND DATE_TRUNC('MONTH', acirc.create_time) = DATE_TRUNC('MONTH', NOW() - INTERVAL '3 month')\r
1967     AND acirc.desk_renewal IS FALSE\r
1968     AND acirc.opac_renewal IS FALSE\r
1969     AND acirc.phone_renewal IS FALSE\r
1970   GROUP by aou.name\r
1971   ORDER BY aou.name, 2\r
1972 ;\r
1973 </programlisting>\r
1974                         </para></formalpara>\r
1975                 </simplesect>\r
1976                 <simplesect id="_monthly_borrowers_added_by_profile_adult_child_etc_library">\r
1977                         <title>Monthly borrowers added by profile (adult, child, etc) / library</title>\r
1978                         <formalpara><title>Monthly Borrowers Added by Profile (Adult, Child, etc)/Library</title><para>\r
1979 <programlisting language="sql" linenumbering="unnumbered">\r
1980 SELECT pgt.name AS "Profile", aou.name AS "Library", COUNT(pgt.name) AS "Count"\r
1981   FROM actor.usr au\r
1982     INNER JOIN permission.grp_tree pgt\r
1983       ON au.profile = pgt.id\r
1984     INNER JOIN actor.org_unit aou\r
1985       ON aou.id = au.home_ou\r
1986   WHERE au.deleted IS FALSE\r
1987     AND DATE_TRUNC('MONTH', au.create_date) = DATE_TRUNC('MONTH', NOW() - '3 months'::interval)\r
1988   GROUP BY pgt.name, aou.name\r
1989   ORDER BY aou.name, pgt.name\r
1990 ;\r
1991 </programlisting>\r
1992                         </para></formalpara>\r
1993                 </simplesect>\r
1994                 <simplesect id="_borrower_count_by_profile_adult_child_etc_library">\r
1995                         <title>Borrower count by profile (adult, child, etc) / library</title>\r
1996                         <formalpara><title>Borrower Count by Profile (Adult, Child, etc)/Library</title><para>\r
1997 <programlisting language="sql" linenumbering="unnumbered">\r
1998 SELECT pgt.name AS "Profile", aou.name AS "Library", COUNT(pgt.name) AS "Count"\r
1999   FROM actor.usr au\r
2000     INNER JOIN permission.grp_tree pgt\r
2001       ON au.profile = pgt.id\r
2002     INNER JOIN actor.org_unit aou\r
2003       ON aou.id = au.home_ou\r
2004   WHERE au.deleted IS FALSE\r
2005   GROUP BY pgt.name, aou.name\r
2006   ORDER BY aou.name, pgt.name\r
2007 ;\r
2008 </programlisting>\r
2009                         </para></formalpara>\r
2010                 </simplesect>\r
2011                 <simplesect id="_monthly_items_added_by_collection_library">\r
2012                         <title>Monthly items added by collection / library</title>\r
2013                         <simpara>We define a <quote>collection</quote> as a shelving location in Evergreen.</simpara>\r
2014                         <formalpara><title>Monthly Items Added by Collection/Library</title><para>\r
2015 <programlisting language="sql" linenumbering="unnumbered">\r
2016 SELECT aou.name AS "Library", acl.name, COUNT(ac.barcode)\r
2017   FROM actor.org_unit aou\r
2018     INNER JOIN asset.call_number acn ON acn.owning_lib = aou.id\r
2019     INNER JOIN asset.copy ac ON ac.call_number = acn.id\r
2020     INNER JOIN asset.copy_location acl ON ac.location = acl.id\r
2021   WHERE ac.deleted IS FALSE\r
2022     AND acn.deleted IS FALSE\r
2023     AND DATE_TRUNC('MONTH', ac.create_date) = DATE_TRUNC('MONTH', NOW() - '1 month'::interval)\r
2024   GROUP BY aou.name, acl.name\r
2025   ORDER BY aou.name, acl.name\r
2026 ;\r
2027 </programlisting>\r
2028                         </para></formalpara>\r
2029                 </simplesect>\r
2030                 <simplesect id="_hold_purchase_alert_by_library">\r
2031                         <title>Hold purchase alert by library</title>\r
2032                         <simpara>in the following set of queries, we bring together the active title, volume,\r
2033                         and copy holds and display those that have more than a certain number of holds\r
2034                         per title. The goal is to UNION ALL the three queries, then group by the\r
2035                         bibliographic record ID and display the title / author information for those\r
2036                         records that have more than a given threshold of holds.</simpara>\r
2037                         <formalpara><title>Hold Purchase Alert by Library</title><para>\r
2038 <programlisting language="sql" linenumbering="unnumbered">\r
2039 -- Title holds\r
2040 SELECT all_holds.bib_id, aou.name, rmsr.title, rmsr.author, COUNT(all_holds.bib_id)\r
2041   FROM\r
2042     (\r
2043       (\r
2044         SELECT target, request_lib\r
2045           FROM action.hold_request\r
2046           WHERE hold_type = 'T'\r
2047             AND fulfillment_time IS NULL\r
2048             AND cancel_time IS NULL\r
2049       )\r
2050       UNION ALL\r
2051       -- Volume holds\r
2052       (\r
2053         SELECT bre.id, request_lib\r
2054           FROM action.hold_request ahr\r
2055             INNER JOIN asset.call_number acn ON ahr.target = acn.id\r
2056             INNER JOIN biblio.record_entry bre ON acn.record = bre.id\r
2057           WHERE ahr.hold_type = 'V'\r
2058             AND ahr.fulfillment_time IS NULL\r
2059             AND ahr.cancel_time IS NULL\r
2060       )\r
2061       UNION ALL\r
2062       -- Copy holds\r
2063       (\r
2064         SELECT bre.id, request_lib\r
2065           FROM action.hold_request ahr\r
2066             INNER JOIN asset.copy ac ON ahr.target = ac.id\r
2067             INNER JOIN asset.call_number acn ON ac.call_number = acn.id\r
2068             INNER JOIN biblio.record_entry bre ON acn.record = bre.id\r
2069           WHERE ahr.hold_type = 'C'\r
2070             AND ahr.fulfillment_time IS NULL\r
2071             AND ahr.cancel_time IS NULL\r
2072       )\r
2073     ) AS all_holds(bib_id, request_lib)\r
2074   INNER JOIN reporter.materialized_simple_record rmsr\r
2075   INNER JOIN actor.org_unit aou ON aou.id = all_holds.request_lib\r
2076     ON rmsr.id = all_holds.bib_id\r
2077   GROUP BY all_holds.bib_id, aou.name, rmsr.id, rmsr.title, rmsr.author\r
2078   HAVING COUNT(all_holds.bib_id) &gt; 2\r
2079   ORDER BY aou.name\r
2080 ;\r
2081 </programlisting>\r
2082                         </para></formalpara>\r
2083                 </simplesect>\r
2084                 <simplesect id="_update_borrower_records_with_a_different_home_library">\r
2085                         <title>Update borrower records with a different home library</title>\r
2086                         <simpara>In this example, the library has opened a new branch in a growing area,\r
2087                         and wants to reassign the home library for the patrons in the vicinity of\r
2088                         the new branch to the new branch. To accomplish this, we create a staging table\r
2089                         that holds a set of city names and the corresponding branch shortname for the home\r
2090                         library for each city.</simpara>\r
2091                         <simpara>Then we issue an <literal>UPDATE</literal> statement to set the home library for patrons with a\r
2092                         physical address with a city that matches the city names in our staging table.</simpara>\r
2093                         <formalpara><title>Update borrower records with a different home library</title><para>\r
2094 <programlisting language="sql" linenumbering="unnumbered">\r
2095 CREATE SCHEMA staging;\r
2096 CREATE TABLE staging.city_home_ou_map (city TEXT, ou_shortname TEXT,\r
2097   FOREIGN KEY (ou_shortname) REFERENCES actor.org_unit (shortname));\r
2098 INSERT INTO staging.city_home_ou_map (city, ou_shortname)\r
2099   VALUES ('Southbury', 'BR1'), ('Middlebury', 'BR2'), ('Hartford', 'BR3');\r
2100 BEGIN;\r
2101 \r
2102 UPDATE actor.usr au SET home_ou = COALESCE(\r
2103   (\r
2104     SELECT aou.id\r
2105       FROM actor.org_unit aou\r
2106         INNER JOIN staging.city_home_ou_map schom ON schom.ou_shortname = aou.shortname\r
2107         INNER JOIN actor.usr_address aua ON aua.city = schom.city\r
2108       WHERE au.id = aua.usr\r
2109       GROUP BY aou.id\r
2110   ), home_ou)\r
2111 WHERE (\r
2112   SELECT aou.id\r
2113     FROM actor.org_unit aou\r
2114       INNER JOIN staging.city_home_ou_map schom ON schom.ou_shortname = aou.shortname\r
2115       INNER JOIN actor.usr_address aua ON aua.city = schom.city\r
2116     WHERE au.id = aua.usr\r
2117     GROUP BY aou.id\r
2118 ) IS NOT NULL;\r
2119 </programlisting>\r
2120                         </para></formalpara>\r
2121                 </simplesect>\r
2122         </section>\r
2123         \r
2124 </chapter>\r