Support Scripts --------------- Various scripts are included with Evergreen in the `/openils/bin/` directory (and in the source code in `Open-ILS/src/support-scripts` and `Open-ILS/src/extras`). Some of them are used during the installation process, such as `eg_db_config`, while others are usually run as cron jobs for routine maintenance, such as `fine_generator.pl` and `hold_targeter.pl`. Others are useful for less frequent needs, such as the scripts for importing/exporting MARC records. You may explore these scripts and adapt them for your local needs. You are also welcome to share your improvements or ask any questions on the http://evergreen-ils.org/communicate/[Evergreen IRC channel or email lists]. Here is a summary of the most commonly used scripts. The script name links to more thorough documentation, if available. * action_trigger_aggregator.pl -- Groups together event output for already processed events. Useful for creating files that contain data from a group of events. Such as a CSV file with all the overdue data for one day. * <<_processing_action_triggers,action_trigger_runner.pl>> -- Useful for creating events for specified hooks and running pending events * authority_authority_linker.pl -- Links reference headings in authority records to main entry headings in other authority records. Should be run at least once a day (only for changed records). * <<_authority_control_fields,authority_control_fields.pl>> -- Links bibliographic records to the best matching authority record. Should be run at least once a day (only for changed records). You can accomplish this by running _authority_control_fields.pl --days-back=1_ * autogen.sh -- Generates web files used by the OPAC, especially files related to organization unit hierarchy, fieldmapper IDL, locales selection, facet definitions, compressed JS files and related cache key * clark-kent.pl -- Used to start and stop the reporter (which runs scheduled reports) * <<_creating_the_evergreen_database,eg_db_config>> -- Creates database and schema, updates config files, sets Evergreen administrator username and password * fine_generator.pl * hold_targeter.pl * <<_importing_authority_records_from_command_line,marc2are.pl>> -- Converts authority records from MARC format to Evergreen objects suitable for importing via pg_loader.pl (or parallel_pg_loader.pl) * marc2bre.pl -- Converts bibliographic records from MARC format to Evergreen objects suitable for importing via pg_loader.pl (or parallel_pg_loader.pl) * marc2sre.pl -- Converts serial records from MARC format to Evergreen objects suitable for importing via pg_loader.pl (or parallel_pg_loader.pl) * <<_marc_export,marc_export>> -- Exports authority, bibliographic, and serial holdings records into any of these formats: USMARC, UNIMARC, XML, BRE, ARE * osrf_control -- Used to start, stop and send signals to OpenSRF services * parallel_pg_loader.pl -- Uses the output of marc2bre.pl (or similar tools) to generate the SQL for importing records into Evergreen in a parallel fashion anchor:_authority_control_fields[] authority_control_fields: Connecting Bibliographic and Authority records ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ indexterm:[authority control] This script matches headings in bibliographic records to the appropriate authority records. When it finds a match, it will add a subfield 0 to the matching bibliographic field. Here is how the matching works: [options="header",cols="1,1,3"] |========================================================= |Bibliographic field|Authority field it matches|Subfields that it examines |100|100|a,b,c,d,f,g,j,k,l,n,p,q,t,u |110|110|a,b,c,d,f,g,k,l,n,p,t,u |111|111|a,c,d,e,f,g,j,k,l,n,p,q,t,u |130|130|a,d,f,g,h,k,l,m,n,o,p,r,s,t |600|100|a,b,c,d,f,g,h,j,k,l,m,n,o,p,q,r,s,t,v,x,y,z |610|110|a,b,c,d,f,g,h,k,l,m,n,o,p,r,s,t,v,w,x,y,z |611|111|a,c,d,e,f,g,h,j,k,l,n,p,q,s,t,v,x,y,z |630|130|a,d,f,g,h,k,l,m,n,o,p,r,s,t,v,x,y,z |648|148|a,v,x,y,z |650|150|a,b,v,x,y,z |651|151|a,v,x,y,z |655|155|a,v,x,y,z |700|100|a,b,c,d,f,g,j,k,l,n,p,q,t,u |710|110|a,b,c,d,f,g,k,l,n,p,t,u |711|111|a,c,d,e,f,g,j,k,l,n,p,q,t,u |730|130|a,d,f,g,h,j,k,m,n,o,p,r,s,t |751|151|a,v,x,y,z |800|100|a,b,c,d,e,f,g,j,k,l,n,p,q,t,u,4 |830|130|a,d,f,g,h,k,l,m,n,o,p,r,s,t |========================================================= anchor:_marc_export[] marc_export: Exporting Bibliographic Records into MARC files ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ indexterm:[marc_export] indexterm:[MARC records,exporting,using the command line] The following procedure explains how to export Evergreen bibliographic records into MARC files using the *marc_export* support script. All steps should be performed by the `opensrf` user from your Evergreen server. [NOTE] Processing time for exporting records depends on several factors such as the number of records you are exporting. It is recommended that you divide the export ID files (records.txt) into a manageable number of records if you are exporting a large number of records. . Create a text file list of the Bibliographic record IDs you would like to export from Evergreen. One way to do this is using SQL: + [source,sql] ---- SELECT DISTINCT bre.id FROM biblio.record_entry AS bre JOIN asset.call_number AS acn ON acn.record = bre.id WHERE bre.deleted='false' and owning_lib=101 \g /home/opensrf/records.txt; ---- + This query creates a file called `records.txt` containing a column of distinct IDs of items owned by the organizational unit with the id 101. . Navigate to the support-scripts folder + ---- cd /home/opensrf/Evergreen-ILS*/Open-ILS/src/support-scripts/ ---- . Run *marc_export*, using the ID file you created in step 1 to define which files to export. The following example exports the records into MARCXML format. + ---- cat /home/opensrf/records.txt | ./marc_export --store -i -c /openils/conf/opensrf_core.xml \ -x /openils/conf/fm_IDL.xml -f XML --timeout 5 > exported_files.xml ---- [NOTE] ==================== `marc_export` does not output progress as it executes. ==================== Options ^^^^^^^ The *marc_export* support script includes several options. You can find a complete list by running `./marc_export -h`. A few key options are also listed below: --descendants and --library +++++++++++++++++++++++++++ The `marc_export` script has two related options, `--descendants` and `--library`. Both options take one argument of an organizational unit The `--library` option will export records with holdings at the specified organizational unit only. By default, this only includes physical holdings, not electronic ones (also known as located URIs). The `descendants` option works much like the `--library` option except that it is aware of the org. tree and will export records with holdings at the specified organizational unit and all of its descendants. This is handy if you want to export the records for all of the branches of a system. You can do that by specifying this option and the system's shortname, instead of specifying multiple `--library` options for each branch. Both the `--library` and `--descendants` options can be repeated. All of the specified org. units and their descendants will be included in the output. You can also combine `--library` and `--descendants` options when necessary. --items +++++++ The `--items` option will add an 852 field for every relevant item to the MARC record. This 852 field includes the following information: [options="header",cols="2,3"] |=================================== |Subfield |Contents |$b (occurrence 1) |Call number owning library shortname |$b (occurrence 2) |Item circulating library shortname |$c |Shelving location |$g |Circulation modifier |$j |Call number |$k |Call number prefix |$m |Call number suffix |$p |Barcode |$s |Status |$t |Copy number |$x |Miscellaneous item information |$y |Price |=================================== --since +++++++ You can use the `--since` option to export records modified after a certain date and time. --store +++++++ By default, marc_export will use the reporter storage service, which should work in most cases. But if you have a separate reporter database and you know you want to talk directly to your main production database, then you can set the `--store` option to `cstore` or `storage`. --uris ++++++ The `--uris` option (short form: `-u`) allows you to export records with located URIs (i.e. electronic resources). When used by itself, it will export only records that have located URIs. When used in conjunction with `--items`, it will add records with located URIs but no items/copies to the output. If combined with a `--library` or `--descendants` option, this option will limit its output to those records with URIs at the designated libraries. The best way to use this option is in combination with the `--items` and one of the `--library` or `--descendants` options to export *all* of a library's holdings both physical and electronic. anchor:_pingest_pl[] Parallel Ingest with pingest.pl ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ indexterm:[pgingest.pl] indexterm:[MARC records,importing,using the command line] A program named pingest.pl allows fast bibliographic record ingest. It performs ingest in parallel so that multiple batches can be done simultaneously. It operates by splitting the records to be ingested up into batches and running all of the ingest methods on each batch. You may pass in options to control how many batches are run at the same time, how many records there are per batch, and which ingest operations to skip. NOTE: The browse ingest is presently done in a single process over all of the input records as it cannot run in parallel with itself. It does, however, run in parallel with the other ingests. Command Line Options ^^^^^^^^^^^^^^^^^^^^ pingest.pl accepts the following command line options: --host:: The server where PostgreSQL runs (either host name or IP address). The default is read from the PGHOST environment variable or "localhost." --port:: The port that PostgreSQL listens to on host. The default is read from the PGPORT environment variable or 5432. --db:: The database to connect to on the host. The default is read from the PGDATABASE environment variable or "evergreen." --user:: The username for database connections. The default is read from the PGUSER environment variable or "evergreen." --password:: The password for database connections. The default is read from the PGPASSWORD environment variable or "evergreen." --batch-size:: Number of records to process per batch. The default is 10,000. --max-child:: Max number of worker processes (i.e. the number of batches to process simultaneously). The default is 8. --skip-browse:: --skip-attrs:: --skip-search:: --skip-facets:: --skip-display:: Skip the selected reingest component. --attr:: This option allows the user to specify which record attributes to reingest. It can be used one or more times to specify one or more attributes to ingest. It can be omitted to reingest all record attributes. This option is ignored if the `--skip-attrs` option is used. + The `--attr` option is most useful after doing something specific that requires only a partial ingest of records. For instance, if you add a new language to the `config.coded_value_map` table, you will want to reingest the `item_lang` attribute on all of your records. The following command line will do that, and only that, ingest: + ---- $ /openils/bin/pingest.pl --skip-browse --skip-search --skip-facets \ --skip-display --attr=item_lang ---- --rebuild-rmsr:: This option will rebuild the `reporter.materialized_simple_record` (rmsr) table after the ingests are complete. + This option might prove useful if you want to rebuild the table as part of a larger reingest. If all you wish to do is to rebuild the rmsr table, then it would be just as simple to connect to the database server and run the following SQL: + [source,sql] ---- SELECT reporter.refresh_materialized_simple_record(); ---- Importing Authority Records from Command Line ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ indexterm:[marc2are.pl] indexterm:[pg_loader.pl] indexterm:[MARC records,importing,using the command line] The major advantages of the command line approach are its speed and its convenience for system administrators who can perform bulk loads of authority records in a controlled environment. For alternate instructions, see the cataloging manual. . Run *marc2are.pl* against the authority records, specifying the user name, password, MARC type (USMARC or XML). Use `STDOUT` redirection to either pipe the output directly into the next command or into an output file for inspection. For example, to process a file with authority records in MARCXML format named `auth_small.xml` using the default user name and password, and directing the output into a file named `auth.are`: + ---- cd Open-ILS/src/extras/import/ perl marc2are.pl --user admin --pass open-ils --marctype XML auth_small.xml > auth.are ---- + [NOTE] The MARC type will default to USMARC if the `--marctype` option is not specified. . Run *parallel_pg_loader.pl* to generate the SQL necessary for importing the authority records into your system. This script will create files in your current directory with filenames like `pg_loader-output.are.sql` and `pg_loader-output.sql` (which runs the previous SQL file). To continue with the previous example by processing our new `auth.are` file: + ---- cd Open-ILS/src/extras/import/ perl parallel_pg_loader.pl --auto are --order are auth.are ---- + [TIP] To save time for very large batches of records, you could simply pipe the output of *marc2are.pl* directly into *parallel_pg_loader.pl*. . Load the authority records from the SQL file that you generated in the last step into your Evergreen database using the psql tool. Assuming the default user name, host name, and database name for an Evergreen instance, that command looks like: + ---- psql -U evergreen -h localhost -d evergreen -f pg_loader-output.sql ---- Juvenile-to-adult batch script ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The batch `juv_to_adult.srfsh` script is responsible for toggling a patron from juvenile to adult. It should be set up as a cron job. This script changes patrons to adult when they reach the age value set in the library setting named "Juvenile Age Threshold" (`global.juvenile_age_threshold`). When no library setting value is present at a given patron's home library, the value passed in to the script will be used as a default. MARC Stream Importer ~~~~~~~~~~~~~~~~~~~~ indexterm:[MARC records,importing,using the command line] The MARC Stream Importer can import authority records or bibliographic records. A single running instance of the script can import either type of record, based on the record leader. This support script has its own configuration file, _marc_stream_importer.conf_, which includes settings related to logs, ports, uses, and access control. By default, _marc_stream_importer.pl_ will typically be located in the _/openils/bin_ directory. _marc_stream_importer.conf_ will typically be located in _/openils/conf_. The importer is even more flexible than the staff client import, including the following options: * _--bib-auto-overlay-exact_ and _--auth-auto-overlay-exact_: overlay/merge on exact 901c matches * _--bib-auto-overlay-1match_ and _--auth-auto-overlay-1match_: overlay/merge when exactly one match is found * _--bib-auto-overlay-best-match_ and _--auth-auto-overlay-best-match_: overlay/merge on best match * _--bib-import-no-match_ and _--auth-import-no-match_: import when no match is found One advantage to using this tool instead of the staff client Import interface is that the MARC Stream Importer can load a group of files at once.