Migrating DataMigrating data into Evergreen can be one of the most daunting tasks for an administrator. This chapter will explain some procedures to help to migrate
bibliographic records, copies and patrons into the Evergreen system. This chapter requires advanced ILS Administration experience, knowledge of Evergreen data structures,
as well as knowledge of how to export data from your current system or access to data export files from your current system.Migrating Bibliographic Recordsmigratingimporting bibliographic records
One of the most important and challenging tasks is migrating your bibliographic records to a new system. The procedure may be different depending on the system from which you
are migrating and the content of the marc records exported from the existing system. The procedures in this section deal with the process once the data from the existing system
is exported into marc records. It does not cover exporting data from your existing non-Evergreen system.Several tools for importing bibliographic records into Evergreen can be found in the Evergreen installation folder
(/home/opensrf/Evergreen-ILS-1.6.1.6/Open-ILS/src/extras/import/ ) and are also available from the Evergreen repository
(
http://svn.open-ils.org/trac/ILS/browser/branches/rel_1_6_1/Open-ILS/src/extras/import).Converting MARC records to Evergreen BRE JSON formatBRE JSONIf you are starting with MARC records from your existing system or another source, use the marc2bre.pl script to create the JSON representation of a bibliographic
record entry (hence bre) in Evergreen. marc2bre.pl can perform the following functions:Converts MARC-8 encoded records to UTF-8 encodingConverts MARC21 to MARCXML21Select the unique record number field (common choices are '035' or '001'; check your records as you might be surprised how a supposedly unique field
actually has duplicates, though marc2bre.pl will select a unique identifier for subsequent duplicates)Extracts certain pertinent fields indexing and display purposes (along with the complete MARCXML21 record)Sets the ID number of the first record from this batch to be imported into the biblio.record_entry table (hint - run the following
SQL to determine what this number should be to avoid conflicts:psql -U postgres evergreen # SELECT MAX(id)+1 FROM biblio.record_entry;If you are processing multiple sets of MARC records with marc2bre.plbefore loading the records into the database, you will need to keep track
of the starting ID number for each subsequent batch of records that you are importing. For example, if you are processing three files of MARC records with 10000
records each into a clean database, you would use , , and
parameters for each respective file.Ignore trash fields that you do not want to retain in EvergreenIf you use marc2bre.pl to convert your MARC records from the MARC-8 encoding to the UTF-8 encoding, it relies
on the MARC::Charset Perl module to complete the conversion. When importing a large set of items, you can speed up the process by using a
utility like marc4j or marcdumper to convert the records
to MARC21XML and UTF-8 before running them through marc2bre.pl with the
flag to tell marc2bre.pl that the records are already in MARC21XML format with
the UTF-8 encoding. If you take this approach, due to a current limitation of MARC::File::XML you have to do a
horrible thing and ensure that there are no namespace prefixes in front of the element names. marc2bre.pl cannot parse the following
example:00677nam a2200193 a 4500H01-0000844t 060420s1950 xx 000 u fre dCaOHCUfre
...
]]>;
But marc2bre.pl can parse the same example with the namespace prefixes removed:00677nam a2200193 a 4500H01-0000844t 060420s1950 xx 000 u fre dCaOHCUfre
...
]]>;
Converting Records for Import into PostgreSQLmigratingimport into PostgreSQLOnce you have your records in Open-ILS JSON ingest format, you then need to use pg_loader.pl to convert these records into a
set of SQL statements that you can use to
load the records into PostgreSQL. The and command line options (bre, mrd, mfr, etc) map to class IDs defined in
/openils/conf/fm_IDL.xml.Adding Metarecords to the Databasemigratingadding metarecordsOne you have loaded the records into PostgreSQL, you can create metarecord entries in the metabib.metarecord table by running the following SQL:psql evergreen# \i /home/opensrf/Evergreen-ILS-1.6*/src/extras/import/quick_metarecord_map.sqlMetarecords are required to place holds on items, among other actions.Migrating Bibliographic Records Using the ESI Migration Toolsmigratingbibliograohic recordsThe following procedure explains how to migrate bibliographic records from marc records into Evergreen. This is a general guide and will need to be adjusted for your
specific environment. It does not cover exporting records from specific proprietary ILS
systems. For assistance with exporting records from your current system please refer to the manuals for your system or you might try to ask for help from the
Evergreen community.Download the Evergreen migration utilities from the git repository.Use the command git clone git://git.esilibrary.com/git/migration-tools.git to clone the migration tools.Install the migration tools:Add environmental variables for migration and import tools. These paths must point to:
the import perl scripts bundled with Evergreen the folder where you extracted the migration toolsthe location of the Equinox-Migration perl modules the location of the Evergreen perl modules (e.g. perl5)export PATH=[path to Evergreen]/Open-ILS/src/extras/import: \
/[path to migration-tools]/migration-tools:$PATH:.
export PERL5LIB=/openils/lib/perl5: \
/[path to migration-tools/Equinox-Migration/lib
Dump marc records into MARCXML using yaz-marcdump
' > imported_marc_records.xml
yaz-marcdump -f MARC-8 -t UTF-8 -o marcxml imported_marc_records.mrc >> imported_marc_records.xml
]]>
Test validity of XML file using xmllint
marc.xml.err
]]>
Clean up the marc xml file using the marc_cleanup utility:
marc_cleanup --marcfile=imported_marc_records.xml --fullauto [--renumber-from #] -ot 001
The is required if you have bibliographic records already in your system. Use this to set the starting id number higher
then the last id in the biblio.record_entry table. The marc_cleanup command will generate a file called clean.marc.xmlCreate a fingerprinter file using the fingerprinter utility:
fingerprinter -o incumbent.fp -x incumbent.ex clean.marc.xml
fingerprinter is used for deduplification of the incumbent records. The option specifies the
output file and the option is used to specify the error output file.Create a fingerprinter file for existing Evergreen bibliographic records using the fingerprinter utility if you
have existing bibliographic records in your system previously imported:
fingerprinter -o production.fp -x production.fp.ex --marctype=MARC21 existing_marc_records.mrc \
--tag=901 --subfield=c
fingerprinter is used for deduplification of the incumbant records.Create a merged fingerprint file removing duplicate records.
cat cat production.fp incumbent.fp | sort -r > dedupe.fp
match_fingerprints [-t start id] -o records.merge dedupe.fp
Create a new import XML file using the extract_loadset utilityextract_loadset -l 1 -i clean.marc.xml -o merged.xml records.mergeExtract all of the currently used TCN's an generate the .bre and .ingest files to prepare for the bibliographic record load.
psql -U evergreen -c "select tcn_value from biblio.record_entry where not deleted" \
| perl -npe 's/^\s+//;' > used_tcns
marc2bre.pl --idfield 903 [--startid=#] --marctype=XML -f final.xml \
--used_tcn_file=used_tcns > evergreen_bre_import_file.bre
The option needs to match the start id used in earlier steps and must be higher than largest id value
in the biblio.record_entry table. the option should match the marc datafield used to store your records ids.Ingest the bibliographic records into the Evergreen database.
bibrecords.sql
]]>
Load the records using psql and the sql scripts generated from the previous step.
log.create_metabib
]]>
Extract holdings from marc records for importing copies into Evergreen using the extract_holdings utility.
extract_holdings --marcfile=clean.marc.xml --holding 999 --copyid 999i --map holdings.map
This command would extract holdings based on the 949 datafield in the marc records. The copy id is generated from the subfile i in the 999 datafield. You may
need to adjust these options based on the field used for holdings informatiom in your marc records.The option holdings.map refers to a file to be used for mapping subfields to the holdings data you would like extracted. Here is an example based on mapping holdings data to the 999 data field:Running the extract holdings script should produce an sql script HOLDINGS.pg similar to:
BEGIN;
egid, hseq, l_callnum, l_barcode, l_location, l_owning_lib, l_circ_modifier,
40 0 HD3616.K853 U54 1997 30731100751928 STACKS FENNELL BOOK
41 1 HV6548.C3 S984 1998 30731100826613 STACKS FENNELL BOOK
41 2 HV6548.C3 S984 1998 30731100804958 STACKS BRANTFORD BOOK
...
Edit the holdings.pg sql script like so:
BEGIN;
TRUNCATE TABLE staging_items;INSERT INTO staging_items (egid, hseq, l_callnum, l_barcode, l_location,
l_owning_lib, l_circ_modifier) FROM stdin;
40 0 HD3616.K853 U54 1997 30731100751928 STACKS FENNELL BOOK
41 1 HV6548.C3 S984 1998 30731100826613 STACKS FENNELL BOOK
41 2 HV6548.C3 S984 1998 30731100804958 STACKS BRANTFORD BOOK
\.
COMMIT;This file can be used for importing holdings into Evergreen. the egid is a critical column. It is used to link the volume and
copy to the bibliographic record. Please refer to for the steps to import your holdings into Evergreen.Adding Copies to Bibliographic RecordsmigratingholdingsBefore bibliographic records can be found in an OPAC search copies will need to be created. It is very important to understand how various tables related to each other in regards
to holdings maintenance.The following procedure will guide you through the process of populating Evergreen with volumes and copies. This is a very simple example. The SQL queries may need to be adjusted
for the specific data in your holdings.Create a staging_items staging table to hold the holdings data:
CREATE TABLE staging_items (
l_callnum text, -- call number label
hseq int,
egid int, -- biblio.record_entry_id
createdate date,
l_location text,
l_barcode text,
l_circ_modifier text,
l_owning_lib text -- actor.org_unit.shortname
);
Import the items using the HOLDINGS.pg SQL script created using the extract_holdings utility.psql -U evergreen -f HOLDINGS.pg evergreenthe file HOLDINGS.pg and/or the COPY query may need to be adjusted for your particular circumstances.Generate shelving locations from your staging table.
INSERT INTO asset.copy_location (name, owning_lib)
SELECT DISTINCT l.location, ou.id
FROM staging_items l
JOIN actor.org_unit ou ON (l.owning_lib = ou.shortname);
Generate circulation modifiers from your staging table.
INSERT INTO config.circ_modifier (code, name, description, sip2_media_type, magnetic_media)
SELECT DISTINCT l_circ_modifier AS code,
l_circ_modifier AS name,
LOWER(l_circ_modifier) AS description,
'001' AS sip2_media_type,
FALSE AS magnetic_media
FROM staging_items
WHERE l_circ_modifier NOT IN (SELECT code FROM config.circ_modifier);
Generate call numbers from your staging table:
INSERT INTO asset.call_number (creator,editor,record,label,owning_lib)
SELECT DISTINCT 1, 1, egid, l.callnum, ou.id
FROM staging.staging_items l
JOIN actor.org_unit ou ON (l.owning_lib = ou.shortname);
Generate copies from your staging table:
INSERT INTO asset.copy (
circ_lib, creator, editor, create_date, barcode,
STATUS, location, loan_duration, fine_level, circ_modifier, deposit, ref, call_number)
SELECT DISTINCT ou.id AS circ_lib,
1 AS creator,
1 AS editor,
l.l_createdate AS create_date,
l.l_barcode AS barcode,
0 AS STATUS,
cl.id AS location,
2 AS loan_duration,
2 AS fine_level,
l.l_circ_modifier AS circ_modifier,
FALSE AS deposit,
CASE
WHEN l.l_circ_modifier = 'REFERENCE' THEN TRUE
ELSE FALSE
END AS ref,
cn.id AS call_number
FROM staging_items l
JOIN actor.org_unit ou
ON (l.l_owning_lib = ou.shortname)
JOIN asset.copy_location cl
ON (ou.id = cl.owning_lib AND l.l_location = cl.name)
JOIN metabib.real_full_rec m
ON (m.record = l.egid)
JOIN asset.call_number cn
ON (ou.id = cn.owning_lib
AND m.record = cn.record
AND l.l_callnum = cn.label)
You should now have copies in your Evergreen database and should be able to search and find the bibliographic records with attached copies.Migrating Patron Datamigratingpatrons
This section will explain the task of migrating your patron data from comma delimited filescomma delimited files into Evergreen.
It does not deal with the process of exporting from the non-Evergreen
system since this process may vary depending on where you are extracting your patron records. Patron could come from an ILS or it could come from a student database in the case of
academic records.
When importing records into Evergreen you will need to populate 3 tables in your Evergreen database:actor.usr - The main table for user dataactor.card - Stores the barcode for users; Users can have more than 1 card but only 1 can be active at a given time;actor.usr_address - Used for storing address information; A user can have more than one address.Before following the procedures below to import patron data into Evergreen, it is a good idea to examine the fields in these tables in order to decide on a strategy
for data to include
in your import. It is important to understand the data types and constraints on each field.Export the patron data from your existing ILS or from another source into a comma delimited file. The comma delimited file used for importing
the records should use Unicode (UTF8) Unicode character encoding.Create a staging table.staging table A staging table will allow you to tweak the data before importing.
Here is an example sql statement:sql
CREATE TABLE students (
student_id int, barcode text, last_name text, first_name text, email text, address_type text, street1 text, street2 text,
city text, province text, country text, postal_code text, phone text, profile int DEFAULT 2,
ident_type int, home_ou int, claims_returned_count int DEFAULT 0, usrname text,
net_access_level int DEFAULT 2, password text
);
Note the DEFAULT variables. These allow you to set default for your library or to populate required fields if you
data allows NULL values where fields are required in Evergreen.The data field profile in the above SQL script refers to the user group and should be an
integer referencing the id field in
permission.grp_tree.
Setting this value will effect the permissions for the user. See the values in permission.grp_tree for
possibilities.ident_type is the identification type used for identiying users. This is a integer value referencing
config.identification_type and should match the id values of that table. The default values are
1 for Drivers License, 2 for SSN or 3 for other.home_ou is the home organizational unit for the user. This value needs to match the corresponding
id in the actor.org_unit
table.Copy records into staging table from a comma delimited file.
COPY students (student_id, last_name, first_name, email, address_type, street1, street2, city, province, country, postal_code, phone)
FROM '/home/opensrf/patrons.csv'
WITH CSV HEADER;
The above script wil vary depending on the format of your patron load file (patrons.csv). You may want to review
PostgreSQL documentationFormatting of some fields to fit Evergreen filed formatting may be required. Here is an example of sql to adjust phone numbers in the staging
table to fit the evergreen field:
UPDATE students phone = replace(replace(replace(rpad(substring(phone from 1 for 9), 10, '-') ||
substring(phone from 10), '(', ''), ')', ''), ' ', '-');
Data massaging will be required to fit formats used in Evergreen.Insert records from the staging table into the actor.usr Evergreen table:
INSERT INTO actor.usr (
profile, usrname, email, passwd, ident_type, ident_value, first_given_name,
family_name, day_phone, home_ou, claims_returned_count, net_access_level)
SELECT profile, students.usrname, email, password, ident_type, student_id,
first_name, last_name, phone, home_ou, claims_returned_count, net_access_level
FROM students;
insert records into actor.card from
actor.usr .
INSERT INTO actor.card (usr, barcode)
SELECT actor.usr.id, students.barcode
FROM students
INNER JOIN actor.usr
ON students.usrname = actor.usr.usrname;
This assumes a one to one card patron relationship. If your patron data import has multiple cards assigned to one patron more complex import scripts may be required which look for inactive or active flags.Update actor.usr.card field with actor.card.id to associate active card with the user:
UPDATE actor.usr
SET card = actor.card.id
FROM actor.card
WHERE actor.card.usr = actor.usr.id;
Insert records into actor.usr_address to add address information for users:
INSERT INTO actor.usr_address (usr, street1, street2, city, state, country, post_code)
SELECT actor.usr.id, students.street1, students.street2, students.city, students.province,
students.country, students.postal_code
FROM students
INNER JOIN actor.usr ON students.usrname = actor.usr.usrname;
update actor.usr.address with address id from address table.
UPDATE actor.usr
SET mailing_address = actor.usr_address.id, billing_address = actor.usr_address.id
FROM actor.usr_address
WHERE actor.usr.id = actor.usr_address.usr;
This assumes 1 address per patron. More complex scenarios may require more sophisticated SQL.Creating an sql Script for Importing PatronsThe procedure for importing patron can be automated with the help of an sql script. Follow these steps to create an import script:Create an new file and name it import.sqlEdit the file to look similar to this:
BEGIN;
-- Create staging table.
CREATE TABLE students (
student_id int, barcode text, last_name text, first_name text, email text, address_type text, street1 text, street2 text,
city text, province text, country text, postal_code text, phone text, profile int,
ident_type int, home_ou int, claims_returned_count int DEFAULT 0, usrname text,
net_access_level int DEFAULT 2, password text
);
--Copy records from your import text file
COPY students (student_id, last_name, first_name, email, address_type, street1, street2, city, province, country, postal_code, phone, password)
FROM '/home/opensrf/patrons.csv'
WITH CSV HEADER;
--Insert records from the staging table into the actor.usr table.
INSERT INTO actor.usr (
profile, usrname, email, passwd, ident_type, ident_value, first_given_name, family_name,
day_phone, home_ou, claims_returned_count, net_access_level)
SELECT profile, students.usrname, email, password, ident_type, student_id, first_name,
last_name, phone, home_ou, claims_returned_count, net_access_level FROM students;
--Insert records from the staging table into the actor.usr table.
INSERT INTO actor.card (usr, barcode)
SELECT actor.usr.id, students.barcode
FROM students
INNER JOIN actor.usr
ON students.usrname = actor.usr.usrname;
--Update actor.usr.card field with actor.card.id to associate active card with the user:
UPDATE actor.usr
SET card = actor.card.id
FROM actor.card
WHERE actor.card.usr = actor.usr.id;
--INSERT records INTO actor.usr_address from staging table.
INSERT INTO actor.usr_address (usr, street1, street2, city, state, country, post_code)
SELECT actor.usr.id, students.street1, students.street2, students.city, students.province,
students.country, students.postal_code
FROM students
INNER JOIN actor.usr ON students.usrname = actor.usr.usrname;
--Update actor.usr mailing address with id from actor.usr_address table.:
UPDATE actor.usr
SET mailing_address = actor.usr_address.id, billing_address = actor.usr_address.id
FROM actor.usr_address
WHERE actor.usr.id = actor.usr_address.usr;
COMMIT;
Placing the sql statements between BEGIN; and COMMIT; creates a transaction block so that if any sql statements fail, the
entire process is canceled and the database is rolled back to its original state. Lines beginning with -- are comments to let you you what
each sql statement is doing and are not processed.Batch Updating Patron Datamigratingbatch updating patronsFor academic libraries, doing batch updates to add new patrons to the Evergreen database is a critical task. The above procedures and
import script can be easily adapted to create an update script for importing new patrons from external databases. If the data import file contains only new patrons, then,
the above procedures will work well to insert those patrons. However, if the data load contains all patrons, a second staging table and a procedure to remove existing patrons from that second staging table may be required before importing the new patrons. Moreover, additional steps to update address information and perhaps delete
inactive patrons may also be desired depending on the requirements of the institution.After developing the scripts to import and update patrons have been created, another important task for library staff is to develop an import strategy and schedule
which suits the needs of the library. This could be determined by registration dates of your institution in the case of academic libraries. It is important to balance
the convenience of patron loads and the cost of processing these loads vs staff adding patrons manually.Restoring your Evergreen Database to an Empty StateIf you've done a test import of records and you want to quickly get Evergreen back to a pristine state, you can create a clean Evergreen database schema by performing the
following:cd ILS/Open-ILS/src/sql/Pg/Rebuild the database schema:./build-db.sh [db-hostname> [db-port] [db-name] [db-user] [db-password] [db-version]This will remove all of your data from the database and restore the default values.Exporting Bibliographic Records into MARC filesmigratingexporting bibliographic recordsMARCThe following procedure explains how to export Evergreen bibliographic records into MARC files using the marc_export support script. All steps should be performed by the
opensrf user from your Evergreen server.Create a text file list of the Bibliographic record ids you would like to export from Evergreen. One way to do this is using SQL:
SELECT DISTINCT bre.id FROM biblio.record_entry AS bre
JOIN asset.call_number AS acn ON acn.record-bre.id
WHERE bre.deleted='false' and ownling_lib=101 \g /home/opensrf/records.txt;
This query will create a file called records.txt containing a column of distinct ids of items owned by the organizational unit with the
id101.Navigate to the support-scripts foldercd /home/opensrf/Evergreen-ILS*/Open-ILS/src/support-scripts/Run marc_export, using the id file you created in step 1 to define which files to export.cat /home/opensrf/records.txt | ./marc_export -i -c /openils/conf/opensrf_core.xml \
-x /openils/conf/fm_IDL.xml -f XML --timeout 5 > exported_files.xml
The example above export the records into MARCXML format.For help or for more options when running marc_export, run marc_export with the option:./marc_export -hProcessing time for exporting records will depond on several factors such as the number of records you are exporting. It is recommended that you divide the
export id files (records.txt) into manageable number of records if you are exporting a large number of records.Importing Authority Recordsmigratingimporting authority recordsThe following procedure explains how to export Evergreen bibliographic records into MARC files using the marc_export support script. All steps should be performed by the
opensrf user from your Evergreen server.Importing Authority Records from Command LineThe major advantages of the command line approach are its speed and its convenience for system administrators who can perform bulk loads of authority records in a
controlled environment.Run marc2are.pl against the authority records, specifying the user name, password, MARC type (USMARC or XML). Use
STDOUT redirection
to either pipe the output directly into the next command or into an output file for inspection. For example, to process a set of authority records
named auth_small.xml using the default user name and password and directing the output into a file named auth.are:cd Open-ILS/src/extras/import/perl marc2are.pl --user admin --pass open-ils auth_small.xml > auth.areRun pg_loader.pl to generate the SQL necessary for importing the authority records into your system. To save time for very large batches of records, you could
simply pipe the output of marc2are.pl directly into pg_loader.pl.cd Open-ILS/src/extras/import/ perl pg_loader.pl --auto are --order are auth.are > auth_load.sqlLoad the authority records from the SQL file that you generated in the last step into your Evergreen database using the psql tool. Assuming the default user
name, host name, and database name for an Evergreen instance, that command looks like:psql -U evergreen -h localhost -d evergreen -f auth_load.sqlImporting authority records using the MARC Batch Import/Export interface from the Staff ClientGood for loading batches of up to 5,000 records (roughly) at a time, the major advantages to importing authority records using the MARC Batch Import/Export interface are
that it does not require command-line or direct database access – good for both security in that it minimizes the number of people who need this access and for spreading the
effort around to others in the library – and it does most of the work (for example, figuring out whether the batch of records is in XML or USMARC format) for you.To import a set of MARC authority records from the MARC Batch Import/Export interface:From the Evergreen staff client, select CatalogingMARC Batch Import/Export.
The Evergreen MARC File Upload screen opens, with Import Records as the highlighted tab.From the Bibliographic records drop-down menu, select Authority records.Enter a name for the queue (batch import job) in the Create a new upload queue field.Select the Auto-Import Non-Colliding Records checkbox.Click the Browse… button to select the file of MARC authorities to import.Click the Upload button to begin importing the records. The screen displays Uploading…Processing… to show that the records are being transferred to the server, then displays a progress bar to show the actual import
progress. When the staff client displays the progress bar, you can disconnect your staff client safely. Very large batches of records might time out at this
stage.Once the import is finished, the staff client displays the results of the import process. You can manually display the import progress by selecting
the Inspect Queue tab of the MARC Batch Import/Export interface and selecting the queue name. By default, the staff client does not
display records that were imported successfully; it only shows records that conflicted with existing entries in the database. The screen shows the overall
status of the import process in the top right-hand corner, with the Total and Imported number of records for the
queue.