]> git.evergreen-ils.org Git - Evergreen.git/commit
Avoid data loss by setting MARC::Charset->assume_unicode(1)
authordbs <dbs@dcc99617-32d9-48b4-a31d-7c20da2025e4>
Tue, 3 May 2011 16:34:51 +0000 (16:34 +0000)
committerdbs <dbs@dcc99617-32d9-48b4-a31d-7c20da2025e4>
Tue, 3 May 2011 16:34:51 +0000 (16:34 +0000)
commitfa4ad6fb944c979e3cd8d57daec3b1b4c5e7ba7c
treed92a84b575827e18b898684de94b086cffb3e2d9
parentf40ca8d7ae973781c40e38586bad56ed92c8d98b
Avoid data loss by setting MARC::Charset->assume_unicode(1)

When using MARC::File::XML, MARC::Charset is used to perform character
conversions; however, MARC::File::XML does not tell MARC::Charset that it is
handling Unicode data. If we do not tell MARC::Charset that it is handling
Unicode data, it can return an error which results in the loss of data
(typically a subfield containing one or more characters which MARC::Charset
does not have an equivalent mapping outside of Unicode).

This problem could be reproduced in authority_control_fields.pl with a
subfield like "von Hans-Christian Müơller" - when this subfield was encountered
without assume_unicode(1), a null string was returned for that subfield, and
if the record was written back to the database due to an authority match being
found in a different field, the only recourse was to restore the record from
auditor.biblio_record_entry_history. The same sort of problems could occur
for any other script or function that modifies the data being handed to it
using MARC::File::XML and BinaryEncoding => UTF8.

Signed-off-by: Dan Scott <dscott@laurentian.ca>
git-svn-id: svn://svn.open-ils.org/ILS/branches/rel_2_0@20387 dcc99617-32d9-48b4-a31d-7c20da2025e4
Open-ILS/src/sql/Pg/002.functions.config.sql
Open-ILS/src/sql/Pg/002.schema.config.sql
Open-ILS/src/sql/Pg/011.schema.authority.sql
Open-ILS/src/sql/Pg/020.schema.functions.sql
Open-ILS/src/sql/Pg/030.schema.metabib.sql
Open-ILS/src/sql/Pg/1.6.1-2.0-upgrade-db.sql
Open-ILS/src/sql/Pg/upgrade/0528.schema.functions_assume_unicode.sql [new file with mode: 0644]
Open-ILS/src/support-scripts/authority_control_fields.pl