add root.xml and admin directory
[working/Evergreen.git] / 1.6 / admin / indexedfieldweighting.xml
1 <?xml version='1.0' encoding='UTF-8'?>\r
2 <section xmlns="http://docbook.org/ns/docbook" xmlns:xi="http://www.w3.org/2001/XInclude"\r
3     xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:id="indexedfieldweighting">\r
4     <title>Indexed-Field and Matchpoint Weighting</title>\r
5     <info>\r
6         <abstract>\r
7             <para>This chapter describes indexed field weighting and matchpoint weighting, which\r
8                 control relevance ranking in Evergreen catalog search results.</para>\r
9             <para>\r
10                 <tip>\r
11                     <para>In tuning search relevance, it is good practice to make incremental\r
12                         adjustments, capture search logs, and assess results before making further\r
13                         adjustments. </para>\r
14                 </tip>\r
15             </para>\r
16         </abstract>\r
17     </info>\r
18     <section>\r
19         <title>Indexed-field Weighting</title>\r
20         <para>Indexed-field weighting is configured in the Evergreen database in the weight column\r
21             of the config.metabib_field table, which follows the other four columns in this table:\r
22             field_class, name, xpath, and format. </para>\r
23         <para>The following is one representative line from the config.metabib_field table:</para>\r
24         <para> author | conference |\r
25             //mods32:mods/mods32:name[@type='conference']/mods32:namePart[../mods32:role/mods32:roleTerm[text()='creator']]\r
26             | mods32 | 1 ) </para>\r
27         <para>The default value for index-field weights in config.metabib_field is 1. Adjust the\r
28             weighting of indexed fields to boost or lower the relevance score for matches on that\r
29             indexed field. The weight value may be increased or decreased by whole integers. </para>\r
30         <para>For example, by increasing the weight of the title-proper field from 1 to 2, a search\r
31             for <emphasis role="bold">jaguar</emphasis> would double the relevance  for the book\r
32             titled <emphasis role="italic">Aimee and Jaguar</emphasis> than for a record with the\r
33             term <emphasis role="bold">jaguar</emphasis> in another indexed field. </para>\r
34     </section>\r
35     <section>\r
36         <title>Matchpoint Weighting</title>\r
37         <para> Matchpoint weighting provides another way to fine-tune Evergreen relevance ranking,\r
38             and is configured through floating-point multipliers in the multiplier column of the\r
39             search.relevance_adjustment table.</para>\r
40         <para> Weighting can be adjusted for one, more, or all multiplier fields in\r
41             search.relevance_adjustment. </para>\r
42         <para>You can adjust the following three matchpoints:</para>\r
43         <itemizedlist>\r
44             <listitem>\r
45                 <para><indexterm>\r
46                         <primary>first_word</primary>\r
47                     </indexterm> boosts relevance if the query is one term long and matches the\r
48                     first term in the indexed field (search for <emphasis role="bold"\r
49                         >twain</emphasis>, get a bonus for <emphasis role="bold">twain,\r
50                         mark</emphasis> but not <emphasis role="bold">mark twain</emphasis>)</para>\r
51             </listitem>\r
52             <listitem>\r
53                 <para><indexterm>\r
54                         <primary>word_order</primary>\r
55                     </indexterm> increases relevance for words matching the order of search terms,\r
56                     so that the results for the search <emphasis role="bold">legend\r
57                         suicide</emphasis> would match higher for the book <emphasis role="italic"\r
58                         >Legend of a Suicide</emphasis> than for the book, <emphasis role="italic"\r
59                         >Suicide Legend</emphasis></para>\r
60             </listitem>\r
61             <listitem>\r
62                 <para><indexterm>\r
63                         <primary>full_match</primary>\r
64                     </indexterm> boosts relevance when the full query exactly matches the entire\r
65                     indexed field (after space, case, and diacritics are normalized). So a title\r
66                     search for <emphasis role="italic">The Future of Ice</emphasis> would get a\r
67                     relevance boost above <emphasis role="italic">Ice Ages of the\r
68                     Future</emphasis>.</para>\r
69             </listitem>\r
70         </itemizedlist>\r
71         <para> Here are the default settings of the search.relevance_adjustment table: </para>\r
72         <table xml:id="search.relevance">\r
73             <title>search.relevance_adjustment table</title>\r
74             <tgroup cols="4">\r
75                 <thead>\r
76                     <row>\r
77                         <entry>field_class</entry>\r
78                         <entry>name</entry>\r
79                         <entry>bump_type</entry>\r
80                         <entry>multiplier</entry>\r
81                     </row>\r
82                 </thead>\r
83                 <tbody>\r
84                     <row>\r
85                         <entry>author</entry>\r
86                         <entry>conference</entry>\r
87                         <entry>first_word</entry>\r
88                         <entry>1.5</entry>\r
89                     </row>\r
90                     <row>\r
91                         <entry>author</entry>\r
92                         <entry>corporate</entry>\r
93                         <entry>first_word</entry>\r
94                         <entry>1.5</entry>\r
95                     </row>\r
96                     <row>\r
97                         <entry>author </entry>\r
98                         <entry>other </entry>\r
99                         <entry>first_word</entry>\r
100                         <entry>1.5</entry>\r
101                     </row>\r
102                     <row>\r
103                         <entry>author</entry>\r
104                         <entry>personal</entry>\r
105                         <entry>first_word</entry>\r
106                         <entry>1.5</entry>\r
107                     </row>\r
108                     <row>\r
109                         <entry>keyword</entry>\r
110                         <entry>keyword</entry>\r
111                         <entry>word_order</entry>\r
112                         <entry>10</entry>\r
113                     </row>\r
114                     <row>\r
115                         <entry>series</entry>\r
116                         <entry>seriestitle</entry>\r
117                         <entry>first_word</entry>\r
118                         <entry>1.5</entry>\r
119                     </row>\r
120                     <row>\r
121                         <entry>series</entry>\r
122                         <entry>seriestitle</entry>\r
123                         <entry>full_match</entry>\r
124                         <entry>20</entry>\r
125                     </row>\r
126                     <row>\r
127                         <entry>title</entry>\r
128                         <entry>abbreviated</entry>\r
129                         <entry>first_word</entry>\r
130                         <entry>1.5</entry>\r
131                     </row>\r
132                     <row>\r
133                         <entry>title</entry>\r
134                         <entry>abbreviated</entry>\r
135                         <entry>full_match</entry>\r
136                         <entry>20</entry>\r
137                     </row>\r
138                     <row>\r
139                         <entry>title</entry>\r
140                         <entry>abbreviated</entry>\r
141                         <entry>word_order</entry>\r
142                         <entry>10</entry>\r
143                     </row>\r
144                     <row>\r
145                         <entry>title</entry>\r
146                         <entry>alternative</entry>\r
147                         <entry>first_word</entry>\r
148                         <entry>1.5</entry>\r
149                     </row>\r
150                     <row>\r
151                         <entry>title</entry>\r
152                         <entry>alternative</entry>\r
153                         <entry>full_match</entry>\r
154                         <entry>20</entry>\r
155                     </row>\r
156                     <row>\r
157                         <entry>title</entry>\r
158                         <entry>alternative</entry>\r
159                         <entry>word_order</entry>\r
160                         <entry>10</entry>\r
161                     </row>\r
162                     <row>\r
163                         <entry>title</entry>\r
164                         <entry>proper</entry>\r
165                         <entry>first_word</entry>\r
166                         <entry>1.5</entry>\r
167                     </row>\r
168                     <row>\r
169                         <entry>title</entry>\r
170                         <entry>proper</entry>\r
171                         <entry>full_match</entry>\r
172                         <entry>20</entry>\r
173                     </row>\r
174                     <row>\r
175                         <entry>title</entry>\r
176                         <entry>proper</entry>\r
177                         <entry>word_order</entry>\r
178                         <entry>10</entry>\r
179                     </row>\r
180                     <row>\r
181                         <entry>title</entry>\r
182                         <entry>translated</entry>\r
183                         <entry>first_word</entry>\r
184                         <entry>1.5</entry>\r
185                     </row>\r
186                     <row>\r
187                         <entry>title</entry>\r
188                         <entry>translated</entry>\r
189                         <entry>full_match</entry>\r
190                         <entry>20</entry>\r
191                     </row>\r
192                     <row>\r
193                         <entry>title</entry>\r
194                         <entry>translated</entry>\r
195                         <entry>word_order</entry>\r
196                         <entry>10</entry>\r
197                     </row>\r
198                     <row>\r
199                         <entry>title</entry>\r
200                         <entry>uniform</entry>\r
201                         <entry>first_word</entry>\r
202                         <entry>1.5</entry>\r
203                     </row>\r
204                     <row>\r
205                         <entry>title</entry>\r
206                         <entry>uniform</entry>\r
207                         <entry>full_match</entry>\r
208                         <entry>20</entry>\r
209                     </row>\r
210                     <row>\r
211                         <entry>title</entry>\r
212                         <entry>uniform</entry>\r
213                         <entry>word_order</entry>\r
214                         <entry>10</entry>\r
215                     </row>\r
216                 </tbody>\r
217             </tgroup>\r
218         </table>\r
219     </section>\r
220     <section>\r
221         <title>Combining Index Weighting and Matchpoint Weighting</title>\r
222         <para>Index weighting and matchpoint weighting may be combined. The relevance boost of the\r
223             combined weighting is equal to the product of the two multiplied values. </para>\r
224         <para>If the relevance setting in the config.metabib_field were increased to 2, and the\r
225             multiplier set to 1.2 in the search.relevance_adjustment table, the resulting matchpoint\r
226             increase would be 240%. </para>\r
227         <note>\r
228             <para>In practice, these weights are applied serially -- first the index weight, then\r
229                 all the matchpoint weights that apply -- because they are evaluated at different\r
230                 stages of the search process.</para>\r
231         </note>\r
232     </section>\r
233 </section>\r