Oracle8 ConText Cartridge Application Developer's Guide
Release 2.4
A63821-01

Library

Product

Contents

Index
 

Prev Next

5
Query Expression Feedback

This chapter describes query expression feedback. The following topics are covered:

The Feedback Process

Figure 5-1

 

Query expression feedback is a feature that enables you to know how ConText parses a text or theme query expression before you execute the query. Knowing how ConText evaluates a text or theme query expression is useful for refining and debugging queries. You can also design your application so that it uses the feedback information to help users write better queries.

The diagram above shows how you use query expression feedback. You execute the PL/SQL procedure CTX_QUERY.FEEDBACK, which generates and stores feedback information to a table. From the data in this feedback table, you can visualize the ConText parse tree to examine how the expression was expanded and parsed. You can then refine the query and re-execute FEEDBACK, or you can execute the real query with CONTAINS for two-step queries, OPEN_CON for in-memory queries, or SELECT for one-step queries.

In text queries, query expression feedback is especially useful for knowing how context expands expressions that contain stem, wildcard, thesaurus, fuzzy, soundex, PL/SQL, or SQE operators before you execute the query. This is because such queries can potentially expand into many tokens or result in very large hitlists.

In theme queries, query expression feedback is useful for knowing how ConText uses the knowledge catalog to normalize query expressions.

Understanding ConText Parse Trees

Before ConText executes a query, it parses the expression. The resulting expression can be represented as a parse tree. A ConText parse tree can show:

The output table of the FEEDBACK procedure is graphical representation of a ConText parse tree.

Operator Precedence

 

Parse trees are read in a depth-first manner and from left to right. This means the first operation is always furthest to the left and at the bottom of the branch. In this way, parse trees illustrate operator precedence.

The example above shows the parse tree for the evaluation of a AND b OR c, where a, b and c stand for three arbitrary words. Since the and operation a AND b is the leftmost operation and at the bottom of the tree, it is executed first. In this way, the parse tree above indicates correctly that the and operator has higher precedence over the or operator. The resulting query is hence (a AND b) OR c rather than a AND (b OR c).

Query Expansions

 

The above example shows how ConText expands the query comp% OR ?smith. The parse tree shows that before ConText executes the query, the token comp% is expanded to computer and comptroller, while ?smith is expanded to smith and smythe.

ConText parse trees show similar expansions with thesaurus, wildcard, soundex, stem, SQE, and PL/SQL operators. In the case of the wildcard, soundex, and fuzzy operators, ConText obtains the correct word expansions from the index.
 


Note: 

When you include the SQE operator in the feedback expression, the feedback (expansion of the stored query expression) is based on the current state of the index and will take into account any inserts, updates, or deletes made to the base table; however, unlike a call to CONTAINS, the stored query expression is not updated or refreshed as a result of the call to FEEDBACK


 
 

Theme Query Normalization

 

You can use query expression feedback to know how ConText interprets theme queries. The feedback information provides the normalized version of the query as obtained from the knowledge catalog.

The example above shows how ConText normalizes the theme query ratified laws to the themes ratification and law. The resulting expression is an AND operation with weights attached to the normal forms: ratification*0.561 AND law*0.438.
 


Note: 

Because numbers are rounded off when displayed, weights might not always add up to 1.000 exactly. 


 
  
See Also: 

For more information about theme queries, see Chapter 4, "Theme Queries"

 
 

Query Optimization

 

The example above shows how ConText optimizes the expression a AND b AND c, where a and b and c stand for three different words.

In the first step of the parse, ConText evaluates a AND b, then ANDs the result with c. With such a parse tree, ConText must search for all documents that contain a and b, then search for all documents that contain c, and then intersect the two result sets.

The ConText optimizer realizes this query is more efficiently executed by simultaneously searching for all the documents that contain a and b and c, which is illustrated in the second step of the optimizing process.

Stopword Rewrite

 

The example above shows the parse sequence for the stopword transformation:

non_stopword NOT stopword => non_stopword

Assuming that is a stopword, ConText reduces the query dog NOT that to dog.
 

See Also: 

To learn more about querying with stopwords, see "Querying with Stopwords" in Chapter 3

For a list of all possible stopword transformations, see Appendix D, "Stopword Transformations"

 
 

Decompounding of Composite Word Tokens

 

When using a composite index with German or Dutch text, you can use query feedback to examine how ConText breaks down a composite word query into its subcomposites. Even though ConText does not return documents that contain only subcomposite words in a query, composite word query feedback is useful for verifying where ConText places word boundaries.

The above example shows that ConText breaks down the German composite word Hauptbahnhof into haupt, bahn, bahnen, and hof.
 


Note: 

To obtain composite word query feedback, the policy's lexer must have the COMPOSITE attribute of the lexer set to 1. 

For more information about defining policies, see the Oracle8 Context Cartridge Administrator's Guide. 


 
 

Understanding the Feedback Table

Before you issue a query, you can obtain the parse tree information for the query expression. The procedure CTX_QUERY.FEEDBACK creates a graphical representation of the parse tree and stores this information in a feedback table, which you create before executing CTX_QUERY.FEEDBACK. To reconstruct ConText parse trees, you must understand the structure of this table.

Table Structure

The feedback table has the following structure:

Table 5-1

Column Name  Datatype  Description 

FEEDBACK_ID 

VARCHAR2(30) 

The value of the feedback_id argument specified in the FEEDBACK call. 

ID 

NUMBER 

A number assigned to each node in the query execution tree. The root operation node has ID =1. The nodes are numbered in a top-down, left-first manner as they appear in the parse tree. 

PARENT_ID 

NUMBER 

The ID of the execution step that operates on the output of the ID step. Graphically, this is the parent node in the query execution tree. The root operation node (ID =1) has PARENT_ID = 0. 

OPERATION 

VARCHAR2(30) 

Name of the internal operation performed. Refer to Table 5-2 for possible values. 

OPTIONS 

VARCHAR2(30) 

Characters that describe a variation on the operation described in the OPERATION column. When an OPERATION has more than one OPTIONS associated with it, OPTIONS values are concatenated in the order of processing. See Table 5-3 for possible values. 

OBJECT_NAME 

VARCHAR2(64) 

Section name, or wildcard term, or term to lookup in the index. 

POSITION 

NUMBER 

The order of processing for nodes that all have the same PARENT_ID.The positions are numbered in ascending order starting at 1. 

CARDINALITY 

NUMBER 

Reserved for future use. You should create this column for forward compatibility. 

 

OPERATION Column

Table 5-2 lists the possible values for the OPERATION column in the feedback table:

Table 5-2

Operation Value  Query Operator  Equivalent Symbol 

ACCUMULATE 

ACCUM 

AND 

AND 

COMPOSITE 

(none) 

(none) 

EQUIVALENCE 

EQUIV 

FIRST_NEXT_DOC 

MAX_DOC 

MINUS 

MINUS 

NEAR 

NEAR 

NOT 

NOT 

NO_HITS 

(no hits will result from this query) 

 

OR 

OR 

PHRASE 

(a phrase term) 

 

SECTION 

(section) 

 

THRESHOLD 

WEIGHT 

WITHIN 

within 

(none) 

WORD 

(a single term) 

 

 

OPTIONS Column

Table 5-3 shows the values for the OPTIONS column in the feedback table. When an OPERATION has more than one OPTIONS associated with it, the OPTIONS values are concatenated in the order of processing.

Table 5-3

Options Value  Description 

($) 

Stem 

(?) 

Fuzzy 

(!) 

Soundex 

(T) 

Order for ordered Near. 

(F) 

Order for unordered Near. 

(n) 

A number associated with Threshold, Weight, Max, or the max_span parameter for the Near operator. 

(m-n) 

First next range (m and n are integers) 

 

Example

 

The figure above shows how ConText encodes the parse tree for the query comp% OR $smith, which is asking for all documents that contain words beginning with comp or contain words that are spelled like smith.

Each node is labeled with a value that corresponds to the OPERATION column in the feedback table. The tree above contains one OR node, two EQUIVALENCE nodes, and four WORD nodes.

The ID and PARENT_ID values are listed beside each node. For example, the OR node has an ID of 1 and PARENT_ID of 0, since it is the root node.

The EQUIVALENCE node with ID = 2, PARENT_ID = 1, has an OBJECT_NAME value of COMP%, because this equivalence operation is a result of wildcard term comp%.

The WORD node with id = 3 has an OBJECT_NAME value of computer, because in this instance, computer is one of the words that satisfy comp%.

Obtaining Query Expression Feedback

To obtain query expression feedback information, you must do the following:

  1. Create the feedback table.
  2. Execute CTX_QUERY.FEEDBACK.
  3. Retrieve data from feedback table.
  4. Optionally, construct expansion tree from table information.

Creating the Feedback Table

To create a feedback table called test_feedback for example, use the following SQL statement:

create table test_feedback(
         feedback_id varchar2(30)
         id number,
         parent_id number,
         operation varchar2(30),
         options varchar2(30),
         object_name varchar2(64),
         position number,
         cardinality number);

Executing CTX_QUERY.FEEDBACK

To obtain the expansion of a query expression such as comp% OR ?smith, use CTX_QUERY.FEEDBACK as follows:

ctx_query.feedback(
         policy_name => 'scott.test_policy',
         text_query => 'comp% OR ?smith',
         feedback_table => 'test_feedback',
         sharelevel => 0,
         feedback_id => 'Test');

Retrieving Data from Feedback Table

To read the feedback table, you can select the columns as follows:

select feedback_id, id, parent_id, operation, options, object_name, position
from test_feedback
order by id;

The output is ordered by ID to simulate a hierarchical query:

FEEDBACK_ID   ID PARENT_ID OPERATION    OPTIONS OBJECT_NAME POSITION 
----------- ---- --------- ------------ ------- ----------- -------- 
Test           1         0 OR           NULL    NULL          1 
Test           2         1 EQUIVALENCE  NULL    COMP%         1
Test           3         2 WORD         NULL    COMPTROLLER   1 
Test           4         2 WORD         NULL    COMPUTER      2 
Test           5         1 EQUIVALENCE  (?)     SMITH         2 
Test           6         5 WORD         NULL    SMITH         1 
Test           7         5 WORD         NULL    SMYTHE        2

Constructing the Parse Tree

You can optionally construct an approximate graphical representation of the parse tree using a hierarchical query. This type of query outputs rows in a hierarchical manner, where children nodes are indented under parent nodes.

The following statement selects from a populated feedback table, indenting the output according to level:

select lpad(' ',2*(level-1)) || operation operation, options, object_name, 
position
from test_feedback
start with id = 1
connect by prior id = parent_id;

This statement produces hierarchical output for the query comp% OR ?smith as follows:

OPERATION            OPTIONS    OBJECT_NAME          POSITION 
-------------------- ---------- -------------------- -------
OR                   NULL       NULL                        1 
  EQUIVALENCE        NULL       COMP%                       1 
    WORD             NULL       COMPTROLLER                 1 
    WORD             NULL       COMPUTER                    2 
  EQUIVALENCE        (?)        SMITH                       2 
    WORD             NULL       SMITH                       1 
    WORD             NULL       SMYTHE                      2



Prev

Next
 
Oracle
Copyright © 1998 Oracle Corporation. 
All Rights Reserved. 

Library

Product

Contents

Index