10.16.08

Test Post

Posted in Uncategorized at 6:03 pm by Mary McRae

SRU/CQL 2.0 Proposals

October 2008

The Table below
shows the current proposals for version 2.0 of SRU and CQL (current as
of the above date).  These have been proposed by the OASIS
Search Web Services Technical Committee.

For some proposals there is not yet consensus on an approach, and more
than one approach is listed.

Feature

Description

1. Element selection

Example: Client wants MODS records, but only the single element “dateIssued”.

Two possible approaches.

  1. Via element set names.
  2. Create a new schema with just those elements.

Approach 1 would require a protocol change. Approach 2 would not.

This requirement comes from an attempt to represent select
clauses. Consider
the following geospatial example,

” select the geometry and depth from the HYDROGRAPHY
feature for the area of the Grand Banks.  The Grand
Banks are bounded by the following box: [-57.9118,46.2023,-46.6873,51.8145]. “

In CQL, that might be partially expressed as:

geo.feature=hydrography AND geo.bbox=/nwse

“-57.9118,46.2023,-46.6873,51.8145″

But “select the geometry and depth” cannot
be represented within the CQL expression, it could only be represented
within the SRU request outside of the query.

2. Same container

The classic example: “find ‘A’ and ‘B’ within the same
container element  ‘C’”

Introduce a new context set, ‘element’

  1. A   PROX/element.container=C  B
    or
  2. A PROX/element.unit=container/distance=0/ element.containerName=C
    B
3. ‘window’ relation

Find ‘A’, ‘B’, ‘C’ ….. within a span of X words.

examples:

* dc.title window/distance<5/unit=word “fries salt
vinegar”

fries, salt, and vinegar all within a span of 5 words

4. boolean modifier ‘prox’

A not near B

Example:

A not/prox/unit=word/distance=3/ordered B

Find occurences of A that are not following within 3 words by
B

5. faceted search Two possible approaches.

  1. Via scan
    Add the capability within the Scan operation to scan a result
    set: Eliminate the scan clause, add a query parameter and
    enrich the scan response. The facets would then be the terms
    in the scan response, but only for the records that  match
    the query.
  2. Via searchRetrieveAdd a response parameter, “facetResults”, or more
    general, “additionalSearchInfo”.  Develop a “facet” schema
    (or more general “additionalSearchInfo” schema).   Perhaps
    add a request parameter to indicate that faceted results are
    requested, and which facets.
6. multiple query types Two possible approaches.

  1. queryType parameter
    Optional.  If omitted, there would be a default. (Either
    a standard-wide  default, i.e. “cql”, or server-specific.
    specified by Explain)
  2. Query parameter name implies query typeThe list of supported query parameter names is specified
    by explain.
7. Alternative Response Format Two possible approaches.

  1. Request parameterAdd a request parameter responseFormat.
  2. Bound to a bindingThus for SRU 2.0, the response format would always be the SRU
    2.0 response format defined in the protocol.  There could
    be a different binding for RSS, etc.
8. Depricate ‘operation’ and ‘version’ parameters Make these optional for compatibility with earlier version.
9. Non-XML Records Allow non-xml data in the response records, including value
by reference. These would be signaled by additional values for
the recordPacking parameter.  Exisiting values (’string’,
‘xml’) would be retained, a value of ‘uri’ to indicate value by
reference, or ‘base64′ for base 64, or in general a MIME Content
Transfer Encoding type.
10. Result size precision Allow the client to indicate how much effort the server should
take to determine or estimate the number of records in the result
set. Similarly, allow the response to indicate the (estimated)
accuracy of  the result-set-size reported.

The server may be able to determine the exact number of records,
or provide a realistic estimate, but it may be an expensive process.
The server might prefer not go through that process unless the
client requests that it does so. Or the client might want to
explicitly request that the server go through, or not go through,
that process.

The client might want the first 10 records, or any 10 records,
regardless of how many records there are. In that case if the server
goes through the process of determining how many records there
are, it may go through an expensive process for nothing.

Or perhaps the server cannot determine or estimate the number
of records in the result set. The server should be able to report
this condition.