CouchDB: A brief introduction (Part IV)
Using CouchDB-Lucene
CouchDB-Lucene is an interface between CouchDB and Lucene, a document indexer. Using this interface we can make really complex queries to get the documents.
Configuring CouchDB-Lucene
The first thing to do to configure CouchDB-Lucene, is configure CouchDB to run the CouchDB-Lucene proccess every time that CouchDB is launched, and every time that a document is updated. We also want that using an URL, http://url_del_servidor:5984/base_de_datos/_fti, the CouchDB-Lucene process will be lauched too. To make this we need to edit the CouchDB configuration file, local.ini, adding this lines:
[external]
fti = /usr/bin/java -jar /path-couchdb-lucene/couchdb-lucene-0.3-SNAPSHOT-jar-with-dependencies.jar -search[update_notification]
indexer = /usr/bin/java -jar /path-couchdb-lucene/couchdb-lucene-0.3-SNAPSHOT-jar-with-dependencies.jar -index[httpd_db_handlers]
_fti = {couch_httpd_external, handle_external_req, <<"fti">>}
We alse need to adjust the timeout parameter, because between the time used to launch the Java Virtual Machine and the time used to make the search we can have a really big stop. The next line will give us a timeout of 6 seconds:
[couchdb]
os_process_timeout = 60000
With all of this we have CouchDB-Lucene configured, now it's time to create the design documents to configure the indexation process. We must take into account that the proccess that we have just followed can be used to launch any other proccess when a document is updated in CouchDB.
Creating the indexation documents
In order to use CouchDB-Lucene, we must create, at least, one design document that let us especify the fields and how we will index our documents. This document must have a field called fulltext, which will have an array called like our index. Inside this array we will have another array to indicate the parameters to our index, and an entry, called index, that holds the function to indicate the fields that we wan't to include in our index. This function will be executed once per document added, edited or erased. With an example is clear:
{
"_id":"_design/lucene",
"fulltext": {
"by_everything": {
"defaults": {
"store":"yes"
},
"index":"function(doc){
var ret=new Document();
function idx(previous,obj) {
if (previous!='') previous+='.';
for (var key in obj){
switch (typeof obj[key]){
case 'object':
idx(previous+key, obj[key]);
break;
case 'function':
break;
default:
ret.add(obj[key],{'field':previous+key});break;}}};
idx('',doc);
return ret;
}"
}
}
}
As we can see, we have change the default value of the property store, because if we don't put it to true the index will be only in memory and will not be saved to disk.
We can also see the code of the index function. This function should return a Document object with the fields that we wan't to add to the index. To add a field to the index we should use the add method, using as first parameter the value that we wan't to index and as second parameter an array with the options. This second parameter is optional but we will use it to select the name of the field.
In the example function we will index all the fields, so if the field is an array we will save it using this structure "array.field: value", this will be really usefull to make advanced searchs.
Making advanced searchs
To make searchs using CouchDB-Lucene we just need to access to the document that we previouslly maded. And we use the URL parameters as parameters to the search that we wan't to make. For example:
http://localhost:5984/database/_fti/lucene/by_everything/?q=Configuration.Value:[3 TO 9] AND Element.Value:Fe&sort=\Score&include_docs=true
This url will search every document that have a Configuration array with a Value property between 3 and 9, and also have an Element array with a Value property with the "Fe" value (q=Configuration.Value:[3 TO 9] AND Element.Value:Fe). We have another two parameters, sort=\Score, that will sort all the results by the Score property from back to front and include_docs, that will give us all the document body and not only the identification field. To sort from front to back we have to remove the slash ( \ ) or change it by the other slash ( / ).
Next ...
This is the last article about CouchDB itself. There are some things to be explained, like how to change the format of the views, but the API is changing right now so it's better to wait for an stable version of this API. The following articles related with CouchDB will talk about the project that we are making with CouchDB. The interesting part begins.
Comentarios recientes
hace 13 semanas 8 horas
hace 13 semanas 6 días
hace 28 semanas 5 días
hace 1 año 3 días
hace 1 año 3 días
hace 1 año 1 semana
hace 1 año 1 semana
hace 1 año 49 semanas
hace 1 año 49 semanas
hace 1 año 52 semanas