Searching is a huge topic, hence an entire chapter has been devoted to a plugin called Doctrine_Search. Doctrine_Search is a fulltext indexing and searching tool. It can be used for indexing and searching both database and files.
2007-07-12 11:50:54 +00:00
Consider we have a class called NewsItem with the following definition:
Now lets say we have an application where users are allowed to search for different news items, an obvious way to implement this would be building a form and based on that form build DQL queries such as:
Here we tell Doctrine that NewsItem class acts as searchable (internally Doctrine loads Doctrine_Template_Searchable) and fields title and content are marked as fulltext indexed fields. This means that everytime a NewsItem is added or updated Doctrine will:
In the NewsItem example the [foreign_keys] would simply contain one field id with foreign key references to NewsItem(id) and with onDelete => CASCADE constraint.
2007-07-12 11:50:54 +00:00
2007-10-17 20:34:55 +00:00
An example row in this table might look something like:
|| keyword || field || position || id ||
|| database || title || 3 || 1 ||
In this example the word database is the third word of the title field of NewsItem 1.
Whenever a searchable record is being inserted into database Doctrine executes the index building procedure. This happens in the background as the procedure is being invoked by the search listener. The phases of this procedure are:
Sometimes you may not want to update the index table directly when new searchable entries are added. Rather you may want to batch update the index table in certain intervals. For disabling the direct update functionality you'll need to set the batchUpdates option to true.
<code type="php">
$search->setOption('batchUpdates', true);
</code>
The actual batch updating procedure can be invoked with the batchUpdateIndex() method. It takes two optional arguments: limit and offset. Limit can be used for limiting the number of batch indexed entries while the offset can be used for setting the first entry to start the indexing from.
By default Doctrine uses Doctrine_Search_Analyzer_Standard for analyzing the text. This class performs the following things:
1. Strips out stop-keywords (such as 'and', 'if' etc.)
2007-10-17 20:34:55 +00:00
As many commonly used words such as 'and', 'if' etc. have no relevance for the search, they are being stripped out in order to keep the index size reasonable.
When searching words 'database' and 'DataBase' are considered equal by the standard analyzer, hence the standard analyzer lowercases all keywords.
3. Replaces all non alpha-numeric marks with whitespace
In normal text many keywords might contain non alpha-numeric chars after them, for example 'database.'. The standard analyzer strips these out so that 'database' matches 'database.'.
4. Replaces all quotation marks with empty strings so that "O'Connor" matches "oconnor"
You can write your own analyzer class by making a class that implements Doctrine_Search_Analyzer_Interface. This analyzer can then be applied to the search object as follows:
<code type="php">
$search->setOption('analyzer', new MyAnalyzer());
</code>
2007-08-02 00:35:56 +00:00
++ Query language
2007-10-17 20:34:55 +00:00
Doctrine_Search provides a query language similar to Apache Lucene. The parsed behind Doctrine_Search_Query converts human readable, easy-to-construct search queries to their complex sql equivalents.
As stated before Doctrine_Search can also be used for searching files. Lets say we have a directory which we want to be searchable. First we need to create an instance of Doctrine_Search_File which is a child of Doctrine_Search providing some extra functionality needed for the file searches.
<code type="php">
$search = new Doctrine_Search_File();
</code>
Second thing to do is to generate the index table. By default Doctrine names the database index class as FileIndex.
<code type="php">
$search->buildDefinition(); // builds to table and record class definitions
$conn->export->exportClasses(array('FileIndex'));
</code>
Now we can start using the file searcher. First lets index some directory:
<code type="php">
$search->indexDirectory('myfiles');
</code>
The indexDirectory() iterates recursively through given directory and analyzes all files within it updating the index table as necessary.
Finally we can start searching for pieces of text within the indexed files: