Detailed Example of Running Using Omega for the First Time

This document will outline in detail the steps necessary to get an example search engine based on Omega and Xapian up and running. I'll point you to a set of files that you can install on your own system and index. This example uses omnindex and omega.

Requirements are:

Apache or another http server that you are familiar with
A c++ compiler

This example was developed on Linux. I have no idea how to get it to run on any other OS, so it's up to you to translate the instructions here to your specific system. I'm running a Debian 3.1 system with Apache 2 and G++ 3.3.

First you must install the xapian libraries. Download the source from the Xapian download site, http://www.xapian.org/download.php

Extract the files from the archive with the following command. Note, the file name will probably be different from this exampe:

tar xzf xapian-core-0.9.4.tar.gz

This will create a directory, xapian-core-0.9.4, so change to that directory, e. g.

cd xapian-core-0.9.4

And configure via:

./configure

If there are no errors, then you can make the libraries with a make command,

make

Assuming the make went OK and you didn't get any errors, become root (su or sudo command) and type

su
make install
exit

This will install the xapian libraries on your system.

Now that we have Xapian installed, we'll have to install the Omega utilities. To do this download Omega from the same place you found the Xapian files, extract, configure, make and install the same way you did for the libraries. The following commands should work.

cd ~
tar xzf omega-0.9.4.tar.gz
cd omega-0.9.4
./configure
make
su
make install
exit

If you encounter errors during the configure or make steps for either of these scripts, please check the README and INSTALL files in each directory for possible additional instructions. If that doesn't help, search the mail list archive and then post a message to the mail list if you still are having problems.

If you've gotten this far then we're all most home. The next step is to copy the omega program into your cgi-bin directory. If you don't know where it is, you'll need to look at the apache (or httpd) configuration files. Here's the section of my apache config file that tells me where to look:

ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/

So I know to put cgi binaries in the /usr/lib/cgi-bin directory. The next few lines demonstrate copying the omega binary.

cd ~
su
cd omega-0.9.4
cp omega /usr/lib/cgi-bin/omega.cgi
cp omega.conf /usr/lib/cgi-bin/
chmod 755 /usr/lib/cgi-bin/omega.cgi
exit

Some http servers require the cgi binaries to have an extension of .cgi, so we'll do that so we're sure it'll work. Note we've also copied the omega.conf file to the same directory. This is the easiest way to get things to work.

The next step is to download the sample data and install it on your system. The file is less than 7 Mb so hopefully you've got enough space for it and download time won't be too bad. Point your browser to http://fayettedigital.com/book/book.0.1.tar.gz and download the file to somewhere convenient. Cd to your document root and extract the files. On my system, I used the following commands.

su
cd /var/www
tar xzf ~jim/book.0.1.tar.gz
chmod -R 644 book
exit

You may also extract the files somewhere else and copy them to your document root. There is nothing magic about the “book” directory.

First lets examine the /usr/lib/cgi-bin/omega.conf file we just copied. Here is the file as it is release (at least for this version)

database_dir /var/lib/omega/data
template_dir /var/lib/omega/templates
log_dir /var/log/omega
cdb_dir /var/lib/omega/cdb

You may leave the values as they are or you can change them. In any case you'll have to create the missing directories, e. g.

su
mkdir -p /var/lib/omega/data
mkdir /var/lib/omega/templates
mkdir /var/lib/omega/cdb
mkdir /var/log/omega

And copy the templates to the new directory.

cd ~jim/omega-0.9.4
cp templates/* /var/lib/omega/templates

Be sure the templates are readable by others. Now we are ready to index the data we just stored in the $DocumentRoot/book directory.

omnindex is the utility that we will use to index the documents. It knows how to parse html documents so we don't have to do anything special. If you wish, you may change the ownership of the /var/lib/omega/data directory to a non root user and do the indexing as that user, but be sure you make sure all the database files are readable by others (chmod 644 /var/lib/omega/data/default/*).

The command I used to index the data and the output is as follows:

/usr/local/bin/omindex --db /var/lib/omega/data/default --url /book /var/www/book
[Entering directory /]
Indexing "/ci_01.htm" as text/html ... added.
Indexing "/ci_02.htm" as text/html ... added.
...
Indexing "/Introduction.htm" as text/html ... added.
Indexing "/Jpg4.htm" as text/html ... added.
Indexing "/pato.htm" as text/html ... added.

Let's look at the omindex command. The –db parameter tells it to create a database with a name of “default” That's the name that omega uses as its default. That can be changed, but for this demonstration let's keep it simple. The –-url parameter identifies the url that will follow the host name. Since we put the documents in /var/www/book we need to specify that. If we were adding files that were in the document root, we'd set use –-url /. The last parameter, /var/www/book tells omindex to look for the documents at that location on disk. Omindex does not web crawl, it only looks at files on disk.


Now test your installation by pointing your browser at http://localhost/cgi-bin/omega.cgi