Installing Solr on Ubuntu Linux

Following are instructions for installing the Solr search server on Ubuntu linux. There are several manual steps in setting up Solr, and most of the other documents I came across on the internet are inadequate in some (or in many) ways so I enlisted the help of colleagues and documented the steps start-to-finish here.

I found Solr not to my liking, encountering significant scaling issues while indexing beyond 4-5 million small documents and so I've abandoned this application in favor of more standard/robust solutions with a far larger community (e.g. mySQL) and more ubiquitous technology with long evolutionary histories (RDBMS) behind them. The problem of indexing XML documents is best solved by avoidance. Digitally born data should exist in normalized and relational states from the get-go.

These instructions have been tested with Hardy Heron 8.04, and will likely work with other recent versions of Ubuntu and Debian-based distros with little or no modification.

Before You Start
Solr can be setup several ways -- these instructions lead up to a Solr environment deployed in Tomcat, with separate development and production areas. Once you've done this a couple times (or carefully read this document a few times), you could set up three environments, just one, or whatever layout suits your needs. There are hardcoded pathing dependencies of which you need to be aware.

(1) Download and install the latest JDK from Sun.

You'll want to get the latest Java JDK from Sun http://java.sun.com/javase/downloads/index.jsp and install it first. At the time these instructions were written, I had installed Sun's jdk1.6.0_10. I'm unsure if it's required, but I also made sure that "ant" was installed on my Ubuntu box (for ant, I simply used Ubuntu's handy package installer Synaptic).

I downloaded the Sun JDK to my user home directory and chmod +x'd the .bin exectuable. I sudo'd to root and executed the file. It made me scroll through the license agreement and decompressed itself. I then mv'd it to /opt/jdk1.6.0_10.

Java needs at least two environment settings in order to be useful. You'll eventually need to set up CLASSPATH as well, but that's not essential for the instructions in this document. I made the following .bashrc additions to both my ordinary user account (/home/{username}/.bashrc), as well as for the root account (/root/.bashrc). Go into each .bashrc file and add the following (which may be slightly different if you chose a different location or have a different version of the JDK):

export PATH=/opt/jdk1.6.0_10/bin:$PATH
export JAVA_HOME=/opt/jdk1.6.0_10

Whenever you make changes to .bashrc you should issue a "source .bashrc" to instruct the shell to re-read the file (otherwise you'd have to logout, and then log back in). You should now be able to type "which java" and see something like this: /opt/jdk1.6.0_10/bin/java, depending on the version you downloaded.

(2) Download and install the latest Tomcat.

Rather than lean on the Tomcat 5.5 version which was part of the Ubuntu repositories at the time of this writing, I downloaded the latest Tomcat: http://tomcat.apache.org. I brought it down to my user directory, decompressing it via gunzip and "tar xvf". It creates a Tomcat directory, populated with everything it needs.

As you use Tomcat over the lifespan of your project/development you may want a more succinct name than something like "apache-tomcat-6.0.16" so I decided to rename (mv) this directory to simply "tomcat6". The instructions which follow in this document will use that abbreviated "tomcat6" convention.

I then did this:

sudo su
mv tomcat6 /usr/local/

You can move it somewhere else -- I picked this location because a colleague who led me through most of these steps put it in that location on his box and I decided to remain consistent with his setup. Maybe you want it in /usr/share/ or somewhere else. Before going further, you should test Tomcat. At this stage, I'm still sudo'd as root.

cd /usr/local/tomcat6/bin
./startup.sh

You should see a message like this:
Using CATALINA_BASE:   /usr/local/tomcat6
Using CATALINA_HOME:   /usr/local/tomcat6
Using CATALINA_TMPDIR: /usr/local/tomcat6/temp
Using JRE_HOME:       /opt/jdk1.6.0_10
(Note that JRE_HOME is the location of the Sun JDK installed in an earlier step. You really need this -- if Tomcat is aimed at a JRE that you don't want, or can't find it, you can't go any further.) Eventually you'll probably want to create a Tomcat specific user, and give it appropriate/minimal rights, instead of using root.

Go to your browser and type this:

http://localhost:8080/

Go to Tomcat servlet examples and click a couple of them, click a couple jsp examples also. They should execute without complaining. At this stage we've installed the latest JDK, the latest Tomcat, and things are talking to one another. If you're getting something wildly different, you can't go any further here. In order to complete this document, it should be "all systems go" at this point.

Before going further, you should shut Tomcat back down:

cd /usr/local/tomcat6/bin
./shutdown.sh

(3) Download and install Solr

I downloaded the latest Solr here: http://www.apache.org/dyn/closer.cgi/lucene/solr/. As with Tomcat, I issued gunzip and "tar xvf" to decompress it to my home user directory. It creates a directory called "apache-solr-1.2.0".

We need to manually create some directories within /usr/local/tomcat6. This setup will yield us two Solr locations within your Tomcat instance: one for development, another for production. There are other ways to set up Solr, but if this is your first attempt you may want to follow this convention. It's unclear why /Catalina and /Catalina/localhost aren't created automatically with a Tomcat install. Probably just to keep our salaries up. The /data/solr directory, as you can see, will have an identical structure below it for dev and prod. Each of those directories additionally has corresponding /conf and /data directories below it.

Make these directories:

/usr/local/tomcat6/conf/Catalina
/usr/local/tomcat6/conf/Catalina/localhost
/usr/local/tomcat6/data
/usr/local/tomcat6/data/solr
/usr/local/tomcat6/data/solr/dev
/usr/local/tomcat6/data/solr/dev/conf
/usr/local/tomcat6/data/solr/dev/data
/usr/local/tomcat6/data/solr/prod
/usr/local/tomcat6/data/solr/prod/conf
/usr/local/tomcat6/data/solr/prod/data

Now we should copy the solr "war" file into position for deployment. Go to the directory where you decompressed solr in an earlier step, and go into the dist subdirectory. For instance: apache-solr-1.2.0/dist.

cp apache-solr-1.2.0.war /usr/local/tomcat6/data/solr

Now, in /usr/local/tomcat6/conf/Catalina/localhost we need to create and save two files which will be read the next time you start Tomcat, and (hopefully) properly deploy Solr. Use a text editor of your choice and create these two files in the /Catalina/localhost subdirectory.

cd /usr/local/tomcat6/conf/Catalina/localhost

solrdev.xml

<Context docBase="/usr/local/tomcat6/data/solr/apache-solr-1.2.0.war" debug="0" crossContext="true">
<Environment name="solr/home" type="java.lang.String" value="/usr/local/tomcat6/data/solr/dev" override="true" />
</Context>

solrprod.xml

<Context docBase="/usr/local/tomcat6/data/solr/apache-solr-1.2.0.war" debug="0" crossContext="true">
<Environment name="solr/home" type="java.lang.String" value="/usr/local/tomcat6/data/solr/prod" override="true" />
</Context>

There are some sample configuration files which come with the Solr distribution you downloaded. Let's copy those into their proper position. Go to the working directory where you downloaded solr, and into the /example/solr/conf subdirectory: /apache-solr-1.2.0/example/solr/conf. You should see something like this:
admin-extra.html  schema.xml    solrconfig.xml  synonyms.txt
protwords.txt     scripts.conf  stopwords.txt   xslt
Copy everything here to your development solr configuration directory:

cp -R * /usr/local/tomcat6/data/solr/dev/conf

Do the same for your production location also:

cp -R * /usr/local/tomcat6/data/solr/prod/conf

Time to test. Everything should now be in place. Sacrifice a chicken and restart Tomcat:

cd /usr/local/tomcat6/bin
./startup.sh

Go to your browser and type this:

http://localhost:8080/solrprod

and also:

http://localhost:8080/solrdev

This this point you should see a "Welcome to Solr!" message with a "Solr Admin" link. If you can click the click and see an example search interface you've probably successfully installed Solr.


The views and opinions expressed in this page are strictly those of the page author.
The contents of this page have not been reviewed or approved by the University of Minnesota.