Difference between revisions of "Collex Getting Started"

From ARC Wiki
Jump to navigation Jump to search
(Setup)
(MySQL)
Line 94: Line 94:
  
 
=== MySQL ===
 
=== MySQL ===
You should create three databases in mysql:
+
Be sure that you are setup to use utf-8 character encoding.
 
 
# nines_development
 
# nines_production
 
# nines_test
 
  
 
Consult the [http://dev.mysql.com/doc/index.html MySQL Documentation] for information on creating a username (with a password) to access the database.  After doing this, edit the collex/web/config/database.yml file so that the username and password fields match up with the database.  This permits the Rails application to connect to the database.
 
Consult the [http://dev.mysql.com/doc/index.html MySQL Documentation] for information on creating a username (with a password) to access the database.  After doing this, edit the collex/web/config/database.yml file so that the username and password fields match up with the database.  This permits the Rails application to connect to the database.

Revision as of 15:41, 13 November 2009

System Requirements

Collex requires a fairly modern processor (newer than 2001 recommended), 2 GB of RAM. A dataset comparable to NINES requires 2-3 GB of disk space, but disk usage depends largely on the number of objects indexed into your installation. It has been tested on Mac OS X as well as Solaris, but it should work on any system with the following components:

  • Ruby 1.8.7
  • Ruby on Rails 2.3.2
  • Java 1.6
  • MySQL 5.x (using the charset utf-8; NOT latin1)

Getting the Source Code

The source code is housed in a Subversion repository with the UVa Library. The Collex subdirectory is arranged in the following manner:

- collex
 - trunk
  + solr_1.4
  + web
  + rdf-indexer
 - tags
  + X.Y.ZZZ
 + branches

The "trunk" directory is for the latest version of the software. It represents the bleeding edge, which is fairly stable and appropriate for development use. Only the directories listed above are current. Any other directories are probably obsolete. The tag directory has a number of tags in the form "9.9.9", which is the version number of collex. It is recommended that you download the highest number tag instead of the "web" directory.

mkdir collex
cd collex
svn co https://subversion.lib.virginia.edu/repos/patacriticism/collex/trunk/solr_1.4 solr_1.4
svn co https://subversion.lib.virginia.edu/repos/patacriticism/collex/trunk/rdf-indexer rdf-indexer
svn co https://subversion.lib.virginia.edu/repos/patacriticism/collex/tags/X.Y.ZZZ web

This will create a "collex" directory with all the needed sources. Additionally you will want to check out some RDF in order to index some data. The following will check out the RDF for the Rossetti Archive:

mkdir rdf
cd rdf
svn co https://subversion.lib.virginia.edu/repos/patacriticism/nines/rdf/rossetti rossetti

You should have the following folder structure on your computer:

collex
  rdf
    rossetti
      *.rdf
  solr_1.4
    solr, etc...
  rdf-indexer
    *.java, etc.
  web
    app
    etc...

Setup

Ruby

Collex has been tested with Ruby 1.8.7 and can be obtained from ruby-lang.org. If you are installing on Solaris we highly recommend that you use the binary distribution provided by Blast Wave.

Gems

Ruby Gems are library packages to install and use various add-ons for Ruby. Download and install RubyGems from rubygems.org or Blast Wave if you are using Solaris.

Rails

To install rails, type the following:

gem install rails

Make sure that you have Rails 2.3.2 installed

Required Gems

The following gems must be installed for Collex to work:

  • image_science (1.2.1)
  • json_pure (1.1.9)
  • mysql (2.8.1)
  • rails (2.3.2)
  • rake (0.8.7)
  • solr-ruby (0.0.8)

On a computer that you plan to index, the following are also required:

  • Linguistics (1.0.5)
  • marc (0.3.0)

You would install them by invoking the command "gem install mysql", for example.

Suggested Gems

You will also need a gem for running collex. One of the following can be used:

  • mongrel (1.1.5)
  • passenger (2.2.5)

MySQL

Be sure that you are setup to use utf-8 character encoding.

Consult the MySQL Documentation for information on creating a username (with a password) to access the database. After doing this, edit the collex/web/config/database.yml file so that the username and password fields match up with the database. This permits the Rails application to connect to the database.

More Information

See the file web/doc/README_FOR_APP for more information on deploying.

Running unit and functional tests

Collex has a number of unit and functional tests. These are run by simplying running Rake (in the web directory):

rake

We strongly encourage all new code come with unit and functional tests, and no code should break existing tests.

Running the server

There are actually two server processes to make Collex work: solr and rails.

Starting Solr

You will first need to run the "ant" command, which compiles all the sources and packages a web application for the embedded Jetty server.

cd collex/solr
ant
cd dist/solr
java -jar start.jar

That will leave that shell running the solr process.

Starting Rails

Now open up another shell and get another process running:

cd collex/web
script/server

This, too, will "occupy" the shell process.

Indexing some RDF

Assuming you downloaded the Rossetti Archive RDF as noted earlier, you can run the indexer (from another shell, which you'll probably want to open from another window) as follows:

cd collex/solr
ant index -Drdf.dir=/path/to/rossetti -Dsolr.url=http://localhost:8983/solr 

It will run for several minutes and should conclude with the phrase "BUILD SUCCESSFUL". To affirm that all went well, pull up the administrative interface (http://localhost:8983/solr/admin/stats.jsp) and you should see 22130 documents loaded.

Using Collex

Now you can pull up your local Collex installation and see the Rossetti Archive content:

http://localhost:3000/collex

Have fun!