Difference between revisions of "Collex Getting Started"

From ARC Wiki
Jump to navigation Jump to search
(Getting the Source Code)
(Getting the Source Code)
 
(12 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== System Requirements ==
 
Collex requires a fairly modern processor (newer than 2001 recommended), 2 GB of RAM.  A dataset comparable to NINES requires 2-3 GB of disk space, but disk usage depends largely on the number of objects indexed into your installation.  It has been tested on Mac OS X as well as Solaris, but it should work on any system with the following components:
 
 
* Ruby 1.8.7
 
* Ruby on Rails 2.3.2
 
* Java 1.6
 
* MySQL 5.x (using the charset utf-8; NOT latin1)
 
 
 
== Getting the Source Code ==
 
== Getting the Source Code ==
The source code is housed in a Subversion repository with the UVa Library. The Collex subdirectory is arranged in the following manner:
+
The source code is housed on a public [https://github.com/collex GitHub page]. The Collex repositories are arranged in the following manner:
 
 
<pre>
 
- collex
 
- trunk
 
  + solr_1.4
 
  + web
 
  + rdf-indexer
 
- tags
 
  + X.Y.ZZZ
 
+ branches
 
</pre>
 
 
 
The "trunk" directory is for the latest version of the software.  It represents the bleeding edge, which is fairly stable and appropriate for development use. Only the directories listed above are current. Any other directories are probably obsolete. The tag directory has a number of tags in the form "9.9.9", which is the version number of collex. It is recommended that you download the highest number tag instead of the "web" directory.
 
  
<pre>
+
[https://github.com/collex/catalog Catalog]
mkdir collex
+
A web service that exposes all of the documents stored in the solr index.
cd collex
 
svn co https://subversion.lib.virginia.edu/repos/patacriticism/collex/trunk/solr_1.4 solr_1.4
 
svn co https://subversion.lib.virginia.edu/repos/patacriticism/collex/trunk/rdf-indexer rdf-indexer
 
svn co https://subversion.lib.virginia.edu/repos/patacriticism/collex/tags/X.Y.ZZZ web
 
</pre>
 
  
This will create a "collex" directory with all the needed sources. Additionally you will want to check out some RDF in order to index some data.  The following will check out the RDF for the Rossetti Archive:
+
  [https://github.com/collex/collex Collex]
 +
  The public-facing web app for Collex.
  
<pre>
+
[https://github.com/collex/arc_inbox Arc-Inbox]
mkdir rdf
+
For accepting contributions to the arc catalog.
cd rdf
 
svn co https://subversion.lib.virginia.edu/repos/patacriticism/nines/rdf/rossetti rossetti
 
</pre>
 
  
You should have the following folder structure on your computer:
+
[https://github.com/collex/typewright Typewright]
 +
A web service that keeps info on the typewright documents.
  
<pre>
+
[https://github.com/collex/solr Solr]
collex
+
An instance of solr configured to work with the Collex Catalog.
  rdf
 
    rossetti
 
      *.rdf
 
  solr_1.4
 
    solr, etc...
 
  rdf-indexer
 
    *.java, etc.
 
  web
 
    app
 
    etc...
 
</pre>
 
  
== Setup ==
+
[https://github.com/collex/rdf-indexer RDF-indexer]
=== Ant ===
+
A Java app that indexes the RDF files into the solr index. It also handles text and does some testing of the RDF objects and the solr index.
download and install [http://ant.apache.org ant] from the apache site. You typically set an environment variable of $ANT_HOME for the installation location.  In order to compile the sources you will need to also download [http://www.junit.org JUnit] and install its Jar file into $ANT_HOME/lib in order for the compilation (ant index automatically compiles) to work.
 
  
=== Ruby ===
+
[https://github.com/collex/collex_wordpress_theme Collex_wordpress_theme]
Collex has been tested with Ruby 1.8.4, 1.8.5 and 1.8.6 and can be obtained from [http://www.ruby-lang.org ruby-lang.org]. If you are installing on Solaris we highly recommend that you use the binary distribution provided by [http://www.blastwave.org Blast Wave].
+
  A child of the Hybrid theme for gluing the WordPress portions to the Rails portions of Collex.
  
 
=== Gems ===
 
=== Gems ===
Ruby Gems are library packages to install and use various add-ons for Ruby.  Download and install RubyGems from [http://www.rubygems.org/ rubygems.org] or [http://www.blastwave.org Blast Wave] if you are using Solaris.
+
Please see the following links for information about required gems for the Collex software.
=== Rails ===
 
To install rails, type the following:
 
 
 
<pre>
 
gem install rails
 
</pre>
 
 
 
Make sure that you have Rails 1.2.6 installed
 
 
 
=== Required Gems ===
 
The following gems must be installed for Collex to work:
 
 
 
* mysql
 
* rake
 
 
 
You would install them by invoking the command "gem install mysql", for example.
 
 
 
=== Suggested Gems ===
 
We recommend adding the mongrel gem, which makes use of a much better http server for rails.  After installing it is used automatically by the script/server command noted below.  To install, simply run "gem install mongrel".
 
 
 
=== MySQL ===
 
You should create three databases in mysql:
 
 
 
# nines_development
 
# nines_production
 
# nines_test
 
 
 
Consult the [http://dev.mysql.com/doc/index.html MySQL Documentation] for information on creating a username (with a password) to access the database.  After doing this, edit the collex/web/config/database.yml file so that the username and password fields match up with the database.  This permits the Rails application to connect to the database.
 
 
 
Finally, you should run the rake migration to set up all the database tables:
 
 
 
<pre>
 
cd collex/web
 
RAILS_ENV=development rake db:migrate
 
</pre>
 
 
 
== Running unit and functional tests ==
 
Collex has a number of unit and functional tests.  These are run by simplying running Rake (in the web directory):
 
 
 
<pre>
 
rake
 
</pre>
 
 
 
We strongly encourage all new code come with unit and functional tests, and no code should break existing tests.
 
 
 
== Running the server ==
 
There are actually two server processes to make Collex work:  solr and rails.
 
 
 
=== Starting Solr ===
 
You will first need to run the "ant" command, which compiles all the sources and packages a web application for the embedded Jetty server.
 
 
 
<pre>
 
cd collex/solr
 
ant
 
cd dist/solr
 
java -jar start.jar
 
</pre>
 
 
 
That will leave that shell running the solr process.
 
 
 
=== Starting Rails ===
 
Now open up another shell and get another process running:
 
 
 
<pre>
 
cd collex/web
 
script/server
 
</pre>
 
 
 
This, too, will "occupy" the shell process.
 
 
 
== Indexing some RDF ==
 
Assuming you downloaded the Rossetti Archive RDF as noted earlier, you can run the indexer (from another shell, which you'll probably want to open from another window) as follows:
 
 
 
<pre>
 
cd collex/solr
 
ant index -Drdf.dir=/path/to/rossetti -Dsolr.url=http://localhost:8983/solr
 
</pre>
 
 
 
It will run for several minutes and should conclude with the phrase "BUILD SUCCESSFUL".  To affirm that all went well, pull up the administrative interface ([http://localhost:8983/solr/admin/stats.jsp http://localhost:8983/solr/admin/stats.jsp]) and you should see 22130 documents loaded.
 
 
 
== Using Collex ==
 
Now you can pull up your local Collex installation and see the Rossetti Archive content:
 
 
 
[http://localhost:3000/collex http://localhost:3000/collex]
 
  
Have fun!
+
* [https://github.com/collex/collex/blob/master/Gemfile Collex Gemfile]
 +
* [https://github.com/collex/arc_inbox/blob/master/Gemfile Arc-Inbox Gemfile]
 +
* [https://github.com/collex/catalog/blob/master/Gemfile Catalog Gemfile]
 +
* [https://github.com/collex/typewright/blob/master/Gemfile Typewright Gemfile]

Latest revision as of 22:43, 1 May 2013

Getting the Source Code

The source code is housed on a public GitHub page. The Collex repositories are arranged in the following manner:

Catalog
A web service that exposes all of the documents stored in the solr index.
Collex
The public-facing web app for Collex.
Arc-Inbox
For accepting contributions to the arc catalog. 
Typewright
A web service that keeps info on the typewright documents.
Solr
An instance of solr configured to work with the Collex Catalog.
RDF-indexer
A Java app that indexes the RDF files into the solr index. It also handles text and does some testing of the RDF objects and the solr index.
Collex_wordpress_theme
A child of the Hybrid theme for gluing the WordPress portions to the Rails portions of Collex.

Gems

Please see the following links for information about required gems for the Collex software.