Download PSAT

PSAT source code is freely available for researchers to setup a local installation of the tool for analyzing genomes not currently supported by the PSAT web server, such as private genomes not yet released to the public.

Download the source package here
psat-2-release.tar.gz

PSAT is supported on a Linux platform. PSAT requires the following packages to run:

  1. PostgreSQL database (www.postgresql.org)
  2. Apache web server (www.apache.org)
  3. Perl
  4. PerlMagick (www.imagemagick.org)

In addition, the blastall (NCBI) program is required for population of the database
The database and web servers can be run on either the same machine or on different machines.

Installation

Installation instructions can also be found in the README in the source package.
Web Scripts
-----------
You may need root permissions in order to copy web scripts to web root directory

e.g.
cd NWRCE-psat-2-release
cp -a psat /var/www/html
cp -a cgi-bin/psat /var/www/cgi-bin

Perl Modules
------------
You may need root permissions in order to copy PSAT perl libraries to perl modules directory

1. Determine where your perl libraries are installed
e.g.

eval "`perl -V:installvendorlib`"; echo $installvendorlib
/usr/lib/perl5/vendor_perl/5.8.5

2. Copy PSAT modules to this directory under a NWRCE subdirectory
mkdir /usr/lib/perl5/vendor_perl/5.8.5/NWRCE

cd perl-NWRCE-PSAT-2-release
cp -r PSAT /usr/lib/perl5/vendor_perl/5.8.5/NWRCE

Create the Database
-------------------

The SQL scripts to create the database, database user and database tables are in the create directory. You can simply run the createdb.sh shell script to execute the appropriate psql commands to run these SQL scripts.

You may first need to modify the -U flag for each command in createdb.sh to specify a valid database user

cd psat_database-2-release/create
sh createdb.sh

Note: 
2 errors at the start of the script are ok upon first run
The commands that generate these errors are intended for subsequent runs to reload the database

ERROR:  database "synteny" does not exist
ERROR:  role "synteny" does not exist

Populate the Database
---------------------

Note: It may be useful to populate your database with the sample data first, to resolve any problems if necessary
Once successfully populated and tested, you can easily recreate a clean version of the database, and provide the
appropriate input files for populating the database for your selected genomes.

1. The blastall program must be installed

2. Ensure that the .ptt and .faa, and the protein BLAST databases for the genomes you want to add into the PSAT tool are available.

    * .ptt and .faa files
	-By default, these files will be looked for in the populate/genomes directory
	-Specify a different directory with the -p option when running the psat_*.pl scripts
	-We obtained the sample files from NCBI ftp
    * BLAST database files
	-By default, the protein BLAST database files are looked for in the populate/blast directory
	-Specify a different directory with the -b option when running psat_blastp.pl
	-We generated the sample BLAST databases using formatdb
    * A small set of genome and BLAST database files from Francisella and Pseudomonas are provided as a test/example. You may delete or add to these files for your own selected genomes.

3. Populate the database with an initial set of comparison and reference genomes

    * cd psat_database-2-release/populate/script
    * Edit the comparison.txt and reference.txt files to specify the set of genomes you want to include. The default specifies a set of example genomes. 

The format of the file should be the path in the specified genomes directory, including genome name and genomic element accession, such as follows:

Francisella_tularensis_holarctica/NC_007880
Francisella_tularensis_tularensis/NC_006570
Pseudomonas_syringae_tomato_DC3000/NC_004578
Pseudomonas_syringae_tomato_DC3000/NC_004632
Pseudomonas_syringae_tomato_DC3000/NC_004633

    * If necessary, edit initial_population.pl to specify the options that will be used by the psat_*.pl scripts
	  o The .faa and .ptt files (-p flag)
	  o Protein BLAST databases (-b flag)
	  o blastall program (-l flag)
	  o database user (-U flag)

    * Run initial_population.pl
          o This will run the appropriate scripts for each genome specified in the comparison.txt and reference.txt files.
          o The script will also run scripts to create or drop database indexes to help speed up database insertion.

Update genomes in database
--------------------------
See psat_example_update.sh as an example

Configuration File
------------------

1. Edit the conf/psat.conf file to specify PSAT database connection details

2. Create the conf directory and copy conf file

mkdir -p /var/www/conf/nwrce
cp conf/psat.conf /var/www/conf/nwrce

Test
----

Test access to your PSAT installation using a standard web browser

http://yourserverdomain/psat/index.html