PSAT User's Manual
1. Finding Genes
Use the user input page to specify the genome regions in which you want to analyze gene context. You can open this page anytime by selecting the Explore Homologies link on the navigation bar at the top of each page. You can click on this link to open a new window for going through the steps for this tutorial. For this tutorial, you will investigate the genomic neighborhood surrounding the pilQ gene in F. novicida
- Select Francisella_tularensis_novicida_U112 as the Reference Genome. You may leave the Comparison Genomes as the default of 'All Genomes'. Later we will perform a more specific synteny analysis by selecting only a subset of these genomes.
- Enter 'pil' as the Gene name to search for. The tool will find all genes with names that contain this string. You can also find genes by specifying a locus tag, gene product description, or genome location. These fields may be helpful, for example, if you are interested in investigating syntenic regions for genes with a particular function or genes that fall within a specific region of the genome. For this part of the tutorial, you can leave these other fields in this section blank.
- Leave the default BLAST alignment score thresholds setting to the default option (e-value<0.1, bit score > 200, % identity > 30%). The tool will look for BLAST hits that meet all of these cutoff values when determining what results to display. Later we will look at how the results change when choosing different threshold scores.
- Leave the display option to Display only first 25 hits. When there are a large number of hits, the tool can take a long time to query the database and display all the hits. Often only about the first 25 hits will be of interest, so this default option was introduced to improve efficiency. If you would like to view more hits, you may change this setting.
- Leave the display option set to the default value to show all hits for each comparison genome. Genomes may have duplicated regions and therefore have multiple hits of interest. An option to show only top hits for each comparison genome is also available.
- Leave the display option set to the default value to only show hits with a homolog clustering score of at least 2. Since the homolog clustering score indicates the number of consecutive homologs in a region, a score of 1 essentially indicates an isolated homolog that is not likely to be part of a gene cluster. The default option therefore is to only show results with a score of at least 2. Users can still view isolated homologs by specifying a value of 1 in this field, if desired, and we will utilize this option later in the tutorial. For a more stingent filter based on the homolog clustering score, users can also specify a larger value in this field.
- Check the two options under List hit count in gene list to Display number of BLAST hits and Display number of genomes with hits. These options are unchecked by default to help improve efficiency of loading the gene list page displayed once the form is submitted. Listing the hit count may provide a useful summary of BLAST results, so you may choose to wait a little longer to obtain these additional details.
- Select the Submit button to submit the form. Next you will inspect the list of genes found.
|