PSAT User's Manual

PreviousTutorialNext

7. Reviewing homology statistics among multiple genomes

In this part of the tutorial, you will compare a set of comparison genomes based on the percentage of genes in F. tularensis subsp. tularensis SchuS4 that have homologs in these genomes.

  1. Select Francisella_tularensis_tularensis as the Reference Genome.

  2. Select a few other genomes as the Comparison Genomes. We have selected from Escherichia_coli_W3110 to Francisella_tularensis_holarctica

  3. Select the analysis method Determine homology statistics.

  4. Leave the default BLAST alignment score thresholds setting to the default option (e-value<0.1, bit score > 200, % identity > 30%). The tool will look for BLAST hits that meet all of these cutoff values when determining what to consider a homolog in this analysis. The results can change dramatically when choosing different threshold scores.

  5. Leave the display option set to the default value to only show top hits for each comparison genome. We assume that the top BLAST hit for each genome is most likely to be a valid ortholog for a given query gene. We therefore have this option to by default include only the most pertinent results and to make the output results more manageable to browse. Users may still be interested in seeing the remaining BLAST hits however, and can select this option if desired.

  6. Leave the display option set to the default value to only show hits with a homolog clustering score greater than 1. Since the score predicts the number of genes in a syntenic region, a score of 1 essentially indicates an isolated homolog with no synteny. This default option therefore specifies to only show results in which at least 2 genes make up a syntenic region. You can still view results that consider isolated homologs in the analysis if desired by selecting 'Show hits with any homolog clustering score'.

  7. Select the Submit button to submit the form. The results indicate the number and percentage of Francisella tularensis SchuS4 genes with homologs (satisfying the specified constraints) in each selected comparison genome.

  8. For example, the results show that 161 or 10% of the SchuS4 genes have a homolog in the Escherichia coli W3110, whereas 1394 or 87% of the SchuS4 genes have a homolog in the Francisella tularensis FSC 198 strain. As might be expected, these analysis results obtained using the specified settings indicate that the SchuS4 genome is quite similar to the other Francisella strains but not to Escherichia or Flavobacteriaum genomes.


  9. Now select the link for genes with homology for the Escherichia coli W3110 genome. This will take you to a page listing the 161 SchuS4 genes that have a homolog in this comparison genome.

    The results list also shows the BLAST hit scores as well as the homolog clustering score.



  10. Select any of the entries in this list of genes. The page with homolog details and genomic neighborhood visualization will be displayed for the selected gene and its listed homologs. You may further explore the homology and genomic context including synteny in this page.

  11. Go back in your browser to the statistics results page and select one of the links for genes without homology, such as the 209 genes for Francisella tularensis FSC 198. Notice that the page of results now lists the SchuS4 genes without a homolog in the comparison genome. The results will not show any BLAST hit scores because these genes, of course, did not have a BLAST hit with any gene in the comparison genome.

  12. Go back to the input page to edit settings such as BLAST score thresholds and display options to test varying the constraints used to determine homology. Modifying these settings can greatly change the numbers/percentages of genes with homologs in each of the comparison genomes.

PreviousUser's Manual ContentsNext
Comparing multiple genomes based on those with or without Gene Homologs Retrieving a list of homologs for a complete genome



© University of Washington 2008