Frequently Asked Questions

 How do I cite PhylomeDB?

Please cite PhylomeDB each time you have used it for your published research. Either if you downloaded a whole dataset or whether it was the source for obtaining some relevant evolutionary information, or the sequences to prime some phylogenetic analysis. Cite the most recent publication of the database, which is now v4:

PhylomeDB v4: zooming into the plurality of evolutionary histories of a genome. Huerta-Cepas J, Capella-Gutiérrez S, Pryszcz LP, Marcet-Houben M, Gabaldón T. Nucleic Acids Res. 2014 Jan;42(Database issue):D897-902. doi: 10.1093/nar/gkt1177.

If you want to refer to the phylogenetic pipeline used in PhylomeDB, which was described in v3 paper, please cite:

PhylomeDB v3.0: an expanding repository of genome-wide collections of trees, alignments and phylogeny-based orthology and paralogy predictions. Huerta-Cepas J, Capella-Gutierrez S, Pryszcz LP, Denisov I, Kormes D, Marcet-Houben M, Gabaldón T. Nucleic Acids Res. 2011 Jan;39(Database issue):D556-60.

Alignments are not displayed properly in my browser

PhylomeDB uses Jalview, an external application that requires Java. Please, check that you have the latest version of Java installed and that Java is enabled in your browser. When correctly installed your browser should display the following examples properly.

You can download the latest version of Java at Java Download Site.

What type of branch supports are used in the phylome?

The specific methods used for each phylome are explained in the corresponding “phylome information page”. Due to computational and time constrains, for most phylomes we did not used standard bootstrap analyses. Instead we usually compute approximate Likelihood Ratio Tests (aLRT), as implemented in PhyML-aLRT or in PhyML v3.0 version.

When I press "show branch support" icon, not all branches display the support

When you press the “branch support” icon, red numbers with the support values will appear beneath the branches. If some do not appear it is because there is no space to show the number, in that case you can press “force topology” and all branch lengths will be re-sized to display all supports.

What do the different branch colors indicate?

Red indicates branches in which our species-overlap algorithm detects a duplication event, whereas speciation events are marked in blue. This color-coding system coincides to that used in EnsemblCompara trees.

What are "collateral trees", what are they useful for?

Collateral trees of a given protein are those trees in which that protein is present but it was not used as a seed for the tree reconstruction. In other words, they are trees for which a paralog of the given protein was used as a seed. Collateral trees may provide additional information on the topological position of a given protein. If several collateral trees support a specific relationship (e.g an orthology relationship or a duplication), we can regard this as additional evidence for that relationship. Collateral trees also provide information about proteins that belong to an organism that has not been used as seed in any phylome but is present in them.

I cannot find a tree for a given protein, even if the protein is present in the proteome you used to build the phylome

For sequences that did not produce at least three significant hits in the genomes considered (see the specific parameters and cut-offs in the phylome information page) we cannot reconstruct a tree. This is probably the reason why you cannot find a tree for these sequences.

I want to add some additional sequences (from a species not included in the phylome) and re-do the tree using the same parameters. How should I proceed?

You can download the sequences from the “clean alignment” link. Add the new sequences, re-align with MUSCLE, trim with trimAl and build the tree with PhyML, with the parameters indicated in the phylome information page. All these programs can be downloaded or are available through the Phylemon webserver

I want to create a direct link from a webapge to phylomeDB resources/results. How should I proceed?

You can find the information at the User's manual section: How to link to phylomeDB. You have a lot of possibilities to link the resources such as the information about a specific phylome or the results for a given query such as phylogenetic tree for a protein in a specific phylome, etc

I find several proteins with identical IDs in the same tree. What are these?

These sequences correspond to different genes in the same genome  that encode identical proteins. As Uniprot does, we assign the same protein identifier for to all proteins within a single species that are identical in sequence. These sequences are usually very recent duplicates (perhaps CNVs) that did not diverged yet at the protein level, or genes that are undergoing gene conversion.