This page is a user manual for the program gardenia.
Input of Gardenia
Gardenia takes as input a set of RNA sequences with secondary structure. Each sequence should be specified in bracket format.
The bracket format consists of three lines. The first line contains a FASTA-like header. The second line contains the nucleic sequence. The last line contains the set of associated pairings encoded by brackets and dots. A base pair between bases i and j is represented by a ( at position i and a ) at position j. Unpaired bases are represented by dots. The lack of pseudoknots in the secondary structure ensures that this notation defines a unique folding.
>trna E. coli
ggggcuauagcucagcugggagagcgccugcuuugcacgcaggaggucugcgguucgaucccgcauagcuccacca
(((((((..((((........)))).(((((.......))))).....(((((........))))))))))))...
Edit operations and Scoring system
Edit operations may be divided into two groups: those concerning free bases and those concerning arcs between bases, or hydrogen bounds.
Edit operations for free bases are the same as usual: removing a base (base-deletion), renaming it (base-mismatch), or leaving it untouched (base-match).
| U A U C C | U A U C C | U A U C C | |||
| U A U C C | U A G C C | U A - C C | |||
| | |
There are five main kinds of edit operations involving base-pairings. Let i,j be two positions in a sequence forming a base pair.
Arc-match: i,j is left untouched.
| ( ) |
| A U C G G U A A C G |
| A U C G C A - A A G |
| ( ) |
Arc-mismatch: i,j is aligned with another base-pairing that is not identical. The cost of the edit operation depends on the number of mutations within the pairing: If only one base changes, then the cost is arc-mismatch (1). If both bases change, then the cost is arc-mismatch (2).
| ( ) | ( ) | |
| A U C G G U A A C G | A U C G G U A A C G | |
| A U C G G U A G C G | C U C G G U A G C G | |
| ( ) | ( ) | |
| |
|
Arc-removing: i,j has no counterpart in the other sequence.
| ( ) |
| A U C G G U A A C G |
| A - C G C A - - A G |
Arc-altering: i,j is aligned with a single free base.The cost of the operation depends on the conservation of the free base.
| ( ) | ( ) | |
| A U C G G U A A C G | A U C G G U A A C G | |
| A U C G G U A - C G | C C C G G U A - C G | |
| |
|
Arc-breaking: the base pair i,j is aligned with two free bases. This operation which breaks the pairing between i and j and leaves the bases free. There are three possible weights : arc-breaking (1) is for identical bases, base-breaking (2) is for one identical base and one modified base, and arc-breaking (3) if for two modified bases.
| ( ) | ( ) | ( ) | ||
| A U C G G U A A C G | A U C G G U A A C G | A U C G G U A A C G | ||
| A U C G G U A A C G | C C C G G U A A C G | C C C G G U A G C G | ||
| |
|
|
You can adjust the values of the weight of each edit operation on the web server.
Command-line version
Gardenia is written in C. The source code of is freely available under the GPL license:
gardenia.zip and read me.
It offers more options than the web server.
Output of Gardenia
The result is a multiple sequence alignment.
trna coli ((((((( ..((((........)))).(((((.......)) ))).....(((((.
trna coli ggggcua-----uagcucagcugggagagcgccugcuuugcacgc-aggaggucugcggu
trna2 uccucgguaguauaguggug-aguauccgcgucugu--cacaugcgaga----cccgggu
trna2 (((((((.......((((.. .....))))(((((( .....)).))) )(((((.
* *** * * *** *** ** ** ** * ***
trna coli .......))))))))))))...
trna coli ucgaucccgcauagcuccacca
trna2 ucaau-ucccggccgggga--g
trna2 ..... .)))))))))))) .
** ** * * *
Each helix that appears in both sequences is assigned a colour. * indicate conserved positions.