This page is a user manual for the program gardenia.
Input of Gardenia
Gardenia takes as input a set of RNA sequences with secondary structure. Each sequence should be specified in bracket format.
The bracket format consists of three lines. The first line contains a FASTA-like header. The second line contains the nucleic sequence. The last line contains the set of associated pairings encoded by brackets and dots. A base pair between bases i and j is represented by a ( at position i and a ) at position j. Unpaired bases are represented by dots. The lack of pseudoknots in the secondary structure ensures that this notation defines a unique folding.
>trna E. coli ggggcuauagcucagcugggagagcgccugcuuugcacgcaggaggucugcgguucgaucccgcauagcuccacca (((((((..((((........)))).(((((.......))))).....(((((........))))))))))))...
Edit operations and Scoring system
Edit operations may be divided into two groups: those concerning free bases and those concerning arcs between bases, or hydrogen bounds.
Edit operations for free bases are the same as usual: removing a base (base-deletion), renaming it (base-mismatch), or leaving it untouched (base-match).
U A U C C | U A U C C | U A U C C | |||
U A U C C | U A G C C | U A - C C | |||
| |
There are five main kinds of edit operations involving base-pairings. Let i,j be two positions in a sequence forming a base pair.
Arc-match: i,j is left untouched.
( ) |
A U C G G U A A C G |
A U C G C A - A A G |
( ) |
Arc-mismatch: i,j is aligned with another base-pairing that is not identical. The cost of the edit operation depends on the number of mutations within the pairing: If only one base changes, then the cost is arc-mismatch (1). If both bases change, then the cost is arc-mismatch (2).
( ) | ( ) | |
A U C G G U A A C G | A U C G G U A A C G | |
A U C G G U A G C G | C U C G G U A G C G | |
( ) | ( ) | |
|
|
Arc-removing: i,j has no counterpart in the other sequence.
( ) |
A U C G G U A A C G |
A - C G C A - - A G |
Arc-altering: i,j is aligned with a single free base.The cost of the operation depends on the conservation of the free base.
( ) | ( ) | |
A U C G G U A A C G | A U C G G U A A C G | |
A U C G G U A - C G | C C C G G U A - C G | |
|
|
Arc-breaking: the base pair i,j is aligned with two free bases. This operation which breaks the pairing between i and j and leaves the bases free. There are three possible weights : arc-breaking (1) is for identical bases, base-breaking (2) is for one identical base and one modified base, and arc-breaking (3) if for two modified bases.
( ) | ( ) | ( ) | ||
A U C G G U A A C G | A U C G G U A A C G | A U C G G U A A C G | ||
A U C G G U A A C G | C C C G G U A A C G | C C C G G U A G C G | ||
|
|
|
You can adjust the values of the weight of each edit operation on the web server.
Command-line version
Gardenia is written in C. The source code of is freely available under the GPL license:
gardenia.zip and read me.
It offers more options than the web server.
Output of Gardenia
The result is a multiple sequence alignment.
trna coli ((((((( ..((((........)))).(((((.......)) ))).....(((((. trna coli ggggcua-----uagcucagcugggagagcgccugcuuugcacgc-aggaggucugcggu trna2 uccucgguaguauaguggug-aguauccgcgucugu--cacaugcgaga----cccgggu trna2 (((((((.......((((.. .....))))(((((( .....)).))) )(((((. * *** * * *** *** ** ** ** * *** trna coli .......))))))))))))... trna coli ucgaucccgcauagcuccacca trna2 ucaau-ucccggccgggga--g trna2 ..... .)))))))))))) . ** ** * * *Each helix that appears in both sequences is assigned a colour. * indicate conserved positions.