This page covers the basics on how to get started with miRkwood small RNA-seq web server. A comprehensive user guide is to be found here.
The main input of miRkwood is a set of reads (ranging from 15 nt to 35 nt approximately) that have been previously mapped on a reference genome and that are stored in a file in BED format. We advise you to first practice with the sample BED file below.
This file contains Illumina reads for Arabidopsis thaliana from SRA SRR960237 . We have mapped these reads on TAIR 10, and have selected 2 portions of reads, one from chromosome 1 and one from chromosome 4. This gives a total of 319,103 reads.
If you want to know more on how to create this BED file for your data, check out the detailed instruction.
Input form
You can now open the input form on the web server.
Upload your set of reads: choose the downloaded sample BED file on your local disk.
Select a species: choose the Arabidospsis thaliana TAIR 10 assembly.
Parameters: keep all default options.
Finally, click the Run miRkwood button. Results will be displayed in a couple of minutes.
Results page
The top of the page displays the job ID, that can be saved for future usage (with the link retrieve a result with an ID, on the main menu).
The result page has then two main parts. The first one (Options summary) is simply a summary of your job parameters. The other one (Results summary) provides the detailed results.
This diagram indicates the proportion of reads found for each category. This is a graphical visualisation of the results displayed below.
-
Total number of reads: 319,103 (29,795 unique reads)
The initial BED file contains 319,103 reads, forming 29,795 unique reads.
-
CoDing Sequences: 2,599 reads
2,599 reads fall within a coding region (annotated as CDS) and are discarded from the analysis. You can list them by clicking on the download link (GFF file).
-
rRNA/tRNA/snoRNA: 6,275 reads
6,275 reads fall within a region annotated either as ribosomal RNA, or transfer RNA or snoRNA, and are discarded from the analysis. You can list them by clicking on the download link (GFF file).
-
Multiply mapped reads: 1,665 reads
1,665 reads map to more than five locations, and are discarded from the analysis. You can list them by clicking on the download link (BED file).
-
Orphan cluster of reads: 26,826 reads
An orphan cluster is a short region in the genome that is enriched with aligned reads but that shows no secondary structure compatible with a hairpin. You can obtain the list of the such clusters by clicking on the download link (BED file).
-
Orphan hairpins: 277 reads
An orphan hairpin is a candidate with a global score of 0 and showing no conservation with miRBase. By default, if you select the option "filter out low quality hairpins", such hairpins will be discarded automatically and you can obtain the list by clicking on the download link (BED file).
-
Unclassified reads: 49,706 reads
49,706 reads do not belong to any orphan cluster or any orphan hairpin, or do not fall in any annotated region.
-
Known miRNAs: 5 sequence(s) - 224,134 reads
mirKwood finds 5 pre-miRNAs present in miRbase that intersect with a total of 224,134 reads present in the input BED file. You can display detailed results by clicking on the link see results. From this new page, you will be able to access a short report for each miRNA found : miRBase accession number, positions of the precursor, sequence of the miRNA, secondary structure, read distribution. Further information can then be retrieved from the miRBase website.
-
Novel miRNAs: 30 sequence(s) - 7,621 reads
miRkwood finds 30 new pre-miRNAs that have not been previously reported in miRbase and that are supported by a significant read coverage and a stemloop secondary structure. These 30 miRNAs represent a total of 7,621 reads. You can display detailed results by clicking on the link see results. From this new page, you will be able to access a comprehensive report for each predicted miRNA : positions of the precursor, sequence of the miRNA, secondary structure, read distribution, thermodynamic stability, precision of the duplex processing, conservation, ... Predictions are ranked according to a quality score.
If you want to know more about all parameter options, export formats, visualisation tools, please visit our main help page.