<
ANU Home | HORUS | Staff Home | Students | RSBS
The Australian National University
Research School of Biological Sciences
ANU COLLEGE OF SCIENCE
  
    
Printer Friendly Version of this Document

Medicago-MIRATdb: M. truncatula Putative miRNA and Target Database

Main | miRNAs | Targets | Search | Blast | hybridTargets | Download | Help


Help Page

Frequently Asked Questions :

 

What does "Source" mean? Where are they from?

Source is the sequence set from which we predicted miRNA or target sequences. We used two source sequence sets, one for predicting miRNAs and one for target sequence prediction. The miRNA sequence source is further divided into non-coding transcripts (putative ncRNAs). For the former, we used the 503 ncRNAs that we predicted from a ncRNA prediction pipeline. For the latter, we mapped the ESTs onto genomic sequences and extracted the genomic sequences intervening with the ESTs at splice consensus sites. The TIGR Medicago truncatula Gene Index (GI) sequences (Release 8.0 January 19, 2005) (ftp://ftp.tigr.org/pub/data/tgi/Medicago_truncatula/ ) consisted of a total of 36,878 assembled GI transcripts obtained as target source sequences for predicting miRNA target sites.

 

What does "GI" represent?

GI represents the TIGR Gene Index which contains both assembled consensus sequences (TCs) and singleton ESTs.

 

What are miRNA and target gene identifiers?

miRNA identifiers start with "n", followed by numbers and represent that miRNAs are predicted from either ncRNA transcripts. We used GI accession as the target gene identifier as target sites were predicted from GIs.

 

Why do you use different regions to describe miRNAs?

As the miRNA length could not be precisely defined, our prediction pipeline determined three regions: (i) miRNA::target duplex regions ranging from 18 - 25 nt, which is shown in miRNA::target alignments, (ii) 25 nt centerd miRNA::miRNA* duplex used in fold-back filter, and (iii) as 90% of known plant miRNAs are 21 nt long, we use sequences of a length of 21 nt to report our predicted miRNAs.

 

How do you define miRNA precursor boundaries?

The predicted miRNA precursors are delimitated by 25 nucleotide miRNAs and miRNA*s. The precursor minimum free energies (MFEs) were actually calculated from the sequences bounded by miRNAs that are matched to target sites and the corresponding miRNA*s. For this reason, a slight discrepancy of MFEs is seen in the reported precursor sequences.

 

Where is EST library information from?

We gathered the Medicago truncatula EST library information from TIGR and www.medicago.org. As different libraries may be generated from the same tissue, we include a field of "tissue category" to describe from which tissue a EST is cloned.

 

What is "number of ESTs" field in your EST info?

It is the number of ESTs assembled into each GI. One may use it as a guide measure of expression abundance.

 

How did you search the miRNA conservation in other plant species?

We used the predicted 21 nt miRNAs to locate homologous mature miRNAs and miRNA*s in Lotus japonicus and Arabidopsis thaliana genomic sequences as well as EST sequences in dbEST. We determined all sites which are similar to the 21bp miRNA sequences (less than 3 mismatches) and have a matching miRNA* at a distance of 15-400 bps. In addition, the homologous miRNAs were required reside on the same arm of the precursor loop. The matched homologous miRNA sequences were also required to meet the criteria of sequence filter and the miRNA fold-back structure filter (see reference for the detail description )

 

What is "CLAN"?

The "CLAN" represents a group of highly similar miRNA sequences. It is named as we used program CLANS to perform grouping miRNAs.

 

How was the p-value of target gene GO classification determined?

We searched for statistically overrepresented GO terms associated with target genes with a threshold of the p-value < 0.1.

 

How does Blast search to protein database performed?

We BLAST searched the target genes against Swiss-Prot, TrEMBL, and NR with E-value < 1E-5. The top 3 best hits of BLAST results were recorded.

 

FASTA format

A sequence in FASTA format is composed of a single line description starting with ">" and lines of sequences data.

 

Do you plan to predict miRNAs in other legumes?

Yes, we next plan to predict miRNAs and their targets in another model legume - Lotus japonicus.

 

Contact

Questions and comments are welcome; send email to Georg.weiller@anu.edu.au