Theoretical Predictions of Non-Coding RNA Genes

Non-coding RNA genes are usually small and unaffected by nonsense or code-shifting mutations, so they are challenging to identify using traditional genetic screening methods. In contrast, the completion of previous whole genome sequencing of many species provides an essential database for the theoretical prediction of non-coding RNA genes. Thus, Lifeasible provides a theory-based approach to finding non-coding RNA genes from the genome, which can be used as an initial screen to provide a large reservoir for experimental studies.

Theoretical predictions of non-coding RNA genes.

Homology search method

Identifying protein-coding genes by searching for homologous sequences in a database is a simple and fast method. We can use the profile SCFG model to describe the homology of non-coding RNA genes among sequences and search for non-coding RNA regions in the non-coding RNA database similar to the profile. Moreover, we have optimized the algorithm in parallel by using the computational power provided by high-performance computing, which significantly improves the algorithm's speed and thus accelerates the application of this method in practice.

Statistical method

We utilize statistical features of the base composition of non-coding RNA genes, such as tetrad distribution rate and AT/AA content. However, this method often requires that the identified non-coding RNA gene sequences are relatively long. Otherwise the base composition cannot show obvious statistical features. Therefore, we do not recommend this method for some short non-coding RNA genes.

Comparative genomic analysis method

Comparative genomics provides another way to predict non-coding RNA genes. We use Bayesian posterior probabilities to characterize conserved regions based on their base substitution patterns and combine this with RNA folding models. Then non-coding RNAs of unknown structure can be detected. This method requires the existence of whole genome sequences of at least two or more species, one of which is the species used to find the non-coding RNA gene and the other similar to it. This method is simple to perform and does not depend on being able to recognize new non-coding RNA genes and specific species.

Neural network method

Artificial neural networks have been used to build pervasive algorithms to predict and annotate non-coding RNA genes in the genome. As a machine learning method, artificial neural networks are suitable for systems with ill-defined or unknown rules, especially those that cannot be described by precise mathematical expressions, and are more often used for feature prediction and extraction. In addition, the artificial neural network method is also very adaptable and capable of handling problems that cannot be described by symbols and have a large amount of data.

Due to the large number of non-coding RNA genes, the wide range of functions involved, and the multiple mechanisms of action, it does not seem feasible to establish a prediction method that is generalizable to all species. Therefore, the prediction and labeling of non-coding RNA genes provided by Lifeasible is a combination of specific and universal methods. Please feel free to contact us with your specific needs so we can provide you with a customized theoretical identification solution.

The services provided by Lifeasible cover all aspects of plant research, please contact us to find out how we can help you achieve the next research breakthrough.

Contact

*If your organization requires the signing of a confidentiality agreement, please contact us by email.

For research use only, not intended for any clinical use.

Related Services