iEnhancer-ELM: Improve Enhancer Identification by Extracting Multi-scale Contextual Information based on Enhancer Language Models

12/03/2022
by   Jiahao Li, et al.
9

Motivation: Enhancers are important cis-regulatory elements that regulate a wide range of biological functions and enhance the transcription of target genes. Although many state-of-the-art computational methods have been proposed in order to efficiently identify enhancers, learning globally contextual features is still one of the challenges for computational methods. Regarding the similarities between biological sequences and natural language sentences, the novel BERT-based language techniques have been applied to extracting complex contextual features in various computational biology tasks such as protein function/structure prediction. To speed up the research on enhancer identification, it is urgent to construct a BERT-based enhancer language model. Results: In this paper, we propose a multi-scale enhancer identification method (iEnhancer-ELM) based on enhancer language models, which treat enhancer sequences as natural language sentences that are composed of k-mer nucleotides. iEnhancer-ELM can extract contextual information of multi-scale k-mers with positions from raw enhancer sequences. Benefiting from the complementary information of k-mers in multi-scale, we ensemble four iEnhancer-ELM models for improving enhancer identification. The benchmark comparisons show that our model outperforms state-of-the-art methods. By the interpretable attention mechanism, we finds 30 biological patterns, where 40 widely used motif tool (STREME) and a popular dataset (JASPAR), demonstrating our model has a potential ability to reveal the biological mechanism of enhancer. Availability: The source code are available at https://github.com/chen-bioinfo/iEnhancer-ELM Contact: junjiechen@hit.edu.cn and junjie.chen.hit@gmail.com; Supplementary information: Supplementary data are available at Bioinformatics online.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset