Back to Search
Journal ArticleUnknown

Align-gram: Rethinking the Skip-gram Model for Protein Sequence Analysis

Author Affiliations
Bangladesh University of Engineering and Technology, University of Dhaka
Published InThe Protein Journal
Year2023
Citations3

Abstract

The inception of next generations sequencing technologies have exponentially increased the volume of biological sequence data. Protein sequences, being quoted as the ‘language of life’, has been analyzed for a multitude of applications and inferences. Owing to the rapid development of deep learning, in recent years there have been a number of breakthroughs in the domain of Natural Language Processing. Since these methods are capable of performing different tasks when trained with a sufficient amount of data, off-the-shelf models are used to perform various biological applications. In this study, we investigated the applicability of the popular Skip-gram model for protein sequence analysis and made an attempt to incorporate some biological insights into it. We propose a novel k -mer embedding…
View at Publisher

BORR does not host full-text PDFs. The button above takes you to the original publisher.