Percorrer por autor "Pimenta-Zanon, Matheus Henrique"
A mostrar 1 - 2 de 2
Resultados por página
Opções de ordenação
- An Efficient Feature Extraction Method for Identifying Signatures of Viral Genomic VariantsPublication . Souza, Felipe Bueno de; Pimenta-Zanon, Matheus Henrique; Henriques, Dora; Pinto, M. Alice; Balsa, Carlos; Rufino, José; Lopes, Fabrício MartinsGenomic analysis is a powerful way to understand viral pathogens and their variations. However, most of the genomic analysis methods are based on sequence alignment, which has a high computational cost. This study introduces a novel methodology to extract discriminative regions from viral genomes. Using exclusive k-mers through strategically defined sliding windows, our approach identifies genomic regions with high concentrations of variant-specific signatures, showcasing high-accuracy classification while requiring modest computational resources. The data-driven and nonparametric nature of our approach enables pattern extraction without imposing predefined distributions, enhancing both analytical flexibility and result interpretability. By balancing minimal k-mer sizes with maximum discriminative power, our method achieves remarkable generalization capability even with limited training samples. The computational efficiency of the methodology alongside the biological transparency and explainability in the results makes it accessible to research environments with restricted processing capacity, potentially accelerating genomic signature discovery across diverse viral pathogens and contributing to better variant tracking and characterization, thus opening up even more possibilities in genomic analysis studies.
- Resonant recognition model as a preprocessing technique for RNA classificationPublication . Souza, Felipe Bueno de; Pimenta-Zanon, Matheus Henrique; Henriques, Dora; Pinto, M. Alice; Balsa, Carlos; Rufino, José; Lopes, Fabrício MartinsThe development of high throughput sequencing technologies, such as RNA-Seq, has enabled the generation of large volumes of biological data. Thus, it is necessary to develop computational methods to interpret this massive volume of data and contribute to knowledge discovery. RNA sequences are products of the transcription of genomic DNA sequences and represent the gene expression process that organisms use to synthesize protein or RNA molecules. These RNA sequences can be compared between organisms of the same or different species to demonstrate similar functional proteins. There are several classes of RNA sequences (mRNA, rRNA, tRNA, ncRNA, etc.), with different biological functions. The correct identification of each class of RNA sequences is important because of the huge volume of unlabelled data available. In this context, this study proposes an approach based on the Resonant Recognition Model (RRM) for feature extraction and classification regarding the ncRNA and mRNA classes. To assess the proposed approach, it was adopted the dataset from the PLEK method. Despite the reduction of the input data size achieved using the RRM model, the results show high accuracy for primary protein sequences translated from RNA sequences, signaling the potential of the proposed approach to classify RNA.
