Keyword search (4,163 papers available)

"Almeida H" Authored Publications:

Title Authors PubMed ID
1 Improving candidate Biosynthetic Gene Clusters in fungi through reinforcement learning Almeida H; Tsang A; Diallo AB; 35762945
CSFG
2 TOUCAN: a framework for fungal biosynthetic gene cluster discovery. Almeida H, Palys S, Tsang A, Diallo AB 33575642
CSFG
3 Machine learning for biomedical literature triage. Almeida H, Meurs MJ, Kosseim L, Butler G, Tsang A 25551575
CSFG
4 mycoCLAP, the database for characterized lignocellulose-active proteins of fungal origin: resource and text mining curation support. Strasser K, McDonnell E, Nyaga C, Wu M, Wu S, Almeida H, Meurs MJ, Kosseim L, Powlowski J, Butler G, Tsang A 25754864
CSFG

 

Title:Improving candidate Biosynthetic Gene Clusters in fungi through reinforcement learning
Authors:Almeida HTsang ADiallo AB
Link:pubmed.ncbi.nlm.nih.gov/35762945/
DOI:10.1093/bioinformatics/btac420
Publication:Bioinformatics (Oxford, England)
Keywords:
PMID:35762945 Category: Date Added:2022-06-28
Dept Affiliation: CSFG
1 Departement d'Informatique, UQAM, Montréal, QC, Canada.
2 Centre for Structural and Functional Genomics, Concordia University, Montréal, QC, Canada.
3 Laboratoire d'Algèbre, de Combinatoire, et d'Informatique Mathématique,UQAM, Montréal, QC, Canada.
4 Centre of Excellence in Research on Orphan Diseases-Courtois Foundation (CERMO-FC), Montréal, QC, Canada.

Description:

Motivation: Precise identification of Biosynthetic Gene Clusters (BGCs) is a challenging task. Performance of BGC discovery tools is limited by their capacity to accurately predict components belonging to candidate BGCs, often overestimating cluster boundaries. To support optimizing the composition and boundaries of candidate BGCs, we propose reinforcement learning approach relying on protein domains and functional annotations from expert curated BGCs.

Results: The proposed reinforcement learning method aims to improve candidate BGCs obtained with state-of-the-art tools. It was evaluated on candidate BGCs obtained for two fungal genomes, Aspergillus niger and Aspergillus nidulans. The results highlight an improvement of the gene precision by above 15% for TOUCAN, fungiSMASH and DeepBGC; and cluster precision by above 25% for fungiSMASH and DeepBCG, allowing these tools to obtain almost perfect precision in cluster prediction. This can pave the way of optimizing current prediction of candidate BGCs in fungi, while minimizing the curation effort required by domain experts.

Availability and implementation: https: github.com/bioinfoUQAM/RL-bgc-components.

Supplementary information: Supplementary data is available at Bioinformatics online.




BookR developed by Sriram Narayanan
for the Concordia University School of Health
Copyright © 2011-2026
Cookie settings
Concordia University