Keyword search (4,163 papers available)

"Butler G" Authored Publications:

Title Authors PubMed ID
1 Ion channel classification through machine learning and protein language model embeddings Ghazikhani H; Butler G; 39572876
ENCS
2 SPOT: A machine learning model that predicts specific substrates for transport proteins Kroll A; Niebuhr N; Butler G; Lercher MJ; 39325691
ENCS
3 Comparative genomic analysis of thermophilic fungi reveals convergent evolutionary adaptations and gene losses Steindorff AS; Aguilar-Pontes MV; Robinson AJ; Andreopoulos B; LaButti K; Kuo A; Mondo S; Riley R; Otillar R; Haridas S; Lipzen A; Grimwood J; Schmutz J; Clum A; Reid ID; Moisan MC; Butler G; Nguyen TTM; Dewar K; Conant G; Drula E; Henrissat B; Hansel C; Singer S; Hutchinson MI; de Vries RP; Natvig DO; Powell AJ; Tsang A; Grigoriev IV; 39266695
CSFG
4 Exploiting protein language models for the precise classification of ion channels and ion transporters Ghazikhani H; Butler G; 38656743
CSFG
5 Enhanced identification of membrane transport proteins: a hybrid approach combining ProtBERT-BFD and convolutional neural networks Ghazikhani H; Butler G; 37497772
ENCS
6 Integrative approach for detecting membrane proteins. Alballa M, Butler G 33349234
CSFG
7 BENIN: Biologically enhanced network inference. Wonkap SK, Butler G 32698722
ENCS
8 TooT-T: discrimination of transport proteins from non-transport proteins. Alballa M, Butler G 32321420
CSFG
9 TranCEP: Predicting the substrate class of transmembrane transport proteins using compositional, evolutionary, and positional information. Alballa M, Aplop F, Butler G 31935244
CSFG
10 Analytical and computational approaches to define the Aspergillus niger secretome. Tsang A, Butler G, Powlowski J, Panisko EA, Baker SE 19618504
BIOLOGY
11 SnowyOwl: accurate prediction of fungal genes by using RNA-Seq and homology information to select among ab initio models. Reid I, O'Toole N, Zabaneh O, Nourzadeh R, Dahdouli M, Abdellateef M, Gordon PM, Soh J, Butler G, Sensen CW, Tsang A 24980894
CSFG
12 Machine learning for biomedical literature triage. Almeida H, Meurs MJ, Kosseim L, Butler G, Tsang A 25551575
CSFG
13 mycoCLAP, the database for characterized lignocellulose-active proteins of fungal origin: resource and text mining curation support. Strasser K, McDonnell E, Nyaga C, Wu M, Wu S, Almeida H, Meurs MJ, Kosseim L, Powlowski J, Butler G, Tsang A 25754864
CSFG
14 An Adaptive Defect Weighted Sampling Algorithm to Design Pseudoknotted RNA Secondary Structures. Zandi K, Butler G, Kharma N 27499762
CSFG

 

Title:Ion channel classification through machine learning and protein language model embeddings
Authors:Ghazikhani HButler G
Link:https://pubmed.ncbi.nlm.nih.gov/39572876/
DOI:10.1515/jib-2023-0047
Publication:Journal of integrative bioinformatics
Keywords:Convolutional Neural Networkdrug discoveryion channelsmembrane proteinsprotein language modelstransmembrane proteins
PMID:39572876 Category: Date Added:2024-11-22
Dept Affiliation: ENCS
1 Department of Computer Science and Software Engineering, Concordia University, Montreal, Canada.

Description:

Ion channels are critical membrane proteins that regulate ion flux across cellular membranes, influencing numerous biological functions. The resource-intensive nature of traditional wet lab experiments for ion channel identification has led to an increasing emphasis on computational techniques. This study extends our previous work on protein language models for ion channel prediction, significantly advancing the methodology and performance. We employ a comprehensive array of machine learning algorithms, including k-Nearest Neighbors, Random Forest, Support Vector Machines, and Feed-Forward Neural Networks, alongside a novel Convolutional Neural Network (CNN) approach. These methods leverage fine-tuned embeddings from ProtBERT, ProtBERT-BFD, and MembraneBERT to differentiate ion channels from non-ion channels. Our empirical findings demonstrate that TooT-BERT-CNN-C, which combines features from ProtBERT-BFD and a CNN, substantially surpasses existing benchmarks. On our original dataset, it achieves a Matthews Correlation Coefficient (MCC) of 0.8584 and an accuracy of 98.35 %. More impressively, on a newly curated, larger dataset (DS-Cv2), it attains an MCC of 0.9492 and an ROC AUC of 0.9968 on the independent test set. These results not only highlight the power of integrating protein language models with deep learning for ion channel classification but also underscore the importance of using up-to-date, comprehensive datasets in bioinformatics tasks. Our approach represents a significant advancement in computational methods for ion channel identification, with potential implications for accelerating research in ion channel biology and aiding drug discovery efforts.





BookR developed by Sriram Narayanan
for the Concordia University School of Health
Copyright © 2011-2026
Cookie settings
Concordia University