PERFORM Publications

Keyword search (4,163 papers available)

Title		Authors	PubMed ID
1	Attention-Fusion-Based Two-Stream Vision Transformer for Heart Sound Classification	Ranipa K; Zhu WP; Swamy MNS;	41155032 ENCS
2	Lung Nodule Malignancy Classification Integrating Deep and Radiomic Features in a Three-Way Attention-Based Fusion Module	Khademi S; Heidarian S; Afshar P; Mohammadi A; Sidiqi A; Nguyen ET; Ganeshan B; Oikonomou A;	41150036 ENCS
3	A novel span and syntax enhanced large language model based framework for fine-grained sentiment analysis	Zou H; Wang Y; Huang A;	40876298 ENCS
4	Deformable detection transformers for domain adaptable ultrasound localization microscopy with robustness to point spread function variations	Gharamaleki SK; Helfield B; Rivaz H;	40640235 PHYSICS
5	SAVE: Self-Attention on Visual Embedding for Zero-Shot Generic Object Counting	Zgaren A; Bouachir W; Bouguila N;	39997554 ENCS
6	Semantically-Enhanced Feature Extraction with CLIP and Transformer Networks for Driver Fatigue Detection	Gao Z; Chen X; Xu J; Yu R; Zhang H; Yang J;	39771685 ENCS
7	CosSIF: Cosine similarity-based image filtering to overcome low inter-class variation in synthetic medical image datasets	Islam M; Zunair H; Mohammed N;	38492455 ENCS
8	Enhanced identification of membrane transport proteins: a hybrid approach combining ProtBERT-BFD and convolutional neural networks	Ghazikhani H; Butler G;	37497772 ENCS

Title:	Semantically-Enhanced Feature Extraction with CLIP and Transformer Networks for Driver Fatigue Detection
Authors:	Gao Z, Chen X, Xu J, Yu R, Zhang H, Yang J
Link:	https://pubmed.ncbi.nlm.nih.gov/39771685/
DOI:	10.3390/s24247948
Publication:	Sensors (Basel, Switzerland)
Keywords:	CLIP pre-trained model; Transformer; fatigue detection; instance normalization; semantic analysis;
PMID:	39771685	Category:	Date Added:	2025-01-08
Dept Affiliation:	ENCS 1 School of Computer Science and Technology, Tongji University, Shanghai 201804, China. 2 Department of Computer Science, City University of Hong Kong, Hong Kong 999077, China. 3 Key Laboratory of Road and Traffic Engineering of the Ministry of Education, Shanghai 201804, China. 4 College of Transportation Engineering, Tongji University, Shanghai 201804, China. 5 Zhejiang Fengxing Huiyun Technology Co., Ltd., Hangzhou 311107, China. 6 Department of Computer Science and Software Engineering, Concordia University, Montreal, QC H3G 1M8, Canada.

Description:

Drowsy driving is a leading cause of commercial vehicle traffic crashes. The trend is to train fatigue detection models using deep neural networks on driver video data, but challenges remain in coarse and incomplete high-level feature extraction and network architecture optimization. This paper pioneers the use of the CLIP (Contrastive Language-Image Pre-training) model for fatigue detection. And by harnessing the power of a Transformer architecture, sophisticated and long-term temporal features are adeptly extracted from video sequences, paving the way for more nuanced and accurate fatigue analysis. The proposed CT-Net (CLIP-Transformer Network) achieves an AUC (Area Under the Curve) of 0.892, a 36% accuracy improvement over the prevalent CNN-LSTM (Convolutional Neural Network-Long Short-Term Memory) end-to-end model, reaching state-of-the-art performance. Experiments show that the CLIP pre-trained model more accurately extracts facial and behavioral features from driver video frames, improving the model's AUC by 7% over the ImageNet-based pre-trained model. Moreover, compared with LSTM, the Transformer more flexibly captures long-term dependencies among temporal features, further enhancing the model's AUC by 4%.

BookR: School of Health Core Facilities Booking

"Transformer" Keyword-tagged Publications:

Search Publications

No results