PERFORM Publications

Keyword search (4,163 papers available)

Title		Authors	PubMed ID
1	Attention-Fusion-Based Two-Stream Vision Transformer for Heart Sound Classification	Ranipa K; Zhu WP; Swamy MNS;	41155032 ENCS
2	Lung Nodule Malignancy Classification Integrating Deep and Radiomic Features in a Three-Way Attention-Based Fusion Module	Khademi S; Heidarian S; Afshar P; Mohammadi A; Sidiqi A; Nguyen ET; Ganeshan B; Oikonomou A;	41150036 ENCS
3	A novel span and syntax enhanced large language model based framework for fine-grained sentiment analysis	Zou H; Wang Y; Huang A;	40876298 ENCS
4	Deformable detection transformers for domain adaptable ultrasound localization microscopy with robustness to point spread function variations	Gharamaleki SK; Helfield B; Rivaz H;	40640235 PHYSICS
5	SAVE: Self-Attention on Visual Embedding for Zero-Shot Generic Object Counting	Zgaren A; Bouachir W; Bouguila N;	39997554 ENCS
6	Semantically-Enhanced Feature Extraction with CLIP and Transformer Networks for Driver Fatigue Detection	Gao Z; Chen X; Xu J; Yu R; Zhang H; Yang J;	39771685 ENCS
7	CosSIF: Cosine similarity-based image filtering to overcome low inter-class variation in synthetic medical image datasets	Islam M; Zunair H; Mohammed N;	38492455 ENCS
8	Enhanced identification of membrane transport proteins: a hybrid approach combining ProtBERT-BFD and convolutional neural networks	Ghazikhani H; Butler G;	37497772 ENCS

Title:	SAVE: Self-Attention on Visual Embedding for Zero-Shot Generic Object Counting
Authors:	Zgaren A, Bouachir W, Bouguila N
Link:	https://pubmed.ncbi.nlm.nih.gov/39997554/
DOI:	10.3390/jimaging11020052
Publication:	Journal of imaging
Keywords:	class-agnostic; object counting; transformers; visual attention; zero-shot;
PMID:	39997554	Category:	Date Added:	2025-02-25
Dept Affiliation:	ENCS 1 Concordia Institute for Information Systems Engineering (CIISE), Concordia University, Montréal, QC H3G 1M8, Canada. 2 Data Science Laboratory, University of Québec (TÉLUQ), Montréal, QC H2S 3L5, Canada.

Description:

Zero-shot counting is a subcategory of Generic Visual Object Counting, which aims to count objects from an arbitrary class in a given image. While few-shot counting relies on delivering exemplars to the model to count similar class objects, zero-shot counting automates the operation for faster processing. This paper proposes a fully automated zero-shot method outperforming both zero-shot and few-shot methods. By exploiting feature maps from a pre-trained detection-based backbone, we introduce a new Visual Embedding Module designed to generate semantic embeddings within object contextual information. These embeddings are then fed to a Self-Attention Matching Module to generate an encoded representation for the head counter. Our proposed method has outperformed recent zero-shot approaches, achieving the best Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) results of 8.89 and 35.83, respectively, on the FSC147 dataset. Additionally, our method demonstrates competitive performance compared to few-shot methods, advancing the capabilities of visual object counting in various industrial applications such as tree counting, wildlife animal counting, and medical applications like blood cell counting.

BookR: School of Health Core Facilities Booking

"Transformer" Keyword-tagged Publications:

Search Publications

No results