0% Complete
Home
/
14th International Conference on Computer and Knowledge Engineering
TriMAE: Fashion visual search with Triplet Masked Auto Encoder Vision Transformer
Authors :
Lachin Zamani
1
Reza Azmi
2
1- Department of Computer Engineering, Faculty of Engineering, Alzahra University, Tehran, Iran
2- Department of Computer Engineering, Faculty of Engineering, Alzahra University, Tehran, Iran
Keywords :
Visual Search،Triplet Network،Masked Auto Encoders Vision Transformer
Abstract :
Visual search is a technology that identifies images similar to a provided query image and presents results ranked by similarity. In the realm of apparel, this innovative tool revolutionizes shopping by enabling users to effortlessly find desired items based on visual preference. Visual search remains a challenging problem despite its potential to significantly enhance user experience. The existence of differences in minute details, the presence of multiple garments in a single image, discrepancies between user-taken and catalog images, and the inherent flexibility of clothing are among the challenges associated with this issue. By selecting robust features and improving the learning of similarity and dissimilarity between images, superior results can be obtained. Consequently, a method has been proposed to yield enhanced outcomes. Convolutional Neural Networks and Vision Transformers are commonly used as the backbone of triplet neural networks for visual search tasks. These networks are designed to better learn the similarities and differences between images. In this research, we employ a combination of triplet neural networks and a masked auto-encoder vision transformer model. A triplet loss function is used during network training to learn the similarity between images. We evaluate our method on the DeepFashion In-shop dataset, which comprises different categories of clothing images. Through extensive experiments on this benchmark, our model achieves an impressive Recall@1 of 93.2% for visual search.
Papers List
List of archived papers
Adaptive Active Queue Management for Time Slot Channel Hopping in Industrial Internet of Things
Mehdi Zirak - Yasser Sedaghat - Mohammad Hossein Yaghmaee Moghaddam
ROCT-Net: A new ensemble deep convolutional model with improved spatial resolution learning for detecting common diseases from retinal OCT images
Mohammad Rahimzadeh - Mahmoud Reza Mohammadi
Dynamic Knowledge Enhanced Neural Fashion Trend Forecasting with Quantile Loss
Fatemeh Rooholamini - Reza Azmi - Mobina Khademhossein - Maral Zarvani
EEMC: Energy Efficient Multi-Clustering Using Grey Wolf Optimizer in WSNs
Maryam Ghorbanvirdi - Sayyed Majid Mazinani
Designing a High Perfomance and High Profit P2P Energy Trading System Using a Consortium Blockchain Network
Poonia Taheri Makhsoos - Behnam Bahrak - Fattaneh Taghiyareh
T-Rank: Graph Data Analytics for Urban Traffic Modeling
Alireza Safarpour - Iman Gholampour - Amirhossain Aghazadeh Fard - Seyed Mohammad Karbasi
An intelligent linguistic error detection approach to automated diagnosis of Dyslexia disorder in Persian speaking children
Fatemeh Asghari - Mahsa Khorasani - Mohsen Kahani - Seyed Amir Amin Yazdi - Mahdi Arkhodi Ghalenoei
Innovative Customer Segmentation based on Multi-Step Sequential Deep Clustering in the Telecommunication Industry
Fatemeh Jalali Farahani - Shima Tabibian
Overview of Electric Vehicles Charging Stations in Smart Grids
Mohammed Wadi - Wisam Elmasry - Mohammed Jouda - Hossein Shahinzadeh - Gevork B. Gharehpetian
City Intersection Clustering and Analysis Based on Traffic Time Series
Mohammad Aminazadeh - Fakhroddin Noorbehbahani
more
Samin Hamayesh - Version 41.5.3