Comparação de Técnicas para Representação Vetorial de Imagens com Redes Neurais para Aplicações de Recuperação de Produtos do Varejo

Autores

  • Gustavo Ramos Lima Programa de Pós-graduação em Computação Aplicada (PPComp), Instituto Federal do Espírito Santo Serra, ES, Brasil
  • Thiago Oliveira Santos Departamento de Informática Universidade Federal do Espírito Santo Vitória, ES, Brasil
  • Patrick Marques Ciarelli Departamento de Engenharia Elétrica Universidade Federal do Espírito Santo Vitória, ES, Brasil
  • Filipe Mutz Programa de Pós-graduação em Computação Aplicada (PPComp), Instituto Federal do Espírito Santo Universidade Federal do Espírito Santo Vitória, ES, Brasil

DOI:

https://doi.org/10.14210/cotb.v14.p355-362

Resumo

ABSTRACT
Product retrieval from images has multiple applications ranging
from providing information and recommentations for customers
in supermarkets to automatic invoice generation in smart stores.
However, this task present important challenges such as the large
number of products, the scarcity of images of items, differences
between real and iconic images of the products, and the constant
changes in the portfolio due to the addition or removal of products.
Hence, this work investigates ways of generating vector representations
of images using deep neural networks such that these
representations can be used for product retrieval even in face of
these challenges. Experimental analysis evaluated the effect that
network architecture, data augmentation techniques and objective
functions used during training have on representation quality. The
best configuration was achieved by fine-tuning a VGG-16 model
in the task of classifying products using a mix of Randaugment
and Augmix data augmentations and a hierarchical triplet loss as a
regularization function. The representations built using this model
led to a top-1 accuracy of 80,38% and top-5 accuracy of 92.62% in
the Grocery Products dataset.

Downloads

Publicado

03-05-2023

Edição

Seção

Artigos Completos