Skip to main content
Article
Towards Decrypting Attractiveness via Multi-Modality Cue
ACM Transactions on Multimedia Computing, Communications, and Applications
  • Tam Nguyen, University of Dayton
  • Si Liu, National University of Singapore
  • Bingbing Ni, Advanced Digital Sciences Center
  • Jun Tan, National University of Defense Technology
  • Yong Rui, Microsoft Research Asia
  • Shuicheng Yan, National University of Singapore
Document Type
Article
Publication Date
8-1-2013
Abstract

Decrypting the secret of beauty or attractiveness has been the pursuit of artists and philosophers for centuries. To date, the computational model for attractiveness estimation has been actively explored in the computer vision and multimedia community, yet with the focus mainly on facial features. In this article, we conduct a comprehensive study on female attractiveness conveyed by single/multiple modalities of cues, that is, face, dressing and/or voice; the aim is to discover how different modalities individually and collectively affect the human sense of beauty.

To extensively investigate the problem, we collect the Multi-Modality Beauty (M2B) dataset, which is annotated with attractiveness levels converted from manual k-wise ratings and semantic attributes of different modalities. Inspired by the common consensus that middle-level attribute prediction can assist higher-level computer vision tasks, we manually labeled many attributes for each modality. Next, a tri-layer Dual-supervised Feature-Attribute-Task (DFAT) network is proposed to jointly learn the attribute model and attractiveness model of single/multiple modalities.

To remedy possible loss of information caused by incomplete manual attributes, we also propose a novel Latent Dual-supervised Feature-Attribute-Task (LDFAT) network, where latent attributes are combined with manual attributes to contribute to the final attractiveness estimation. The extensive experimental evaluations on the collected M2B dataset well demonstrate the effectiveness of the proposed DFAT and LDFAT networks for female attractiveness prediction.

Inclusive pages
1-20
ISBN/ISSN
1551-6857
Document Version
Postprint
Comments

This document available for download is the authors' accepted manuscript, provided in compliance with the publisher's policy on self-archiving. Differences may exist between this document and the published version, which is available using the link provided. Permission documentation is on file.

Publisher
Association for Computing Machinery
Peer Reviewed
Yes
Citation Information
Tam Nguyen, Si Liu, Bingbing Ni, Jun Tan, et al.. "Towards Decrypting Attractiveness via Multi-Modality Cue" ACM Transactions on Multimedia Computing, Communications, and Applications Vol. 9 Iss. 4 (2013)
Available at: http://works.bepress.com/tam-nguyen/5/