MuSe-Sent: Multimodal Sentiment Classification in-the-Wild (MuSe2021)
Author(s)
Stappen, L
Baird, A
Schuller, B
Type
Dataset
Abstract
MuSe-Sent of the 2nd Multimodal Sentiment in-the-Wild Challenge! Predicting five advanced intensity classes for each of the emotional dimensions (valence, arousal) for segments of audio-video-text data. This package includes only MuSe-Sent features (all partitions) and labels of the training and development set (test scoring via the MuSe website). More: https://www.muse-challenge.org/muse2021 General: The purpose of the Multimodal Sentiment Analysis in Real-life media Challenge and Workshop (MuSe) is to bring together communities from different disciplines. We introduce the novel dataset MuSe-CAR that covers the range of aforementioned desiderata. MuSe-CAR is a large (>36h), multimodal dataset which has been gathered in-the-wild with the intention of further understanding Multimodal Sentiment Analysis in-the-wild, e.g., the emotional engagement that takes place during product reviews (i.e., automobile reviews) where a sentiment is linked to a topic or entity. We have designed MuSe-CAR to be of high voice and video quality, as informative video social media content, as well as everyday recording devices have improved in recent years. This enables robust learning, even with a high degree of novel, in-the-wild characteristics, for example as related to: i) Video: Shot size (a mix of close-up, medium, and long shots), face-angle (side, eye, low, high), camera motion (free, free but stable, and free but unstable, switch, e.g., zoom, fixed), reviewer visibility (full body, half-body, face only, and hands only), highly varying backgrounds, and people interacting with objects (car parts). ii) Audio: Ambient noises (car noises, music), narrator and host diarisation, diverse microphone types, and speaker locations. iii) Text: Colloquialisms, and domain-specific terms.
Version
1
Date Issued
2021-04-01
Online Publication Date
2023-10-09T13:56:27Z