Emotionless: privacy-preserving speech analysis for voice assistants
File(s)
Author(s)
Aloufi, Ranya
Haddadi, Hamed
Boyle, David
Type
Working Paper
Abstract
Voice-enabled interactions provide more human-like experiences in many
popular IoT systems. Cloud-based speech analysis services extract useful
information from voice input using speech recognition techniques. The voice
signal is a rich resource that discloses several possible states of a speaker,
such as emotional state, confidence and stress levels, physical condition, age,
gender, and personal traits. Service providers can build a very accurate
profile of a user's demographic category, personal preferences, and may
compromise privacy. To address this problem, a privacy-preserving intermediate
layer between users and cloud services is proposed to sanitize the voice input.
It aims to maintain utility while preserving user privacy. It achieves this by
collecting real time speech data and analyzes the signal to ensure privacy
protection prior to sharing of this data with services providers. Precisely,
the sensitive representations are extracted from the raw signal by using
transformation functions and then wrapped it via voice conversion technology.
Experimental evaluation based on emotion recognition to assess the efficacy of
the proposed method shows that identification of sensitive emotional state of
the speaker is reduced by ~96 %.
popular IoT systems. Cloud-based speech analysis services extract useful
information from voice input using speech recognition techniques. The voice
signal is a rich resource that discloses several possible states of a speaker,
such as emotional state, confidence and stress levels, physical condition, age,
gender, and personal traits. Service providers can build a very accurate
profile of a user's demographic category, personal preferences, and may
compromise privacy. To address this problem, a privacy-preserving intermediate
layer between users and cloud services is proposed to sanitize the voice input.
It aims to maintain utility while preserving user privacy. It achieves this by
collecting real time speech data and analyzes the signal to ensure privacy
protection prior to sharing of this data with services providers. Precisely,
the sensitive representations are extracted from the raw signal by using
transformation functions and then wrapped it via voice conversion technology.
Experimental evaluation based on emotion recognition to assess the efficacy of
the proposed method shows that identification of sensitive emotional state of
the speaker is reduced by ~96 %.
Date Issued
2019-08-09
Citation
2019
Publisher
arXiv
Copyright Statement
© 2019 The Author(s)
Identifier
http://arxiv.org/abs/1908.03632v1
Subjects
cs.CR
cs.CR
cs.LG
cs.SD
eess.AS
stat.ML
Notes
5 pages, 4 figures, privacy Preserving Machine Learning Workshop, CCS 2019
Publication Status
Published