The distinction between synthetic and human voice uses the techniques of the current biometric voice recognition systems, which prevent that a person’s voice, no matter if with good or bad intentions, can be confused with someone else’s. Steganography gives the possibility to hide in a file without a particular value (usually audio, video or image files) a hidden message in such a way as to not rise suspicion to any external observer. This article suggests two methods, applicable in a VoIP hypothetical scenario, which allow us to distinguish a synthetic speech from a human voice, and to insert within the Comfort Noise a text message generated in the pauses of a voice conversation. The first method takes up the studies already carried out for the Modulation Features related to the temporal analysis of the speech signals, while the second one proposes a technique that derives from the Direct Sequence Spread Spectrum, which consists in distributing the signal energy to hide on a wider band transmission.
Due to space limits, this paper is only an extended abstract. The full version will contain further details on our research.
Dettaglio pubblicazione
2016, Digital-Forensics and Watermarking, Pages 145-159 (volume: 9569)
Synthetic speech detection and audio steganography in VoIP scenarios (04b Atto di convegno in volume)
Capolupo Daniele, D'Amore Fabrizio
ISBN: 978-3-319-31959-9; 978-3-319-31960-5
keywords