153 Pages
English
Gain access to the library to view online
Learn more

Very low bit rate parametric audio coding [Elektronische Ressource] / von Heiko Purnhagen

-

Gain access to the library to view online
Learn more
153 Pages
English

Description

VeryLowBitRateParametricAudioCodingVonderFakultat¨ fur¨ ElektrotechnikundInformatikderGottfriedWilhelmLeibnizUniversitat¨ HannoverzurErlangungdesakademischenGradesDoktor IngenieurgenehmigteDissertationvonDipl. Ing.HeikoPurnhagengeborenam2.April1969inBremen2008ii1.Referent: Prof.Dr. Ing.H.G.Musmann¨2. Prof.Dr. Ing.U.Z olzerTagderPromotion: 28.November2008iiiAcknowledgmentsThis thesis originates from the work I did as member of research staff at the InformationTechnologyLaboratoryoftheUniversityofHannover.Firstofall,Iwouldliketothankmysupervisor,ProfessorMusmann,fortheopportu nitytoworkintheinspiringenvironmentofhisinstituteandtheInformationTechnologyLaboratory,forenablingmetoparticipateintheMPEGstandardizationactivities,forthefreedom he gave me to pursue my own ideas, and for everything I learned during theseyears. I would also like to thank Professor Zolzer¨ and Professor Ostermann for being onmycommittee.I’m very grateful for the fruitful interactions I had with my colleagues and students.Inparticular,IwouldliketothankBerndEdler,NikolausMeine,CharalamposFerekidis,AndreeBuschmann,andGabrielGaus,whoallcontributed,intheirownway,tothesuc cess of this work. I would also like to thank Frank Feige and Bernhard Feiten (DeutscheTelekom, Berlin) and Torsten Mlasko (Bosch, Hildesheim) for good cooperation and re search funding.

Subjects

Informations

Published by
Published 01 January 2008
Reads 7
Language English
Document size 1 MB

Exrait

VeryLowBitRate
ParametricAudioCoding
VonderFakultat¨ fur¨ ElektrotechnikundInformatik
derGottfriedWilhelmLeibnizUniversitat¨ Hannover
zurErlangungdesakademischenGrades
Doktor Ingenieur
genehmigte
Dissertation
von
Dipl. Ing.HeikoPurnhagen
geborenam2.April1969inBremen
2008ii
1.Referent: Prof.Dr. Ing.H.G.Musmann
¨2. Prof.Dr. Ing.U.Z olzer
TagderPromotion: 28.November2008iii
Acknowledgments
This thesis originates from the work I did as member of research staff at the Information
TechnologyLaboratoryoftheUniversityofHannover.
Firstofall,Iwouldliketothankmysupervisor,ProfessorMusmann,fortheopportu
nitytoworkintheinspiringenvironmentofhisinstituteandtheInformationTechnology
Laboratory,forenablingmetoparticipateintheMPEGstandardizationactivities,forthe
freedom he gave me to pursue my own ideas, and for everything I learned during these
years. I would also like to thank Professor Zolzer¨ and Professor Ostermann for being on
mycommittee.
I’m very grateful for the fruitful interactions I had with my colleagues and students.
Inparticular,IwouldliketothankBerndEdler,NikolausMeine,CharalamposFerekidis,
AndreeBuschmann,andGabrielGaus,whoallcontributed,intheirownway,tothesuc
cess of this work. I would also like to thank Frank Feige and Bernhard Feiten (Deutsche
Telekom, Berlin) and Torsten Mlasko (Bosch, Hildesheim) for good cooperation and re
search funding. My work was closely related to the MPEG 4 standardization activities,
andIwouldliketothankSchuylerQuackenbush,MasayukiNishiguchi,andJur¨ genHerre
fortheirsupportinMPEG.
ManythanksgotoLars“Stockis”Liljeryd,MartinDietz,LarsGillner,andallmycol
leagues at Coding Technologies (now Dolby) for their patience, confidence, and support
duringthetimeIneededtofinalizethisthesis.
Furthermore, I would also like to thank all those people who continue to create and
to craft sounds that help me believing that there are still audio signals around that are
worthwhile to deal with. These are people like Joachim Deicke, Paul E. Pop, Bugge
Wesseltoft, Sofia Jernberg, Fredrik Ljungkvist, Joel Grip, Paal Nilssen Love, and many
othermusiciansandartists.
Lastbutnotleast,Iwouldliketothankmyparentsandfriendsforalltheirsupport.
Stockholm,December2008
Deteren´ made˚ atforsta˚ enandenkulturpa.˚ Atleveden. Atflytteindiden,at
bedeomatblivetalt˚ somgæst,atlæresigsproget. Pa˚ etellerandettidspunkt
kommer sa˚ mask˚ e forstaelsen.˚ Den vil da altid være ordløs. Det øjeblik man
begriber det fremmede, mister man trangen til at forklare det. At forklare et
fænomeneratfjernesigfradet.
PeterHøeg: FrøkenSmillasfornemmelseforsne(1992)ivv
Kurzfassung
IndieserArbeitwirdeinparametrischesAudiocodierungsverfahrenfur¨ sehrniedrigeDa
tenraten vorgestellt. Es basiert auf einem verallgemeinerten Ansatz, der verschiedene
Quellenmodelle in einem hybriden Modell vereinigt und damit die flexible Verwendung
einer breiten Palette von Quellen und Wahrnehmungsmodellen erm oglicht.¨ Das ent
wickelte parametrische Audiocodierungsverfahren erlaubt die effiziente Codierung von
beliebigenAudiosignalenmitDatenratenimBereichvonetwa6bis16kbit/s.
DieVerwendungeineshybridenQuellenmodellssetztvoraus,daßdasAudiosignalin
Komponenten zerlegt wird, die jeweils mit einem der verfugbaren¨ Quellenmodelle ange
messen nachgebildet werden konnen.¨ Jede Komponente wird durch einen Satz von Mo
dellparameternihresQuellenmodellsbeschrieben.DieParameterallerKomponentenwer-
denquantisiertundcodiertunddannalsBitstromvomEncoderzumDecoderubermittelt.¨
ImDecoderwerdendieKomponenten Signalewiedergem aߨ derubertragenen¨ Parameter
synthetisiertunddannzusammengefugt,¨ umdasAusgangssignalzuerhalten.
DashierentwickeltehybrideQuellenmodellkombiniertSinustone,¨ harmonischeTone¨
und Rauschkomponenten und verfugt¨ uber¨ eine Erweiterung zur Beschreibung von
schnellen Signal Transienten. Der Encoder verwendet robuste Algorithmen zur automa
tischen Zerlegung des Eingangssignals in Komponenten und zur Schatzung¨ der Modell
parameter dieser Komponenten. Ein Wahrnehmungsmodell im Encoder steuert die Si
gnalzerlegung und wahlt¨ die fur¨ die Wahrnehmung wichtigsten Komponenten fur¨ die
¨Ubertragungaus.SpezielleCodierungstechnikennutzendiestatistischenAbhangigk¨ eiten
¨undEigenschaftenderquantisiertenParameterfur¨ eineeffizienteUbertragungaus.
Der parametrische Ansatz ermoglicht¨ die Erweiterung des Codierungsverfahrens um
zusatzliche¨ Funktionen.DieSignalsyntheseimDecodererlaubtes,Wiedergabegeschwin
digkeit und Tonhohe¨ unabhangig¨ voneinander zu verandern.¨ Datenratenskalierbarkeit
wird erzielt, indem die wichtigsten Komponenten in einem Basis Bitstrom ubertragen¨
¨ ¨ ¨werden,weitereKomponentendagegeninErganzungs Bitstr omen.Robustheitfurfehler-
¨behaftete Ubertragungskanale¨ wird durch ungleichformigen¨ Fehlerschutz und Techniken
zurMinimierungderFehlerfortpflanzungundzurFehlerverdeckungerzielt.
DasresultierendeCodierungsverfahrenwurdealsHarmonicandIndividualLinesplus
Noise (HILN) parametrischer Audiocoder im internationalen MPEG 4 Audio Standard
standardisiert. Hortests¨ zeigen, daß HILN bei 6 und 16 kbit/s eine Audioqualitat¨ erzielt,
dievergleichbarmitdervonetabliertentransformationsbasiertenAudiocodernist.
Schlagworte: ParametrischeAudiocodierung,Signalzerlegung,Parameterschatzung,¨
Quellenmodell,Wahrnehmungsmodell,MPEG 4HILNvivii
Abstract
In this thesis, a parametric audio coding system for very low bit rates is presented. It is
based on a generalized framework that combines different source models into a hybrid
model and thereby permits flexible utilization of a broad range of source and perceptual
models. The developed parametric audio coding system allows efficient coding of arbi
traryaudiosignalsatbitratesintherangeofapproximately6to16kbit/s.
The use of a hybrid source model requires that the audio signal is being decomposed
intoasetofcomponents,eachofwhichcanbeadequatelymodeledbyoneoftheavailable
source models. Each component is described by a set of model parameters of its source
model. The parameters of all components are quantized and coded and then conveyed
as bit stream from the encoder to the decoder. In the decoder, the component signals are
resynthesized according to the transmitted parameters. By combining these signals, the
outputsignaloftheparametricaudiocodingsystemisobtained.
The hybrid source model developed here combines sinusoidal trajectories, harmonic
tones, and noise components and includes an extension to support fast signal transients.
Theencoderemploysrobustalgorithmsfortheautomaticdecompositionoftheinputsig
nalintocomponentsandfortheestimationofthemodelparametersofthesecomponents.
A perceptual model in the encoder guides signal decomposition and selects the percep
tually most relevant components for transmission. Advanced coding schemes exploit the
statisticaldependenciesandpropertiesofthequantizedparametersforefficienttransmis
sion.
The parametric approach facilitates extensions of the coding system that provide ad
ditional functionalities. Independent time scaling and pitch shifting is supported by the
signalsynthesisinthedecoder. Bitratescalabilityisachievedbytransmittingthepercep
tually most important components in a base layer bit stream and further components in
one or more enhancement layers. Error robustness for operation over error prone trans
mission channels is achieved by unequal error protection and by techniques to minimize
errorpropagationandtoprovideerrorconcealment.
TheresultingcodingsystemwasstandardizedasHarmonicandIndividualLinesplus
Noise (HILN) parametric audio coder in the international MPEG 4 Audio standard. Lis
tening tests show that HILN achieves an audio quality comparable to that of established
transform basedaudiocodersat6and16kbit/s.
Keywords: parametricaudiocoding,signaldecomposition,parameterestimation,
sourcemodel,perceptualmodel,MPEG 4HILNviiiix
Contents
1 Introduction 1
2 FundamentalsofParametricAudioCoding 9
2.1 ParametricRepresentationsofAudioSignals . . . . . . . . . . . . . . . 9
2.2 GeneralizedFrameworkforParametricAudioCoding . . . . . . . . . . . 12
3 SignalAnalysisbyDecompositionandParameterEstimation 15
3.1 DesignofaHybridSourceModelforVeryLowBitRateAudioCoding . 15
3.1.1 ModelingofSinusoidalTrajectories . . . . . . . . . . . . . . . . 16
3.1.2ofHarmonicTones . . . . . . . . . . . . . . . . . . . 17
3.1.3 ModelingofTransientComponents . . . . . . . . . . . . . . . . 19
3.1.4ofNoise . . . . . . . . . . . . . . . . . . 20
3.2 ParameterEstimationforSingleSignalComponents . . . . . . . . . . . 21
3.2.1ofSinusoidalTrajectoryParameters . . . . . . . . . . 22
3.2.2 BuildingSinusoidalTrajectories . . . . . . . . . . . . . . . . . . 32
3.2.3 EstimationofHarmonicToneParameters . . . . . . . . . . . . . 40
3.2.4ofTransientComponentParameters . . . . . . . . . . 47
3.2.5 EstimationofNoiseP . . . . . . . . . . . . 50
3.3 SignalDecompositionandComponentSelection . . . . . . . . . . . . . 51
3.3.1 SignalDecompositionforHybridModels . . . . . . . . . . . . . 51
3.3.2 Perception BasedDecompositionandComponentSelection . . . 56
3.4 ConstrainedSignalandParameterEstimation . . . . . . . 65
3.4.1 RateDistortionOptimization . . . . . . . . . . . . . . . . . . . . 65
3.4.2 EncoderImplementationConstraints . . . . . . . . . . . . . . . . 66
3.4.3 ComplexityofHILNEncoderImplementations . . . . . . . . . . 67x Contents
4 ParameterCodingandSignalSynthesis 71
4.1 ParameterEncodingandBitAllocation . . . . . . . . . . . . . . . . . . 71
4.1.1 QuantizationofModelParameters . . . . . . . . . . . . . . . . . 71
4.1.2 EntropyCodingofModelP . . . . . . . . . . . . . . . 76
4.1.3 JointCodingofaSetofModelParametersbySubdivisionCoding 81
4.1.4 BitAllocationinHILNEncoder . . . . . . . . . . . . . . . . . . 87
4.2 ParameterDecodingandSignalSynthesis . . . . . . . . . . . . . . . . . 94
4.2.1ofModelParameters . . . . . . . . . . . . . . . . . . 95
4.2.2 SynthesisTechniquesandImplementationAspects . . . . . . . . 96
4.2.3 ComplexityofHILNDecoderImplementations . . . . . . . . . . 98
4.2.4 ExampleofHILNCodingandSignalReconstruction . . . . . . . 100
4.3 ExtensionsforAdditionalFunctionalities . . . . . . . . . . . . . . . . . 103
4.3.1 Time ScalingandPitch Shifting . . . . . . . . . . . . . . . . . . 103
4.3.2 BitRateScalability . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.3.3 ErrorRobustness . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5 MPEG 4HILNExperimentalResults 107
5.1 MPEG 4AudioandHILNParametricAudioCoding . . . . . . . . . . . 107
5.2 AssessmentofPerformanceofHILN . . . . . . . . . . . . . . . . . . . . 108
5.2.1 ResultsfromMPEG 4CoreExperimentPhase . . . . . . . . . . 108
5.2.2oftheVerificationListeningTest . . . . . . . . 112
6 Conclusions 115
A ListeningTestItems 123
B SubdivisionCodingAlgorithm 125
Bibliography 127