Joint Source-Channel Coding With Time-Varying Channel and Side-Information

Transmission of a Gaussian source over a time-varying Gaussian channel is studied in the presence of time-varying correlated side information at the receiver. A block fading model is considered for both the channel and the side information, whose states are assumed to be known only at the receiver. The optimality of separate source and channel coding in terms of average end-to-end distortion is shown when the channel is static, while the side information state follows a discrete or a continuous and quasiconcave distribution. When both the channel and side information states are time-varying, separate source and channel coding is suboptimal in general. A partially informed encoder lower bound is studied by providing the channel state information to the encoder. Several achievable transmission schemes are proposed based on uncoded transmission, separate source and channel coding, joint decoding, and hybrid digital-analog transmission. Uncoded transmission is shown to be optimal for a class of continuous and quasiconcave side information state distributions, while the channel gain may have an arbitrary distribution. To the best of our knowledge, this is the first example, in which the uncoded transmission achieves the optimal performance thanks to the time-varying nature of the states, while it is suboptimal in the static version of the same problem. Then, the optimal distortion exponent, which quantifies the exponential decay rate of the expected distortion in the high SNR regime, is characterized for Nakagami distributed channel and side information states, and it is shown to be achieved by hybrid digital-analog and joint decoding schemes in certain cases, illustrating the suboptimality of pure digital or analog transmission in general.


I. INTRODUCTION
Many common applications in wireless networks, such as multimedia signal (voice or video) transmission over cellular networks or, the accumulation of local sensor measurements at a fusion center, require the transmission of a continuous amplitude source signal over a fading channel, to be reconstructed with the minimum distortion possible at the destination.Depending on the application layer requirements, additional delay requirements might be imposed on the system.For example, in video streaming or voice transmission, the source signal has to be reconstructed at the lowest distortion within a certain deadline.In many practical scenarios, in addition to the signal received from the transmitter, the destination might have access to an additional side information correlated with the source signal.This correlated side information might be obtained either from other transmitters in the network, or through the own sensing devices of the destination terminal.While current transmission schemes do not exploit this extra information in general, the theoretical benefits of having correlated side information are well known [4].
Examples of such scenarios are measurements from other sensors at a fusion center, signals from repeaters in digital TV broadcasting, or relay signals in future mobile networks.
We model this important practical communication scenario as a joint source-channel coding problem of transmitting a Gaussian source over a time-varying Gaussian channel with the minimum average end-to-end distortion in the presence of time-varying correlated side information at the receiver.We consider a block fading model for the states of both the channel and the side information, and these states are assumed to be known perfectly at the receiver.
When both the channel and the side information are static, Shannon's separation theorem applies [5], and the optimal performance is achieved by separate source and channel coding; that is, the concatenation of an optimal Wyner-Ziv source code [4], which exploits the side information available at the decoder, with an optimal capacity achieving channel code.However, in delay-limited transmission, if the channel and the side information are timevarying, and the channel state information (CSI) is available only at the receiver, the transmitter cannot use the optimal source and channel codes without being prone to outages, and the separation theorem fails.In order to have a good performance on average, the transmitter has to adapt to the time-varying nature of both the channel and the side information without knowing their realizations.
Strategies based on separate source and channel coding suffer from the threshold effect and do not adapt well to the uncertainties of the channel [6].On the other hand, uncoded (analog) transmission is a simple joint sourcechannel coding scheme robust to signal-to-noise (SNR) mismatch, and does not suffer from the threshold effect.
In Gaussian point-to-point channels, uncoded transmission is an alternative optimal scheme in the absence of side information [7], [8].However, it becomes suboptimal in the presence of correlated side information.In [9] a hybrid digital-analog scheme, called HDA, is proposed and shown to be robust to SNR mismatch and, unlike uncoded transmission, HDA is optimal even in the presence of side information at the receiver, or known interference in the channel.HDA is also shown to outperform separate source and channel coding and uncoded transmission in certain static setups, such as the transmission of a Gaussian source in the presence of correlated interference [10], [11], or to achieve the optimal distortion in the transmission of a bivariate Gaussian source over a broadcast channel [12].In addition to HDA or various other hybrid digital-analog transmission schemes, pure digital joint source and channel coding, based on joint decoding of the channel and source codewords, is also shown to exhibit improved robustness to the threshold effect, and to achieve the optimal performance in certain broadcasting scenarios [13]- [15].
The characterization of the optimal expected distortion for the proposed model in the absence of time-varying side information has received a lot of interest in recent years.Despite the ongoing efforts, the optimal performance remains an open problem.The expected distortion in this model is studied using multi-layer source codes concatenated with time-division [16] and superposition [17], [18] coding schemes.More conclusive results on this problem have been obtained by focusing on the high signal-to-noise ratio (SNR) behavior of the expected distortion.The SNR exponent of the expected distortion, called the distortion exponent, is characterized in the multi-antenna setup in certain regimes in [19], [20] and [21], and it is shown that multi-layer source and channel codes, or hybrid digital-analog coding schemes, are needed to achieve the optimal distortion exponent.
The pure source coding version of our problem, in which the channel is considered as an error-free constant-rate link, is studied in [22], and it is shown that, contrary to the channel coding problem, when the side information follows a continuous quasiconcave fading distribution, a single layer source code suffices to achieve the optimal performance.Recently, the joint source channel coding problem has also been considered in [23] and [24].In [23], the distortion exponent for separate source and channel coding is derived when the side information sequence has two states, the side information average gain does not increase with the SNR, and the channel follows a Rayleigh fading.In [24], HDA and joint decoding schemes are considered, and their performance is studied using the distortion loss, which quantifies the loss with respect to a fully informed encoder that perfectly knows the channels and the side information states.
In this paper, we consider the joint source-channel coding problem both in the finite and high SNR regimes.We first consider two lower bounds on the expected distortion by providing the encoder with different channel and side information state information.We then study achievable schemes based on uncoded transmission, separate source and channel coding, joint decoding, as well as hybrid digital-analog transmission and compare the performance of these schemes with the lower bound.
The main contributions of the paper are the following: • We prove the optimality of separate source and channel coding when the channel is static and the side information state has a discrete or a continuous quasiconcave gain distribution.Remarkably, most common distributions used to model wireless communication channels, e.g., Rayleigh, Rician, Nakagami, have continuous and quasiconcave gains.
• When both the channel and the side information are time-varying, and the side information gain distribution is discrete or continuous quasiconcave, we derive a lower bound on the expected distortion called the partially informed encoder lower bound, by providing only the current channel state to the encoder while the side information state remains unknown.
• We show that uncoded transmission meets this lower bound when the side information fading state belongs to a certain class of continuous quasiconcave distributions, while separate source and channel coding is suboptimal.This class includes monotonically decreasing functions which occur, for example, under Rayleigh fading.To the best of our knowledge, this is the first result showing the optimality of uncoded transmission in a fading channel scenario while it would be suboptimal in the static case.
• We propose achievable schemes based on separate source and channel coding (SSCC), joint decoding (JDS) and hybrid digital analog transmission with a superposed analog layer (SHDA).We show that JDS always outperforms SSCC and numerically show that SHDA performs very close to the partially informed encoder lower bound, although in general no particular scheme outperforms the others.
• We obtain the distortion exponent corresponding to the proposed upper and lower bounds for Nakagami distributed channel and side information.We parameterize the uncertainty by the shape parameter, given by L c for the channel and by L s for the side information.For L c ≥ 1, we characterize the optimal distortion exponent and show that it is achieved by SHDA, in line with the numerical results.For L c < 1, we show that JDS achieves the optimal distortion exponent in certain regimes, while SHDA is suboptimal.However, as L s increases, the performance of JDS saturates and becomes worse than SHDA, whose distortion exponent converges to the upper bound.
We will use the following notation in the rest of the paper.We denote random variables with upper-case letters, e.g., X, their realizations with lower-case letters, e.g., x, and the sets with calligraphic letters A. We denote E X [•] as the expectation with respect to X, and E A [•] as the expectation over the set A. We denote by R + n the set of positive real numbers, and by R ++ n the set of strictly positive real numbers in R n , respectively.We define (x) + = max{0, x}.Given two functions f (x) and g(x), we use f (x) .= g(x) to denote the exponential equality
≤ are defined similarly.
The rest of the paper is organized as follows: in Section II we introduce the system model.In Section III we provide some previous results and characterize the optimal performance for a static channel; while in Section IV, we propose upper and lower bounds on the performance.In Section V we prove the optimality of uncoded transmission under certain side information fading distributions.In Section VI we provide numerical results for the finite SNR regime, while in Section VII we consider a high SNR analysis and characterize the optimal distortion exponent.
Finally, in Section VIII we provide the conclusions.

II. SYSTEM MODEL
We consider the transmission of a random source sequence X n of independent and identically distributed (i.i.d.) entries form a zero mean, unit variance real Gaussian distribution, i.e., X i ∼ N (0, 1), over a time-varying channel (see Fig. 1).An encoder f n : R n → R n maps the source sequence X n to the input of this channel, U n ∈ R n , i.e., u n = f n (x n ), while satisfying an average power constraint: The block-fading channel is given by where H c ∈ R is the channel fading state with probability density function (pdf) p Hc (h c ), and N n is the additive white Gaussian noise N i ∼ N (0, 1).
In addition, there is an orthogonal block-fading side information channel connecting the source to the destination, which provides an uncoded noisy version of the source sequence to the destination.This second channel models the time-varying correlated side-information at the destination.Similarly to the communication channel, we model this side information channel as a memoryless block fading channel given by where Γ c ∈ R is the side information fading state with pdf p Γc (γ c ), X n is the uncoded channel input, and Z n is the additive white Gaussian noise, i.e., Z i ∼ N (0, 1), i = 1, ..., n.
We define H H 2 c ∈ R + and Γ Γ 2 c ∈ R + as the instantaneous channel gain and the instantaneous side information gain, with pdfs p H (h) and p Γ (γ), respectively.
We assume a stringent delay constraint that imposes each source block of n source samples to be transmitted over one block of the channel, consisting of n channel uses.Both the channel and side information states, H c and Γ c , are assumed to be constant, with values h c and γ c , respectively, for the duration of one channel block, and independent among different blocks.The channel and side information state realizations h c and γ c are assumed to be known at the receiver, while the encoder is only aware of their distributions.
The decoder reconstructs the source sequence from the channel output V n , the side information sequence Y n , and the channel and side information states h c and γ c using a mapping For given channel and side information distributions, we are interested in characterizing the minimum expected distortion, E[D], where the quadratic distortion between the source sequence and the reconstruction is given by The expectation is taken with respect to the source, channel and side information states, and the noise distributions.
The minimum expected distortion can be expressed as

III. PRELIMINARY RESULTS
We first review some of the existing results in the literature for the source coding version of the problem under consideration, in which the fading channel is substituted by an error-free channel of finite capacity.We then focus on the scenario in which the channel is noisy but static, i.e., the channel gain is constant and known both at the encoder and the decoder.We show that separate source and channel coding is optimal in the case of a static channel.

A. Background: Lossy Source Coding with Fading Side Information
The source-coding version of this problem in which the fading channel is substituted by an error-free channel of rate R and a time-varying side information sequence Y n is available at the destination is considered in [22].Here we briefly review the results of [22] which will be used later in the paper.
Let the distribution p Γ (γ) be discrete with define the side information sequence available at the decoder when the realization of the side information fading Note that the side information has a degraded structure, characterized by the Markov chain This is equivalent to the Heegard-Berger source coding problem with degraded side information [25], in which an encoder is connected by an error-free channel of rate R to M receivers, and receiver i has access to side information 1 To avoid confusion in the indexing, we use ] to denote all the elements Y i,j , j = 1, ..., n for the i-th side information state.
Y n i,1 .The minimum expected distortion is given by the solution to the following problem, where p [p 1 , ..., p M ], D = [D 1 , ..., D M ] with D i defined as the achievable distortion at receiver i and R HB (D) is the Heegard-Berger rate-distortion function given by where W i 1 denotes the auxiliary random variables W 1 , ..., W i , and P(D) is the set of random variables W M 1 satisfying the Markov chain condition for which there exist source reconstructions Xi (Y i , W i 1 ) satisfying E[d i (X, Xi )] ≤ D i , i = 1, ..., M .When the source X n is Gaussian, it can be shown that the optimal auxiliary random variables W M 1 minimizing (6) are jointly Gaussian.Then, the minimum expected distortion for a Gaussian source with finite number of side information states can be found by solving the following convex optimization problem [22, Eq. ( 59)-(62)]: where D 0 σ2 x = 1 and γ 0 0. The Heegard-Berger rate distortion function also extends to the set of infinitely many degraded fading states, [22].For a countable number of states, the expected distortion is given in [22, Eq. ( 75)-(78)] as the solution to When the side information distribution p Γ (γ) is continuous and quasiconcave 2 , the optimal expected distortion is achieved by single-layer rate allocation such that all the available rate R is targeted to a single side information state γ [22].Then, the optimal expected distortion is given by where γ minimizing (10) is determined as follows: Let a super-level set be defined as [γ l (α), γ r (α)] {γ|p Γ (γ) ≥ α}.Then, γ is defined as the left endpoint of the super-level set induced by α * , i.e., γ = γ l (α * ), where α * ∈ [0, max p Γ (γ)] is found by solving the equation When the side information state is Rayleigh distributed, the side information gain Γ is exponentially distributed.
Then it can be seen that γ = 0 and the optimal expected distortion becomes where dt is the exponential integral [22].Results in our paper are valid for discrete, i.e., finite or countable, number of states γ i , as well as continuous quasiconcave distributions of the side information.To unify these results, we define the function ED * s (R) as the minimum expected distortion in the source coding problem for these three setups.

B. Static Channel and Fading Side Information
In this section we consider a static channel and prove the optimality of separate source and channel coding in this setting.We consider a channel from U n to V n , not necessarily the fading Gaussian channel characterized in (1), of fixed capacity C. The side information is still block-fading as in (2) with the side information gain following a distribution p Γ (γ).Note that it is a joint source-channel coding generalization of the source coding problem reviewed in Section III-A.We denote the minimum expected distortion in the case of a static channel by ED * sta .Optimality of separate source and channel coding can be proven when Γ, the side information gain, has finite or countable number of states, or when it has a continuous quasiconcave distribution.This reduces the problem to the source coding problem of Section III-A with R = C. Theorem 1. Assume that the channel is static with capacity C. When the side information gain Γ has a discrete number of states, or a continuous quasiconcave pdf p Γ (γ), the minimum expected distortion, ED * sta , is achieved by separate source and channel coding, and is given by Proof: The theorem is first proven when Γ has a discrete distribution.Then, to show the optimality of separation when p Γ (γ) is continuous and quasiconcave we construct a lower bound on the expected distortion ED * sta by discretizing the continuum of analog side information states, and show that this bound is achievable in the limit of finer discretizations.See Appendix I for details.

IV. UPPER AND LOWER BOUNDS
In this section we return to the problem presented in Section II in which both the channel and the side information are block-fading.We construct two lower bounds on ED * .The first one is obtained by informing the encoder with both of the channel and side information states H and Γ.Then, we construct a tighter lower bound by informing the encoder only with the channel state H. Next, we propose achievable schemes based on uncoded transmission, separate source and channel coding, joint decoding and hybrid digital-analog transmission.Comparison of the proposed upper and lower bounds in different regimes of operation is relegated to Sections V, VI and VII.

A. Informed Encoder Lower Bound
A trivial lower bound on ED * can be obtained by providing the encoder with the instantaneous states of the channel and the side information.We call this bound the informed encoder lower bound.At each realization, the problem reduces to the systematic model considered in [5] (see also [26]), for which the separation theorem holds.
The encoder compresses the source sequence using Wyner-Ziv source coding considering the side information, and then transmits the compressed bits at the instantaneous capacity of the channel.For states (h, γ), the optimal distortion is given by D inf (h, γ) (1 + h) −1 (1 + γ) −1 .Averaging over the channel and side information states, the informed encoder lower bound on the expected distortion is given by

B. Partially Informed Encoder Lower Bound
We can obtain a tighter lower bound by providing the encoder only with the channel realization h.We call this the partially informed encoder lower bound, and denote it by ED * pi .For a given channel state realization h, the setup reduces to the one considered in Section III-B, and for a discrete or continuous quasiconcave p Γ (γ), separation theorem applies for each channel realization.Averaging over the channel states, we have the following lower bound.Lemma 1.If p Γ (γ) is discrete or continuous quasiconcave, the minimum expected distortion is lower bounded by where C(h) 1 2 log(1 + h) is the capacity of the channel for a given realization h = h 2 c .
Providing only the side information state to the encoder does not lead to a tight computable lower bound, since the optimality of separate source and channel coding does not hold in this case.Although the partially informed encoder lower bound is tighter, we will include the informed encoder bound in our analysis, as it provides a benchmark for the performance when both channel and side information states are available at the transmitter, which sheds light on the value of the CSI feedback for this joint source-channel coding problem.
Next, we study some achievable schemes (upper bounds) for the lossy systematic joint source-channel coding problem.

C. Uncoded Transmission
Uncoded transmission is a memoryless and zero-delay transmission scheme in which each channel input U i is generated by scaling the source signal X i while satisfying the power constraint.In our model both the source variance and power constraint of the encoder are 1, and hence, no scaling is needed, i.e., U i = X i .The received signal from the channel is then given by The receiver reconstructs each component with a minimum mean-squared error (MMSE) 3 estimator using both the channel output and the side information sequence, i.e., Xi = E[X i |V i , Y i ], i = 1, ..., n.The distortion for each source component X i for a given channel and side information realization h c and γ c is given by D u (h, γ) Averaging over the channel and side information realizations, we have

D. Separate Source and Channel Coding (SSCC)
Next, we consider separate source and channel coding with a single layer based on Wyner-Ziv source coding using the side information sequence followed by channel coding for the channel.Note that due to the lack of CSI at the transmitter the rates of the source and the channel codebooks are fixed at all channel and side information states.
The quantization codebook consists of 2 n(Rc+Rs) length-n codewords, W n (i), i = 1, ..., 2 n(Rc+Rs) , generated through a 'test channel' given by W The generated quantization codewords are then uniformly distributed into 2 nRc bins.On average, each bin contains 2 nRs codewords.Additionally, a Gaussian channel codebook with 2 nRc length-n codewords U n (s) is generated independently with U ∼ N (0, 1), and the codeword U n (s), s ∈ [1, ..., 2 nRc ], is assigned to the bin index s.
Given a source realization X n , the encoder searches for a quantization codeword W n (i) that is jointly typical4 with X n .Assuming one such codeword is found, the channel codeword U n (s) is transmitted over the channel, where s is the bin index of W n (i).At reception, the bin index s is recovered with high probability using the The decoder then looks for a quantization codeword within the estimated bin, that is jointly typical with the side information sequence Y n .If the bin index is correct, the correct codeword will be decoded with high probability if, If the quantization codeword W n is successfully decoded, then Xn is reconstructed with an optimal MMSE estimator as An outage is declared whenever, due to the randomness of the channel or the side information, the quantization codebook cannot be correctly decoded, i.e., when condition (18) or (19) are not satisfied.In case of an outage, only the side information sequence is used to estimate the source, and we have When the quantization rate is R and the side information state is γ, the distortion is if the quantization codeword is decoded correctly.If an outage occurs, the achievable distortion is given by D d (0, γ).
Then, the expected distortion of SSCC is given by where O c sb is the complement of the outage event defined as Since the source and channel rates R s and R c are fixed for all channel and side information states, we can chose those in order to minimize the expected distortion.Thus, we have When the side information has a continuous quasiconcave gain distribution, we can have a closed-form expression for the optimal source coding rate R s , as given in the next lemma.
where γ is the solution to (11).
Proof: Once the channel rate has been fixed, i.e., once R c is fixed, it follows from the results in Section III-A that ED sb (R s , R c ) is minimized by compressing the source to a single layer targeted for side information state γ, , from where R s is obtained.
We can reduce the complexity of SSCC by having only a single codeword in each bin, that is, by letting R s = 0.This way, we get rid of the outage event corresponding to a poor side information gain realization.
However, to achieve the same quantization noise, we need to transmit at a higher rate over the channel, which increases the channel outage probability.Without binning, the minimum expected distortion is found as Note that when the side information fading distribution is such that γ = 0, then, from Lemma 2, the optimal source coding rate is R s = 0, i.e., the minimum expected distortion is achieved by ignoring the decoder side information in the encoding process.
In this section, we have only considered a single layer source coding scheme since for continuous quasiconcave p Γ (γ), the optimal source code uses a single source code layer.However, in the case of discrete number of side information gain states, the optimal source code employs multiple source layers, one layer targeting each of the side information states [22].For a channel code at rate R c , the achievable expected distortion can be obtained similarly to the scheme described in this section, using ED F (R c ) and ED C (R c ) in ( 8), for finite and countable number of side information states, respectively.

E. Joint Decoding Scheme (JDS)
Here, we consider a source-channel coding scheme that does not involve any explicit binning at the encoder and uses joint decoding to reduce the outage probability.This coding scheme is introduced in [14] in the context of broadcasting a common source to multiple receivers with different side information qualities, and it is shown to be optimal in the case of lossless broadcasting over a static channel.The success of the decoding process depends on the joint quality of the channel and the side information states.
At the encoder, a codebook of The quantization noise variance is chosen such that R j = I(X; W ) + ǫ, for an arbitrarily small ǫ > 0.Then, an independent Gaussian codebook of size 2 nRj is generated with length-n codewords U n (i) with U ∼ N (0, 1).Given a source outcome X n , the transmitter finds the quantization codeword W n (i) jointly typical with the source outcome and transmits the corresponding channel codeword U n (i) over the channel.At reception, the decoder looks for an index i for which both (u n (i), V n ) and (Y n , w n (i)) are jointly typical.Then the outage event is given by where If decoding is successful, the source X n is estimated using both the quantization codeword and the side information sequence, while if an outage occurs, the source X n is reconstructed using only the side information sequence.Then, the expected distortion for the JDS scheme is found as Similarly to (21), the expected distortion can be optimized over R j to obtain the minimum expected distortion achieved by JDS, that is, ED * j min Rj ED j (R j ).
In SSCC, the quantization codeword is successfully decoded only if both the channel and the source codes are successfully decoded.On the other hand, JDS decodes the quantized codeword exploiting the joint quality of both the channel and the side information sequence.Hence, a bad channel realization can be compensated with a sufficiently good side information realization, or viceversa, reducing the outage probability.Indeed, the minimum expected distortion of JDS is always lower than that of SSCC, as stated in the next lemma.
Proof: Consider the SSCC scheme as in Section IV-D with rates R c and R s .We will show that the JDS scheme with rate both schemes are in outage, or if the quantization codeword is decoded successfully in both, they achieve the same distortion.Thus, to prove our claim, it will suffice to show that O sb ⊇ O j .
Let (h, γ) be such that R c ≥ I(U ; V ) = 1 2 log(1 + h), i.e., SSCC is in outage.Note that for given (h, γ), R s and R c , I(U ; V ) and I(W ; X|Y ) have the same values for both schemes.However, if I(W ; X|Y ) < I(U ; V ), JDS is able to decode the quantization codeword successfully while SSCC would still be in outage.This condition is satisfied This completes the proof.

F. Superposed Hybrid Digital-Analog Transmission (SHDA)
In this section we consider a hybrid digital-analog transmission scheme over the channel that superposes a coded layer with an uncoded layer and allocates the power among the two layers.The decoder uses joint decoding to recover the quantized codeword using the channel output and the side information sequence.The uncoded component in the channel causes an interference correlated with the source sequence, and thus, acts as side information in the decoding.On the contrary, if an outage occurs and the quantization codeword is not successfully decoded, the analog component provides additional robustness since the channel now contains a noisy uncoded version of the source sequence useful for the reconstruction.This scheme was presented without the uncoded layer in [9] for the static setting, i.e., static channel and static side information available at the receiver, and was shown to be robust against channel SNR mismatch.
The encoder transmits a superposition of digital and analog input signals as where U n d and U n a are the length-n channel input vectors corresponding to digital and analog input signals, respectively.The analog channel input U n a is a scaled version of the source sequence X n with power P a , given by The digital portion of the transmitted signal X n d is generated as follows.We first define the auxiliary random variable T U d + ηX, where U d is independent of X and distributed as U d ∼ N (0, P d ), where P d , is the power allocated to the digital channel input with P d = 1 − P a ; and η and P d satisfy, for an arbitrarily small ǫ > 0, . Then, we generate a codebook of 2 nR h length-n codewords T n with i.i.d.components according to the auxiliary random variable T .For each source outcome, the encoder determines which of the 2 n(I(T ;X)+ǫ) codewords T n in the codebook is jointly typical with X n , and transmits U n d = T n − ηX n .For sufficiently large n, a unique T n satisfies the joint typicality condition with high probability since R h > I(T ; X).
At the decoder, given the channel output V n and the side information sequence Y n , the receiver looks for an auxiliary codeword T n which is simultaneously jointly typical with V n and Y n .For large enough n, the correct We define the matrix We have be defined as the submatrix of C h with the first column and first row eliminated.Then, we have An outage will be declared whenever condition (25) does not hold due to the randomness of the channel and side information.Hence, the outage event is defined by and is given by If T n is successfully decoded, each X i is reconstructed using an MMSE estimator with all the information available at the decoder, We have If an outage occurs, the receiver estimates X n from V n and Y n with an MMSE estimator, The achieved distortion is found to be Finally, the expected distortion for SHDA is given by Optimizing over P d and η, we obtain ED * shda min P d ,η ED shda (P d , η).Note that uncoded transmission can be recovered from ED shda (P d , η) with P d = 0.The hybrid digital analog (HDA) scheme of [9] can be recovered by letting P a = 0. We define the minimum expected distortion achievable with HDA as ED * hda min η ED shda (1, η).

V. OPTIMALITY OF UNCODED TRANSMISSION
In addition to separate source and channel coding, uncoded transmission is well known to achieve the minimum distortion in point-to-point static Gaussian channels [7], [8].Other setups in which uncoded transmission achieves the optimal performance have received a lot of attention in the literature, such as the transmission of noisy observations of a Gaussian source over Gaussian multiple access channels (MACs) [29] and the transmission of correlated Gaussian sources over a Gaussian MAC, in which the uncoded transmission is shown to be optimal below a certain SNR threshold in [30].
However, even in a point-to-point Gaussian channel, in the presence of static side information at the decoder, uncoded transmission becomes suboptimal.In this case, separate source and channel coding, concatenating a Wyner-Ziv source code with a capacity achieving channel code [5], or joint source-channel coding through the HDA scheme in [9] is required to achieve the optimal distortion.Surprisingly, in our setting, when Γ has a continuous and quasiconcave distribution for which γ = 0 is the solution to equation (11), uncoded transmission achieves the lower bound ED * pi in (15) for any arbitrarily distributed channel, while both separate source and channel coding and HDA schemes are suboptimal.Similarly to the other results, optimality of uncoded transmission in our setting is also sensitive to the source and channel distributions.Theorem 2. Let p H (h) be an arbitrary pdf while p Γ (γ) is a continuous and quasiconcave function satisfying equation (11) for γ = 0.Then, the minimum expected distortion ED * is achieved by uncoded transmission.
Proof: For any pdf satisfying (11) with γ = 0, the partially informed encoder lower bound is given by where (a) is obtained by substituting γ = 0 in (10).This completes the proof.
The class of continuous quasiconcave functions for which any non-empty super-level set of f Γ (γ) begins at γ = 0, satisfies γ = 0.It is not hard to see that the class of continuous monotonically decreasing functions in γ ≥ 0 satisfy this condition.
Proof: By definition, γ is given by the left endpoint of the super-level set induced by α * .For any monotonically decreasing function p Γ (γ), the left endpoint of the super-level set {γ : p Γ (γ) ≥ α} corresponds to γ = 0, and as a consequence, we have γ = 0 for any value of α * .

VI. FINITE SNR RESULTS
In the previous section we have seen the optimality of uncoded transmission when the side information fading state follows a continuous quasiconcave pdf for which γ = 0.The exponential distribution, and the more general family of gamma distributions with shape parameter L ≤ 1, are continuous monotonically decreasing distributions, and hence, the uncoded transmission is optimal when the side information gain Γ follows one of these distributions.
Gamma distributed fading gains appear, for example, when the channel state follows a Nakagami distribution.The gamma distribution with shape parameter L and scale parameter θ, Γ ∼ Υ(L, θ), is given as where Ψ(z) ∞ 0 t z−1 e −t dt is the gamma function.The variance of Γ is σ 2 Γ = Lθ 2 and its mean is E[Γ] = Lθ.When L ≤ 1, it is easy to check that p Γ (γ) is continuous monotonically decreasing, while it is continuous quasiconcave for L > 1.Note that when L = 1, the gamma distribution reduces to the exponential distribution.
Parameter L models the side information diversity since a time-varying side information sequence Y m , with state distribution p Γ (γ), provides the equivalent information (in the sense of sufficient statistics) provided by L Fig. 2. Upper and lower bounds on the expected distortion versus the channel SNR (ρ) for Rayleigh fading channel and side information gain distributions, i.e., independent side information sequences each with i.i.d.Rayleigh block-fading gains.We note that despite the term "diversity", the side information diversity comes from uncoded noisy versions of the source sequence; hence, the gains it provides are limited compared to the channel diversity which can be better exploited through coding.
To illustrate the performance of the achievable schemes and compare them with the lower bounds, we consider Nakagami fading channel and side information distributions.We consider normalized channel and side information gains where Basically, H c0 and Γ c0 capture the randomness in the channels while ρ is the average SNR.We define the associated instantaneous gains H 0 H 2 c0 and Γ 0 Γ 2 c0 .We assume that the channel gain H 0 has a gamma distribution with scale parameter L c > 0 and ), and similarly, the side information gain follows a gamma distribution with L s > 0 and θ s = L −1 s , i.e., Γ 0 ∼ Υ(L s , L −1 s ).We have fixed the value of θ c and θ s such that , and both channels have the same average SNR ρ for any L c and L s .Note that the variance of Γ is σ 2 Γ = L s θ 2 = 1/L s .Thus, the side information gain Γ becomes more deterministic as L s increases, and similarly, for L c and H.
First we consider the case with L s = L c = 1, i.e., both the channel and the side information gains are Rayleigh distributed.In Figure 2 we plot the expected distortion with respect to the channel SNR.As shown in Theorem 2, uncoded transmission achieves the partially informed encoder lower bound ED * pi .The minimum expected distortion is given by We see from the figure that the informed encoder lower bound is significantly loose, especially at high SNR.This gap between the two lower bounds also illustrates the potential performance improvement that will be achieved by increasing the feedback resources.If both channel and side information states can be fed back to the encoder, instead of only CSI feedback, a significant improvement can be achieved.In relation to this observation, a problem that requires further research is the allocation of feedback resources between channel and side information states when a limited feedback channel is available from the decoder to the encoder.
SHDA (ED * shda ) also achieves the optimal performance by allocating all available power to the analog component, reducing it to uncoded transmission.Note that while the HDA scheme of [9] cannot reach ED * in the low SNR regime, its performance gets very close to ED * at high SNR values.
The expected distortion achievable by SSCC is minimized without any binning, since we have γ = 0 for Rayleigh fading side information.Hence, R * s = 0 from Lemma 2, and therefore ED * sb = ED * nb .It is interesting to observe that for Rayleigh fading side information states, the uncertainty in the side information renders it useless in transmitting the quantized source codeword, and the side information is ignored to avoid outages in source decoding.The side information is used only in the estimation step.As will be seen next, this is not the case when the side information fading has a different distribution.
We also observe in Fig. 2 that JDS (ED * j ) outperforms SSCC by exploiting the joint quality of the channel and the side information, as claimed by Lemma 3. We also see that JDS cannot achieve the optimal performance in this setting.Observe that the expected distortion achieved by MMSE estimation of the source using only the side information, ED * no , has a constant gap with ED * , as well as with the other schemes in the high SNR regime.Observations above, including the optimality of uncoded transmission, hold for any L c value as long as L s ≤ 1.
This follows from Proposition 1 since p Γ (γ) is monotonically decreasing if L s ≤ 1.However, while uncoded transmission is optimal when L s ≤ 1, this optimality does not hold in general.Next, it will be shown that for a wide variety of channel distributions, while uncoded transmission is suboptimal, SHDA performs very close to the partially informed encoder lower bound.
We consider the case with L s = 2 and L c = 1 in Fig. 3.We can see that SHDA achieves the lowest expected distortion among the proposed schemes and performs very close to the lower bound at all SNR values, while uncoded transmission is suboptimal.Although the performance of uncoded transmission is very close to ED * pi in the low SNR regime, as the SNR increases, the gap between uncoded transmission and the partially informed encoder bound increases.In addition, both SSCC and JDS surpass the performance of uncoded transmission as the SNR increases.
We see that SSCC with and without binning both have worse performance than JDS in all SNR regimes and, while at low SNR binning does not provide significant gains, as the SNR increases ED * sb starts to outperform ED * nb .On the other hand, ED * nb lies between ED u and ED no .These three schemes have the same decay rate and maintain a constant gap.The rate of decay in the high SNR regime is characterized in Section VII for all the proposed schemes.Similar behavior is observed in Fig. 4 for L s = 10 and L c = 1.The minimum distortion among the proposed transmission schemes is achieved by SHDA, which performs very close to the lower bound beyond SNR ≃ 8dB.
We can observe that as L s increases, the performance of uncoded transmission is further away from the lower bound, and JDS outperforms it even at lower SNR values.However, the rate of decay of JDS is worse than the optimal decay in this setting.We also observe that when no binning is considered, the minimum expected distortion achieved by SSCC is still worse than that achieved by uncoded transmission, while the two have the same decay rate in the high SNR regime.However, the use of binning allows SSCC to outperform uncoded transmission, yet outperforms SHDA for SNR values greater than SNR ⋍ 37dB.As the SNR increases, JDS performs close to the partially informed lower bound, while SHDA performance is further from the lower bound.Similarly to the previous scenarios, we observe that uncoded transmission performs close to the lower bound for low SNR values and that SSCC achieves lower distortion values if binning is considered.
Observe from Fig. 3 and Fig. 4 that, as the side information diversity, L s , increases, the gap at any SNR between the informed encoder lower bound and the partially informed encoder lower bound reduces.The two bounds converge since for the studied setup σ 2 Γc0 = L −1 s , and as L s increases, the variance decreases, and therefore, the level of uncertainty in the available side information gain state drops.In fact, the two bounds can be shown to converge at any SNR value and for any arbitrary side information gain whose variance decreases with some parameter, namely L s , as given in the next lemma.

Lemma 4. Let H be arbitrarily distributed and have a finite mean, i.e., E H
be a sequence of side information gain random variables such that, for every L, Γ L follows an arbitrary distribution with variance Then, the partially informed encoder lower bound converges to the informed encoder lower bound, i.e., the following limit holds: Proof: See Appendix II.
Although the side information available at the decoder becomes more deterministic with increasing L s , the channel is still block-fading.Only SHDA performs close to the informed encoder lower bound, i.e., the optimal performance when the current channel and side information states are known.On the contrary, the rest of the studied schemes cannot fully exploit the determinism in the side information fading gain for L c ≥ 1, while it seems that for L c < 1 JDS is the scheme achieving the lowest expected distortion.The performance of each scheme will be analyzed in the next section in terms of the exponential decay rate of the expected distortion in the high SNR regime.

VII. HIGH SNR ANALYSIS
In the previous section we have seen the optimality of uncoded transmission in certain settings in which the proposed digital schemes are suboptimal.On the other hand, our numerical results have shown that the SHDA scheme has a good performance for a wide variety of channel distributions while the optimality of uncoded transmission is very sensitive to the distribution of the side information.We have also observed that JDS outperforms SHDA in certain regimes.Although we have characterized the optimal expected distortion in closed-form for the Rayleigh fading scenario in (33), a closed-form expression of the optimal expected distortion for general channel and side information distributions is elusive.Instead, we focus on the high SNR regime, and study the exponential decay rate of the expected distortion with increasing SNR, defined as the distortion exponent, and denoted by ∆ [31].We have, In this section, we study the distortion exponent for the model considered in Section VI, i.e., a Nakagami fading channel and side information gains, i.e., H 0 ∼ Υ(L c , L −1 c ) and Γ 0 ∼ Υ(L s , L −1 s ).We are interested in characterizing the maximum distortion exponent over all encoder and decoder pairs, denoted by ∆ * (L s , L c ).
We first provide an upper bound on the distortion exponent by studying the partially informed encoder lower bound on the expected distortion in (15).In determining the high SNR behavior of the partially informed encoder lower bound, it is challenging to characterize the optimal SNR exponent for the target side information state γ in (11) for different channel states.Hence, we further bound the expected distortion by considering the ergodic channel capacity as the channel rate.

Lemma 5. The optimal distortion exponent is upper bounded by the exponent of the partially informed encoder lower bound calculated at the ergodic channel capacity, given by
Proof: See Appendix III-A.We will see that ∆ pe (L s , L c ) is tight only for L c ≥ 1, and the ergodic channel relaxation is loose for L c < 1.
In order to tighten the bound in these regimes, we consider the distortion exponent of the informed encoder upper lower proposed in Section IV.

Lemma 6. The distortion exponent is upper bounded by the exponent of the informed encoder lower bound, given by
In the next proposition, we combine the two upper bounds into a single upper bound on the distortion exponent.

Theorem 3. For a Nakagami fading channel with
), and a Nakagami fading side information with Γ 0 ∼ Υ(L s , L −1 s ), the optimal distortion exponent is upper bounded by In Fig. 6 and Fig. 7 we plot the distortion exponent upper and lower bounds with respect to the parameter L s of the Nakagami distribution for L c = 1 and L c = 0.5, respectively.
Note that for L c ≥ 1, as L s increases, the optimal distortion exponent ∆ * (L s , L c ) converges to the informed encoder upper bound, which is obtained by assuming perfect knowledge of both channel and side information states at the encoder.This observation is parallel to the result in Lemma 4.However, this is not the case if L c < 1.While Lemma 4 applies to any channel distribution, the partially informed bound with ergodic channel relaxation is loose in this regime.
Next, we consider the distortion exponent achievable by transmission schemes proposed in Section IV.The proofs of the corresponding distortion exponent results can be found in Appendix IV.

Lemma 7. The distortion exponent achieved by uncoded transmission is given by
As expected from Theorem 2, uncoded transmission achieves the optimal distortion exponent for L s ≤ 1.However, it is suboptimal for L s > 1.We note that the distortion exponent of simple MMSE estimation using only the side information sequence, ED no , is given by ∆ no (L s , L c ) = min{L s , 1}.

Lemma 8. The distortion exponent achievable by SSCC with binning is given by
If binning is not used, the achievable distortion exponent is given by From Lemma 2, we know that binning is suboptimal for L s ≤ 1 irrespective of the channel distribution, and both schemes achieve the same distortion exponent in this regime.Note also that when L s = 1, SSCC achieves the optimal distortion exponent of 1.However, when L s > 1, if binning is not used the scheme cannot exploit the side information state properly, and achieves the same distortion exponent as uncoded transmission.This proves that binning is required in this regime.

Lemma 9. The distortion exponent achievable by JDS is given by
JDS achieves the same distortion exponent as SSCC for L s ≤ 1.However, interestingly, for 1 ≤ L s ≤ 1 + L c , JDS achieves the optimal distortion exponent and then saturates for L s > 1 + L c .Observe that, as L s increases, the achievable distortion exponent with SSCC converges to the performance of JDS.
Lemma 10.The distortion exponent achievable by SHDA and HDA is given by Lemma 10 reveals that the robustness provided by the uncoded layer in SHDA is not required in the high SNR regime to achieve the optimal distortion exponent, and allocating all the available power to the HDA layer of the SHDA scheme is sufficient.However, we remark that, in terms of the expected distortion in the low SNR regime pure HDA is not sufficient to achieve a performance close to the lower bound, and the uncoded layer improves the performance in general, as observed in the previous section.
HDA achieves the optimal distortion exponent for L c ≥ 1 while the rest of the proposed schemes are suboptimal.
However, when L c < 1, JDS outperforms HDA for 1 ≤ L s ≤ 2. Nevertheless, as L s increases, HDA converges to the distortion exponent of the informed encoder lower bound, despite the uncertainty in the channel state.
We can see that in the limit L s → ∞, with 0 < L c ≤ 1, we have This result suggests that, as the side information fading state becomes more deterministic, the performance of HDA converges to the informed encoder lower bound, while the rest of the schemes perform significantly worse than HDA.
Combining the achievable distortion exponents of the JDS and HDA schemes, we can characterize the optimal distortion exponent ∆ * (L s , L c ) in certain regimes, as given next.

Theorem 4. Consider a Nakagami fading channel with
) and a Nakagami fading side information with Γ 0 ∼ Υ(L s , L −1 s ).If L c ≥ 1, the optimal distortion exponent is achieved by the HDA scheme, and is given by If L c < 1, and L s ≤ 1 + L c , the optimal distortion exponent is given by and is achieved by uncoded transmission and HDA when L s ≤ 1, and by JDS when These analytical results are in line with the numerical analysis carried out in Section VI.For L s = L c = 1, all the schemes achieve the optimal distortion exponent ∆ * (1, 1) = 1, which is far from the informed encoder upper bound given by ∆ inf (1, 1) = 2, as observed in Fig. 2. For L s = 2 and L c = 1, plotted in Fig. 3, the optimal distortion exponent is given by ∆ * (2, 1) = 3/2, which is achieved by HDA, while uncoded transmission is suboptimal since In this case JDS also achieves the optimal distortion exponent, while SSCC with binning achieves a lower distortion exponent of ∆ sb (2, 1) = 4/3.As observed in the numerical analysis, if no binning is used, SSCC achieves the same distortion exponent as the uncoded transmission, and the one achieved by using only the side information sequence, i.e., ∆ u (2, 1) = ∆ nb (2, 1) = ∆ no (2, 1) = 1.Although a similar behavior is observed for higher values of L s , JDS does not achieve the optimal distortion exponent in general.For the case of L s = 10 and L c = 1 plotted in Fig. 4, we have ∆ * (10, 1) = 19/10, while ∆ j (L s , 1) = 3/2 for L s ≥ 2. However, when L c = 0.5 and L s = 1.5, plotted in Fig. 5, JDS achieves the optimal distortion exponent of ∆ * (1.5, 0.5) = 4/3, while HDA achieves a smaller distortion exponent given by ∆ shda (1.5, 0.5) = 5/4.In this setup the performance of SSCC is improved if binning is used since ∆ sb (1.5, 0.5) = 6/5, while if binning is not used we have ∆ sb (1.5, 0.5) = 1, which coincides with the distortion exponent of uncoded transmission.

VIII. CONCLUSIONS
We have studied the joint source-channel coding problem of transmitting a Gaussian source over a delay-limited block-fading channel when block-fading side information is available at the decoder.We have assumed that only the receiver has full knowledge of the channel and side information states while the transmitter is aware only of their distributions.In the case of a static channel, we have shown the optimality of separate source and channel coding when the side information gain follows a discrete or a continuous quasiconcave distribution.
When both the channel and side information states are block-fading, the optimal performance is not known in general.We have proposed achievable schemes based on uncoded transmission, separate source and channel coding, joint decoding and hybrid digital-analog transmission.We have also derived a lower bound on the expected distortion by providing the encoder with the actual channel state.We call this the partially informed encoder lower bound, since the side information state remains unknown to the encoder.We have shown that this lower bound is tight for a certain class of continuous quasiconcave side information fading distributions, and the optimal performance is achieved by uncoded transmission.This, to the best of our knowledge, constitutes the first communication scenario in which the uncoded transmission is optimal thanks to the existence of fading, while the known digital encoding schemes fall short of the optimal performance.We have also proved that joint decoding outperforms separate source and channel coding since the success of decoding at the receiver depends on the joint quality of the channel and side information states, rather than being limited by each of them separately.We have also shown numerically that hybrid digital-analog transmission performs very close to the lower bound for a wide range of channel and side-information distributions (in particular, we have considered Gamma distributed channel and side information gains with different shape parameters).However, it has also been observed that no unique transmission scheme outperforms others at all cases.
In the high SNR regime, we have obtained closed-form expressions for the distortion exponent, i.e., the optimal exponential decay rate of the expected distortion in the high SNR regime, of the proposed upper and lower bounds for Nakagami distributed channel and side information.Aligned with the numerical results in the finite SNR regime, we have shown that hybrid digital-analog transmission outperforms other schemes in most cases and achieves the optimal distortion exponent for certain values of channel and side information diversity, and joint decoding achieves the optimal distortion exponent for some values of side information diversity when the channel diversity is less than one, in which case hybrid digital-analog transmission is suboptimal.

A. Separation for Discrete Distributions
For Γ with two states optimality of separation can be obtained as a special case of the model studied in [26].
This result can be extended to M receivers (or states) by combining the converses in [26] and [25, Sec.VII] for M side information states, i.e., Y n i,1 , i = 1, ..., M .The direct part is shown by the concatenation of the optimal source code in [25] and an optimal channel code.
First, we consider the converse.We have, where (a) is due to the definition of capacity, (b) is due to the data processing inequality, (c) is due to the Markov and (e) are due to the chain rule of the mutual information.From this point, by applying the steps in [25, Sec.VII] with some slight modifications we obtain where we have defined the random variables for i = 1, ..., n.Note that the random variables satisfy the Markov chain condition Applying the usual techniques by defining the auxiliary random variable Q ∼ Unif[1, n], X iQ , W iQ , and Y iQ for i = 1, ..., M , we obtain the single letter condition, Finally, the right hand side of (46) is given by the Heegard-Berger rate-distortion function, R HB (D), and does not depend on the number of receivers but only on the sum of the mutual information terms, each one corresponding to a receiver with side information Y i , as discussed in [22].Hence, the converse applies for countably many receivers as well.
The achievability follows from Heegard-Berger source coding [25, Sec.VII] followed by channel coding at rate

B. Separation for Continuous Quasiconcave Distributions
To prove the optimality of separation when p Γ (γ) is a continuous quasiconcave distribution, we construct a lower bound on the expected distortion ED * sta by discretizing the continuum of side information states, and show that this bound is achievable in the limit of finer discretizations.
We divide the side information state γ into some partition s given by [s 0 , s 1 ), [s 1 , s 2 ), ..., such that s 0 = 0 < s is defined by ∆s i , i.e., ∆s i s i − s i−1 .Let us define γ > 0 as the super-level set γ satisfying (11).The partition is chosen such that for some index j, we have s j = γ.A fading realization belongs to the interval [s i−1 , s i ) with We assume that when γ belongs to the interval [s i−1 , s i ), a genie substitutes the current side information sequence Y = γ c X +N with a sequence with gain s i , i.e., Ỹ √ s i X +N .Note that this receiver has a better performance as noise can be added to Ỹ to recover the original side information sequence if required.Hence, the expected distortion for a given partition s, denoted by ED * gen (s), is a lower bound on the expected distortion of the continuous fading setup.The genie aided system now consists of a countable number of receivers and, due to the optimality of separation under countable number of side information states, ED * gen (s) is achieved by the concatenation of a Heegard-Berger source encoder with side information states s 1 , s 2 , ... and a capacity achieving channel code.Then, for a given partition s, we have, where ED * C (•) is defined in (9).With the channel state h c known, expected distortion ED * Q (C) is achievable with separate source and channel coding by concatenating a single layer source encoder for side information state γ, and a channel code at a rate arbitrarily close to C.Then, As the partition gets finer in the sense that max i ∆s i → 0, the limiting behavior of ED * gen (s) can be obtained by noting that once the optimality of separation is proved for each ED * gen (s), the problem reduces to the problem studied in [22].Hence, by [22,Proposition 4] and [22,Proposition 5], ED * gen (s) converges to ED * Q (C), i.e., lim max i ∆si→0 ED * gen (s) = ED * Q (C).Then from inequality (48) we have, in the limit of finer partitions, ED * sta = ED * Q (C).This completes the proof.

APPENDIX II
PROOF OF LEMMA 4 In order to show the convergence of ED * pi to ED inf , first, we construct an upper bound on ED * pi and we show that this bound converges to ED inf for large enough L.

The lower bound ED *
pi is achieved by the concatenation of a capacity achieving channel code with a single-layer source code targeting the side information state γ, the solution to (11), for each realization of H. Instead, we consider that, for a given L the source coding is done targeting the state where µ E[Γ L ] is the mean of Γ L and δ σ 2 L .The expected distortion achieved by this scheme is an upper bound on ED * pi and is found, similarly to ED * pi , to be given by where ED Q (R) is given as in (10) for γ substituted by γL and p L (γ) is the pdf of Γ L .
Then, we have the following bound where (a) follows since 1 (1+γ) ≤ 1 for the first integral, and because we are reducing the integration region in the third one, (b) follows due to Then (c) follows since γL = µ − δ, and subtracting the two integrals, (d) follows from the following bound, where (f ) follows since γ ≤ µ + δ in the integration region; (g) follows since γL = µ − δ and µ+δ µ−δ p L (γ)dγ ≤ 1.Finally, (e) follows from Chebyshev's inequality.
By the choice of δ = σ 2 L , we have and the difference converges to 0 from the assumption σ 2 L → 0 for L → ∞.This completes the proof.

A. Partially Informed Encoder Upper Bound
In Section IV-B we have seen that for continuous quasiconcave pdfs, ED * pi is obtained by averaging the expected distortion achievable by the concatenation of a single layer source code designed for the side information state γ(h) and an optimal channel code for the current channel state h.For each h, the optimal γ(h) is determined by solving (11) with R = C(h) = 1  2 log(1 + h).Note that γ(h) is a random variable dependant on the realization of the channel fading H.
An upper bound on the distortion exponent can be found by lower bounding ED * pi .First, we note that ED * Q (R) in ( 10) is a convex function of R.This follows from the time-sharing arguments and convexity of the Heegard-Berger rate-distortion function [25].Then, by Jensen's inequality, we have where and γ is the solution to (11) with R = E H [C(H)], that is, the ergodic capacity of the channel.Note that γ depends only on the ergodic capacity of the channel and not on the current channel state realization, and therefore, is not a random variable, as opposed to γ(h).Now, since C(h) is a concave function of h, applying Jensen's inequality again, we have that is, the ergodic capacity of the channel is lower than the capacity of a static channel with the same average SNR.
We define, for γ ≥ 0, Then, we have where (a) follows from inequality (52), and (b) follows from the definition in (53).Now, we obtain the exponential behavior of ED * pe .Consider a sequence of normalized gamma distributed random variables H 0 ∼ Υ(L, θ) under the change of variables A = − log H0 log ρ .The pdf for A is found as Then, p A (α) is given by and the exponential behavior is found as For the model considered in Section VI, the SNR exponent for the Nakagami fading channel, H 0 ∼ Υ(L c , L −1 c ), is given by S A (α) = L c α for α ≥ 0, and for the Nakagami fading side information, Γ 0 ∼ Υ(L s , L −1 s ), we have S B (β) = L s β for β ≥ 0.
Define κ log γ log ρ , such that γ = ρ κ .Applying the change of variables to (53), in the high SNR regime, we have where we have defined and we have used the fact that, in the high SNR asymptotic, and for β ∈ A c pe , we have which follows since ρ x + ρ y .= ρ max{x,y} for x, y ≥ 0, and we have 1 − β > κ for β ∈ A c pe .Similarly, in the high SNR limit we have Since the exponents in the integral do not depend on ρ, the distortion exponent for each integral can be found by applying Varadhan's Lemma [32] separately for each integral term, similar to the proof of Theorem 4 in [33].
We define and and write (54) as follows Then, the distortion exponent is upper bounded by We solve the optimization problem in (61) with S B (β) = L s β, and denote the optimal value by ∆ pe (L s , L c ).We note that we can restrict the domain of β in (58) and (59) to β ≥ 0 without loss of optimality since S B (β) = +∞ for β < 0.
Next, we consider the case κ ≥ 0. Substituting S B (β) = L s β in ∆ p1 (κ) in (58), we note that we can constrain our search to 0 ≤ β ≤ 1, since any β > 1 can only increase the objective function.We have, On the contrary, for L s ≤ 1, the objective function is decreasing in β, and is minimized at Similarly, for ∆ p2 (κ) in (59), we have This problem is minimized by β * = 0, for which ∆ p2 (κ) = 1 + κ, for 0 ≤ κ < 1, and has no solution for κ ≥ 1, since there are no feasible β in the optimization set.
Note that when L s ≤ 1, the side information gain distribution is monotonically decreasing.Then γ(h) = 0 for any h from Proposition 1, and therefore, from Theorem 2, uncoded transmission achieves the minimum expected distortion, i.e., ED * pi = ED u .The distortion exponent for uncoded transmission ∆ ), we observe that the proposed lower bound on ED * pi is in general not tight due to inequality (52).

B. Informed Encoder Upper Bound
Expressing the informed encoder lower bound ED inf in ( 14) in terms of α and β, the distortion exponent is found by using Varadhan's Lemma as follows, where the distortion exponent is found as the solution to the following optimization problem, We note that we can reduce the optimization domain to α, β ≥ 0 since S A (α) = S B (β) = +∞ for α, β < 0.
Evaluating for S A (α) = L c α and S B (β) = L s β, the minimum in (66) is achieved by α * = 1 if L c < 1 and α * = 0 if L c ≥ 1, and by β * = 1 if L s < 1, and β * = 0 for L s ≥ 1.Then, the minimum is found to be given by

A. Uncoded transmission
Similarly to the proof in Appendix III-A, applying the change of variables H 0 = ρ −A and Γ 0 = ρ −B , and Varadhan's lemma, we have where the distortion exponent is found by substituting S A (α) = L c α and S B (β) = L s β as Note that we can constraint to 0 ≤ α, β ≤ 1 without loss in optimality since any α, β > 1 achieve a larger solution.

B. Separate Source and Channel Coding (SSCC)
Here we find the distortion exponent of SSCC.Let us define the events Event O 1 corresponds to an outage due to bad quality of the channel, and O 2 corresponds to a correct decoding of the channel codeword while an outage occurs due to the bad quality of the side information.It is readily seen 2 log ρ and R c = rc 2 log ρ, for r s ≥ 0 and r c > 0. Note that we consider r s = 0 to allow SSCC to transmit without binning.We have where we have defined A sb (ρ) A 1 (ρ) A 2 (ρ), and A 1 (ρ) characterizes O 1 in terms of α and β, and is given by and similarly for O 2 we have Using similar bounding techniques to the ones used in Appendix III-A, it is not hard to show that in the high SNR regime, we have The optimal distortion exponent of SSCC can be found by maximizing over the rates as The distortion exponent is maximized when r s + r c > 1, r c < 1 and r s < 1.Then, we have ∆ s1 (r) = r s + r c , The maximum is achieved by r c and r s for which the left and right terms in the minimization in ∆ s2 (r) are equal, i.e., Solving this, we have which satisfy r s < 1, r c < 1 and r s + r c > 1.Note that for L s = 1, we have r s = 0, i.e., no binning is optimal, as expected from Lemma 2. Now we consider the case L s ≤ 1.In this regime, the gamma function is monotonically decreasing, and hence, γ = 0 and from Lemma 2 we have R * s = 0, i.e., no binning achieves the minimum distortion for SSCC.Next, we derive the distortion exponent when no binning is considered, for general L s to account for ED * nb .Letting R s = 0, the outage event A 2 is empty.Then, we find the distortion exponent of ED nb (R c ) as First we note that in both ∆ j1 (r j ) and ∆ j2 (r j ) we can restrict to 0 ≤ α, β ≤ 1 without loss of optimality since The minimum is achieved by α * = 0 and β * = (1 − r j ) + if r j ≤ 2 and is given by ∆ j1 (r j ) = r j + L s (1 − r j ) + and has no feasible solutions if r j ≥ 2.Then, the exponent ∆ j1 (r j ) is given by the minimum of these solutions, given by where we have used that for L s ≤ 1 and 0 ≤ r j ≤ 1, we have r j + L s (1 − r j ) + = 1 + (1 − L s ) + (1 − r j ) + , and for L s ≥ 1 and 0 ≤ r j ≤ 1, we have min{r j + L s (1 − r j ) + , 1} = 1.Now, we solve ∆ j2 (r j ).If r j < 1 − β, the problem has no feasible solution due to the constraints.If r j ≥ 1 − β, where O h in ( 27) is found, in terms of α and β as A h (ρ) (α, β) : In the high SNR regime, we let η 2 = ρ r h , for r h ∈ R ,and the outage event A h (ρ) is equivalent to The distortion exponent for HDA can be optimized over the parameter r h as First, we obtain the achievable distortion exponent when r h < 0. To solve ∆ h1 (r h ), note that if 0 ≤ α ≤ 1, there are no feasible solutions.Then, for α > 1, we have We can constrain the optimization to 0 ≤ β ≤ 1 without loss of optimality, and the minimum is achieved by α * = 2 − β − r h .If L s ≥ 1 + L c , the minimum is achieved by β * = 0, and is given by ∆ h1 (r h ) = 1 + L c (2 − r h ).
If L s ≤ 1, we have ∆ h1 (r h ) ≥ ∆ h2 (r h ), and the distortion exponent is maximized by letting r h → 0 and we get ∆ hda (L s , L c ) = min{L s + L c , 1}.If L s ≥ 1, we have ∆ hda (L s , L c ) = 1 for any r h < 0.
In the following, we derive the distortion exponent achievable by SHDA when r h ≥ 0. First, we solve ∆ h1 (r h ).

Fig. 1 .
Fig. 1.Block diagram of the joint source-channel coding problem with fading channel and side information.
ED * sb is still far from the lower bound.Finally, in Fig 5, we consider L c = 0.5 and L s = 1.5.Contrary to the previous scenarios, in this setup JDS

Fig. 6 .
Fig. 6.Distortion exponent upper and lower bounds for Nakagami fading channel and information with Lc = 1, as a function Ls.

Fig. 7 .
Fig. 7. Distortion exponent upper and lower bounds for Nakagami fading and side information with Lc = 0.5, as a function of Ls.
If this condition does not hold, both schemes are in outage and have the same performance.Then, O sb ⊇ O j .Conversely, if JDS is in outage, i.e.,