On the Detection of Robust Multidecadal Changes in Earth’s Outgoing Longwave Radiation Spectrum

Differences betweenEarth’sglobalmeanall-skyoutgoinglongwaveradiationspectrumasobservedin1970 [Interferometric Infrared Spectrometer (IRIS)], 1997 [Interferometric Monitor for Greenhouse Gases (IMG)], and 2012 [Infrared Atmospheric Sounding Instrument (IASI)] are presented. These differences are evaluated to determine whether these are robust signals of multidecadal radiative forcing and hence whether there is the potential for evaluating feedback-type responses. IASI–IRIS differences range from 1 2 K in the atmospheric window (800–1000cm 2 1 ) to 2 5.5K in the 1304cm 2 1 CH 4 band center. Corresponding IASI– IMG differences are much smaller, at 0.2 and 2 0.8K, respectively. More noticeably, IASI–IRIS differences showadistinctstepchangeacrossthe1042cm 2 1 O 3 bandthatisnotseeninIASI–IMGcomparisons.Thisstep change is a consequence of a difference in behavior when moving from colder to warmer scenes in the IRIS data compared to IASI and IMG. Matched simulations for the relevant periods using ERA reanalyses mimic the spectral behavior shown by IASI and IMG rather than by IRIS. These ﬁndings suggest that uncertainties in the spectral response of IRIS preclude the use of these data for quantitative assessments of forcing and feedback processes.


Introduction
Measurements of the total broadband energy reflected and emitted by the Earth-atmosphere system have been made from space for almost four decades from sensors such as the Earth Radiation Budget Experiment (ERBE; Barkstrom 1984), the Clouds and the Earth's Radiant Energy System (CERES; Wielicki et al. 1996), and the Geostationary Earth Radiation Budget experiment (GERB; Harries et al. 2005). These Earth radiation budget (ERB) data have been profoundly useful in a wide variety of environmental and climate studies, leading to improvements in understanding and model development (e.g., Forster and Gregory 2006). Examples include studies of the effects of clouds on the ERB (e.g., Futyan et al. 2005;Potter and Cess 2004;Loeb et al. 2007), the role of water vapor absorption (e.g., Held and Soden 2000;Soden et al. 2005), and the impact of aerosol scattering and absorption (e.g., Loeb and Kato 2002;Ansell et al. 2014). Recent work has also suggested that ERB measurements can help to constrain estimates of climate sensitivity (Tett et al. 2013). However, since these measurements integrate all the energy in the shortwave or longwave at once, compensation effects may occur such that a very small broadband signal results (e.g., Hansen et al. 2005;Huang et al. 2013).
In contrast, if measurements of the outgoing radiation are spectrally resolved it is possible to identify and monitor the effects of many different processes. This potential for the simultaneous detection and attribution of change using spectrally resolved radiation has been recognized for some considerable time (e.g., Charlock 1984;Goody et al. 1995;Slingo and Webb 1997) and forms a substantial part of the rationale behind proposals to establish a highaccuracy, International System of Units (SI) traceable, space-based climate monitoring system (e.g., Fox et al. 2011;Wielicki et al. 2013).
In the light of these efforts, in this paper we revisit measurements in the infrared region of the spectrum to illustrate challenges associated with identifying robust changes in spectral outgoing longwave radiation (OLR) in existing records. We make use of three different sets of observations; first the Interferometric Infrared Spectrometer (IRIS) on Nimbus-4 (Hanel et al. 1972), second the Interferometric Monitor for Greenhouse Gases (IMG) on Advanced Earth Observing Satellite 1 (ADEOS I; Kobayashi 1999), and finally the Infrared Atmospheric Sounding Instrument (IASI) on MetOp-A (Simeoni et al. 2004). These three instruments operated from April 1970to January 1971, from October 1996to June 1997, and from June 2007 to the present time, respectively. Currently, over 13 years of OLR data are available from the Atmospheric Infrared Sounder (AIRS; Aumann et al. 2003); however, owing to the spectral gaps in these observations, we choose not to include these in this particular study.
Previous work by Harries et al. (2001) reported the long-term changes in Earth's clear-sky spectral OLR seen between 1997 and 1970 as manifested in IRIS and IMG measurements. This work showed observationally for the first time the impact of increases in well-mixed greenhouse gases such as CO 2 , CH 4 , and CFC-11 and -12 on the OLR spectrum. The study was limited to clearsky conditions and made no attempt to detect the effects of feedback processes. However, work using newer instrumentation such as IASI promises progress toward achieving this ambition. For example, Brindley et al. (2015) use IASI data to show that all-sky interannual variability at the global scale, evaluated over the period 2008-12, is less than 0.17 K across the spectral region 645-1600 cm 21 , reducing to less than 0.05 K in the atmospheric window region between 800 and 1200 cm 21 . This latter region is particularly sensitive to surface temperature and the presence of cloud. This result therefore suggests that an efficient mechanism for regulating the planetary thermal emission to space, in the face of considerable variability in cloud and surface temperature, is in operation, at least at the global scale.
Estimates of trends in global mean surface temperature since the mid-twentieth century are on the order of 0.1 K decade 21 (e.g., Huang et al. 2015;IPCC 2013), implying a change on the order of 0.4 K between 2012 and 1970. Hence, if we were to assume that the level of interannual variability is relatively constant with time this would imply that, given adequate calibration and sampling, at the global scale even the short observational period available from IRIS may provide insight into important feedback processes when those measurements are compared to those from IASI.
With this aim, here we use the full set of all-sky observations from the IRIS instrument and contrast these to more recent observations from the IASI instrument. While the IMG satellite had much reduced spatial coverage relative to both IASI and IRIS such that instrument averages may not be representative of the true global mean (Brindley and Harries 2003), comparisons of IMG with IRIS and IASI are also made to assess the consistency of the overall spectral shape. To increase confidence in our findings we have also simulated the outgoing longwave spectrum, taking into account the individual satellite sampling patterns and periods of operation. In section 2 we describe the instruments, observations, and simulation methodology. Section 3 provides examples of the observed differences and an interpretation of their significance, placing these in the context of the simulated spectra. Finally, in section 4 we provide conclusions and discuss the implications of our results.

Methodology: Observations and simulations
a. Satellite observations IRIS, IMG, and IASI have very different characteristics as summarized in Table 1. To enable meaningful comparisons between the spectra measured by each instrument, several steps were first necessary. These were as follows: 1) To avoid seasonal artifacts, only data from the common overlapping months, April-June (AMJ), were retained for each year of data available. 2) Uniquely, compared to the nadir-only viewing IRIS and IMG instruments, IASI is a cross-track scanning instrument, producing 30 fields of regard (FOR) per scan. Each FOR is itself an array of 2 3 2 pixels. Therefore to minimize any biases due to viewing angle, only ''nadir'' observations are retained. In practice this means those observations within 58 of nadir. 3) To approximate the spatial resolution of IRIS, 16 nadir IASI pixels were averaged to provide a single IRIS-like pixel. Owing to the restricted sampling strategy of IMG, observations from this instrument were retained at their native spatial resolution. 4) The IMG and 16-pixel average IASI spectra were smoothed to match the spectral resolution of IRIS, 2.8 cm 21 , using the appropriate Hamming instrument function, and a wavenumber correction was applied to account for the differing solid angles within each instrument. 5) Finally, given the spectral coverage of each individual instrument, the spectral range was reduced to include only those wavenumbers common to all which are not subject to significant radiometric noise, between 700 and 1400 cm 21 .
Previous studies have identified potential issues with the earlier instruments (e.g., Aumann et al. 2011) and therefore careful attention was paid to screening the IRIS and IMG data prior to use. While both datasets contain quality flags, even after these had been accounted for a number of clearly erroneous spectra (indicative of calibration targets or a sharp variation in scene type over the acquisition of the interferogram) were still present. Therefore both records were surveyed using principal component techniques to identify outlying spectra for each month, which were then visually inspected to enable identification of additional erroneous spectra. As a result of these tests 2602 additional spectra were removed from the IRIS record, corresponding to just over 1% of the data record for April-June 1970. For IMG, the number of additional spectra removed comprised less than 0.25% of the total observations available for April-June 1997. In both cases, the impact of removing these erroneous spectra was less than 0.5 mK on the global mean brightness temperature.

b. Model simulations
In this study we make use of the radiance simulator (Chen et al. 2013) based on the Principal Component-Based Radiative Transfer Model (PCRTM; Liu et al. 2006). PCRTM is a flexible fast radiative transfer code that exploits redundancy in the information content contained within different spectral channels to generate resolved radiance spectra much more rapidly than is possible using a traditional channel-based approach. Studies using PCRTM to simulate aircraft observations have shown brightness temperature agreement that lies below the instrument noise, reducing to less than 0.5 K across the entirety of the spectral range considered here.
Comparisons with more traditional line-by-line codes show agreement to better than 0.1 K (Liu et al. 2007) and a saving of computational time by a factor of ;4480 (Chen et al. 2013). The model requires vertical profiles of temperature, specific humidity and ozone at the specified 101 PCRTM pressure levels (defined from the surface to 0.005 hPa). The concentrations of well-mixed greenhouse gases can also be specified: in these simulations we include CO 2 , CH 4 , CO, and N 2 O. The simulator can also take surface spectral emissivity into account. A further unique feature is its ability to account for the subgrid variability of cloud fields. This is done by populating clouds into subcolumn grids in a way that is consistent with the overlapping assumptions adopted by the numerical model producing a given simulated or reanalysis cloud field. Radiances of each subcolumn grid are computed separately and then averaged to obtain the radiances for the entire grid. More details about the radiance simulator and its configuration can be found in Chen et al. (2013).
The speed of PCRTM means it was feasible to use the radiance simulator to simulate each spectrum corresponding to the time and location of each individual satellite observation from IRIS, IMG, and IASI, using atmospheric reanalyses to provide the required temperature, humidity, ozone, and cloud properties. For IMG and IASI we use the European Centre for Medium-Range Weather Forecasts (ECMWF) interim reanalysis (ERA-Interim, hereafter ERA-I; Dee et al. 2011). However, this only covers the period from 1979 onward. Therefore, an alternative reanalysis dataset, ERA-20C (Compo et al. 2011), which only assimilates surface pressure reports, was employed for the simulations for the IRIS spectra in 1970. To assess the impact of using different reanalyses, ERA-20C based PCRTM simulations were also performed for the IMG data record in 1997.
To avoid introducing obvious temporal sampling biases, the temperature, water vapor, and ozone reanalysis fields were linearly interpolated from 6-hourly to the satellite overpass time. Since cloud fields are harder to treat in a self-consistent manner, for simplicity the nearest cloud fields in time to that of a given The reanalysis data only extend to 1 hPa, so above this level standard profiles from McClatchey et al. (1972) were employed based on the location and season of the observation. For example for a footprint within 308S-308N the tropical profile was used, for footprints falling within 308-508N the midlatitude summer profile was used, and so on. In addition, the vertical profile of CO 2 was defined by the U.S. 1976 Standard Atmosphere profile and then scaled to correspond to the month of the observation using monthly-mean NOAA ESRL observational data. 1 The CH 4 , CO, and N 2 O concentrations were fixed for all simulations and their vertical profiles were also scaled according to the relevant U.S. 1976 Standard Atmosphere profile. Each satellite footprint was matched to a specific surface type based on its geographical location and the USGS International Geosphere-Biosphere Programme (IGBP) land coverage dataset (Loveland et al. 2000), and the corresponding spectral emissivity was obtained from the Advanced Spaceborne Thermal Emission Reflection Radiometer (ASTER) spectral library v2.0 (Wilber et al. 1999;Baldridge et al. 2009).

c. Generation of global mean spectra
Global mean spectra for both the observations and simulations were generated in an identical manner. First, zonal mean radiances for each 108 latitude band were derived for each month. These were then area weighted, creating a mean global monthly value. These monthly global mean radiances were then averaged over the three months April-June, implicitly giving equal weight to each month irrespective of the total number of spectra contributing to each monthly global mean. Finally each 3-month AMJ average global mean radiance spectrum was converted to an equivalent brightness temperature spectrum. Figure 1a shows the AMJ global mean average brightness temperature spectra for IRIS, IMG, and IASI while Fig. 1b illustrates differences between later and earlier measurements. While we show measurements from IASI for 2012 here, as noted earlier, global interannual variability manifested in the IASI data between 2008 and 2012 is less than 0.17 K across the spectral range considered here (Brindley et al. 2015), so in essence any of the years in this period could have been shown without having a noticeable impact on the difference spectrum.

Results
From Fig. 1a it is apparent that the global mean spectra from IASI and IMG are very similar across the entire spectral range considered, but IRIS appears to be noticeably cooler in some spectral regions. Figure 1b shows that differences between the IMG and IASI spectra across the atmospheric window regions between 750 and 1000 cm 21 (hereafter referred to as W 1 ) and 1050 to 1250 cm 21 (hereafter referred to as W 2 ) are in general less than 0.4 K. Spectral regions exhibiting larger differences correspond to the Q branch of CO 2 at 720 cm 21 , the O 3 band centered at 1042 cm 21 , the Q branch of CH 4 centered at 1304 cm 21 and, in particular, strong water vapor absorption lines at wavenumbers greater than 1250 cm 21 . All of these larger features are also evident in differences involving the IRIS spectra. However, they tend to be much enhanced in magnitude, with signals reaching 21.5 K in the CO 2 Q branch and 25.5 K at the center of the CH 4 Q branch. Perhaps of greater interest is the distinct step in the magnitude of the difference across the 1042 cm 21 O 3 band. This manifests in the IMG-IRIS and IASI-IRIS differences as a differential window signal of approximately 2 K and 1 K in the W 1 and W 2 regions respectively and is not apparent in the IASI-IMG difference spectrum. The spectral shape of the differences between IASI/ IMG and IRIS are similar to those reported in Harries et al. (2001) even though results there were based on cloud-free spectra. In that study the authors speculated that the differential window signal might be at least partially due to cirrus cloud contamination in the cloudfree subset of spectra used, although no firm conclusion could be drawn. For cirrus to be responsible for the changes seen in this study there would need to have been a noticeable change in their coverage or optical properties between 1970 and 1997, with no significant alteration between 1997 and 2008-12. Studies have shown that long-term trends in high cloud cover and cloud frequency are highly uncertain, with inferences concerning even the sign of any trend dependent on the dataset used (e.g., Wylie et al. 2005;Warren et al. 2007). Moreover, recent work has suggested that owing to the increased ability of more recent active sensors to detect tenuous clouds, including thin cirrus, it would be difficult to draw conclusions concerning true multidecadal trends (e.g., Stubenrauch et al. 2010Stubenrauch et al. , 2013. In essence, although it cannot be unambiguously ruled out, there is little evidence to indicate that the change in cirrus cloud needed to cause the signatures seen in the spectra involving IRIS in Fig. 1b has occurred. Given this absence of evidence, since Fig. 1b indicates that IRIS manifests anomalous behavior compared to the other two instruments, its calibration must be investigated. Unfortunately the available IRIS dataset only consists of the calibrated radiances not the raw inteferograms and calibration parameters such as blackbody temperatures, making it difficult to quantitatively assess the effects of small changes to the latter. Nonetheless, we have used information contained in Hanel et al. (1972) and would like to draw particular attention to Fig. 3 from their paper, which clearly shows a step change in the blackbody emissivity estimate consistent with the step change observed in the difference spectra, to show that when compared to an ''ideal emitter'' the spectral variation reported in the emissivity of the onboard blackbody calibration source (their Fig. 3) could result in the type of differential window signal seen in Fig. 1b herein. Hence, relatively small errors in the characterization of the emissivity could be responsible for the form of the signal seen. We hypothesize that there is an error in the spectral emissivity applied to the blackbody calibration source used by IRIS, the impact of which would vary depending upon the scene temperature. For example, observations where the field of view is filled with thick cold cloud would result in a spectral radiance that is both lower in amplitude and peaks at lower wavenumbers when compared to a much warmer scene. Therefore, if there is an error in the characterization of the spectral response of IRIS, one would expect to see a dependence of the magnitude of the differences between W 1 and W 2 on scene temperature that is inconsistent with the behavior of a better-characterized instrument.
To examine this hypothesis, the radiance spectra for each of the three satellite instruments acquired over global oceans during AMJ were converted to equivalent brightness temperature spectra. These brightness temperature spectra were then separated into cold, mild, and warm scenes using the value of the brightness temperature for each spectrum at 1126 cm 21 (T B1126 ). Those spectra within the range 220 K , T B1126 , 250 K were classified as cold, those within the range 250 K , T B1126 , 280 K were classified as mild, and those within the range 280 K , T B1126 , 310 K were classified as warm. To quantify how the shape of the spectra differed between W 1 and W 2 as a function of scene temperature, the brightness temperature difference T B909 2 T B1250 for each spectrum was calculated, where T B1250 and T B909 represent the brightness temperatures at 1250 and 909 cm 21 wavenumbers, respectively. These wavenumbers were chosen for consistency with the cloudscreening procedure used in Harries et al. (2001). Figure 2 shows the resulting probability density functions (PDFs) of brightness temperature differences T B909 2 T B1250 for each satellite instrument for cold, mild, and warm scenes. Each PDF has been normalized using the total numbers of spectra fulfilling each of the criteria for a particular instrument. Hence each y value represents the fraction of the total number of warm or cold spectra falling within each 0.5-K bin for each particular instrument. A summary of the mean and associated standard deviation of each PDF is provided in Table 2. The number of spectra contributing to each distribution for each instrument is also given.
First, considering the cold scenes (Fig. 2a), it is evident that the PDFs for all of the instruments are broadly similar, with the position of the peak in each distribution centered at around 0.5 K and their means varying by less than 0.15 K ( Table 2). The width of each distribution shows greater variability between the instruments but the standard deviations are still within 0.6 K of each other. Moving to the warmer scenes, Figs. 2b and 2c show that as scene temperature increases all three instrument distributions shift to higher T B909 2 T B1250 values. This would be expected: the majority of colder scenes are likely representative of optically thicker clouds whose behavior tends to be spectrally flat as compared to the clearer conditions sampled by the warmer scenes. However, while the mean of the IASI and IMG distributions are always within 0.3 K of each other, the mean of the IRIS distribution becomes progressively lower as scene temperature increases, differences reaching up to 1.3 K for the warmest scenes (Table 2). This shift between instruments is manifested clearly in Fig. 2c and reinforces the hypothesis that there may be an underlying issue with the spectral response characterization of the IRIS instrument which becomes increasingly evident as scene temperature increases.
Using the simulations described in section 2b we next investigate whether the differential window signal seen in the IRIS observations exists in the corresponding simulations and whether this signal becomes more marked for warmer scenes. We note that the discrete nature of the reanalyses in both 3D space and time, and the difficulties associated with simulating cloudy cases, particularly in translating their distribution in an equivalent manner to the PCRTM domain (Chen et al. 2013), implies that even if the reanalyses were perfect an identical match between each individual observation and simulation would not be expected. We also avoid commenting on comparisons between simulations generated from different reanalyses that may themselves exhibit different trends with time (e.g., Poli et al. 2013) and instead focus on the PDFs for each instrument, stratifying them in an identical way to Fig. 2. Figure 3 shows the resulting distributions while Table 3 provides  analogous information to Table 2 for the simulations, including the IMG simulations calculated using both the ERA-20C and ERA-I reanalyses datasets.
Comparing the numbers of spectra classified as warm, mild, or cold in Table 3 with Table 2 implies that the simulations from both reanalyses tend to be, on average, warmer than their corresponding observations, with a stronger overall warm bias seen for ERA-I. Similarly, the histograms in Fig. 3 are rather more peaked than those seen in Fig. 2. This is likely a reflection of the discrete representation of the atmospheric state as provided by the reanalyses but also hints that these reanalyses do not capture the full variability of the Earth system as manifested in the observations. Nevertheless, Fig. 3 clearly shows that, according to the simulations, all three instruments should show T B909 2 T B1250 distributions that are consistent with each other in all three scene temperature regimes, in contrast to the IRIS observations.

Discussion and conclusions
Spectrally resolved observations of Earth's outgoing longwave radiation implicitly contain signatures of key climate forcing and feedback processes and as such could, in principle, provide a stringent test of our ability to simulate past climate. In particular, their use could mitigate the possibility that model predictions and observations agree for the wrong physical reasons in more   (Brindley et al. 2015) has suggested that for global mean all-sky conditions the level of interannual variability across the spectrum is small enough that it may be possible to identify robust changes in regions affected by feedback processes (particularly cloud feedback) in comparisons between these more recent measurements and the earlier observations, even given the short record length of the latter. Hence in this study we have performed such a comparison, degrading the spectral resolution of each instrument to match that of the lowest resolution (IRIS), applying additional quality control to the IRIS and IMG datasets, matching the months considered (April-June) to avoid seasonal artifacts and, where possible, making best efforts to match the spatial resolution of the instruments. In addition, we have simulated the observed behavior in each period by effectively flying each instrument through the appropriate ERA-I and ERA-20C reanalysis fields using the fast PCRTM radiative transfer code. Given that studies have already shown the power of comparing IRIS and IMG in identifying the signatures of well-mixed greenhouse gases (Harries et al. 2001) the focus here was intentionally on signals within the atmospheric window region (750-1250 cm 21 ), which is particularly sensitive to surface temperature and cloud.
Within this window region the observed global mean all-sky differences between IMG and IASI are remarkably consistent, agreeing to within 0.1 K. Conversely, observed global mean differences between IRIS and both IASI and IMG show a differential window signal, with differences in the region 750-1000 cm 21 higher than those in the 1050-1250 cm 21 range by up to 1 K. Decomposing the spectra from each instrument into warm, mild, and cold scenes shows that this shift results from a reduced spectral variation in the IRIS record under warmer conditions when compared to both IASI and IMG. Although the lack of an ERA reanalysis product spanning 1970-2012 means that inferences concerning the expected absolute level of brightness temperature change between 1970 and the later years cannot be made, when considering the simulated distributions, all three periods show consistent behavior when stratified according to scene temperature.
Given the reported corrections that were made to the IRIS instrument in-orbit calibration (Hanel et al. 1972), and the dependence of the differential window signal on scene temperature that we see in the IRIS observations, we hypothesize that the characterization of the IRIS spectral response in particular is most likely responsible for much of the behavior seen. While this does not alter the fact that clear signatures due to increases in wellmixed greenhouse gases such as CH 4 can be identified FIG. 3. As Fig. 2, but for the PCRTM simulations. IMG results are shown here for the simulations using both the ERA-I and ERA-20C reanalyses. when the IRIS observations are compared to more recent spectra, we suggest that these uncertainties mean that the dataset cannot be used either to quantify the exact magnitude of these gas forcings or to make inferences about climate feedback processes that might be expected to be manifested in the window regions. This study reinforces the key role that instrument calibration (and knowledge of that calibration) plays in the construction of observational records that can stand the test of time. Calibration accuracy is even more critical when gaps exist between instruments if we wish to be able to make robust claims concerning real changes that have occurred between their observing periods. At present, although the situation has undoubtedly improved over the last decade, we would argue that there is currently no spectrally resolved instrument in space that possesses the level of in-orbit, SI traceable calibration needed to provide a benchmark record of the true climate state. Efforts are ongoing to rectify this situation via, for example, the Climate Absolute Radiance and Refractivity Observatory (CLARREO) mission (Wielicki et al. 2013): we note here that had such an initiative been undertaken in the IRIS era we would have much greater certainty of the changes that have occurred to the radiation field over the past 40 years. Such a record would have been invaluable for testing the ability of our climate models to correctly capture the behavior of the different physical processes contributing to these changes. However, as a result of our study, we strongly advise, unless the uncertainties surrounding the calibration of the IRIS record can be resolved, that these data are not used to provide a reference point for quantitative long-term assessments of changes in Earth's spectrally resolved OLR.