Incorporating sedimentological data in UK flood frequency estimation

This study presents a new analytical framework for combining historical flood data derived from sedimentological records with instrumental river flow data to increase the reliability of flood risk assessments. Historical flood records were established for two catchments through re‐analysis of sedimentological records; the Nant Cwm‐du, a small, steep upland catchment in the Cambrian Mountains of Wales, and a piedmont reach of the River Severn in mid Wales. The proposed framework is based on maximum likelihood and least‐square estimation methods in combination with a Generalised Logistic distribution; this enables the sedimentological data to be combined effectively with existing instrumental river flow data. The results from this study are compared to results obtained using existing industry standard methods based solely on instrumental data. The comparison shows that inclusion of sedimentological data can have an important impact on flood risk estimates, and that the methods are sensitive to assumptions made in the conversion of the sedimentological records into flood flow data. As current industry standard methods for flood risk analysis are known to be highly uncertain, the ability to include additional evidence of past flood events derived from sedimentological records as demonstrated in this study can have a significant impact on flood risk assessments.


Introduction Flood frequency estimation
Flood frequency estimates are an essential part of flood risk management, providing information on what flood flows are expected to occur for a given rarity (e.g. the 100 year return period flood). This information underpins many important decisions, such as the design and operation of flood defences, flood mapping, informing planning decisions in flood risk areas and reservoir safety (Institution of Civil Engineers, 2015). Methods described in the Flood Estimation Handbook (FEH) (Institute of Hydrology, 1999), and subsequent major updates (Kjeldsen, 2007;Environment Agency, 2008), are considered the industry standard for flood estimation in the UK and are extensively used by hydrologists from both the public and private sector. A more international perspective on flood estimation techniques is given by Castellarin et al. (2012). Typically, flood frequency estimates (also known as design flood estimates) are derived by fitting statistical models to a series of annual maximum (AMAX) peak flows measured at a gauging station. There are a small number of AMAX records that have more than a century of data (see Longfield and Macklin, 1999;Marsh and Harvey, 2012), however for most locations the gauged record is relatively short (most begin between 1950 and 1980), often much shorter than the rarity of events required for design of flood defences and reservoir spillways, ranging from 100 to 10,000 years. To overcome this, current methods in FEH advocate the 'pooling' of data from hydrologically similar sites, but ideally flood estimates would be based on a long data series representing the flood characteristics at the location of interest. The incorporation of historical records in flood estimation has been advocated in previous studies such as the Flood Studies Report (Natural Environment Research Council, 1975) which contained a database of historical flood data, and Potter (1978) who recognised the benefit of supplementing short gauged records with historical data. The FEH also advocates the use of historical flood records; specific guidance was published by Bayliss and Reed (2001) on how to augment flood frequency analysis using historical data. However, the use of historical data is often omitted in practice due to the time and resources required to establish a reliable database of historical events, and the perceived lack of an analytical framework for including the data into operational flood frequency estimation (Kjeldsen et al., 2014b;Environment Agency, 2015).

Documentary evidence of floods
Documentary, epigraphic and sedimentological evidence of past floods in the UK can be used to extend systematic gauged flood records by many hundreds, and in some cases even thousands of years (Macklin et al., 1992;Macklin and Rumsby, 2007;Foulds and Macklin, 2016). Information on historical floods can be found from a variety of documentary sources and can be compiled for most locations in the UK (Kjeldsen et al., 2014b). Macdonald and Sangster (2017) have compiled documentary (and epigraphic) flood records in a number of British catchments from 1500, with a record of all known major floods from 1750. Furthermore, Archer (1999) suggests that there is useful information for at least 150 years in virtually every flood-prone catchment in England. Historical flood data can have a large influence on estimated design flows. For example, Black and Fadipe (2009)  estimated 100 year flood flows at three out of four sites they studied increased by more than 50% as a result of incorporating reliable historical information. Macdonald and Black (2010) also reassessed flood risk at York using documentary records dating back to AD 1263. This study showed that FEH estimates of the 100 year flood were implausibly high as the estimated flow rates had not been reached in the entire 737 year historical series. The preferred result was nearly 20% smaller than the FEH estimate. Macdonald et al. (2014) analysed historical flood data on the Sussex Ouse and found that while the inclusion of historical events resulted in a minor reduction in design flood estimates, the largest impact was on the associated level of uncertainty with a 40% reduction in standard deviation of the predicted 100 year design flood. Recent work by the Environment Agency (2017a) also shows that the inclusion of historical information leads to a higher degree of confidence in the accuracy of flood-frequency estimation as a result of the increased sample size. However, it is also possible that the uncertainty of design flood estimates can increase with the inclusion of historical data since the uncertainty surrounding historical sources can be large (Neppel et al., 2010).

Sedimentological evidence of floods
In the UK sedimentological flood data have received less attention by operational hydrologists than historical records when augmenting systematic flood data for flood frequency estimation (Lewin and Macklin, 2010;Naylor et al., 2017). Sedimentological evidence of past floods can be found in channel and floodplain deposits. Flood sediment deposition from rivers in the UK takes place in a variety of sub-environments, including upland gullies and streams (boulder berms), aggrading alluvial fans, channel deposits preserved by lateral migration of rivers across a floodplain, fills in abandoned channels (palaeochannels) and fine-grained overbank sediment deposited incrementally by floodwaters (Jones et al., 2010;Macklin and Lewin, 2008). There are many UK examples of flood chronologies compiled from sedimentological flood evidence, including; overbank deposits on the middle and lower South Tyne (Rumsby and Macklin, 1994) and boulder berms in small incising upland streams (dated by radiocarbon or lichenometry [Macklin et al., 1992;Merrett and Macklin, 1999]). However, to date these long flood chronologies have not been employed in UK flood estimation. In the United States, the use of sedimentological data for flood frequency analyses was advocated as far back as the late-1970s (Costa, 1978;Baker et al., 1979). Sedimentological techniques to estimate design floods have been successfully employed in the United States (see House et al., 2002) and Spain (Benito et al., 2003). This may partly be the result of more favourable river environments for reconstructing sedimentological floods (slackwater deposits in bedrock canyons [although see Carling and Grodeck, 1994]) found in the United States and Mediterranean region, than in the UK.
The use of epigraphic, documentary and sedimentological flood records is not without limitation. Authors have long recognised the challenges around flood record completeness and the reliability of flood magnitude estimates based on epigraphic and documentary sources (e.g. Wallis, 1986, McEwen, 1987). Likewise, long and complete palaeoflood records are often been restricted to rocky gorge sites (Li and Huang, 2017), although physiographical settings outside of these areas are increasingly being used for flood risk analysis (Lam et al., 2017a). Nevertheless, the differential preservation and censoring of potential palaeoflood data (Lewin and Macklin, 2003) needs to be considered. Where stable depositional sites can be located and combined with high resolution dating, reliable flood magnitude estimates can be made (Leigh, 2018), with great potential to extend flood records and improve flood risk assessments by reducing uncertainty (Lam et al., 2017b).
For the first time in the UK this paper attempts to incorporate reconstructed flood series from sedimentological evidence into flood frequency estimation, compare the results with systematic data alone, and critique the results. The second aim of this paper is to assess the sensitivity of design flood estimates to decisions used in the translation of sedimentological data into flood flow data.

Study Catchments
Two sites were selected to investigate the incorporation of sedimentary flood records in flood frequency estimation. Both sites offered readily available data on reconstructed sedimentological flood discharges which can be used to develop palaeoflow thresholds. The two sites also represent river environments with very different channel and floodplain dynamics and depositional processes.
The first study site is located at the 'Roundabout' on the River Severn in mid-Wales (

Data structure of combined systematic and historical records
Consider a systematic record of observed AMAX events ( , = 1, ⋯ , ) of which crosses a threshold value of 0 . Next, a total of ℎ historical events has been observed over the past ℎ years prior to the start of the systematic record, all with a magnitude ( , = 1, ⋯ , ℎ ) exceeding the same flow threshold value 0 . The combined record length is therefore defined as = + ℎ and the total number of exceedances of the 0 threshold value is = + ℎ .

The Flood Estimation Handbook (FEH) method
Flood frequency estimation in the UK commonly relies on a Generalised Logistic (GLO) distribution defined by its probability density function (Hosking & Wallis, 1997) where , and are the location, scale and shape parameters respectively. The corresponding cumulative distribution function (cdf) is defined as (2) and the resulting design flood event with return period is defined as the 1 (1 − ) ⁄ quantile of the distribution, i.e.
Where = ⁄ . The dimensionless growth factor is denoted and is the location parameter also known as the index flood. The Flood Estimation Handbook (Institute of Hydrology, 1999) adopted a version of the method of L-moments to fit the three model parameter based on fixing the index flood as the sample median, denoted QMED. Consider a sample of years of recorded AMAX events, ( , = 1, ⋯ , ) from which the sample median, , and the sample L-moment ratios, L-CV ( 2 ) and L-skewness ( 3 ) have been calculated, then the GLO parameters are estimated as In the case of an ungauged catchment, the three GLO parameters are estimated using a version of the index flood method where an existing regression model is linking the index flood to a set of catchment descriptors, and the higher order L-moment ratios ( 2 and 3 ) are estimated as weighted averages of values obtained from a selection of gauged catchments considered hydrologically similar (Environment Agency, 2008). The FEH statistical method is routinely used to estimate design flood events in gauged and ungauged catchments based on observed AMAX data and catchment descriptors, but critically it does not allow for the inclusion of non-systematic historical data such as those obtained from geomorphological studies.

Maximum-likelihood estimation
An alternative to the method of L-moments for estimating the three model parameters of the GLO distribution is based on the maximisation of the likelihood function. The use of the maximum-likelihood (ML) method for representing peak flow data that include both systematic and historical flood data is described in detail by Macdonald et al. (2014) and only summarised here. In this case the likelihood function for the combined systematic and non-systematic data is defined as Where ( ) and ( ) are the pdf and cdf, respectively, of the GLO distribution as defined in Eq.
(1). The three GLO parameters are estimated by finding the set of values that maximises the value of the likelihood function in Eq. (7) using Nelder-Mead numerical optimisation method (Nelder and Mead, 1965).

Least-square estimation
The ML method relies on numerical optimisation and from experience it is not always possible to identify a set of model parameters. In such cases, this study has adopted an alternative and robust method based on the least-square method. Firstly, each observed data-point in the combined series of systematic and historical data (and/or sedimentological data) is assigned an annual exceedance probability, , using the plotting position formula proposed by Hirch and Stedinger ( Each plotting position is transformed into an equivalent return period using the relationship = 1⁄ and finally the magnitude of each event is plotted against its return period. The three GLO parameter values can now be estimated by simply minimising the squared difference between the observed and predicted flood magnitude for each return period as where , ( , , ) is the predicted year event for return period as derived from Eq. (3). The numerical optimisation was based on the Nelder-Mead method. In practice, the shape parameter had to be fixed at the pooled estimate obtained from the FEH method to enable a satisfactory solution to be identified, thus only the location and scale parameters, and , were optimised.

The Roundabout, River Severn
Previous work at the Roundabout site by Jones et al. (2012) provides sedimentological flood data related to event timing and relative magnitude. Given that the grain size of sediment deposits during flood events are dependent on flow velocity and flood discharge (Church, 1978, Macklin et al., 1992, Knox 1993, 2003, the authors used the inorganic element ratio of zirconium and rubidium (Zr/Rb) as a geochemical grain size proxy of fluvially deposited finegrained sediment to reconstruct a c.3750 year flood record. Systematic AMAX flow data are also available in close geographical proximity to the sedimentary record. Consequently, the availability of systematic and sedimentary flood data allows for the comparison of design flood estimates derived using systematic data alone, and a combination of systematic and sedimentological data.

Systematic data
The first stage of the analysis was to select an appropriate gauging station with systematic AMAX flows data close to the Roundabout site (catchment area 977 km 2 ) on the River Severn. The National River Flow Archive (NRFA) holds AMAX data for the two most suitable sites near the Roundabout, upstream at Abermule (NRFA station 54014, 52 o 43'28" N. 2 o 52'19" W, catchment area 580km 2 ) and downstream at Montford (NRFA station 54005, 52 o 33'13" N. 3 o 14'4" W, catchment area 2025km 2 ) ( figure 1[a]). The AMAX record at Abermule only has reliable peak flow data for 45 years from the 1970s onwards, whereas the downstream site at Montford has a 56 year record starting in 1951 but with 8 years data missing between 1971 and 1978. Systematic flow data for the 1960s period is crucial to this analysis (to allow later definition of 0 ), so the Severn at Montford was chosen as the most appropriate gauging station with systematic AMAX flows (figure 4). QMED was calculated from the observed AMAX data as the median value of 308.65 m 3 /s, and a statistical enhanced single-site flood frequency analysis (Environment Agency, 2008) carried out for Montford. The resulting flood frequency curve represented by a GLO distribution is shown in figure 5, and associated design estimates given in table 1. This curve represents the estimates that would be most likely derived by practitioners in the absence of historical or sedimentary data.

Incorporation of sedimentological flood data
Since the site at Montford has systematic gauged flow data and non-systematic data in the form of the sedimentological flood record at the Roundabout, the Maximum-likelihood (ML) method was used to estimate the parameters for a GLO distribution fitted to the combined data as described in Eq. (7). To implement this approach data are required on the number ( ) of sedimentological floods over a given time period (ℎ), and a perception (or threshold) flow ( 0 ), above which we are confident that all floods have been identified (from the sedimentological flood record). The calculation of parameters and ℎ required a reanalysis of the original data from Jones et al. (2012) and the creation of a new composite flood record derived from two sediment cores, described below.

Calculation of and ℎ
Jones et al. (2012) derived the relative magnitude of major flood peaks at the Roundabout from the geochemical analysis of two sediment cores (figures 2(b) and 6), where ln(Zr/Rb) values exceeded 0.35. To ensure that these data were suitable for incorporation in to flood frequency analysis at Montford, ln(Zr/Rb) values were reassessed to remove multi-peak flood events in each individual core, and duplicate flood events present in both cores. Multi-peak events were removed from each core using graphical analysis of the data and consideration of sedimentation rates. If there was a cluster of similarly dated ln(Zr/Rb) peaks (within c. 10 years) then the highest peak was selected; if there was no clustering, then the peak was considered to represent a single flood event. The peaks from both cores were then compared to identify duplicate peaks. For each event that was present in both cores the higher of the two ln(Zr/Rb) values and the mean of the estimated ages of the event were taken for the composite record. Given the subjective nature of this analysis and uncertainty in some of the peak matching, two composite flood records were produced, one where = 54 peaks (figure 6), and one where = 45. These two composite records represent the maximum and minimum estimate for the number of flood events over the c.3750 year record. The value of ℎ, the time period of the sedimentological flood data, was defined as the period between the date at the bottom of the oldest sediment core (c. 1736 BC. from core 1) and the start of the gauging period (AD. 1951 at Montford); consequently, ℎ = 3687 years.

Calculation of 0
To establish the perception flow, 0 for the composite sedimentological flood record, it was necessary to relate values of ln(Zr/Rb) to gauged flood flows at Montford, since flow cannot be directly estimated from ln(Zr/Rb). Originally, a threshold of ln(Zr/Rb) >0.35 was used to identify flood peaks in the composite record. However, it was not possible to relate a value of 0.35 to flows at Montford, since there are no ln(Zr/Rb) peaks >0.35 during the gauging period. To overcome this, peak flow data at Montford were compared to a lower ln(Zr/Rb) value of 0.2 in core 2 (chosen to obtain a manageable number of peaks to compare). The corresponding portion of core 1 had a cracked and uneven surface that produced invalid geochemical results and was therefore excluded from this analysis. Flows at Montford were compared to ln(Zr/Rb) values >0.2 from core 2, but only four could be confidently linked due to dating uncertainty in the sedimentary record (these events occurred in 1960, 1964, 1965 and 1998 at Montford). Figure 7 shows that there is an apparent linear relationship between the magnitudes of the sedimentological flood data in core 2 and the peak discharges at Montford. However, inferences drawn from such a small dataset are highly uncertain. Notwithstanding this, the fitted linear regression line (R 2 = 0.92) shows that a ln(Zr/Rb) value of 0.35 corresponds with a discharge of 470 m 3 /s at Montford. Consequently 0 was set at 470 m 3 /s. In other words, all the events in the composite sedimentological records represent individual flood events with a flow greater than 470 m 3 /s, which effectlvely gives us exceedences above the perception flow 0 .

ML estimation with sedimentary flood data
A flood frequency analysis incorporating the composite sedimentological flood data (figure 6) at the Roundabout (for = 54 and = 45), and systematic AMAX flow data from Montford was conducted. The parameters of the GLO distribution were derived using the ML method; resulting flood frequency curves and design estimates are presented in figure 8(a) and table 1 respectively. When compared to the results of the enhanced single site analysis (using systematic data only), incorporation of the c.3750 year sedimentological record results in lower flood estimates. For the 100 year return period, the incorporation of sedimentological flood data results in a 15% reduction of the peak flow estimate, from 571 m 3 /s to 485 m 3 /s (when = 54). For lower-order return periods, for example the 5 year event, design estimates were reduced by 3%. The sensitivity of the method to incorporating sedimentological data into flood frequency estimation using a ML approach was investigated by varying 0 , and ℎ. 0 was varied by ± 10% and ± 20% ( figure 8[b]) when = 54 flood events and ℎ = 3687 years. The method appears quite sensitive to choice of 0 , when 0 was varied by ± 20% (compared to the original 470 m 3 /s), 100 year return period estimates varied by +22% and -19%. For the 5 year return period, the variation was between +4% and -13%. The sensitivity of the ML method incorporating sedimentological flood data to variation in and ℎ is much less pronounced. was varied by ± 10% and ± 20% for the composite record with most flood events ( = 54) when 0 = 470 m 3 /s and ℎ = 3687 years. Figure 8 (c) show very little difference in design estimates, which only varied between ± 2% across return periods up to 200 years. ℎ also appears quite insensitive to reasonable variation; ℎ was varied by ± 100 years and ± 200 years, when 0 = 470 m 3 /s and = 54 ( figure 8[d]). Design estimates varied by a maximum of ± 0.6% across all return periods up to 200 years. For comparison purposes, a flood frequency curve was also fitted to the systematic data alone, where the GLO parameters were estimated using the ML method. This curve and associated estimates are shown in figure 8 and table 1.

Systematic data
The magnitude of design floods at Nant-Cwm-du were initially estimated using the FEH statistical method. Since the catchment is ungauged the value of QMED was calculated using FEH catchment descriptors extracted from FEH Web Service (Centre for Ecology and Hydrology, 2017) and the regression equation outlined by the Environment Agency (2008). Standard methods recommend adjusting QMED by data transfer from five or more donor catchments (Kjeldsen et al., 2014a), however it was judged that there were no suitable donors available for adjusting QMED due to the very small size (0.85 km 2 ) of the catchment. QMED at Nant Cwmdu was estimated to be 1.676 m 3 /s. A flood frequency analysis was carried out and a GLO distribution fitted as the pooled growth curve. The pooling group consisted of 17 catchments with a total of 503 years of flood data. Selected plots of catchment descriptors for Nant Cwm-du and the 17 pooling group members are shown in figure 9. This shows that all sites in the pooling group have a larger catchment area than Nant Cwm-du (1.63 km 2 to 16.64 km 2 ), but are very similar for some influential catchment descriptors such as the influence of upstream lakes and reservoirs, FARL (0.942 to 1), and the extent of urban land-use, URBEXT2000 (0 to 0.016), so the pooling group was judged as acceptable. Weighted L-moment ratios, L-CV (0.213) and L-skewness (0.219) were calculated for the pooling group according to standard methods (Institute of Hydrology, 1999;Environment Agency, 2008) to derive the GLO pooled growth curve. The flood frequency curve was then derived as the product of QMED and the pooled growth curve. 95% confidence intervals for design flows at ungauged sites were also calculated according to the method outlined in Kjeldsen (2015); these are shown alongside the flood frequency curve in figure 10 as dashed and solid black lines respectively. The values of design flood estimates from a standard FEH statistical pooled analysis at Nant Cwm-du are shown in table 3, and provide the benchmark with which to compare estimates derived using sedimentological flood data.

Incorporation of sedimentological flood data
The second part of this analysis aims to derive design flow estimates using sedimentological flood data for the Nant Cwm-du catchment. Palaeoflow data used for flood frequency analysis on the Nant Cwm-du catchment were taken from a study by Foulds et al. (2014). The study reconstructed sedimentological flood discharges for 29 lichen dated boulder berm units in the catchment. Each boulder berm unit was dated to +/-5 years, meaning that if discrete units were dated within the same 10 year period then they may represent only a single flood event. The full record of 29 boulder berm units was re-examined by comparing potentially overlapping lichen dates with documentary evidence of flood events (from Foulds et al, 2014) to identify units that may represent duplicate flood events. This method suggested that 12 discrete flood events could be identified in the boulder berm record between 1886 and 2014 where the peak flow of the flood event was taken as the maximum reconstructed flow of the berms transported during the same flood. Figure 11 shows the time series of reconstructed flows for all 29 berms and the final 12 berms that were used in the flood frequency analysis below.

Palaeohydraulic reconstruction
Flood flow was reconstructed by measuring the B-axis of the five largest boulders present in each boulder berm unit and then applying a range of standard boulder transport equations to estimate flow (Costa, 1983;Carling 1986). The Carling (1986) method was considered to be the most appropriate for reconstructing flows in upland UK catchments since it was derived from field data in a similar steep, upland catchment in northern England (Carling 1983(Carling , 1986). The Carling method requires several parameters to be derived including the Manning's 'n' value (Chow, 1959). Three different n values were chosen to reconstruct flood flows in the Nant Cwm-du catchment; 0.05 and 0.07 which represent the normal and maximum values for boulder streams (Chow, 1959), respectively, and a method described by Jarrett (1992). The Jarrett method typically yields much higher n values than those suggested by Chow (1959), in this case between 0.13 and 0.15 (the n value is dependent on the characteristics of individual boulder berm units and so varies). The n values derived using the Jarrett method are comparable to other published values for upland UK streams (e.g. Johnson and Warburton, 2002) and are considered the best estimates for the Nant Cwm-du catchment berms. The best reconstructed flow estimates for the Nant Cwm-du catchment are therefore considered to be those derived using a combination of the Jarrett (1992) and Carling (1986) methods.

ML estimation with sedimentary flood data
The first method considered to incorporate the sedimentological flood data in flood estimation at Nant Cwm-du was the same as used at the Roundabout site on the River Severn, where the ML approach was used to estimate the parameters of a GLO distribution in the presence of systematic flow data and historical and/or sedimentological evidence of past floods and at the site of interest as outlined in Eq. (7). Attempts were made to reduce the complexity of the optimisation problem by fixing the value of the shape parameter at the value obtained from the FEH pooled analysis ( =-0.22). However, no sensible set of parameters could be derived using this method and consequently this approach was discounted.

Least-square method with sedimentary flood data
The second method considered to incorporate the boulder berm data at Nant Cwm-du in to flood frequency analysis was the least-square technique outlined earlier in the methodology section of this paper. Plotting positions for the 12 reconstructed flood events were calculated and graphed on a peak flow versus reduced variate diagram ( figure 12[a], red circles). Initial parameters for the GLO distribution were taken from the results of the pooled flood frequency analysis described earlier, where L-CV = 0.213, L-skewness = 0.219 and QMED = 1.676 m 3 /s. These values were then used to calculate the equivalent points of the observed data ( figure  12[a], blue circles, table 2[a]) assuming a GLO distribution. The GLO parameters were then numerically optimised using the least-square method by searching for a set of GLO parameters that minimised the squared difference between the observed sedimentological flood data (Eq. 10, and figure 12[b], red circles) and the equivalent points from the GLO distribution ( figure  12[b], blue circles). After some initial testing, the shape parameter of the GLO distribution was fixed to the value derived from the original FEH pooled frequency analysis (0.219). The location (QMED) and scale parameters were optimised in the final analysis shown in table 2(b). Varying the shape parameter produced unrealistic results which were heavily influenced by a lack of low order return period flood events in the boulder berm record. Using a fixed shape parameter derived from the standard FEH technique for ungauged catchments yielded much more sensible results and is considered a pragmatic solution which achieved a reasonable fit of the GLO distribution to the sedimentological flood data. This method yields an optimised set of GLO parameters (shown in table 2[b]) which give the best fit for a flood frequency curve (in terms of the least-squared different between observed and modelled data) to 12 sedimentological flood events. The final flood frequency curve fitted to the sedimentological flood data is shown in figure 12(b).
Design estimates for a range of return periods at Nant Cwm-du are presented in figure 13 and table 3. Estimates have been calculated for the three palaeoflow estimates (derived using different Manning's 'n' values). In all cases the design estimates derived using reconstructed flows from at-site data at Nant Cwm-du significantly exceed those derived from a standard FEH 13 Version 5 -Revised manuscript, resubmitted to the Journal of Flood Risk Management on 12 March 2018. pooled analysis. Taking the 100 year design flood as an example, even the lowest estimate from the sedimentological flood data (6.1 m 3 /s) is 35% higher than that derived from an ungauged FEH pooled analysis (4.5 m 3 /s). 100 year estimates derived from sedimentological flood data with higher Manning's 'n' values are 181% and 294% higher than a pooled analysis alone, although both are considered overestimates since the favoured flow reconstruction for Nant Cwm-du uses the Jarrett (1992) and Carling (1986) methods which yields the 35% higher 100 year design estimate.

Discussion
This study has shown that sedimentological flood data can be incorporated into flood frequency analysis in the UK using a range of techniques. There are however a number of issues and uncertainties that need to be considered if sedimentological flood data are more routinely used in practice.
Sensitivity analysis on the key parameters required for using the ML method outlined in Environment Agency (2017b) to incorporate sedimentological flood data in flood frequency estimation showed that the selection of perception threshold ( 0 ) can have a significant impact on design estimates. In the Roundabout site example 0 was based on a regression relationship derived using only four gauged AMAX events and ln(Zr/Rb) values. Clearly this may not be a robust relationship, and justifying adopting alternative design estimates based on this rather tenuous threshold would be unlikely in practice. The original study by Jones et al. (2012) did not set out to derive a perception threshold, but rather establish relative flood magnitude based on ln(Zr/Rb) ratios as a proxy for flood discharge. If flood chronologies derived from fine-grained sediment sequences are to be used in flood frequency analysis then future studies should place more emphasis on establishing robust relationships between events in the sedimentary record and estimates of peak flow. Ideally, peak flow estimates for each individual flood event identified in sedimentary records would be known, but as a minimum, the discharge threshold over which all floods in the sedimentary record exceed ( 0 ) should be established. A second issue which introduces uncertainty in the design estimates for Montford was the subjective approach adopted to creating a composite flood record from multiple cores. Multi-peak flood events in individual cores, and duplicate events in multiples cores did not allow an exact value for the number of flood events ( ) to be identified. However, the chosen value of does not appear to significantly influence final design estimates. Likewise, the time period represented by the sedimentological flood record (ℎ) does not appear to have a large impact on the final design estimates when considering dating control uncertainties up to ± 200 years. To encourage use of the ML method for incorporating sedimentological flood data into flood frequency analysis, future geomorphological studies should put more emphasis on establishing the key parameters, 0 , and ℎ, paying particular attention to establishing a robust estimate for 0 .
The incorporation of historical data in flood frequency estimation at ungauged sites does not appear to have been studied directly in the UK. Consequently the analysis of the sedimentological flood data from the Nant Cwm-du catchment presented in this paper outlines 14 Version 5 -Revised manuscript, resubmitted to the Journal of Flood Risk Management on 12 March 2018. a simple technique that is an adaptation of a traditional pooled flood frequency analysis. This least square method appears to produce credible design estimates when analysing sedimentological flood data alone. However, there are several sources of unquantified uncertainty that need to be considered. The palaeohydraulic flow reconstruction undertaken by Foulds et al. (2014) yielded three flow estimates for the flood events that formed boulder berms in the Nant Cwm-du catchment. The three markedly differing flood frequency curves shown in figure 13 highlight the sensitivity of design estimates to the peak flows ascribed to each boulder berm unit. Even if we take the preferred flood frequency curve derived using the Jarrett (1992) and Carling (1986) methods, the effect of including the sedimentological flood data on the uncertainty of the estimated flood frequency curve is difficult to define. It provides an alternative, higher, estimate of design floods but it is difficult to identify any evidence that this alternative is more certain. It is also worth noting that the Carling (1986) method is based on limited field data, and improving the evidence base behind palaeohydraulic reconstruction techniques in the UK should be considered a key area of future research if boulder berm data are to be used in flood frequency analysis. A second area of uncertainty to consider with respect to the boulder berm units at Nant Cwm-du is the likelihood for multiple boulder berms to be deposited during a single event. This was addressed in this study by examining the accuracy of lichen dating, cross-referencing with documentary records and removing duplicate events. It may also be beneficial for researchers to investigate more advanced techniques for flood estimation in ungauged catchments in the UK where only documentary or sedimentological flood data are available. One approach may be to adapt the ML method outlined in Environment Agency (2017b) so that flood estimates can be derived by only using the part of the likelihood function that describes historical data.
FEH methods assume stationarity, whereby it is assumed that the statistical properties of flood generating processes have not changed over time. However, hydrological non-stationarity has been noted by several authors (e.g. Strupzewski et al., 2001;Prosdocimi et al., 2014) and geomorphological studies in both the UK (Macklin et al., 1992;Rumsby and Macklin, 1994;Macklin and Rumsby, 2007;Foulds and Macklin, 2016) and internationally (see recent review by Naylor et al., 2017) have long recognised non-stationarity in long-term flood series related to anthropogenic influence and climatic variability. We recommend that future work relating to the incorporation of sedimentological flood data into flood frequency analysis should consider non-stationarity in more detail, building on both recent hydrological (Salas and Obeysekera, 2014;Prosdocimi et al., 2014;Serinaldi and Kilsby, 2015) and geomorphologicalbased (Toonen et al., 2017) studies.
Finally, we note that geomorphological understanding of floods has a much broader potential role to play in understanding and communicating flood risk than merely nudging a flood frequency curve up or down. In discussing this role, Baker (1994, 153) refers to the "vibrant understanding of real floods as explored by those who interpret the signs of those floods". Some types of sedimentological information can vividly demonstrate the height, extent and power of past floods in a way that effectively convinces stakeholders of the dangers that future floods may pose. Baker (1994, 153) also claims, perhaps provocatively, that "the reality of geomorphological knowledge far outweighs any uncertainty, especially in comparison to the artificial knowledge most often conveyed by conventional hydrology". In our opinion a more objective comparison of the relative uncertainties is needed, and this should be prioritised in future work.

Conclusions
This study has demonstrated that sedimentological flood data from both upland and lowland river environments can be incorporated in UK flood frequency analysis. Sedimentological flood data have the advantage over short systematic records of river flow in that they can provide long-term evidence of flooding at the site of interest. Consequently, flood estimates are based site-specific rather than being based on statistical relationships with hydrologically similar, but often remote catchments. Future work should focus on quantifying the uncertainties associated with flood data derived from flood-sediment archives, and the impact of the incorporation of sedimentological flood data on design estimate uncertainty. Future geomorphological studies should also place more emphasis of developing robust estimates of palaeodischarge, taking account of non-stationarity resulting from long-term channel and floodplain dynamics (Lewin and Macklin, 2010). Once these uncertainties in palaeodischage estimates have been quantified, the incorporation of sedimentological flood data in UK flood frequency analyses has the potential to greatly improve current and future design flow estimates.        : Regression relationship between palaeoflood magnitude (represented by ln(Zr/Rb)) at the Roundabout and peak discharge gauged at Montford Bridge (NRFA station number 54005). Four ln (Zr/Rb) values could be confidently linked gauged flood events. These events occurred in 1960, 1964, 1965and 1998 Version 5 -Revised manuscript, resubmitted to the Journal of Flood Risk Management on 12 March 2018. Figure 8: Flood frequency curves for the River Severn at Montford. (a) Comparison of flood frequency curves derived using systematic data (enhanced single site with GLO distribution using L-moments and single site with GLO distribution using ML) and a combination of systematic and palaeoflood data (GLO distribution, maximum likelihood method). (b) Sensitivity of design estimates to variation in the perception threshold 0 . (c) Sensitivity of design estimates to variation in the number of sedimentological flood events . (d) Sensitivity of design estimates to variation in length of sedimentological flood record ℎ. Figure 9: Diagnostic plots of catchment descriptors for pooled flood frequency analysis at Nant Cwmdu. Vertical bars represent the distribution and relative frequency of 558 catchments classified by FEH as 'suitable for pooling'. Black circles show catchment descriptor values of pooling group members relative to all 558 catchments available for pooling. The black cross shows the catchment descriptor values for the Nant Cwm-du catchment relative to all 558 catchments available for pooling. AREA = catchment drainage area, SAAR = standard average annual rainfall , BFIHOST = baseflow index, FARL = index of flood attenuation due to reservoirs and lakes, PROPWET = proportion of time when soil moisture deficit ≤ 6mm during 1961-90, URBEXT2000 = extent of urban and suburban land cover in 2000 expressed as a fraction. Figure 10: Flood frequency curves for the Nant Cwm-du catchment derived using systematic data (pooled flood frequency analysis fitting a GLO distribution using L-moments), showing 95% confidence limits. 36 Version 5 -Revised manuscript, resubmitted to the Journal of Flood Risk Management on 12 March 2018. Figure 11: Time series of reconstructed flow records form hydraulic analysis of boulder berm deposits using equations in Carling (1986) and Jarrett (1992) for Nant Cwm-du (after Foulds et al, 2014). (a) Full record of 29 boulder berm units. (b) Final record of 12 boulder berm units used in flood frequency estimation after duplicate events were identified and removed from the record. 37 Version 5 -Revised manuscript, resubmitted to the Journal of Flood Risk Management on 12 March 2018. Figure 12: Optimisation of the GLO parameters using the least square method at Nant Cwm-du. (a) Flood frequency curve prior to optimisation (using parameters derived from a traditional pooled flood frequency analysis). (b) Flood frequency after optimisation using the least square method. Example shown for observed flow estimates derived using Jarrett (1992) and Carling (1986). Version 5 -Revised manuscript, resubmitted to the Journal of Flood Risk Management on 12 March 2018. Figure 13: Final flood frequency curves for the Nant Cwm-du catchment. Comparison of flood frequency curves derived using systematic data from a pooled flood frequency analysis (GLO distribution derived using L-moment ratios shown with black lines) and estimates derived using sedimentological flood data reconstructed using a range of Manning's 'n' values. Grey shaded area represents the range of estimates derived from sedimentological flood data for the range of Manning's 'n' values.