Groundwater - 2023 - Sun

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 61

Research Paper\

Evaluation of Hydraulic Conductivity Estimates from Various

Approaches with Groundwater Flow Models

Dongwei Sun
Department of Earth and Environmental Sciences, University of Waterloo, Waterloo, ON,
N2L3G1, Canada.
d34sun@uwaterloo.ca
Ning Luo
Department of Earth and Environmental Sciences, University of Waterloo, Waterloo, ON,
N2L3G1, Canada.
n2luo@uwaterloo.ca
Aaron Vandenhoff
Department of Earth and Environmental Sciences, University of Waterloo, Waterloo, ON,
N2L3G1, Canada.
aaron.vandenhoff@uwaterloo.ca
Wesley McCall
Geoprobe Systems Inc., 1835 Wall St., Salina, KS 67401, USA.
McCallw@geoprobe.com
Zhanfeng Zhao
Key Laboratory of Water Cycle and Related Land Surface Processes, Institute of Geographic
Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing, China.
zhaozhanfeng@igsnrr.ac.cn
Chenxi Wang
Department of Earth and Environmental Sciences, University of Waterloo, Waterloo, ON,
N2L3G1, Canada.
c592wang@uwaterloo.ca
David L. Rudolph

This article has been accepted for publication and undergone full peer review but has not been
through the copyediting, typesetting, pagination and proofreading process which may lead to
differences between this version and the Version of Record. Please cite this article as doi:
10.1111/gwat.13348
This article is protected by copyright. All rights reserved.
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Department of Earth and Environmental Sciences, University of Waterloo, Waterloo, ON,
N2L3G1, Canada.
drudolph@uwaterloo.ca
Walter A. Illman
Corresponding author: Department of Earth and Environmental Sciences, University of Waterloo,
Waterloo, ON, N2L 3G1, Canada. 519-888-4567
willman@uwaterloo.ca
Conflict of Interest: None

Key Words: Hydraulic conductivity, specific storage, connectivity, grain size analysis,

permeameter test, slug test, direct push, hydraulic profiling tool, inverse modeling, hydraulic

tomography, geological model, groundwater model.

Article impact statement: Evaluating K from various approaches showed that inverse modeling

and data fusion are necessary steps in building robust groundwater models.
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Abstract

Significant efforts have been expended for improved characterization of hydraulic

conductivity (K) and specific storage (Ss) to better understand groundwater flow and contaminant

transport processes. Conventional methods including grain size analyses (GSA), permeameter,

slug and pumping tests have been utilized extensively, while Direct Push-based Hydraulic

Profiling Tool (HPT) surveys have been developed to obtain high-resolution K estimates.

Moreover, inverse modeling approaches based on geology-based zonations, and highly

parameterized Hydraulic Tomography (HT) have also been advanced to map spatial variations of

K and Ss between and beyond boreholes. While different methods are available, it is unclear

which one yields K estimates that are most useful for high resolution predictions of groundwater

flow. Therefore, the main objective of this study is to evaluate various K estimates at a highly

heterogeneous field site obtained with three categories of characterization techniques including:

(1) conventional methods (GSA, permeameter and slug tests); (2) HPT surveys; and (3) inverse

modeling based on geology-based zonations and highly parameterized approaches. The

performance of each approach is first qualitatively analyzed by comparing K estimates to site

geology. Then, steady-state and transient groundwater flow models are employed to

quantitatively assess various K estimates by simulating pumping tests not used for parameter

estimation. Results reveal that inverse modeling approaches yield the best drawdown predictions

under both steady and transient conditions. In contrast, conventional methods and HPT surveys

yield biased predictions. Based on our research, it appears that inverse modeling and data fusion

are necessary steps in predicting accurate groundwater flow behavior.


17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Introduction

Significant research has been conducted over the last several decades to better understand

groundwater flow and contaminant transport processes. Groundwater flow patterns, contaminant

transport and their subsurface distributions have been found to be primarily governed by the spatial

distribution of hydraulic conductivity (K) and specific storage (Ss), while the accurate delineation

of such parameters is very difficult in complex groundwater flow systems due to high degrees of

geological heterogeneity. Inaccurate hydraulic parameter estimates will lead to poor groundwater

flow and solute transport predictions. In addition, as it was clearly demonstrated by Rehfeldt et al.

(1992) and Yeh et al. (1995), to accurately forecast the migration of a tracer plume, the number of

K estimates required to adequately capture the heterogeneity significantly increases for a site with

high degrees of geological variability. The large number of K measurements required to capture

heterogeneity presents significant challenges to implementing conventional site characterization

techniques.

Conventional methods such as empirical-relation-based grain size analyses (GSA),

laboratory permeameter analyses of core samples, slug and pumping tests have been used in water-

supply investigations for several decades. However, most of them are not capable of providing

reliable and sufficient information on local heterogeneity efficiently (Butler, 2005; Alexander et

al., 2011). For example, laboratory analyses of core samples, such as GSA and permeameter tests,

can provide small-scale estimates of K at sampling locations. However, they are usually time-

consuming, notwithstanding the low sample recovery rate for coarse grained materials, and

potential errors that may result from using repacked samples for experiments conducted in the

laboratory which may deviate significantly from in situ conditions (Klute and Dirksen, 1986;
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
White, 1988). Moreover, information on K variability between boreholes cannot be delineated

without interpolating point-scale measurements.

Single well response tests or slug tests are usually conducted to provide point-scale K and Ss

estimates of materials representing a small volume surrounding the screened interval. While these

estimates are useful, they may not be representative of large-scale groundwater flow and solute

transport behavior. Also, considerable care must be taken as conditions at and near the well will

have significant impacts on K and Ss estimates (Beckie and Harvey, 2002; Butler, 2019). In

addition, using a solution (e.g., Hvorslev, 1951) that ignores inertial mechanisms can lead to a

significant overestimation of K (Butler et al. 2003). Therefore, appropriate slug test models should

be selected and used for data analysis to minimize interpretation errors.

To obtain larger-scale estimates of K and Ss representative of large-scale groundwater flow

and solute transport behavior at a site, pumping or injection tests with observation wells are

conducted. Various analytical solutions are available that can be used to obtain large-scale

estimates of K and Ss (e.g., Theis, 1935; Cooper and Jacob, 1946). In addition, some solutions yield

important insights on flow geometry and anisotropy in K (e.g., Neuman et al., 1984; Hsieh and

Neuman 1985). However, estimates of K and Ss from the traditional interpretation of

pumping/injection tests are averaged parameters over large volumes that are frequently impacted

by the scale effect (Clauser, 1992; Rovey and Cherkauer, 1995; Butler and Healey, 1998;

Vesselinov et al. 2001; Illman, 2006). Moreover, Wu et al. (2005) demonstrated that the estimated

K and Ss values from type curve and straight-line analyses of pumping tests in heterogeneous

aquifers are highly dependent on pumping and monitoring locations. Therefore, while the

estimates are useful for various applications, it is unclear what these parameters mean and how
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
useful they are in groundwater models. This is one important reason why novel approaches are

necessary for higher resolution subsurface characterization of K and Ss heterogeneity.

To better capture subsurface heterogeneity, significant efforts have been expended to map

the spatial distribution of K. One such example is the invention of various direct push (DP)

methods over the last three decades. These include DP slug test (DPST), DP permeameter (DPP),

DP injection logging (DPIL), and hydraulic profiling tool (HPT) that have been developed as

efficient alternatives to conventional well-based approaches for providing high-resolution vertical

profiles of K variability in shallow, unconsolidated aquifers (Hinsby et al., 1992; Butler et al., 2007;

Dietrich et al., 2008; McCall and Christy, 2010; Geoprobe, 2015). Specifically, the HPT can

rapidly obtain high-resolution (~1.5 cm) K profiles based on the ratio of water injection rate and

corrected down-hole water pressure measured in situ (McCall and Christy, 2020).

Most approaches described above can only provide K variations in the immediate vicinity of

a well or DP location, while reliable information away from or between boreholes is difficult to

obtain. As a result, inverse modeling methods (e.g., Poeter and Hill, 1997; Carrera et al., 2005)

have been developed as an alternative approach to estimate K and Ss by calibrating a groundwater

flow model consisting of geological zonations with ambient or anthropogenically modified

hydraulic head fields. Calibration of groundwater models through trial-and-error or with assistance

from nonlinear regression tools such as UCODE (Poeter and Hill, 1998) and PEST (Doherty, 2015)

can produce representative values of K and Ss if the zonation is accurate (Zhao et al., 2016; Tong

et al., 2021) and if many observed heads are available to derive statistically representative values

for each zone (Yeh et al., 2015). However, when the geological models are inaccurate, structural

noise is introduced (Doherty and Welter, 2010) and parameter estimates from inverse models can

be unrealistic with wide confidence intervals (e.g., Zhao et al., 2016; Luo et al., 2017).
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
More recently, hydraulic tomography (HT) has been developed as a new site characterization

approach to yield high resolution K and Ss estimates from deterministic or geostatistical inverse

modeling of multiple pumping tests. Specifically, HT uses the same equipment as traditional

pumping or injection tests and collects drawdown/buildup-time datasets at several surrounding

observation wells from tests conducted at different wells. The drawdown/buildup-time dataset

through a single test and the corresponding interpretation with an appropriate inverse model yields

a snapshot of K and Ss heterogeneity. Repeating these tests at different locations and their

interpretation yields many snapshots of K and Ss heterogeneity through multiple tests. However,

quantitative synthesis of these images to accurate K and Ss values requires advanced inverse

modeling techniques. Over the last two decades, HT has been tested through a number of synthetic

(e.g., Yeh and Liu, 2000; Bohling et al., 2002; Xiang et al., 2009; Zhu and Yeh, 2005; Hu et al.,

2011), laboratory (e.g., Liu et al., 2007; Berg and Illman, 2011a; Zhao et al., 2015, 2022; Luo et

al., 2017; Jiang et al., 2021), and field studies (Bohling et al., 2007; Straface et al., 2007; Illman et

al., 2009; Berg and Illman, 2011b; Huang et al., 2011; Castagna et al., 2011; Brauchler et al., 2013;

Cardiff et al., 2013; Zhao and Illman, 2018, 2022a; Fischer et al. 2018; Zha et al., 2016, 2019;

Tiedeman and Barrash, 2020; Luo et al., 2022; Ning et al., 2023; Zhao et al., 2023). HT data from

pumping or injection tests can be inverted sequentially or simultaneously, while treating the

medium to be homogeneous, consisting of geology-based zonations, or highly parameterized

(Illman et al., 2015). Steady-state hydraulic tomography (SSHT) can provide K estimates, while

transient hydraulic tomography (THT) can provide both K and Ss estimates. When pumping and

monitoring locations are sparse, HT yields smooth K and Ss distributions (Illman et al., 2009; Berg

and Illman, 2011b; Cardiff et al., 2013) that could also benefit from regularization of the inverse

problem (Doherty, 2015). For example, the integration of accurate geological information into HT
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
has shown that salient inter- and intra-layer heterogeneities of K can be imaged effectively (Zhao

et al., 2016; Luo et al., 2017) for both aquifer and aquitard units (Zhao and Illman, 2018).

Based on diverse data collection and interpretation approaches that have been developed, we

can classify the aforementioned approaches into three categories of site characterization

methodologies. The first category consists of conventional methods including GSA, permeameter,

slug and pumping tests. The second category includes DP approaches with various tools such as

DPST, DPP, DPIL, and HPT. The third category consists of inverse modeling methods with

various degrees of model parameterization ranging from geological zonations to a highly

parameterized geostatistics-based HT approach.

A question frequently encountered by hydrogeologists is what approach should be adopted

to obtain K estimates at a given site for groundwater modeling? A significant amount of research

has been conducted to examine the effectiveness of different approaches (Butler, 2005; Chapuis et

al., 2005; Butler et al., 2007; Alexander et al., 2011; Vienken and Dietrich, 2011; Liu et al., 2012;

Brauchler et al., 2013; Rosas et al., 2014; Zhao and Illman, 2018). For example, Vienken and

Dietrich (2011) utilized various empirical formulae to analyze grain size sieve results revealing

that mean K values varied by several orders of magnitude among the formulae. Alexander et al.

(2011) compared several conventional methods including GSA, permeameter, slug and pumping

tests showing that K estimates varied significantly from one method to another. Liu et al. (2012)

assessed multiple DP approaches including DPST, DPP, and DPIL. Zhao and Illman (2018)

evaluated inverse models built with different conceptualizations including effective parameters,

geology-based zonations, and a highly parameterized geostatistics-based HT approach.

Thus far, only few studies have compared approaches of different categories. Butler et al.

(2007) assessed the first two categories including GSA and DP methods based on DPST and DPP.
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Brauchler et al. (2013) compared the last two categories including DPIL and HT. However, there

is no consensus on which approach yields K estimates that are most representative of a field site

and useful for groundwater flow modeling.

The main objective of this study is to evaluate various K estimates from three categories of

site characterization methods at the well-studied North Campus Research Site (NCRS). The NCRS

is located on the University of Waterloo campus in Waterloo, Ontario, Canada (Alexander et al.,

2011), which is underlain by a multiple aquifer-aquitard system consisting of highly heterogeneous

glaciofluvial deposits. We choose what we believe are the most widely utilized site

characterization methods for K heterogeneity. Approaches evaluated include: (1) conventional

methods [Case 1a: GSA; Case 1b: Permeameter Tests; Case 1c: Slug Tests]; (2) HPT surveys with

three different formulae [Case 2a: McCall and Christy (2010); Case 2b: Borden et al., (2021); Case

2c: Zhao and Illman (2022b); and (3) various inverse modeling approaches [Case 3a: PEST

Calibrated Geological Model; Case 3b: Averaged THT Geological Model; Case 3c: Highly

Parameterized THT Model]. It is crucial to understand that each approach differs in terms of the

scale and resolution at which heterogeneity is captured as well as the types and quantity of data

that they rely on. Therefore, the performance of each approach is first qualitatively analyzed by

comparing K estimates to site geology. Then, we quantitatively evaluate the K estimates through

the independent prediction of pumping tests or other drawdown-inducing events that have not been

used during model calibrations as advocated by Illman et al. (2007) and Liu et al. (2007).

Specifically, a three-dimensional (3-D) forward groundwater model is developed using

HydroGeoSphere (HGS) (Aquanty, 2019) for forward simulations of steady-state drawdown data

from seven independent pumping tests that are not used for K estimation by any method evaluated

in this study. Then, transient forward runs are performed for simulations of transient drawdown
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
data from the same pumping tests. Methods yielding K estimates that result in the smallest

discrepancies between simulated and observed drawdowns are considered the most reliable for the

NCRS.

Description of Field Site and Data Used for Analyses

Site Description and Hydrogeology

The shallow subsurface beneath the NCRS is comprised of the Waterloo Moraine, which is

a highly heterogeneous mixture of glaciofluvial deposits. Deposits around and below the surface

are mostly an outcome of advances and retreats of the Laurentide ice sheet lobes during glaciations.

Tills covering and concealing the bedrock are laid down directly by the ice, mixing all sizes of

materials from clay to boulders (Karrow, 1993).

Karrow (1979) drilled a 50-meter-long borehole to obtain a continuous core sampling of the

materials down to the bedrock. According to the drilling report, below the top organic soil is a thin

silt layer, followed by the Tavistock till which is composed of sandy-to-clayey silt, but only exists

as erosional remnants. This till is underlain by a three-meter-thick sand sequence, followed by the

silty clay Maryhill till and the dense Catfish Creek till, which consist of silty sand and stony silt.

The Catfish Creek till extends approximately 20 meters below the ground surface and has been

treated to be the lower hydraulic barrier of the NCRS (Alexander et al. 2011). Subsequent work

by Sebol (2000) and Alexander et al. (2011) revealed that the primary characteristic of the site is

the alternating and interfingering multi-aquifer-aquitard system consisting of two high-K units

separated by a discontinuous low-K layer. The lower aquifer consists of sandy gravel, while the

upper aquifer is comprised of sand to sandy silt. Hydraulic connections are known to be provided

by the low K layer in between, and the aquifer is semi-confined. Aquitards are also found above
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
and below the two aquifers. Local stratigraphy is discontinuous with the presence of stratigraphic

windows rendering the site to be highly heterogeneous.

Available Field Data and the 19-layer Geological Model

The schematic configuration of wells at the NCRS in plan view is shown in Figure 1a. The

blue dashed box represents a nine-well pumping and observation network. Initially, Alexander et

al. (2011) installed four continuous multichannel tubing wells (CMT1 – CMT4), each with seven

observation ports, and a pumping well (PW1) screened at eight different elevations (i.e., PW1-1 ~

PW1-8) (Figure 1b). Continuous sediment core samples were collected with recovery rates ranging

from 69% to 83% during well installations. Sample recovery was good, but they reported the

presence of periodic gaps in profiles that corresponded with less consolidated aquifer units

(Alexander et al., 2011).

To provide a comprehensive K profile along each borehole, 270 GSA and 471 falling head

permeameter tests were initially carried out using core samples from CMT1 – CMT4 and PW1 by

Alexander et al. (2011). Twenty-eight slug tests were also performed at each monitoring port of

the CMT systems. Later, two multi-screened wells (PW3, PW5) and two well clusters (PW2, PW4)

were installed and described by Berg and Illman (2011b). Fifteen additional slug tests were

performed at various intervals of PW1, PW3 and PW5 by Xie (2015) and interpreted using various

analytical models (Hvorslev, 1951; Bouwer and Rice, 1976; Hyder et al., 1994). Nine pumping

tests at PW1-3, PW1-4, PW1-5, PW3-3, PW3-4, PW4-3, PW5-3, PW5-4, and PW5-5 were

conducted mainly within aquifer layers during a HT survey by Berg and Illman (2011b). Zhao and

Illman (2018) then conducted six additional pumping/injection tests at PW1-1, PW1-6, PW1-7,

PW2-3, PW3-1, and PW5-1 with longer durations to stress both aquifer and aquitard units.
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Additional 171 permeameter tests have also been performed by Zhao and Illman (2017) with core

samples collected from PW2, PW3, PW4, and PW5 wells.

To date, a total of 270 GSA, 642 permeameter analyses of core samples, 43 slug tests, and

15 pumping and injection tests were performed within the CMT and PW system. Moreover,

geophysical surveys were also performed at the NCRS, with Geoprobe DP surveys first conducted

in April of 2015 at eight locations to obtain electrical conductivity (EC) profiles (Williamson,

2016). During the summer of 2019, Sun et al. (2022) carried out HPT surveys at 11 DP locations

(HPT1 – HPT10 and HPT6-2). Figure 1b is the 3-D perspective view of wells and DP locations

(HPT1 ~ HPT6-2) within and around the 15 m × 15 m well clustering area, along with illustrations

of pumping and observation locations, bentonite sealings and high-resolution HPT survey intervals.

Figure 2 is the cross-sectional view (orientations of cross sections are indicated on Figure

1a) of the 3-D geological zonation model created by Zhao and Illman (2017) for the NCRS,

containing 19 different layers representing seven different material types. The model was

constructed by examining lithology information obtained from 18 boreholes completed to different

depths at the site. Further details on the construction of the geological model are provided in Zhao

and Illman (2017).

The geological model is 70 m × 70 m × 17 m in extent and is constructed with the

commercial software Leapfrog Geo (ARANZ Geo. Limited, 2015), that interpolates various data

types to quickly construct geological models. Locations of the CMT and PW wells and screened

intervals are shown in the C-C’ and D-D’ cross sections in Figure 2, and A-A’ and B-B’ cross

sections in Figure S1 of the Supporting Information (SI) section. The interpolated geology between

boreholes based on known lithology is a reasonable representation of the site. The complex and
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
truncated layering of different sediment types indicate the highly heterogeneous nature of the

glaciofluvial deposit at the NCRS.

Insert Figures 1 and 2 here

Description of Various K Estimation Methods

Case 1a: Empirical Formulae Applied to Grain Size Analyses (GSA) Results

The first method considered in this study is the application of various empirical formulae to

results from GSA. Specifically, many empirical formulae have been developed to establish

relationships between K and particle size statistics (Vienken and Dietrich, 2011; Rosas et al. 2014;

Devlin, 2015). This method is cost-efficient compared to other conventional approaches when it

comes to obtaining rapid estimates of K, avoiding the need of conducting permeameter tests

through core samples or efforts to install wells to conduct slug or pumping tests. However, the

highly heterogeneous condition at the NCRS leads to significant challenges to the analysis as most

equations described in the literature are developed for relatively permeable materials such as sand

(e.g., Krumbein and Monk, 1943; Kozeny, 1953). Thus, it is hard to determine if one dedicated

empirical relationship is suitable for various unconsolidated materials. In this study, three different

models were applied to derive K estimates from core samples of different materials. Specifically,

the Hazen (1911) model was used for coarse-grained sediments, the Puckett et al. (1985)

relationship for fine-grained sediments, and the Barr (2001) formula for intermediate-grained

sediments, with details provided in the Supporting Information (SI) section.

Case 1b: Permeameter Tests

Another traditional method for obtaining K estimates is to conduct laboratory permeameter

analyses of repacked samples retrieved during well drilling and borehole logging. During previous
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
work by Alexander et al. (2011) and Zhao and Illman (2017), a total of 642 temperature-corrected

falling head permeameter analyses were performed on repacked samples to estimate K based on a

formula provided in Freeze and Cherry (1979). Details are provided in the SI section.

By conducting a permeameter test, vertical K is preferentially determined. As reported by

Klute and Dirksen (1986), K values of repacked samples estimated in the laboratory can be

artificially lower than those from intact samples. In addition, the extraction and repacking

processes may induce fractures and destroy the internal structures that are well-preserved in intact

samples. Sudicky (1988) demonstrated that the potential error caused by using repacked samples

in permeameter tests is small compared to the K heterogeneity. Moreover, it is very difficult to

recover substantial intact core samples from highly permeable zones (Butler, 2005; Alexander et

al., 2011). Therefore, underprediction of K is possible for permeameter tests conducted with

materials from highly permeable intervals.

Case 1c: Slug Tests

At the NCRS, 28 slug tests were conducted by Alexander et al. (2011) in all seven monitoring

intervals of the four CMT wells (i.e., CMT1 – CMT 4) and 15 tests by Xie (2015) at open intervals

of PW1, PW3, and PW5 wells resulting in a total of 43 tests. Data collected during those tests

yielded head response data that are amenable to standard slug test analyses solutions. For this study,

all tests were interpreted with the Hvorslev (1951) model with details provided in the SI section.

As the slug tests at the site were conducted along existing observation intervals and not with DP

equipment to obtain high resolution K estimates, results are grouped as part of conventional

methods.

Slug tests are suitable for materials that have moderate to low values of K, while high K

materials could also be tested and analyzed. Moreover, the sampled volume of the slug test is
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
usually considered to be much smaller compared to a pumping test, and the estimated hydraulic

parameters are only representative of materials around the test interval, and usually not between

boreholes based on site heterogeneity (Butler, 1997).

Cases 2a – 2c: HPT Surveys

Eleven HPT surveys were conducted by Sun et al. (2022) (Figure 1a) to characterize the

high-resolution variability of K to an approximate depth of 17 m with the HPT probe (Model

K6050; Geoprobe). Water was continuously injected during the advancement of the HPT probe

through a screen (1-cm in diameter) on the side of the probe and the corresponding water pressure,

injection flow rate (Q), as well as Electrical Conductivity (EC) were recorded electronically at a

1.5-cm vertical interval over time. Due to the highly heterogeneous nature of NCRS sediments,

the HPT probe was advanced at an average rate ranging between 1.4 to 2.2 cm/s for all 11 surveys

depending on varying sediment types. In this study, three different formulae [Case 2a: McCall and

Christy (2010); Case 2b: Borden et al. (2021); Case 2c: Zhao and Illman (2022b)] were utilized to

convert the collected data to K measurements with details provided in the SI section.

Case 3a: PEST Calibrated Geological Model

An effective way for capturing the spatial variation of hydraulic parameters is to develop

stratigraphic or zonation models, in which hydraulic parameters in each zone are treated to be

homogeneous and their values are estimated based on pumping tests or ambient hydraulic head

data through trial-and-error or automated calibration methods (Doherty, 2015). At the NCRS, Zhao

and Illman (2018) built a zonation model based on the 19-layer geological model and jointly

calibrated with 522 transient data from 176 drawdown/buildup curves obtained through eight

pumping tests (PW1-1, PW1-4, PW1-6, PW1-7, PW2-3, PW3-3, PW4-3, and PW5-3) for K and

Ss estimates. The calibration was performed by coupling HGS (Aquanty, 2019) with the parameter
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
estimation code PEST (Doherty, 2005), while treating elements in each layer to be homogeneous

and isotropic to simplify the analysis, which resulted in 19 pairs of K and Ss estimates. The model

was discretized into 31,713 rectangular finite elements of varying sizes with 34,816 nodes for

inverse modeling. From the central well cluster area to the model boundary, the element size

gradually increased, with blocks expanding from 0.5 m × 0.5 m × 0.5 m to 5 m × 5 m × 0.5 m.

The computational mesh is provided as Figure S2 in the SI section.

In the work of Zhao and Illman (2018), the unsaturated zone at the NCRS was not considered,

and the water table was designated as the upper boundary. The water table was modelled as a flat

surface since the change in water level was less than the height of the elements at the top. The

Catfish Creek till was treated as a hydraulic barrier (Alexander et al., 2011) and served as the lower

boundary. The top and bottom model boundaries were treated as impermeable boundaries, while

the remaining four boundaries were treated as constant head boundaries as in our previous inverse

models built for the site (Berg and Illlman, 2011b, Zhao and Illman, 2018, 2022a).

Case 3b: Averaged THT Geological Model and Case 3c: Highly Parameterized

THT Model

The same datasets utilized to calibrate the geology-based groundwater flow model were also

utilized for THT analysis by Zhao and Illman (2018) to map the K and Ss heterogeneity at the

NCRS using VSAFT3 (Variably Saturated Flow and Transport 3-D Model) (Yeh et al., 1993),

which utilizes the Simultaneous Successive Linear Estimator (SimSLE) (Xiang et al., 2009) for

geostatistical inverse modeling. Settings of the numerical model (model discretization, initial and

boundary conditions) were the same as the geology-based zonation model described in the

previous section. Furthermore, results from the calibrated geology-based zonation model were

utilized as initial K and Ss guesses for the inversion of the THT model.
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
The estimated K and Ss values at 31,713 finite elements from the THT analysis were then

averaged for each layer by taking the geometric mean based on the geological model to compare

with estimates from the calibrated geology-based zonation model using PEST (Doherty, 2005) and

other estimates from this study. This resulted in 19 estimates of K and Ss for each layer which we

refer to as Case 3b: Averaged THT Geological Model, while Case 3c: Highly Parameterized THT

Model utilizes all 31,713 K and Ss estimates.

Qualitative Comparison of K estimates

Figure 1 shows that CMT1 is spatially close to HPT3, while CMT3 is close to both HPT6

and HPT6-2. Therefore, K estimates at CMT1 and CMT3 from Case 1a: GSA, Case 1b:

Permeameter Tests, Case 1c: Slug Tests, and various inverse modeling approaches (Cases 3a – 3c)

could be qualitatively and quantitatively compared with HPT results obtained at adjacent DP

locations.

Insert Figure 3 here

Figure 3 summarizes the vertical profiles of log10K estimates along CMT3 and HPT6 from

GSA, permeameter tests, slug tests, and various inverse modeling approaches along with site

stratigraphy at these locations. Similar figures for vertical profiles along CMT1 and HPT3 (Figure

S3a), as well as CMT3 and HPT6-2 (Figure S3b) are provided in the SI section.

Results show that K measurements are highly variable ranging approximately seven orders

of magnitude across the two CMT wells indicating the highly heterogeneous nature of K at the

site. Figure 3 reveals the K variability from one layer to another (i.e., interlayer heterogeneity)

reflecting the alternating aquifer-aquitard system can be captured by most of the methods, while
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
only small-scale measurements from Case 1a: GSA, Case 1b: Permeameter Tests, Cases 2a – 2c:

HPT surveys, and Case 3c: Highly Parameterized THT Model reveal heterogeneity within

individual layers (i.e., intralayer heterogeneity).

In terms of conventional methods, point-scale measurements of K from Case 1a: GSA and

Case 1b: Permeameter Tests generally follow a similar trend. Case 1c: Slug Test results also follow

the trend, but the measured K values are generally larger than Case 1a: GSA and Case 1b:

Permeameter Test estimates, especially at highly permeable zones potentially exhibiting a scale

effect (Clauser, 1992; Rovey and Cherkauer, 1995; Butler and Healey, 1998; Vesselinov et al.

2001; Illman, 2006). HPT results at three DP locations generally follow the trend of K from

permeameter tests of samples from the collocated CMT wells, while the K estimates are around 1

to 2 orders of magnitude larger than those estimated by permeameter tests, especially from 4 m to

8 m where local geology from collocated CMT wells is primary low permeability materials such

as clay and silt (Figure 3). Using various site-dependent formulae yields similar results at this

upper depth range. Significant differences are observed in the middle and lower portions of the

site.

Examination of Figure 3 at around 8 m – 10 m based on the core log reveals that local

geology consists primarily of highly permeable materials such as sand, thus K estimates from

permeameter tests, Case 1a: GSA and Case 1c: Slug Tests all yield relatively higher estimates of

K than for silt materials located above and below, while HPT estimates using Case 2a: McCall and

Christy (2010)’s model only yields a fixed K estimate at the lower bound of 3.5 × 10-7 m/s. The

Case 2b: Borden et al. (2021) and Case 2c: Zhao and Illman (2022b) models both provide K

estimates that are higher than those generated by McCall and Christy (2010)’s relationship.
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Similarly, examination of Figure 3 at depth ranging from 9 m – 11 m, a transition zone from

sand to silt is observed through core logs, and K measurments from Case 1a: GSA and Case 1b:

Permeameter Tests both reflect this variation. However, none of the K estimates from the three

HPT formulae detect this variation, while Case 2a: McCall and Christy (2010)’s model only yields

a fixed lower bound. The Catfish Creek till located at depths below 12 m for CMT1 and below 14

m for CMT3 is detected by a significant drop in K estimates from Case 1a: GSA and Case 1b:

Permeameter Tests. Case 2a: McCall and Christy (2010)’s model yields a fixed lower bound, while

Case 2b: Borden et al. (2021)’s model generates even higher estimates of K. On the other hand,

Case 2c: Zhao and Illman (2022b)’s model at HPT3 and HPT6-2 yields K estimates that are close

to Case 1a: GSA and Case 1b: Permeameter Tests, which is encouraging as this formula was

derived through fitting K estimates mostly in the range of 3.5 × 10-7 m/s ~ 6.9 × 10-4 m/s. Case 1c:

Slug Tests yield K estimates that are in general smaller than those estimated by HPT, but the

number of available data are low.

The K estimates from various inverse modeling approaches (Cases 3a – 3c) are also plotted

for comparison (Figure 3). For Case 3a: PEST Calibrated Geological and Case 3b: Averaged THT

Geological Models, uniform K values are assigned along each of the 19-layers, while Case 3c:

Highly Parameterized THT model yields spatially variable K estimates along the depth of the

borehole.

From 0 m to 3 m, K estimates from Case 3c: Highly Parameterized THT Model are quite

smooth because there are no monitoring data available for inversion, thus the estimated K values

are nearly identical to the initial K estimate input to THT analysis. Beneath 3 m, it is evident that

THT results follow the general pattern of K variability reflecting the site stratigraphy including

small-scale interlayer heterogeneity. For example, the transition zone at 3 m and 8 m from CMT3,
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
are not captured by the other methods. However, THT results near the bottom of CMT3 from 14

m to 16 m indicate an increase in K, which does not conform to stratigraphy at this location.

Statistical Analysis of K from Various Site Characterization Approaches

Descriptive Statistics of K from Various Approaches

Table 1 summarizes the descriptive statistics of K from various approaches at the NCRS. It

is worth mentioning that only 10 HPT surveys (HPT2 ~ HPT10 and HPT6-2) were utilized for this

study since there was no dissipation test conducted at HPT1 thus corresponding K values may be

less reliable. The reported statistics include minimum, maximum, geometric mean of K (KG), range

of log10K, and variance of log10K (σ2log10K). Another version of the table based on the natural

logarithm of K (ln K) is provided as Table S1 in the SI section.

Insert Table 1 here

Examination of Table 1 shows that HPT surveys (Cases 2a – 2c) yield a significantly larger

number of K estimates due to the 1.5 cm profiling intervals along each DP location. Case 3c:

Highly Parameterized THT Model has the largest number of estimated K due to its highly

parameterized nature of the geostatistical inversion approach.

The geometric mean of K (KG) from HPT methods (Cases 2a – 2c) are higher than those

generated through traditional methods due to the technical limitation of HPT for low K materials.

In addition, KG increases from Case 1a: GSA to Case 1b: Permeameter Tests and to Case 1c: Slug

Tests due to a potential scale effect. Moreover, Case 2a: McCall and Christy (2010)’s model

exhibits the smallest range of log10K due to the use of fixed upper and lower K limits. Although

both Case 2b: Borden et al. (2021)’s and Case 2c: Zhao and Illman (2022b)’s models have not

fixed the lower K limits, HPT measurements with Q less than 10 ml/min in low K sediments were
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
considered to be inaccurate and excluded as suggested by Liu et al. (2012). Therefore, Case 2b:

Borden et al. (2021)’s model extends the range especially at the lower end. In contrast, Case 2c:

Zhao and Illman (2022b)’s model extends the range for both the higher and lower ends and yields

the largest range of log10K among the three formulae to interpret HPT data. Case 1c: Slug Tests

yield a relatively small range of K estimates, while Case 1a: GSA with three models, Case 1b:

Permeameter Tests, and the Case 3c: Highly Parameterized THT Model all yield larger ranges of

estimates.

In terms of the variance of log10K (σ2log10K), Case 1a: GSA yields the highest σ2log10K of 2.63,

perhaps because of the use of three empirical models to target various soil classes. Case 1b:

Permeameter Tests yield the second highest estimate of σ2log10K at 1.55, which is comparable to

the value for Case 1c: Slug Tests (1.47) despite the smallest number of available measurements (n

= 43). It is also noteworthy that the Case 3a: PEST Calibrated Geological Model, Case 3b:

Averaged THT Geological Model, and Case 3c: Highly Parameterized THT Model yield

comparable σ2log10K of 1.61, 1.46, and 1.47, respectively. In contrast, K estimates from the HPT

tend to result in smaller σ2log10K estimates with Case 2b: Borden et al. (2021) and Case 2c: Zhao

and Illman (2022b) models, yielding σ2log10K estimates of 0.38 and 0.86, respectively, while Case

2a: McCall and Christy (2010)’s model results in a σ2log10K of 1.28 that is somewhat lower but

closer to the other approaches.

Assignment of K for the 19 Geological Model Layers

The point scale K measurements from various approaches in Cases 1a – 1c and 2a – 2c were

then used to populate the 19-layer geological model by taking the KG of all data points located in

each layer. Measurements from similar sediment material were attributed to layers that have no
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
sample data available. For example, as only 43 K estimates in 11 out of 19 layers of the geological

model were available from Case 1c: Slug Tests, it was most difficult to populate the model. As a

result, KG from layers 4, 8, 16, and 18 (clay) was assigned to layers 1 and 12 (clay); KG from layers

2, 7, and 14 (silt) was assigned to layer 10 (silt), KG from layer 13 (sandy silt) was assigned to

layers 6 and 9 (sandy silt), KG from layer 11 (sand) was assigned to layer 3 (sand), KG from layer

17 (clay & silt) was assigned to layer 19 (clay & silt), and KG from layers 3 and 11 (sand) and 2

and 10 (silt) were assigned to layer 5 (sand & silt). Similar steps were also performed for the other

methods if there were layers that did not contain any K estimates and described beneath Tables S2

– S7 in the SI section.

Additionally, the maximum, upper quartile, median, KG, lower quartile, and minimum of

log10K values estimated from all investigated approaches (except for Case 3a: PEST Calibrated

Geological Model and Case 3b: Averaged THT Geological Model) were computed for each

geological layer and plotted as box-and-whisker plots in Figure 4, while their numerical values

were summarized in Tables S2 – S9 of the SI section. The lower and upper range of K estimates

from Case 2a: McCall and Christy (2010)’s model and the higher range of the Case 2c: Zhao and

Illman (2022b)’s model were indicated within the box plots in Figure 4.

Insert Figure 4 here

Examination of Figure 4 reveals that most of the box plots are either positively skewed or

negatively skewed depending on the approach examined. In addition, the interquartile range (IQR)

of K estimates from Case 1a: GSA and Case 1b: Permeameter Tests over 19 layers is generally

larger than those from other methods, suggesting larger variability of K estimates for each layer.

The IQR from Case 2a: McCall and Christy (2010)’s model is larger than the Case 2b: Borden et

al. (2021) and Case 2c: Zhao and Illman (2022b) models. In addition, the KG from Case 2b: Borden
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
et al. (2021)’s model is less variable than Case 2a: McCall and Christy (2010)’s and Case 2c: Zhao

and Illman (2022b)’s models. The IQR for Case 3c: Highly Parameterized THT Model is

consistently smaller, which indicates less dispersion of data sets along each of the 19 layers despite

the large degree of variability in K estimates for each layer compared to other approaches.

Next, the profiles of log10KG of 19 geological layers estimated from all investigated

approaches were plotted in Figure 5. This figure reveals that it is very hard to accurately

characterize a heterogeneous site, such as the NCRS, as estimated log10KG values could range

about four orders of magnitude within a single geological unit when using various site

characterization approaches. Overall, Case 1c: Slug Tests yielded higher K estimates compared to

Case 1a: GSA and Case 1b: Permeameter Tests. The K values from HPT surveys with three

different models (Cases 2a – 2c) yielded similar K estimates, while the estimates were generally

higher than those obtained from conventional methods (i.e., Case 1a: GSA and Case 1b:

Permeameter Tests). The K values estimated from Case 3a: PEST Calibrated Geological Model

were close to those generated from Case 3b: Averaged THT Geological Model.

Insert Figure 5 here

Evaluation of K from Various Subsurface Characterization Methods by

Predicting Independent Groundwater Flow Events

Description of Groundwater Model and Experimental Design

Because the spatial distribution of the true K field across the NCRS is not available, K

estimates from various approaches were assessed through the prediction of drawdowns from

pumping/injection tests that have not been used for K estimation. For this, we constructed a HGS

model for forward simulations of independent pumping/injection tests with K and Ss fields derived
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
from different methods. Other than hydraulic parameter fields, HGS settings were the same as the

numerical model utilized in the work of Zhao and Illman (2018) as described previously.

Seven pumping/injection tests (PW1-3, PW1-5, PW3-1, PW3-4, PW5-1, PW5-4, and PW5-

5) not used in inverse modeling were simulated with HGS and results were compared to field data

via scatterplots to evaluate the performances of models built with K estimates from different

approaches. Since most of the conventional and HPT methods were not capable of providing Ss

estimates, the forward model’s ability to predict drawdowns under steady-state condition was the

first metric employed to evaluate the K estimates from various approaches.

Then, transient forward simulations were performed. For transient simulations, 19 Ss

estimates from Case 3a: PEST-calibrated 19-layer geological model were utilized in Cases 1a –

1c, Cases 2a - 2c, and Case 3a. For Case 3b, 31,713 Ss values estimated via THT analysis were

averaged for each of the 19 layers, while for Case 3c, 31,713 Ss values from the THT analysis were

utilized for simulating transient drawdown responses.

Comparison of K Distributions

The estimated K distributions from conventional (Cases 1a – 1c), HPT (Cases 2a – 2c), and

inverse modeling (Cases 3a – 3c) approaches utilized in the groundwater flow models are

presented as fence diagrams in Figure 6. In this figure, locations of CMT and PW wells as well as

HPT survey locations are indicated. As mentioned previously, the primary characteristic of the site

is an alternating aquifer-aquitard system, in which three discontinuous low-K units of clay to silt

are separated by two high-K units of sand to gravel.

Insert Figure 6 here


17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Examination of Figure 6 reveals that the generated K distribution from Case 1a: GSA is

unable to capture the two aquifers correctly. In contrast, the aquitard clay layer 12 between the two

aquifers and the aquitard layer 1 at the uppermost of the model domain has a higher K value than

sand and gravel aquifer units, which is inconsistent with geological data. Based on Table S2, both

aquitard layers are primarily composed of clay. Thus, K values from these two units (layers 1 and

12) are estimated using Puckett et al. (1985)’s model, while K values of the two aquifer layers are

calculated with Hazen (1911)’s model. Similar findings were reported by Alexander et al. (2011),

where both Puckett et al. (1985) and Hazen (1911) models were utilized to calculate 270 grain size

distributions and results showed that the mean K generated by the Puckett et al. (1985) model was

about two orders of magnitude larger than estimated from the Hazen (1911) model. These findings

indicate that the Puckett et al. (1985) model may not be suitable to calculate K for clay materials

at a highly heterogeneous glaciofluvial deposited site even though equation (2) is only dependent

on clay content.

Results from Case 1b: Permeameter Tests reveal the existence of a double-aquifer system.

However, based on Table S3, relatively low K values are estimated for the sand and gravel layer

(i.e., layer 15). Lower K values are obtained for the aquitard layers above and below as well as in

between the aquifer system.

Results from Case 1c: Slug Tests capture the aquitard units below and above the aquifer

system. However, based on Tables S2 to S4, K values are relatively larger than those estimated by

Case 1a: GSA and Case 1b: Permeameter Tests. In addition, the double-aquifer system is not

reflected correctly. Based on Table S4, all the sandy-silt layers (6, 9, and 13) between and above

the aquifers have greater estimates of K than the two aquifer units (layers 11 and 15). The reason

is that the 43 slug test measurements only cover one sandy-silt layer. Specifically, there are only
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
four measurements for layer 13, while the estimated K values are relatively large and do not agree

with site geology.

The K distributions in Figure 6 from HPT using three different models (Cases 2a – 2c) are

generally biased towards higher K values. Specifically, results from Case 2a: McCall and Christy

(2010)’s model capture the lower aquitard. However, K estimates tend to be larger than those

generated from Case 1a: GSA and Case 1b: Permeameter Tests. In addition, only the lower aquifer

is revealed, while the clay layer 4 and silt layers 6, 7, and 10 located in the upper aquitard (based

on Table S5) have higher K estimates than the most permeable aquifer unit layer 15, which does

not conform to known geology. Results from Case 2b: Borden et al. (2021)’s model only captures

the lower aquitard, while K estimates are generally larger, thus every layer above the lower

aquitard is hard to be distinguished. The K distributions from Case 2c: Zhao and Illman (2022b)’s

model only captures the lower aquifer and the lowest aquitard, while the K values for units above

the lower aquifer are generally less variable and the aquitard layers have generally larger K

estimates. The less variable values from high-resolution HPT methods are mainly due to the

limited range of K estimates obtained from each of the three models (McCall and Christy, 2010;

Borden et al. 2021; and Zhao and Illman, 2022b).

Examination of three inverse modeling results (Cases 3a – 3c) reveals K variations more

accurately. The K values from Case 3a: PEST Calibrated Geological Model capture the expected

variation from one layer to the next. Specifically, the K estimate of the unit in between two aquifers

matches that expected for an aquitard. The K values for aquitards above and below the double

aquifer system are also estimated to be low. However, the K estimate for the upper aquitard is

relatively larger than the value estimated from permeameter tests, due to the sparse monitoring

data at the uppermost model domain. Case 3b: Averaged THT Geological Model has a similar K
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
distribution compared to Case 3a: PEST Calibrated Geological Model (as shown in Figure 6), and

the K estimate for the lowest aquifer agrees more with those obtained by conventional methods

(i.e., Case 1a: GSA and Case 1b: Permeameter Tests). Case 3c: Highly Parameterized THT Model

yields a K field that exhibits both inter- and intra-layer heterogeneity. It is noteworthy that the

estimated K from Case 3c for the lower aquifer layer 15 is higher compared with Cases 3a and 3b.

Results from Forward Simulations of Pumping/Injection tests

The performance of each K distribution in Figure 6 obtained by various methods was then

evaluated by predicting independent pumping tests that are not used for model calibration using

HGS (Aquanty, 2019) with the computational mesh described earlier. As previously noted, a total

of 15 pumping tests (PW1-1, PW1-3, PW1-4, PW1-5, PW1-6, PW1-7, PW2-3, PW3-1, PW3-3,

PW3-4, PW4-3, PW5-1, PW5-3, PW5-4, and PW5-5) were conducted at the NCRS, while eight

tests were utilized by Zhao and Illman (2018) for Case 3a, 3b, and 3c model calibrations (PW1-1,

PW1-4, PW1-6, PW1-7, PW2-3, PW3-3, PW4-3, and PW5-3). Therefore, for this study, seven

tests not used in model calibration by Zhao and Illman (2018) were chosen to evaluate the K

estimates from various approaches.

Since most of the conventional (i.e., Case 1a: GSA and Case 1b: Permeameter Tests) and

the HPT surveys (Cases 2a – 2c) in its current form cannot provide Ss estimates, steady-state

simulation was the first metric to evaluate the K estimates from various approaches. Only late-time

pressure heads from ports that reach steady or quasi-steady state were chosen, which resulted in

153 head data. To better evaluate the correspondence between the simulated and observed

drawdown values, quantitative analyses were first performed by comparing the coefficient of

determination (R2), mean absolute error (L1) and mean square error (L2), which are provided as:
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
2
1 𝑛𝑛
∑ �𝑋𝑋𝑖𝑖 −𝑋𝑋��𝑋𝑋�𝑖𝑖 −𝑋𝑋
�𝚤𝚤 �
𝑛𝑛 𝑖𝑖=1
𝑅𝑅 2 = � 2 1 2
� (1)
� 1 ∑𝑛𝑛 𝑛𝑛 � �
𝑛𝑛 𝑖𝑖=1 �𝑋𝑋𝑖𝑖 −𝑋𝑋� × ∑𝑖𝑖=1 �𝑋𝑋𝑖𝑖 −𝑋𝑋𝚤𝚤 �
𝑛𝑛

1
𝐿𝐿1 = 𝑛𝑛 ∑𝑛𝑛𝑖𝑖=1 �𝑋𝑋𝑖𝑖 − 𝑋𝑋�𝑖𝑖 � (2)

1 2
𝐿𝐿2 = 𝑛𝑛 ∑𝑛𝑛𝑖𝑖=1 �𝑋𝑋𝑖𝑖 − 𝑋𝑋�𝑖𝑖 � (3)

where 𝑛𝑛 is the total number of data, 𝑖𝑖 indicates the data number, 𝑋𝑋𝑖𝑖 is the simulated drawdown, 𝑋𝑋�𝑖𝑖

is the observed drawdown, 𝑋𝑋 is the mean of simulated drawdowns, and 𝑋𝑋�𝚤𝚤 is the mean of observed

drawdowns.

Statistics calculated from each method through seven simulations are summarized in Tables

S10 to S12 (in the SI section). Cells in Tables S10 to S12 are colour-coded to enhance the

comparison. Examination of Tables S10 to S12 reveals that Case 3c: Highly Parameterized THT

Model performs the best yielding smallest discrepancies between simulated and measured

drawdowns (i.e., smallest L1 and L2 norms) as well as highest R2 values for most of the pumping

tests, followed by Case 3b: Averaged THT Geological Model and Case 3a: PEST Calibrated

Geological Model.

In terms of K estimates from HPT surveys (Cases 2a – 2c), discrepancies between simulated

and observed drawdown values are the smallest for Case 2c: Zhao and Illman (2022b)’s model

among the three HPT formulae. Three conventional methods, especially Case 1a: GSA and Case

1b: Permeameter Tests rank at the lower end.

Simulation results are also assessed by plotting scatterplots, as shown in Figure 7. In each

plot, a linear model fit to all data and corresponding slope and intercept of these fits, as well as R2

values are included. Meanwhile, a 1:1 line is also included in each subplot to indicate a perfect
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
match. The slope and intercept values obtained from the linear model fit for individual tests from

all methods are summarized in Table S13 for the interested reader.

Insert Figure 7 here

To enhance our comparison and evaluation, transient simulations were also performed. Since

most of the selected conventional and HPT methods cannot yield Ss estimates, estimates for the 19

geological layers obtained from Case 3a: PEST Calibrated Geological Model were assigned to

Cases 1a to 1c and Cases 2a to 2c. Three points were selected from the early, intermediate, and

late times of each drawdown curve, which resulted in a total of 388 drawdown data. It is worth

noting that less drawdown data points were selected from injection tests performed at PW3-1 and

PW5-1 as the data from the tests were noisy and impacted by the Noordbergum effect (Verruijt,

1969; Rodrigues, 1983; Berg et al., 2011). Therefore, only late time data were selected from those

drawdown curves. Similar to steady-state results, various model performance metrics such as the

R2, L1, L2, slope and intercept of the linear model are summarized in Tables S14 to S17, while

scatterplots of observed and simulated drawdowns are presented in Figure 8. Meanwhile,

simulated drawdown curves for the pumping/injection tests at ports PW1-3, PW1-5, PW3-1, PW3-

4, PW5-1, PW5-4, and PW5-5 with various K estimates are compared against observed drawdowns

on Figures S4 to S10 in the SI section, respectively.

Insert Figure 8 here

Discussion

Which K Estimates Yield Best Predictions of Steady-State and Transient Drawdown

Predictions?
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Examination of Figures 7 and 8 reveals that performance in steady-state and transient

simulation results are quite comparable. Specifically, groundwater models built with conventional

(Cases 1a - 1c) and HPT (Cases 2a - 2c) K estimates yield biased drawdown predictions for both

steady-state (Figure 7) and transient (Figure 8) simulation results. In contrast, inverse modeling

approaches based on Case 3a: PEST Calibrated Geological Model and Case 3b: Averaged THT

Geological Model both yield good predictions of drawdowns, while Case 3c: Highly

Parameterized THT Model produces excellent matches for steady state simulation results (Figure

7). For transient simulation results (Figure 8), the difference in performance among Cases 3a - 3c

is more comparable although Case 3c still yields the best prediction performance.

In terms of conventional methods, K estimates from Case 1a: GSA and Case 1b:

Permeameter Tests overpredict drawdowns, while those estimated via Case 1c: Slug Tests

underpredict drawdowns under steady-state and transient conditions (Figures 7 and 8). According

to Table 1, as well as Figure 5, Case 1a: GSA and Case 1b: Permeameter Tests tend to provide

smaller KG estimates than Case 1c: Slug Tests. Specifically, Table 1 shows that Case 1a: GSA and

Case 1b: Permeameter Tests yield KG values of 1.19 × 10-7 m/s and 3.03 × 10-7 m/s, respectively.

The lower KG values in relation to other approaches (Table 1) may be due to sample loss from

highly permeable zones as observed by Alexander et al. (2011). Core samples have been obtained

with a split spoon sampler that was driven in front of the drill head. Alexander et al. (2011) noted

that the sample recovery was on the order of 80% for all wells except for CMT3, which had a

lower recovery rate of 69%. Sample recovery was found to be good, but periodic gaps were noted

for depths corresponding with aquifers. Therefore, the coarse-grained portion of samples may have

been lost and not subjected to sieve analyses and permeameter tests. An additional factor relevant

to permeameter tests is the repacking of samples. Klute and Dirksen (1986) discussed that K of
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
repacked samples estimated in the laboratory can be artificially lower than those from intact

samples.

It is surprising to note that Case 1c: Slug Tests consistently underpredict drawdowns (Figures

7 and 8) as this method is widely used for various field investigations (Butler, 1997, 2005; Cardiff

et al., 2011). Table 1 shows that 43 slug tests yield a KG value of 2.65 × 10-6 m/s, which is

approximately one order of magnitude higher than those from Case 1a: GSA and Case 1b:

Permeameter Tests. This may be due to three potential commingling factors: 1) the relatively

sparse data points (n = 43) available at the site that could have led to preferential sampling from

higher K intervals; 2) the Hvorslev (1951) approach yielding slightly higher estimates of K (Xie,

2015) compared to the Bouwer and Rice (1976) and Kansas Geological Survey (KGS) models

(Hyder et al., 1994); and 3) the scale effect, in which slug tests sample larger volumes that may be

impacted by highly permeable zones not considered by other methods that sample smaller volumes

such as Case 1a: GSA and Case 1b: Permeameter Tests. Another potential factor may be the slug

test K estimate being representative of the filter pack. However, for this study, care was taken to

avoid fitting the Hvorslev (1951) model to the early portion of the head response curve.

Figures 7 and 8 also reveal that K estimates from various HPT formulae (Cases 2a – 2c)

persistently yield biased low predictions of drawdowns under both steady-state and transient

conditions. Examination of Table 1 reveals that KG from the three methods are 2.85 × 10-6 m/s,

5.78 × 10-6 m/s, and 3.84 × 10-6 m/s for Case 2a: McCall and Christy (2010), Case 2b: Borden et

al. (2021), and Case 2c: Zhao and Illman (2022) models, respectively. The KG estimates from the

three HPT formulae (Cases 2a – 2c) are approximately one order of magnitude larger than those

from conventional methods (Cases 1a – 1c). In terms of predictions of drawdowns, the use of three

different models yields results that are slightly different. Specifically, based on Figures 7 and 8,
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
R2 values increase, while both L1 and L2 norms decrease from Cases 2a to 2c. As previously

mentioned, Case 2c: Zhao and Illman (2022b)’s model is a site-specific relationship developed for

the NCRS. Therefore, building a site-specific model to interpret HPT data is helpful in terms of

site characterization. However, the improvement to drawdown predictions is not very significant

based on Figures 7 and 8 (see also Figures S4 to S10 in the SI section) as Cases 2a to 2c that all

underpredict observed drawdowns. An obvious reason is the limited range of estimated K for the

three models used to interpret HPT data at the NCRS. It is interesting to note that while the KG

values estimated through the three approaches are quite similar, the range of log10K is quite

different for each approach (Table 1). Moreover, despite Case 2c: Zhao and Illman (2022b)’s

model that extends the lower and upper K ranges compared to the other two models (Cases 2a and

2b), the resulting K estimates and corresponding forward simulations yield biased predictions. This

is likely due to the highly heterogeneous nature of the glaciofluvial deposits at the NCRS and the

connectivity of these units is an important consideration for building more accurate groundwater

models that predict drawdowns more accurately.

In contrast to the conventional and HPT K estimates that yield biased drawdown predictions,

we find that inverse modeling approaches (Cases 3a – 3c) with various parameterizations, all yield

more accurate drawdown predictions at the NCRS (Figures 7 and 8). Case 3a: PEST Calibrated

Geological Model and Case 3b: Averaged THT Geological Model both yield a good drawdown

match with the measured data, while Case 3c: Highly Parameterized THT Model with prior

geological information yields excellent forward simulation results under both steady and transient

conditions. The calibration of a HGS groundwater flow model based on geological zonation with

PEST (Case 3a) is also a form of THT analysis (Illman et al., 2015), but it differs from the high-

resolution approach (Case 3c) based on SimSLE in VSAFT3. The HGS/PEST calibration (Case
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
3a) fits all drawdown/buildup data in a least-square sense and restricts its effective parameter

estimates to 19 geologic zones, while SimSLE in VSAFT3 does not have this constraint. For this

reason, VSAFT3 can adjust more parameters such that the calibrated drawdown-time curves honor

the observed ones during each test. Therefore, VSAFT3’s estimates (Case 3c) yield better

predictions of independent pumping tests.

Based on Tables S10 to S12 and Tables S14 to S16 (SI section), Case 3c: Highly

Parametrized THT Model consistently yields the best R2, L1 and L2 norms under both steady-state

and transient conditions followed by Case 3a: PEST Calibrated Geological Model and Case 3b:

Averaged THT Geological Model. These results indicate that, even though the K fields (refer to

Figure 6, Cases 3a to 3c) reveal similar overall characteristics, local scale differences in K could

lead to noticeable differences between simulated drawdowns at various observation points. As a

result, Case 3c: Highly Parametrized HT Model that can accurately map both interlayer and

intralayer heterogeneities may be most suitable for high-resolution characterization at highly

heterogeneous sites such as the NCRS.

Should Groundwater Models Consider Variability in Ss?

For transient groundwater flow simulations, estimates of Ss are necessary. However, most

of the conventional and HPT methods do not yield these estimates. In addition, Ss is typically

considered to be much less variable than K, thus less attention has been paid. As a result, the

importance of including homogeneous or heterogeneous Ss estimates for transient groundwater

flow simulation is analyzed by: (1) using an effective Ss from Zhao and Illman (2018) who treated

the multi-aquifer-aquitard system to be homogeneous and isotropic; and (2) using estimated Ss

values from Case 3a: PEST Calibrated Geological Model of Zhao and Illman (2018).
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
To answer the question of whether groundwater models should consider variability in Ss or

not, additional transient simulations are performed with seven pumping tests (PW1-3, PW1-5,

PW3-1, PW3-4, PW5-1, PW5-4, and PW5-5) for Cases 1a to 1c and Cases 2a to 2c. The

corresponding L1 and L2 norms are summarized in Tables S18 and S19 (SI section), where the blue

color represents the results from homogeneous Ss, while the yellow color represents results from

heterogeneous Ss. The bold values of L1 and L2 norms on Tables S18 and S19 indicate smaller

values for either the homogeneous or heterogenous Ss case identifying the case exhibiting less

discrepancy between simulated and observed drawdowns.

Examination of Tables S18 and S19 reveals that for virtually all cases, providing

heterogeneous Ss estimates to each of the 19-layers in the model yields better transient simulation

results than utilizing a homogeneous Ss value encompassing all 19-layers. As a result, to achieve

more accurate transient groundwater flow simulation results, it may be advisable to spend more

efforts in accurately capturing Ss heterogeneity at sites where the lithology changes significantly

throughout the simulation domain.

Summary and Conclusions

The accurate characterization of subsurface heterogeneity in K and Ss is important in

building robust groundwater models for improved predictions of groundwater flow and solute

transport. There are several conventional approaches to estimate K including the use of empirical

and analytical formulae to interpret data from GSA, permeameter, slug and pumping tests. Over

the last two decades, several DP-based field tools such as DPIL and HPT have been developed to

characterize high-resolution spatial variations of K in heterogeneous unconsolidated formations.

The newer DP tools and interpretation methods have positioned DP surveys to become one of the
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
most efficient approaches for site characterization compared to conventional methods, although

information on spatial K variability and connectivity requires interpolation of K values at DP

locations. Inverse modeling methods, such as automatic calibration of geology-based groundwater

models and more recent development and testing of HT have shown its effectiveness in yielding

robust estimates of K and Ss heterogeneity between boreholes.

Previously, various studies have been published that compared different methods of

estimating K, but there is lack of consensus of a method that yields K estimates that are most useful

for groundwater flow models. In this study, we utilize a groundwater flow model, constructed with

19 geological layers representative of a multi-aquifer-aquitard system at the NCRS, to evaluate the

performance of three generations of site characterization approaches for K including: (1)

conventional techniques (Case 1a: GSA; Case 1b: Permeameter Tests; and Case 1c: Slug Tests);

(2) HPT survey data interpreted with three different models [Case 2a: McCall and Christy (2010);

Case 2b: Borden et al. (2021); and Case 2c: Zhao and Illman (2022b)]; and (3) three inverse

modeling approaches (Cases 3a: PEST Calibrated Geological Model; Case 3b: Averaged THT

Geological Model; and Case 3c: Highly Parameterized THT Model) in terms of their ability to

predict drawdowns under both steady-state and transient conditions. This study leads to the

following major findings and conclusions:

1. Despite the time and effort to conduct 270 GSA, 642 permeameter tests, and 43 slug tests,

conventional methods at the NCRS yielded biased K estimates that led to poor predictions

of drawdowns from pumping tests. Most empirical formulae applied with data from GSA

were developed for relatively permeable materials, which presents a challenge for their

application to highly heterogeneous settings consisting of low K media. More importantly,


17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
low core sample recovery from highly permeable zones can lead to biased low K estimates,

which in turn could impact groundwater flow modeling results.

2. The development of DP techniques and the HPT has significantly advanced our capabilities

in high-resolution characterization of K along vertical profiles at DP locations in

unconsolidated media. While the approach yields rapid estimates of K at an unprecedently

high-resolution, the estimation of K from HPT survey data may require more attention than

previously thought. In this study, three separate approaches [i.e., Case 2a: McCall and

Christy (2010); Case 2b: Borden et al. (2021); Case 2c: Zhao and Illman (2022b)] were

utilized to estimate K. The K estimates obtained through the three different formulae were

each constrained through varying upper and lower bounds, which presented challenges in

characterizing low permeability materials such as silt and clay. Groundwater flow

simulations with K estimates derived from three formulae yielded biased predictions of

drawdowns at the NCRS. Given HPT’s significant advantage in hydrogeologic

characterization of unconsolidated deposits, it is necessary to advance the logging

apparatus and corresponding interpretation methods to attain an extended range of

estimates for both higher and lower K geological media.

3. Inverse modeling of pumping test data with geology-based and highly parameterized

geostatistics-based HT models at the NCRS has shown that they yield robust estimates of

K and Ss that are useful for steady-state and transient groundwater flow simulations.

Specifically, the automatic calibration of a groundwater flow model yielded parameter

estimates that consistently led to accurate predictions of pumping tests not used in the

calibration effort. Drawdown predictions were found to improve dramatically by utilizing

a highly parameterized groundwater flow model with parameter estimates from THT that
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
captured the most salient features of interlayer and intralayer K heterogeneity. Additional

transient simulations, in which heterogeneous Ss values were considered, revealed

obviously improved drawdown predictions suggesting the benefits of Ss heterogeneity

characterization at sites where large changes to lithologies are found. While the accurate

prediction of drawdowns from pumping tests is promising, further studies are needed to

see whether these K distributions are useful for contaminant transport predictions.

4. Our research suggests that inverse modeling is a necessary step in building more robust

groundwater flow models echoing suggestions by Poeter and Hill (1997) and Carrera et al.

(2005). HT additionally fuses information from multiple pumping tests and can integrate

other data such as from geological investigations (e.g., Zhao and Illman, 2018),

geophysical surveys (e.g., Soueid Ahmed et al., 2015), flowmeter surveys (Li et al., 2008,

Aliouache et al., 2021; Luo et al., 2023), tracer tests (e.g., Yeh and Zhu, 2007; Illman et

al., 2010; Doro et al., 2014) and high-resolution pressure (Zhao and Illman, 2022a) as well

as K estimates from the HPT surveys (Zhao et al., 2023) that further improves parameter

estimates. However, HT should not be considered a panacea technology as the parameter

estimates are highly dependent on model conceptualization, accuracy of data fed into

models including forcing functions (i.e., initial and boundary conditions, source/sink terms)

applied to models. Data fusion as part of inverse modeling is encouraged for building more

robust groundwater models and obtaining better parameter estimates but should be done

with caution always considering the information content of data.

Acknowledgements

The HPT surveys conducted by Geoprobe Systems, GroundTech Solutions Ltd. and the University

of Waterloo (UW) at the NCRS were a result of discussions at the NovCare meeting held at the
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
University of Waterloo during the summer of 2019. We are very grateful to Wes McCall from

Geoprobe Systems and Jeff Bibbings from GroundTech Solutions Ltd. for visiting UW and

training our staff and students to conduct the HPT surveys at the NCRS. Walter A. Illman

acknowledges the partial support from the Discovery Grant awarded by the Natural Sciences and

Engineering Research Council of Canada (NSERC). Dongwei Sun acknowledges the support from

the Qinhuangdao Architecture Design Institute and Brayden McNeill from Aquanty Inc. who

provided guidance on building the initial HGS model for this study. Finally, we thank the

Executive Editor (Charles Andrews), Mike Fienen, and the two anonymous reviewers for

providing constructive comments that led to an improved manuscript.

Supporting Information

Supporting Information is generally not peer reviewed. Supporting Information can be found in an

online document that contains additional details to methods used to estimate K and Tables S1 to

S19 as well as Figures S1 to S10 as referenced in the text above.

References

Alexander, M., S. J. Berg, and W. A. Illman. 2011. Field study of hydrogeologic characterization

methods in a heterogeneous aquifer. Ground Water 49, no. 3: 365–382.

Aliouache, M., X. Wang, P. Fischer, G. Massonnat, and H. Jourde. 2021. An inverse approach

integrating flowmeter and pumping test data for three-dimensional aquifer

characterization. Journal of Hydrology 603: 126939.

Aquanty, Inc. 2019. HydroGeoSphere: A three-dimensional numerical model describing fully

integrated subsurface and surface flow and solute transport. Waterloo, Ontario, Canada.
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
ARANZ Geo. Limited., 2015. Leapfrog Hydro 2.2.3. 3D Geological Modeling Software.

Barr, D. W. 2001. Coefficient of permeability determined by measurable parameters. Ground

Water 39, no. 3: 356–361.

Beckie, R., and C. F. Harvey. 2002. What does a slug test measure: an investigation of

instrument response and the effects of heterogeneity. Water Resources Research 38, no.

12: 1290.

Berg, S. J., P. A. Hsieh, and W. A. Illman. 2011. Estimating hydraulic parameters when

poroelastic effects are significant. Ground Water 49, no. 6: 815–829.

Berg, S. J., and W. A. Illman. 2011a. Capturing aquifer heterogeneity: comparison of approaches

through controlled sandbox experiments. Water Resources Research 47, no. 9: W09514.

Berg, S. J., and W. A. Illman. 2011b. Three-dimensional transient hydraulic tomography in a

highly heterogeneous glaciofluvial aquifer-aquitard system. Water Resources Research

47, no. 10: W10507.

Bohling, G. C., X. Zhan, J. J. Butler, Jr., and L. Zheng. 2002. Steady shape analysis of

tomographic pumping tests for characterization of aquifer heterogeneities. Water

Resources Research 38, no. 12: 1324.

Bohling, G. C., J. J. Butler, Jr., X. Zhan, and M. D. Knoll. 2007. A field assessment of the value

of steady shape hydraulic tomography for characterization of aquifer heterogeneities.

Water Resources Research 43: W05430.

Borden, R. C., K. Y. Cha, and G. Liu. 2021. A physically based approach for estimating

hydraulic conductivity from HPT pressure and flowrate. Ground Water 59, no. 2: 266–

272.
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Bouwer, H. and R. C. Rice. 1976. A slug test method for determining hydraulic conductivity of

unconfined aquifers with completely or partially penetrating wells, Water Resources

Research 12, no. 3: 423-428.

Brauchler, R., R. Hu, L. Hu, S. Jiménez, P. Bayer, P. Dietrich, and T. Ptak. 2013. Rapid field

application of hydraulic tomography for resolving aquifer heterogeneity in

unconsolidated sediments. Water Resources Research 49, no. 4: 2013–2024.

Butler, J. J., Jr. 2019. The Design, Performance, and Analysis of Slug Tests. 2nd ed. CRC Press,

Boca Raton, FL, 280 pp.

Butler, J. J., Jr. and J. M. Healey. 1998. Relationship between pumping test and slug-test

parameters: scale effect or artifact? Ground Water 36: 305–313.

Butler, J. J., Jr. 2005. Hydrogeological methods for estimation of spatial variations in hydraulic

conductivity. Hydrogeophysics, 23-58. Springer Netherlands, 527 pp.

Butler, J. J., Jr, E. J. Garnett, and J. M. Healey. 2003. Analysis of slug tests in formations of high

hydraulic conductivity. Ground Water 41, no. 5: 620–630.

Butler, J. J., P. Dietrich, V. Wittig, and T. Christy. 2007. Characterizing hydraulic conductivity

with the direct-push permeameter. Ground Water 45, no. 4: 409–419.

Cardiff, M., W. Barrash, M. Thoma, and B. Malama. 2011. Information content of slug tests for

estimating hydraulic properties in realistic, high-conductivity aquifer scenarios. Journal

of Hydrology 403, no.1–2: 66–82.

Cardiff, M., W. Barrash, and P. K. Kitanidis. 2013. Hydraulic conductivity imaging from 3-D

transient hydraulic tomography at several pumping/observation densities. Water

Resources Research 49, no. 11: 7311–7326.


17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Carrera, J., A. Alcolea, A. Medina, J. Hidalgo, L. J. Slooten. 2005. Inverse problem in

hydrogeology. Hydrogeology Journal 13: 206–222.

Castagna, M., M. W. Becker, and A. Bellin. 2011. Joint estimation of transmissivity and

storativity in a bedrock fracture. Water Resources Research 47, no. 9: W09504.

Chapuis, R. P., V. Dallaire, D. Marcotte, M. Chouteau, N. Acevedo, and F. Gagnon. 2005.

Evaluating the hydraulic conductivity at three different scales within an unconfined sand

aquifer at Lachenaie, Quebec. Canadian Geotechnical Journal 42, no. 4: 1212–1220.

Clauser, C., 1992. Permeability of crystalline rocks. Eos Transactions American Geophysical

Union 73, no. 21: 233 - 238.

Cooper, H. H., and C. E. Jacob. 1946. A generalized graphical method for evaluating formation

constants and summarizing well-field history, Eos Transactions American Geophysical

Union 27, no. 4: 526– 534.

Devlin, J. F. 2015. HydrogeoSieveXL: an Excel-based tool to estimate hydraulic conductivity

from grain-size analysis. Hydrogeology Journal 23, no. 4: 837–844.

Dietrich, P., J. J. Butler, Jr. and K. Faiß. 2008. A rapid method for hydraulic profiling in

unconsolidated formations. Ground Water 46, no. 2: 323–328.

Doherty, J. 2005. PEST: Model-Independent Parameter Estimation User Manual. Watermark

Numerical Computing, Brisbane, Australia.

Doherty, J., and D. Welter. 2010. A short exploration of structural noise, Water Resources

Research 46: W05525.


17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Doherty, J. 2015. Calibration and Uncertainty Analysis for Complex Environmental Models,

PEST: complete theory and what it means for modelling the real world, Watermark

Numerical Computing, 237 pp.

Doro, K. O., O. A. Cirpka, and C. Leven. 2014. Tracer tomography: Design concepts and field

experiments using heat as a tracer. Groundwater 53, no. S1: 139 – 148.

Fischer, P., A. Jardani, and N. Lecoq. 2018. Hydraulic tomography of discrete networks of

conduits and fractures in a karstic aquifer by using a deterministic inversion algorithm.

Advances in Water Resources 112: 83–94.

Freeze, R. A., and J. A. Cherry. 1977. Groundwater. Prentice-Hall.

Geoprobe. 2015. Geoprobe ® Hydraulic Profiling Tool (HPT) System Standard Operating

Procedure.

Hazen, A. 1911. Discussion: Dams on sand foundations. Transactions, American Society of Civil

Engineers 73, no. 11: 199.

Hinsby, K., P. L. Bjerg, L. J. Andersen, B. Skov, and E. V. Clausen. 1992. A mini slug test

method for determination of a local hydraulic conductivity of an unconfined sandy

aquifer. Journal of Hydrology 136, no. 1–4: 87–106.

Hsieh, P. A. and S. P. Neuman. 1985. Field determination of the three-dimensional hydraulic

conductivity tensor of anisotropic media. 1. Theory. Water Resources Research 21, no.

11: 1655-1665.

Hu, R., R. Brauchler, M. Herold, and P. Bayer. 2011. Hydraulic tomography analog outcrop

study: Combining travel time and steady shape inversion. Journal of Hydrology 409, no.

1–2: 350–362.
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Huang, S.-Y., J.-C., Wen, T.-C. J., Yeh, W. Lu, H.-L. Juan, C.-M. Tseng, J.-H. Lee, K.-C.

Chang. 2011. Robustness of joint interpretation of sequential pumping tests: Numerical

and field experiments. Water Resources Research 47, no.10: W10530.

Hvorslev, M. J. 1951. Time Lag and Soil Permeability in Ground-Water Observations, Bull. No.

36. Vicksburg, Mississippi: Waterways Experiment Station, Corps of Engineers, U.S.

Army, 1–50.

Hyder, Z., J. J. Butler, Jr., C. D. McElwee, and W. Liu. 1994. Slug tests in partially penetrating

wells, Water Resources Research 30, no. 11: 2945 - 2957.

Illman, W. A. 2006. Strong field evidence of directional permeability scale effect in fractured

rock, Journal of Hydrology 319, no. 1 – 4: 227-236.

Illman, W. A., X. Liu, and A. J. Craig. 2007. Steady-state hydraulic tomography in a laboratory

aquifer with deterministic heterogeneity: Multi-method and multiscale validation of

hydraulic conductivity tomograms, Journal of Hydrology 341, no. 3 – 4: 222-234.

Illman, W. A., X. Liu, S. Takeuchi, T.-C. J. Yeh, K. Ando, and H. Saegusa. 2009. Hydraulic

tomography in fractured granite: Mizunami Underground Research site, Japan, Water

Resources Research 45: W01406.

Illman, W. A., S. J. Berg, X. Liu, and A. Massi. 2010. Hydraulic/partitioning tracer tomography

for DNAPL source zone characterization: Small-scale sandbox experiments.

Environmental Science & Technology 44, no. 22: 8609–8614.

Illman, W. A., S. J. Berg, and Z. Zhao. 2015. Should hydraulic tomography be interpreted using

geostatistical inverse modeling? A laboratory sandbox investigation, Water Resources

Research 51: 3219–3237.


17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Jiang, L., R. Sun, T.-C. J. Yeh, and X. Liang. 2021. Inverse modeling of different stimuli and

hydraulic tomography: A laboratory sandbox investigation, Journal of Hydrology 603:

127108.

Karrow, P. F. 1979. Quaternary geology of the University of Waterloo campus. Department of

Earth Sciences, University of Waterloo, Waterloo, ON.

Karrow, P. F. 1993. Quaternary geology, Stratford-Conestogo area. Ontario Ministry of

Northern Development and Mines, 283.

Klute, A., and C. Dirksen. 1986. Hydraulic conductivity and diffusivity: laboratory methods.

Methods of Soil Analysis, 687-734. John Wiley & Sons, Ltd.

Kozeny, J. 1953. Das wasser im boden. grundwasserbewegung. Hydraulik 380-445. Springer.

Krumbein, W. C., and G. D. Monk. 1943. Permeability as a function of the size parameters of

unconsolidated sand. Transactions of the AIME 151, no. 01: 153–163.

Li, W., A. Englert, O. A. Cirpka, and H. Vereecken. 2008. Three-dimensional geostatistical

inversion of flowmeter and pumping test data. Groundwater 46, no. 2: 193-201.

Liu, G., J. J. Butler, Jr., E. Reboulet, and S. Knobbe. 2012. Hydraulic conductivity profiling with

direct push methods. Ground Water 17, no. 1: 19–29.

Liu, X., W. A. Illman, A. J. Craig, J. Zhu, and T.-C. J. Yeh. 2007. Laboratory sandbox validation

of transient hydraulic tomography. Water Resources Research 43, no. 5: W05404.

Luo, N., Z. Zhao, W. A. Illman, and S. J. Berg. 2017. Comparative study of transient hydraulic

tomography with varying parameterizations and zonations: Laboratory sandbox

investigation. Journal of Hydrology 554: 758–779.


17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Luo, N., W. A. Illman, and Y. Zha. 2022. Large-scale three-dimensional hydraulic tomography

analyses of long-term municipal wellfield operations. Journal of Hydrology 610: 127911.

Luo, N., Z. Zhao, W. A. Illman, Y. Zha, C.-M. W. Mok, and T.-C. J. Yeh (2023), Three-

dimensional steady-state hydraulic tomography analysis with integration of cross-hole

flowmeter data at a highly heterogeneous site, Water Resources Research 59:

e2022WR034034.

McCall, W., and T. M. Christy. 2010. Development of a Hydraulic Conductivity-Estimate for the

Hydraulic Profiling Tool (HPT) Abstract and Presentation, The 2010 North American

Environmental Field Conference& Exposition. The Nielsen Environmental Field School,

Las Cruces, NM. January.

McCall, W., and T. M. Christy. 2020. The hydraulic profiling tool for hydrogeologic

investigation of unconsolidated formations. Groundwater Monitoring & Remediation 40,

no. 3: 89–103.

Neuman, S. P. G. R. Walter, H. W. Bentley, J. J. Ward, and D. D. Gonzales. 1984.

Determination of horizontal aquifer anisotropy with three wells, Ground Water 22, no. 1:

66-72.

Ning, Z., N. Luo, K. Inaba, T. Nakashima, T. Shimizu, and W. A. Illman. 2023. Three-

dimensional hydraulic tomography analyses to investigate commingling issues of

reproducibility, data density, and geological prior models. Journal of Hydrology 616:

128785.

Poeter, E. P. and M. C. Hill. 1997. Inverse model: A necessary next step in ground-water

modeling, Groundwater 35, no. 2: 250-260.


17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Poeter, E. P. and M. C. Hill. 1998. Documentation of UCODE, a computer code for universal

inverse modeling. USGS Water-Resources Investigations Report 98-4080. Reston,

Virginia, USGS.

Puckett, W. E., J. H. Dane, and B. F. Hajek. 1985. Physical and mineralogical data to determine

soil hydraulic properties. Soil Science Society of America Journal 49, no. 4: 831–836.

Rehfeldt, K. R., J. M. Boggs, and L. W. Gelhar. 1992. Field study of dispersion in a

heterogeneous aquifer: geostatistical analysis of hydraulic conductivity. Water Resources

Research 28, no. 12: 3309–3324.

Rodrigues, J. D. 1983. The Noordbergum effect and characterization of aquitards at the Rio

Maior mining project. Ground Water 21, no. 2: 200–207.

Rosas, J., O. Lopez, T. M. Missimer, K. M. Coulibaly, A. H. A. Dehwah, K. Sesler, L. R. Lujan,

and D. Mantilla. 2014. Determination of hydraulic conductivity from grain-size

distribution for different depositional environments. Ground Water 52, no. 3: 399–413.

Rovey II., C.W., and D. S. Cherkauer. 1995. Scale dependency of hydraulic conductivity

measurements. Ground Water 33, no.5: 769–780.

Sebol, L. A. 2000. Determination of groundwater age using CFCs in three shallow aquifers in

Southern Ontario. Ph.D. dissertation, Department of Earth and Environmental Sciences,

University of Waterloo, Waterloo, Ontario, Canada.

Soueid Ahmed, A., A. Jardani, A. Revil, J. P. Dupont. 2014. Hydraulic conductivity field

characterization from the joint inversion of hydraulic heads and self-potential data. Water

Resources Research 50, no. 4: 3502-3522.


17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Straface, S., T.-C.J. Yeh, J. Zhu, S. Troisi, and C.H. Lee. 2007. Sequential aquifer tests at a well

field, Montalto Uffugo Scalo, Italy. Water Resources Research 43, no. 7: W07432.

Sudicky, E. A. 1988. Reply. Water Resources Research 24, no. 6: 895–896.

Sun, D., N. Luo, A. Vandenhoff, C. Wang, Z. Zhao, D. L. Rudolph, and W. A. Illman. 2022.

Evaluation of the hydraulic profiling tool (HPT) at a highly heterogeneous field site

underlain by glaciofluvial deposits, Draft Technical Report submitted to Geoprobe

Systems, 74 pp.

Theis, C. V. 1935. The relation between the lowering of piezometric surface and the rate of the

duration of discharge of well using groundwater storage. Eos, Transactions American

Geophysical Union 16: 519-524.

Tiedeman, C.R., and W. Barrash. 2020. Hydraulic tomography: 3D hydraulic conductivity,

fracture network, and connectivity in mudstone. Ground Water 58, no. 2: 238–257.

Tong, X., W. A. Illman, S. J. Berg, and N. Luo. 2021. Hydraulic tomography analysis of

municipal-well operation data with geology-based groundwater models. Hydrogeology

Journal 29, no. 5: 1979–1997.

Verruijt, A. 1969. Elastic storage of aquifers. Flow through Porous Media, 1: 331–376.

Vesselinov, V. V., S. P. Neuman, and W. A. Illman. 2001. Three-dimensional numerical

inversion of pneumatic cross-hole tests in unsaturated fractured tuff 2. Equivalent

parameters, high-resolution stochastic imaging and scale effects. Water Resources

Research 37, no. 12: 3019–3041.

Vienken, T., and P. Dietrich. 2011. Field evaluation of methods for determining hydraulic

conductivity from grain size data. Journal of Hydrology 400, no. 1–2: 58–71.
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Vukovic, M., and A. Soro. 1992. Determination of Hydraulic Conductivity of Porous Media

from Grain-Size Composition. Water Resources Publications, LLC Highlands Ranch,

Colorado.

White, I. 1988. Comment on “A natural gradient experiment on solute transport in a sand

aquifer: Spatial variability of hydraulic conductivity and its role in the dispersion

process” by E. A. Sudicky. Water Resources Research 24, no. 6: 892-894.

Williamson, P. 2016. Examination of the electrical-hydraulic conductivity relationship at a

highly heterogeneous site, MSc report, University of Waterloo, 78 pp.

Wu, C.-M., T.-C. J. Yeh, J. Zhu, T. H. Lee, N.-S. Hsu, C.-H. Chen, and Sancho, A. F. 2005.

Traditional analysis of aquifer tests: Comparing apples to oranges?, Water Resources

Research 41: W09402.

Xiang, J., T.-C. J. Yeh, C.-H. Lee, K.-C. Hsu, and J.-C. Wen. 2009. A simultaneous successive

linear estimator and a guide for hydraulic tomography analysis. Water Resources

Research 45, no. 2: W02432.

Xie, Q. 2015. Slug tests analysis with different analytical models at a highly heterogeneous field

site. B.Sc. thesis, Department of Earth and Environmental Sciences, University of

Waterloo, Waterloo, Ontario, Canada.

Yeh, T.-C. J., R. Srivastava, A. Guzman, and T. Harter. 1993. A numerical model for water flow

and chemical transport in variably saturated porous media. Ground Water 31, no. 4: 634–

644.
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Yeh, T.-C. J., J. Mas‐Pla, T. M . W illiams, and J. F, M cCarthy . 1995. Observation and three-

dimensional simulation of chloride plumes in a sandy aquifer under forced-gradient

conditions. Water Resources Research 31, no. 9: 2141-2157.

Yeh, T.-C. J., and S. Liu. 2000. Hydraulic tomography: development of a new aquifer test

method. Water Resources Research 36, no. 8: 2095–2105.

Yeh, T.-C. J., and J. Zhu (2007), Hydraulic/partitioning tracer tomography for characterization of

dense nonaqueous phase liquid source zones, Water Resources Research 43: W06435.

Yeh, T.-C. J., D. Mao, Y. Zha, J.-C. Wen, L. Wan, K.-C. Hsu, and C.-H. Lee. 2015. Uniqueness,

scale, and resolution issues in groundwater model parameter identification. Water Science

and Engineering 8, no. 3: 175-194.

Zha, Y., T.-C. J. Yeh, W. A. Illman, T. Tanaka, P. Bruines, H. Onoe, H. Saegusa, D. Mao, S.

Takeuchi, and J.-C. Wen. 2016. An application of hydraulic tomography to a large-scale

fractured granite site, Mizunami, Japan, Groundwater 54, no. 6: 793-804.

Zha, Y., T.-C. J. Yeh, W. A. Illman, C. M. W. Mok, C.-H. M. Tso, Y.-L. Wang. 2019.

Exploitation of pump-and-treat systems for characterization of hydraulic heterogeneity,

Journal of Hydrology 573: 324-340.

Zhao, Z., W. A. Illman, T.-C. J. Yeh, S. J. Berg, and D. Mao. 2015, Validation of hydraulic

tomography in an unconfined aquifer: A controlled sandbox study, Water Resources

Research 51: 4137–4155.

Zhao, Z., W. A. Illman, and S. J. Berg. 2016. On the importance of geological data for hydraulic

tomography analysis: Laboratory sandbox study. Journal of Hydrology 542: 156–171.


17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Zhao, Z., and W. A. Illman. 2017. On the importance of geological data for three dimensional

steady-state hydraulic tomography analysis at a highly heterogeneous aquifer-aquitard

system. Journal of Hydrology 544: 640–657.

Zhao, Z., and W. A. Illman. 2018. Three-dimensional imaging of aquifer and aquitard

heterogeneity via transient hydraulic tomography at a highly heterogeneous field site.

Journal of Hydrology 559: 392–410.

Zhao, Z., and W. A. Illman. 2022a. Integrating hydraulic profiling tool pressure logs and

hydraulic tomography for improved high-resolution characterization of subsurface

heterogeneity. Journal of Hydrology 610: 127971.

Zhao, Z., and W. A. Illman. 2022b. Improved high-resolution characterization of hydraulic

conductivity through inverse modeling of HPT profiles and steady-state hydraulic

tomography: Field and synthetic studies. Journal of Hydrology 612: 128124.

Zhao, Z., S. J. Berg, W. A. Illman, and Y. Qi. 2022. Improving predictions of solute transport in

a laboratory sandbox aquifer through high-resolution characterization with hydraulic

tomography. Journal of Hydrology 615: 128673.

Zhao, Z., N. Luo, and W. A. Illman. 2023. Geostatistical analysis of high-resolution hydraulic

conductivity estimates from the hydraulic profiling tool and integration with hydraulic

tomography at a highly heterogeneous field site, Journal of Hydrology 617: 129060.

Zhu, J., and T.-C. J. Yeh. 2005. Characterization of aquifer heterogeneity using transient

hydraulic tomography. Water Resources Research 41, no. 7: 1–10.


17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Table 1. Descriptive statistics of K and log10K from various methods at the NCRS.

Method n Min. K (m/s) Max. K (m/s) KG (m/s) Range of log10 K σ2log10 K

Case 1a: GSA (Three Models) 270 3.07×10−11 2.50×10−3 1.19×10−7 7.91 2.63

Case 1b: Permeameter Tests 642 1.15×10−10 4.63×10−3 3.03×10−7 7.60 1.55

Case 1c: Slug Tests 43 1.21×10−8 1.68×10−4 2.65×10−6 4.14 1.47

Case 2a: HPT (McCall and Christy, 2010) 7,660 3.53×10−7 2.65×10−4 2.85×10−6 2.88 1.28

Case 2b: HPT (Borden et al. 2021) 7,660 1.13×10−8 2.69×10−4 5.78×10−6 4.38 0.38

Case 2c: HPT (Zhao and Illman, 2022b) 7,660 3.78×10−10 6.90×10−4 3.84×10−6 6.26 0.85

Case 3a: PEST Calibrated Geological Model 19 2.53×10−9 1.07×10−4 1.25×10−6 4.63 1.61

Case 3b: Averaged THT Geological Model 19 5.44×10−9 1.29×10−4 1.14×10−6 4.37 1.46

Case 3c: Highly Parameterized THT Model 31,713 4.20×10−11 2.90×10−3 5.79×10−7 7.84 1.47

Figure Captions

Figure 1. a) Schematic diagram in plan view showing the well configuration including the CMT

and PW well network and nine NC wells where geological data are obtained, as well as 11 HPT

profile locations. Gray dashed lines represent four geological cross sections A-A’, B-B’, C-C’ and
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
D-D’ as presented in Figure 2; b) 3D perspective view of wells and DP locations within the 15 m

× 15 m well cluster area shown as a blue dashed area in Figure 1a along with numbered well

screens and pumped ports, as well as high-resolution HPT profile locations.

Figure 2. Cross-sectional view of the 19-layer geological zonation model with CMT and PW

screened intervals shown in cross sections C-C’ and D-D’. Cross sections along A-A’ and B-B’

are available in Figure S1 of the Supporting Information (SI) section. The 19 layers represent 7

different material types as indicated in the stratigraphic index. The 7 material types were obtained

through examination of cores from 18 boreholes at the site. Specifically, the 19 layers indicated

on cross sections C-C’ and D-D’ are clay (1, 4, 8, 12, 16, 18), silt and clay (17, 19), silt (2, 7, 10,

14), sandy silt (6, 9, 13), silt and sand (5), sand (3, 11) and sand and gravel (15). On cross sections

C-C’ and D-D’, layer numbers are italicized and numbers along PW and CMT wells indicate port

numbers (e.g., PW1-1, PW1-2, and so on).

Figure 3. Vertical profiles of log10K estimates from various approaches with K in units of m/s

along CMT3 and HPT6 plotted against site stratigraphy.

Figure 4. Box-and-whisker plots of log10K estimates with K in units of m/s from various site

characterization methods for 19 layers of the geological model.

Figure 5. Log10KG estimates with KG in units of m/s from various site characterization

approaches for 19 layers of the geological model. Log10K values from Case 3c are also plotted,

but as the values are not provided in terms of layers, those values are plotted against Depth (m)

on the right axis based on the vertical profile of PW1 at the center of the simulation domain.
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Figure 6. K distributions at the NCRS from various site characterization approaches. CMT and

PW well locations (red lines) along with their screened intervals (black colour) as well as HPT

survey locations (dashed pink lines) are shown on each subfigure.

Figure 7. Scatterplots of observed versus simulated drawdowns from various K characterization

approaches for model validation under steady state conditions.

Figure 8. Scatterplots of observed versus simulated drawdowns from various K characterization

approaches for model validation under transient conditions.


17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Figure_1.tiff
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Figure_2.tiff
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Figure_3.tiff
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Figure_4.tiff
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Figure_5.tiff
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Figure_6.tiff
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Figure_7.tiff
17456584, ja, Downloaded from https://ngwa.onlinelibrary.wiley.com/doi/10.1111/gwat.13348 by Cochrane Oman, Wiley Online Library on [31/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Figure_8.tiff

You might also like