The following flow chart summarizes the methodology developed for the large scale agrichemical transport in the Midwest rivers:
Figure 4.1 Methodology of the large scale modeling of agrichemical concentrations in the Midwest rivers.
Two constituents are selected for the study of the transport of agricultural chemicals in surface waters: nitrate plus nitrite and atrazine. The following factors influenced this selection:
It is assumed that the nitrate plus nitrite concentrations in Midwest streams are mainly derived from chemical fertilizers.
Nitrogen (N) in soils occurs as organic or inorganic N. The inorganic forms, include ammonium (NH4-), nitrite (NO2-), nitrate (NO3-), nitrous oxide (N2O), nitric oxide (NO), and elemental N (N2). The three most important forms, NH4-, NO2-,and NO3-, usually represent 2 to 5% of the total soil N. The source of NH4+ is from mineralization of organic N and from fertilizers. Some of NH4+ is converted to NO2-, which is toxic to plant roots by bacteria Nitrosomonas (2NH4- + 3O2 = 2NO2- + 2H2O + 4H+), and then oxidized to NO3- by Nitrobacter (2NO2- + O2 = 2NO3-). The NO3- anion is very mobile and subject to leaching losses (Tisdale et al., 1993).
Nitrate in streams is derived from many anthropogenic and natural resources including chemical fertilizers, animal wastes, domestic sewage, legumes, mineralization of vegetation, soil organic matter, and from atmosphere through electrical, combustion and industrial processes. NO3- is a very soluble and mobile anion. It can be transported from agricultural fields in both overland flow and subsurface flow, and by volatilization into the atmosphere. Ammonium is adsorbed by the soil colloids and moves very little until converted to NO3-
In contrast to the overland transport in which nitrate takes minutes or hours to get to a stream, downward vertical leaching and the subsequent underground travel is a long process which takes months or years. The soil system has a strong memory with respect to nitrate production and leaching. Jones and Burt (1993) presented a study in which 64% of annual nitrate concentration in streams was explained by a stepwise regression involving the year of measurement, and each of the previous two years. It may take nitrate years or even decades to appear in rivers as a base flow pollutant.
There are some losses of nitrate due to erosion, but for humid temperate climates, erosion is generally an insignificant process compared with leaching and runoff (OECD, 1986). Other authors indicate that the adsorption has no marked influence on the rate of NO3- movement (Keeney, 1983, Bailey and Swank 1983). The predominant losses of nitrate in an agricultural field are due to assimilation by row crops and by other terrestrial and aquatic plants.
Although, the best time of fertilizer application is at the time of peak N demand of the crop, it is seldom feasible. In north central United States most of the N application occurs late summer and fall. It is influenced by the following factors (Tisdale et al., 1993):
The time of the nitrogen fertilizers application as well as the decrease in temperature during late fall, winter, and early spring, cause high nitrate plus nitrite concentrations in surface waters. These concentrations decrease in late spring and summer when the plant demands for nutrients are high (Goolsby and Battaglin, 1993, Davis and Keller, 1983). In addition, a significant decrease in nitrate concentration in surface water may result from assimilation of nitrate by algae and by instream riparian macrophytes (Heathwaite, 1993, Moore, 1991) as well as nitrate may be converted by the denitrification bacteria and various chemical processes into free nitrogen and nitrogen oxides which escape into the atmosphere. Since these processes are stimulated by high temperatures and low flow rates, the highest loss of nitrate in lakes and rivers occurs during the summer. Lakes and reservoirs act as a "buffer," thus they are less responsive to seasonal changes then rivers are (OECD, 1986).
Denitrification and other processes of biochemical degradation of nitrate can be modeled as a first order reaction with an "overall" decay rate dependent on temperature and carbon content. Some losses may result from infiltration (river water seepage into groundwater).
Atrazine is a herbicide that controls broadleaf weeds in fields of corn and sorghum. It is one of the most widely used herbicides in the United states (Comfort and Roeth, 1996). The EPA set the drinking water health limit for atrazine at 3 µg/L (ppb). The conventional water treatment does not remove this herbicide. Recent studies conducted in 29 communities throughout the Midwest, Louisiana, and the Chesapeake Bay detected high concentrations of atrazine in tap water in months from May through July, some of them exceeding EPA Maximum Contamination Level (EWG, 1996).
Numerous laboratory tests as well as field studies have been performed to determine the behavior of this herbicide in different chemical and physical environments for almost half of the century. Some of the published parameters are as follows: Atrazine (2-chloro-4-ethylamino-6-isopropylamino-s-triazine) is a low solubility herbicide; its water solubility in typical temperature and pH varies from 30 to 35 ppm (mg/L). The volatility of this chemical is very low ( vapor pressure = 0.3*10-6 mm Hg = 0.00004 Pa). Published values of the octanol extrability coefficient (soil sorption coefficient) Koc are from 130 to 172. The octanol-water partition coefficient Kow equals 251 (e.g., Weber, 1972, Hamaker, 1975, Wauchope, 1978, Rao, et al., 1983, Weber, 1988, work cited by Plimmer, 1988).
There are two parameters that characterize the chemical decay in soil: half-life and chemical persistence. Kruger et al. (1993) reported half-life of atrazine in soil under unsaturated conditions ranged from 41 d to 231 d, whereas in saturated soil at the 90-120 cm depth, the half-life was 87 d. Similar values were cited by Goring et al. (1975). Wauchope (1978) determined that atrazine persistence in soil, that is the approximate time for 90% disappearance from soil, is 12 months and may vary from 6 to 18 months, that corresponds to half-life from 55 d to 165 d, depending on climate and soil.
Atrazine decays into many degradation products which can be more persistent and mobile than their parent compound. During a USGS 1991 Midcontinent survey of near-surface aquifers, deethylatrazine, an atrazine metabolite, was the most frequently detected compound followed by atrazine, and then deisopropylatrazine, another atrazine metabolite (Kolpin et al., 1991). The half life of the deisopropylatrazine is much longer in surface water than in soil (Goolsby et al., 1993 pp. 51-62, Comfort and Roeth, 1996). However, Kruger et. al. (1993) found that deisopropylatrazine may be even less persistent under saturated conditions then in saturated soil. Their estimations of the deisopropylatrazine half-life ranged from 32 d to 173 d in unsaturated soil at top 30 cm, and from 58 d to 173 d in saturated soil at the 90 to 120 cm depth.
Weber (1988) presented results of research, which he conducted with J. A. Best, on the effect of pH on the dissipation of atrazine applied to soil. No parent component volatilization was detected; 0.1%-0.2% was found in leachate, plants used two to four percent. Ninety percent of atrazine was retained in the soil layer.
The adsorption and movement of s-triazines in soil depends upon such factors as soil organic matter, clay minerals, pH, temperature, soil moisture, concentration and species of other ions in the system (Weber 1972, Weber 1988, Goring et al. 1975). In CREAMS model, the pesticide in runoff is partitioned between the solution phase and the sediment phase (Knisel et al., 1983).
Study by the USGS of the occurrence of herbicides in precipitation in the Midwest and Northeastern United States showed that significant amounts of atrazine were lost through volatilization and subsequently returned to the land through precipitation washout. The amount of atrazine in precipitation washout was equal to one half the loading found in the Mississippi River (Goolsby et al., 1993 pp. 75-86). Plimmer (1988) discussed reports which described the identification of atrazine in fog. These findings were apparently in contradiction to the findings about negligible volatilization published by Weber (1972, 1988). As was pointed out by Kenaga (1975), vapor pressure is useful where molecules are volatilizing from a liquid or solid mass of the same compound. When a pesticide is applied to heterogeneous surfaces such as natural water, soil, foliage, wood, or glass, volatility is variable because sorption varies. There is a possibility that atrazine enters the atmosphere adsorbed on particulate matter through dust blowing from the land surface. The process of atrazine volatility is not well understood.
The Mississippi - Missouri basin above Thebes, Illinois (drainage area 2.3*106 km² ) together with the Ohio River basin above Grand Chain, Illinois (drainage area 0.5*106 km²) constitute the primary region that is used for estimation of statistical model parameters. Its extend is determined by the USGS reconnaissance study of selected herbicides and nitrate in Midwestern United States (Scribner et al., 1993). This region is one of the most extensive agricultural areas in the country, producing over 80% of all US corn and soybeans (Oberle and Burkart, 1994).
Figure 4.2 The Mississippi-Missouri and Ohio River Basins.
Since processing data for an area that covers almost three million square kilometers requires a computer with adequate operational and storage memory, the final model is built, and verified on the selected subregion, i.e., the Iowa-Cedar River watershed in Iowa. There are two sites for which measurement of agricultural chemicals are available: Old Man's Creek near Iowa City and Cedar River at Palisades. Flow rate is recorded in as many as 36 USGS gauging stations. Diversity of such geographic features as lakes, reservoirs, wetlands, hills, plains, and a wide range of stream sizes constitutes a very good representation of the majority of the Midwest. Figure 4.3 shows the Iowa and Cedar Rivers, and the location of gauging stations.
Figure 4.3 The Iowa River with tributaries and the USGS gauging stations.
All maps utilized in this research are represented in Albers Conical Equal Area Projection, generally accepted for large maps of the USA. This projection has the valuable property of equal area representation, combined with a scale error that is practically the minimum attainable in any system covering such a large area in a single sheet.
The following parameters are applied (standard for USA): units = meters, first standard parallel = 29º30'00'', second standard parallel = 45º30'00'', latitude of projection's origin = 23º00'00'', false easting = 0.000 m, false northing = 0.000 m, longitude of central meridian = -96º00'00''. Using one common projection for the whole Midwest eliminates the problems associated with merging separately modeling regions into one unit.
The Albers projection is of the conical type, in which the meridians are straight lines meeting in common point beyond the limits of the map, and the parallels are concentric circles, the center of which is at the point of intersection of the meridians. The meridians and the parallels intersect at right angles and the arcs of longitude along any given parallel are of equal length. The spheroid is intersected by a cone at two parallels known as the standard parallels for the area to be represented. On the two standard parallels, arcs of longitude are represented in their true lengths, or in their exact scale. Between the standard parallels, the scale along the meridians is too large and beyond them too small (Deetz and Adams 1969).
There are two basic mechanisms that are responsible for the transport of dissolved and suspended solutes in surface waters: advection and diffusion/dispersion. These two processes are described by the advection-dispersion equation which is a fundamental equation for majority of the pollutant transport models. Equation (4.1) describes one-dimensional advection-dispersion in the reach with uniform cross-sectional area.
 (4.1)
	(4.1)
 where:
c = concentration of pollutant [g/m³]
t = time [s]
x = distance along the river [m]
 = mass flux due to the longitudinal
dispersion [g/m²s]
 = mass flux due to the longitudinal
dispersion [g/m²s]
Ex = longitudinal dispersion coefficient [m²/s]
-vc = mass flux due to advection [g/m²s]
Si = i-th source/sink of the constituent [g/m³s]
 i = 1...n, n = number of sources/sinks
Kj = decay rate due to the j-th process [1/s]
v = flow velocity [m/s]
The source/sink term, Si and reaction term, Kjc represent a wide range of features such as lateral flux, transient storage, biotic and abiotic retention, benthic flux, periphyton retention, sediment retention, reaeration, photosynthesis, and nitrification (U. of Mississippi, 1990, James and Elliot, 1993).
Dynamic models have been used mainly for pollution incidents such as spills and runoff discharges. Their applicability for large scale modeling of nitrate/atrazine transport is limited, mainly because:
Lagrangian type transport models, such as a Moving Segment Model (MSM), can be efficiently incorporated into GIS. In MSM (James and Elliot, 1993) the stream is subdivided into series of segments. Within each segment the variations of chemical concentration are calculated by summing, for example, hourly changes due to all the processes involved. The process within the block is described by the following equation:
 (4.2)
	(4.2)
where: Si is the i-th source/sink and Kj is the j-th reaction coefficient.
The segments (blocks) move downstream with travel time t which may be a function of such parameters as flow rate, cross-sectional area, friction coefficient, slope, and stream curvature. For a detailed representation of the constituent transport in surface water at least the following processes should be represented (O'Connor et al., 1983, Thomann and Mueller, 1987, O'Connor, 1988a,b, U. of Mississippi, 1990):
Figure 4.4 shows these major reaction mechanisms and transfer routes of chemicals and solids in both river water and the river bed.
Figure 4.4 Reactions and transfers in a natural water system.
The processes is surface waters can be described by the mass balance equations for the dissolved and particulate components (Eq. 4.3 and Eq 4.4). For a specific chemical, e.g. nitrate or atrazine, some processes must be included in the transport model whereas some processes are not significant and therefore can be neglected.
 (4.3)
	(4.3)
 (4.4)
	(4.4)
where:
Kv, Kf, Ks, Ku = the bulk transfer coefficients of volatilization,
dissolved
       exchange, settling and scour respectively;
cs, cp = dissolved and particulate concentrations in water,
respectively;
cb, cbp = dissolved and particulate concentrations in bed;
Kc, Kp = decay coefficients of dissolved and particulate
form;
K1 = K0rcm; K0 = adsorption coefficient,
rc = adsorptive capacity,
      m = concentration of solids;
K2 = desorption coefficient;
Kq = coefficient of dilution due to the groundwater inflow.
If no dispersion is assumed (Ex = 0), no sources and sinks exist (Si = 0), and the chemical is undergoing first order decay, i.e., the losses can be modeled by the first order reaction (Kj c = Kd c), the solution of the Eq. (4.1) is:
 (4.5)
	(4.5)
This equation can be easily incorporated into GIS to model the transport agrichemicals in surface water.
There is no complete and consistent data base of parameters required for comprehensive modeling the agrichemical runoff from the field and its transport in surface waters of the Midwest. The runoff and transport parameters have to be estimated from such data as: (1) the observed flow rate and the measured agrichemical concentrations in the variety of watersheds scattered over the upper Mississippi-Missouri and Ohio River basins and (2) the DEM from which a watershed morphometry can be calculated. Only a model based on the statistical analysis comes into consideration.
Commonly used by the USGS equation for event mean concentrations of eleven water quality parameters as well as the total storm event loads has the following form (Huber, 1993):
 (4.6)
	(4.6)
where:
c  = mean concentration;
X1 ... Xn = explanatory variables;
ß0 ... ßn = regression coefficients;
BCF = bias correction factor.
Cohn et al. (1992) used similar equation to describe the concentrations of several nutrient species entering Chesapeake Bay through its tributaries. Their equation (Eq. 4.7) includes periodic functions of a time variable to explain cyclic behavior:
 (4.7)
	(4.7)
where Q' is the normalized flow rate and T' is the normalized time variable (yr).
The following Section introduce a set of explanatory variables that have potential application in the large scale model of agrichemical runoff from the field and transport in the river network.
Agrichemical application. Since it is a common practice to put more chemical on the field than the amount that can be completely utilized by the vegetation and the chemical - microbiological processes, some of the nutrients and herbicides runoff from the field. Many studies proved that the more chemical is applied on field U, the higher chemical loads are carried by a river (e.g. Battaglin et al., 1993).
Land slope. The higher the slope of the land SL, the chemicals are more susceptible to the washout, therefore the greater losses of the chemical from the field can be expected. This complex process is mostly influenced by the gravitational forces. Since the vegetation has a smaller chance to uptake nutrient or herbicide when the land slope is higher (due to the shorter residence time), a greater portion of the mass applied on the field will reach the stream network. All models of erosion and sediment transport comprise the slope parameter.
Distance to a stream. The influence of the average distance LL between the point of herbicide or nutrient application and the location of the stream network on the amount of mass that enters the stream is not as obvious as in the case of previously discussed watershed slope. Intuitively it may be expected that since the longer the average travel path is, the larger the chemical losses are and thus the coefficient ß3, from equation ln(c) = ...ß3 * ln(LL), estimated by the regression analysis should be negative. But, since the predictor variable LL represents complex watershed features, the coefficient ß3 does not necessarily have to be negative. The average length of the land-flow path highly depends on the stream density, i.e., on the length of the streams per unit area of watershed. The more streams there are within a given watershed, the shorter the length of the land flow. The density of the stream network influences the amount of chemical that enters a unit length of the stream. Therefore, for watersheds characterized by the higher value of LL , the amount of chemical that enters a stream (mass per unit length) is higher than the amount reaching the stream network which is located in a watershed with smaller LL.
Climate. Both, the precipitation and the temperature have an impact on the vegetation growth. Moreover, the greater the water surplus (precipitation minus the potential evaporation), the greater the possibility of loss of agrichemical through leaching if the crop is not growing vigorously or through washout if the land is not protected by a plant cover. Denitrification depends on the amount of water in soil, whereas nitrification rate is highly correlated with the temperature (Tisdale et al., 1993).
The influence of the climate on the concentrations in surface waters can be illustrated by comparison of studies performed in different climatic regions. For example, the largest number of atrazine detections in Swedish stream waters was in July, August and September (Kreuger and Brink 1988), whereas in the Midwestern United State the major atrazine runoff occurs in May and June (Scribner et al., 1994). Swedish vegetation period is short (6-8 months) and cold (3º-17º C)
Since the region under scrutiny extends from about 37º N to 50º N (latitude) and from 79º W to 114º W (longitude) the spatial distribution of the average temperature and the average precipitation influences not only the spatial distribution of the chemical runoff from the field but it also causes the spatially different agrichemical application time. It is believed that difference in climate conditions within studied region can be represented by a spatially distributed adjustment coefficient as well as a time shift introduced into periodic functions that describe seasonal variation of agrichemical concentration in the surface water.
The losses of agrichemical in stream can be described by an exponential function of the travel time (the traditional approach) or by an exponential function of the travel distance (Smith. et al., 1993). Although it is possible to estimate the travel time from the observed flow time series, in this research the agrichemical decay has been related to the travel distance, i.e., the amount of chemical that enters the stream in point i (cell i in raster representation of river) decays as it travels downstream according to the following equation:
 (4.8)
	(4.8)
where:
Rk  = total mass runoff from k-th watershed;
Ri  = mass that enters the stream in point i (cell
i);
n = number of all cells that constitute the stream network
located within k-th    watershed;
kS = overall distance decay coefficient;
Lik  = length of the flow path from stream cell i
in which the chemical runoff from a   field enters the stream
network to the watershed outlet located in cell k    [102
km];
Rk  = total mass runoff from k-th watershed.
The travel distance Lik can be estimated from the DEM using GIS functions. The average regional distance decay coefficient kS may be calculated by a trial and error method which goal is to eliminate a regression coefficient ß4. Symbolically it can be described as follows:
 (4.9)
	(4.9)
The elimination process can be performed in the following steps:
 for each watershed k for which the concentrations were
measured;
for each watershed k for which the concentrations were
measured;
 ;
;
Since the method outlined above is very computation intensive, it hasn't been fully utilized in this research. The total mass runoff Rk from k-th watershed has been replaced by an explanatory variable Es defined by formula:
 (4.10)
	(4.10)
The equation (4.10) assumes that the unit mass of chemical enters the stream network and it is uniformly distributed over all stream cells.
Another variable that has potential to explain the transport of agrichemical in rivers is the stream slope SS.
The seasonal variations of the nitrate concentration as well as the atrazine concentration in surface waters are modeled by a periodic function:
 (4.11)
	(4.11)
where:
SW = seasonal component;
sin(2n¶m/12), and cos(2n¶m/12)  =  components of
the Fourier series (the cycle   corresponding to n = 1 has a 12-month
period, n = 2, 3, 4, 5, and 6 are    harmonics of period 12/m
months;
m  = month of a year (1 for January);
ßn,6, ßn,7, ...  =  regression coefficients.
Research performed by Walling and Webb (1984, work discussed by Jones and Burt, 1993) shows that in most streams the annual nitrate variations do not have a perfectly symmetrical sinusoidal form; the annual minimum and maximum occurred 4-6 weeks later and 2-3 weeks earlier then the timing suggested by a single harmonic. Furthermore, no clear seasonal variations of nitrate concentration exist in catchments where groundwater with a consistently high nitrate concentration mixes with quick flow of varying origin and concentration (Jones and Burt, 1993).
To make the presentation of methodology complete this Section presents two exemplary equations that describe the agrichemical concentration (Eq. 4.12) and load (Eq. 4.13) in the Midwest rivers:
 (4.12)
	(4.12)
and
 (4.13)
	(4.13)
where:
L  = constituent load (L = Qc) [g/s];;
Q  = flow rate at a given stream location [m³/s];
c  = constituent concentration [g/m³];
U  = annual chemical use in drainage area above the sampling
point [kg/yr];
UR  = annual chemical application rate [kg/km²/yr];
SL  = average slope of the land [dimensionless];
LL  = average length of the constituent travel path, from
the point of application to   the stream network within a given
watershed to the sampling point [km];
A  = drainage area [km²];
ES  = average of the exponent of negative flow distance
in streams
 =  ;
;
nc  = number of cells that constitute the stream network
within sampled    watershed;
Lik  = length of the flow path from the i-th stream
cell to the watershed 
  outlet k  [102 km]
SS  = average slope of the stream network [dimensionless];
sin(2n¶m/12), and cos(2n¶m/12)  =  components of
the Fourier series (the cycle   corresponding to n = 1 has a 12-month
period, n = 2, 3, 4, 5, and 6 are    harmonics of period 12/m
months
m  = month of a year (1 for January) 
ß0, ß1, ß2 ...  =  regression coefficients.
The right hand side of Eq. (4.12) and Eq. (4.13) represent three major components of the transport process: the runoff from the field, the losses in the stream and the seasonal element.
 describe changes in mass applied
U (or UR) on a field as it travels from the point
of application to a stream;
 describe changes in mass applied
U (or UR) on a field as it travels from the point
of application to a stream;
 reflect the losses of agrichemical
in the rivers; and
 reflect the losses of agrichemical
in the rivers; and
 incorporates into the model seasonal variation of the load
and the concentration respectively.
 incorporates into the model seasonal variation of the load
and the concentration respectively.The estimation of values of explanatory variables such as agrichemical application rate, total application, average stream slope, average land slope, exponent of the negative flow length, and average distance from the field to the closest stream is performed in two steps: (1) Grids of spatially distributed parameters are constructed for the Upper Mississippi - Missouri River and the Ohio River basins, and (2) for each cell that represents the sampled watershed outlet the value of the explanatory variable is extracted.
Here, the grid of spatially distributed parameters means a grid which each cell contains average or sum calculated for the total drainage area upstream to the given cell. Figure 4.5 shows selected cells that contain a value that characterize the upstream watershed.
Figure 4.5 Example for explanation of the grid of spatially distributed values of explanatory variables.
The regression equations find the following application in this study:
The GIS technology gives the opportunity to construct versatile "cascade" models. The idea of cascade modeling incorporated here into GIS has been extracted from the methodology used in forecasting the municipal water use (Maidment and Parzen, 1984, Mizgalewicz, 1991).The GIS cascade structure is two dimensional. It can be applied in spatial dimension as well as in a temporal one. In the time domain, a general model describes spatial distribution of average annual amounts of agricultural chemicals in rivers. The annual values are then broken into seasonal or monthly values. For example, the annual average concentrations c(y) for the Midwest can be estimated by the general function:
 (4.14)
	(4.14)
where: X is a vector of explanatory variables such as annual agrichemical application and watershed morphometry (area, land slope, stream slope, stream length, overland flow length) which practically do not change in time.
The Equation (4.14) can be further extended by adding a seasonal component, a monthly fractions S(m) to brake down the annual predictions of concentration c(y) into monthly values c(m):
 (4.15)
	(4.15)
Cascade modeling enables further extending of the Eq. (4.15) by adding such elements as trend and irregular component.
Cascade modeling in spatial domain implies that the modeling process is subdivided into several levels of resolution, i.e., the Upper Mississippi-Missouri and Ohio River can be subdivided into three basins: the Missouri River Basin above junction with the Mississippi River, the Upper Mississippi River Basin, and the Ohio River Basin. The transport of agricultural pollutants in the Upper Mississippi River is estimated by utilizing results of sub-models that describe transport in individual component basins such as the Des-Moines River, the Skunk River, the Iowa-Cedar River, and Wisconsin River Basins.
Additional spatial subdivision of the Midwest can be performed by introduction of climate zones. Each zone is defined, for instance, by specific range of temperatures and precipitation depths. Thus, the Eq. (4.15) would have the following form:
 (4.16)
	(4.16)
where: c(m,g) is the monthly agrichemical concentration in rivers modified for the climate zones; g(T,P,Z) is a function of T - zonal temperature, P - zonal precipitation depth, and Z - zone location.
The diagram shown in Figure 4.6 illustrates the example of spatio-temporal cascade modeling within GIS.
Figure 4.6 Example of the cascade modeling within the GIS.
Parameters of the mathematical description of the chemical applicationrunoff process and chemical losses in streams are stored in a GIS attribute table. Moreover, the equations are stored as objects in a database table. Storing equations as a database objects not only permits to fully implement the cascade modeling technique into the GIS but also it simplifies the structure of the model, allows one to test and compare unlimited number of equations, and permits changes to the mathematical description by simple record editing operations. The model prototype is constructed within ArcView, a GIS application (ESRI, 1995).
Elementary watersheds, i.e., modeling units that are considered lumped systems, constitute the smallest units into which the region is subdivided. Each unit is characterized by set of parameters such as area, agricultural chemical application, slope, depth of precipitation, water runoff, average elevation, and an equation that relates the mass of chemical applied and the mass of chemical runoff.
Since the GIS offers very convenient tools for data storage and manipulation, all attributes can be stored and extracted by models that operate on different spatial scales. The order of processing the individual watersheds is the researcher's choice. Moreover, each watershed does not need to be divided into modeling units of the same size. To make the modeling process efficient, such features as density of spatial information and diversity of terrain should influence assumed size of the elementary unit.
A watershed is explicitly defined by its outlet point. This hydrologic property is utilized here to subdivide the region under investigation into elementary drainage areas. Three types of watershed outlet locations are considered:
Figure 4.7 shows example of selected watershed outlets and corresponding modeling units:
Figure 4.7 Example of watershed outlets: (1) beginning of the river, (2) stream junction, and (3) gauging station.
The watershed structure, developed by Maidment (1993) for hydrologic modeling utilizes type(2) and (3) watershed outlets which are positioned at the stream junctions and at the gauging station locations. In this study this set of watershed outlets has been extended by adding the points in which the stream network, delineated from DEM, begins (type 1 outlet). The test have been performed to determine the influence of additional unit watersheds on the uniformity of region subdivision and the control on the unit area. The Iowa-Cedar River basin has been subdivided using two sets of outlets. Full set: Median = 25.8 km² and mean = 31.6 km², reduced set (type 1 outlets not included): median = 37.9 km² and mean = 46.7 km². By including the outlets of type 1, the median of unit watershed drainage areas is very close to the used threshold. Figure 4.8 compares the frequencies of modeling unit area.
Figure 4.8 Comparison of the frequency of modeling unit area subdivided using different sets of watershed outlets.
The following list summarizes advantages of the addition of type(1) outlets:
The threshold area of 25 km² has been selected after many tests with different values were performed. Smaller threshold area resulted in very dense stream network and thus large number of areas of size of 1-2 cells. After Arc/Info conversion of these small units from raster format (grid) into vector format many of them disappeared. In addition, smaller than 25 km² values are not justified by the data used in the research. For example, the agrichemical application rate is published with the county-size resolution. Stream network delineated from DEM, is slightly more dense that the one represented by the RF1 (1:500,000 digital map of rivers).
The larger threshold value than 25 km² resulted in very coarse subdivision of studied region and a low density stream network. Figure 4.9 presents the Iowa-Cedar River watershed divided into modeling units of different sizes.
Figure 4.9 Division of the Iowa-Cedar River into modeling units using different threshold drainage area: 25 km², 400 km², and 2500 km².
Network flow system is common function in vector GIS (Maidment 1993). Here the flow system has been extended: instead of arcs, unit watersheds (polygons) compose the flow system. The flow topology is described by two numbers: the modeling unit ID and the ID of the downstream unit, i.e., the next unit on the flow path. The ID = 0 of the next unit indicates that there are no more downstream units. Figure 4.10 shows an example of the description of modeling units flow topology.
Figure 4.10 An example of the flow topology of the unit watersheds; (x,y), x is the unit ID, y is the ID of downstream to x unit.
The flow direction indicator is the basic concept in the raster-based hydrologic modeling. For example, Arc/Info Grid denotes the next cell in the flow path by one of eight numbers: 1 represent flow into E (East) neighbor cell, 2 in SE cell, 4 in S cell, 8 in SW cell, 16 in W cell, 32 in NW cell, 64 in N cell, and 128 in NE cell. These numbers are called "flow direction". Since modeling units are not regular spatial shapes, as the cells are, it is not possible to create an uniform numbering system of the flow direction. Proposed here method describes the flow connectivity by specifying the ID of the next unit on the flow path.
Addition of the item that describes the flow direction into the unit watershed attribute table made possible to utilize most of the concepts of hydrologic modeling, as far used in the raster GIS, into the vector environment. Such functions as flow accumulation, basin delineation, flow length can be applied for any shapes and thus for unit drainage areas. The flow system described here may be adopted for all models in which the conditions in a modeling unit does not influence the conditions in upstream unit such as kinematic wave routing and constituent decay as it flows downstream.
In this study, the flow connectivity of the unit watersheds is used to calculate cumulative or average parameters of drainage area upstream of a given point. These values are applied to the regression equation to estimate agricultural chemical concentration in the runoff. The following list presents selected examples of application of the unit watershed flow system that have been programmed in Arc/View script language:
The unit watershed approach of modeling agrichemical transport can be considered as a semantic data model:
... semantic data modeling --- creating abstractions of geographic data layers which map one to one with their geographic representation but which are simplified in a functional description to the level needed for hydrologic modelling. (Maidment, 1993)
Figure 4.11 shows examples of the conceptual stream network. The links have been developed by connecting the cells immediately below each modeling unit outlet. Thus, Although the system is conceptual, its each node is located on the river represented in Grid format.
Figure 4.11 Flow system in the Iowa-Cedar River basin subdivided into modeling units of different sizes (threshold drainage area: 25 km², 400 km², and 2500 km²).
In this system, the outflow from the source watershed (first order watershed) is a point inflow into the stream. The outflow from the intermediate watershed can be represented as a lateral inflow applied on the stream length or can contribute directly to the outflow from the reach. Thus the conceptual stream network attribute table contains three items that describe the water or agrichemical mass flow conditions:
In addition, the arc attribute table can contain wide range of items that describe the links (length, slope, RF1-ID, agrichemical decay coefficient, travel time) as well as the parameters of the beginning node and the ending node (elevation, coordinates, and length from the watershed outlet). In this research only the conceptual stream network represented by unit watersheds is utilized.
To enhance the calculations performed by the GIS model, the following system of ordering modeling units has been utilized: All the most upstream units are assigned an order one. An interior unit has the order equal to the maximum order of the upstream units increased by one. Figure 4.12 compares this ordering method with the Strahler and the Shreve ordering systems (ESRI, 1992). The proposed ordering system allows to perform calculations in consecutive manner: initially all the first order units are processed, then the order is increased by one and all units of order two are evaluated. The routing is performed until the unit of the highest order is calculated.
Figure 4.12 Comparison of the stream ordering systems: (a) Strahler, (b) Shreve, and (c) utilized in the agrichemical transport model.
The grid that describes the celltocell flow (flowdirection) is crucial for all hydrologic analysis that is performed in a rasterized environment. Procedures such as stream and watershed boundary delineation, dividing a basin into modeling units, stream slope calculation, length of the flow path estimation, and connection of hydrologic units, are examples of operations that cannot be performed without the map of flow direction. Moreover, the accuracy of all derived information depends on the precision of the flow direction grid. Therefore, an effort has been made to develop a method to improve the map that represents the flow paths. The RF1digital 1:500 000 map of the U.S. rivers has been selected as a basis for the spatial framework of the entire flow system. The following explanations support the application of map of existing rivers to correct the flow system determined from DEM:
The process of the DEM adjustment is based on the converting the RF1 into grid form and then increasing the elevation of all DEM cells, that do not represent gridded RF1, by an arbitrary value (e.g., 20000 m). This operation forces the Grid GIS to create a map of flow direction that is compatible with the flow system represented by the vector map of rivers (RF1). The method of enhancing the flow system development has the following disadvantages:
If the river network does not fulfill the above mentioned requirements, some editing after converting into grid format is necessary.
Incorporating the RF1 into grid has an additional advantage. By assigning the reach ID from RF1 to the raster river representation, the attribute table of RF1 can be linked with the attribute table of derived grids. Thus such information as the average flow velocity, average flow rate or stream names that are in RF1 attribute table can be used for grid models and vice versa, the parameters estimated in grid such as reach slope, drainage area and flow length can be assigned to the streams in vector RF1. The RF1 - Grid link extends the grid-network procedure for hydrologic modeling developed by Maidment (1992).
The flow rate is essential to estimate the agrichemical concentration and load in all rivers of the area under investigation. In this study, the historical flow record is utilized instead of synthetic values. The high or low flow conditions are modeled by selecting a year from the past that had the high or low flow rate recorded. The flow measurements are available only in locations in which the USGS gauging stations are located. Therefore, a procedure that calculates the flow rate in ungauged stations has been developed. This section describes such a procedure.
To make the model of transport of agrichemicals capable of reconstructing historical conditions, a database of the recorded average monthly flow rate and a database of the observed average monthly precipitation depth must be constructed. Since the model is developed within the GIS, the data structure of the flow and precipitation time series must be incorporated into the geographic system. The advantages of this approach are as follows:
The GIS database of monthly flow rate and the database of the average monthly precipitation depth is developed in two steps: first a map of monitoring stations is created and then, for each map, the attribute table with the measurements is build.
The maps are created utilizing the latitude and longitude published by Hydrosphere (1993 a, b, 1994). Although there are 38 USGS stations in the Iowa-Cedar River basin only the stations with the complete flow record (28) has been used for analysis. Figure 4.13 shows the map of the USGS gauging stations selected for modeling the spatial distribution of recorded flow rate.
Figure 4.13 The Iowa-Cedar River basin: subwatersheds and selected USGS gauging stations (numbers represent station ID).
The map of the weather stations contains 86 stations that are located within the Iowa-Cedar River basin and within the 50-km buffer zone outside the basin. Figure 4.14 shows the map of the weather stations. All stations are utilized to redistribute the flow rate record.
Figure 4.14 Weather stations applied to analysis of the hydrologic conditions in the Iowa-Cedar River watershed.
A fragment of the point attribute table that contains average monthly flow rates is presented in Table 4.1.
Table 4.1 An example of PAT--point attribute table of the USGS gauging station coverage. Full table contains monthly flow record for 38 stations, for period from 1940 to 1992 in m³/s
GS GS_ID STATION M199001 M199002 M199003 M199004 M199005
1 1 5457000 1.339 1.419 13.111 15.036 18.519 2 3 5459500 0.292 0.357 3.228 2.274 6.513 3 4 5457700 2.444 3.596 22.201 23.390 31.630 4 5 5458000 0.165 0.294 5.239 3.143 4.955 5 6 5449000 0.004 0.020 0.697 0.413 0.445 6 9 5449500 0.154 0.204 1.481 1.141 2.917 7 10 5462000 1.444 1.855 15.150 12.601 26.618
The process of spatial redistribution of the measured monthly average flow requires the average precipitation depth for each modeling unit. GIS water quality models such as SWAT-GRASS utilizes the rainfall depth observed in the closest weather station to subbasin (Ramanarayanan et al., 1996, Krysanova et al., 1996). The commonly used in hydrology methods of spatial estimation of rainfall from rain gauges are the Thiessen polygon method (Chow et al., 1988) and the inverse distance-squared method (Smith, 1993). There are two other functions available in Arc/Info GIS: kriging and trend (fitting a polynomial regression surface). In this research the Arc/Info IDW function (inverse distance-squared method) has been applied. It creates a grid of spatially distributed values extracted from the attribute table of point coverage. The calculations are very fast, thus it was feasible to create maps of spatial distribution of monthly average precipitation depth for the Iowa-Cedar River basin for the period from 1950 to 1992. Figure 4.15 presents the map of precipitation depth estimated for June 1990.
Figure 4.15 Spatial distribution of precipitation in the Iowa-Cedar River watershed in June 1990.
There are about 38 USGS gauging stations in the Iowa-Cedar River basin (some of them are not in service). One or more gauging stations determine a drainage area (or partial drainage area), referred to as a gauging station zone or zone. The gauging station constitutes a point through which a known amount of water flows from one zone to an other zone. The runoff from each modeling unit depends on two factors: 1) the water balance calculated for the unit's respective zone and 2) the distribution of the precipitation depth over the gauging zone. Figure 4.16 illustrates an example of a gauging zone and modeling units.
Figure 4.16 Cedar River watershed above Charles City, Iowa: An example of gauging station zones and modeling units.
The process of spatial redistribution of the measured monthly average flow is performed in four steps. Each step is performed separately for a given month. Since the method only redistribute observed values using drainage area and precipitation as a weight, the effect of the snow accumulation and the evaporation on the spatial distribution of the flow is included in the flow record for the gauging zones. Within each zone neither the snow accumulation nor evaporation or groundwater transfer are taken into consideration. The precision of data used for the agrichemical transport such as county level chemical application do not justify construction of very detailed flow model that require at least information about spatially distributed temperature, snow depth, land use, aspect, solar radiation etc. The steps of the flow rate interpolation are as follow:
Estimation of average runoff coefficient. Using recorded outflow from the first order zones, i.e. watersheds determined by the most upstream gauging stations, an average runoff coefficient Cf is calculated according to the following equation:
 (4.17)
	(4.17)
where:
Cf(m) = average runoff coefficient for the m-th
month [dimensionless]
Qj (m) = average flow recorded  in j-th gauging
station during m-th month [m³/s]
Pj (m) = average monthly precipitation depth [mm/d]
Aj (m) = j-th watershed area [km²]
m  = month
j  = index of the first order watersheds
a   = units conversion factor.
First approximation of the flow in streams. The runoff from the modeling units is summed along the flow path, moving downstream from a first order stream toward the outlet of the basin. The mass balance for the i-th unit is described by the following formula:
 (4.18)
	(4.18)
where:
Q'i(m)  = estimated cumulative flow at the outlet point
of the i-th unit [m³/s]
Q'k(m) = estimated cumulative flow at the outlet point
of the k-th unit [m³/s]
k  = index of units in the immediate upstream vicinity
of the i-th unit
 Cf(m), Pi (m), Ai (m), = average runoff
coefficient, average monthly precipitation      depth in unit
i, and i-th unit area, respectively 
 To make calculations more efficient, when the flow path crosses
the border of the gauging station zone, i.e. a gauging station,
the calculated cumulative flow  is substituted by
the measured value Qj(m). This substitution of values ensures
that the observed inflow is used to calculate accumulated flow
in each zone, and the resulting error at the zone outlet is due
only to inaccuracy of water balance estimated within the zone
(errors do not propagate from zone to zone). Thus, the water balance
in unit downstream to the gauging station is:
 is substituted by
the measured value Qj(m). This substitution of values ensures
that the observed inflow is used to calculate accumulated flow
in each zone, and the resulting error at the zone outlet is due
only to inaccuracy of water balance estimated within the zone
(errors do not propagate from zone to zone). Thus, the water balance
in unit downstream to the gauging station is:
 (4.19)
	(4.19)
where:
Q'i(m), Cf(m), Pi (m), Ai (m),   = same as
for (Eq. 4.18) 
Qj(m) = observed inflow into modeling unit i, from
zone j. This value equals to
   the outflow from zone j [m³/s] 
Error evaluation. In this step, the difference between
estimated and observed flow,  , is calculated for
each zone j:
, is calculated for
each zone j:
 (4.20)
	(4.20)
where:
Qj(m) = observed outflow from zone j
Q'i(m) = estimated cumulative flow at the outlet point
of the i-th unit that 
      is also the outlet point of the j-th gauging station
zone.
Correction of the cumulative flow. The correction of the estimated cumulative flow is the crucial process of the flow distribution method. Two weighting coefficients k1 and k2 are applied. The coefficient k2 redistributes error according to the cumulative runoff calculated for each zone separately - the flow in rivers is created only by the estimated runoff CfP(m)A(m). There are no inflows from the upstream zones. Thus the coefficient k2 redistributes error according to the drainage area and the precipitation depth within each zone separately. It is calculated for each modeling unit as a proportion of the runoff Qxi(m) from the drainage area upstream of the given, i-th unit outlet (within the zone j) to the total runoff Qxj(m) from the zone j (at the beginning of the flow path it is very small, at the zone outlet it equals one):
 (4.21)
	(4.21)
To put more burden of the error correction on the major rivers rather, then small streams the coefficient k1 has been introduced. There is higher probability that the estimated error is due to the losses/gains of the river between two gauging stations than to the losses/gains in small streams located far from measurements points. In another words, further from the gauged reach a stream is located, the higher the uncertainty of the flow is and therefore, the more appropriate is to apply the average basin runoff coefficient to estimate stream flow rate.
The coefficient k1 is calculated using total cumulative flow, i.e., the approximated flow in the streams of the whole basin as a proportion of the estimated cumulative flow Q'i(m) at the outlet point of the i-th modeling unit to the cumulative flow Q'j(m) at the outlet of zone j in which the i-th unit is located:
 (4.22)
	(4.22)
The adjustment of estimated flow (Eq. 4.18 and Eq. 4.19) can be represented by the following formula:
 (4.23)
	(4.23)
where:
Q''i(m)  = corrected cumulative flow at the outlet point
of the i-th modeling unit;
Q'i(m)  = estimated cumulative flow at the outlet point
of the i-th modeling unit;
 = error, difference between observed and estimated
cumulative flow at the   outlet of zone j;
  = error, difference between observed and estimated
cumulative flow at the   outlet of zone j;
k1,k2 =  weighting coefficients.
This section discusses a model which estimates loads in rivers assuming the chemical losses in rivers are governed by the first order reaction, i.e., the agrichemical mass exponentially decays as it travels from one modeling unit to the downstream unit. The preliminary model has been developed to calculate the concentrations in rivers utilizing the results of CEEPS (Comprehensive Environmental Economic Policy Evaluations System) metamodel developed by the Iowa State University's Center for Agricultural and Rural Development (Bouzaher and Monale, 1993).
The CEEPS applies two models of chemical runoff from the field: PRZM (Mullins et al., 1993) for pesticides and EPIC for nutrients. The results, among others, are proportions of agrichemical mass applied that leaves the field. Unfortunately, the author of this dissertation was unable to obtain the spatial distribution of these export factors over the Iowa-Cedar river neither to estimate the rate of agrichemical losses in rivers necessary to develop and test the complete GIS model. Since it is not in the scope of the research to develop such a model, only its outline for the completeness of discussion, is included below.
The spatial model frame is based on the watershed divided into modeling units described in previous sections. The mass that enters the stream in unit watershed is calculated using export factors:
 (4.24)
	(4.24)
where:
Mi = agrichemical mass that enters surface water in the
i-th modeling unit;
Ui =  total agrichemical application in the i-th
unit; and
Ei =  export factor that depends, e.g., on agricultural
management practices.
As the agrichemical mass (load) travels along the flow path it decays. The decay process can be related to the travel time or to the travel distance. It is assumed here that the time decay coefficient or distance decay coefficient represent such processes as decay, sorption, desorption, volatilization, exchange between sediment and water column, settling, scour, and dilution. The equation (4.25) describes this process utilizing the travel time and the time decay coefficient:
 (4.25)
	(4.25)
where:
Mout,i  = agrichemical mass that leaves i-th modeling unit;
Mout,k  = input mass from the k-th upstream unit;
Mi   = agrichemical runoff from i-th unit;
Kt,i   = time loss coefficient; 
ti   = time that takes the constituent to travel through
i-th unit.
To describe the agrichemical losses by the exponential function of river reach length the Kt,i ti component of the equation (4.25) must be substituted by the distance loss coefficient KK,i and the travel distance Li.
 Since the database of the average monthly precipitation depth,
the monthly water runoff from the field as well as the average
monthly flow rate in river for all modeling units it is feasible
to apply concentrations (traditional approach: ) instead of loads.
The GIS model prototype is coded in ArcView script language. The
space in the polygon attribute table is reserved for all required
parameters.
 ) instead of loads.
The GIS model prototype is coded in ArcView script language. The
space in the polygon attribute table is reserved for all required
parameters.
Since no information about the river cross sections is available as yet, a special procedure has been developed to determine the distribution of time of travel and flow velocity. The visual analysis of daily flow record along the flow path has revealed that there is a time shift between flow time series. Figure 4.17 presents an example of the flow rate time series along the flow path from the gauging station "Winnebago River at Mason City, Iowa" to the gauging station "Iowa River at Wapello, Iowa".
Figure 4.17 Example of time series recorded in gauging stations located along the Cedar River. The data are for the water year 1990. The logarithmic scale is used to show all time series in one picture.
A cross-correlation coefficients can be applied to test the linear relationship between flow time series recorded in the gauging stations located on the same flow path. Six gauging stations along the Cedar River shown in Figure 4.18 have been selected for the preliminary analysis.
Figure 4.18 Flow path and location of gauging stations which have been used to illustrate the potential method for time of travel estimation.
The cross-correlation coefficients between the time series recorded in gauging station "Cedar River near Conesville, IA" and the time series recorded in remaining gauging stations have been calculated. Figure 4.19 shows the estimated coefficients. For example, the cross-correlation between the flow recorded in gauging station "Cedar River near Conesville" and the flow recorded in gauging station "Cedar River at Cedar Rapids" has the maximum value for a two-day lag. This suggests that it takes about two days for the water to flow from Cedar Rapids to Conesville.
Figure 4.19 Cross-correlation coefficients: the strength of the linear relationship between the flow rate in the Cedar River recorded near Conesville, IA, and the flow rate recorded at indicated locations. All gauging stations are on one flow path (shown in Figure 4.18).
The projection of the travel time over the ungauged regions is much more difficult that the projection of the flow velocity, except the case when the simple relation between the travel time and the travel distance is applied. Figures 4.20a and 4.20b show such relationship. Figure 4.20c presents the flow velocity as a function of the square root of the stream slope.
Figure 4.20 Analysis of travel time and flow velocity:
 a) cumulative travel time versus cumulative flow distance;
 b) travel time vs. flow distance;
 c) velocity vs. square root of stream slope.
The travel time shown in Figure 4.20 represents the lag time for which the cross-correlation coefficient has the maximum value. The slope for each stream reach has been calculated from the digital elevation model (DEM). The flow length has been calculated by the Arc/Info function flowlength.
The same exercise has been repeated for the Iowa River. Two river sections have been excluded from the analysis: one between Rowan and Marshalltown (maximum correlation between no-lagged series) and other between Marengo and Iowa City (the Lake Coralville caused that the maximum correlation coefficient has been estimated for nine-day lagged series). The following relationship between "travel time", the stream slope, and the stream length has been estimated for both, the Iowa River and the Cedar River:
 (4.26)
	(4.26)
where:
t = travel time [d] 
So = stream slope
L = stream length [km]
The Equation (4.26) gives a good approximation of the "travel time" in the Iowa River and the Cedar River. Figure 4.21 shows that the error of prediction for all stream reaches is below 0.5 day (the precision of determining a lag time of the maximum correlation is 0.5 day.)
Figure 4.21 Estimation of the "travel time" for major reaches of the Iowa River and the Cedar River, IA.
Figure 4.22 compares the flow velocity extracted from the RF1 database with the flow velocity estimated by the correlation analysis. There is no visible relation between these two data sets. The flow velocity estimated from the correlation of flow record is about three times higher then the one published with RF1. According to the RF1 velocities, it takes constituent almost a month to travel from Austin, MN to Wapello, IA. Further analysis is needed to clarify the reasons for this discrepancy.
Figure 4.22 Comparison of the velocity from the RF1 database
with the velocity estimated from a correlation analysis of the
flow record in the Iowa River and tributaries