Modeling and Analysis of Transportation Flows Created

by E-commerce Transactions

 

Adam Ho, Erhan Kutanoglu, Michael Cole

Department of Industrial Engineering

University of Arkansas, Fayetteville, AR

 

Michael Bartolacci

Computer Science

Pennsylvania State University, Reading, PA

 

1. INTRODUCTION

1.1       Motivation

            Many studies and surveys show the signs of an exponentially growing Internet-based economy. The recent growth of business and trade realized over the Internet has drawn a lot of attention to electronic business, whether it is business-to-business or business-to-consumer. The increasing availability of e-commerce solutions provides firms with new potential for reaching new customers and business partners. Traditionally, the two most formidable barriers for this type of extended business have been distance and the lack of access to key sales and marketing areas. With the potential removal of such barriers in the new economy, United States electronic sales are projected to be $380 billion in 2002 (U.S. Department of Commerce News, 2001). Ultimately, e-commerce will change and make an impact on the United States economy in terms of sales, jobs, and business opportunities. The goal of this study is to identify the effects of e-commerce-based transactions on transportation freight movements between regions. We believe that e-commerce has diminished and will continue to diminish the barrier effect of distance in the U.S. economy.

The question that arises is “How will growing e-commerce affect the physical transportation network?" Similar to traditional commerce transactions, an e-commerce transaction may result in transfer of goods. This physical exchange of goods relies heavily on the traditional transportation network. We hypothesize that the growing number of e-commerce transactions affects the distribution of loads on the transportation network, with potential changes on the usage of different modes such as air, rail, road, and inland water. In e-commerce, instead of shipping 100 computers in one truckload to a local store, 100 boxes, each with one computer, are shipped to a dispersed set of customers. For example, on a single Saturday in July 2000, 100 airplanes and 9,000 trucks delivered more than 250,000 copies of Harry Potter and the Goblet of Fire to Amazon.com customers all over the United State (Environmental News Network, 2000). Also, according to a senior fellow at Inform, an environmental research organization in New York City, It's unlikely e-commerce will save the planet as some have claimed," says Bette Fishbein. "There might be some reductions in energy use, but a huge increase in packaging and shipping by air results in much more air pollution (Environmental News Network, 2000).

A quick analysis of the U.S. Census Bureau's commodity flow surveys (1993 and 1997) indicates an increase in the average distance of each ton of products shipped (www.census.gov). This implies that over the years, an average load is shipped to a destination that is farther away from the origin. Although not all of such changes are due to e-commerce, we believe that the growth of e-commerce results in a diminishing effect of distance on transportation flows between distant regions. Our goal is to model the changes in the distribution of transportation flows given increasing amounts of e-commerce and the corresponding diminishing importance of distance.

For the purposes of planning by governmental agencies and transportation providers, surveys have already been undertaken through partnerships between the Census Bureau, the Department of Commerce and the Department of Transportation to collect data on the movement of goods (not necessarily e-commerce initiated). The data from this survey, referred as commodity flow survey (CFS), are used by public analysts and transportation providers to assess the demand for transportation facilities and services, energy use, and environmental concerns. We foresee that public analysts and transportation analysts can make use of the knowledge on changing pattern of flows due to e-commerce to allocate resources and plan for the future.

            Currently, e-commerce represents approximately 1% of the total U.S. economy (U.S.  Department of Commerce News, 2001). E-commerce generated flows are only a fraction of the total flows. However, due to its expected exponential growth, e-commerce will represent a significant part of the economy in the near future. At this point, the Census Bureau claims that producing a separate series of data at micro levels for e-commerce can be very difficult and expensive (Atrostic, Gates and Jarmin, 2000). Therefore, in this research, we seek to model the flow of goods for e-commerce by using the readily available commodity flow survey data from the Census Bureau. We intend to focus on a handful of selected SCTG (Standard Classification of Transported Goods) codes. Some of these products are more e-commerce related in the sense that their flows will be more likely be affected by e-commerce much earlier than others. A model developed for an e-commerce-oriented product will give us a better feel of the future directional distribution of other products when e-commerce grows even further. We employ the use of gravity model in this research because it is capable of capturing the major components that contribute to the transportation flows system.

 

1.2. Report Outline

The rest of the report is organized as follows: In Chapter 2, we provide a rather extensive review of the relevant literature on different applications of gravity models, a model that is primarily used in estimating transportation flows between regions. In this research, this model has been further developed to estimate freight movements with the effect of e-commerce. We discuss data sources in Chapter 3. In Chapter 4, we discuss our modeling effort and how we calibrate the base flow data obtained from the 1997 commodity flow survey. Base flow is the historical flows of goods exchanging between regions. We also present the way we determine different parameters to estimate future flows, and the process in assigning the flows to different transportation modes. We provide our first preliminary analysis using SCTG code ‘35’ (electronic and electrical products, and office equipments) as our base flow condition in Chapter 5. We also show our preliminary output of the gravity model and the way we use two other product flow data to eliminate the bias effect toward product code ‘35’. We show that we can use the results to validate the use of gravity modeling for quantifying the directional distribution of transportation flows under diminishing effects of distance in e-commerce. We also present the findings of the expected usage of different modes in year 2005. Finally, we summarize our main findings and provide some insights for future research in Chapter 6.


2. LITERATURE REVIEW

In this chapter, we present a review of literature on gravity modeling, which is the modeling tool that we use in this research. We first give an overview of gravity modeling applications in trip and freight distribution as well as other economic applications. While we review the relevant literature, we also highlight several differences between previous research and our implementation for e-commerce flows. To our knowledge, no previous research effort has used gravity models to capture the effect of e-commerce on the United States transportation network. In that sense, this research can be viewed as the first attempt toward such a goal.

 

2.1       Introduction to Gravity Models

            Gravity modeling was first introduced into transportation modeling in the 1950’s. Gravity models belong to a class of models called synthetic models, and they are generally used for the rough approximation of actual movements (Hamburg, Kaiser and Lathrop, 1983). Gravity models are often used for estimating trip distribution in a transportation context. These models have also been modified and used to estimate freight flows between a set of production and consumption regions. The gravity model is particularly useful when there are sizable distances and cost differences between each pair of production and consumption regions. Such characteristics are present in the world of electronic commerce.

                         Gravity models were originally developed from an analogy with Newton’s gravitational law (Ortuzar and Willumsen, 1990). The simplest formulation of the gravity model is

                                                                                                                      (1)

where  Tij = the number of trips between origin i and destination j,

            Oi = population of origin region i,

            Oj = population of destination region j,

            dij = distance between origin i and destination j,

           aij = proportionality factor

            The initial gravity model, Equation (1), was later modified by modeling the effect of distance with a more generic function f (cij), which represents the disincentive to travel as distance, time or cost increases. The modified model thus became

                                               Tij=aijOiOj f (cij)                                                                (2)

            The deterrence function f(cij) (also called the impedance)  is usually defined in terms of distance between region i and j. The potential problem with any gravity model application is that, in the flow matrix, the consumption of say Region 1, 2 and 3 may not be equal the production of Region A that has produced the flows. Since gravity model is a closed system, where all flows that are created are consumed within the system, the summation of consumption capacities should equal the production capacities. Therefore, an iterative process is employed to adjust Tij to achieve equality between production and consumption (Hamburg, Lathrop, and Kaiser, 1983). This procedure will be further discussed in section 4.3.

 

 

 

2.2       Gravity Modeling in Trip Distribution Problems

                The trip distribution problem deals with the assignment of traffic from given origin zones to given destination zones. This problem is built on the idea of accessibility of one region from another, thus creating the inter-activity between regions. In reference to the traditional form of gravity mode shown in Equation (1), the population of origin region Oi is substituted with Pi, which represents the production capacities, and the population of destination region Oj is substituted with Cj, which represents the consumption capacities. The relative number of opportunities such as work opportunities can be used as an accessibility measure for a zone. In this research, this can be viewed as the opportunity for online businesses to reach additional sellers/customers in further reaching regions, which then creates additional transportation flows.

            The types of marginal constraints with which we shall be primarily concerned are of the forms

                                  

                Tij = Number of trips flowing from region i to region j

                           Pi = Number of trips originated from region i

                           Cj = Number of trips consumed by region j

       These marginal constraints eliminate the gravity model problem discussed in Section 2.1 where all flows from region i to region j within a system should equal the production and consumption capacities. These constraints can also be represented as shown in Equation 5 and 6.

                                             ()                                                          (5) (6)

            Trip distribution models involving these types of marginal constraints are referred to as doubly-constrained distribution models (Erlander and Stewart, 1990). The gravity model that we develop in this research is also a doubly-constrained gravity model. A doubly-constrained gravity model could come in different forms, and such forms are governed by impedance values. Impedance values are determined from its functional form called deterrence function. Impedance values set the level of inter-activity between two regions. Erlander and Stewart (1988) present several basic forms, which we review briefly in Section 2.2.1.

            The Bureau of Public Roads (Connor and Whitton, 1965) for urban area planning suggests that the most effective representation for impedance value is travel times. The total travel time is usually the minimum total driving time over a path between zones (or regions) plus the terminal times at both ends of the trip. Travel times provide a realistic measure of the actual spatial separation between regions, as it is likely to influence automobile drivers in their decisions as to places to work, shop, etc. In effect, the travel time factor measures the probability of making a trip during each time unit. Distance, travel cost, and many other spatial separation inter-relations have been used in the past as the factor to determine the impedance value.

 

 

 

 

           

Different Forms of Gravity Models:

a)                  Doubly Constrained Gravity Model with Given Inter-Zonal Weights

            This type of gravity model assigns a set of inter-zonal weights for origin- destination pairs. These weights are usually viewed as constants, which can be interpreted as a priori weights. Erlander and Stewart (1988) define a gravity model with inter-zonal weights as follows: Given Wij Î (0,1), (i ,j)Î L (set of all possible origin-destination pairs or Links), Tij is a solution of the doubly-constrained gravity model with given inter-zonal weights Wij,

                               (7)

            where Tij = Number of trips flowing from region i to region  j

                       Pi = Number of trips originated from region i

                       Cj = Number of trips consumed in region j

                      Wij = Inter-zonal weight between region i and region j

                        L = Set of origin-destination pairs

 

b)                  Doubly-Constrained Gravity Model with Exponential Deterrence Function

            According to Erlander and Stewart (1988), the exponential deterrence function is the most widely used deterrence function in trip distribution modeling. The exponential deterrence function specifies the inter-zonal weights in terms of parameter g ³ 0, and constants cij. Given g ³ 0, and cij ³ 0, (i ,j)Î L, a doubly constrained gravity model with exponential deterrence function is as follows:

                             Tij = PiCje(-gcij)        Pi>0 , Cj>0, (i , j) Î L                            (8)

 

c)         Doubly-Constrained Gravity Model with Exponential Deterrence Function and Socio-economic Factor

            This new form is a modification of the previous one with additional constants Kij that are interpreted as socio-economic factors. Socio-economic factors are included in trip distribution models in order to account for trip-making potentials of individuals, or the trip production potential of origins and the trip attraction potential of destinations (Kanafani, 1983). Given g ³ 0, Cj ³  0, and Kij Î  (0,1), (i,j)Î L, a doubly-constrained gravity model with exponential deterrence function exp(-gcij) and socio economic factors Kij is as follows:

                                              Tij = Pi Cj [Kij e (-gcij)]    Pi > 0 , Cj>0 , (i , j)Î L                             (9)

               

2.3 Regression and Least Square Analysis

            The third form of gravity model discussed in section 2.2 is as shown in equation (10)

                               Tij=PiCj[Kije (-gcij)]                                                                      (10)

This model is linear by itself, and with a logarithmic transformation, we can calibrate it using simple linear regression to determine better g values (Kanafani, 1983). The calibration process helps to better estimate the impedance values that will properly set the inter-activity between origin and destination pairs. Note that     

                                 ln [ Tij / Pi Cj ] = ln(Kij) - gcij                                                                                (11)

             According to Kanafani, in order to avoid any possible distortion in the estimate of g  when there are large cij values, a least squares function can be used. That is, one can try to minimize the sum of squared errors to fine-tune the value of g: The sum of squared errors or the least squares function is defined as

                                                                                                        (12)

             where   T’ij = Observed origin-destination flows of the base flow condition

                           Tij = Estimated origin-destination flows

            The values used as base flow conditions are obtained from 1997 commodity flow surveys. They are historical values measured in tons, which represents the flows of products from region i to region j.

            In this research, T’ij is obtained from the U.S Census Bureau's commodity flow survey, and Tij is estimated using the model that we have developed. The least squares estimation procedure attempts to seek the closest agreement between Tij and T’ij by minimizing the sum of squares. This is a method to improve and to evaluate the performance of the newly developed model and see how well the model is calibrated to base flow condition (Kanafani, 1983). We employ a similar procedure in our research.

           

2.4       Relevant Applications of Gravity Models

            Carter (1993) states that gravity modeling is an accepted market analysis tool for determining the economic feasibility of retail stores. Retail gravity models were originally used to forecast the number of consumers shopping in a city. Carter (1993) uses them to evaluate the value of retail property depending on the demand for the products sold by stores. His research allocates the consumer dollars that will be spent for a type of product within a trade area based on a reasonable assumption about consumer behavior. The retail model assumes that, within a trade area, the probability that a consumer will shop at a particular store is directly proportional to some power of the size of the store and is inversely proportional to some power of the distance between the consumer and the store. Distance is considered to be a dominating factor when it comes to trading, even if a large trade area is considered. However, in our view, this will change as e-commerce grows over time.

            Retail gravity modeling is also used to quantify the economic viability of a proposed project. Bottum (1989) introduces additional parameters governing the retail gravity model. In the revised model, consumer behavior not only depends on the size of stores and distance, but also is a function of accessibility, physical barriers, driving time and income levels. This approach is feasible when a small trade area is considered.

            Gravity modeling is also used in the travel industry to analyze the foreign tourist market. For example, Webster (1993) uses gravity modeling to predict the flow of tourists between a pair of countries as a direct function of each country’s population and as an inverse function of the distance between them. Here distance serves as the main impedance contributor for tourism. However, later findings in Webster’s research showed that there is a lack of significance displayed by the distance variable relative to the number of trips. Travel time turned out to be the best impedance.

 

2.5      Gravity Modeling for Freight Flow Distribution

            Freight flow distribution can be defined as the movement of goods from several origins to several destinations. Modeling freight flows can be considered from multiple dimensions, such as volume, weight, and trips. Veras and Thorson (2000) consider the amount of freight measured in tons (or any comparable unit of weight) as a unit of measure for freight demand and supply. This allows commodity-based models such as gravity models to more accurately capture the fundamental economic mechanisms driving freight movements, which largely are determined by the freight attributes such as tonnage.

            In commodity flow surveys, data for both tonnage and dollar freight values are available. However, Veras and Thorson (2000) suggest avoiding using shipment dollar values since they believe that shipment values ($) exhibit more variability from one product to another. For example, freight values may be as low as $9/ton for products such as gypsum; and the value may very well exceed $500,000/ton for products such as computer chips. In addition, Veras and Thorson also discuss that using "trips traveled", may result in inaccurate results since empty trips may represent 15 to 50 percent of the total trips and the goal is to estimate actual freight being transported. Based on this, we use tonnage as the unit of measure of flow for our gravity model implementation.

 

2.6       Linear Programming for Freight Flow Distribution

            Hamburg, Lathrop and Kaiser (1983) use linear programming (LP) for estimating freight distribution. Their LP formulation of freight flow distribution can be expressed mathematically as

                  Minimize                              (13)

                  such that       

                                                                

                                       

 

  where Tij = Shipment from production area i to consumption area j,

             Pi = Production in Region i,

             Cj = Consumption in Region j,

             cij = Impedance value between Region i and Region j (normally distance or cost).

            There are pros and cons in using LP to solve freight distribution problems. The major attraction of LP is its underlying basis of economic rationality, which is to minimize overall transportation cost. However, there is no rational central authority that could make all flow decisions between regions. In a way, each entity or region acts independently, which undermines the validity of LP approach. Moreover, the overall attractiveness is also damaged by inherent characteristics of LP, which have created some limitations in solving freight flow distribution problems Hamburg, Lathrop and Kaiser, 1983). First of all, for a system comprised of n regions, a normal solution to LP will produce no more than (2n-1) of the n(n-1) potential inter-regional flows, i.e., the optimal solution of the LP model will have only 2n-1 positive flows. Secondly, LP does not allow freight flows in both directions along a link (from i to j and from j to i), which is called cross hauling. Very few commodity movements exist without some cross hauling. Lastly, in many cases, unit transport costs are not linear with distance or shipment size, as is assumed inherently in an LP formulation (Hamburg, Lathrop and Kaiser, 1983). Due to these limitations and the widespread use of gravity modeling for similar freight flow estimation problems, we use gravity modeling in this research.


3. DATA COLLECTION

The unavailability of good data is perhaps the greatest challenge we face in this research.  Our goal is to model the directional distribution of flows generated by e-commerce, but there is currently no data source that has a direct measure of such flows. Estimated e-commerce sales volume in the United States (as a whole) is available, but it is not broken down into region-to-region basis. The Census Bureau has just begun to collect some survey data on the Internet economy. (Atrostic, Gates, Jarmin, 2000)

            Since there is no readily available e-commerce data, we model the e-commerce flows based on the existing commodity flow survey data. Commodity flow surveys capture data on shipments originating from selected types of business establishments located in the fifty states and the District of Columbia. Businesses that participate in this program provide information on the total value of shipments, total weights, major commodity type, modes of transportation used, miles traveled, and the origin and destination of shipments. We estimate the flows due to e-commerce from the existing commodity flow survey.

            Two sets of commodity flow survey data are available, 1993 and 1997. In 1993, there were virtually no significant e-commerce transactions. Therefore, we initially planned to compare the flows of a selected product code in 1993 to the flows in 1997. However, the 1993 survey data uses the detailed STCC (Standard Transportation Commodity Classification) coding system, whereas the 1997 commodity flow survey uses more aggregate SCTG (Standard Classification of Transported Goods) coding system. That is, goods are grouped within fewer product codes in 1997. Therefore, a direct comparison between the 1993 and 1997 data is not possible. As a result, we used the 1997 commodity flow survey as our main data source.

                Another set of data that we have looked at is the distribution of Internet domains registered in United States. As of June 2000, there were 13,260,000 active Web sites registered in the United States (U.S. Map New Stat, 2000). The data from this source indicates that more populous states top the list for largest number of domain name registrations. Though most web sites are inactive and do not conduct business online, surveys indicate that 80% of businesses that have registered a web address have done so to develop an on-line presence for an existing business (Network Solutions, 2000). In other words, these are companies with established business models and real products – the so-called “Click/Brick and Mortar” companies. These companies have nonetheless become the driving force behind the Internet economy, using the efficiencies and reach of the Internet to extend their traditional business models. Also note that many companies have distribution centers that would initialize shipment flows throughout the United States although they have registered their web site in another state. That is, a company's products may not come from the location where its domain is registered. The domain distribution data is not incorporated into the model, but it has provided us with better insights on the intensity of e-commerce in all the states in United States.

 

 

 

 

 

 

 

4. METHODOLOGY

4.1       Introduction

In this section, we first provide a brief overview of the formulation process of our gravity model that captures the directional distribution of flows. We explain the formulation procedure in a step-by-step manner. In Section 4.2, we describe the reverse derivation procedure. We use reverse derivation to determine the historical impedance of the base flow condition that leads to the flow distribution of the base flow condition. The deterrence function formulation will also be discussed in this section. In section 4.3, we describe the iterative procedure we use to adjust the calculated commodity flows to within 10 percent of the originally specified values. In section 4.4, we present the concept of an ‘extreme case in impedance values, and show how the growing e-commerce economy is moving the impedance values to this extreme. In section 4.5, we describe how to calculate the average mile statistic. The average mile is the average distance traveled by each ton of product. We project the increase in average mile due to e-commerce such that the average exponent n can be estimated. In Section 4.6, we describe the process of determining the appropriate smoothing constant l (a value to set the intermediate condition between current and future estimated condition). Finally in Section 4.7, we describe how the distributed flows are assigned to different modes of transportation.

 

 

 

 

We explain the steps of the procedure in more detail below:

1)      Determine the geographic regions for the model. We use the 48 contiguous states.

Hawaii and Alaska are eliminated due to the lack of flows and many missing data.

 

2)      Pick product Code '35' (electronic and electrical products, and office equipment) as the representative e-commerce product. The base flow condition of our model will be based on product flows of Code '35'.

 

3)      The impedance of the base flow is determined by doing a reverse derivation of      

                                                      

of each state-to-state pair, where T’ij­ are actual flows of base condition obtained from the commodity flow survey of product code '35'. Note that all base (actual) conditions are differentiated with a prime ( ' ) sign.

 

4)     

Develop a distance and population based deterrence function that will represent the impedance of the new deterrence function. This function takes the form

where Oi and Oj are populations of state i and state j, respectively, and Rij’s are proportionality factors of the total commodity flow (ranging from 1 to 5) that we will define below. We use population as a part of the deterrence function since it represents the extent of demand and economic activity. We also use distance, as it is still a major contributor to the movement of goods. This function will be further calibrated in step 6.

 

5)      An iterative procedure is performed to ensure that the production and consumption capacities that were initially specified are satisfied within 10 percent.

 

6)     

Fine-tune the deterrence function such that the sum of squared errors are minimized in order to determine a better n value.

 

where T’ij is the base flow for each origin-destination pair directly obtained from the commodity flow survey for product code ‘35’, and Tij  is the estimated flow distributed by the new deterrence function at the last iteration for product code ‘35’. We determine the n value that gives us the lowest summation of squared errors.

 

7)      Project the average miles for product code '35' in year 2005 from the base flow. This value is used as our benchmark to determine smoothing constant l.

 

8)      Repeat the whole process for product codes '30' and '6' to eliminate the bias toward product code '35'. The average of n and l values determined from the three product types is used in the model to distribute the total projected flows for year 2005.

 

9)      Assign the flows to different transportation modes to estimate the impact of

            e-commerce on different transportation modes.

 

    10)  Report the increase in total ton-miles and the percent share of different mode

           usage in 2005.

 

 

4.2         Deterrence Function Formulation

4.2.1    Geographic Regions Determination

            We first determine the boundaries of our study area.  Our initial idea was to formulate the gravity model based on the 9 geographic divisions used by the Census Bureau because data is available for these regions. However, some of these regions are too big. The gravity model is primarily ‘distance sensitive’, and large regional sizes do not accurately represent where products originate and arrive. We think the model would not perform well if such issues were not carefully considered. Therefore, we use a more granular regional structure and we have decided to use the 48 contiguous state boundaries. This gives us a 48 by 48 matrix with 2,304 origin-destination pairs.

 

4.2.2        Base Flow Impedance

            With the availability of base year condition (flow data for code '35'), a reverse derivation procedure is used to determine the (empirical) impedance values. We want to calibrate the deterrence function that we develop such that the sum of squared errors between the flows determined from the base and the newly developed deterrence function is minimized. This procedure will be discussed in Section 4.2.4. We determine the base impedance using                                         

                                                                                                        (19)

            where F'ij = Impedance value between origin i and destination j for base flow

                               condition

                      T’ij = Observed flows in tons from region i to region j for base flow       

                               condition

                       P'i = Production from region i for base flow condition

                       C'j = Consumption for region j for base flow condition

 

4.2.3        Main Components of Deterrence Function

The deterrence function Fij is a function that reflects the impedance of product flow. Deterrence functions are typically assumed to be either a linearly or exponentially decreasing function of distance. However, such thought primarily applies to trip distribution. In the inter-regional commodity flows in the United States, the observed distribution is not only the result of the impedance function due to distance, but also of the economic activity level. For example, we observe significant flows between large western states, such as California, and large eastern states, such as New York, although they are far away from each other. Therefore, there are other factors involved than just distance.

 To achieve better modeling of this pattern, we assume that such strong economic trade level is primarily based on the population of those regions. Also, similar to the socio-economic factors in the literature (see section 2.2.1(b)), we introduce a multiplication factor Kij as a representation for such activity (We discuss the details of determining Kij factors in the following section). The resulting estimated flow Tij between region i and j takes the form:

                                    (20)

where f(dij) is a function of distance.

4.2.4    Determination of the deterrence function, Fij  

The first segment of our deterrence function f(dij) depends on distance. Specifically, we represent f(dij) as the inverse of distance raised to power of n, which is a parameter that we can modify for better accuracy of the model. Assuming Kij between each origin-destination pair is a constant, we follow a trial and error approach to find the exponent n. The following is the deterrence function for origin i and destination j:

                                                                         (21)

The factor Kij is based on the 1997 statewide population estimates from the U.S. Census Bureau. The estimated population data is the computed number of persons living in each state. It is calculated from a demographic component of change model that incorporates information on natural change (births and deaths) and net migration (net domestic migration and net movement from abroad) that has occurred in each state since the reference data of the 1990 Census.

For each origin-destination pair (i,j), Kij takes the following form

                                                                                      (22)

           where Oi = Population of origin state i,

          Oj = Population of destination state j,

          Rij = Proportionality factor of each origin-destination pair.

The inverse of the proportionality factor of total commodity flow for every origin-destination pair is introduced to differentiate the Kij factor between two origin-destination pairs depending on the magnitude of the existing overall commodity flow between the states. X'ij is the total commodity flow between origin state i to destination state j of the base condition, which includes all product types listed in the SCTG codes.

Table 1 shows the breakpoints in the values of Rij.

Table 1. Breakpoints in determining the values of Rij

                Total Base Flow of Each Origin-Destination Pair (X'ij)               Rij

                                                X’ij < 500K tons                                                          5                                                       500K £ X’ij < 1500K tons                                         4

                               1500K £ X’ij < 2500K tons                                              3

                   2500K £ X’ij < 4000K tons                                              2

                                    X’ij ³ 4000K tons                                                        1

           

We currently set Rij factor to a value between 1 and 5, depending on the overall base flow. Due to the proportionality factor, Kij may not necessarily be the same as Kji. Such differences can add more validity to the model as they take into account the economic interaction between states. Preliminary results show that using this factor decreases the sum of squared errors between the base flows and the calculated flows at the 9th iteration. The least squares procedure is discussed later in this chapter. Finally, the deterrence function that we have developed, Fij is as follows:

                                                                                          (23)

The next step is to calibrate our model by fine-tuning the exponent n in the deterrence function such that the deterrence function can be refined using the least square analysis. We search for a value of n that gives us small error. We employ Kanafani’s method (1983), which minimizes the sum of squared errors between base flow and calculated flows. The following is the sum of squared errors that we try to minimize by changing n:

                                                                                 (24)

where T'ij = Base flow directly obtained from commodity flow survey for

                    product code ‘35’

            Tij = Estimated flows distributed by the new deterrence function at 9th

                    iteration for product code ‘35’

This calibration process helps us to determine the proper n-value for our deterrence function. 

                                            

4.3              Iterative Procedure in Gravity Modeling

            Hamburg, Kaiser and Lathrop (1983) introduced three concepts as the basis of the

iterative procedure in gravity modeling: attraction factors (Aij), accessibility indices (Ii), and production indices (Ui). An iterative procedure is employed in gravity model to ensure that the production and consumption capacity is satisfied to within 5 to 10 percent of the estimated value. This iterative procedure will be undertaken for all rows and columns of the 48 by 48 matrix of the gravity model.

 

4.3.1    Attraction Factors, Accessibility Indices and Production Indices

            Knowing the deterrence function Fij, production capacities Pi, and consumption capacities Cj we determine the attraction factor (Aij) for every pair of regions i and j:

                                                                                                                     (25)

The accessibility index (Ii) for production region i is

                                                                                                                                            (26)

The production index (Ui) for region i is

                                                                                                                            (27)

The matrix of the model up to 1st iteration takes the form of Table 2.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Table 2. A representative table for the first iteration of the gravity model

 

 

 

 

 

 

 

 

j1

 j2

……...jnth

 

 

  Accessibility  

      Index, Ii

  Production   

    Index, Ui

          

              

 

 

   

 

 

 

 

 

 

 

i1

Aij=Cj1Fi1,j1

Cj2Fi1,j2

 

 

 

i2

Cj1Fi2,j1

Cj2Fi1,j2

 

 

 

Production Capacity, Pi      

                                                        :

 

 

 

 

 

             :

 

 

 

 

 

     inth             

 

 

 

 

 

Total Commodity to Region j ; STij

STij

 

 

 

 

% Deviation for Iteration 0

        1-(STij /Cj1)

 

 

 

 

Adjusting Factor for Iteration 1

Cj1/STij

 

 

 

 

Total Commodity to Region j ; STij

 

 

 

 

 

% Deviation for Iteration 1

 

 

 

 

 

Adjusting Factor for Iteration 2

 

 

 

 

 

                       

            Table 3 illustrates the implementation of the formulation process in Table 2. We perform the iterative procedure up to the 2nd iteration. We present an example problem that involves inter-regional flows between 3 regions. The production, consumption and impedance values between the regions are given.

Production Capacity

 

Production

Consumption

 

Region A

 

10

15

 

Region B

 

20

20

 

Region C

 

30

25

 

Impedance Values Between Origin Region i and Destination Region j

 

 

Region A

Region B

Region C

 

Region A

6

8

9

 

Region B

5

6

8

 

Region C

6

8

10

 


Table 3. A sample problem of inter-regional flows between 3 regions.

 

 

 

 

Consumption

 

 

 

 

 

 

 

Region A

Region B

Region C

Accessibility Index, Ii

 Production Index, Ui

 

Capacity

 

15

20

25

 

 

Region A

10

Impedance

6

8

9

 

 

 

 

Attraction Factor; Aij

6x15=90

8x20=160

9x25=225

475.00

0.02

 

 

Flow @ Iteration 1

0.02(90)=1.8

0.02(160)=3.2

0.02(225)=4.5

 

 

 

 

Attraction Factor; Aij

1.19x1.8=2.14

1.18x3.2=3.78

0.74x4.5=3.33

9.25

1.08

 

 

Flow @ Iteration 2

1.08x2.14=2.31

1.08x3.78=4.08

1.08x3.33=3.60

 

 

Region B

20

Impedance

5

6

8

 

 

 

 

Attraction Factor; Aij

5x15=75

6x15=90

8x25=200

365.00

0.06

 

 

Flow @ Iteration 1

0.06(75)=4.5

0.06(90)=5.4

0.06(200)=12

 

 

 

 

Attraction Factor; Aij

1.19x4.5=5.36

1.18x5.4=6.37

0.74x12=8.88

20.61

0.97

 

 

Flow @ Iteration 2

0.97x5.36=5.20

0.97x6.37=6.18

0.97x8.88=8.61

 

 

Region C

30

Impedance

6

8

10

 

 

 

 

Attraction Factor; Aij

6x15=90

8x15=120

10x25=250

460.00

0.07

 

 

Flow @ Iteration 1

0.07(90)=6.3

0.07(120)=8.4

0.07(1575)=17.5

 

 

 

 

Attraction Factor; Aij

1.19x6.3=7.50

1.18x8.4=9.91

0.74x17.5=12.95

30.36

0.99

 

 

Flow @ Iteration 2

0.99x7.5=7.43

0.99x9.91=9.81

0.99x12.95=12.82

 

 

 

 

Total Commodity to Region j ;  STij

12.6

17

34

 

 

 

 

% Deviation for Iteration 1

1-(12.6/15)=0.16

1-(17/20)=0.15

1-(34/25)= -0.36

 

 

 

 

Adjusting Factor for Iteration 2

(15/12.6)=1.19

(20/17)=1.18

(25/34)=0.74

 

 

 

 

Total Commodity to Region j ;  STij

14.94

20.07

25.03

 

 

 

 

% Deviation for Iteration 2

1-(14.94/15)=0.004

1-(20.07/20)= -0.0035

 1-(25.03/25)= -0.0012

 

 

 

 

Adjusting Factor for Iteration 3

(15/14.94)=1.00

(20/20.07)=1.00

(25/25.03)=1.00

 

 


In the first iteration, the commodity flow from region i to region j is

                                                                                                                    (28)

            This process automatically generates flows that satisfy the production capacity constraints of our doubly-constrained gravity model. The next step is to correct flows for the consumption capacity of each region. The adjusting factor for consumption in each region j is

                                                                                                                               (29)             

            The adjusting factor of each consumption region is multiplied by the attraction factor of the corresponding origin-destination pair to determine the new attraction factor for the next iteration of the procedure. We repeat the same procedure of calculating attraction factors, accessibility indices, production indices, and finally new adjusting factors until the percent deviation from the actual consumption value falls within 10%. The resulting Tij for each individual member of the matrix is the flow for each origin-destination pair.

We provide a summary of the iterative procedure as follows:

1)      Calculate the attraction factor of each origin-destination pair.

2)      Calculate the accessibility index for each origin/production region i (summation of all attraction factors).

3)      Calculate the production index for each production region i (divide the production capacity of each region by its accessibility index).

4)      Compute the initial flow for each origin-destination (production-consumption) pair by multiplying the production index of the origin with the attraction factor of the corresponding destination.

5)      Calculate the adjusting factor for the consumption capacity by dividing the consumption capacity of each destination region j by the summation of all flows coming into region j. If the adjusting factor is small (close to 1), stop and report the latest calculated flows. Otherwise, multiply the adjusting factor for each consumption region with the corresponding attraction factor to determine the new attraction factor for the next iteration and return to Step 2.

 

4.4       Extreme Impedance

 

E-commerce tools and technologies are increasingly bringing buyers and sellers, and suppliers and customers, closer. The three basic spatial separation constraints (distance, time, and shipping cost) are beginning to impact buyers less. Consumers are no longer restricted to buy things from local stores. Products can be delivered overnight or in a very short amount of time. The difference in shipping costs among the regions is getting smaller. For example, according to the shipping rate provided by UPS, shipping a 100 pound parcel from Scranton, Pennsylvania to Conway, New Hampshire costs approximately $46, whereas shipping the same parcel to San Francisco, California (five times the distance), costs only $54. (www.ups.com) 

 Due to these observations, we introduce the following extreme impedance function to generate multiple scenarios for simulating the effect of diminishing effects of distance on the flows of goods. The extreme impedance function is a constant value, i.e., every origin-destination pair has the same value between the regions. We then have two deterrence functions, the extreme function (constant), and the original deterrence function in equation (23). We believe that as e-commerce continues to grow, the deterrence function will fall somewhere between these two functions, and it will move closer to the extreme case over time. Therefore, we employ the use of a smoothing method to develop an intermediate deterrence function. We estimate the intermediate deterrence function that lies between the current deterrence and the extreme deterrence using

    f=(1-l)(distance based-deterrence function)+ l(extreme impedance function)        (30)  

where l is the a smoothing constant between 0 and 1. l is equal to 1 when we are at the extreme condition, and  l  is equal to 0 when we are at the traditional base condition.

 

4.5       Determining the Average Miles 

            One way to illustrate the future impact of e-commerce on inter-regional flows is to compare the average miles of the base flow condition to that of the inter-regional flows determined from the projected future production and consumption capacity. Average miles traveled by each ton of the current and future flows can be determined by using the following function.

                           Average Miles = (åton-miles / åton)                                                  (31)

According to the U.S. Census Bureau, ton-mile is simply the shipment weight times the mileage for a shipment. Respondents of the commodity flow survey reported shipment weight in pounds; mileage was calculated as the distance between the shipment origin zip code and destination zip code. Aggregated pound-miles were converted to ton-miles. The summation of ton-miles for every origin-destination pair divided by the total tonnage shipped results in average distance traveled by one ton of products for all the 48 contiguous states. The significant increase in average distance traveled by products for future projected flows is one indicator that shows the diminishing effect of distance due to e-commerce on inter-regional transportation flows.

 

4.6      Determining the Appropriate l Value for  Smoothing

We now determine the appropriate l value for smoothing between our distance based deterrence function and the extreme impedance function. This process involves a benchmarking procedure where an assumption is made on the average expected increase in average miles with the effect of e-commerce on future flows. We seek to determine the l value for the year 2005.

The U.S. Census Bureau reports that online purchases accounted for 11 percent of all cost of materials at manufacturing plants in 1999. Also, 12 percent of all manufacturing shipments were for orders accepted online. E-commerce transactions were significant in the machinery sector (12 percent), Computer and Electronic Products sector (12 percent), and Electrical Equipment sector (10 percent). (U.S. Census Bureau, 2001). A survey conducted by U.S. Census Bureau of 38,985 manufacturing plants shows that 16 percent of reporting plants have engaged in both e-shipments (shipping products to customers that have made their purchases online) and e-purchases (making online purchases). The result of the survey is presented in Table 4.

        

 

Table 4. Status of E-commerce Engagement for Manufacturing Plants (U.S. Census 

              Bureau, 2001)

 

 

          Status of

E-Shipments

 

Status of E-purchases

All plants

Make             E-shipments

Do Not Make                E-shipments

Unknown

All Plants

38,985

12,069

26,462

454

Make E-Purchases

13,233

6,063

7,061

109

Do Not Make                  E-Purchases

25,237

5,901

19,203

133

Unknown

515

105

198

212

                                                                                                   

            From this latest report by U.S. Census Bureau, we know that e-commerce is beginning to play a major role in the United States economy. Though this report does not cover the entire U.S. economy, it surveys the North American Industry Classification System (NAICS) industries that accounted for approximately 70 percent of economic activity measured in the 1997 Economic Census. We are therefore looking at a major portion of the U.S. economy that has gone online.

With this new information released by U.S. Census Bureau, we will conservatively assume the share of e-commerce flows for 2005 will be 15%. For the remaining 85% of the products, we assume the flows were created under the traditional economy.

In estimating the average miles of the product in year 2005, we evaluate the 1993 and 1997 commodity flow survey. Most products have undergone an average increase of 5% in average miles within the 4 years period (see Figure 1). Therefore, for year 2005, we assume the average mile will increase by 10% since 1997. We apply this percent increase to the 85% of the products, which flow under traditional economic methods. As for the 15% e-commerce products, we assume the average mile will increase by 20% from 1997 to 2005. The resulting average mile computed from the two shares of economy is reported in the next chapter.

Based on the estimated average mileage transported by each ton of products for 2005, the l value is determined. A 20% increase in production and consumption capacities is assumed for all states. Such assumption is based on the average increase of production capacity of 14% every 4 years. Since the projection is made from 1997 to 2005, we conservatively assume the production and consumption capacities will grow by 20% in those 8 years. An interpolation process is employed to determine the exact l value that gives us the estimated average miles of 2005. Since the l values may vary across product lines, the process of determining l value will be performed each time a different product is considered.



              

Figure 1. Graph of Percent Change of Average Miles for Various Product Types from 1993 to 1997

4.7       Mode Assignment

We perform mode assignment on the total (all SCTG codes) production and consumption capacities for every state. Mode assignment mainly assigns the distributed flows to different modes of transportation. We assume the production and consumption capacities have increased by 20% from 1997 to 2005. Also, the assumption on the percent share of e-commerce (15%) remains the same.

            The result of mode assignment that we will obtain in this procedure will not ultimately represent the percent share of different mode usage in 2005 for United States. However, we perform this step to illustrate the impact of diminishing effect of distance on the choice of mode used. Though we realize that, other than distance, shippers make their selections on their choice of modes based upon variety of reasons. In fact, Gray (1980), is supported by other authors in concluding that there is no specific need to examine organizational mode selection in freight transport. Lambert (1993) found that “lowest rates” came only 40th after testing the importance of over hundred and fifty selection factors. In fact, various subjective factors that affect mode choice, such as reliability, trust and service level play a much important role (Pisharodi, 1991). Many studies have come up with different contrasting opinions, which make mode assignment a much greater challenge (Gray, 1982). Therefore, the output of mode assignment in this research is not intended for forecasting, but just to illustrate the mode usage distribution if distance is a significant influence of mode usage.

Before distributing the flows, we determine the proper n and l values to be used. As discussed before, n and l may vary across product lines. Therefore, the average n and l values determined from the individual n and l of product codes ‘35’, ‘30’ and ‘6’ will be used to distribute the flows. These three groups of products represent three distinct types of products in the commodity flow survey. Therefore, taking the average of the individual n and l values of these three product types will be a good approach.

            Having determined the n and l values, we then distribute the projected production capacity to every origin-destination pair. Every origin-destination flow is considered separately for mode assignment. We perform the mode assignment at a very high level in this research. We assign the distributed flows to different transportation modes based on the average distance between the origin-destination pair. Table 5 gives us the percent share of mode choice under different distance categories. The values shown in these tables are average percent shares across all states for all shipment types. A graph of flows versus modes used in 1997 and 2005 (Figure 13) will be plotted to observe the change in mode usage due to e-commerce. This graph will be shown in Chapter 5. The next chapter validates the approach we have taken and presents the results from this research.


 

Table 5. Percent Usage of Different Transportation Modes Relative to Distance in 1997 U.S. Commodity Flow

 

Less than 50 miles

 

50 to 99 miles

 

100 to 249 miles

 

Mode

Tons(000)

Percentage

Tons(000)

Percentage

Tons(000)

Percentage

      All modes

6444454

 

1079841

 

1311278

 

    Single modes

6086713

 

1053694

 

1238913

 

Truck

5212913

80.90%

866735

80.75%

770562

58.76%

Rail

254985

3.96%

107608

10.03%

285232

21.75%

Water

179449

2.79%

53806

5.01%

109425

8.34%

Air (includes truck and air)

0

0.00%

202

0.02%

683

0.05%

Pipeline

439350

6.82%

25342

2.36%

73011

5.57%

Parcel, US Postal Service or courier

4307

0.07%

1704

0.16%

3546

0.27%

Multiple Modes(truck&rail,truck&  water,rail&water,other multiple modes

352315

5.47%

17981

1.68%

68818

5.25%

 

 

 

 

 

 

 

Total

6443319

100%

1073378

100%

1311277

100%

 

 

 

 

 

 

 

      Table 5. Continues.

 

250 to 499 miles

 

500 to 749 miles

 

750 to 999 miles

 

 

Tons(000)

Percentage

Tons(000)

Percentage

Tons(000)

Percentage

      All modes

905504

 

541782

 

383327

 

    Single modes

844909

 

501869

 

343183

 

Truck

415852

46.05%

191915

35.79%

103369

27.03%

Rail

322529

35.71%

213720

39.86%

173661

45.41%

Water

65618

7.27%

84391

15.74%

51822

13.55%

Air (includes truck and air)

622

0.07%

485

0.09%

455

0.12%

Pipeline

40288

4.46%

11359

2.12%

13876

3.63%

Parcel, US Postal Service or courier

3611

0.40%

2869

0.54%

2257

0.59%

Multiple Modes(truck&rail,truck& water,rail&water,other multiple modes

54566

6.04%

31412

5.86%

36999

9.67%

 

 

 

 

 

 

 

Total

903086

100.00%