Modeling and Analysis of
Transportation Flows Created
by E-commerce Transactions
Adam Ho, Erhan
Kutanoglu, Michael Cole
Department of Industrial
Engineering
Michael
Bartolacci
Computer Science
1. INTRODUCTION
1.1 Motivation
Many studies and surveys show the
signs of an exponentially growing Internet-based economy. The recent growth of
business and trade realized over the Internet has drawn a lot of attention to
electronic business, whether it is business-to-business or
business-to-consumer. The increasing availability of e-commerce solutions
provides firms with new potential for reaching new customers and business
partners. Traditionally, the two most formidable barriers for this type of
extended business have been distance and the lack of access to key sales and
marketing areas. With the potential removal of such barriers in the new
economy,
The
question that arises is “How will growing e-commerce affect the physical transportation
network?" Similar to traditional commerce transactions, an e-commerce
transaction may result in transfer of goods. This physical exchange of goods
relies heavily on the traditional transportation network. We hypothesize that
the growing number of e-commerce transactions affects the distribution of loads
on the transportation network, with potential changes on the usage of different
modes such as air, rail, road, and inland water. In e-commerce, instead of
shipping 100 computers in one truckload to a local store, 100 boxes, each with
one computer, are shipped to a dispersed set of customers. For example, on a
single Saturday in July 2000, 100 airplanes and 9,000 trucks delivered more
than 250,000 copies of Harry Potter and
the Goblet of Fire to Amazon.com customers all over the
A quick
analysis of the U.S. Census Bureau's commodity flow surveys (1993 and 1997)
indicates an increase in the average distance of each ton of products shipped
(www.census.gov). This implies that over the years, an average load is shipped
to a destination that is farther away from the origin. Although not all of such
changes are due to e-commerce, we believe that the growth of e-commerce results
in a diminishing effect of distance on transportation flows between distant
regions. Our goal is to model the changes in the distribution of transportation
flows given increasing amounts of e-commerce and the corresponding diminishing
importance of distance.
For the
purposes of planning by governmental agencies and transportation providers,
surveys have already been undertaken through partnerships between the Census
Bureau, the Department of Commerce and the Department of Transportation to
collect data on the movement of goods (not necessarily e-commerce initiated).
The data from this survey, referred as commodity flow survey (CFS), are used by
public analysts and transportation providers to assess the demand for
transportation facilities and services, energy use, and environmental concerns.
We foresee that public analysts and transportation analysts can make use of the
knowledge on changing pattern of flows due to e-commerce to allocate resources
and plan for the future.
Currently,
e-commerce represents approximately 1% of the total
1.2. Report Outline
The rest of the report is organized as follows: In Chapter 2, we provide a rather extensive review of the relevant literature on different applications of gravity models, a model that is primarily used in estimating transportation flows between regions. In this research, this model has been further developed to estimate freight movements with the effect of e-commerce. We discuss data sources in Chapter 3. In Chapter 4, we discuss our modeling effort and how we calibrate the base flow data obtained from the 1997 commodity flow survey. Base flow is the historical flows of goods exchanging between regions. We also present the way we determine different parameters to estimate future flows, and the process in assigning the flows to different transportation modes. We provide our first preliminary analysis using SCTG code ‘35’ (electronic and electrical products, and office equipments) as our base flow condition in Chapter 5. We also show our preliminary output of the gravity model and the way we use two other product flow data to eliminate the bias effect toward product code ‘35’. We show that we can use the results to validate the use of gravity modeling for quantifying the directional distribution of transportation flows under diminishing effects of distance in e-commerce. We also present the findings of the expected usage of different modes in year 2005. Finally, we summarize our main findings and provide some insights for future research in Chapter 6.
2. LITERATURE REVIEW
In this
chapter, we present a review of literature on gravity modeling, which is the
modeling tool that we use in this research. We first give an overview of
gravity modeling applications in trip and freight distribution as well as other
economic applications. While we review the relevant literature, we also
highlight several differences between previous research and our implementation
for e-commerce flows. To our knowledge, no previous research effort has used
gravity models to capture the effect of e-commerce on the
2.1
Introduction to Gravity Models
Gravity modeling was first
introduced into transportation modeling in the 1950’s. Gravity models belong to
a class of models called synthetic models,
and they are generally used for the rough approximation of actual movements
(Hamburg, Kaiser and Lathrop, 1983). Gravity models are often used for
estimating trip distribution in a transportation context. These models have
also been modified and used to estimate freight flows between a set of
production and consumption regions. The gravity model is particularly useful
when there are sizable distances and cost differences between each pair of production
and consumption regions. Such characteristics are present in the world of
electronic commerce.
Gravity models were originally developed from
an analogy with
(1)
where Tij
= the number of trips between origin i
and destination j,
Oi =
population of origin region i,
Oj = population of destination region j,
dij = distance
between origin i and destination j,
aij = proportionality
factor
The initial gravity model, Equation
(1), was later modified by modeling the effect of distance with a more generic
function f (cij), which represents the disincentive to travel as
distance, time or cost increases. The modified model thus became
Tij=aijOiOj
f (cij)
(2)
2.2 Gravity
Modeling in Trip Distribution Problems
The trip distribution problem deals with the
assignment of traffic from given origin zones to given destination zones. This
problem is built on the idea of accessibility of one region from another, thus
creating the inter-activity between regions. In reference to the traditional
form of gravity mode shown in Equation (1), the population of origin region Oi is substituted with Pi, which represents the
production capacities, and the population of destination region Oj is substituted with Cj, which
represents the consumption capacities. The relative number of opportunities
such as work opportunities can be used as an accessibility measure for a zone.
In this research, this can be viewed as the opportunity for online businesses
to reach additional sellers/customers in further reaching regions, which then
creates additional transportation flows.
The types of marginal constraints with which we shall be primarily concerned are
of the forms![]()
Tij = Number of trips flowing from region i to region j
Pi = Number of trips originated from region i
Cj = Number of trips consumed by region j
These marginal constraints eliminate the gravity model
problem discussed in Section 2.1 where all flows from region i to region j within a system should equal the production and consumption
capacities. These constraints can also be represented as shown in Equation 5
and 6.
(
)
(5) (6)
Trip
distribution models involving these types of marginal constraints are referred
to as doubly-constrained distribution
models (Erlander and Stewart, 1990). The gravity model that we develop in this
research is also a doubly-constrained gravity model. A doubly-constrained
gravity model could come in different forms, and such forms are governed by impedance values. Impedance values are determined from its functional form called deterrence function. Impedance values set the level of
inter-activity between two regions. Erlander and Stewart (1988) present several
basic forms, which we review briefly in Section 2.2.1.
The Bureau of Public Roads (Connor and Whitton, 1965) for urban area planning suggests that the most effective representation for impedance value is travel times. The total travel time is usually the minimum total driving time over a path between zones (or regions) plus the terminal times at both ends of the trip. Travel times provide a realistic measure of the actual spatial separation between regions, as it is likely to influence automobile drivers in their decisions as to places to work, shop, etc. In effect, the travel time factor measures the probability of making a trip during each time unit. Distance, travel cost, and many other spatial separation inter-relations have been used in the past as the factor to determine the impedance value.
Different Forms of Gravity Models:
a)
Doubly
Constrained Gravity Model with Given Inter-Zonal Weights
This type of
gravity model assigns a set of inter-zonal weights for origin- destination
pairs. These weights are usually viewed as constants, which can be interpreted
as a priori weights. Erlander and
Stewart (1988) define a gravity model with inter-zonal weights as follows:
Given Wij Î (0,1), (i
,j)Î L (set of all
possible origin-destination pairs or Links),
Tij is a solution of the doubly-constrained gravity model with
given inter-zonal weights Wij,
(7)
where Tij = Number of trips flowing
from region i to region j
Pi = Number
of trips originated from region i
Cj = Number of trips consumed
in region j
Wij =
Inter-zonal weight between region i
and region j
L
= Set of origin-destination pairs
b)
Doubly-Constrained
Gravity Model with Exponential Deterrence
Function
According to
Erlander and Stewart (1988), the exponential deterrence function is the most widely used deterrence function in trip distribution modeling. The exponential deterrence function specifies the
inter-zonal weights in terms of parameter g ³ 0, and constants cij.
Given g ³ 0, and cij ³ 0, (i ,j)Î L, a doubly
constrained gravity model with exponential deterrence
function is as follows:
Tij = PiCje(-gcij) Pi>0 , Cj>0, (i , j) Î L (8)
c) Doubly-Constrained
Gravity Model with Exponential Deterrence
Function and Socio-economic Factor
This new
form is a modification of the previous one with additional constants Kij that are interpreted as socio-economic factors. Socio-economic factors are included in
trip distribution models in order to account for trip-making potentials of
individuals, or the trip production potential of origins and the trip
attraction potential of destinations (Kanafani, 1983). Given g ³ 0, Cj ³ 0, and Kij
Î (0,1), (i,j)Î L, a doubly-constrained gravity model with exponential deterrence function exp(-gcij) and socio
economic factors Kij is as
follows:
Tij = Pi Cj [Kij e (-gcij)] Pi > 0 , Cj>0 , (i , j)Î L (9)
2.3
Regression and
The third form of gravity model discussed in section 2.2 is as shown in equation (10)
Tij=PiCj[Kije (-gcij)]
(10)
This model is linear by itself, and with a logarithmic transformation, we can calibrate it using simple linear regression to determine better g values (Kanafani, 1983). The calibration process helps to better estimate the impedance values that will properly set the inter-activity between origin and destination pairs. Note that
ln [ Tij
/ Pi Cj ] = ln(Kij) - gcij
(11)
According
to Kanafani, in order to avoid any possible distortion in the estimate of g when there are large cij values, a least squares
function can be used. That is, one can try to minimize the sum of squared
errors to fine-tune the value of g: The sum of squared errors or the least squares function is
defined as
(12)
where T’ij
= Observed origin-destination flows of the base flow condition
Tij = Estimated
origin-destination flows
The values
used as base flow conditions are obtained from 1997 commodity flow surveys.
They are historical values measured in tons, which represents the flows of products
from region i to region j.
In this
research, T’ij is
obtained from the U.S Census Bureau's commodity flow survey, and Tij is estimated using the
model that we have developed. The least squares estimation procedure attempts
to seek the closest agreement between Tij
and T’ij by
minimizing the sum of squares. This is a method to improve and to evaluate the
performance of the newly developed model and see how well the model is
calibrated to base flow condition (Kanafani, 1983). We employ a similar procedure
in our research.
2.4 Relevant
Applications of Gravity Models
Carter
(1993) states that gravity modeling is an accepted market analysis tool for
determining the economic feasibility of retail stores. Retail gravity models
were originally used to forecast the number of consumers shopping in a city.
Carter (1993) uses them to evaluate the value of retail property depending on
the demand for the products sold by stores. His research allocates the consumer
dollars that will be spent for a type of product within a trade area based on a
reasonable assumption about consumer behavior. The retail model assumes that,
within a trade area, the probability that a consumer will shop at a particular
store is directly proportional to some power of the size of the store and is
inversely proportional to some power of the distance between the consumer and
the store. Distance is considered to be a dominating factor when it comes to
trading, even if a large trade area is considered. However, in our view, this
will change as e-commerce grows over time.
Retail
gravity modeling is also used to quantify the economic viability of a proposed
project. Bottum (1989) introduces additional parameters governing the retail
gravity model. In the revised model, consumer behavior not only depends on the
size of stores and distance, but also is a function of accessibility, physical
barriers, driving time and income levels. This approach is feasible when a
small trade area is considered.
Gravity
modeling is also used in the travel industry to analyze the foreign tourist
market. For example, Webster (1993) uses gravity modeling to predict the flow
of tourists between a pair of countries as a direct function of each country’s
population and as an inverse function of the distance between them. Here
distance serves as the main impedance
contributor for tourism. However, later findings in Webster’s research showed
that there is a lack of significance displayed by the distance variable
relative to the number of trips. Travel time turned out to be the best impedance.
2.5 Gravity
Modeling for Freight Flow Distribution
Freight flow
distribution can be defined as the movement of goods from several origins to
several destinations. Modeling freight flows can be considered from multiple
dimensions, such as volume, weight, and trips. Veras and Thorson (2000)
consider the amount of freight measured in tons (or any comparable unit of
weight) as a unit of measure for freight demand and supply. This allows
commodity-based models such as gravity models to more accurately capture the
fundamental economic mechanisms driving freight movements, which largely are
determined by the freight attributes such as tonnage.
In commodity
flow surveys, data for both tonnage and dollar freight values are available. However,
Veras and Thorson (2000) suggest avoiding using shipment dollar values since
they believe that shipment values ($) exhibit more variability from one product
to another. For example, freight values may be as low as $9/ton for products
such as gypsum; and the value may very well exceed $500,000/ton for products
such as computer chips. In addition, Veras and Thorson also discuss that using
"trips traveled", may result in inaccurate results since empty trips
may represent 15 to 50 percent of the total trips and the goal is to estimate
actual freight being transported. Based on this, we use tonnage as the unit of
measure of flow for our gravity model implementation.
2.6 Linear
Programming for Freight Flow Distribution
Minimize
(13)
such that

where Tij = Shipment
from production area i to consumption
area j,
Pi = Production in
Region i,
Cj =
Consumption in Region j,
cij = Impedance value between Region i and Region j (normally distance or cost).
There are
pros and cons in using LP to solve freight distribution problems. The major
attraction of LP is its underlying basis of economic rationality, which is to
minimize overall transportation cost. However, there is no rational central
authority that could make all flow decisions between regions. In a way, each
entity or region acts independently, which undermines the validity of LP
approach. Moreover, the overall attractiveness is also damaged by inherent
characteristics of LP, which have created some limitations in solving freight
flow distribution problems
3. DATA
COLLECTION
The unavailability of good data is perhaps the
greatest challenge we face in this research.
Our goal is to model the directional distribution of flows generated by
e-commerce, but there is currently no data source that has a direct measure of
such flows. Estimated e-commerce sales volume in the
Since there is no readily available
e-commerce data, we model the e-commerce flows based on the existing commodity
flow survey data. Commodity flow surveys capture data on shipments originating
from selected types of business establishments located in the fifty states and
the
Two sets of commodity flow survey data are available, 1993 and 1997. In 1993, there were virtually no significant e-commerce transactions. Therefore, we initially planned to compare the flows of a selected product code in 1993 to the flows in 1997. However, the 1993 survey data uses the detailed STCC (Standard Transportation Commodity Classification) coding system, whereas the 1997 commodity flow survey uses more aggregate SCTG (Standard Classification of Transported Goods) coding system. That is, goods are grouped within fewer product codes in 1997. Therefore, a direct comparison between the 1993 and 1997 data is not possible. As a result, we used the 1997 commodity flow survey as our main data source.
Another set of data that we have
looked at is the distribution of Internet domains registered in
4. METHODOLOGY
4.1 Introduction
In this section, we first provide a brief overview of the formulation process of our gravity model that captures the directional distribution of flows. We explain the formulation procedure in a step-by-step manner. In Section 4.2, we describe the reverse derivation procedure. We use reverse derivation to determine the historical impedance of the base flow condition that leads to the flow distribution of the base flow condition. The deterrence function formulation will also be discussed in this section. In section 4.3, we describe the iterative procedure we use to adjust the calculated commodity flows to within 10 percent of the originally specified values. In section 4.4, we present the concept of an ‘extreme’ case in impedance values, and show how the growing e-commerce economy is moving the impedance values to this extreme. In section 4.5, we describe how to calculate the average mile statistic. The average mile is the average distance traveled by each ton of product. We project the increase in average mile due to e-commerce such that the average exponent n can be estimated. In Section 4.6, we describe the process of determining the appropriate smoothing constant l (a value to set the intermediate condition between current and future estimated condition). Finally in Section 4.7, we describe how the distributed flows are assigned to different modes of transportation.
We explain the steps of the procedure in more detail below:
1) Determine the geographic
regions for the model. We use the 48 contiguous states.