nextprevious
Next:Structure and Contribution of Previous:The Markovian Approach

Subexponential Distributions and Long-Range Dependent Traffic

Recent studies have revealed that several types of network traffic exhibit a multiple time-scale behavior [LTW94]. Multiple time-scale traffic is characterized by considerable fluctuations in the traffic rate, well above or under the average rate, over several time scales. Both VBR and data traffic have been shown to display this property, over half a dozen of time scales.

One may wonder about the origins of the multiple time-scale behavior of network traffic. Certainly, complex network mechanisms do account for the multiple-scale behavior of network traffic. For instance, unsuccessfully transmitted information may be retransmitted either at a fast time scale via Mac or Data Link layer mechanisms (such as CSMA/CD) or at a slower time scale via Transport mechanisms (such as TCP). But, as several studies point out [Jel96, CrB97], the main source of the multiple time-scale nature of the traffic may be the extreme variability in the size of the documents transfered over the network. Indeed, recent technological advances have enabled the development of storage devices with the capability of storing huge documents. The segmentation and packetization of such documents, when transmitted over the network, creates bursts of highly correlated traffic. The impact of multiple time-scale traffic is thus expected to increase with the growth of Web and multimedia traffic.

The use of Markov processes for modeling multiple time-scale traffic seems, a priori, problematic. Indeed, very large state-spaces are needed for capturing the complex behavior of such kind of traffic. These large state-spaces rely, in turn, on a large number of parameters whose meaning is unclear. Moreover, with regard to queueing analysis, one can not usually expect to obtain closed-form results (only numerical results). Therefore, a better alternative may be to resort to subexponential distributions and long-range dependent traffic models.

Subexponential (or long-tailed) distributions [GoK98] provide a basis for the parsimonious modeling of multiple time-scale traffic. A subexponential distribution has the property to decay slower than any exponential distribution, i.e., its cumulative distribution function F(t) satisfies

 equation558

Examples of subexponential distributions are Pareto, Weibull and lognormal [PKO76]. Experimental studies give strong supports to modeling network traffic with subexponential distributions. In [WTS95], traffic measurements from two Ethernet LAN's show that data generations of individual sources can be modeled by On/Off processes whose On and Off periods have subexponential distributions. It has also been found that subexponential distributions fit very well many Internet networks parameters such as files lengths, duration of connections and intervals between requests [PaF95]. In MPEG video, the number of frames within a scene exhibits a subexponential behavior [JeL96].

A very important class of subexponential distributions consists of the heavy-tailed (or power-tailed) distributions [GoK98, ReS98]. A random variable X has a heavy-tailed distribution if its complementary cumulative distribution function (ccdf) tex2html_wrap_inline630 satisfies

 equation561

where tex2html_wrap_inline596 and c are positive constants, and tex2html_wrap_inline636 means tex2html_wrap_inline638 . The most frequent situation is tex2html_wrap_inline640 for which the random variable X has a finite mean but an infinite variance. Occurrence of such a distribution in the activity and/or silence period of an On/Off process gives rise to long-range dependence, i.e., a non-summable autocorrelation function [BRS96]. A famous heavy-tailed distribution is the (translated) Pareto distribution for which

 equation564

The Pareto distribution provides parsimonious modeling since it depends on only two parameters.

The heavy tails and long range dependence phenomena have been detected both in video and Web traffic [BST95, CrB97]. Moreover, the aggregation of a large number of heavy-tailed On/Off processes gives rise, under appropriate rescaling, to a certain version of the Fractional Gaussian Noise (FGN) [WTS95]. The FGN is a long-range dependent Gaussian process exhibiting the self-similar property . The FGN and the corresponding cumulative process, called Fractional Brownian Motion (FBM) (see [Ber94], p.55, for the definition of fractional Brownian motion), have been shown to have good statistical fit with data measurements in Ethernet LAN segments [LTW94, Nor95].

Unfortunately, subexponential distributions do not lend themselves to easy queueing analysis since their Laplace transforms are not tractable, in most cases (for an exception, see [BoC98]). This explains why, so far, most of the queueing results involving subexponential distributions have only been obtained in asymptotic regimes (see [GJK99] and references therein). Although these asymptotic results may provide very useful insights, they suffer from some severe limitations. First, the region where they are applicable is often limited to very large (and sometimes unrealistic) buffer sizes. Second, they provide only partial information on system performances. Consider, for instance, the distribution of the waiting time W in a GI/G/1 queue with subexponentially distributed service time. Asymptotic results [Pak75] show that the tail of the waiting distribution tex2html_wrap_inline658 is insensitive to the inter-arrival time distribution (except for its mean), as tex2html_wrap_inline660 . In this thesis, we show that this fact is not necessarily true for small and moderate values of t.


nextprevious
Next:Structure and Contribution of Previous:The Markovian Approach
David Starobinski

Mon Nov 1 15:20:02 PST 1999