RELATED APPLICATION
[0001] This application claims priority to U.S. provisional application serial no.
titled "Generating Models for Directed Scale-Free Inter-Object Relationships", filed
on April 18, 2003, and hereby incorporated by reference.
TECHNICAL FIELD
[0002] The invention pertains to generating models for growth and distribution of directed
scale-free object relationships.
BACKGROUND
[0003] Many new processes for generating distributions of random graphs have been introduced
and analyzed, inspired by certain common features observed in many large-scale real-world
graphs such as the "web graph", whose vertices are web pages with a directed edge
for each hyperlink between two web pages. For an overview see the survey papers [2]
and [15] of the Appendix. Other graphs modeled are the "internet graph" [18], movie
actor [28] and scientific [25] collaboration graphs, cellular networks [21], and so
on.
[0004] In addition to the "small-world phenomenon" of logarithmic diameter investigated
originally in the context of other networks by Strogatz and Watts [28], one of the
main observations is that many of these large real-world graphs are "scale-free" (see
references [5, 7, 24] of the Appendix), in that the distribution of vertex degrees
follows a power law, rather than the Poisson distribution of the classical random
graph models
G(n, p) and
G(n, M) [16, 17, 19], see also [9]. Many new graph generators have been suggested to try
to model such scale-free properties and other features, such as small diameter and
clustering, of real-world events, phenomena, and systems that exhibit dynamically
developing object relationships such as that presented by the Worldh a Wide Web (WWW).
Unfortunately, such existing generators produce models that are either completely
undirected or, at most, semi-, or uni-directional (i.e., either in-degrees or out-degrees
are treated, but not both simultaneously), and/or have a statically predetermined
degree distribution.
[0005] In light of this, existing techniques for generating graphs do not provide realistic
treatments of dynamically generated scale-free graphs with directed object relationships
(i.e., link(s) from one object to another) that develop in a way depending on both
links out-of and into an object. As such, conventional generation techniques do not
adequately represent specific or fully modeled simulations of scale-free, directed
object relationships that may exist in nature and/or other dynamic environments such
as the WWW.
[0006] In view of these limitations, systems and methods for generating models of directed
scale-free graphs or dynamic communities of relationships (e.g., network topologies)
are greatly desired. Such generators could be used, e.g., to generate sample directed
network topologies on which directed internet routing protocols are tested, or to
generate sample web graphs on which search algorithms are tested.
SUMMARY
[0007] Systems and methods for generating models of directed scale-free object relationships
are described. In one aspect, a sequence of random numbers is generated. Individual
ones of these random numbers are then selected over time to generate the directed
scale-free object relationships as a graph based on sequences of in-degrees and out-degrees.
BRIEF DESCRIPTION OF THE DRAWINGS
[0008] The following detailed description is given with reference to the accompanying figures.
In the figures, the left-most digit of a component reference number identifies the
particular figure in which the component first appears.
[0009] Fig. 1 is a block diagram of an exemplary computing environment within which systems
and methods for generating models of directed scale-free object relationships may
be implemented.
[0010] Fig. 2 is a block diagram that shows further exemplary aspects of system memory of
Fig. 1, including application programs ,and program data for generating models of
directed scale-free object relationships.
[0011] Fig. 3 shows an exemplary network of directed object relationships.
[0012] Fig. 4 shows an exemplary procedure to generate a model of directed scale-free object
relationships.
DETAILED DESCRIPTION
Overview
[0013] The following systems and methods generate directed scale-free modeling of object
relationships. This is accomplished through the simultaneous treatment of both in-degrees
and out-degrees (bidirectional) to provide a very natural model for generating graphs
with power law degree distributions. Depending on the characteristics of the entity
or the abstraction being modeled, power laws can be different for in-degrees and out-degrees.
Such modeling is consistent with power laws that have been observed, for example,
in nature and in technological communities (e.g., directed hyperlinks among web pages
on the WWW, connections among autonomous systems on the AS internet, connections among
routers on the internet, etc.).
Exemplary Operating Environment
[0014] Turning to the drawings, wherein like reference numerals refer to like elements,
the invention is illustrated as being implemented in a suitable computing environment.
Although not required, the invention is described in the general context of computer-executable
instructions, such as program modules, being executed by a personal computer. Program
modules generally include routines, programs, objects, components, data structures,
etc., that perform particular tasks or implement particular abstract data types.
[0015] Fig. 1 illustrates an example of a suitable computing environment 120 on which the
subsequently described systems, apparatuses and methods to generate directed scale-free
network topologies may be implemented. Exemplary computing environment 120 is only
one example of a suitable computing environment and is not intended to suggest any
limitation as to the scope of use or functionality of the systems and methods described
herein. Neither should computing environment 120 be interpreted as having any dependency
or requirement relating to any one or combination of components illustrated in computing
environment 120.
[0016] The methods and systems described herein are operational with numerous other general
purpose or special purpose computing system environments or configurations. Examples
of well known computing systems, environments, and/or configurations that may be suitable
include, but are not limited to, hand-held devices, symmetrical multi-processor (SMP)
systems, microprocessor based or programmable consumer electronics, network PCs, minicomputers,
mainframe computers, portable communication devices, and the like. The invention may
also be practiced in distributed computing environments where tasks are performed
by remote processing devices that are linked through a communications network. In
a distributed computing environment, program modules may be located in both local
and remote memory storage devices.
[0017] As shown in Fig. 1, computing environment 120 includes a general-purpose computing
device in the form of a computer 130. Computer 130 includes one or more processors
132, a system memory 134, and a bus 136 that couples various system components including
system memory 134 to processor 132. Bus 136 represents one or more of any of several
types of bus structures, including a memory bus or memory controller, a peripheral
bus, an accelerated graphics port, and a processor or local bus using any of a variety
of bus architectures. By way of example, and not limitation, such architectures include
Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MCA) bus, Enhanced
ISA (EISA) bus, Video Electronics Standards Association (VESA) local bus, and Peripheral
Component Interconnects (PCI) bus also known as Mezzanine bus.
[0018] Computer 130 typically includes a variety of computer readable media. Such media
may be any available media that is accessible by computer 130, and it includes both
volatile and non-volatile media, removable and non-removable media. In Fig. 1, system
memory 134 includes computer readable media in the form of volatile memory, such as
random access memory (RAM) 140, and/or non-volatile memory, such as read only memory
(ROM) 138. A basic input/output system (BIOS) 142, containing the basic routines that
help to transfer information between elements within computer 130, such as during
start-up, is stored in ROM. RAM typically contains data and/or program modules that
are immediately accessible to and/or presently being operated on by processor(s) 132.
[0019] Computer 130 may further include other removable/non-removable, volatile/non-volatile
computer storage media. For example, Fig. 1 illustrates a hard disk drive 144 for
reading from and writing to a non-removable, non-volatile magnetic media (not shown
and typically called a "hard drive"), a magnetic disk drive 146 for reading from and
writing to a removable, non-volatile magnetic disk 148 (e.g., a "floppy disk"), and
an optical disk drive 150 for reading from or writing to a removable, non-volatile
optical disk 152 such as a CD-ROM/R/RW, DVD-ROM/R/RW/+R/RAM or other optical media.
Hard disk drive 144, magnetic disk drive 146 and optical disk drive 150 are each connected
to bus 136 by one or more interfaces 154.
[0020] The drives and associated computer-readable media provide nonvolatile storage of
computer readable instructions, data structures, program modules, and other data for
computer 130. Although the exemplary environment described herein employs a hard disk,
a removable magnetic disk 148 and a removable optical disk 152, it should be appreciated
by those skilled in the art that other types of computer readable media which can
store data that is accessible by a computer, such as magnetic cassettes, flash memory
cards, digital video disks, random access memories (RAMs), read only memories (ROM),
and the like, may also be used in the exemplary operating environment.
[0021] A number of program modules may be stored on the hard disk, magnetic disk 148, optical
disk 152, ROM 138, or RAM 140, including, e.g., an operating system (OS) 158 to provide
a runtime environment, one or more application programs 160, other program modules
162, and program data 164.
[0022] A user may provide commands and information into computer 130 through input devices
such as keyboard 166 and pointing device 168 (such as a "mouse"). Other input devices
(not shown) may include a microphone, joystick, game pad, satellite dish, serial port,
scanner, camera, etc. These and other input devices are connected to the processing
unit 132 through a user input interface 170 that is coupled to bus 136, but may be
connected by other interface and bus structures, such as a parallel port, game port,
or a universal serial bus (USB).
[0023] A monitor 172 or other type of display device is also connected to bus 136 via an
interface, such as a video adapter 174. In addition to monitor 172, personal computers
typically include other peripheral output devices (not shown), such as speakers and
printers, which may be connected through output peripheral interface 176.
[0024] Computer 130 may operate in a networked environment using logical connections to
one or more remote computers, such as a remote computer 178. Remote computer 178 may
include many or all of the elements and features described herein relative to computer
130. Logical connections shown in Fig. 1 are a local area network (LAN) 180 and a
general wide area network (WAN) 182. Such networking environments are commonplace
in offices, enterprise-wide computer networks, intranets, and the Internet.
[0025] When used in a LAN networking environment, computer 130 is connected to LAN 180 via
network interface or adapter 184. When used in a WAN networking environment, the computer
typically includes a modem 186 or other means for establishing communications over
WAN 182. Modem 186, which may be internal or external, may be connected to system
bus 136 via the user input interface 170 or other appropriate mechanism.
[0026] Depicted in Fig. 1, is a specific implementation of a WAN via the Internet. Here,
computer 130 employs modem 186 to establish communications with at least one remote
computer 178 via the Internet 188.
[0027] In a networked environment, program modules depicted relative to computer 130, or
portions thereof, may be stored in a remote memory storage device. Thus, e.g., as
depicted in Fig. 1, remote application programs 190 may reside on a memory device
of remote computer 178. It will be appreciated that the network connections shown
and described are exemplary and other means of establishing a communications link
between the computers may be used.
[0028] Fig. 2 is a block diagram that shows further exemplary aspects of system memory 134
of Fig. 1, including application programs 160 and program data 164. Application programs
160 include, for example, a Directed Scale-Free Object Relationship Network Generating
Module 202 to generate a Directed Scale-Free Graph 204 (hereinafter often referred
to as the "graph"). Each graph 204 represents vertices and edges between respective
vertices that have been added to the graph by the network generating module 202 during
discrete iterative operations that are performed over time
t. Before turning to more detailed aspects of the algorithms used to generate the graph
204, exemplary structure and elements of a graph 204 are described in reference to
graph 204(a).
[0029] Graph 204(a) is represented as a matrix, wherein each horizontal row
i and vertical column
j of the matrix corresponds to a respective vertex, or node (i.e., node
1 through node
N). Thus,
i = 1...
N, and
j = 1...
N. (Hereinafter, the terms node and nodes are often used interchangeably with the terms
vertex and vertices). To grow graph 204(a) from some number of nodes to a greater
number of nodes, the network generating module 202 adds a node to the graph 204(a).
This means that a row and a column representing the new node are added to the graph
204(a). The
(i,j) element
E(i,j) of the graph 204(a) represents the number of directed edges or connections from node
i to node
j, modeling e.g., the number of hyperlinks from web page
i to web page
j, or a directed transfer of
E(i,j) objects or characteristics from entity
i to entity
j (such as the transfer of money and goods between a merchant and a buyer), and/or
the like.
[0030] In the representation 204(a), we have adopted the convention that edge direction
is evaluated from the row-node to the column-node.
[0031] We now describe the edge
E(i, j) values of graph 204(a) in view of network 300 of Fig. 3, which shows the exemplary
network 300 of directed object relationships. In this exemplary network, objects 302-1,
302-2, and 302-3 have at least one edge 304 (i.e., one or more edges 304-1 through
304-N) to/from another object. For example, object 302-1 (Fig. 3) shows a looping
edge 304-1 that indicates that the object has a relationship to itself (for example,
a web page having a hyperlink to a point inside itself).
[0032] Referring to Fig. 2, such a looping edge is also represented in graph 204(a) at the
edge value that corresponds to the intersection between row-Node
1 and column-Node
1 (i.e.,
E(1,1) =
1). This indicates that Node
1 has a single relationship to itself. This type of edge is called a "loop".
[0033] In this implementation, the module 202 may generate (self-)loops in the graph 204.
However, the generating module 202 can be configured not to generate loops to model
systems without self-loops.
[0034] In another example to represent edges 304 of Fig. 3 with a directed scale-free graph
204(a) of Fig. 2, note that object 302-1 of Fig. 3 has three (3) edges 304-2 through
304-4 to node 302-2. In particular, the intersection of row-Node
1 with column-Node
2 (i.e.,
E(1,2)) shows a value of 3, which is representative of the relationship between object 302-1
of Fig. 3 to object 302-2. This type of edge is called a "multiple edge", which in
general refers to two or more edges from a particular object Node
i to a different object Node
j. In this implementation, the module 202 may generate multiple edges in the graph
204. However, in another implementation, the generating module 202 can be configured
not to generate multiple edges, to model systems in which there are only single edges.
[0035] Although network 300 of Fig. 3, and graph 204(a) of Fig. 2 respectively represent/map
only 3 nodes/objects, it can be appreciated that the complexity and number of objects
represented/mapped by the exemplary network 300 and graph 204(a), are exemplary and
could represent/map any number of objects of any complexity.
[0036] We now describe the algorithms used by the generating module 202 to generate directed
scale-free object relationships in further detail.
Generating Directed Scale-Free Obiect Relationships
[0037] Referring to Fig. 2, the generating module 202 introduces random and probabilistic
aspects during graph 204 generation to simulate dynamically created objects (e.g.,
web pages, etc.) and relationships between them (e.g., hyperlinks, etc.) that is/are
often observed, for example, in technological (e.g., the web), cultural, natural,
and/or the like, environments. Such a random aspect is obtained via iterative generating
module 202 requests over time
t for respective random number(s) 206 from the random number generating module (RNG)
208. The RNG 208 can be a standalone module, or a service provided by a computer program
module such as the OS 158 (Fig. 1).
[0038] Some of the random numbers 206 will be required to lie between 0 (zero) and 1 (one).
For each of these random numbers 206, the network generating module 202 uses the random
number 206 to determine one of three possibilities, labeled (A), (B) and (C), depending
on whether the random number lies between 0 (zero) and α, α and α + β, or α + β and
α+ β + γ, respectively. The parameters α, β and γ are non-negative real numbers that
when added together equal one (1), i.e., α + β + γ = 1. These parameters stored as
respective portions of the configuration data 210. The parameters α, β and γ can be
selected/determined in different manners, for example, manually preconfigured by a
system administrator, programmatically configured in view of environmental measurements,
etc. This allows for considerable flexibility to customize the model generating process
to simulate structural and object relationships of various types of measured environments.
[0039] When the generating module 202 maps the random number 206 to the range [0, α], the
generating module 202 augments the graph 204 by adding a vertex and an edge from the
new vertex into an existing (old) vertex. When the generating module 202 maps the
random number 206 to the range [α, α + β], the generating module 202 augments the
graph 204 by connecting two old vertices (i.e., a vertex is not added, but one of
the
E(i,j) values increases by one). When the generating module 202 maps the random number 206
to the range [α + β, α + β + γ], , the generating module 202 augments the graph 204
by connecting an old vertex to a newly generated vertex. Additionally, during graph
generation, the module 202 applies configurable constants δ
in and/or δ
out to introduce in-degree and out-degree shifts to the graph.
[0040] The degree shift, δ
in or
δout, is a non-negative parameter added to the in-degree or out-degree of a vertex, respectively.
The degree shift is added before applying any other rules which are used to choose
random vertices.
[0041] In light of the above, let
G0 be any fixed initial directed graph 204, for example, a single vertex (i.e., Node
1) without edges (i.e.,
E(1,1)=0), and
let to be the number of edges of
G0. The generating module 202 always adds one edge per iteration, and sets
G(t0) = G0, so at time
t the graph
G(t) has exactly
t edges, and a random number
n(t) of vertices. For purposes of discussion, number(s) of edges and vertices, as well
as other intermediate parameters and calculations are represented by respective portions
of "other data" 212.
[0042] In the operation of the generating module 202, to choose a vertex
v of
G(t) according to
dout + δout means to choose
v so that
Pr(v =
vi) is proportional to
dout(
vi) + δ
out, i.e., so that Pr(
v =
vi) =
(dout (vi) +
δout) /
(t+
δoutn(t)). To choose
v according to
din + δ
in means to choose
v so that
Pr(v =
vj) =
(din(vj) +
δin)/
(t+
δinn(t)). Here
dout(vi) and
din(
vj) are the out-degree of
vi and the in-degree of
vj, respectively, measured in the graph
G(t).
[0043] For
t ≥ t0, the generating module 202 forms
G(
t+1) from
G(t) according the following rules:
(A) With probability α (see configuration data values 210), add a new vertex v together with an edge from v to an existing vertex w, where w is chosen according to din + δin, so that Pr(w = wj) ∝ (din(wj) + δin). (For instance, in a web graph, add one (1) edge representing a hyperlink from vertex
v to vertex w). The inputs to this algorithm are n=n(t) vertices and t edges, and the outputs are n(t+1)=n(t)+1 vertices and t+1 edges. After adding the new vertex v=Noden+1, the particular existing vertex w that will receive the edge from the new vertex v is determined as follows:
E(i,j) = Eij = number of edges from vertex i to vertex j.

At this point, the generating module 202 requests an additional random number 206
between 0 and the sum of all numbers din (j) + δin in G(t):

The range from 0 to t + nδin is divided into n slots with lengths din (j) + δin, j=1, ...,n. The random number 206 will fall into a particular slot j. At this point, the generating module 202 sets E(n+1, j)=1.
(B) With probability β (see configuration data values 210), add an edge from an existing
vertex v to an existing vertex w, where v and w are chosen independently, v according to dout + δout, and w according to din + δin, so that Pr(v = vi, w = wj) ∝ (dout (vi) + δout)(din (wj) + δin). The inputs to this algorithm are n=n(t) vertices and t edges, and the outputs are n(t+1)=n(t) vertices and t+1 edges. The generating module 202 selects the particular existing vertex v that will add an edge to vertex w by generating a random number 206 (rout):

This range is divided into slots, with an ith slot having length dout(i) + δout. The random number 206 falls into a particular slot i; the vertex v will be Nodei. The generating module 202 determines the vertex w that will receive the edge by generating a random number 206 (rin) such that:

This range is divided into slots, with the jth slot having length din (j) + δin. The random number 206 falls into a particular slot j; the vertex w will be Nodej. At this point, the generating module 202 increments E(i,j) by 1.
(C) With probability γ (see configuration data values 210, which can be calculated
as γ = 1 - α - β), add a new vertex v and an edge from an existing vertex w to v, where w is chosen according to dout + δout, so that Pr(w = wi) ∝ (dout (wi) + δout). The inputs to this algorithm are n=n(t) vertices and t edges, and the outputs are n(t+1)=n(t)+1 vertices and t+1 edges. After adding the new vertex v=Noden+1, the particular existing vertex w that will add an edge to the new vertex v is determined as follows: generate a random number (rout) 206 according to:

This range is divided into slots, with the jth slot having length dout (i) + δout. The random number 206 falls into a particular slot i; the vertex w will be Nodei. Thus, the generating module 202 sets E(i,n+1)=1.
[0044] Although the generating module 202 makes no additional assumptions about the parameters,
the behavior of the resulting graph is non-trivial only if certain settings of the
parameters are avoided. In particular, the following parameter values can be avoided
to exclude trivialities:
- α + γ = 0 (↔ the graph does not grow)
- δin+δout = 0 (↔ all vertices have not in G0 have din = 0 or dout = 0)
- α δin + γ = 0 (↔ all vertices not in G0 have din = 0)
- γ = 1 (↔ all vertices not in G0 have din = 1)
- γ δout + α = 0 (↔ all vertices not in G0 have dout = 0)
- α = 1 (↔ all vertices not in G0 have dout = 1)
[0045] In one implementation, when graph 204 represents a web graph,
δout is set to 0. The motivation is that vertices added under rule (C) correspond to web
pages which purely provide content; such pages do not change, are born without out-links
and remain without out-links. In this implementation, vertices generated/added under
rule (A) correspond to usual pages, to which links may be added later. While mathematically
it may seem natural to take
δin = 0 in addition to
δout = 0, doing so would provide a model in which every page not in G
0 has either no in-links or no out-links, i.e. a trivial model.
[0046] A non-zero value of
δin corresponds to insisting that a page is not considered part of the web until something
points to it, for example, a search engine. This allows the generating module 202
to consider edges from search engines independently/separately from the rest of the
graph, since they are typically considered to be edges of a different nature (for
purposes of implementing a search algorithm, for example) than other types of edges.
For the same reason,
δin does not need to be an integer. The parameter
δout is included to provide symmetry to the model with respect to reversing the directions
of edges (swapping α with γ and
δin with
δout), and to further adapt the model to contexts other than that of the webgraph.
[0047] In one implementation, taking β = γ = δ
out = 0 and α =
δin = 1, the generating module 202 includes a precise version of the special case of
m = 1 of the Barabási-Albert model [5], wherein
m represents the number of new edges added for each new vertex A more general model
than that so far described here, with additional parameters, can be generated by adding
m edges for each new vertex, or (as in [14]) by adding a random number of new edges
with a certain distribution for each new vertex. In implementing the description here,
the main effect of the Barabási-Albert parameter
m, namely varying the overall average degree, is achieved by varying β.
[0048] Another more general model than that so far described here, again with additional
parameters, can be generated to describe systems in which different vertices have
different fitnesses. For example, some web pages may be considered more fit or attractive
than others, and may get more connections per unit time even if their degrees are
not as high as those of less fit web pages. To model this, whenever the generating
module 202 creates a new vertex
v, the random number generator 208 will independently generate two random numbers λ(
v) and µ(
v) from some specified distributions D
in and D
out, respectively, independently of each other and of all earlier choices. Then steps
(A), (B) and (C) of [0041] will be modified as follows: In step (A), the existing
vertex
w will be chosen according to λ(
w)(d
in +
δin), so that
Pr(w =
wi,) ∝ λ(
wi)
(din (
wi) +
δin). In step (B), the existing vertex
v will be chosen according to
µ(v)(dout, + δout), and the existing vertex
w will be chosen according to λ(
w)(
din +
δin), so that
Pr(v = vi,
w =
wj) ∝ µ(
vi)λ(
wj)(
dout(
vi) +
δout)(din(wj) + (
δin). In step (C), the existing vertex
w will be chosen according to
µ(w)(dout +
δout), so that
Pr(w =
wi) ∝ µ(
wi)(
dout(
wi) +
δout).
An Exemplary Procedure
[0049] Fig. 4 shows an exemplary procedure 400 to generate directed scale-free object relationships.
For the purposes of discussion, these procedural operations are described in reference
to program module and data features of Figs. 1 and 2. At block 402, the generating
module 202 configures numerical probabilities α, β, γ, and configurable in-degree
and out-degree shift constants
δin and
δout. At block 404, the generating module 202 generates random numbers 206 to select successive
steps (A), (B), or (C) over time to generate the directed scale-free object relationships
as a graph. Further random selection of vertices to/from which directed edges are
added uses preferential attachment, i.e., selection according to in/out-degree respectively,
as described in (A), (B) and (C) of [0042].
Conclusion
[0050] The described systems and methods generate directed scale-free object relationships.
Although the systems and methods have been described in language specific to structural
features and methodological operations, the subject matter as defined in the appended
claims is not necessarily limited to the specific features or operations described.
Rather, the specific features and operations are disclosed as exemplary forms of implementing
the claimed subject matter. For instance, the described systems 100 and methods 400,
besides being applicable to generation of a directed scale-free model of the web (a
web graph) or some portion thereof, can also used to generate customized models of
many other naturally occurring (man-made and otherwise) physical and abstract object
relationships.
References
[0051]
[1] W. Aiello, F. Chung and L. Lu, A random graph model for power law graphs, Experiment.
Math. 10 (2001), 53-66.
[2] R. Albert and A.L. Barabási, Statistical mechanics of complex networks, arXiv.condmat/0106096
(2001)
[3] R. Albert, H. Jeong and A.L. Barabási, Diameter of the world-wide web, Nature
401 (1999), 130-131.
[4] K. Azuma, Weighted sums of certain dependent variables, Töhoku Math. J. 3 (1967). 357-367.
[5] A.-L. Barabási and R. Albert, Emergence of scaling in random networks, Science 286 (1999), 509-512.
[6] A.-L. Barabási, R. Albert and H. Jeong, Mean-field theory for scale-free random networks, Physica A 272 (1999), 173-187.
[7] A.-L. Barabási, R. Albert and H. Jeong, Scale-free characteristics of random networks:
the topology of the world-wide web, Physica A 281 (2000), 69-77.
[8] G. Bianconi and A.-L. Barabási, Competition and multiscaling in evolving networks,
condmat/0011029.
[9] B. Bollobás, Random Graphs, Second Edition, Cambridge studies in advanced mathematics, vol. 73, Cambridge University Press, Cambridge,
2001, xvi + 498 pp.
[10] B. Boliobás, blartingales, isoperimetric inequalities and random graphs. In Combinatorics (Eger, 1987), 113-139, Colloq. Math. Soc. János Bolyai, 52, North-Holland, Amsterdam
1988.
[11] B. Bollobás and O.M. Riordan, The diameter of a scale-free random graph, submitted
for publication.
[12] B. Bollobás, O.M. Riordan, J. Spencer, and C. Tusnády, The degree sequence of
a scale-free random graph process, Random Structures and Algorithms 18 (2001), 279-290.
[13] A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata, A. Tomkins
and J. Wiener, Graph structure in the web, Proc 9th WWW Conf. 309-320 (2000).
[14] C. Cooper and A. Frieze, A general model of web graphs, preprint.
[15] S.N. Dorogovtsev and J.F.F. Mendes, Evolution of random networks, preprint.
[16] P. Erdös and A. Rényi, On random graphs. I, Publ. Math. Debrecen 6 (1959), 290-297.
[17] P. Erdös and A. Renyi, On the evolution of random graphs, Magyar Tud. Akad. Mat. Kutató Int. Közl. 5 (1960), 17-61.
[18] M. Faloutsos, P. Faloutsos and C. Faloutsos, On power-law relationships of the
internet topology, SIGCOMM 1999, Comput. Commun. Rev. 29 (1999), 251.
[19] E.N. Gilbert, Random graphs, Ann. Math. Statist. 30 (1959), 1141-1144.
[20] W. Hoeffding, Probability inequalities for sums of bounded random variables,
J. Amer. Statist. Assoc. 58 (1963), 13-30.
[21] H. Jeong, B. Tombor, R. Albert, Z.N. Oltvai and A.-L. Barabási, The large-scale
organization of metabolic networks, Nature 407 (2000), 651-654.
[22] J. Kleinberg. R. Kumar, P. Raghavan, S. Rajagopalan and A. Tomkins, The web as
a graph: measurements, models, and methods, COCOON 1999.
[23] R. Kumar, P. Raghavan, S. Rajagopalan and A. Tomkins, Extracting large scale
knowledge bases from the web, VLDB 1999.
[24] R. Kumar, P. Raghavan, S. Rajagopalan, D. Sivakumar, A. Tomkins and E. Upfal,
Stochastic models for the web graph, FOCS 2000.
[25] M.E J. Newman, The structure of scientific collaboration networks, Proc. Natl. Acad. Sci USA 98 (2001), 404-409.
[26] M.E.J. Newman, S.H. Strogatz and D.J. Watts, Random graphs with arbitrary degree
distributions and their applications, Phys. Rev. E 64, 026118 (2001).
[27] D. Osthus and G. Buckley, Popularity based random graph models leading to a scale-free
degree distribution, preprint.
[28] D.J. Watts and S.H. Strogatz, Collective dynamics of 'small-world' networks,
Nature 393 (1998), 440-442.
Appendix A
Appendix B
1. A computer-readable medium comprising computer-program instructions executable by
a processor for modeling directed scale-free object relationships, the computer-program
instructions comprising instructions for:
generating a sequence of random numbers; and
successively selecting individual ones of the random numbers over time to generate
models of directed scale-free object relationships in a graph, with graph development
depending on both in-degrees and out-degrees.
2. A computer-readable medium as recited in claim 1, wherein the graph is a web graph
comprising nodes and directed edges between respective ones of the nodes, the nodes
corresponding to web pages and the directed edges corresponding to hyperlinks from
one web page to another web page.
3. A computer-readable medium as recited in claim 1, wherein the computer-program instructions
further comprise instructions for successively using the random numbers to update
the graph by:
(A) adding an edge between a new object and an old object;
(B) adding an edge between two old objects; or
(C) adding an edge from an old object to a new object according to configurable parameters
α, β and γ.
4. A computer-readable medium as recited in claim 1, wherein the computer-program instructions
further comprise instructions for adding new edges to the graph as a function of directed
preferential attachment.
5. A computer-readable medium as recited in claim 1, wherein the computer-program instructions
further comprise instructions for generating the graph as a function of in-degree
and/or out-degree shifts δin and/or δout.
6. A computer-readable medium as recited in claim 1, wherein the computer-program instructions
further comprise instructions for modeling the graph as a function of a measured environmental
characteristic based on a set of configurable parameters α, β, γ, δin and δout.
7. A computer-readable medium as recited in claim 1, wherein an in-degree power law associated
with an object represented by the graph is different from an out-degree power law
associated with the object.
8. A computer-readable medium as recited in claim 1, wherein an in-degree power law associated
with an object represented by the graph is different from an out-degree power law
associated with the object such that for a generator with parameters α, β, γ, δ
in and
δout, a proportion of vertices with in-degree equal to d
in asymptotically scales as follows:

with

and a proportion of vertices with out-degree equal to d
out asymptotically scales as

with
9. A computer-readable medium as recited in claim 3, wherein the computer-program instructions
further comprise instructions based on (A) for updating the graph by adding an edge
from a new object
v to a random old object
w chosen according to a probability distribution with
10. A computer-readable medium as recited in claim 3, wherein the computer-program instructions
further comprise instructions based on (B), updating the graph by adding an edge from
a first existing object
v of the graph to a second existing object
w, and wherein objects
v and
w are chosen according to a probability distribution with
11. A computer-readable medium as recited in claim 3, wherein the computer-program instructions
further comprise instructions based on (C) for updating the graph by adding an edge
from a randomly chosen old object
w to a new object
vi where
w is chosen according to a probability distribution with
12. A computer-readable medium as recited in claim 3, wherein (A) the computer program
instructions further comprise instructions for adding an edge
E(i,j) from a new object
vi to an old object
wj by:
dividing interval [0, t + nδin] into n slots of width din(wj) + δin;
selecting a random number rin uniformly from the interval [0, t + nδin]; and
selecting the old object wj if the random number rin falls into a jth slot.
13. A computer-readable medium as recited in claim 3, wherein the computer-program instructions
based on (B) further comprise instructions for adding an edge
E(i,j) from an old object ν
i to an old object
wj by:
dividing interval [0, t + nδout] into n slots of width dout(vi) + δout;
selecting a random number rout uniformly from the interval [0, t + nδout];
selecting the old object vi if the random number rout falls into an ith slot;
dividing interval [0, t + nδin] into n slots of width din (wj) + δin;
selecting a random number rin uniformly from the interval [0, t + nδin]; and
selecting the old object wj if the random number rin falls into a jth slot.
14. A computer-readable medium as recited in claim 3, wherein the computer program instructions
based on (C) further comprise instructions for adding an edge
E(i,j) from an old object
wi to a new object
vj by:
dividing interval [0, t + nδout] into n slots of width dout(wi) + (δout;
selecting a random number rout uniformly from the interval [0, t + nδout]; and
selecting the old object wi if the random number rout falls into an ith slot.
15. A computer-readable medium as recited in claim 1, wherein the computer-program instructions
further comprise instructions for:
independently generating two random numbers λ(v) and µ(v) from specified distributions Din and Dout for a new vertex v of the graph; and
utilizing the random numbers to update vertices of the graph as follows:
(A) choosing an existing vertex w according to λ(w)(din + δin) such that Pr(w = wj) ∝ λ( wj) (din(wj) + δin);
(B) choosing an existing vertex v according to µ(v)(dout + δout) and a second existing vertex w according to λ(w)(din + δin), so that Pr(v = vi, w = wj) ∝ µ(vi)λ(wj)(dout(vi) + (δout)(din(wj) + δin); or
(C) selecting an existing vertex w according to µ(w)(dout + δout) such that Pr(w = wi) ∝ µ(wi(dout(wi) + δout).
16. A method to generate models of directed scale-free object relationships, the method
comprising:
generating a sequence of random numbers; and
successively selecting individual ones of the random numbers over time to generate
models of directed scale-free object relationships in a graph, with the development
of the graph depending on both in-degrees and out-degrees.
17. A method as recited in claim 16, wherein the graph is a web graph comprising nodes
and directed edges between respective ones of the nodes, the nodes corresponding to
web pages and the directed edges corresponding to hyperlinks from one web page to
another web page.
18. A method as recited in claim 16, wherein the method further comprises successively
using the random numbers to update the graph by:
(A) adding an edge between a new object and an old object;
(B) adding an edge between two old objects; or
(C) adding an edge from an old object to a new object according to configurable parameters
α, β and γ.
19. A method as recited in claim 16, wherein the method further comprises adding new edges
to the graph as a function of directed preferential attachment.
20. A method as recited in claim 16, wherein the method further comprises generating the
graph as a function of in-degree and/or out-degree shifts δin and/or δout·
21. A method as recited in claim 16, wherein the method further comprises modeling the
graph as a function of a measured environmental characteristic that is based on a
set of configurable parameters α, β, γ, δin and δout.
22. A method as recited in claim 16, wherein an in-degree power law associated with an
object represented by the graph is different from an out-degree power law associated
with the object.
23. A method as recited in claim 16, wherein an in-degree power law associated with an
object represented by the graph is different from an out-degree power law associated
with the object such that for a generator with parameters α, β, γ,
δin and
δout, a proportion of vertices with in-degree equal to
din asymptotically scales as follows:

with

and a proportion of vertices with out-degree equal to
dout asymptotically scales as

with
24. A method as recited in claim 18, wherein (A) further comprises updating the graph
by adding an edge from a new object
v to a random old object
w being chosen according to a probability distribution with
25. A method as recited in claim 18, wherein (B) further comprises updating the graph
by adding an edge from a first existing object
v of the graph to a second existing object
w where the objects
v and ware chosen according to a probability distribution with
26. A method as recited in claim 18, wherein (C) further comprises updating the graph
by adding an edge from a randomly chosen old object
w to a new object
v, where
w is chosen according to a probability distribution with
27. A method as recited in claim 24, wherein (A) further comprises adding an edge
E(i,j) from a new object
vi to an old object
wj by:
dividing interval [0, t + nδin] into n slots of width din(wj) + δin;
selecting a random number rin uniformly from the interval [0, t + nδin]; and
selecting the old object wj if the random number rin falls into a jth slot.
28. A method as recited in claim 25, wherein (B) further comprises adding an edge
E(i,j) from an old object
vi to a second old object
wj by:
dividing the interval [0, t + nδout] into n slots of width dout(vi) + δout;
selecting a random number rout uniformly from the interval [0, t + nδout];
selecting the old object vi if the random number rout falls into an ith slot;
dividing the interval [0, t + nδin] into n slots of width din (wj) + δin;
selecting a random number rin uniformly from the interval [0, t + nδin]; and
selecting the old object wj if the random number rin falls into a jth slot.
29. A method as recited in claim 26, wherein (C) further comprises adding an edge
E(i,j) from an old object
wi to a new object
vj by:
dividing the interval [0, t + nδout] into n slots of width dout(wi) + δout;
selecting a random number rout uniformly from the interval [0, t + nδout]; and
selecting the old object wi if the random number rout falls into an ith slot.
30. A method as recited in claim 16, wherein the method further comprises:
independently generating two random numbers λ(v) and µ(v) from specified distributions Din and Dout for a new vertex v of the graph; and
utilizing the random numbers to update vertices of the graph as follows:
(A) choosing an existing vertex I according to λ(w)(din + δin) such that Pr(w = wj) ∝ λ( wj) (din(wj) + δin);
(B) choosing an existing vertex v according to µ(v)(dout + δout) and a second existing vertex w according to λ(w)(din + δin), so that Pr(v = vi, w = wj) ∝ µ(vi)λ(wj)(dout(vi) + δout)(din(wj) + δin); or
(C) selecting an existing vertex w according to µ(w)(dout + δout) such that Pr(w = wi) ∝ µ(wi(dout(wi) + δout).
31. A computing device for generating models of directed scale-free object relationships,
the computing device comprising:
a processor; and
a memory coupled to the processor, the memory comprising computer-program instructions
executable by the processor for:
generating a sequence of random numbers; and
successively selecting individual ones of the random numbers over time to generate
models of directed scale-free object relationships in a graph, with the development
of the graph depending on both in-degrees and out-degrees.
32. A computing device as recited in claim 31, wherein the graph is a web graph comprising
nodes and directed edges between respective ones of the nodes, the nodes corresponding
to web pages and the directed edges corresponding to hyperlinks from one web page
to another web page.
33. A computing device as recited in claim 31, wherein the computer-program instructions
further comprise instructions for successively using the random numbers to update
the graph by:
(A) adding an edge between a new object and an old object;
(B) adding an edge between two old objects; or
(C) adding an edge from an old object to a new object according to configurable parameters
α, β and γ.
34. A computing device as recited in claim 31, wherein the computer-program instructions
further comprise instructions for adding new edges to the graph as a function of directed
preferential attachment.
35. A computing device as recited in claim 31, wherein the computer-program instructions
further comprise instructions for generating the graph as a function of in-degree
and/or out-degree shifts δin and/or δout.
36. A computing device as recited in claim 31, wherein the computer-program instructions
further comprise instructions for modeling the graph as a function of a measured environmental
characteristic that is based on a set of configurable parameters α, β, γ, δin and δout.
37. A computing device as recited in claim 31, wherein an in-degree power law associated
with an object represented by the graph is different from an out-degree power law
power law associated with the object.
38. A computing device as recited in claim 31, wherein an in-degree power law associated
with an object represented by the graph is different from an out-degree power law
associated with the object such that for a generator with parameters α, β, γ, δ
in and δ
out, a proportion of vertices with in-degree equal to d
in asymptotically scales as follows:

with

and a proportion of vertices with out-degree equal to d
out asymptotically scales as

with
39. A computing device as recited in claim 33, wherein the computer-program instructions
further comprise instructions based on (A) for updating the graph by adding an edge
from a new object
v to a random old object
w chosen according to a probability distribution with
40. A computing device as recited in claim 33, wherein the computer-program instructions
further comprise instructions based on (B), updating the graph by adding an edge from
a first existing object
v of the graph to a second existing object
w where the objects
v and
w are chosen according to a probability distribution with
41. A computing device as recited in claim 33, wherein the computer-program instructions
further comprise instructions based on (C) for updating the graph by adding an edge
from a randomly chosen old object
w to a new object
v, where
w is chosen according to a probability distribution with
42. A computing device as recited in claims 33 and 39, wherein the computer-program instructions
based on (A) further comprise instructions for adding an edge
E(i,j) from a new object
vi to an old object
wj by:
dividing interval [0, t + nδin] into n slots of width din(wj) + δin;
selecting a random number rin uniformly from the interval [0, t + nδin]; and
selecting the old object wj if the random number rin falls into a jth slot.
43. A computing device as recited in claims 33 and 40, wherein the computer-program instructions
based on (B) further comprise instructions for adding an edge
E(i,j) from an old object
vi to a second old object
wj by:
dividing the interval [0, t + nδout] into n slots of width dout(vi) + δout;
selecting a random number rout uniformly from the interval [0, t + nδout];
selecting the old object vi if the random number rout falls into an ith slot;
dividing the interval [0, t + nδin] into n slots of width din (wj) + δin;
selecting a random number rin uniformly from the interval [0, t + nδin]; and
selecting the old object wj if the random number rin falls into a jth slot.
44. A computing device as recited in claim 33 and 41, wherein the computer program instructions
based on (C) further comprise instructions for adding an edge
E(i,j) from an old object
wi to a new object
vj by:
dividing the interval [0, t + nδout] into n slots of width dout(wi) + δout;
selecting a random number rout uniformly from the interval [0, t + nδout];and
selecting the old object wi if the random number, rout falls into an ith slot.
45. A computing device as recited in claim 31, wherein the computer-program instructions
further comprise instructions for:
independently generating two random numbers λ(v) and µ(v) from specified distributions Din and Dout for a new vertex v of the graph; and
utilizing the random numbers to update vertices of the graph as follows:
(A) choosing an existing vertex w according to λ(w)(din + δin) such that Pr(w = wj) ∝ λ( wj) (din(Wj) + δin);
(B) choosing an existing vertex v according to µ(v)(dout + δout) and a second existing vertex w according to λ(w)(din + δin), so that Pr(v = vi, w = wj) ∝ µ(vi)λ(wj)(dout(vi) + (δout)(din(wj) + δin); or
(C) selecting an existing vertex w according to µ(w)(dout + δout) such that Pr(w = wi) ∝ µ(wi(dout(wi) + δout).
46. A computing device for generating models of directed scale-free object relationships,
the computing device comprising:
means for generating a sequence of random numbers;
means for successively selecting individual ones of the random numbers over time to
generate models of directed scale-free object relationships in a graph, with the development
of the graph depending on both in-degrees and out-degrees.
47. A computing device as recited in claim 46, wherein the graph is a web graph comprising
nodes and directed edges between respective ones of the nodes, the nodes corresponding
to web pages and the directed edges corresponding to hyperlinks from one web page
to another web page.
48. A computing device as recited in claim 46, and further comprising means for successively
using the random numbers to update the graph by:
(A) adding an edge between a new object and an old object;
(B) adding an edge between two old objects; or
(C) adding an edge from an old object to a new object,
according to configurable parameters α, β and γ.
49. A computing device as recited in claim 46, and further comprising means for adding
new edges to the graph as a function of directed preferential attachment.
50. A computing device as recited in claim 46, and further comprising means for generating
the graph as a function of in-degree and/or out-degree shifts δin and/or δout.
51. A computing device as recited in claim 46, and further comprising means for modeling
the graph as a function of a measured environmental characteristic that is based on
a set of configurable parameters α, β, γ, δin and δout.
52. A computing device as recited in claim 46, wherein an in-degree power law associated
with an object represented by the graph is different from an out-degree power law
power law associated with the object.
53. A computing device as recited in claim 46, wherein an in-degree power law associated
with an object represented by the graph is different from an out-degree power law
associated with the object such that for a generator with parameters α, β, γ,
δin and
δout, a proportion of vertices with in-degree equal to
din asymptotically scales as follows:

with

and a proportion of vertices with out-degree equal to d
out asymptotically scales as

with
54. A computing device as recited in claim 48, and further comprising means based on (A)
for updating the graph by adding an edge from a new object
v to a random old object
w chosen according to a probability distribution with
55. A computing device as recited in claim 48, and further comprising means based on (B),
updating the graph by adding an edge from a first existing object
v of the graph to a second existing object
w where the objects
v and
w are chosen according to a probability distribution with
56. A computing device as recited in claim 48, and further comprising means based on (C)
for updating the graph by adding an edge from a randomly chosen old object
w to a new object
v, where
w is chosen according to a probability distribution with
57. A computing device as recited in claim 48, and further comprising means based on (A)
for adding an edge
E(i,j) from a new object
vi to an old object
wj by:
dividing interval [0, t + nδin] into n slots of width din (wj) + δin;
selecting a random number rin uniformly from the interval [0, t + nδin]; and
selecting the old object wj if the random number rin falls into a jth slot.
58. A computing device as recited in claim 48, and further comprising means based on (B)
for adding an edge
E(i,j) from an old object
vi to a second old object
wj by:
dividing the interval [0, t + nδout] into n slots of width dout(vi) + δout;
selecting a random number rout uniformly from the interval [0, t + nδou];
selecting the old object vi if the random number rout falls into an ith slot;
dividing the interval [0, t + nδin] into n slots of width din (wj) + δin;
selecting a random number rin uniformly from the interval [0, t + nδin]; and
selecting the old object wj if the random number rin falls into a jth slot.
59. A computing device as recited in claim 48, and further comprising means based on (C)
for adding an edge
E(i,j) from an old object
wi to a new object
vj by:
dividing the interval [0, t + nδout] into n slots of width dout(wi) + δout;
selecting a random number rout uniformly from the interval [0, t + nδout];and
selecting the old object w, if the random number rout falls into an ith slot.
60. A computing device as recited in claim 46, and further comprising means for:
independently generating two random numbers λ(v) and µ(v) from specified distributions Din and Dout for a new vertex v of the graph; and
utilizing the random numbers to update vertices of the graph as follows:
(A) choosing an existing vertex w according to λ(w)(din + δin) such that Pr(w = wj) ∝ λ( wj) (din(wj) + (δin);
(B) choosing an existing vertex v according to µ(v)(dout + δout) and a second existing vertex w according to λ(w)(din + δin), so that Pr(v = vi, w = wj) ∝ µ(vi)λ(wj)(dout(vi) + δout)(din(wj) + δin); or
(C) selecting an existing vertex w according to µ(w)(dout + δout) such that Pr(w = wi) ∝ µ(wi)(dout(wi) + δout).