Statistical Methods for Social Networks
Tom Snijders

Phillip Bonacich - Paulette Lloyd, Department of Sociology
Eigenvector-like Measures of Centrality for Asymmetric Relations

Eigenvectors of adjacency matrices are useful as measures of centrality or of status. However, they are misapplied to asymmetric networks in which some positions are unchosen. For these networks an alternative measure of centrality is suggested that equals an eigenvector when eigenvectors can be used and provides meaningfully comparable results when they cannot.

John P. Boyd - University of California at Irvine
Probability Distributions for Popularity and Expansiveness: Social Process Versus Personal Attributes

In a directed graph, the relative frequencies of in-degrees and out-degrees are known as popularity and expansiveness, respectively. Unfortunately, some of the more familiar discrete distributions, such as the binomial, Poisson, and negative binomial, have to be rejected by the Ord criterion: the ratio of the second central moment to the first, and the ratio of the third to the second. However, a study of probability distributions that do fit theses marginal distributions can shed light on the social process that formed the links. For example, one of the Pólya urn sampling schemes is to replace each ball sampled with c balls of a similar color, producing a negative hypergeometric distribution. When c is positive, then both colors are contagious. In the context of friendship choices, this can lead to a runaway popularity effect. However, the contagion model has, by Gurland's theorem, a dual genesis as a mixture (or stopped-sum) of distributions. We suggest experimental ways to distinguish these two ways of generating the same distribution.

Ove Frank - Department of Statistics, Stockholm University
Bayesian Approaches to Social Network Modeling


Bayesian statistics derives its models from exchangability assumptions and other invariance principles that apply to data observed. Social network data present special opportunities for Bayesian approaches. Both measurement models and sampling models for networks could be approached by Bayesian methods. Some examples are discussed in order to demonstrate the potential of this approach and illustrate some alternatives to Holland-Leinhardt models and other log-linear models for social networks. The discussion also provides a way to handle the difficulties involved in evaluating models for which the number of parameters increases with the number of nodes in the network.
Key words. Network sampling. Bayesian statistical modeling. Social network data.

Ove Frank(1) - Michael Capobianco(2) - (1)Department of Statistics, Stockholm University, (2)St. John's University
The Exploratory Statistical Analysis of Networks: Fixed Choice Scheme

O. Frank and M. F. Capobianco initiated the study of statistical inference in networks in 1969-70. More recently, Capobianco has devoted attention mainly to exploratory analysis. Here, rather than being interested in a specific property of the net, such as its size, or connectednes, we consider only the posibility of learning something about its structure, e.g., is it clustered or widely separarted.
We studied, among other more complicated problems, two "choice schemes" namely, the Fixed ("name your three best friends"), and the Variable( "name all your friends"). This paper deals only with the former. It was found that just 10 configurations are possible between any pair of sampled points, and that the distribution of these in the sample yields information about the structure of the population network.

Jan Hagberg - Stockholm University, Department of Statistics
Centrality Testing and the Distribution of the Degree Variance in Bernoulli Graphs

Exact and asymptotic distributions of the degree variance are investigated for Bernoulli graphs and uniform random graphs. In particular the range of values of the degree variance and its maximum value are considered. We show that the degree variance is approximately gamma distributed with parameters obtained from the first two moments of the degree variance.
Since centrality of a graph can be interpreted as a measure of its heterogeneity in terms of vertex degrees, we can perform a centrality test with a critical value obtained from the gamma distribution.

Key words: Centrality Testing, Bernoulli Graphs, Degree Variance, Gamma Approximation, Uniform Random Graphs

Mark Huisman - Dept. of Statistics & Measurement theory / ICS, FPPSW, University of Groningen
Stochastic Actor-oriented Models for Networks of Changing Composition

Markov chains can be used for the modelling of complex longitudinal social network data. A probability model for the evolution of social networks is the stochastic actor-oriented model for network change proposed by Snijders (1996, 2001). The basic idea for the model is that actors in the model evaluate their position in the network and strive for the `best' possible configuration of relations. The evaluation of the configuration is defined as a function of the actor's position in the network, and depends on parameters that are estimated from the data by a Markov chain monte carlo procedure.

This paper describes the problem of changing network composition due to actors leaving the network at some time point and new actors joining the network. The actor-oriented model of Snijders is extended to handle longitudinal data in which the composition of the network and its size change. For that purpose continuous-time Markov chain models are implemented as simulation models in which actors are allowed to leave or enter the network at fixed time points.

Tina Kogovsek - Anuska Ferligoj - Faculty of Social Sciences, University of Ljubljana, Slovenia
Estimating Reliability and Validity of Egocentered Network Measurements

In the paper the quality of data in terms of reliability and validity of egocentered network measurements is estimated by the multitrait-multimethod (MTMM) approach. This approach usually requires at least three repeated measurements (methods) of the same variable (trait) for model identification purposes. This poses a considerable burden on the respondent and increases the cost of the data collection. A split ballot MTMM design (Saris, 1999) was used, in which separate groups of respondents got different combinations of only two methods. The design can also be regarded as a planned missing data design and the procedures suggested by Allison (1987) are used for maximum likelihood estimation of the confirmatory factor analysis models for MTMM designs specified in Saris and Andrews (1991). The influence of factors, such as methods used and demographic or personal characteristics of respondents, that can affect the quality of data is estimated by the Multiple Classification Analysis. The procedures are applied to social support data collected in the city of Ljubljana (Slovenia) in the year 2000.

Johan Koskinen - Department of Statistics, Stockholm University
Aggregation of Perceived Social Networks

Measurement accuracy is an inherent problem in social network analysis. The issue of actor accuracy in reporting their interactions with others, was raised by Bernard, Killworth and Sailer (e.g. Bernard et al.,1980, Information accuracy in social network data IV:A comparison of clique-level structure in behavioral and cognitive network data, Social Networks, 2:191-218) and provoked extensive debate. Krackhardt (1987, Cognitive social structures, Social Networks, 9:109-134) later introduced the concept of Cognitive Social Structures and several methods for aggregating different actor reports on the network into a single graph, with the aid of which actor-actor congruence could be gauged. A statistical model for aggregating separate reports into a single consensus network, with the additional benefit of allowing estimates of actor accuracy to be obtained in the process, was proposed by Batchelder, Kumbasar and Boyd (1997, Consensus analysis of three-way social network data, Journal of Mathematical Sociology, 22:29-58). The purpose here is to investigate this approach to the problem in a Bayesian framework. The emphasis is put on the effects of the choices of different distributional assumptions on the ability of the models to capture our prior knowledge and yield estimates of actor "accuracy", the consensus/central graph and, various summary measures.
Keywords: Bayesian statistical modelling. Consensus analysis. Cognitive social structures (CSS). Measurement reliability.

Lynne Seymour - Department of Statistics, University of Georgia
Gibbs Regression and Some Tests for Goodness of Fit

We explore a model for social networks that may be viewed either as a conditional extension of logistic regression or as a Gibbs distribution on a complete graph (a model from particle physics). The model was developed for data from a mental health service system which includes a neighborhood structure on the clients in the system, and models client responses while
assuming that the network bonds between clients always exist (but could perhaps be degenerate). Markov chain Monte Carlo methods are required for fitting the model. We will also present goodness of fit statistics for assessing the fit of this model.

Tom A.B. Snijders - Department of Statistics and Measurement Theory, University of Groningen, The Netherlands
Markov Chain Monte Carlo Estimation of the p* Model

The estimation method which is at this moment usual for the p* model is a maximum quasi-likelihood procedure which is implemented as a logistic regression method. The statistical properties of this
procedure, however, are questionable and not yet completely understood. Maximum likelihood estimation for the p* model is possible, however, and can be carried out by Markov Chain Monte Carlo. Various implementations are possible in principle and practical difficulties have to be solved to make the algorithm work well.
A method is proposed which uses a Robbins-Monro-type procedure for approximating the solution of the likelihood equations. The p* model is simulated as the asymptotic distribution of a particular specification of the network evolution model also used in the SIENA program. Examples are given for various triadic p* models.

Christian Tallberg - Stockholm University, Department of Statistics
Bayesian Network Modeling of Block Structures

A Bayesian approach is taken to model block structures in social networks. In particular, a stochastic block model is considered comprising a block of central actors and a block of non-central actors. Prior probabilities are assigned to the different alternatives for choosing the central block, and posterior probabilities are derived for different possibilities for the central block. Furthermore, posterior probabilities are calculated for the order of the central block. A generalization is also considered where the number of blocks is allowed to be larger than two, and where centrality is extended to other structural
properties governed by the edge probabilities within and between the blocks.
Keywords: Bayesian statistical modeling. Bernoulli block structures.

Christopher Wheat - Harvard University, Organizational Behavior Program
Confidence and Complexity in Blockmodel Selection

This paper explores how a Bayesian approach can be used to address the problem of blockmodel selection for social networks. The MinimumDescription Length (MDL) principle is used to develop a prior probability distribution for the set of possible blockmodel structures for a given social network. The method presented here can be used not only to determine how actors should be assigned to a given partition of a network into blocks, but also provides a statistical basis for determining how many blocks actors in a given network should be partitioned into.
Furthermore, this method provides a statistical basis for determining confidence intervals for blockmodel parameters. The method developed in this paper is predicated on the existence of a stochastic blockmodel, or a posterior probability distribution for the observation of a set of network ties given a particular blockmodel structure. The stochastic blockmodeling approach presented in this paper represents a generalized model, of which many of the existing stochastic blockmodeling approaches are special cases.

Abstracts by topics