konect logo
KONECT > Statistics


A network statistic is a numerical value that characterizes a network. Examples of network statistics are the number of nodes and the number of edges in a network, but also more complex measures such as the diameter and the clustering coefficient.

Statistics are the basis of most network analysis methods; they can be used to compare networks, classify networks, detect anomalies in networks and for many other tasks. Network statistics are also used to map a network's structure to a simple numerical space, in which many standard statistical methods can be applied. Thus, network statistics are essential for the analysis of almost all network types.

All statistics described in KONECT are real numbers.

The following statistics are computed for the networks in KONECT. Not all statistics are computed for all networks. Some statistics only apply to certain kinds of networks (e.g., the clustering coefficient only applies to unipartite networks). For very large networks, some statistics are not computed because the computations take too long.

Size\begin{align} n &= |V| \end{align}
Volume\begin{align} m = |E| = \frac 1 2 \| \mathbf A \|_{\mathrm F} ^2 \end{align}
Average degree\begin{align} d &= \frac 1 {|V|} \sum_{u\in V} d(u) = \frac {2m} n \end{align}
Fill\begin{align} p &= \left\{ \begin{array}{ll} 2m / [n(n+1)] & \text{when $G$ is undirected} \\ m / n^2 & \text{when $G$ is directed} \\ m / (n_1 n_2) & \text{when $G$ is bipartite} \end{array}\right. \end{align}
Maximum degree\begin{align} d_{\max} &= \max_{u\in V} d(u) \end{align}
Reciprocity\begin{align} y = \frac 1 m | \{ (u,v) \in E \mid (v,u) \in E \} | \end{align}
Negativity\begin{align} \zeta &= \frac { | \{ e \in E \mid w(e) < 0 \} | } {m} \end{align}
LCC\begin{align} N &= \max_{F \subseteq \mathcal C} |F| \end{align}
Wedge count\begin{align} s = \sum_{u \in V} {d(u) \choose 2} = \sum_{u \in V} \frac 1 2 d(u) (d(u) - 1) \end{align}
Claw count\begin{align} z = \sum_{u \in V} {d(u) \choose 3} = \sum_{u \in V} \frac 1 6 d(u) (d(u) - 1) (d(u) - 2) \end{align}
Triangle count\begin{align} t = | \{ \{u, v, w\} \mid u \sim v \sim w \sim u \} | \;/\; 6 \end{align}
Square count\begin{align} q = | \{ u, v, w, x \mid u \sim v \sim w \sim x \sim u \} | / 8 \end{align}
4-tour count\begin{align} T_4 &= 8q + 4s + 2m \end{align}
Power law exponent\begin{align} \gamma &= 1 + n \left( \sum_{u\in V} \ln \frac {d(u)} {d_{\min}} \right) ^{-1} \end{align}
Gini coefficient\begin{align} G &= \frac {2 \sum_{i=1}^{n} i d_i} {n \sum_{i-1}^{n} d_i} - \frac {n+1} {n} \end{align}
Relative edge distribution entropy\begin{align} H_{\mathrm{er}} &= \frac 1 {\ln |V|} \sum_{u \in V} - \frac{d(u)}{D} \ln \frac{d(u)}{D} \end{align}
Assortativity\begin{align} \rho \end{align}
Clustering coefficient\begin{align} c &= \frac {|\{ u, v, w \in V \mid u \sim v \sim w \sim u \}|} {|\{ u, v, w \in V \mid u \sim v \neq w \sim u \}|} = \frac {3t} s \end{align}
Diameter\begin{align} \delta &= \max_{u \in E} \epsilon(u) = \max_{u,v\in E} d(u,v) \end{align}
Spectral norm\begin{align} | \lambda_1[\mathbf A] | &= \left\| \mathbf A \right\|_2 \end{align}
Algebraic connectivity\begin{align} a &= \lambda_2[\mathbf L] \end{align}
Preferential attachment exponent\begin{align} \min_{\alpha,\beta} \sum_{u\in V} \left( \alpha + \beta \ln[1 + d_1(u)] - \ln[\lambda + d_2(u)] \right)^2 \end{align}