Title: | Statistical Methods for Graphs |
---|---|
Description: | Contains statistical methods to analyze graphs, such as graph parameter estimation, model selection based on the Graph Information Criterion, statistical tests to discriminate two or more populations of graphs, correlation between graphs, and clustering of graphs. References: Takahashi et al. (2012) <doi:10.1371/journal.pone.0049949>, Fujita et al. (2017) <doi:10.3389/fnins.2017.00066>, Fujita et al. (2017) <doi:10.1016/j.csda.2016.11.016>, Fujita et al. (2019) <doi:10.1093/comnet/cnz028>. |
Authors: | Grover E. Castro Guzman [aut], Diogo R. da Costa [aut], Taiane C. Ramos [aut], Suzana S. Santos [aut], Eduardo S. Lira [aut], Andre Fujita [aut, cre] |
Maintainer: | Andre Fujita <[email protected]> |
License: | GPL (>= 3) |
Version: | 1.0.6 |
Built: | 2025-01-30 02:50:47 UTC |
Source: | https://github.com/cran/statGraph |
anogva
statistically tests whether two or more sets of graphs are generated
by the same random graph model. It is a generalization of the takahashi.test
function.
anogva(Graphs, labels, maxBoot = 1000, dist = "KL", ...)
anogva(Graphs, labels, maxBoot = 1000, dist = "KL", ...)
Graphs |
a list of undirected graphs.
If each graph has the attribute |
labels |
an array of integers indicating the labels of each graph. |
maxBoot |
integer indicating the number of bootstrap resamplings (default |
dist |
string indicating if you want to use the 'KL' (default), 'JS' , 'L1' or 'L2' distances. 'KL' means Kullback-Leibler divergence. 'JS' means Jensen-Shannon divergence. |
... |
Other relevant parameters for |
A list with class 'htest' containing the following components:
statistic: |
the statistic of the test. |
p.value: |
the p-value of the test. |
method: |
a string indicating the used method. |
data.name: |
a string with the data's name(s). |
Fujita, A., Vidal, M. C. and Takahashi, D. Y. (2017) A Statistical Method to Distinguish Functional Brain Networks. _Front. Neurosci._, *11*, 66. doi:10.3389/fnins.2017.00066.
Takahashi, D. Y., Sato, J. R., Ferreira, C. E. and Fujita A. (2012) Discriminating Different Classes of Biological Networks by Analyzing the Graph Spectra Distribution. _PLoS ONE_, *7*, e49949. doi:10.1371/journal.pone.0049949.
Silverman, B. W. (1986) _Density Estimation_. London: Chapman and Hall.
Sturges, H. A. The Choice of a Class Interval. _J. Am. Statist. Assoc._, *21*, 65-66.
Sheather, S. J. and Jones, M. C. (1991). A reliable data-based bandwidth selection method for kernel density estimation. _Journal of the Royal Statistical Society series B_, 53, 683-690. http://www.jstor.org/stable/2345597.
set.seed(1) g1 <- g2 <- g3 <- list() for (i in 1:20) { g1[[i]] <- igraph::sample_gnp(50, 0.50) g2[[i]] <- igraph::sample_gnp(50, 0.50) g3[[i]] <- igraph::sample_gnp(50, 0.52) } G <- c(g1, g2, g3) label <- c(rep(1,20),rep(2,20),rep(3,20)) result <- anogva(G, label, maxBoot=50) result
set.seed(1) g1 <- g2 <- g3 <- list() for (i in 1:20) { g1[[i]] <- igraph::sample_gnp(50, 0.50) g2[[i]] <- igraph::sample_gnp(50, 0.50) g3[[i]] <- igraph::sample_gnp(50, 0.52) } G <- c(g1, g2, g3) label <- c(rep(1,20),rep(2,20),rep(3,20)) result <- anogva(G, label, maxBoot=50) result
GIC
returns the Kullback-Leibler divergence, L1 or L2 distance between an
undirected graph and a given graph model using the exact or degree-based spectral densities.
GIC(Graph, model, p = NULL, dist = "KL", ...)
GIC(Graph, model, p = NULL, dist = "KL", ...)
Graph |
the undirected graph (igraph object).
If |
model |
either a list, a string, or a function describing a graph model: A list that represents the spectral density of a model. It contains the components 'x' and 'y', which are numeric vectors of the same size. The 'x' component contains the points at which the density was computed and the 'y' component contains the observed density. A string that indicates one of the following models: 'ER' (Erdos-Renyi random
graph), 'GRG' (geometric random graph), 'KR' (k regular random graph), 'WS'
(Watts-Strogatz model), and 'BA' (Barabási-Albert model). When the argument
A function that returns a graph (igraph object)
generated by a graph model. It must contain two arguments: the first one
corresponds to the graph size and the second to the parameter of the model.
The model parameter will be provided by the argument |
p |
the model parameter. The user must provide a valid parameter if the
argument the radius used to construct the geometric graph in a unit square, for the 'GRG' model (geometric random graph); the degree the probability to reconnect a vertex, for the 'WS' model (Watts-Strogatz model); and the scaling exponent, for the 'BA' model (Barabási-Albert model). |
dist |
string indicating if you want to use the 'KL' (default), 'L1' or 'L2' distances. 'KL' means Kullback-Leibler divergence. |
... |
Other relevant parameters for |
A list with class 'statGraph' containing the following components:
method: |
a string indicating the used method. |
info: |
a string showing details about the method. |
data.name: |
a string with the data's name(s). |
value: |
a real number corresponding to the Kullback-Leibler divergence, L1, or L2 distance between the observed graph and the graph model. |
Takahashi, D. Y., Sato, J. R., Ferreira, C. E. and Fujita A. (2012) Discriminating Different Classes of Biological Networks by Analyzing the Graph Spectra Distribution. _PLoS ONE_, *7*, e49949. doi:10.1371/journal.pone.0049949.
Silverman, B. W. (1986) _Density Estimation_. London: Chapman and Hall.
Sturges, H. A. The Choice of a Class Interval. _J. Am. Statist. Assoc._, *21*, 65-66.
Sheather, S. J. and Jones, M. C. (1991). A reliable data-based bandwidth selection method for kernel density estimation. _Journal of the Royal Statistical Society series B_, 53, 683-690. http://www.jstor.org/stable/2345597.
set.seed(1) G <- igraph::sample_gnp(n=50, p=0.5) # Using a string to indicate the graph model result1 <- GIC(G, 'ER', 0.5) result1 # Using a function to describe the graph model # Erdos-Renyi graph model <- function(n, p) { return (igraph::sample_gnp(n, p)) } result2 <- GIC(G, model, 0.5) result2
set.seed(1) G <- igraph::sample_gnp(n=50, p=0.5) # Using a string to indicate the graph model result1 <- GIC(G, 'ER', 0.5) result1 # Using a function to describe the graph model # Erdos-Renyi graph model <- function(n, p) { return (igraph::sample_gnp(n, p)) } result2 <- GIC(G, model, 0.5) result2
The function graph.acf
computes estimates of the autocorrelation
function for graphs.
graph.acf(Graphs, plot = TRUE)
graph.acf(Graphs, plot = TRUE)
Graphs |
a list of undirected graphs.
If each graph has the attribute |
plot |
logical. If |
An object of class acf.
Fujita, A., Takahashi, D. Y., Balardin, J. B., Vidal, M. C. and Sato, J. R. (2017) Correlation between graphs with an application to brain network analysis. _Computational Statistics & Data Analysis_ *109*, 76-92.
set.seed(1) G <- list() p <- array(0, 100) p[1:3] <- rnorm(3) for (t in 4:100) { p[t] <- 0.5*p[t-3] + rnorm(1) } ma <- max(p) mi <- min(p) p <- (p - mi)/(ma-mi) for (t in 1:100) { G[[t]] <- igraph::sample_gnp(100, p[t]) } graph.acf(G, plot=TRUE)
set.seed(1) G <- list() p <- array(0, 100) p[1:3] <- rnorm(3) for (t in 4:100) { p[t] <- 0.5*p[t-3] + rnorm(1) } ma <- max(p) mi <- min(p) p <- (p - mi)/(ma-mi) for (t in 1:100) { G[[t]] <- igraph::sample_gnp(100, p[t]) } graph.acf(G, plot=TRUE)
graph.cem
clusters graphs following an expectation-maximization algorithm based
on the Kullback-Leibler divergence between the spectral densities of the
graph and of the random graph model.
graph.cem(Graphs, model, k, max_iter = 10, ...)
graph.cem(Graphs, model, k, max_iter = 10, ...)
Graphs |
a list of undirected graphs.
If each graph has the attribute |
model |
a string that indicates one of the following random graph models: 'ER' (Erdos-Renyi random graph), 'GRG' (geometric random graph), 'KR' (k regular graph), 'WS' (Watts-Strogatz model), and 'BA' (Barabási-Albert model). |
k |
an integer specifying the number of clusters. |
max_iter |
the maximum number of expectation-maximization steps to execute. |
... |
Other relevant parameters for |
A list with class 'statGraph' containing the following components:
method: |
a string indicating the used method. |
info: |
a string showing details about the method. |
data.name: |
a string with the data's name(s). |
cluster: |
a vector of the same length of |
parameters: |
a vector containing the estimated parameters for the groups.
It has the length equals to |
Celeux, Gilles, and Gerard Govaert. 'Gaussian parsimonious clustering models.' Pattern recognition 28.5 (1995): 781-793.
Sheather, S. J. and Jones, M. C. (1991). A reliable data-based bandwidth selection method for kernel density estimation. _Journal of the Royal Statistical Society series B_, 53, 683-690. http://www.jstor.org/stable/2345597.
set.seed(1) g <- list() for(i in 1:2){ g[[i]] <- igraph::sample_gnp(n=10, p=0.5) } for(i in 3:4){ g[[i]] <- igraph::sample_gnp(n=10, p=1) } res <- graph.cem(g, model='ER', k=2, max_iter=1,eps=0.1) res
set.seed(1) g <- list() for(i in 1:2){ g[[i]] <- igraph::sample_gnp(n=10, p=0.5) } for(i in 3:4){ g[[i]] <- igraph::sample_gnp(n=10, p=1) } res <- graph.cem(g, model='ER', k=2, max_iter=1,eps=0.1) res
graph.cor.test
tests for association between paired samples of graphs,
using Spearman's rho correlation coefficient.
graph.cor.test(Graphs1, Graphs2)
graph.cor.test(Graphs1, Graphs2)
Graphs1 |
a list of undirected graphs.
If each graph has the attribute |
Graphs2 |
a list of undirected graphs.
If each graph has the attribute |
A list with class 'htest' containing the following components:
statistic: |
the value of the test statistic. |
p.value: |
the p-value of the test. |
method: |
a string indicating the used method. |
data.name: |
a string with the data's name(s). |
estimates: |
the estimated measure of association 'rho'. |
Fujita, A., Takahashi, D. Y., Balardin, J. B., Vidal, M. C. and Sato, J. R. (2017) Correlation between graphs with an application to brain network analysis. _Computational Statistics & Data Analysis_ *109*, 76-92.
library(mvtnorm) set.seed(1) G1 <- G2 <- list() p <- mvtnorm::rmvnorm(50, mean=c(0,0), sigma=matrix(c(1, 0.5, 0.5, 1), 2, 2)) ma <- max(p) mi <- min(p) p[,1] <- (p[,1] - mi)/(ma - mi) p[,2] <- (p[,2] - mi)/(ma - mi) for (i in 1:50) { G1[[i]] <- igraph::sample_gnp(50, p[i,1]) G2[[i]] <- igraph::sample_gnp(50, p[i,2]) } graph.cor.test(G1, G2)
library(mvtnorm) set.seed(1) G1 <- G2 <- list() p <- mvtnorm::rmvnorm(50, mean=c(0,0), sigma=matrix(c(1, 0.5, 0.5, 1), 2, 2)) ma <- max(p) mi <- min(p) p[,1] <- (p[,1] - mi)/(ma - mi) p[,2] <- (p[,2] - mi)/(ma - mi) for (i in 1:50) { G1[[i]] <- igraph::sample_gnp(50, p[i,1]) G2[[i]] <- igraph::sample_gnp(50, p[i,2]) } graph.cor.test(G1, G2)
Given a list of graphs, graph.dist
builds a distance matrix
according to the Jensen-Shannon divergence, L2 norm, or L1 norm
between the spectral density of the graphs graphs.
graph.dist(Graphs, dist = "JS", ...)
graph.dist(Graphs, dist = "JS", ...)
Graphs |
a list of undirected graphs.
If each graph has the attribute |
dist |
string indicating if you want to use the 'JS' (default), 'L1' or 'L2' distances. 'JS' means Jensen-Shannon divergence. |
... |
Other relevant parameters for |
a distance matrix
set.seed(1) g <- list() for(i in 1:5){ g[[i]] <- igraph::sample_gnp(n=50, p=0.1) } for(i in 6:10){ g[[i]] <- igraph::sample_gnp(n=50, p=0.5) } for(i in 11:15){ g[[i]] <- igraph::sample_gnp(n=50, p=0.9) } graph.dist(g, dist = 'JS')
set.seed(1) g <- list() for(i in 1:5){ g[[i]] <- igraph::sample_gnp(n=50, p=0.1) } for(i in 6:10){ g[[i]] <- igraph::sample_gnp(n=50, p=0.5) } for(i in 11:15){ g[[i]] <- igraph::sample_gnp(n=50, p=0.9) } graph.dist(g, dist = 'JS')
graph.entropy
returns the spectral entropy of an undirected graph.
graph.entropy(Graph, ...)
graph.entropy(Graph, ...)
Graph |
the undirected graph (igraph object).
If |
... |
Other relevant parameters for |
A list with class 'statGraph' containing the following components:
method: |
a string indicating the used method. |
info: |
a string showing details about the method. |
data.name: |
a string with the data's name(s). |
entropy: |
a real number corresponding to the graph spectral entropy. |
Takahashi, D. Y., Sato, J. R., Ferreira, C. E. and Fujita A. (2012) Discriminating Different Classes of Biological Networks by Analyzing the Graph Spectra Distribution. _PLoS ONE_, *7*, e49949. doi:10.1371/journal.pone.0049949.
set.seed(1) G <- igraph::sample_gnp(n=100, p=0.5) entropy <- graph.entropy(Graph = G) entropy
set.seed(1) G <- igraph::sample_gnp(n=100, p=0.5) entropy <- graph.entropy(Graph = G) entropy
Given a list of graphs, graph.hclust
builds a hierarchy of clusters
according to the Jensen-Shannon divergence between graphs.
graph.hclust(Graphs, k = NULL, clus_method = "complete", dist = "JS", ...)
graph.hclust(Graphs, k = NULL, clus_method = "complete", dist = "JS", ...)
Graphs |
a list of undirected graphs.
If each graph has the attribute |
k |
the number of clusters. If NULL, it won't return the computed clustering. |
clus_method |
the agglomeration method to be used. This should be (an unambiguous abbreviation of) one of ”ward.D”, ”ward.D2”, ”single”, ”complete”, ”average” (= UPGMA), ”mcquitty” (= WPGMA), ”median” (= WPGMC) or ”centroid” (= UPGMC). |
dist |
string indicating if you want to use the 'JS' (default), 'L1' or 'L2' distances. 'JS' means Jensen-Shannon divergence. |
... |
Other relevant parameters for |
A list with class 'statGraph' containing the following components:
method: |
a string indicating the used method. |
info: |
a string showing details about the method. |
data.name: |
a string with the data's name(s). |
cluster: |
a vector of the same length of |
hclust: |
a 'hclust' object. |
Takahashi, D. Y., Sato, J. R., Ferreira, C. E. and Fujita A. (2012) Discriminating Different Classes of Biological Networks by Analyzing the Graph Spectra Distribution. _PLoS ONE_, *7*, e49949. doi:10.1371/journal.pone.0049949.
Silverman, B. W. (1986) _Density Estimation_. London: Chapman and Hall.
Sturges, H. A. The Choice of a Class Interval. _J. Am. Statist. Assoc._, *21*, 65-66.
Sheather, S. J. and Jones, M. C. (1991). A reliable data-based bandwidth selection method for kernel density estimation. _Journal of the Royal Statistical Society series B_, 53, 683-690. http://www.jstor.org/stable/2345597.
set.seed(1) G <- list() for (i in 1:5) { G[[i]] <- igraph::sample_gnp(50, 0.5) } for (i in 6:10) { G[[i]] <- igraph::sample_smallworld(1, 50, 8, 0.2) } for (i in 11:15) { G[[i]] <- igraph::sample_pa(50, power = 1, directed = FALSE) } graph.hclust(G, 3)
set.seed(1) G <- list() for (i in 1:5) { G[[i]] <- igraph::sample_gnp(50, 0.5) } for (i in 6:10) { G[[i]] <- igraph::sample_smallworld(1, 50, 8, 0.2) } for (i in 11:15) { G[[i]] <- igraph::sample_pa(50, power = 1, directed = FALSE) } graph.hclust(G, 3)
graph.kmeans
clusters graphs following a k-means algorithm based on the
Jensen-Shannon divergence between the spectral densities of the graphs.
graph.kmeans(Graphs, k, nstart = 2, dist = "JS", ...)
graph.kmeans(Graphs, k, nstart = 2, dist = "JS", ...)
Graphs |
a list of undirected graphs.
If each graph has the attribute |
k |
an integer specifying the number of clusters. |
nstart |
the number of trials of k-means clusterizations. The algorithm returns the clusterization with the best silhouette. |
dist |
string indicating if you want to use the 'JS' (default), 'L1' or 'L2' distances. 'JS' means Jensen-Shannon divergence. |
... |
Other relevant parameters for |
A list with class 'statGraph' containing the following components:
method: |
a string indicating the used method. |
info: |
a string showing details about the method. |
data.name: |
a string with the data's name(s). |
cluster: |
a vector of the same length of |
centers: |
a list containing the centroids of each cluster. |
MacQueen, James. 'Some methods for classification and analysis of multivariate observations.' Proceedings of the fifth Berkeley symposium on mathematical statistics and probability. Vol. 1. No. 14. 1967.
Lloyd, Stuart. 'Least squares quantization in PCM.' IEEE transactions on information theory 28.2 (1982): 129-137.
set.seed(1) g <- list() for(i in 1:5){ g[[i]] <- igraph::sample_gnp(30, p=0.2) } for(i in 6:10){ g[[i]] <- igraph::sample_gnp(30, p=0.5) } res <- graph.kmeans(g, k=2, nstart=2) res
set.seed(1) g <- list() for(i in 1:5){ g[[i]] <- igraph::sample_gnp(30, p=0.2) } for(i in 6:10){ g[[i]] <- igraph::sample_gnp(30, p=0.5) } res <- graph.kmeans(g, k=2, nstart=2) res
graph.model.selection
selects the graph model that best approximates the
observed graph according to the Graph Information Criterion (GIC).
graph.model.selection(Graph, models = NULL, parameters = NULL, ...)
graph.model.selection(Graph, models = NULL, parameters = NULL, ...)
Graph |
the undirected graph (igraph object).
If |
models |
either a vector of strings, or a list of functions: A vector of strings containing some of the following models: 'ER' (Erdos-Renyi random graph), 'GRG' (geometric random graph), 'KR' (k regular random graph), 'WS' (Watts-Strogatz model), and 'BA' (Barabási-Albert model). A list of functions. Each function returns a graph (igraph object) generated by a graph model and has two arguments: the graph size and the model parameter, in this order. If the argument |
parameters |
list of numeric vectors or list of lists. If a list of numeric vectors is given,
then each vector contains the values that will be considered for the parameter estimation of the
corresponding model. If a list of lists is given, then each list contains |
... |
Other relevant parameters for |
A list with class 'statGraph' containing the following components:
method: |
a string indicating the used method. |
info: |
a string showing details about the method. |
model: |
the indice(s) or name(s) of the selected model(s), i. e. the model(s) that minimize(s) the Graph Information Criterion (GIC). |
estimates: |
a matrix in which each row corresponds to a model, the column 'param' corresponds to the parameter estimate, and the column 'GIC' corresponds to the Graph Information Criterion (GIC), i. e. the distance measure (Kullback-Leibler divergence by default) between the observed graph and the model. |
Takahashi, D. Y., Sato, J. R., Ferreira, C. E. and Fujita A. (2012) Discriminating Different Classes of Biological Networks by Analyzing the Graph Spectra Distribution. _PLoS ONE_, *7*, e49949. doi:10.1371/journal.pone.0049949.
Silverman, B. W. (1986) _Density Estimation_. London: Chapman and Hall.
Sturges, H. A. The Choice of a Class Interval. _J. Am. Statist. Assoc._, *21*, 65-66.
Sheather, S. J. and Jones, M. C. (1991). A reliable data-based bandwidth selection method for kernel density estimation. _Journal of the Royal Statistical Society series B_, 53, 683-690. http://www.jstor.org/stable/2345597.
## Example using an igraph object as input data set.seed(1) G <- igraph::sample_gnp(n=30, p=0.5) # Using strings to indicate the graph models result1 <- graph.model.selection(G, models=c('ER', 'WS'), eps = 0.5) result1 ## Using functions to describe the graph models # Erdos-Renyi graph model1 <- function(n, p) { return(igraph::sample_gnp(n, p)) } # Watts-Strogatz small-world graph model2 <- function(n, pr, K=8) { return(igraph::sample_smallworld(1, n, K, pr)) } parameters <- list(seq(0.01, 0.99, 0.49), seq(0.01, 0.99, 0.49)) result2 <- graph.model.selection(G, list(model1, model2), parameters) result2
## Example using an igraph object as input data set.seed(1) G <- igraph::sample_gnp(n=30, p=0.5) # Using strings to indicate the graph models result1 <- graph.model.selection(G, models=c('ER', 'WS'), eps = 0.5) result1 ## Using functions to describe the graph models # Erdos-Renyi graph model1 <- function(n, p) { return(igraph::sample_gnp(n, p)) } # Watts-Strogatz small-world graph model2 <- function(n, pr, K=8) { return(igraph::sample_smallworld(1, n, K, pr)) } parameters <- list(seq(0.01, 0.99, 0.49), seq(0.01, 0.99, 0.49)) result2 <- graph.model.selection(G, list(model1, model2), parameters) result2
graph.mult.scaling
performs multidimensional scaling of graphs. It
takes the Jensen-Shannon divergence between graphs (JS) and uses the
cmdscale
function from the stats
package to obtain a set of points such
that the distances between the points are similar to JS.
graph.mult.scaling( Graphs, plot = TRUE, type = "n", dist = "JS", main = "", ... )
graph.mult.scaling( Graphs, plot = TRUE, type = "n", dist = "JS", main = "", ... )
Graphs |
a list of undirected graphs.
If each graph has the attribute |
plot |
logical. If |
type |
what type of plot should be drawn. The default value is |
dist |
string indicating if you want to use the 'JS' (default), 'L1' or 'L2' distances. 'JS' means Jensen-Shannon divergence. |
main |
title of the plot (default value is ”). |
... |
additional parameters for |
A list with class 'statGraph' containing the following components:
method: |
a string indicating the used method. |
info: |
a string showing details about the method. |
data.name: |
a string with the data's name(s). |
values: |
a matrix in which each column corresponds to a coordinate and each row corresponds to a graph (point). Then, each row gives the coordinates of the points chosen to represent the Jensen-Shannon divergence (by default), L1, or L2 distance between graphs. |
Takahashi, D. Y., Sato, J. R., Ferreira, C. E. and Fujita A. (2012) Discriminating Different Classes of Biological Networks by Analyzing the Graph Spectra Distribution. _PLoS ONE_, *7*, e49949. doi:10.1371/journal.pone.0049949.
Silverman, B. W. (1986) _Density Estimation_. London: Chapman and Hall.
Sturges, H. A. The Choice of a Class Interval. _J. Am. Statist. Assoc._, *21*, 65-66.
Sheather, S. J. and Jones, M. C. (1991). A reliable data-based bandwidth selection method for kernel density estimation. _Journal of the Royal Statistical Society series B_, 53, 683-690. http://www.jstor.org/stable/2345597.
set.seed(1) G <- list() for (i in 1:5) { G[[i]] <- igraph::sample_gnp(50, 0.5) } for (i in 6:10) { G[[i]] <- igraph::sample_smallworld(1, 50, 8, 0.2) } for (i in 11:15) { G[[i]] <- igraph::sample_pa(50, power = 1, directed = FALSE) } graph.mult.scaling(G)
set.seed(1) G <- list() for (i in 1:5) { G[[i]] <- igraph::sample_gnp(50, 0.5) } for (i in 6:10) { G[[i]] <- igraph::sample_smallworld(1, 50, 8, 0.2) } for (i in 11:15) { G[[i]] <- igraph::sample_pa(50, power = 1, directed = FALSE) } graph.mult.scaling(G)
graph.param.estimator
estimates the parameter that best approximates
the model to the observed graph according to the Graph Information Criterion
(GIC).
graph.param.estimator( Graph, model, interval = NULL, eps = 0.01, search = "grid", ... )
graph.param.estimator( Graph, model, interval = NULL, eps = 0.01, search = "grid", ... )
Graph |
the undirected graph (igraph object).
If |
model |
either a string or a function: A string that indicates one of the following models: 'ER' (Erdos-Renyi random graph), 'GRG' (geometric random graph), 'KR' (k regular random graph), 'WS' (Watts-Strogatz model), and 'BA' (Barabási-Albert model). A function that returns a graph (represented by its adjacency matrix) generated by a graph model. It must contain two arguments: the first one corresponds to the graph size and the second to the parameter of the model. |
interval |
numeric vector containing the values that will be
considered for the parameter estimation, or a list containing 'lo' and 'hi'
that indicates the model's parameter search interval < 0 to 1 with step 0 to sqrt(2) with step 0 to 'n' with step 0 to 1 with step and 0 to 3 with step |
eps |
precision of the grid and ternary search (default is |
search |
string that indicates the search algorithm to find the parameter with the smallest GIC. If 'grid' (default) parameter is estimated using grid search, and only works when method is not 'fast'. If 'ternary' parameter is estimated using ternary search. |
... |
Other relevant parameters for |
A list with class 'statGraph' containing the following components:
method: |
a string indicating the used method. |
info: |
a string showing details about the method. |
data.name: |
a string with the data's name(s). |
param: |
the parameter estimate. For the 'ER', 'GRG', 'KR', 'WS', and 'BA'
models, the parameter corresponds to the probability to connect a pair of
vertices, the radius used to construct the geometric graph in a unit square,
the degree |
dist: |
the distance between the observed graph and the graph model with the estimated parameter. |
Takahashi, D. Y., Sato, J. R., Ferreira, C. E. and Fujita A. (2012) Discriminating Different Classes of Biological Networks by Analyzing the Graph Spectra Distribution. _PLoS ONE_, *7*, e49949. doi:10.1371/journal.pone.0049949.
Silverman, B. W. (1986) _Density Estimation_. London: Chapman and Hall.
Sturges, H. A. The Choice of a Class Interval. _J. Am. Statist. Assoc._, *21*, 65-66.
Sheather, S. J. and Jones, M. C. (1991). A reliable data-based bandwidth selection method for kernel density estimation. _Journal of the Royal Statistical Society series B_, 53, 683-690. http://www.jstor.org/stable/2345597.
set.seed(1) G <- igraph::sample_gnp(n=50, p=0.5) # Using a string to indicate the graph model result1 <- graph.param.estimator(G, 'ER', eps=0.25) result1 # Using a function to describe the graph model # Erdos-Renyi graph set.seed(1) model <- function(n, p) { return(igraph::sample_gnp(n, p)) } result2 <- graph.param.estimator(G, model, seq(0.2, 0.8, 0.1)) result2
set.seed(1) G <- igraph::sample_gnp(n=50, p=0.5) # Using a string to indicate the graph model result1 <- graph.param.estimator(G, 'ER', eps=0.25) result1 # Using a function to describe the graph model # Erdos-Renyi graph set.seed(1) model <- function(n, p) { return(igraph::sample_gnp(n, p)) } result2 <- graph.param.estimator(G, model, seq(0.2, 0.8, 0.1)) result2
graph.spectral.density
returns the exact or degree-based spectral density
in the interval <from
,to
> by using npoints
discretization points.
graph.spectral.density(Graph, method = "diag", ...)
graph.spectral.density(Graph, method = "diag", ...)
Graph |
the undirected graph (igraph object).
If |
method |
String that specifies the method to obtain the spectral density. It can take two possible values 'diag' (Default) and 'fast'. If 'diag' is used then the exact spectral density is obtained, otherwise the degree-based spectral density is obtained. |
... |
Other relevant parameters to obtain the spectral density such as |
A list with class 'statGraph' containing the following components:
method: |
a string indicating the used method. |
info: |
a string showing details about the method. |
data.name: |
a string with the data's name(s). |
x: |
a vector corresponding to the x axis coordinates of the density function. |
y: |
a vector corresponding to the y axis coordinates of the density function. |
from: |
a real number corresponding to the smallest value of the x axis. |
to: |
a real number corresponding to the largest value of the x axis. |
#' Takahashi, D. Y., Sato, J. R., Ferreira, C. E. and Fujita A. (2012) Discriminating Different Classes of Biological Networks by Analyzing the Graph Spectra Distribution. _PLoS ONE_, *7*, e49949. doi:10.1371/journal.pone.0049949.
Silverman, B. W. (1986) _Density Estimation_. London: Chapman and Hall.
Sturges, H. A. The Choice of a Class Interval. _J. Am. Statist. Assoc._, *21*, 65-66.
Sheather, S. J. and Jones, M. C. (1991). A reliable data-based bandwidth selection method for kernel density estimation. _Journal of the Royal Statistical Society series B_, 53, 683-690. http://www.jstor.org/stable/2345597.
Newman, M. E. J., Zhang, X., & Nadakuditi, R. R. (2019). Spectra of random networks with arbitrary degrees. Physical Review E, 99(4), 042309.
set.seed(1) G <- igraph::sample_smallworld(dim = 1, size = 50, nei = 2, p = 0.2) # Obtain the spectral density density <- graph.spectral.density(Graph = G) density
set.seed(1) G <- igraph::sample_smallworld(dim = 1, size = 50, nei = 2, p = 0.2) # Obtain the spectral density density <- graph.spectral.density(Graph = G) density
graph.takahashi.test
tests whether two sets of graphs were generated by the same
random graph model.
This bootstrap test is based on the Jensen-Shannon (JS) divergence between
graphs.
graph.takahashi.test(Graphs1, Graphs2, maxBoot = 1000, dist = "JS", ...)
graph.takahashi.test(Graphs1, Graphs2, maxBoot = 1000, dist = "JS", ...)
Graphs1 |
a list of undirected Graphs.
If each graph has the attribute |
Graphs2 |
a list of undirected Graphs.
If each graph has the attribute |
maxBoot |
integer indicating the number of bootstrap resamplings (default |
dist |
string indicating if you want to use the 'JS' (default) , 'L1' or 'L2' distances. 'JS' means Jensen-Shannon divergence. |
... |
Other relevant parameters for |
Given two lists of graphs, Graphs1
and Graphs2
, graph.takahashi.test
tests H0: 'JS
divergence between Graphs1
and Graphs2
is 0
' against H1: 'JS divergence between
Graphs1
and Graphs2
is larger than 0
'.
A list with class 'htest' containing the following components:
statistic: |
the value of the Jensen-Shannon divergence (default), L1 or L2 between 'Graphs1' and 'Graphs2'. |
p.value: |
the p-value of the test. |
method: |
a string indicating the used method. |
data.name: |
a string with the data's name(s). |
Takahashi, D. Y., Sato, J. R., Ferreira, C. E. and Fujita A. (2012) Discriminating Different Classes of Biological Networks by Analyzing the Graph Spectra Distribution. _PLoS ONE_, *7*, e49949. doi:10.1371/journal.pone.0049949.
Silverman, B. W. (1986) _Density Estimation_. London: Chapman and Hall.
Sturges, H. A. The Choice of a Class Interval. _J. Am. Statist. Assoc._, *21*, 65-66.
Sheather, S. J. and Jones, M. C. (1991). A reliable data-based bandwidth selection method for kernel density estimation. _Journal of the Royal Statistical Society series B_, 53, 683-690. http://www.jstor.org/stable/2345597.
set.seed(1) G1 <- G2 <- list() for (i in 1:20) { G1[[i]] <- igraph::sample_gnp(n=50, p=0.500) } for (i in 1:20) { G2[[i]] <- igraph::sample_gnp(n=50, p=0.512) } result <- graph.takahashi.test(G1, G2, maxBoot=500) result
set.seed(1) G1 <- G2 <- list() for (i in 1:20) { G1[[i]] <- igraph::sample_gnp(n=50, p=0.500) } for (i in 1:20) { G2[[i]] <- igraph::sample_gnp(n=50, p=0.512) } result <- graph.takahashi.test(G1, G2, maxBoot=500) result
sp.anogva
statistically tests whether two or more graphs are
generated by the same model and set of parameters.
sp.anogva(Graphs, model, maxBoot = 100, ...)
sp.anogva(Graphs, model, maxBoot = 100, ...)
Graphs |
a list of undirected graphs.
If each graph has the attribute |
model |
A string that indicates one of the following models: 'ER' (Erdos-Renyi random graph model), 'GRG' (geometric random graph model), 'WS' (Watts-Strogatz random graph model), and 'BA' (Barabási-Albert random graph model). |
maxBoot |
integer indicating the number of bootstrap resamples (default
is |
... |
Other relevant parameters for |
A list with class 'htest' containing the following components:
statistic: |
the F statistic of the test. |
p.value: |
the p-value of the test. |
method: |
a string indicating the used method. |
data.name: |
a string with the data's name(s). |
estimates: |
a vector containing the estimated parameters for each graph. |
Andre Fujita, Eduardo Silva Lira, Suzana de Siqueira Santos, Silvia Yumi Bando, Gabriela Eleuterio Soares, Daniel Yasumasa Takahashi. A semi-parametric statistical test to compare complex networks, Journal of Complex Networks, cnz028, https://doi.org/10.1093/comnet/cnz028
Sheather, S. J. and Jones, M. C. (1991). A reliable data-based bandwidth selection method for kernel density estimation. _Journal of the Royal Statistical Society series B_, 53, 683-690. http://www.jstor.org/stable/2345597.
set.seed(1) model <- 'ER' G <- list() # Under H0 G[[1]] <- igraph::sample_gnp(50, 0.5) G[[2]] <- igraph::sample_gnp(50, 0.5) G[[3]] <- igraph::sample_gnp(50, 0.5) result1 <- sp.anogva(G, model, maxBoot = 10,eps=0.1) result1 # Under H1 G[[1]] <- igraph::sample_gnp(50, 0.5) G[[2]] <- igraph::sample_gnp(50, 0.75) G[[3]] <- igraph::sample_gnp(50, 0.5) result2 <- sp.anogva(G, model, maxBoot = 10,eps=0.1) result2
set.seed(1) model <- 'ER' G <- list() # Under H0 G[[1]] <- igraph::sample_gnp(50, 0.5) G[[2]] <- igraph::sample_gnp(50, 0.5) G[[3]] <- igraph::sample_gnp(50, 0.5) result1 <- sp.anogva(G, model, maxBoot = 10,eps=0.1) result1 # Under H1 G[[1]] <- igraph::sample_gnp(50, 0.5) G[[2]] <- igraph::sample_gnp(50, 0.75) G[[3]] <- igraph::sample_gnp(50, 0.5) result2 <- sp.anogva(G, model, maxBoot = 10,eps=0.1) result2