Brain networks construction using Bayes FDR and average power function

Abstract

Brain functional connectivity is a widely investigated topic in neuroscience. In recent years, the study of brain connectivity has been largely aided by graph theory. The link between time series recorded at multiple locations in the brain and the construction of a graph is usually an adjacency matrix. The latter converts a measure of the connectivity between two time series, typically a correlation coefficient, into a binary choice on whether the two brain locations are functionally connected or not. As a result, the choice of a threshold τ over the correlation coefficient is key. In the present work, we propose a multiple testing approach to the choice of τ that uses the Bayes false discovery rate and a new estimator of the statistical power called average power function to balance the two types of statistical error. We show that the proposed average power function estimator behaves well both in case of independence and weak dependence of the tests and it is reliable under several simulated dependence conditions. Moreover, we propose a robust method for the choice of τ using the 5% and 95% percentiles of the average power function and False Discovery Rate bootstrap distributions, respectively, to improve stability. We applied our approach to functional magnetic resonance imaging and high density electroencephalogram data.

Keywords

Average Power Function Bayes FDR Functional MRI High Density EEG Multiple hypothesis testing

1 Introduction

Functional connectivity is defined as the temporal dependency within spatially remote neurophysiologic events.¹ In the past years, an increasing body of neuroimaging studies explored functional connectivity by measuring the level of co-activation of time series between brain regions.

Graph theory is increasingly used to study brain connectivity either through summary measures of the topological network organisation or by means of connection-level analyses.^2,3 In the graphical representation of a brain network, a node corresponds to a brain region while an edge corresponds to an interaction between two brain regions.⁴ Nonetheless, there still exists a lack of consensus on how to convert continuous connectivity measures into binary brain networks. Many authors advocated the use of an absolute threshold over the connectivity measure, some preferred the use of a threshold on the network sparsity (proportional thresholding) and other authors recommended multiple-thresholding approaches in which the connectivity matrix is binarised at different values (or different numbers of edges are retained) and the subsequent network properties are studied either as a function of the varying threshold or by integrating over all possible configurations.^3,5–10

In the present study, we focus on methods that derive brain functional networks by choosing an absolute threshold on the connectivity matrix. The absolute thresholding rationale is that of bringing out the true underlying network structure by removing all spurious noise-generated edges which are often low-connected pairs in the connectivity matrix.

The first attempts in this direction were made by preserving highly connected edges only, using an arbitrary threshold; however, the results were unreliable and potentially misleading as they were associated with an unknown amount of lost information.^6,10

In order to minimise the error generated by the binarisation process, a statistical approach has been proposed that takes into account the properties of connectivity measures and their distribution under a simple null hypothesis. The idea was to move the threshold from an arbitrary value on the connectivity measure to a generally acceptable value of the probability of an observed test statistic under H₀. Nevertheless, this probability must be tested for each edge of a network rendering the whole process massively affected by the multiple testing problem.^7,11 This is particularly relevant to neuroimaging studies where even high correlations may appear simply by chance.⁷

Several approaches were proposed in the neuroscientific literature to address the problem of multiple testing. First attempts were based on adjusting the statistical threshold through a function of the number of tests.⁷ However, the extraordinary development of neuroscientific technologies in the last decades has enabled high-resolution brain recording from hundreds to thousands of sites across the brain. This translated into a multitude of pair-wise tests whose Family Wise Error Rates (FWER) could not be controlled without a substantial loss of statistical power.

Developments were made by employing more complex statistical tools such as the false discovery rate (FDR).^12,13 By controlling the expected number of false positives, FDR is a less conservative approach which gained popularity in the neuroscientific community.^10,14

Important theoretical improvements have been attained in the study of large-scale inference problems in the last 20 years. Storey¹⁵ introduced the positive FDR (pFDR) which represents the expected number of false positives conditioned on a positive number of findings. By fixing a threshold on the p-values and estimating the alpha rate over that threshold, it is possible to conservatively control the pFDR. Efron¹⁶ proposed the local Bayes FDR which can be interpreted as the posterior chance of incurring in a false discovery given the observed p-value. Such approach has the advantages of requiring minimal modelling assumptions and it has a straightforward empirical Bayes estimator.

More recently, a growing awareness in the common practice of statistical testing of the importance of statistical power in the interpretation of evidence has led some authors to explore new paradigms where both type I and type II errors can be combined in the decision process.¹⁷ In the filed of brain networks, Sala et al.¹¹ proposed a method to derive graphs from pairwise correlation test statistics by controlling type I and type II error rates using pFDR and positive False Nondiscovery Rate (pFNR) and a method for balancing the two errors.

The present study proposes a new Bayesian estimator of the Average Power Function (APF) which extends the so-called Average Power (AP)¹⁷ by making it dependent on the rejection region threshold. We couple the APF with the Bayes FDR¹⁶ for threshold selection in multiple testing procedures. The APF estimator is proven to be unbiased and to asymptotically approximate the actual value of the parameter both in the case of independent and stationary-associated p-values. In the case of brain connections analysis, such an asymptotic approximation is typically adequate because the number of tests involved is fixed but very large. Simulation results show that APF estimator has low bias and mean squared error (MSE) over its full range and also for several types and strengths of spatial dependence among tests.

Furthermore, as small variability in the number of erroneously rejected or accepted hypotheses (i.e. stability) is an important feature of a testing procedure,¹⁸ we propose a robust approach based on the use of tail probabilities instead of point estimates. For this purpose, we employ the 95th and 5th percentiles of FDR and APF, respectively, to identify a threshold on the p-values that guarantees both a small false discovery error and a reasonably high power with 95% probability. The proposed method is general and can be applied to various statistics; here we used Spearman's test statistics and its approximate distribution to test our approach with a Monte Carlo (MC) simulation study and a real data analysis of functional Magnetic Resonance Imaging (fMRI) and High Density Electroencephalogram (HD-EEG) data recorded from a healthy subject.

2 Multiple testing

In order to deal with multiple testing, we consider m pairs of hypotheses H₀ and H₁, with a priori probabilities defined by $π_{0} = P (H_{0})$ and $π_{1} = P (H_{1}) = 1 - π_{0}$ . Each pair is put through a hypothesis test that returns a p-value p_j for $j = 1, \dots, m$ which is assumed to be uniformly distributed under H₀.

Let us also consider the probability of false discoveries, called Bayes FDR¹⁶

FDR (γ) = P (H_{0} | p_{j} \leq γ) = \frac{P (p_{j} \leq γ | H_{0}) P (H_{0})}{P (p_{j} \leq γ)} = \frac{γ π_{0}}{F (γ)}

where γ and F represent a suitable threshold for the p-values and their cdf, respectively. More precisely, we represent the cdf of the p-values as a mixture of the cdf under H₀ and the cdf under H₁¹⁶ as

\begin{matrix} F (γ) = P (p_{j} \leq γ) = P (p_{j} \leq γ | H_{0}) P (H_{0}) + P (p_{j} \leq γ | H_{1}) P (H_{1}) = γ π_{0} + P (p_{j} \leq γ | H_{1}) π_{1} \end{matrix}

Let us define the AP as the probability of rejecting the null hypothesis when the alternative is true, which can be calculated through the integral of the power function weighted by the prior distribution.¹⁷ Here, we introduce the APF where the rejection region is made vary w.r.t the threshold γ. We define the APF as

\begin{matrix} APF (γ) = P (p_{j} \leq γ | H_{1}) = \frac{P (H_{1} | p_{j} \leq γ) P (p_{j} \leq γ)}{P (H_{1})} \\ = \frac{[1 - FDR (γ)] F (γ)}{1 - π_{0}} = \frac{F (γ) - γ π_{0}}{1 - π_{0}} \end{matrix}

We propose the following estimates

\begin{matrix} \hat{FDR} (γ) = \frac{γ π_{0}}{\hat{F} (γ)}, \\ \hat{APF} (γ) = \frac{\hat{F} (γ) - γ π_{0}}{1 - π_{0}} \end{matrix}

for which the choice of the p-values' empirical cdf

\hat{F} (γ) = # {p_{j} \leq γ} / m

leads to the expected values

\begin{matrix} E [\hat{FDR} (γ)] \geq \frac{γ π_{0}}{E [\hat{F} (γ)]} = FDR (γ), \\ E [\hat{APF} (γ)] = \frac{E [\hat{F} (γ)] - γ π_{0}}{1 - π_{0}} = APF (γ) \end{matrix}

due to Jensen's inequality and

E [\hat{F} (γ)] = F (γ) = γ π_{0} + APF (γ) π_{1}

It is worth noting that the FDR estimate is conservative while the APF estimate is unbiased.

Information regarding the a priori probability $π_{0}$ can be acquired empirically from the data. This approach defines a conservative estimate of the a priori probability $π_{0}$ as shown by Storey¹⁵

{\hat{π}}_{0} (λ) = \frac{# {p_{j} > λ}}{m (1 - λ)} = \frac{1 - \hat{F} (λ)}{1 - λ}

whose expected value depends on λ and is defined as

E [{\hat{π}}_{0} (λ)] = \frac{1 - F (λ)}{1 - λ} = π_{0} + \frac{1 - APF (λ)}{1 - λ} π_{1} \geq π_{0}

which is obtained through

E [1 - \hat{F} (λ)] = 1 - F (λ) = (1 - λ) π_{0} + [1 - APF (λ)] π_{1}

Therefore, the empirical Bayes estimates of FDR and APF are

\begin{matrix} \hat{FDR} (γ) = \frac{γ {\hat{π}}_{0} (λ_{1})}{\hat{F} (γ)}, \\ \hat{APF} (γ) = \frac{\hat{F} (γ) - γ {\hat{π}}_{0} (λ_{2})}{1 - {\hat{π}}_{0} (λ_{2})} \end{matrix}

where

λ_{1}

and

λ_{2}

are two suitable values of the tuning parameter λ which can be selected in order to optimise the estimates performance. To derive the optimal value of λ for each estimate, we resample the m p-values with replacement B times, we calculate the bootstrap versions of

{\hat{π}}_{0} (λ)

over a range of λ values (e.g. from 0 to 1 with step 0.05) and we minimise the bootstrap estimate of the corresponding mean square error (MSE) defined as

\begin{matrix} \hat{MS E_{FDR}} (λ_{1}) = \frac{1}{B} \sum_{b = 1}^{B} [{FDR}_{λ_{1}}^{b} (γ) - {FDR}_{λ_{1}}^{plg} (γ)]^{2}, \\ \hat{MS E_{APF}} (λ_{2}) = \frac{1}{B} \sum_{b = 1}^{B} [{APF}_{λ_{2}}^{b} (γ) - {APF}_{λ_{2}}^{plg} (γ)]^{2} \end{matrix}

where plg denotes the plug-in estimate of the unknown parameters.¹⁹ The reader is referred to Sala et al.¹¹ and Section 9 of Storey¹⁵ for further details on the optimal choice of λ.

The optimal values of λ for the two estimates allow us to construct the one-sided $(1 - α)$ -confidence intervals for the FDR and APF parameters by taking, respectively, the $(1 - α)$ -quantile of the ${\hat{FDR}}_{λ_{1}} (γ)$ bootstrap distribution as the upper confidence bound, and the α-quantile of the ${\hat{APF}}_{λ_{2}} (γ)$ bootstrap distribution as the lower confidence bound of the corresponding parameters.

The use of confidence intervals instead of point estimates allows us to obtain a more informative, robust and conservative procedure.

Since it is not sufficient to control the FDR alone, we propose a flexible approach to balance the two types of error rate. The trade-off can be made by first choosing the alpha value for both the $(1 - α)$ -quantile of the bootstrap distribution of FDR and the α-quantile of the bootstrap distribution of APF and then evaluating these quantities over all the gamma range and identifying a suitable gamma threshold such that: first, the FDR is low and second, the APF is reasonably high (both at $1 - α$ level). This flexibility in the choice of a threshold given to researchers comes with the recommendations set out by the recent statement of the American Statistical Association (ASA) on p-values.²⁰

2.1 Asymptotics

In order to prove that, for a large number m of p-values, the limiting behaviour of the proposed estimators is essentially the same, both under the assumption of independence of p-values and under a suitable assumption of weak dependence, we start by noting that ${\hat{π}}_{0}, \hat{FDR}$ and $\hat{APF}$ can be expressed as continuous functions of $\hat{F}$ and that continuous functions preserve almost sure convergences (according to the continuous mapping theorem).

More precisely, in case of stationary associated p-values, if

\begin{matrix} \sum_{i = 1}^{m} Cov (p_{i}, p_{m}) = o (m) for m \to \infty \end{matrix}

then

{sup}_{γ} {\hat{F} (γ) - F (γ)} \to 0

by a result of Yu.²¹ Hence, through the continuous mapping theorem, we obtain

\begin{matrix} {\hat{π}}_{0} (λ) = \frac{1 - \hat{F} (λ)}{1 - λ} a . s . \frac{1 - F (λ)}{1 - λ} = π_{0} [1 + \frac{1 - APF (λ)}{1 - λ} \frac{π_{1}}{π_{0}}] \geq π_{0}, \hat{FDR} (γ) = \frac{γ {\hat{π}}_{0} (λ)}{\hat{F} (γ)} a . s . \frac{γ π_{0}}{F (γ)} [1 + \frac{1 - APF (λ)}{1 - λ} \frac{π_{1}}{π_{0}}] \\ = FDR (γ) [1 + \frac{1 - APF (λ)}{1 - λ} \frac{π_{1}}{π_{0}}] \geq FDR (γ) \end{matrix}

And

\begin{matrix} \hat{APF} (γ) = \frac{\hat{F} (γ) - γ {\hat{π}}_{0} (λ)}{1 - {\hat{π}}_{0} (λ)} a . s . \frac{F (γ) - γ π_{0} [1 + \frac{1 - APF (λ)}{1 - λ} \frac{π_{1}}{π_{0}}]}{1 - π_{0} [1 + \frac{1 - APF (λ)}{1 - λ} \frac{π_{1}}{π_{0}}]} \\ = \frac{APF (γ) - γ \frac{1 - APF (λ)}{1 - λ}}{1 - \frac{1 - APF (λ)}{1 - λ}} = \frac{1 - λ}{APF (λ) - λ} APF (γ) - γ \frac{1 - APF (λ)}{APF (λ) - λ} \end{matrix}

since

\binom{\hat{F} (γ) a . s .}{F (γ)}

and

APF (λ) = \frac{F (λ) - λ π_{0}}{1 - π_{0}}

In particular, when λ is chosen so that $APF (λ)$ is close to 1 (e.g. if λ is near to 1), the empirical Bayes estimates asymptotically approximate the actual values of the corresponding parameters.

As a final point, we may observe that, in the case of independent and identically distributed p-values, the almost sure convergence of the empirical cdf follows directly from the strong law of large numbers and implies the almost sure convergence of the proposed estimators by means of the continuous mapping theorem.

2.2 The case of nonparametric independence testing

Let us consider the case of m pairs of hypotheses H₀: independence vs. H₁: dependence, with a priori probabilities defined by $π_{0} = P (H_{0})$ and $π_{1} = P (H_{1}) = 1 - π_{0}$ . We apply the approach introduced in Section 2 to this scenario where dependence is measured through the Spearman's test statistics, which is

t_{j} = \frac{r_{j}}{\sqrt{1 - r_{j}^{2}}} \sqrt{n - 2}

where n is the number of sampled points in the time series, r_j represents the Spearman's rank correlation coefficient and

t_{j}, j = 1, \dots, m

are approximately distributed under the null hypothesis as a Student's t with

n - 2

degrees of freedom. The corresponding m p-values are approximated by

p_{j} = 2 - 2 F_{n - 2} (| t_{j} |)

where

F_{n - 2}

is the Student's cumulative distribution function (cdf) with

n - 2

degrees of freedom. A threshold γ on the p-values corresponds to a threshold τ on the test statistics.

The idea of basing multiple testing procedures on correlation coefficients is common in the biological literature.²² Pearson and partial correlations are often employed by neuroscientists; however, we propose here the use of Spearman's statistics as a nonparametric alternative which could capture more general forms of dependence and further help the identification of co-activated brain areas. Nonetheless, the method can be employed with other measures of dependence; in particular, if we assume the normal distribution of the data, we could use Pearson correlations replacing the Student t distribution with the standard normal. Information regarding the a priori probability $π_{0}$ can be either acquired from previous studies or empirically from the data which is the approach we employed here in the application to fMRI and EEG data. We will always refer to the pair of hypotheses defined above in the following sections where our multiple testing procedure is applied to a simulation study and the construction of fMRI and HD-EEG brain networks.

3 Simulation study

We performed a Monte Carlo (MC) simulation study to assess the performance of the proposed APF estimator together with the FDR estimator. Multiple tests of the form $H_{0} ⋮ μ_{0} = 0; H_{1} ⋮ μ_{1} = 2$ were simulated on the Spearman's test statistics by using its normal asymptotic distribution with $σ^{2} = 1$ . We define one hundred tests with $π_{0} = 0.3$ as the proportion of true null hypotheses (treated as known) and $π_{1} = 1 - π_{0}$ ; this corresponds to 100 nodes and 7000 edges graph. Tests were repeatedly simulated B = 1000 times by drawing from a multivariate normal with parameters $\underline{μ} = (0_{1}, \dots, 0_{30}, 2_{31}, \dots, 2_{100})$ and $Σ = I_{100}$ . For each test the p-value is defined as $p_{i, b} = P {N (0, 1) \geq z_{i}}$ for the b-th iteration, where z_i is the i-th observed value of the vector $\underline{z}$ drawn from $N_{100} (\underline{μ}, Σ)$ .

The multivariate normal distribution allows to study the performance of the estimators when the independence between tests is violated; by modifying the correlation structure in Σ, we employed typical forms of spatial dependence, namely the first-order autoregressive structure

ρ^{| d |}

²³ and the Matérn class of covariance functions

C_{v} (d)

for

v = (1 / 2, \infty)

and

ρ = (0.2, 0.4, 0.7)

\begin{matrix} C_{v} (d) = σ^{2} \frac{2^{1 - v}}{Γ (v)} (\sqrt{2 v} \frac{d}{ρ})^{v} K_{v} (\sqrt{2 v} \frac{d}{ρ}) \end{matrix}

where d is the absolute distance between two tests, Γ is the gamma function and K_v is the modified Bessel function of the second kind.²⁴ To assess the overall performance of the estimators, we computed the MC bias and MSE as follows

\begin{matrix} \hat{Bia s_{APF}} (γ) = \frac{1}{B} \sum_{b = 1}^{B} \hat{AP F^{* b}} (γ) - APF (γ), \\ \hat{MS E_{APF}} (γ) = \frac{1}{B} \sum_{b = 1}^{B} (\hat{AP F^{* b}} (γ) - APF (γ)) 2 \end{matrix}

and similarly for the FDR. Monte Carlo Bias and MSE were reported for a sensible set of γ values in Table 1.

Table 1.

Monte Carlo Bias and MSE for a sensible range of γ values, different covariance functions and correlation intensities.

Cov Function	Monte Carlo Bias (MSE) $[* 10^{- 5}]$
Independence	$γ =$ 0.0001	0.001	0.01	0.1	0.2
FDR	31.4 (0.08)	34.6 (0.22)	33.7 (0.43)	28.2 (1.53)	50.5 (2.28)
APF	176.6 (52.6)	133.7 (177.2)	−106.6 (335.4)	−10.2 (327.6)	−276.0 (254.5)
$ρ^{d}; ρ = 0.2$
FDR	34.9 (0.09)	47.2 (0.43)	45.7 (0.64)	46.5 (1.92)	54.0 (2.82)
APF	225.1 (61.0)	149.5 (218.6)	−139.5 (441.9)	−183.1 (408.2)	−251.6 (313.6)
$C_{1 / 2} (d); ρ = 0.4$
FDR	32.7 (0.08)	40.5 (0.31)	38.4 (0.52)	34.2 (1.67)	52.9 (2.49)
APF	202.3 (55.3)	118.0 (194.9)	−132.3 (370.8)	−63.1 (356.5)	−277.4 (278.1)
$C_{\infty} (d); ρ = 0.4$
FDR	31.7 (0.08)	38.1 (0.30)	35.7 (0.47)	27.6 (1.60)	53.7 (2.32)
APF	199.4 (55.0)	160.8 (187.5)	−108.1 (353.9)	18.3 (343.2)	−306.0 (259.7)
$ρ^{d}; ρ = 0.4$
FDR	40.6 (0.10)	62.9 (0.62)	61.7 (1.00)	63.2 (2.59)	53.4 (3.84)
APF	283.7 (76.9)	120.9 (268.8)	−63.8 (613.4)	−271.7 (529.1)	−127.4 (423.8)
$C_{1 / 2} (d); ρ = 0.7$
FDR	35.8 (0.09)	50.2 (0.46)	47.6 (0.67)	51.6 (2.02)	55.1 (3.0)
APF	246.6 (63.7)	132.3 (225.8)	−126.6 (468.5)	−235.9 (426.2)	−243.1 (333.1)
$C_{\infty} (d); ρ = 0.7$
FDR	37.2 (0.09)	53.5 (0.55)	45.4 (0.72)	52.5 (2.10)	46.2 (3.1)
APF	236.6 (67.1)	190.9 (237.0)	−13.8 (472.2)	−225.9 (447.0)	−134.5 (342.5)
$ρ^{d}; ρ = 0.7$
FDR	60.8 (0.15)	160.0 (2.47)	151.7 (4.76)	118.3 (6.22)	122.0 (8.5)
APF	386.6 (132.9)	169.5 (548.1)	−10.9 (1282)	−271.7 (1116)	−380.2 (849.6)

Note: d is the absolute distance between two nodes. Results are scaled by 10⁵. APF: average power function; FDR: false discovery rate.

4 Application

In this section we apply the combined FDR and APF approach to resting-state fMRI and HD-EEG correlation data recorded from the same healthy subject. The underlying properties of these data offer the opportunity to explore the behaviour of our method under different correlation scenarios; in fact, resting-state fMRI typically has lower correlations than resting-state HD-EEG which is affected by the well-known physical phenomenon of Volume Conduction (VC).²⁵

A 30-year-old healthy woman from the research team of the Scientific Institute Santa Maria Nascente of the Don Gnocchi Foundation (Milan, Italy) volunteered for the study. She underwent resting state functional Magnetic Resonance Imaging (fMRI) and High Density electroencephalogram (HD-EEG) recordings. Each exam lasted 20 minutes and was recorded at the same hour of the day in a darkened room with the subject laid in supine position with eyes closed. She was instructed to keep alert and relaxed; no specific mental task was requested.

4.1 fMRI

The resting state fMRI was carried out at the Department of Radiology using a 1.5 T Siemens Magnetom Avanto (Erlangen, Germany) MRI scanner with an eight-channel head coil. BOLD EPI images were collected at rest for approximately eight minutes. High-resolution T1-weighted 3D scans were also collected to be used as anatomical references for fMRI data analysis. Standard pre-processing involved the following steps: motion and EPI distortion corrections, non-brain tissues removal, high-pass temporal filtering (cut-off 0.01 Hz) and artefacts removal using the FMRIB ICA-based Xnoiseifier (FIX) toolbox.²⁶

After the pre-processing, the resulting 4D dataset was aligned to the subject's high-resolution T1-weighted image, registered to MNI152 standard space and subsequently resampled to $2 \times 2 \times 2 mm 3$ resolution. A total of 190 volumes were available for successive analyses. fMRI time series were then extracted as the average signal within each of 84 human functional Brodmann's Areas (BA) as regions of interest (ROIs) using the Resting-State fMRI Data Analysis Toolkit REST.²⁷

4.2 HD-EEG

The high density EEG (HD-EEG) was recorded in the Neurophysiology Lab using a BrainVision Recorder 1.20 (Brain Products GmbH, Germany) and a pre-cabled EEG recording cap equipped with 64 Ag/AgCl electrodes with FCz as the reference. Analog signals were digitalised at 500 Hz sampling rate and bandpass filtered from 0.1 to 100 Hz. Raw data were further notch filtered at 50 Hz and band-pass filtered (1–30 Hz) off-line. Before segmentation, both visual inspection and Independent Component Analysis (ICA) were used for semi-automated removal of ocular artefacts.²⁸ Data were then segmented into consecutive non overlapping 2.5-seconds epochs yielding 120 epochs available for successive analyses.

EEG time series for each ROI were obtained by first applying the standard procedures for the computation of mean spectral density. The cross-spectral matrix was used as input for sLORETA source analysis.²⁹ Source activities were combined into 84 regions of interest (ROIs). Each ROI centre was placed at the respective BA centroid and then the time series of the electric neuronal activity at the ROIs were extracted.

5 Results

Figure 1 and Table 1 report the results of the Monte Carlo simulation study. Figure 1 shows the difference between the true values of APF (blue) and their point estimates computed throughout the full γ range for different covariance functions and correlation intensities. The estimates are close to the true values of APF for the full γ range and in almost every scenario tested, the highest variability being observed with the most correlated spatial structure. Table 1 reports MC Bias and MSE of APF and FDR for a meaningful set of γ values and for different correlation patterns and intensities. All the estimates of FDR and APF show low bias and MSE. The APF bias turns out to be always conservative for γ values equivalent to the range of power most useful in applications (0.4 to 0.9). The MSE tends to grow, especially for the APF, as γ increases or the spatial correlation structure becomes stronger.

Figure 1.

True APF values (dark line) and 50 replications of their point estimates for different covariance functions and correlation intensities. Results are shown over all range of γ (x-axis).

5.1 fMRI and HD-EEG brain network construction

Figure 2 shows the empirical distributions of fMRI and HD-EEG correlations. As expected, both distributions are right-skewed due to predominance of positive correlations in resting-state recordings³⁰ with the HD-EEG data showing a far more heavy-tailed distribution due to correlation inflation caused by the VC phenomenon. We computed the $95 th$ percentile of the bootstrap distribution of FDR ( $95 th \hat{FD R^{* b}}$ ) together with the $5 th$ percentile of the bootstrap distribution of APF ( $5 th \hat{AP F^{* b}}$ ) to identify a suitable threshold τ for the construction of the fMRI and HD-EEG networks. Figure 3 shows the selected quantiles of FDR and APF over a range of τ values. Both $95 th \hat{FD R^{* b}}$ and $5 th \hat{AP F^{* b}}$ decrease as the threshold τ increases. This allowed us to draw the trade-off between FDR and power for the two networks (Figure 4). Therefore, it was possible to balance FDR and power by considering the set of pairs $(95 th \hat{FD R^{* b}} (γ), 5 th \hat{AP F^{* b}} (γ))$ and choosing a suitable pair. An example for both the fMRI and HD-EEG is reported in Table 2. In order to find a suitable trade-off for these data, we considered the standard experimental framework where priority is on controlling the type I error; however, we also added the power to the decision-making process. In particular, we chose τ so as to achieve at least $50 %$ of APF (with $95 %$ probability) while keeping the FDR low (at most 10% with $95 %$ probability).

Figure 2.

Empirical distributions of Spearman's r for the fMRI and HD-EEG data.

Figure 3.

95th bootstrap percentile of the FDR (upper) and 5th bootstrap percentile of the APF (lower) for a range of τ values. Bootstrap quantiles are reported for both the fMRI (line) and HD-EEG (dashed) data.

Figure 4.

Trade-off between $95 th \hat{FD R^{* b}}$ and $5 th \hat{AP F^{* b}}$ for the fMRI (left) and HD-EEG (right, dot-dashed) networks.

Table 2.

Examples of thresholds γ on the p-values and τ on the Spearman's test statistics for the fMRI and HD-EEG networks.

	$95 th \hat{FD R^{* b}}$	$5 th \hat{AP F^{* b}}$	Network density
fMRI
$γ = 0.0331$	0.103	0.503	0.22
$τ = 0.154$
HD-EEG (1)
$γ = 0.0001$	0.00003	0.512	0.46
$τ = 0.273$
HD-EEG (2)
$γ = 0.0191$	0.004	0.802	0.72
$τ = 0.168$

Note: The $95 th$ bootstrap percentile of FDR and the $5 th$ bootstrap percentile APF are reported together with the resulting networks density. Two examples of threshold are proposed for the HD-EEG network that preserve a small FDR while return different values of APF and network density. fMRI: Functional magnetic resonance imaging; HD-EEG: high density electroencephalogram; APF: average power function; FDR: false discovery rate.

The FDR–APF trade-off for the fMRI network (Figure 4, left) did not provide alternatives: to control both errors sensibly, we had to select a threshold τ returning estimates of FDR and APF not greater than $10 %$ and of at least $50 %$ , respectively, with $95 %$ probability.

The resulting network featured nodes with high degree of connectivity in the pre-motor, temporal, parietal and occipital areas (Figure 5) as indicated by different graph measures of centrality such as degree, closeness and betweenness. These results are in line with other studies on functional brain networks.^31–33 It is worth observing that a threshold τ returning an FDR of at most $5 %$ would not guarantee an APF of at least $50 %$ with $95 %$ probability (Figure 4, left).

Figure 5.

Functional brain hubs as extracted from the fMRI network obtained using the FDR-APF approach. The hubs (blue dots) and their edges (lines) are dividend in central-temporal and parietal-occipital areas (left/right) and identified by the number of the relative Brodmann area on the circles. Hubs are defined using multiple graph measures of centrality.

On the other hand, without any a priori knowledge about the HD-EEG network, the FDR–APF trade-off allowed different reasonable choices of the threshold τ (Figure 4, right). For instance, the researcher could arguably favour a low upper bound for the FDR and guarantee no more than at least $50 %$ of APF with $95 %$ probability (Table 2, HD-EEG(1)), although in this case it would be better choosing a less conservative FDR in order to gain a much more desirable lower bound for the power (Table 2, HD-EEG(2)). The reader should note the macroscopic impact these choices have on the properties of the subsequent brain networks; for instance, the resulting HD-EEG networks have density, i.e. number of links over all possible connections,³⁴ of 0.46 and 0.72, respectively (Table 2).

6 Discussion

The present study addresses the problem of setting a threshold τ on the correlation coefficient between time series of brain activity recorded from several brain areas. An adjacency matrix is consequently defined for the construction of networks which are widely employed by the neuroscientific community for the study of the brain activity.

To this purpose, we proposed a new estimator of the APF and paired it with the Bayes FDR estimator¹⁶ to account for both type I error and statistical power in the choice of τ.

In the introduction, we mentioned other approaches to define functional brain networks. It is our opinion that many of them do not address properly the important problem of separating true connections from spurious ones. This is particularly important as brain functional measures are well known to be affected by several non-neural phenomena which can heavily mask the true connectivity structure.⁷ Therefore, methods such as proportional thresholding,^5,6 multiple thresholding and cost-integration,^3,8,10 and those working with fully connected graphs⁹ have the limit of providing no direct control on the amount of error that affects their results.

To the same extent, advanced methods in the neuroimaging literature for controlling the FWER such as random field and permutation test do not provide researchers with complete information on both type I error and power associated with the choice of a threshold while they come either with several assumptions or computational costs and lack of generality.³⁵

On the contrary, functional brain networks obtained by balancing Bayes FDR and APF have the advantage of directly disclosing the level of false discovery error and power attached to each graph.

As pointed out in Sala et al.,¹¹ combining the two types of error helps the construction of reliable brain networks.

In Neuroscience as well as in many other applications, the assumption of independence between tests is unlikely to hold. Spatial dependence of multiple tests is of concern as it can affect the selection of the threshold τ. To this extent, we showed that the APF estimator improves substantially the current literature¹¹ as we were able to prove its unbiasedness and its almost sure convergence to the actual value of the parameter, not only when assuming independence but also with weakly dependent p-values. Furthermore, we tested the behaviour of the APF and FDR estimators under different types and strengths of dependence obtaining low Monte Carlo bias and MSE for both estimates even when moderate-to-highly dependent tests were considered.

Nevertheless, it is worth stressing that we explored standard forms of spatial dependence and thus our results might not be the same in other scenarios, particularly when a high level of dependence between tests is expected.

Another important point is reproducibility of results in neuroscientific studies employing testing procedures.¹⁸ Here, we proposed the combined use of $95 %$ and $5 %$ percentiles of FDR and APF distributions, respectively, to account for sampling variability in the choice of τ and hence returning more robust results in terms of network stability compared to methods based on point estimates. This approach allows also a straightforward interpretation of the threshold as the $95 %$ probability of achieving at least the power and at most the type I error controlled by the pair of APF and FDR percentiles chosen.

The application to the construction of fMRI and HD-EEG networks supported the added value of our method: When there is only one sensible choice for the threshold τ, as for the fMRI data, the pair FDR-APF informs researchers on both the statistical errors they are willing to accept. Furthermore, by exploring features of the resulting network using graph measures, we were able to identify important nodes that are consistent with other works on functional networks and brain hubs.^31–33 On the other hand, when multiple choices of τ are possible, as for the HD-EEG example, the addition of the APF enables a more informed choice of the threshold than the FDR alone. Our method is not limited to fMRI and EEG networks; we believe the additional information on power helps researchers who employ multiple tests to strengthen their results.

It is worth noting that, in the case of HD-EEG, a highly dense brain network was expected as a result of the well-known phenomenon of volume conduction.²⁵ Nonetheless, there were different sensible choices of τ which returned even moderately dense HD-EEG networks. In such example, the combined use of FDR and APF proved to be a helpful tool in selecting the threshold which most effectively captures the correct density structure of the underlying phenomenon.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The study was supported by a grant of the Italian Ministry of Health, Ricerca Corrente funding program 2015–2016 [RC2015].

References

Friston

. Functional and effective connectivity: a review. Brain Connect 2011; 1: 13–36.

Van den Heuvel

Hulshoff Pol

. Exploring the brain network: a review on resting state fMRI functional connectivity. Eur Neuropsychopharmacol 2010; 20: 519–534.

Baggio

Abos

Segura

, et al. Statistical inference in brain graphs using threshold-free network based statistics. Hum Brain Mapp 2018; 39: 2289–2302.

Rubinov

Sporns

. Complex network measures of brain connectivity: uses and interpretations. Neuroimage 2010; 52: 1059–1069.

van den Heuvel

de Lange

Zalesky

, et al. Proportional thresholding in resting-state fMRI functional connectivity networks and consequences for patient-control connectome studies: issues and recommendations. NeuroImage 2017; 152: 437–449.

Garrison

Scheinost

Finn

, et al. The (in)stability of functional brain network measures across thresholds. NeuroImage 2015; 118: 651–661.

De vico Fallani

Richiardi

Chavez

, et al. Graph analysis of functional brain networks; practical issues in translational neuroscience. Philos Trans R Soc Lond B Biol Sci 2014; 369: 20130521.

Ginestet

Nichols

Bullmore

, et al. Brain network analysis: separating cost from topology using cost integration. PLoS One 2011; 6: e21570.

Rubinov

Sporns

. Weight-conserving characterization of complex functional brain networks. NeuroImage 2011; 56: 2068–2079.

10.

Bullmore

Bassett

. Brain graphs: graphical models of the human brain connectome. Annu Rev Clin Psychol 2011; 7: 113–140.

11.

Sala

Quatto

Valsasina

, et al. pFDR and pFNR for brain networks construction. Stat Med 2014; 33: 158–169.

12.

Benjamini

Hocberg

. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Series B 1995; 57: 289–300.

13.

Genovese

Lazar

Nichols

. Thresholding of statistical maps in functional neuroimaging using the false discovery rate. NeuroImage 2002; 15: 870–878.

14.

Bennett

Wolford

Miller

. The principled control of false positives in neuroimaging. Soc Cogn Affect Neurosci 2009; 4: 417–422.

15.

Storey

. A direct approach to false discovery rates. J R Stat Soc Series B Stat Methodol 2002; 64: 479–498.

16.

Efron

. Large-scale inference. Empirical Bayes methods for estimation, testing and prediction, Cambridge: Cambridge University Press, 2010.

17.

Bayarri

Benjamin

Berger

, et al. Rejection odds and rejection ratios: a proposal for statistical practice in testing hypotheses. J Math Psychol 2016; 72: 90–103.

18.

Dunrez

Roels

Moerkerke

. Multiple testing in fMRI: an empirical case study on the balance between sensitivity, specificity, and stability. Biomed J 2014; 56: 649–661.

19.

Shao

. The jackknife and bootstrap, Springer: New York, 1996.

20.

Wasserstein

Lazar

. The ASA's statement on p-values: context, process and purpose. Am Stat 2016; 70: 129–133.

21.

. A Glivenko-Cantelli lemma and weak convergence for empirical processes of associated sequences. Probab Theory Rel 1993; 95: 337–370.

22.

Schäfer

Strimmer

. An empirical Bayes approach to inferring large-scale gene association networks. Bioinformatics 2005; 21: 754–764.

23.

Myers

Montgomery

Vining

. Generalized linear models with applications in engineering and the sciences, New York, NY: Wiley, 2002.

24.

Rasmussen CE and Williams CKI. Gaussian processes for machine learning. Cambridge, Massachusetts: The MIT press, 2006.

25.

Khadem

Hossein-Zadeh

. Quantification of the effects of volume conduction on the EEG/MEG connectivity estimates: an index of sensitivity to brain interactions. Physiol Meas 2014; 35: 2149–2164.

26.

Griffanti

Salimi-Khorshidi

Beckmann

, et al. ICA-based artefact removal and accelerated fMRI acquisition for improved resting state network imaging. Neuroimage 2014; 95: 232–247.

27.

Song

Dong

Long

, et al. REST: a toolkit for resting-state functional magnetic resonance imaging data processing. Plos One 2011; 6: e25031.

28.

Jung

Makeig

Humphries

, et al. Removing electroencephalographic artefacts by blind source separation. Psychophysiology 2000; 37: 163–178.

29.

Pascual-Marqui

Michel CM and Lehmann

. Low resolution electromagnetic tomography: a new method for localizing electrical activity in the brain. Int J Psychophysiol 1994; 18: 49–65.

30.

Goelman

Gordon

Bonne

. Maximizing negative correlations in resting-state functional connectivity MRI by time-lag. PloS One 2014; 9: e111554.

31.

Tomasi

Volkow

. Functional connectivity hubs in the human brain. Neuroimage 2011; 57: 908–917.

32.

Tomasi

Volkow

. Association between functional connectivity hubs and brain networks. Cereb Cortex 2011; 21: 2003–2013.

33.

van den Heuvel

Sporns

. An anatomical substrate for integration among functional Networks in human cortex. J Neurosci 2013; 33: 14489–14500.

34.

Kolaczyk ED. Statistical analysis of network data. Methods and models. New York, NY: Springer Series in Statistics, 2009.

35.

Nichols

Hayasaka

. Controlling the familywise error rate in functional neuroimaging: a comparative review. Stat Meth Med Res 2003; 12: 419–446.