The impact of intensity variations in T1-hypointense lesions on clinical correlations in multiple sclerosis

Abstract

Background: The correlations between T1-hypointense lesion (‘black hole’) volume and clinical measures have varied widely across previous studies. The degree of hypointensity in black holes is associated with the severity of tissue damage, but the impact on the correlation with disability is unknown.

Objectives: To determine how variations in the intensity level used for lesion classification can impact clinical correlation, specifically with the Expanded Disability Status Scale (EDSS), and whether using a restricted range can improve correlation.

Methods: A highly automated image analysis procedure was applied to the scans of 24 multiple sclerosis (MS) patients with well-distributed EDSS scores to compute their black hole volumes at nine different levels of intensity relative to the reference intensities sampled in normal-appearing white matter (NAWM) and cerebrospinal fluid (CSF). Two methods of volume computation were used.

Results: The black hole volume–EDSS Spearman correlations ranged between 0.49–0.73 (first method) and 0.54–0.74 (second method). The strongest correlations were observed by only including the voxels with maximum intensities at 30–40% of the CSF to NAWM range.

Conclusions: Intensity variations can have a large impact on black hole–EDSS correlation. Restricting the measurement to a subset of the darkest voxels may yield stronger correlations.

Keywords

black hole volume computer-assisted image analysis Expanded Disability Status Scale (EDSS)intensity variations multiple sclerosis MRI Spearman rank correlation co-efficient

Introduction

Hypointense lesions on T1-weighted (T1w) magnetic resonance images (MRIs), or ‘black holes', are considered one of the most promising imaging endpoints in multiple sclerosis (MS) clinical trials.¹ Black holes are typically defined as white matter lesions with a hypointense appearance relative to normal-appearing white matter (NAWM) in a T1w image and that also appear hyperintense in the corresponding co-registered T2-weighted (T2w) image. About half of newly formed black holes are transient and will return to an isointense state, while others do not resolve and these, called persistent or chronic black holes, have been shown to reflect irreversible demyelination and axonal damage.^2,3 Black holes have been extensively studied to determine their pathological and clinical relevance,⁴ but despite the increasing knowledge on their pathological significance, their relationship to clinical features remains unclear. Specifically, the correlation between black hole volume and clinical measures, most notable the Expanded Disability Status Scale (EDSS),⁵ has been very inconsistent.^4,6 The Spearman correlation (r) values in previous cross-sectional studies have ranged from non-significant^7,8 to significant and moderately strong (e.g., r = 0.58, r = 0.65).^9,10 This inconsistency can be attributed to a number of factors. The discrepancy between the clinical presentation of MS and the disease burden visible on MRI, commonly referred to as the clinico-radiological paradox, has been well documented.¹¹ Other considerations are the size and clinical diversity of the patient group, with the strongest correlations driven primarily by studies with a combination of relapsing–remitting MS (RRMS) and secondary progressive MS (SPMS) patients.⁶ Additional factors that may affect the correlation in any given study include variations in lesion appearance due to pathological state, the MRI hardware and scanning sequence parameters, and methodological considerations for lesion classification.^2,12

A number of studies have shown that the image intensity within a black hole is related to the amount of tissue destruction, with the darkest areas reflecting the most profound, and therefore likely to be more permanent, damage. Histological evidence has shown that the degree of hypointensity within a black hole correlates to axonal density¹³ and can be used to partially distinguish between demyelinated lesions and those that have undergone subsequent remyelination.¹⁴ More recently, a tissue-specific imaging technique was used to demonstrate that chronic black holes with signal closer to cerebrospinal fluid (CSF) have lower magnetization transfer ratios, indicative of more advanced tissue damage, than those lesions with signal closer to white matter.¹⁵ It has been suggested that because of the heterogeneous nature of hypointense lesions, methods of black hole volume measurement that account for such intensity variations may produce stronger imaging surrogate biomarkers of MS progression.¹² Despite the mounting evidence that the intensity heterogeneity of black holes is pathologically meaningful, there have been very few systematic studies on how the variations in intensity may influence the correlation to clinical measures. Most previous studies on black hole volume have relied on binary segmentations that included all visibly hypointense regions within a corresponding hyperintense T2w lesion (e.g., Van Walderveen et al.,⁹ Parry et al.¹⁰). A notable exception is the study by O'Riordan et al.,⁷ who explored the effect of the degree of hypointensity on the correlation between black hole volume and EDSS by computing the lesion volume at several different intensity thresholds. They did not find a statistically significant correlation at any threshold, but the patient sample available was not optimal in that the EDSS scores were tightly clustered.

The goals of this cross-sectional study are to investigate the range of black hole volume versus EDSS correlation values that may be spanned by varying the degree of hypointensity for classifying the lesions, and to determine whether measuring the volume within a restricted intensity range can improve clinical correlation over the traditional approach of including all visibly hypointense regions. While lesions only reflect a limited aspect of MS pathology and black hole volume cannot be expected to correlate perfectly with any clinical measure, identifying the key sources of variation that affect this correlation would be useful to improve reproducibility across future studies and provide further insight into the utility of black hole volume as a surrogate biomarker. To this end, a highly automated image analysis procedure was applied to the scans of 24 patients with MS with well-distributed EDSS scores to compute their black hole volumes at nine different levels of intensity relative to the reference intensities sampled in NAWM and CSF. Two methods of volume computation, one that takes each voxel as a whole, and the other computing a fractional value based on relative intensity, were compared to explore a technique that could potentially make the optimal volume–EDSS correlation more robust to the intensity level used for lesion classification.

Materials and methods

The MRIs used in this study were acquired as part of an MS clinical trial. The scans of all patients from a single selected site who had visible black holes were included, resulting in a sample size of 24. Of these patients, 13 had RRMS status while 11 had evolved to SPMS. Table 1 summarizes the patients' age, gender, disease duration and EDSS. Even though the patients were all from one center, their EDSS scores spanned a wide range and were well distributed. The T2w, proton density-weighted (PDw) and gadolinium-enhanced T1w scans used in this study were acquired on a Philips Achieva 3T MRI scanner (Philips Healthcare, Andover, USA). The T2w and PDw scans were acquired with a dual-echo sequence using TE1 = 15.0 ms, TE2 = 75.0 ms and TR = 2700.0 ms. Contrast-enhanced T1w scans were used in order to avoid including the enhanced regions, indicative of active inflammation, in the volumetric calculations. The T1w scans were acquired with TE = 9.0 ms and TR = 616.7 ms. All of the images have dimensions 256 × 256 × 50 and voxel size 0.937 × 0.937 × 3.0 mm, with no interslice gap. Appropriate ethics approval was obtained from the institutional ethics review board for all data acquired in this study.

Table 1.

Age (years), disease duration (years), EDSS, gender and MS type statistics for the 24 patients in the current study (µ = mean, σ = standard deviation, q1–3 = quantiles of the distribution)

	µ	σ	q1	q2	q3	Range
Age	52.9	7.5	47.5	51.5	58.6	41.0–66.0
Disease Duration	24.7	5.1	20.0	24.0	29.4	17.8–34.3
EDSS	5.0	2.2	3.5	5.5	6.8	1.5–8.0
Gender	6 male, 18 female
MS Type	13 RRMS, 11 SPMS

EDSS, Expanded Disability Status Scale; RRMS, relapsing–remitting multiple sclerosis; SPMS, secondary progressive multiple sclerosis.

Lesion classification and volume computation

The lesion classification and volume computation pipeline consists of two main preprocessing steps, followed by T2w lesion segmentation, then finally extraction and volumetric quantification of the black holes. To minimize operator bias, we employ a largely automated processing procedure that only requires manual interaction for identifying the T2w lesions. The first preprocessing step is the correction of MR intensity inhomogeneity using a multi-scale version¹⁶ of the Non-parametric Non-uniform intensity Normalization (N3) method.¹⁷ In the second preprocessing step, all non-brain tissue is removed with the Brain Extraction Tool (BET).¹⁸ N3 and BET are both publicly available as free software. The rest of the segmentation procedure is diagrammed in Figure 1.

Figure 1.

Overview of the black hole segmentation process. The only step that requires manual interaction is the identification of the T2w lesions. CSF, cerebrospinal fluid; GM, grey matter; MRI, magnetic resonance imaging; NAWM, normal-appearing white matter; T1w, T1-weighted; T2w, T2-weighted.

For T2w lesion identification, two radiologists who are highly experienced in MS lesion identification place one or more seed points to mark the location and approximate extent of each lesion. The radiologists follow a set of guidelines, consisting of three main points, that is minimalistic and allows the seeding procedure to be efficient and intuitive:

Mark all distinct hyperintense white matter regions consistent with MS pathology, regardless of their size.

Place more than one point on a lesion if they feel that the additional points would help define the extent of the lesion.

Place at least one point near the center of each lesion.

Full details on the point placement and subsequent automatic segmentation of the T2w lesions are described in previous work,¹⁹ but briefly, the seed points are processed by a customized Parzen windows²⁰ classifier to estimate the intensity distribution of the lesions, and connected component and shape analyses are then used to reduce false positives. The T2w delineation component is the most complex of the entire black hole segmentation pipeline, and has the largest impact on overall accuracy and reproducibility. Fortunately, this component has been shown to agree closely and consistently to an established gold standard, and effectively reduces the variability between raters placing the seed points.¹⁹ The overall accuracy of the T2w segmentation process is approximately 80% as measured by the Dice overlap coefficient.²¹

To automatically extract the black holes from the T1w scans, each T2w scan is first registered to the corresponding T1w scan using a rigid transformation so that the T2w lesion mask can be overlaid to define a search space for the T1-hypointense regions. In some areas, most notably those with significant partial volume such as near the roof of the lateral ventricles, the CSF boundaries on the T1w scans can overlap slightly with those of the T2w lesion masks. The reasons for this may include overestimation of the T2w lesions, registration error and normal differences in CSF appearance on T1w versus T2w scans. To exclude CSF from contributing falsely to the black hole volume, a two-step CSF classification procedure is employed: (1) histogram-based thresholding²² is applied to the T1w image to form a conservative CSF mask that mostly excludes partial volume areas; (2) the CSF mask on each slice is then refined using a 2D geodesic active contour (GAC),²³ which is a boundary model that deforms according to local gradient information, in this case expanding the CSF segmentation to coincide with detectable boundaries on the T1w image. The GAC is allowed to grow into the T2w lesion mask, thereby pushing it out of the CSF, but only up to a maximum of three voxels, which was empirically determined to be the maximum extent of the conflict region between the CSF/T2w lesion masks. The three-voxel constraint limits overgrowth errors in areas with exceptionally low gradients.

To extract the black holes from the refined T2w lesion mask at a given level of intensity, relative to the intensities of NAWM and CSF which may vary from scan to scan, the following upper threshold (t_x ) is used to determine whether any given voxel x should be included:

where i _NAWM, _x is the NAWM intensity, taken as the mean of the T1w intensities of the 20 WM voxels that are closest to x on the same slice and also outside of the T2w lesion containing x, i _CSF,slice is the mean CSF intensity over the entire slice, and l is the key parameter that is varied to span the range between NAWM and CSF. This method of computing an individual threshold for every voxel is designed to maximize the use of local contrast information and also naturally excludes gadolinium-enhanced areas. Grey matter is distinguished from WM and excluded from the computation of i _NAWM, _x by applying a modified fuzzy clustering algorithm²⁴ to the T2w/PDw combined intensity space. Every voxel x that has intensity i_x ≤ t_x is counted as a black hole voxel, with volume v equal to the product of the three voxel dimensions.

To assess how closely the automatic segmentation reflects the extent of total visible black hole burden, the radiologists manually delineated the full visual extent of all focal hypointensities on the T1w scans, excluding the enhanced areas, to provide a comparison to the computed results. Each radiologist independently traced all 24 scans, and upon review of the final segmentations, it was determined that the union of the two sets of manual results would most completely represent the lesion regions and best represent the traditional approach of black hole analysis that does not model intensity variations.

After the segmentation process, the black hole volumes are computed. In addition to the method above that counts each black hole voxel as a whole, a secondary volume is computed for each voxel by using the equation:

which produces a weighted volume v_x that reflects the voxel's level of hypointensity (i.e., the closer the intensity is to CSF, the closer v_x is to v). The total black hole volume (vol _l ) for a scan at a given l and the corresponding intensity-weighted black hole volume (IWvol _l ) are then computed by summing v and v_x , respectively, for the included voxels over all slices. In this study we varied l from 0.10 to 0.90 in increments of 0.10, resulting in black holes extracted at nine different levels of intensity. Because l is used to compute the upper threshold t_x for any given voxel x, vol _l ₁ ≥vol _l ₂ and IWvol _l ₁ ≥IWvol _l ₂ for all values l1 ≥ l2. To account for normal variations in head size, the black hole volumes are normalized to the parenchymal volume of each subject to facilitate cross-sectional comparisons.

Once the volumes were computed, the Spearman rank correlations between vol _l and EDSS and between IWvol _l and EDSS were determined for each intensity level l. The similarity between the manual and automatic segmentations was quantified for each l using the Dice overlap coefficient. To provide greater context for the black hole results, we calculated a number of statistics relevant to T2w lesion volume and disease duration. Specifically, for each l we computed the percentage of the T2w lesion volume occupied by T1w lesions, the correlation between T2w lesion volume and black hole volume, and the correlation between T2w lesion volume and EDSS to determine whether the T1w lesions provide additional information to T2w lesion load and have a stronger relationship to clinical parameters as observed in previous studies. In addition, we computed the correlation between disease duration and EDSS and between disease duration and black hole volume. We also applied the Multiple Sclerosis Severity Score (MSSS),²⁵ which uses EDSS and disease duration to produce a new measure based on the distribution of disability in patients with comparable durations, and repeated our black hole correlation analysis with the MSSS.

Results

We compared the automatic and radiologist segmentations to determine the accuracy and consistency of the black hole computation method. The best overlap agreement between the automatic and union of the two manual segmentations occurs at l = 0.80 with an overall Dice similarity of 0.69, which is comparable to that between the segmentations performed by the two radiologists (0.71). In addition, the standard deviation in Dice similarity between the automatic and manual methods (0.11) is also similar to that between the radiologists (0.14).

Table 2 shows the statistics for the black hole volumes and EDSS correlations for the nine different intensity levels. From a minimum of 0.49 (p < 0.05) at l = 0.90, the correlation increases gradually as l decreases, to a peak of 0.73 (p < 0.0001) at l = 0.30. At l = 0.20, r remains approximately the same (0.72, p < 0.0001), but drops sharply to 0.54 (p < 0.01) at l = 0.10.

Table 2.

Black hole volume mean (μ _{vol_l}) and standard deviation (δ _{vol_l}) in mm³, percentage of T2w lesion volume occupied by T1w black holes (T1w/T2w %), and Spearman's correlation (r) between EDSS scores and vol _l normalized to parenchymal volume, computed at nine different intensity levels

l	μ _{vol_l}	δ _{vol_l}	T1w/T2w %	r
0.10	70.47	127.29	1.40	0.54**
0.20	145.95	225.69	2.37	0.72****
0.30	281.82	357.39	3.28	0.73****
0.40	540.60	638.22	5.02	0.72****
0.50	981.45	1110.99	7.69	0.71***
0.60	1839.12	2122.83	12.21	0.69***
0.70	3576.75	4264.59	20.73	0.62**
0.80	6520.02	7823.16	33.29	0.59**
0.90	10497.54	12464.85	51.69	0.49*

For the correlation r, *indicates p < 0.05, **indicates p < 0.01, ***indicates p < 0.001 and ****indicates p < 0.0001.

The fact that the automatic and manual methods agree most strongly at l = 0.80 means that this is the intensity level that most accurately targets all visibly hypointense regions. At this level, the EDSS correlation is 0.59 (p < 0.01). The mean volume computed at l = 0.30, the level with the highest correlation (0.73, p < 0.0001), is only 4.32% of that computed at l = 0.80, which means the maximum correlation for this data set occurs when only a very small, dark subset of the visible black hole voxels are included. Figure 2 illustrates the black holes extracted from a sample slice at l = 0.80 and l = 0.30.

Figure 2.

Examples of black holes extracted at two different intensity levels. Left: T1w image with visible black holes. Middle: black holes computed at l = 0.80 (in purple). Right: black holes computed at l = 0.30 (in purple). The lesions computed at l = 0.80 correspond most closely to expert manual segmentations, while the volumes computed at l = 0.30 have the highest correlation to Expanded Disability Status Scale over all scans in the current data set. (Online version in colour).

Table 3 shows the statistics for the intensity-weighted black hole volumes, computed using Equation 2, and the EDSS correlations for the nine different intensity levels. The maximum correlation obtained is 0.74 (p < 0.0001), which is very close to that from the non-intensity-weighted volumes, but at a slightly higher value of l (0.40 vs. 0.30).

Table 3.

Intensity-weighted black hole volume mean (μ _{lW_{vol_l}}) and standard deviation (σ _{lW_{vol_l}}) in mm³, and Spearman's correlation (r) between EDSS scores and IWvol _l normalized to parenchymal volume, computed at nine different intensity levels

l	μ _{lW_{vol_l}}	σ _{lW_{vol_l}}	r
0.10	55.74	94.17	0.54**
0.20	85.89	127.92	0.66***
0.30	133.71	173.34	0.72****
0.40	207.60	249.81	0.74****
0.50	315.51	354.36	0.73****
0.60	483.93	523.08	0.72****
0.70	761.01	811.65	0.68***
0.80	1213.14	1304.52	0.65***
0.90	1913.13	2085.42	0.62**

For the correlation r, **indicates p < 0.01, ***indicates p < 0.001 and ****indicates p < 0.0001.

Figure 3 is a graph of the volume to EDSS correlations with and without intensity weighting. The intensity-weighted correlation curve is flatter, more centered to the middle of the range, and more symmetrical, but the magnitudes of the two peaks are almost identical.

Figure 3.

Black hole (BH) volume and intensity-weighted black hole volume correlation to Expanded Disability Status Scale (EDSS) at nine different intensity levels. The intensity-weighted correlation curve is flatter, more centred to the middle of the range, and more symmetrical, but the magnitudes of the two peaks are almost identical.

The correlation between T2w lesion volume, normalized to parenchymal size, and EDSS is 0.431 (p < 0.05), which is lower than the T1w lesion to EDSS correlation at any given intensity level for either computation method. The correlation between T2w lesion load and black hole volume without intensity weighting is 0.97 (p < 0.00001) at l = 0.80, which is the intensity level producing the strongest agreement between the automatic and manual black hole segmentations. At l = 0.30, the level with the highest correlation between T1w lesion volume and EDSS, the correlation between T1w and T2w lesion loads drops to 0.70 (p < 0.001), and the black holes only occupy 3.28% of the T2w lesion volume (Table 2).

In this patient sample, there is no significant correlation between disease duration and EDSS or disease duration and black hole volume at any intensity level for either computation method. In addition, adjusting the EDSS for disease duration by using the MSSS results in very similar correlations to those produced by using the EDSS alone. However, it should be noted that four of the patients had disease durations of 31, 32, 33 and 34 years, which are longer than the valid range of 30 years for applying the MSSS, and the values had to be clamped.

Discussion

This study has yielded some intriguing results on the influence of intensity variations on the correlation between T1w hypointense lesion volume and clinical disability as measured by EDSS. The most striking observation is the large range of correlation values obtained by varying the intensity level l used for lesion classification (0.49–0.72 for vol _l , 0.54–0.74 for IWvol _l ). While the correlation values near the high end of the observed range are strong (0.7+), the well-distributed nature of the EDSS scores may be a dominant factor, and the range of correlations likely represents a more general result than the values themselves.

In addition to using a clinically diverse patient group, there may be other methodological reasons for the range in correlation values and the smooth nature of the curves shown in Figure 3. To reduce the concerns of scan-to-scan variations in lesion appearance, we used NAWM and CSF as intensity references that should capture some of the signal fluctuations. O'Riordan et al.⁷ also used NAWM and CSF to define black holes, but they worked with a much more restricted range to avoid overlapping intensities between black holes and CSF/NAWM, presumably because they did not have registered T2w lesion masks that could have been used to prevent CSF regions from being misclassified as black holes. In our case, the T2w lesion masks allowed us to explore the entire range from CSF to NAWM. However, to minimize the number of variables we made the simplifying assumption that the optimal value of l applies equally to all regions of the brain. Given that different regions can have different intensities for the same tissue type, a more refined analysis would include individual weights for each region.

It has also been suggested that the subjectivity of manual interaction may have impacted previous studies of black holes,² which tend to have much less distinct boundaries than T2w lesions. This provided motivation for our use of a largely automated method to reduce measurement noise. However, to reduce false positives stemming from the lack of specificity of conventional MRI, we did ask the radiologists to manually identify the T2w lesions. It would be of great interest to adapt some of the recent black hole classification methods^26,27 that do not require any manual processes except for the provision of expertly labeled data used to set the algorithmic parameters.

In this patient sample, the highest correlations values are associated with a small subset of the total black hole voxels (<5% for l = 0.30, <9% for l = 0.40) and have intensity values lower than or equal to 30–40% of the range from CSF to NAWM. The association of greater clinical significance with the darker regions seems to lend further support to previous studies that indicate such areas have more profound damage. The optimal value of l is likely to vary across data sets, and this single data set cannot be used to recommend a specific value. However, the differences in correlation within 20–60% of the CSF to NAWM range are small (≤0.06), and it seems promising that, given enough data, determining a useful range of values is a feasible goal, especially when using the intensity-weighted method of volume computation. Intensity weighting improves the ability to account for finer signal variations due to pathological state as well as partial volume, and makes the correlation curve more robust to the intensity level used for classification. We envision that if similar curves are observed across other data sets, the approach of targeting the darker regions would eventually replace the common strategy of including all visibly hypointense areas. This new volume measure would be used wherever total black hole burden is currently used, whether on its own or in a composite imaging score. As the correlation curve is fairly flat (Figure 3), especially when using intensity-weighted volumes, a conservative but still beneficial approach to choosing an intensity level could be to use a value close to 0.5, which may not yield the optimal correlation but the result should be close, and there would be little risk of encountering a sharp drop. However, as described below, further validation with advanced imaging, longitudinal data, larger samples and other disability measures will ultimately decide how the method would be best used in MS image analysis.

This work was motivated by histopathology studies that show the degree of T1w hypointensity is associated with the severity of tissue damage, and such evidence allows us to conjecture that the increased correlation to disability that we obtained by limiting the volume measure to the darker voxels is the result of restricting the region-of-interest to the more damaged areas, but the conventional MRIs used in this study lack the pathological specificity needed to confirm our theory. Future studies that use advanced MRI techniques, such as myelin water²⁸ and magnetization transfer^15,29 imaging, in combination with the current analysis method, would allow us to see the differences in composition and structural integrity at the various levels of hypointensity. For example, values of l at the higher end may primarily reflect inflammation and edema, while the lower values may indicate demyelination and axonal transaction.

This study only used data from a single time point, therefore only a cross-sectional analysis could be performed. Adjusting for disease duration did not significantly alter the results, but using longitudinal data may still prove valuable as demonstrated in previous studies with shorter time intervals (e.g., Van Walderveen et al.,⁹ Truyen et al.³⁰). For example, distinguishing between transient and chronic black holes, which are indicative of more permanent injury, may define more precisely the relationships between the intensity and duration of black holes. It remains to be seen whether intensity-based analysis has enough discriminative power to supplant longitudinal methods for identifying irreversible damage, but perhaps using the degree of hypointensity to further classify voxels within longitudinally identified persistent black holes would yield greater sensitivity to clinical parameters.

This study used a small patient sample relative to those used in most MS clinical trials. Larger groups of patients, perhaps segregated by disease course, with scans acquired from multiple sites, should be included in future studies. The method that we used for T2w segmentation has already been validated for large multi-center studies, and preliminary experiments have shown that the black hole extraction method can be used with little to no modification. The results could provide a practical estimate of the range of intensity levels that could be used to obtain more consistent and potentially stronger EDSS correlations in cross-sectional studies.

The measure of disability that we used for this study, namely the EDSS, despite being an extensively used clinical parameter in MS, has its own limitations. The non-linearity and heavy emphasis of the EDSS on ambulation have been well documented, and other measures that incorporate additional dimensions of disability, such as the Multiple Sclerosis Functional Composite (MSFC), are available and should be investigated.³¹

In conclusion, we have shown that a large range in correlation values between black hole volume and EDSS can be observed by varying the intensity level, relative to the reference intensities sampled in CSF and NAWM, used to classify the lesions. In this data set, the greatest similarity to expertly traced lesions was obtained by including all voxels that are only slightly hypointense (≤80% of the CSF to NAWM range), but the strongest correlations were found by limiting the black hole regions to a small subset of the darkest voxels (≤30–40% of the CSF to NAWM range). Using an intensity-weighted method of volume computation improved the robustness of the correlation to parameter selection.

Footnotes

Funding

This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

Conflict of interest statement

The authors declare that they have no conflicts of interest.

References

Barkhof

Calabresi

Miller

Reingold

. Imaging outcomes for neuroprotection and repair in multiple sclerosis trials. Nat Rev Neurol 2009; 5: 256–266.

Neema

Stankiewicz

Arora

Guss

Bakshi

. MRI in multiple sclerosis: what's inside the toolbox? Neurotherapeutics 2007; 4: 602–617.

Van den Elskamp

Lembcke

Dattola

Beckmann

Pohl

Hong

. Persistent T1 hypointensity as an MRI marker for treatment efficacy in multiple sclerosis. Mult Scler 2008; 14: 764–769.

Sahraian

Radue

Haller

Kappos

. Black holes in multiple sclerosis: definition, evolution, and clinical correlations. Acta Neurol Scand 2010; 122: 1–8.

Kurtzke

. Rating neurologic impairment in multiple sclerosis: an Expanded Disability Status Scale (EDSS). Neurology 1983; 33: 1444–1452.

Naismith

Cross

. Multiple sclerosis and black holes: connecting the pixels. Arch Neurol 2005; 62: 1666–1668.

O'Riordan

Cain

Coles

Wang

Compston

Tofts

. T1 hypointense lesion load in secondary progressive multiple sclerosis: a comparison of pre versus post contrast loads and of manual versus semi automated threshold techniques for lesion segmentation. Mult Scler 1998; 4: 408–412.

Masek

Vaneckova

Krasensky

Danes

Havrdova

Hrebikova

. Secondary-progressive form of multiple sclerosis: MRI changes versus clinical status. Neuroendocrinol Lett 2008; 29: 461–466.

Van Walderveen

Truyen

van Oosten

Castelijns

Lycklama à Nijeholt

van Waesberghe

. Development of hypointense lesions on T1-weighted spin-echo magnetic resonance images in multiple sclerosis: relation to inﬂammatory activity. Arch Neurol 1999; 56: 345–351.

10.

Parry

Clare

Jenkinson

Smith

Palace

Matthews

. White matter and lesion T1 relaxation times increase in parallel and correlate with disability in multiple sclerosis. J Neurol 2002; 249: 1279–1286.

11.

Barkhof

. The clinico-radiological paradox in multiple sclerosis revisited. Curr Opin Neurol 2002; 15: 239–245.

12.

Zivadinov

Leist

. Clinical-magnetic resonance imaging correlations in multiple sclerosis. J Neuroimaging 2005; 15: 10S–21S.

13.

Van Walderveen

Kamphorst

Scheltens

van Waesberghe

Ravid

Valk

. Histopathologic correlate of hypointense lesions on T1-weighted spin-echo MRI in multiple sclerosis. Neurology 1998; 50: 1282–1288.

14.

Barkhof

Bruck

De Groot

CJA

Bergers

Hulshof

Geurts

. Remyelinated lesions in multiple sclerosis: magnetic resonance image appearance. Arch Neurol 2003; 60: 1073–1081.

15.

Riva

Ikonomidou

Ostuni

van Gelderen

Auh

Ohayon

. Tissue-speciﬁc imaging is a robust methodology to differentiate in vivo T1 black holes with advanced multiple sclerosis-induced damage. AJNR Am J Neuroradiol 2009; 30: 1394–1401.

16.

Jones

Wong

. Multi-scale application of the N3 method for intensity correction of MR images. In: Proceedings of SPIE (Medical Imaging 2002: Image Processing) 2002 1123–1129.

17.

Sled

Zijdenbos

Evans

. A nonparametric method for automatic correction of intensity nonuniformity in MRI data. IEEE Trans Med Imaging 1998; 17: 87–97.

18.

Smith

. Fast robust automated brain extraction. Hum Brain Map 2002; 17: 143–155.

19.

McAusland

Tam

Wong

Riddehough

. Optimizing the use of radiologist seed points for improved multiple sclerosis lesion segmentation. IEEE Trans Biomed Eng 2010; 57: 2689–2698.

20.

Parzen

. On estimation of a probability density function and mode. Ann Math Stat 1962; 33: 1065–1076.

21.

Dice

. Measures of the amount of ecologic association between species. Ecology 1945; 26: 297–302.

22.

Otsu

. A threshold selection method from gray level histograms. IEEE Trans Syst Man Cybern 1979; 9: 62–66.

23.

Caselles

Kimmel

Sapiro

. Geodesic active contours. Int J Comput Vision 1997; 22: 61–79.

24.

McAusland J, Wong E, Riddehough A and Li D. A modiﬁed fuzzy clustering method for modelling partial volume effects in MRI data. In: Proceedings of the International Society for Magnetic Resonance in Medicine (ISMRM) Scientiﬁc Meeting. 2004.

25.

Roxburgh

Seaman

Masterman

Hensiek

Sawcer

Vukusic

. Multiple Sclerosis Severity Score: using disability and disease duration to rate disease severity. Neurology 2005; 64: 1144–1151.

26.

Datta

Sajja

Wolinksky

Gupta

Narayana

. Segmentation and quantiﬁcation of black holes in multiple sclerosis. NeuroImage 2006; 29: 466–474.

27.

Warﬁeld

Tan

Wells

Meier

van Schijndel

. Automated segmentation of multiple sclerosis lesion subtypes with multi-channel MRI. NeuroImage 2006; 32: 1205–1215.

28.

Vavasour

DKB

Laule

Traboulsee

Moore

GRW

MacKay

. Multi-parametric MR assessment of T1 black holes in multiple sclerosis. J Neurol 2007; 254: 1653–1659.

29.

Levesque

Giacomini

Narayanan

Ribeiro

Sled

Arnold

. Quantitative magnetization transfer and myelin water imaging of the evolution of acute multiple sclerosis lesions. Magn Reson Med 2010; 63: 633–640.

30.

Truyen

Waesberghe

van Walderveen

van Oosten

Polman

Hommes

. Accumulation of hypointense lesions (“black holes”) on T1 spin-echo MRI correlates with disease progression in multiple sclerosis. Neurology 1996; 47: 1469–1476.

31.

Marrie

Goldman

. Validity of performance scales for disability assessment in multiple sclerosis. Mult Scler 2007; 13: 1176–1182.