56
View
Open Access Peer-Reviewed|
ARTIGO ORIGINAL

Automated hippocampal volume measurement: agreement analysis between HIPS and volBrain software

Volumetria automatizada do hipocampo: análise de concordância entre os softwares HIPS e volBrain

Federico Biafore1,a; Jorge Docampo1,b; Germán Duca2,c

DOI: 10.1590/0100-3984.2025.0003
e20250003
Publish in: August 20 2025

ABSTRACT

OBJECTIVE: To perform an agreement analysis between volBrain and HIPS software for measuring hippocampal volume and its associated asymmetry index.
MATERIALS AND METHODS: We evaluated volumetric T1-weighted magnetic resonance imaging scans from radiologically normal subjects (n = 50; age range, 25–75 years). Correlation and Bland-Altman plots were generated. The Pearson correlation coefficient (r) and the intraclass correlation coefficient of absolute agreement between volBrain and HIPS software were calculated for each measurement.
RESULTS: For each hippocampus and its combined volume, a very high correlation was found between the methods (r ≥ 0.96 for absolute values and r ≥ 0.93 for relative values), along with a systematic bias (primarily additive). Consistently, HIPS (with the Kulaga-Yoskovitz protocol) reported smaller volumes than did volBrain. The average difference ranged from 8.2% to 9.1% for absolute values and from 7.9% to 8.7% for relative values. The asymmetry index exhibited a strong correlation (r = 0.82) with no significant bias, although 14% of cases showed opposite signs. The average asymmetry index difference was 32.7%. The intraclass correlation coefficient of absolute agreement ranged from 0.61 to 0.83, reflecting moderate to good agreement overall.
CONCLUSION: Our results indicate that the two methods are not interchangeable for evaluating hippocampal volume and its associated asymmetry index.

Keywords: Hippocampus; Magnetic resonance imaging; Alzheimer disease; Image processing, computer-assisted; Reproducibility of results.

RESUMO

OBJETIVO: Realizar uma análise de concordância entre os softwares volBrain e HIPS para medir o volume do hipocampo e seu índice de assimetria associado.
MATERIAIS E MÉTODOS: Foram usadas imagens volumétricas de ressonância magnética ponderadas em T1 de 50 indivíduos radiologicamente normais (25–75 anos). Foram realizados gráficos de correlação e de Bland-Altman. Os coeficientes de correlação de Pearson (r) e o coeficiente de correlação intraclasse de concordância absoluta entre os dois métodos de medição foram calculados em cada caso.
RESULTADOS: Os volumes do hipocampo, isolados e combinados, apresentaram correlação extremamente alta entre os métodos (r ≥ 0,96 para valores absolutos e r ≥ 0,93 para valores relativos), juntamente com um viés sistemático predominantemente aditivo. Os volumes medidos pelo HIPS (protocolo de Kulaga) foram consistentemente menores que os do volBrain, com diferença média entre os métodos de 8,2% a 9,1% para valores absolutos e de 7,9% a 8,7% para valores relativos. Para o índice de assimetria, houve forte correlação (r = 0,82) sem viés significativo, embora 14% da amostra apresentassem sinais opostos. A diferença média entre os métodos foi de 32,7%. Os valores de correlação intraclasse de concordância absoluta variaram de 0,61 a 0,83, refletindo uma concordância geral de moderada a boa.
CONCLUSÃO: Os resultados sugerem que os dois métodos não são intercambiáveis para avaliar o volume do hipocampo e seu índice de assimetria associado.

Palavras-chave: Hipocampo; Ressonância magnética; Doença de Alzheimer; Processamento de imagem assistida por computador; Reprodutibilidade dos testes.

INTRODUCTION

Structural changes in the brain are often associated with various neurodegenerative or psychiatric conditions, as well as with normal aging. These changes can alter the properties of images acquired, such as intensity values on magnetic resonance imaging (MRI) and computed tomography, and may also lead to morphological variations, including changes in the volume of different brain tissues and structures(1–3). The high spatial resolution of three-dimensional (3D) images from clinical high-field (1.5-T and 3.0-T) MRI systems, combined with advances in segmentation and quantification algorithms, has driven rapid growth in this field of research, significantly impacting clinical practice, especially over the last decade. As a result, volumetric MRI analysis has demonstrated significant potential as a tool for the diagnosis and monitoring of various neurological diseases(4–9).

Hippocampal volumetry is widely used for studying and monitoring diseases such as temporal lobe epilepsy, Alzheimer’s disease, and schizophrenia. The value of these measurements as imaging biomarkers, along with related metrics like the asymmetry index, has been supported by extensive research(10–14). Although manual volumetric techniques performed by specialists are considered the gold standard, their time-consuming nature makes them impractical for routine clinical use, and achieving consistent reproducibility remains challenging. As a result, there is a growing use of automated techniques based on probability atlases, which operate without user intervention. These methods incorporate spatial information, in addition to signal intensity at the pixel or voxel level, to classify tissues and structures(1). These atlases are created from MRI studies of cohorts of healthy subjects that have been spatially aligned and intensity-normalized, ensuring a shared geometric and intensity domain across the dataset. Including a large number of subjects accounts for anatomical variability across individuals(1,5). From such atlases, segmentations and volumetric calculations can be performed automatically at different scales for a given case (once the images have been normalized to the atlas space), from macroscopic tissues to subcortical structures(1,5,7). Several free, commercially licensed software packages are currently available to perform this task(15–23). One of these is the volBrain platform (www.volbrain.org), which offers a suite of MRI volumetric analysis tools. It requires no software installation or operation by the user; the image to be analyzed is simply uploaded to the website. The platform handles image preprocessing and processing, delivering a report with the corresponding volumetric measurements in about 20 min. Among the tools available on the platform, the volBrain module performs brain segmentation and volumetry at multiple scales from a 3D T1-weighted MRI scan, with a recommended 1-mm isotropic resolution in Neuroimaging Informatics Technology Initiative (NIfTI) format.

The volBrain report includes measurements of the total intracranial volume (TIV), total gray matter, white matter, cerebrospinal fluid, lateral ventricles, cerebellum (left and right), and volumes of subcortical structures such as the putamen, caudate, globus pallidus, thalamus, hippocampus, amygdala, and accumbens(19).

In contrast, the HIPS module is specifically designed for hippocampal volumetry, including the parcellation of the hippocampus into its substructures. This module can work with only one 3D T1-weighted (monomodal) MRI scan or by adding a 3D T2-weighted (bimodal) image(20).

The aim of this study was to perform an agreement analysis between the two modules for measuring the volume of the hippocampus and the corresponding asymmetry index using T1-weighted MRI scans from radiologically normal subjects.


MATERIALS AND METHODS

For this retrospective analysis, we used volumetric T1-weighted MRI scans from radiologically normal adults (n = 50; age range, 25–75 years) without a significant medical history. All images were acquired in 3.0-T scanners.

One of the image sets (group 1) was obtained from our institution and consists of images of patients with a history of headaches (n = 20; 10 males and 10 females; age range, 25–40 years), acquired in 3D fast spoiled gradient-echo sequences with a resolution of 1 × 1 × 1.2 mm3. All of the patients gave written informed consent before undergoing the imaging scans.

The second set of images (group 2) consists of images of cognitively normal adults (n = 30; 15 males and 15 females; age range, 43–75 years), sourced from the Open Access Series of Imaging Studies database(24). The images were acquired in 3D fast spoiled gradient-echo and 3D magnetization-prepared rapid gradient-echo sequences with resolutions of 1 × 1 × 1.2 mm3 and 1.2 × 1.05 × 1.05 mm3, respectively.

Both image sets were evaluated by a senior neuroradiologist and showed no abnormalities. Our multicentric population sample covered a wide age range, and the images were acquired in different 3D sequences commonly used in routine practice.

Image processing

The images were anonymized and converted to NIfTI format before being processed on the volBrain platform.

Preprocessing – Both modules perform an image preprocessing pipeline, which includes the application of a noise removal filter, correction for field inhomogeneity, normalization to Montreal Neurological Institute space, intensity normalization, and extraction of the intracranial cavity. The detailed description can be found in previous works(19,20).

volBrain – All segmentations, except for that of the hippocampi, are based on adaptations of probabilistic atlases and manually segmented libraries. In the specific case of the hippocampus, volBrain follows the Alzheimer’s Disease Neuroimaging Initiative harmonized protocol, which defines procedures to standardize hippocampal segmentation. The protocol was designed by the European Alzheimer’s Disease Consortium to establish consensus on hippocampal segmentation for clinical and research applications, while also serving to validate automated segmentation algorithms(19,25).

HIPS – We selected the Kulaga-Yoskovitz monomodal segmentation protocol, which divides the hippocampus into three substructures(20): CA1-3, CA4/DG, and the subiculum. The total hippocampal volume is obtained by summing the volumes of those substructures, as detailed in the corresponding report. In addition, both modules generate segmentation masks of the structures in NIfTI format, which can be merged with the MRI images for evaluation purposes.

The reports include the absolute and relative values (expressed as a percentage of the TIV) for the segmented structures, along with reference ranges of normal values. From those, the absolute and relative values were obtained. The asymmetry index was calculated as follows:




where AI is the asymmetry index and where HR, HL, and HR + HL are the absolute values for the volumes of the right hippocampus, left hippocampus, and total hippocampus, respectively.

Statistical analysis

Correlation and Bland-Altman scatter plots were generated for each measurement. The Pearson correlation coefficient (r) and absolute intraclass correlation coefficient (ICC) were computed to assess the degree of linear association and agreement between the two measurement methods(26,27), respectively. Because the absolute ICC considers any difference between measurements as discordance, is a useful statistical tool to assess whether the two methods are interchangeable.

The Shapiro-Wilk normality test was conducted for each variable and its corresponding differences to ensure the proper application of statistical tools. The analysis was performed with the Excel-based XLSTAT package (Addinsoft, New York, NY, USA).


RESULTS

The mean values for the two methods and the agreement analysis parameters are presented in Tables 1 and 2, respectively. Figures 1 and 2 display the results of the correlation and Bland-Altman analyses for absolute values and the asymmetry index. The two methods showed very high linear correlations for the volumes of the right hippocampus, left hippocampus, and total hippocampus, for absolute (r ≥ 0.96) and relative values (r ≥ 0.93), as indicated in Table 2. As illustrated in Figures 1A, 1B, and 2A, respectively, the data for those three variables lie below the identity function (diagonal line), indicating that HIPS module measurements are consistently lower than are those from the volBrain module. This is demonstrated by the Bland-Altman plots in Figures 1C, and 1D. Likewise, Table 2 shows that the mean differences (biases) for absolute and relative values are less than zero, with their respective confidence intervals excluding the zero line (equality between measurements). The results clearly indicate that there is a systematic bias between the two modules for those three variables. Using the mean of both methods as the best estimate of the measurement, the bias can be expressed as a percentage of the mean for each estimator (bias%, Table 2). The difference between the two methods ranges, on average, from 8.2% to 9.1% for absolute values and from 7.9% to 8.7% for relative values. Figure 3 shows an example of segmentation differences between the two methods for the right hippocampus of the same subject.
















For the asymmetry index (Figure 2B), the correlation between the two methods was strong (r = 0.82), albeit slightly weaker than that for the other three measurements. No significant bias was observed in the data distribution. The graph also shows that in seven cases (14% of the sample), the asymmetry indices had opposite signs. The Bland-Altman plot (Figure 2D) and Table 2 show that the bias is close to the zero line (0.89), with the confidence interval encompassing this value. In terms of bias (%), the difference between the two methods is, on average, 32.7%.

The absolute ICC values for absolute and relative measurements ranged from 0.71 to 0.73 and from 0.61 to 0.64, respectively, whereas the asymmetry index was 0.81. According to the criteria established by Koo et al.(26), the reliability was classified as moderate for the right hippocampus, left hippocampus, and total hippocampus, whereas it was classified as good for the asymmetry index.


DISCUSSION

The agreement analysis between HIPS and volBrain revealed a primarily additive systematic bias when measuring the absolute and relative volumes of each hippocampus and their combined volume in a radiologically normal population over a wide age range. Although those measurements exhibited a very strong linear correlation, the absolute ICC indicated only a moderate level of absolute agreement. In general, the absolute ICC values were lower for the relative volumetric measurements than for the absolute ones, because of greater data dispersion of the TIV estimates in both methods.

Although the statistical analysis defines confidence limits for agreement between the two methods, the acceptance of those limits should ultimately be guided by biological and clinical criteria relevant to the specific application of these measurements. Available evidence indicates that the average annual rate of hippocampal volume loss is approximately −0.8% in normal aging, −2.6% in mild cognitive impairment, and −4.4% in Alzheimer’s disease(28). Given that the difference between the methods, based on the best estimates defined above, averages at least 7.9%, switching between HIPS and volBrain—particularly in a longitudinal study—could significantly impact the results and their interpretation.

When calculating the asymmetry index, we observed no systematic bias. It also achieved the highest absolute ICC value among all variables analyzed, although it did not meet the threshold for excellent reliability based on the Koo et al. criteria(26). The average difference between the two methods was 32.7%. In addition, the indices showed an opposite sign in 14% of cases. These results also suggest that switching modules might have an impact on conclusions regarding the quantification of the asymmetry index. The overall results suggest that the two methods are not directly interchangeable for volumetric assessment of hippocampal structures and for determining the asymmetry index, unless the linear equation linking them is considered.

Selecting the most appropriate algorithm for MRI brain volumetry remains challenging because there is no gold standard for automated techniques. Variations in image processing techniques, segmentation methods, and anatomical definitions can result in substantial discrepancies between or among approaches. Previous studies have shown high variability in absolute hippocampal volume measurements across several automated methods, with differences ranging from 2.3% to 48.7%(29,30). For TIV-normalized volumes, differences greater than 24% can be derived from previous works(13). Given these results, the two methods evaluated in this study exhibit differences that can be classified as modest. However, these differences may be greater than those caused by the condition being studied.

The tools examined in this study, provided on a single platform, enable rapid generation of detailed volumetric reports. However, it is essential to critically assess the accuracy of the segmentations generated by the chosen software. Similarly, maintaining consistency across acquisition, processing, and segmentation methods is crucial to prevent errors and biases that could compromise clinical decisions.

Several limitations of this study must be considered. First, the study sample included only subjects without pathological hippocampal alterations (e.g., severe atrophy), which may restrict the generalizability of the findings to clinical populations. In addition, the population sample was not stratified based on the MRI scanner model or manufacturer to assess significant differences among them, because such stratification would have resulted in subgroups with insufficient sample sizes for robust statistical analysis. Finally, the study did not assess whether differences in hippocampal segmentation between the two methods occur systematically in specific anatomical regions, leaving potential spatial biases unaddressed.


CONCLUSION

The agreement analysis performed suggests that the volBrain and HIPS modules cannot be considered interchangeable for the volumetric assessment of the hippocampi and the associated asymmetry index.


REFERENCES

1. Despotović I, Goossens B, Philips W. MRI segmentation of the human brain: challenges, methods, and applications. Comput Math Methods Med. 2015:2015:450341.

2. Scarpazza C, Ha M, Baecker L, et al. Translating research findings into clinical practice: a systematic and critical review of neuroimaging-based clinical tools for brain disorders. Transl Psychiatry. 2020;10:107.

3. Opfer R, Suppa P, Kepp T, et al. Atlas based brain volumetry: How to distinguish regional volume changes due to biological or physiological effects from inherent noise of the methodology. Magn Reson Imaging. 2016;34:455–61.

4. Fawzi A, Achuthan A, Belaton B. Brain image segmentation in recent years: a narrative review. Brain Sci. 2021;11:1055.

5. Hasan KM, Walimuni IS, Kramer LA, et al. Human brain atlas-based volumetry and relaxometry: application to healthy development and natural aging. Magn Reson Med. 2010;64:1382–9.

6. Giorgio A, De Stefano N. Clinical use of brain volumetry. J Magn Reson Imaging. 2013;37:1–14.

7. Mandal PK, Mahajan R, Dinov ID. Structural brain atlases: design, rationale, and applications in normal and pathological cohorts. J Alzheimers Dis. 2012;31 Suppl 3 (0 3):S169–88.

8. Sterling NW, Lewis MM, Du G, et al. Structural imaging and Parkinson’s disease: moving toward quantitative markers of disease progression. J Parkinsons Dis. 2016;6:557–67.

9. Raji CA, Meysami S, Porter VR, et al. Diagnostic utility of brain MRI volumetry in comparing traumatic brain injury, Alzheimer disease and behavioral variant frontotemporal dementia. BMC Neurol. 2024;24:337.

10. Park HY, Suh CH, Heo H, et al. Diagnostic performance of hippocampal volumetry in Alzheimer’s disease or mild cognitive impairment: a meta-analysis. Eur Radiol. 2022;32:6979–91.

11. Ruchinskas R, Nguyen T, Womack K, et al. Diagnostic utility of hippocampal volumetric data in a memory disorder clinic setting. Cogn Behav Neurol. 2022;35:66–75.

12. Wisse LEM, Chételat G, Daugherty AM, et al. Hippocampal subfield volumetry from structural isotropic 1-mm3 MRI scans: A note of caution. Hum Brain Mapp. 2021;42:539–50.

13. Princich JP, Donnelly-Kehoe PA, Deleglise A, et al. Diagnostic performance of MRI volumetry in epilepsy patients with hippocampal sclerosis supported through a random forest automatic classification algorithm. Front Neurol. 2021;12:613967.

14. Pardoe HR, Pell GS, Abbott DF, et al. Hippocampal volume assessment in temporal lobe epilepsy: How good is automated segmentation? Epilepsia. 2009;50:2586–92.

15. Fischl B. FreeSurfer. Neuroimage. 2012;62:774–81.

16. Ashburner J, Friston KJ. Unified segmentation. Neuroimage. 2005;
26:839–51.

17. Patenaude B, Smith SM, Kennedy DN, et al. A Bayesian model of shape and appearance for subcortical brain segmentation. Neuroimage. 2011;56:907–22.

18. PNEURO Human Brain Analysis. Bruker BioSpin GmbH. [cited 2024 Nov 20]. Available from: https://store.bruker.com/products/pneuro-human-brain-analysis-remote.

19. Manjón JV, Coupé P. volBrain: an online MRI brain volumetry system. Front Neuroinform. 2016;10:30.

20. Romero JE, Coupé P, Manjón JV. HIPS: a new hippocampus subfield segmentation method. Neuroimage. 2017;163:286–95.

21. Ashburner J, Friston KJ. Voxel-based morphometry—the methods. Neuroimage. 2000;11(6 Pt 1):805–21.

22. Brewer JB. Fully-automated volumetric MRI with normative ranges: translation to clinical practice. Behav Neurol. 2009;21:21–8.

23. Quibim Platform. QP-Brain. [cited 2024 Nov 20]. Available from: https://quibim.com/products/qp-brain/.

24. Marcus DS, Wang TH, Parker J, et al. Open Access Series of Imaging Studies (OASIS): cross-sectional MRI data in young, middle aged, nondemented, and demented older adults. J Cogn Neurosci, 2007;19:1498–507.

25. Frisoni GB, Jack CR Jr, Bocchetta M, et al. The EADC-ADNI harmonized protocol for manual hippocampal segmentation on magnetic resonance: evidence of validity. Alzheimers Dement. 2015;11:
111–25.

26. Koo TK, Li MY. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med. 2016;15:
155–63.

27. Prieto L, Lamarca R, Casado A. Assessment of the reliability of clinical findings: the intraclass correlation coefficient. Med Clin (Barc). 1998;110:142–5.

28. Schuff N, Woerner N, Boreta L, et al. MRI of hippocampal volume loss in early Alzheimer’s disease in relation to ApoE genotype and biomarkers. Brain. 2009;132(Pt 4):1067–77.

29. Mangesius S, Haider L, Lenhart L, et al. Qualitative and quantitative comparison of hippocampal volumetric software applications: do all roads lead to Rome? Biomedicines. 2022;10:432.

30. Koussis P, Toulas P, Glotsos D, et al. Reliability of automated brain volumetric analysis: a test by comparing NeuroQuant and volBrain software. Brain Behav. 2023;13.



1. Fundación Científica del Sur, Lomas de Zamora, Provincia de Buenos Aires, Argentina
2. Tomografía Computada Buenos Aires Centro de Diagnóstico (TCba), Ciudad Autónoma de Buenos Aires, Argentina

a. https://orcid.org/0009-0004-7066-7529
b. https://orcid.org/0000-0002-5569-4420
c. https://orcid.org/0009-0007-6732-6485

Correspondence:
Federico Biafore
Fundación Científica del Sur. Hipólito Yrigoyen 8680.
CP. 1832, Lomas de Zamora
Provincia de Buenos Aires, Argentina.
Email: flbiafore@fcsur.com.ar.

How to cite this article:
Biafore F, Docampo J, Duca G. Automated hippocampal volume measurement: agreement analysis between HIPS and volBrain software. Radiol Bras. 2025;58:e20250003.

Received in January 5 2025.
Accepted em May 5 2025.
Publish in August 20 2025.


Creative Commons License
This work is licensed under an Attribution 4.0 International License (CC BY 4.0), effective June 9, 2022. Previously, the journal was licensed under a Creative Commons Attribution - Non-Commercial - Share Alike 4.0 International License.

Site Map



  • SPONSORED BY

Av.Paulista, 37 - 7° andar - Conj. 71 - CEP 01311-902 - São Paulo - SP - Brazil - Phone: (11) 3372-4554 - Fax: (11) 3372-4554

© All rights reserved 2025 - Radiologia Brasileira