(Optional) Compare resampled data#

In this Notebook, we compare the resampled data (a) between different runs of FreeSurfer, and (b) between FreeSurfer and Python.

The resampled data are very close in these comparisons.

import os
import numpy as np
import nibabel as nib
from glob import glob
from scipy.stats import zscore
import tarfile
import neuroboros as nb
tmp_dir = '/dev/shm/forrest_tmp'
os.makedirs(tmp_dir, exist_ok=True)

First we compare the data resampled by fMRIPrep (d1) and by running mri_vol2surf directly (d2), both using the same version of FreeSurfer.

The correlation between the time series of the same vertex is very high (>0.9999). The small differences might be caused by different hardware/software (e.g., different nodes of the same HPC cluster), and/or different order of operations (float number operations are not fully commutative).

root = os.path.expanduser('~/lab/nb-data/forrest/20.2.7/resampled')
tar_fns = sorted(glob(os.path.expanduser('~/lab/nb-data-archive/forrest/20.2.7/fmriprep/*.tar.lzma')))

count = 0
for tar_fn in tar_fns:
    with tarfile.open(tar_fn) as tf:
        members = tf.getmembers()
        
        for mb in members:
            if not mb.path.endswith('gii') or 'func' not in mb.path:
                continue
            tmp_fn = os.path.join(tmp_dir, os.path.basename(mb.path))

            with tf.extractfile(mb) as f_in, open(tmp_fn, 'wb') as f_out:
                f_out.write(f_in.read())

            a, b = os.path.basename(mb.path).split('_space-fsaverage5_')
            lr = b.split('hemi-')[1][0].lower()
            fn = f'{root}/fsavg-ico32/{lr}-cerebrum/2step_freesurfer6/{a}.gii'

            d1 = np.stack([_.data for _ in nib.load(tmp_fn).darrays], axis=0)
            d2 = np.stack([_.data for _ in nib.load(fn).darrays], axis=0)

            r = np.mean(zscore(d1, axis=0) * zscore(d2, axis=0), axis=0)
            assert np.all(r[np.isfinite(r)]) > 0.9999
            count += 1

            os.remove(tmp_fn)
print(count)
518

In this part, we compare the resampled data based on FreeSurfer 7.2 (d1), FreeSurfer 6 (d2), and Python (d3 and d4).

Note that different versions of FreeSurfer use different ways to compute vertex normals from face normals. The two Python implementations (“normals-equal” and “normals-sine”) behave similarly to FreeSurfer 7.2 and 6, respectively. Therefore, we expect d1 and d3 to resemble each other, and d2 and d4 to resemble each other.

Different implementations (Python and FreeSurfer) of the same method also generate similar resampled data, as indicated by the high correlations of the corresponding time series.

fns = []
for space in ['fsavg-ico32', 'onavg-ico32']:
    fns += sorted(glob(f'{root}/{space}/*-cerebrum/2step_freesurfer7.2/*.gii'))

mats = []
count = 0
for fn1 in fns:
    d1 = np.stack([_.data for _ in nib.load(fn1).darrays], axis=0)
    fn2 = fn1.replace('2step_freesurfer7.2', '2step_freesurfer6')
    d2 = np.stack([_.data for _ in nib.load(fn2).darrays], axis=0)
    d3 = np.load(fn1.replace('2step_freesurfer7.2', '2step_normals-equal_nnfr')[:-4] + '.npy')
    d4 = np.load(fn1.replace('2step_freesurfer7.2', '2step_normals-sine_nnfr')[:-4] + '.npy')
    dd = [d1, d2, d3, d4]

    mat = np.zeros((4, 4))
    for i, a in enumerate(dd):
        for j, b in enumerate(dd):
            if i <= j:
                continue
            v = np.sum((a - b)**2)
            mat[i, j] = v
            mat[j, i] = v
    mats.append(mat)
    count += np.prod(a.shape)

    r13 = np.mean(zscore(d1, axis=0) * zscore(d3, axis=0), axis=0)
    r24 = np.mean(zscore(d2, axis=0) * zscore(d4, axis=0), axis=0)
    for r in [r13, r24]:
        mask = np.isnan(r)
        if np.any(mask):
            assert mask.mean() < 0.01
            r[mask] = 1.
        np.testing.assert_array_less(0.99, r)
np.sum(mats, axis=0) / count
array([[0.    , 0.0124, 0.0001, 0.0126],
       [0.0124, 0.    , 0.0126, 0.0001],
       [0.0001, 0.0126, 0.    , 0.0124],
       [0.0126, 0.0001, 0.0124, 0.    ]])