Downstream Analysis with Lesion Exclusion
The lesion mask and the anatomy reports produced by lit-postprocessing (see Usage Guide) can be used to exclude lesion-affected regions from downstream statistical analyses. This applies both to surface-based analyses (vertex-wise metrics such as cortical thickness) and to volumetric analyses (region-of-interest volumes from parcellation or segmentation).
The same principle underlies both approaches: identify which regions are affected by the lesion, set their values to NaN (not a number) or drop them from the table, and let your statistical test handle the missing data. This prevents lesion-driven measurement artefacts from inflating or deflating group statistics, reproducibility metrics, or any other comparison.
Surface-Based Analysis
The workflow below uses cortical thickness as the example measure, but the same approach applies to any per-vertex surface metric (curvature, sulcal depth, myelination, etc.).
Overview
Run neuroLIT inpainting and postprocessing to obtain per-subject lesion surface labels
Project the cortical measure onto a common surface template (
fsaverage) withmris_preprocSet lesion vertices to
NaNin the stacked surface dataRun your statistical analysis; NaN vertices are excluded vertex-wise
Step 1 — Obtain Lesion Surface Labels
Run the neuroLIT postprocessing pipeline to generate FreeSurfer .label files marking lesion vertices on the cortical surface for each subject:
lit-postprocessing \
--subject-id SUBJECT_ID \
--subjects-dir /path/to/subjects_dir
This produces label/{lh,rh}.lesion.label inside each subject directory. These label files are in the subject’s native surface space and can be projected to fsaverage together with the measure of interest.
Step 2 — Extract Regional Statistics (Optional)
Standard FreeSurfer tools can extract parcellation-level statistics before or after lesion exclusion, depending on whether the postprocessing step has already updated the annotation files:
# Cortical thickness per parcellation (DKT atlas), per hemisphere
aparcstats2table --subjects $SUBJECTS \
--hemi lh --measure thickness \
--parc aparc.DKTatlas.mapped \
--tablefile lh.thickness.txt
aparcstats2table --subjects $SUBJECTS \
--hemi rh --measure thickness \
--parc aparc.DKTatlas.mapped \
--tablefile rh.thickness.txt
# Subcortical volumes
asegstats2table --subjects $SUBJECTS \
--meas volume \
--tablefile aseg.volume.txt \
--delimiter space \
--stats aseg.stats
Step 3 — Project to Common Surface Template
Project individual subject surface maps onto fsaverage using mris_preproc. A spatial smoothing kernel (e.g. 15 mm FWHM) is typically applied:
mris_preproc \
--s SUBJECT1 --s SUBJECT2 ... \
--meas thickness \
--fwhm 15 \
--target fsaverage \
--hemi lh \
--out lh.thickness.fwhm15.stack.mgh
mris_preproc ... --hemi rh --out rh.thickness.fwhm15.stack.mgh
The output .mgh files are vertex-wise stacks with shape (n_subjects, n_vertices).
Step 4 — Mask Lesion Vertices
Before running any statistical analysis, set lesion-affected vertices to NaN so they are excluded vertex-wise:
import nibabel as nib
import numpy as np
# Load stacked surface data
thickness_stack = nib.load("lh.thickness.fwhm15.stack.mgh").get_fdata().squeeze()
# shape: (n_subjects, n_vertices)
# Load per-subject lesion label files and mask affected vertices
for i, lesion_label_path in enumerate(lesion_label_paths):
lesion_vertices = nib.freesurfer.io.read_label(lesion_label_path)
thickness_stack[i, lesion_vertices] = np.nan
# Restrict to cortex vertices only
cortex_vertices = nib.freesurfer.io.read_label("fsaverage/label/lh.cortex.label")
thickness_cortex = thickness_stack[:, cortex_vertices]
# NaN vertices are excluded automatically by numpy/scipy functions
The lh.cortex.label file is part of the standard fsaverage distribution and defines the non-medial-wall region.
Step 5 — Run Your Statistical Analysis
With lesion vertices masked to NaN, any vertex-wise analysis that handles missing data will automatically exclude them. Examples:
Test-retest reproducibility (ICC)
Intraclass Correlation Coefficient (ICC type A-1, two-way mixed absolute agreement) at each vertex:
n_subjects = thickness_stack1.shape[0]
k = 2 # number of sessions
icc_map = np.full(n_cortex_vertices, np.nan)
for v in range(n_cortex_vertices):
col1 = thickness_stack1[:, v]
col2 = thickness_stack2[:, v]
valid = ~np.isnan(col1) & ~np.isnan(col2)
if valid.sum() < 3:
continue
data = np.stack([col1[valid], col2[valid]], axis=1)
n = data.shape[0]
grand_mean = data.mean()
ss_total = ((data - grand_mean) ** 2).sum()
ss_rows = k * ((data.mean(axis=1) - grand_mean) ** 2).sum()
ss_cols = n * ((data.mean(axis=0) - grand_mean) ** 2).sum()
ss_error = ss_total - ss_rows - ss_cols
msr = ss_rows / (n - 1)
msc = ss_cols / (k - 1)
mse = ss_error / ((n - 1) * (k - 1))
icc_map[v] = (msr - mse) / (msr + (k - 1) * mse + k * (msc - mse) / n)
Group differences (t-test)
from scipy.stats import ttest_ind
pval_map = np.full(n_cortex_vertices, np.nan)
for v in range(n_cortex_vertices):
g1 = group1_stack[:, v]
g2 = group2_stack[:, v]
valid1, valid2 = g1[~np.isnan(g1)], g2[~np.isnan(g2)]
if len(valid1) < 3 or len(valid2) < 3:
continue
_, pval_map[v] = ttest_ind(valid1, valid2)
Step 6 — Visualize on the Cortical Surface (Optional)
Surface maps can be rendered using WhipperSnappy:
from whippersnappy.core import snap4
snap4(
lhoverlaypath="lh.result.fwhm15.mgh",
rhoverlaypath="rh.result.fwhm15.mgh",
fthresh=0.8,
fmax=1.0,
sdir="/path/to/fsaverage",
outpath="result_surface.png"
)
Alternatively, the maps can be loaded into standard neuroimaging viewers (FreeView, FSLeyes) for interactive inspection.
Volumetric Analysis
For region-of-interest (ROI) analyses based on segmentation volumes, the lesion anatomy reports generated by lit-postprocessing identify which parcellation regions are affected by the lesion. This makes it straightforward to exclude unreliable measurements before running group comparisons, correlations, or other statistics on volumetric data.
Anatomy Reports
For each subject, lit-postprocessing writes plain-text anatomy reports alongside the modified segmentation files:
stats/aparc.DKTatlas+aseg.lesion_report.txt— cortical parcellation and subcortical structures (DKT atlas)stats/aseg.lesion_report.txt— FreeSurfer aseg segmentation
Each report classifies every brain region into one of three categories:
Category |
Meaning |
|---|---|
Replaced |
Region is fully covered by the lesion. The entire structure has been relabelled; its volume is no longer meaningful. |
Reduced |
Region is partially covered. The measured volume is smaller than the true anatomy; the degree of reduction depends on lesion overlap. |
Adjacent |
Region touches the lesion boundary but has no voxel overlap. Its volume is intact, but proximity may introduce inpainting artefacts at the boundary. |
The appropriate exclusion threshold depends on your analysis goals. A conservative approach excludes all three categories; a more permissive one retains adjacent regions and only drops replaced or reduced ones.
Excluding Affected Regions
The general workflow is:
Parse the per-subject anatomy report to obtain the set of affected region names
Build a subject × region table of volumes (e.g. from
asegstats2table/aparcstats2tableoutput)For each subject, set the volumes of affected regions to
NaNRun your analysis column-wise, dropping or ignoring NaN entries
A minimal Python example:
import pandas as pd
import numpy as np
def parse_lesion_report(report_path, categories=("Replaced", "Reduced", "Adjacent")):
"""Return region names listed under the given categories in a lesion report."""
affected = set()
current_category = None
with open(report_path) as f:
for line in f:
line = line.strip()
if not line or line.startswith("#"):
continue
# Section headers contain the category name
for cat in categories:
if cat in line:
current_category = cat
break
else:
if current_category is not None:
# Lines under a section are "label_id region_name"
parts = line.split(None, 1)
if len(parts) == 2 and parts[0].isdigit():
affected.add(parts[1])
return affected
# Load volumetric stats table (rows = subjects, columns = regions)
volumes = pd.read_csv("aseg_volumes.csv", index_col="subject")
# Mask affected regions per subject
for subject in volumes.index:
report = f"subjects/{subject}/stats/aseg.lesion_report.txt"
affected_regions = parse_lesion_report(report, categories=("Replaced", "Reduced", "Adjacent"))
cols_to_mask = [c for c in volumes.columns if c in affected_regions]
volumes.loc[subject, cols_to_mask] = np.nan
# Each column now contains NaN for subjects where that region was lesion-affected.
# Standard pandas/scipy functions skip NaN by default.
group_means = volumes.groupby("group").mean()
Choosing Which Categories to Exclude
A reasonable default is to exclude all three categories. * Replaced regions should almost always be excluded — the volume reflects the lesion label, not the anatomy. * Reduced regions are typically excluded as well, since the measured volume is artificially smaller than the true value, but can be retained with caution (manual review recommended) * Adjacent regions may also warrant exclusion, since the structure boundary may be affected by the lesion (manual review recommended)
Dependencies
FreeSurfer:
mris_preproc,asegstats2table,aparcstats2table, fsaverage templatePython:
nibabel,numpy,scipy,pandasWhipperSnappy (optional, for surface visualization):
pip install whippersnappy