Downstream Analysis with Lesion Exclusion

The lesion mask and the anatomy reports produced by lit-postprocessing (see Usage Guide) can be used to exclude lesion-affected regions from downstream statistical analyses. This applies both to surface-based analyses (vertex-wise metrics such as cortical thickness) and to volumetric analyses (region-of-interest volumes from parcellation or segmentation).

The same principle underlies both approaches: identify which regions are affected by the lesion, set their values to NaN (not a number) or drop them from the table, and let your statistical test handle the missing data. This prevents lesion-driven measurement artefacts from inflating or deflating group statistics, reproducibility metrics, or any other comparison.

Surface-Based Analysis

The workflow below uses cortical thickness as the example measure, but the same approach applies to any per-vertex surface metric (curvature, sulcal depth, myelination, etc.).

Overview

Run neuroLIT inpainting and postprocessing to obtain per-subject lesion surface labels
Project the cortical measure onto a common surface template (fsaverage) with mris_preproc
Set lesion vertices to NaN in the stacked surface data
Run your statistical analysis; NaN vertices are excluded vertex-wise

Step 1 — Obtain Lesion Surface Labels

Run the neuroLIT postprocessing pipeline to generate FreeSurfer .label files marking lesion vertices on the cortical surface for each subject:

lit-postprocessing \
    --subject-id SUBJECT_ID \
    --subjects-dir /path/to/subjects_dir

This produces label/{lh,rh}.lesion.label inside each subject directory. These label files are in the subject’s native surface space and can be projected to fsaverage together with the measure of interest.

Step 2 — Extract Regional Statistics (Optional)

Standard FreeSurfer tools can extract parcellation-level statistics before or after lesion exclusion, depending on whether the postprocessing step has already updated the annotation files:

# Cortical thickness per parcellation (DKT atlas), per hemisphere
aparcstats2table --subjects $SUBJECTS \
    --hemi lh --measure thickness \
    --parc aparc.DKTatlas.mapped \
    --tablefile lh.thickness.txt

aparcstats2table --subjects $SUBJECTS \
    --hemi rh --measure thickness \
    --parc aparc.DKTatlas.mapped \
    --tablefile rh.thickness.txt

# Subcortical volumes
asegstats2table --subjects $SUBJECTS \
    --meas volume \
    --tablefile aseg.volume.txt \
    --delimiter space \
    --stats aseg.stats

Step 3 — Project to Common Surface Template

Project individual subject surface maps onto fsaverage using mris_preproc. A spatial smoothing kernel (e.g. 15 mm FWHM) is typically applied:

mris_preproc \
    --s SUBJECT1 --s SUBJECT2 ... \
    --meas thickness \
    --fwhm 15 \
    --target fsaverage \
    --hemi lh \
    --out lh.thickness.fwhm15.stack.mgh

mris_preproc ... --hemi rh --out rh.thickness.fwhm15.stack.mgh

The output .mgh files are vertex-wise stacks with shape (n_subjects, n_vertices).

Step 4 — Mask Lesion Vertices

Before running any statistical analysis, set lesion-affected vertices to NaN so they are excluded vertex-wise:

import nibabel as nib
import numpy as np

# Load stacked surface data
thickness_stack = nib.load("lh.thickness.fwhm15.stack.mgh").get_fdata().squeeze()
# shape: (n_subjects, n_vertices)

# Load per-subject lesion label files and mask affected vertices
for i, lesion_label_path in enumerate(lesion_label_paths):
    lesion_vertices = nib.freesurfer.io.read_label(lesion_label_path)
    thickness_stack[i, lesion_vertices] = np.nan

# Restrict to cortex vertices only
cortex_vertices = nib.freesurfer.io.read_label("fsaverage/label/lh.cortex.label")
thickness_cortex = thickness_stack[:, cortex_vertices]
# NaN vertices are excluded automatically by numpy/scipy functions

The lh.cortex.label file is part of the standard fsaverage distribution and defines the non-medial-wall region.

Step 5 — Run Your Statistical Analysis

With lesion vertices masked to NaN, any vertex-wise analysis that handles missing data will automatically exclude them. Examples:

Test-retest reproducibility (ICC)

Intraclass Correlation Coefficient (ICC type A-1, two-way mixed absolute agreement) at each vertex:

n_subjects = thickness_stack1.shape[0]
k = 2  # number of sessions

icc_map = np.full(n_cortex_vertices, np.nan)
for v in range(n_cortex_vertices):
    col1 = thickness_stack1[:, v]
    col2 = thickness_stack2[:, v]
    valid = ~np.isnan(col1) & ~np.isnan(col2)
    if valid.sum() < 3:
        continue
    data = np.stack([col1[valid], col2[valid]], axis=1)
    n = data.shape[0]
    grand_mean = data.mean()
    ss_total = ((data - grand_mean) ** 2).sum()
    ss_rows = k * ((data.mean(axis=1) - grand_mean) ** 2).sum()
    ss_cols = n * ((data.mean(axis=0) - grand_mean) ** 2).sum()
    ss_error = ss_total - ss_rows - ss_cols
    msr = ss_rows / (n - 1)
    msc = ss_cols / (k - 1)
    mse = ss_error / ((n - 1) * (k - 1))
    icc_map[v] = (msr - mse) / (msr + (k - 1) * mse + k * (msc - mse) / n)

Group differences (t-test)

from scipy.stats import ttest_ind

pval_map = np.full(n_cortex_vertices, np.nan)
for v in range(n_cortex_vertices):
    g1 = group1_stack[:, v]
    g2 = group2_stack[:, v]
    valid1, valid2 = g1[~np.isnan(g1)], g2[~np.isnan(g2)]
    if len(valid1) < 3 or len(valid2) < 3:
        continue
    _, pval_map[v] = ttest_ind(valid1, valid2)

Step 6 — Visualize on the Cortical Surface (Optional)

Surface maps can be rendered using WhipperSnappy:

from whippersnappy.core import snap4

snap4(
    lhoverlaypath="lh.result.fwhm15.mgh",
    rhoverlaypath="rh.result.fwhm15.mgh",
    fthresh=0.8,
    fmax=1.0,
    sdir="/path/to/fsaverage",
    outpath="result_surface.png"
)

Alternatively, the maps can be loaded into standard neuroimaging viewers (FreeView, FSLeyes) for interactive inspection.

Volumetric Analysis

For region-of-interest (ROI) analyses based on segmentation volumes, the lesion anatomy reports generated by lit-postprocessing identify which parcellation regions are affected by the lesion. This makes it straightforward to exclude unreliable measurements before running group comparisons, correlations, or other statistics on volumetric data.

Anatomy Reports

For each subject, lit-postprocessing writes plain-text anatomy reports alongside the modified segmentation files:

stats/aparc.DKTatlas+aseg.lesion_report.txt — cortical parcellation and subcortical structures (DKT atlas)
stats/aseg.lesion_report.txt — FreeSurfer aseg segmentation

Each report classifies every brain region into one of three categories:

Category	Meaning
Replaced	Region is fully covered by the lesion. The entire structure has been relabelled; its volume is no longer meaningful.
Reduced	Region is partially covered. The measured volume is smaller than the true anatomy; the degree of reduction depends on lesion overlap.
Adjacent	Region touches the lesion boundary but has no voxel overlap. Its volume is intact, but proximity may introduce inpainting artefacts at the boundary.

The appropriate exclusion threshold depends on your analysis goals. A conservative approach excludes all three categories; a more permissive one retains adjacent regions and only drops replaced or reduced ones.

Excluding Affected Regions

The general workflow is:

Parse the per-subject anatomy report to obtain the set of affected region names
Build a subject × region table of volumes (e.g. from asegstats2table / aparcstats2table output)
For each subject, set the volumes of affected regions to NaN
Run your analysis column-wise, dropping or ignoring NaN entries

A minimal Python example:

import pandas as pd
import numpy as np

def parse_lesion_report(report_path, categories=("Replaced", "Reduced", "Adjacent")):
    """Return region names listed under the given categories in a lesion report."""
    affected = set()
    current_category = None
    with open(report_path) as f:
        for line in f:
            line = line.strip()
            if not line or line.startswith("#"):
                continue
            # Section headers contain the category name
            for cat in categories:
                if cat in line:
                    current_category = cat
                    break
            else:
                if current_category is not None:
                    # Lines under a section are "label_id  region_name"
                    parts = line.split(None, 1)
                    if len(parts) == 2 and parts[0].isdigit():
                        affected.add(parts[1])
    return affected

# Load volumetric stats table (rows = subjects, columns = regions)
volumes = pd.read_csv("aseg_volumes.csv", index_col="subject")

# Mask affected regions per subject
for subject in volumes.index:
    report = f"subjects/{subject}/stats/aseg.lesion_report.txt"
    affected_regions = parse_lesion_report(report, categories=("Replaced", "Reduced", "Adjacent"))
    cols_to_mask = [c for c in volumes.columns if c in affected_regions]
    volumes.loc[subject, cols_to_mask] = np.nan

# Each column now contains NaN for subjects where that region was lesion-affected.
# Standard pandas/scipy functions skip NaN by default.
group_means = volumes.groupby("group").mean()

Choosing Which Categories to Exclude

A reasonable default is to exclude all three categories. * Replaced regions should almost always be excluded — the volume reflects the lesion label, not the anatomy. * Reduced regions are typically excluded as well, since the measured volume is artificially smaller than the true value, but can be retained with caution (manual review recommended) * Adjacent regions may also warrant exclusion, since the structure boundary may be affected by the lesion (manual review recommended)

Dependencies

FreeSurfer: mris_preproc, asegstats2table, aparcstats2table, fsaverage template
Python: nibabel, numpy, scipy, pandas
WhipperSnappy (optional, for surface visualization): pip install whippersnappy