Data Module

Overview

The data module handles data loading, preprocessing, and transformations.

Submodules

conform

neurolit.data.conform.options_parse()[source]

Command line option parser.

Returns:: object holding options
Return type:: options

neurolit.data.conform.map_image(img: SpatialImage, out_affine: ndarray, out_shape: ndarray, ras2ras: ndarray | None = None, order: int = 1, dtype: type | None = None) → ndarray[source]

Map image to new voxel space (RAS orientation).

Parameters:

img (nib.analyze.SpatialImage) – the src 3D image with data and affine set
out_affine (np.ndarray) – trg image affine
out_shape (np.ndarray) – the trg shape information
ras2ras (Optional[np.ndarray]) – an additional mapping that should be applied (default=id to just reslice)
order (int) – order of interpolation (0=nearest,1=linear(default),2=quadratic,3=cubic)
dtype (Optional[Type]) – target dtype of the resulting image (relevant for reorientation, default=same as img)

Returns:

mapped image data array

Return type:

np.ndarray

neurolit.data.conform.getscale(data: ndarray, dst_min: float, dst_max: float, f_low: float = 0.0, f_high: float = 0.999) → tuple[float, float][source]

Get offset and scale of image intensities to robustly rescale to range dst_min..dst_max.

Equivalent to how mri_convert conforms images.

Parameters:

data (np.ndarray) – image data (intensity values)
dst_min (float) – future minimal intensity value
dst_max (float) – future maximal intensity value
f_low (float) – robust cropping at low end (0.0 no cropping, default)
f_high (float) – robust cropping at higher end (0.999 crop one thousandth of high intensity voxels, default)

Returns:

float src_min – (adjusted) offset
float – scale factor

neurolit.data.conform.scalecrop(data: ndarray, dst_min: float, dst_max: float, src_min: float, scale: float) → ndarray[source]

Crop the intensity ranges to specific min and max values.

Parameters:

data (np.ndarray) – Image data (intensity values)
dst_min (float) – future minimal intensity value
dst_max (float) – future maximal intensity value
src_min (float) – minimal value to consider from source (crops below)
scale (float) – scale value by which source will be shifted

Returns:

scaled image data

Return type:

np.ndarray

neurolit.data.conform.rescale(data: ndarray, dst_min: float, dst_max: float, f_low: float = 0.0, f_high: float = 0.999) → ndarray[source]

Rescale image intensity values (0-255).

Parameters:

data (np.ndarray) – image data (intensity values)
dst_min (float) – future minimal intensity value
dst_max (float) – future maximal intensity value
f_low (float) – robust cropping at low end (0.0 no cropping, default)
f_high (float) – robust cropping at higher end (0.999 crop one thousandth of high intensity voxels, default)

Returns:

scaled image data

Return type:

np.ndarray

neurolit.data.conform.find_min_size(img: SpatialImage, max_size: float = 1) → float[source]

Find minimal voxel size <= 1mm.

Parameters:

img (nib.analyze.SpatialImage) – loaded source image
max_size (float) – maximal voxel size in mm (default: 1.0)

Returns:

Rounded minimal voxel size

Return type:

float

Notes

This function only needs the header (not the data).

neurolit.data.conform.find_img_size_by_fov(img: SpatialImage, vox_size: float, min_dim: int = 256) → int[source]

Find the cube dimension (>= 256) to cover the field of view of img.

If vox_size is one, the img_size MUST always be min_dim (the FreeSurfer standard).

Parameters:

img (nib.analyze.SpatialImage) – loaded source image
vox_size (float) – the target voxel size in mm
min_dim (int) – minimal image dimension in voxels (default 256)

Returns:

The number of voxels needed to cover field of view.

Return type:

int

Notes

This function only needs the header (not the data).

neurolit.data.conform.is_orientation(affine: ndarray, target_orientation: str = 'lia', eps: float = 1e-06) → bool[source]: Check whether affine follows a target orientation code.

neurolit.data.conform.conformed_vox_img_size(img: SpatialImage, vox_size, img_size, threshold_1mm: float | None = None, vox_eps: float = 0.0001, **kwargs) → tuple[ndarray | None, ndarray | None][source]: Extract target voxel size and image size with FastSurfer-compatible semantics.

neurolit.data.conform.conform(img: SpatialImage, order: int = 1, vox_size=1.0, img_size=256, dtype: type | None = None, orientation: str | None = 'lia', threshold_1mm: float | None = None, rescale: int | float | None = 255, conform_vox_size=None, conform_to_1mm_threshold: float | None = None) → MGHImage[source]

Python version of mri_convert -c.

mri_convert -c by default turns image intensity values into UCHAR, reslices images to standard position, fills up slices to standard 256x256x256 format and enforces 1mm or minimum isotropic voxel sizes.

Parameters:

img (nib.analyze.SpatialImage) – loaded source image
order (int) – interpolation order (0=nearest,1=linear(default),2=quadratic,3=cubic)
conform_vox_size (VoxSizeOption) – conform image the image to voxel size 1. (default), a specific smaller voxel size (0-1, for high-res), or automatically determine the ‘minimum voxel size’ from the image (value ‘min’). This assumes the smallest of the three voxel sizes.
dtype (Optional[Type]) – the dtype to enforce in the image (default: UCHAR, as mri_convert -c)
conform_to_1mm_threshold (Optional[float]) – the threshold above which the image is conformed to 1mm (default: ignore).

Returns:

conformed image

Return type:

nib.MGHImage

Notes

Unlike mri_convert -c, we first interpolate (float image), and then rescale to uchar. mri_convert is doing it the other way around. However, we compute the scale factor from the input to increase similarity.

neurolit.data.conform.is_conform(img: ~nibabel.spatialimages.SpatialImage, vox_size=1.0, img_size=256, eps: float = 1e-06, check_dtype: bool = True, dtype: type | None = <object object>, orientation: str | None = 'lia', verbose: bool = True, threshold_1mm: float | None = None, conform_vox_size=None, conform_to_1mm_threshold: float | None = None) → bool[source]

Check if an image is already conformed or not.

Dimensions: 256x256x256, Voxel size: 1x1x1, LIA orientation, and data type UCHAR.

Parameters:

img (nib.analyze.SpatialImage) – Loaded source image
conform_vox_size (VoxSizeOption) – which voxel size to conform to. Can either be a float between 0.0 and 1.0 or ‘min’ check, whether the image is conformed to the minimal voxels size, i.e. conforming to smaller, but isotropic voxel sizes for high-res (default: 1.0).
eps (float) – allowed deviation from zero for LIA orientation check (default: 1e-06). Small inaccuracies can occur through the inversion operation. Already conformed images are thus sometimes not correctly recognized. The epsilon accounts for these small shifts.
check_dtype (bool) – specifies whether the UCHAR dtype condition is checked for; this is not done when the input is a segmentation (default: True).
dtype (Optional[Type]) – specifies the intended target dtype (default: uint8 = UCHAR)
verbose (bool) – if True, details of which conformance conditions are violated (if any) are displayed (default: True).
conform_to_1mm_threshold (Optional[float]) – the threshold above which the image is conformed to 1mm (default: ignore).

Returns:

Whether the image is already conformed.

Return type:

bool

Notes

This function only needs the header (not the data).

neurolit.data.conform.get_conformed_vox_img_size(img: SpatialImage, conform_vox_size, conform_to_1mm_threshold: float | None = None) → tuple[float, int][source]

Extract the voxel size and the image size.

This function only needs the header (not the data).

Parameters:

img (nib.analyze.SpatialImage) – Loaded source image
conform_vox_size (VoxSizeOption) – [MISSING]
conform_to_1mm_threshold (Optional[float]) – [MISSING]

Return type:

[MISSING]

neurolit.data.conform.check_affine_in_nifti(img: Nifti1Image | Nifti2Image) → bool[source]

Check the affine in nifti Image.

Sets affine with qform, if it exists and differs from sform. If qform does not exist, voxel sizes between header information and information in affine are compared. In case these do not match, the function returns False (otherwise True).

Parameters:

img (Union[nib.Nifti1Image, nib.Nifti2Image]) – loaded nifti-image
logger (Optional[logging.Logger]) – Logger object or None (default) to log or print an info message to stdout (for None)

Returns:

True, if – affine was reset to qform voxel sizes in affine are equivalent to voxel sizes in header
False, if – voxel sizes in affine and header differ

Image conforming utilities for standardizing brain MRI images.

Key Functions:

conform_image(): Conform image to standard space
check_orientation(): Verify image orientation
resample_image(): Resample to target resolution

datasets

neurolit.data.datasets.get_test_sample()[source]

neurolit.data.datasets.get_dataset(csv_file, transforms=None, size='standard')[source]

Get a dataset from a CSV file.

Parameters:

csv_file (str) – Path to CSV file containing file paths
transforms (callable, optional) – Transforms to apply to the data. Defaults to None.
size (str, optional) – If “small”, only loads first 3 samples. Defaults to “standard”.

Returns:

Dataset containing the loaded files

Return type:

CacheDataset

neurolit.data.datasets.get_base_dataset(size='big', transforms=None)[source]

Get training and validation datasets from CSV files.

Parameters:

size (str, optional) – Size of training dataset to use. Must be one of [“big”, “small”, “standard”]. “big” uses 1268 subjects, “small” uses 120 subjects. Defaults to “big”.
transforms (callable, optional) – Transforms to apply to the data. Defaults to None.

Returns:

train_dataset (CacheDataset) – Training dataset
val_dataset (CacheDataset) – Validation dataset

class neurolit.data.datasets.SlicedDataset(dataset, thickness, ax, slice_per_img=None, transform=None)[source]

Bases: Dataset

A dataset that extracts slices from 3D images with a specified thickness.

This dataset wraps another dataset containing 3D images and provides access to 2D slices with a configurable thickness along a specified axis. Each slice is returned with the slice thickness as the channel dimension.

Parameters:

dataset (Dataset) – Base dataset containing 3D images
thickness (int) – Thickness of slices to extract (must be odd)
ax (int) – Axis along which to extract slices (0=sagittal, 1=coronal, 2=axial)
slice_per_img (int, optional) – Number of slices to extract per image. If None, extracts all possible slices. Defaults to None.
transform (callable, optional) – Transform to apply to extracted slices. Defaults to None.

dataset

The base dataset

Type:: Dataset

thickness

Slice thickness

Type:: int

ax

Slicing axis

Type:: int

slice_per_img

Number of slices per image

Type:: list

transform

Transform function

Type:: callable

slice_per_img_cumsum

Cumulative sum of slices per image

Type:: ndarray

len

Total number of slices across all images

Type:: int

__init__(dataset, thickness, ax, slice_per_img=None, transform=None)[source]

get_slice_axis(slice_index)[source]

Get slice indices for extracting a slice at the given index.

Parameters:: slice_index (int) – Index of slice to extract
Returns:: Tuple of slice objects for indexing the image array
Return type:: tuple

__getitem__(index)[source]

Get a slice from the dataset at the specified index.

Maps the flat index to an image and slice index, extracts the slice, and ensures the slice thickness becomes the channel dimension.

Parameters:: index (int) – Index of slice to retrieve
Returns:: Dictionary containing the slice under ‘image’ key and any additional metadata from the base dataset
Return type:: dict

PyTorch dataset classes for brain MRI data.

Key Classes:

BrainDataset: Dataset for brain MRI images
InpaintingDataset: Dataset for inpainting tasks

transforms

class neurolit.data.transforms.Subsampled(*args, **kwargs)[source]

Bases: MapTransform

Transform that subsamples input data and pads to a specified size.

Parameters:

keys (list) – List of keys to apply transform to
spatial_size (tuple) – Target spatial size for output (h,w,d)
size_reduction (int, optional) – Factor by which to subsample. Defaults to 2.

__init__(keys: list[str], spatial_size: tuple[int, int, int], size_reduction: int = 2) → None[source]

class neurolit.data.transforms.ScaleAugmentation(*args, **kwargs)[source]

Bases: MapTransform

Transform that randomly scales the voxel size metadata.

Parameters:

keys (list) – List of keys to apply transform to
scale_range (tuple) – Range of possible scale factors (min, max)

__init__(keys: list[str], scale_range: tuple[float, float]) → None[source]

class neurolit.data.transforms.Identityd(*args, **kwargs)[source]

Bases: MapTransform

Transform that returns data unchanged.

Used as a placeholder when no transform is needed.

Data augmentation and transformation utilities.

Key Classes:

Compose: Compose multiple transforms
RandomFlip: Random horizontal/vertical flip
RandomRotation: Random rotation
Normalize: Intensity normalization
ToTensor: Convert to PyTorch tensor

Examples

Conforming Images

from neurolit.data.conform import conform_image

# Conform a single image
conform_image(
    input_path='raw_T1w.nii.gz',
    output_path='T1w_conformed.nii.gz',
    target_spacing=(1.0, 1.0, 1.0),
    target_size=(256, 256, 256)
)

Using Datasets

from neurolit.data.datasets import BrainDataset
from torch.utils.data import DataLoader

# Create dataset
dataset = BrainDataset(
    data_dir='training_data',
    transform=None
)

# Create data loader
loader = DataLoader(
    dataset,
    batch_size=16,
    shuffle=True,
    num_workers=4
)

# Iterate over batches
for batch in loader:
    images = batch['image']
    # Process batch...

Applying Transforms

from neurolit.data.transforms import Compose, RandomFlip, Normalize, ToTensor

# Define transform pipeline
transform = Compose([
    RandomFlip(p=0.5),
    Normalize(mean=0.5, std=0.5),
    ToTensor()
])

# Apply to dataset
dataset = BrainDataset(
    data_dir='training_data',
    transform=transform
)

Custom Transforms

from neurolit.data.transforms import Compose

class CustomTransform:
    def __call__(self, image):
        # Your custom transformation
        return modified_image

# Use in pipeline
transform = Compose([
    CustomTransform(),
    Normalize(),
    ToTensor()
])

Batch Conforming

from neurolit.data.conform import conform_image
from pathlib import Path

input_dir = Path('raw_data')
output_dir = Path('conformed_data')
output_dir.mkdir(exist_ok=True)

for img_path in input_dir.glob('*.nii.gz'):
    output_path = output_dir / img_path.name
    conform_image(
        input_path=str(img_path),
        output_path=str(output_path)
    )
    print(f"Conformed {img_path.name}")