Data Module
Overview
The data module handles data loading, preprocessing, and transformations.
Submodules
conform
- neurolit.data.conform.options_parse()[source]
Command line option parser.
- Returns:
object holding options
- Return type:
options
- neurolit.data.conform.map_image(img: SpatialImage, out_affine: ndarray, out_shape: ndarray, ras2ras: ndarray | None = None, order: int = 1, dtype: type | None = None) ndarray[source]
Map image to new voxel space (RAS orientation).
- Parameters:
img (nib.analyze.SpatialImage) – the src 3D image with data and affine set
out_affine (np.ndarray) – trg image affine
out_shape (np.ndarray) – the trg shape information
ras2ras (Optional[np.ndarray]) – an additional mapping that should be applied (default=id to just reslice)
order (int) – order of interpolation (0=nearest,1=linear(default),2=quadratic,3=cubic)
dtype (Optional[Type]) – target dtype of the resulting image (relevant for reorientation, default=same as img)
- Returns:
mapped image data array
- Return type:
np.ndarray
- neurolit.data.conform.getscale(data: ndarray, dst_min: float, dst_max: float, f_low: float = 0.0, f_high: float = 0.999) tuple[float, float][source]
Get offset and scale of image intensities to robustly rescale to range dst_min..dst_max.
Equivalent to how mri_convert conforms images.
- Parameters:
data (np.ndarray) – image data (intensity values)
dst_min (float) – future minimal intensity value
dst_max (float) – future maximal intensity value
f_low (float) – robust cropping at low end (0.0 no cropping, default)
f_high (float) – robust cropping at higher end (0.999 crop one thousandth of high intensity voxels, default)
- Returns:
float src_min – (adjusted) offset
float – scale factor
- neurolit.data.conform.scalecrop(data: ndarray, dst_min: float, dst_max: float, src_min: float, scale: float) ndarray[source]
Crop the intensity ranges to specific min and max values.
- Parameters:
- Returns:
scaled image data
- Return type:
np.ndarray
- neurolit.data.conform.rescale(data: ndarray, dst_min: float, dst_max: float, f_low: float = 0.0, f_high: float = 0.999) ndarray[source]
Rescale image intensity values (0-255).
- Parameters:
data (np.ndarray) – image data (intensity values)
dst_min (float) – future minimal intensity value
dst_max (float) – future maximal intensity value
f_low (float) – robust cropping at low end (0.0 no cropping, default)
f_high (float) – robust cropping at higher end (0.999 crop one thousandth of high intensity voxels, default)
- Returns:
scaled image data
- Return type:
np.ndarray
- neurolit.data.conform.find_min_size(img: SpatialImage, max_size: float = 1) float[source]
Find minimal voxel size <= 1mm.
- Parameters:
img (nib.analyze.SpatialImage) – loaded source image
max_size (float) – maximal voxel size in mm (default: 1.0)
- Returns:
Rounded minimal voxel size
- Return type:
Notes
This function only needs the header (not the data).
- neurolit.data.conform.find_img_size_by_fov(img: SpatialImage, vox_size: float, min_dim: int = 256) int[source]
Find the cube dimension (>= 256) to cover the field of view of img.
If vox_size is one, the img_size MUST always be min_dim (the FreeSurfer standard).
- Parameters:
- Returns:
The number of voxels needed to cover field of view.
- Return type:
Notes
This function only needs the header (not the data).
- neurolit.data.conform.is_orientation(affine: ndarray, target_orientation: str = 'lia', eps: float = 1e-06) bool[source]
Check whether affine follows a target orientation code.
- neurolit.data.conform.conformed_vox_img_size(img: SpatialImage, vox_size, img_size, threshold_1mm: float | None = None, vox_eps: float = 0.0001, **kwargs) tuple[ndarray | None, ndarray | None][source]
Extract target voxel size and image size with FastSurfer-compatible semantics.
- neurolit.data.conform.conform(img: SpatialImage, order: int = 1, vox_size=1.0, img_size=256, dtype: type | None = None, orientation: str | None = 'lia', threshold_1mm: float | None = None, rescale: int | float | None = 255, conform_vox_size=None, conform_to_1mm_threshold: float | None = None) MGHImage[source]
Python version of mri_convert -c.
mri_convert -c by default turns image intensity values into UCHAR, reslices images to standard position, fills up slices to standard 256x256x256 format and enforces 1mm or minimum isotropic voxel sizes.
- Parameters:
img (nib.analyze.SpatialImage) – loaded source image
order (int) – interpolation order (0=nearest,1=linear(default),2=quadratic,3=cubic)
conform_vox_size (VoxSizeOption) – conform image the image to voxel size 1. (default), a specific smaller voxel size (0-1, for high-res), or automatically determine the ‘minimum voxel size’ from the image (value ‘min’). This assumes the smallest of the three voxel sizes.
dtype (Optional[Type]) – the dtype to enforce in the image (default: UCHAR, as mri_convert -c)
conform_to_1mm_threshold (Optional[float]) – the threshold above which the image is conformed to 1mm (default: ignore).
- Returns:
conformed image
- Return type:
nib.MGHImage
Notes
Unlike mri_convert -c, we first interpolate (float image), and then rescale to uchar. mri_convert is doing it the other way around. However, we compute the scale factor from the input to increase similarity.
- neurolit.data.conform.is_conform(img: ~nibabel.spatialimages.SpatialImage, vox_size=1.0, img_size=256, eps: float = 1e-06, check_dtype: bool = True, dtype: type | None = <object object>, orientation: str | None = 'lia', verbose: bool = True, threshold_1mm: float | None = None, conform_vox_size=None, conform_to_1mm_threshold: float | None = None) bool[source]
Check if an image is already conformed or not.
Dimensions: 256x256x256, Voxel size: 1x1x1, LIA orientation, and data type UCHAR.
- Parameters:
img (nib.analyze.SpatialImage) – Loaded source image
conform_vox_size (VoxSizeOption) – which voxel size to conform to. Can either be a float between 0.0 and 1.0 or ‘min’ check, whether the image is conformed to the minimal voxels size, i.e. conforming to smaller, but isotropic voxel sizes for high-res (default: 1.0).
eps (float) – allowed deviation from zero for LIA orientation check (default: 1e-06). Small inaccuracies can occur through the inversion operation. Already conformed images are thus sometimes not correctly recognized. The epsilon accounts for these small shifts.
check_dtype (bool) – specifies whether the UCHAR dtype condition is checked for; this is not done when the input is a segmentation (default: True).
dtype (Optional[Type]) – specifies the intended target dtype (default: uint8 = UCHAR)
verbose (bool) – if True, details of which conformance conditions are violated (if any) are displayed (default: True).
conform_to_1mm_threshold (Optional[float]) – the threshold above which the image is conformed to 1mm (default: ignore).
- Returns:
Whether the image is already conformed.
- Return type:
Notes
This function only needs the header (not the data).
- neurolit.data.conform.get_conformed_vox_img_size(img: SpatialImage, conform_vox_size, conform_to_1mm_threshold: float | None = None) tuple[float, int][source]
Extract the voxel size and the image size.
This function only needs the header (not the data).
- Parameters:
img (nib.analyze.SpatialImage) – Loaded source image
conform_vox_size (VoxSizeOption) – [MISSING]
conform_to_1mm_threshold (Optional[float]) – [MISSING]
- Return type:
[MISSING]
- neurolit.data.conform.check_affine_in_nifti(img: Nifti1Image | Nifti2Image) bool[source]
Check the affine in nifti Image.
Sets affine with qform, if it exists and differs from sform. If qform does not exist, voxel sizes between header information and information in affine are compared. In case these do not match, the function returns False (otherwise True).
- Parameters:
img (Union[nib.Nifti1Image, nib.Nifti2Image]) – loaded nifti-image
logger (Optional[logging.Logger]) – Logger object or None (default) to log or print an info message to stdout (for None)
- Returns:
True, if – affine was reset to qform voxel sizes in affine are equivalent to voxel sizes in header
False, if – voxel sizes in affine and header differ
Image conforming utilities for standardizing brain MRI images.
Key Functions:
conform_image(): Conform image to standard spacecheck_orientation(): Verify image orientationresample_image(): Resample to target resolution
datasets
- neurolit.data.datasets.get_dataset(csv_file, transforms=None, size='standard')[source]
Get a dataset from a CSV file.
- Parameters:
- Returns:
Dataset containing the loaded files
- Return type:
CacheDataset
- neurolit.data.datasets.get_base_dataset(size='big', transforms=None)[source]
Get training and validation datasets from CSV files.
- Parameters:
size (str, optional) – Size of training dataset to use. Must be one of [“big”, “small”, “standard”]. “big” uses 1268 subjects, “small” uses 120 subjects. Defaults to “big”.
transforms (callable, optional) – Transforms to apply to the data. Defaults to None.
- Returns:
train_dataset (CacheDataset) – Training dataset
val_dataset (CacheDataset) – Validation dataset
- class neurolit.data.datasets.SlicedDataset(dataset, thickness, ax, slice_per_img=None, transform=None)[source]
Bases:
DatasetA dataset that extracts slices from 3D images with a specified thickness.
This dataset wraps another dataset containing 3D images and provides access to 2D slices with a configurable thickness along a specified axis. Each slice is returned with the slice thickness as the channel dimension.
- Parameters:
dataset (Dataset) – Base dataset containing 3D images
thickness (int) – Thickness of slices to extract (must be odd)
ax (int) – Axis along which to extract slices (0=sagittal, 1=coronal, 2=axial)
slice_per_img (int, optional) – Number of slices to extract per image. If None, extracts all possible slices. Defaults to None.
transform (callable, optional) – Transform to apply to extracted slices. Defaults to None.
- dataset
The base dataset
- Type:
Dataset
- transform
Transform function
- Type:
callable
- slice_per_img_cumsum
Cumulative sum of slices per image
- Type:
ndarray
PyTorch dataset classes for brain MRI data.
Key Classes:
BrainDataset: Dataset for brain MRI imagesInpaintingDataset: Dataset for inpainting tasks
transforms
- class neurolit.data.transforms.Subsampled(*args, **kwargs)[source]
Bases:
MapTransformTransform that subsamples input data and pads to a specified size.
- Parameters:
- class neurolit.data.transforms.ScaleAugmentation(*args, **kwargs)[source]
Bases:
MapTransformTransform that randomly scales the voxel size metadata.
- Parameters:
- class neurolit.data.transforms.Identityd(*args, **kwargs)[source]
Bases:
MapTransformTransform that returns data unchanged.
Used as a placeholder when no transform is needed.
Data augmentation and transformation utilities.
Key Classes:
Compose: Compose multiple transformsRandomFlip: Random horizontal/vertical flipRandomRotation: Random rotationNormalize: Intensity normalizationToTensor: Convert to PyTorch tensor
Examples
Conforming Images
from neurolit.data.conform import conform_image
# Conform a single image
conform_image(
input_path='raw_T1w.nii.gz',
output_path='T1w_conformed.nii.gz',
target_spacing=(1.0, 1.0, 1.0),
target_size=(256, 256, 256)
)
Using Datasets
from neurolit.data.datasets import BrainDataset
from torch.utils.data import DataLoader
# Create dataset
dataset = BrainDataset(
data_dir='training_data',
transform=None
)
# Create data loader
loader = DataLoader(
dataset,
batch_size=16,
shuffle=True,
num_workers=4
)
# Iterate over batches
for batch in loader:
images = batch['image']
# Process batch...
Applying Transforms
from neurolit.data.transforms import Compose, RandomFlip, Normalize, ToTensor
# Define transform pipeline
transform = Compose([
RandomFlip(p=0.5),
Normalize(mean=0.5, std=0.5),
ToTensor()
])
# Apply to dataset
dataset = BrainDataset(
data_dir='training_data',
transform=transform
)
Custom Transforms
from neurolit.data.transforms import Compose
class CustomTransform:
def __call__(self, image):
# Your custom transformation
return modified_image
# Use in pipeline
transform = Compose([
CustomTransform(),
Normalize(),
ToTensor()
])
Batch Conforming
from neurolit.data.conform import conform_image
from pathlib import Path
input_dir = Path('raw_data')
output_dir = Path('conformed_data')
output_dir.mkdir(exist_ok=True)
for img_path in input_dir.glob('*.nii.gz'):
output_path = output_dir / img_path.name
conform_image(
input_path=str(img_path),
output_path=str(output_path)
)
print(f"Conformed {img_path.name}")