lamindb.integrations.lightning .md

PyTorch Lightning integration for LaminDB.

class lamindb.integrations.lightning.Checkpoint(dirpath, *, features=None, monitor=None, verbose=False, save_last=None, save_top_k=1, save_weights_only=False, mode='min', auto_insert_metric_name=True, every_n_train_steps=None, train_time_interval=None, every_n_epochs=None, save_on_train_epoch_end=None, enable_version_counter=True, overwrite_versions=False)

ModelCheckpoint that annotates torch lightning checkpoints.

Extends Lightning’s ModelCheckpoint with artifact creation & feature annotation. Checkpoints are stored at semantic paths like {dirpath}/epoch=0-val_loss=0.5.ckpt. Each checkpoint is a separate artifact. Query with ln.Artifact.filter(key__startswith=callback.dirpath).

If available in the instance, the following features are automatically tracked: is_best_model, score, model_rank, logger_name, logger_version,`max_epochs`, max_steps, precision, accumulate_grad_batches, gradient_clip_val, monitor, save_weights_only, mode.

Additionally, model hyperparameters (from pl_module.hparams) and datamodule hyperparameters (from trainer.datamodule.hparams) are captured if corresponding features exist.

Parameters:
  • dirpath (str | Path) – Directory for checkpoints (reflected in cloud paths).

  • features (dict[Literal['run', 'artifact'], dict[str, Any]] | None, default: None) – Features to annotate runs and artifacts. Use “run” key for run-level features (static metadata). Use “artifact” key for artifact-level features (values can be static or None for auto-population from trainer metrics/attributes).

  • monitor (str | None, default: None) – Quantity to monitor for saving best checkpoint.

  • verbose (bool, default: False) – Verbosity mode.

  • save_last (bool | None, default: None) – Save a copy of the last checkpoint.

  • save_top_k (int, default: 1) – Number of best checkpoints to keep.

  • save_weights_only (bool, default: False) – Save only model weights (not optimizer state).

  • mode (Literal['min', 'max'], default: 'min') – One of “min” or “max” for monitor comparison.

  • auto_insert_metric_name (bool, default: True) – Include metric name in checkpoint filename.

  • every_n_train_steps (int | None, default: None) – Checkpoint every N training steps.

  • train_time_interval (timedelta | None, default: None) – Checkpoint at time intervals.

  • every_n_epochs (int | None, default: None) – Checkpoint every N epochs.

  • save_on_train_epoch_end (bool | None, default: None) – Run checkpointing at end of training epoch.

  • enable_version_counter (bool, default: True) – Append version to filename to avoid collisions.

  • overwrite_versions (bool, default: False) – Whether to overwrite existing checkpoints.

Examples

Using the API:

import lightning as pl
from lamindb.integrations import lightning as ll

# Optional one-time setup to enable automated lightning specific feature tracking
ll.save_lightning_features()

callback = ll.Checkpoint(
    dirpath="deployments/my_model/",
    monitor="val_loss",
    save_top_k=3,
)

trainer = pl.Trainer(callbacks=[callback])
trainer.fit(model, dataloader)

# Query checkpoints
ln.Artifact.filter(key__startswith=callback.dirpath)

Using the CLI:

# config.yaml
trainer:
  callbacks:
    - class_path: lamindb.integrations.lightning.Checkpoint
      init_args:
        dirpath: deployments/my_model/
        monitor: val_loss
        save_top_k: 3

# Run with:
# python main.py fit --config config.yaml
setup(trainer, pl_module, stage)

Validate user features and detect available auto-features.

Return type:

None

class lamindb.integrations.lightning.SaveConfigCallback(*args, **kwargs)

SaveConfigCallback that also saves config to the instance.

Use with LightningCLI to save the resolved configuration file alongside checkpoints.

Example:

from lightning.pytorch.cli import LightningCLI
from lamindb.integrations import lightning as ll

cli = LightningCLI(
    MyModel,
    MyDataModule,
    save_config_callback=ll.SaveConfigCallback,
)
setup(trainer, pl_module, stage)

Save resolved configuration file alongside checkpoints.

Return type:

None

lamindb.integrations.lightning.save_lightning_features()

Register LaminDB features used by the Lightning integration Checkpoint.

Creates the following features if they do not already exist: :rtype: None

  • lamindb.lightning (feature type): Parent feature type for the below lightning features.

  • is_best_model (bool): Whether this checkpoint is the best model.

  • score (float): The monitored metric score.

  • model_rank (int): Rank among all checkpoints (0 = best).

  • logger_name (str): Name from the first Lightning logger.

  • logger_version (str): Version from the first Lightning logger.

  • max_epochs (int): Maximum number of epochs.

  • max_steps (int): Maximum number of training steps.

  • precision (str): Training precision (e.g., “32”, “16-mixed”, “bf16”).

  • accumulate_grad_batches (int): Number of batches to accumulate gradients over.

  • gradient_clip_val (float): Gradient clipping value.

  • monitor (str): Metric name being monitored.

  • save_weights_only (bool): Whether only model weights are saved.

  • mode (str): Optimization mode (“min” or “max”).

Example:

from lamindb.integrations import lightning as ll

ll.save_lightning_features()