OA-BreaCR

πŸ“Œ Overview

OA-BreaCR is a longitudinal breast cancer risk prediction model that combines ordinal learning, temporal alignment of breast tissue changes, and probabilistic latent embeddings to improve both risk prediction and time-to-event estimation.

The model operates on two-timepoint mammograms (current and prior exams) and explicitly models the temporal evolution of breast tissue using attention-based registration and feature-level deformation.

It is designed to support both: - Breast cancer risk prediction - Time-to-event (ordinal) prediction of future cancer occurrence


🧠 Key Idea

OA-BreaCR is built around three core principles:

  • Ordinal learning for time-to-event prediction:
    The model learns structured risk over time using mean-variance ordinal loss and probabilistic ordinal embeddings (POE), enabling more informative time-to-cancer predictions.

  • Attention-based longitudinal alignment:
    Breast tissue changes between prior and current exams are modeled using attention maps and learned flow fields in the feature space, improving interpretability of temporal changes.

  • Feature-level fusion of longitudinal information:
    Current, prior, and difference features are combined to form a unified representation for final risk prediction.


πŸ—οΈ Architecture

OA-BreaCR Architecture


The model consists of four main components:

1. Feature Encoder

  • A shared CNN backbone (ResNet-18) extracts feature maps from:
    • Current mammogram
    • Prior mammogram
  • Input images are converted to 3-channel format if needed

2. Attention-Based Longitudinal Alignment

  • Attention pooling extracts spatial representations from both time points
  • A small flow network estimates a deformation field between exams
  • A spatial transformer aligns prior features to the current features
  • The model computes:
    • Aligned prior features
    • Absolute difference features (|current βˆ’ aligned prior|)

3. Ordinal Feature Modeling

  • Difference features are enriched with continuous positional encoding using time gap information
  • This allows the model to explicitly learn temporal progression patterns

4. Feature Fusion and Prediction

  • Three feature streams are combined:
    • Current features
    • Prior features
    • Time-aware difference features
  • A fusion MLP integrates all representations
  • Optional Probabilistic Ordinal Embedding (POE) introduces stochastic latent structure
  • Final risk is predicted using a linear classification head

πŸ”„ Input / Output

Input

The model expects a batch dictionary with:

  • current_image: Current mammogram [B, 1, H, W]
  • previous_image: Prior mammogram [B, 1, H, W]
  • time_gap: Time between exams [B]
  • years_to_cancer: Target time-to-event for current exam
  • years_to_cancer_prior: Target time-to-event for prior exam
  • years_to_last_followup: Censoring information

Output

The forward method returns a dictionary containing:

  • final: Primary fused risk logits
  • current: Risk prediction from current exam only
  • prior: Risk prediction from prior exam only
  • difference: Risk prediction from temporal difference features
  • flow_field: Learned deformation field for alignment
  • emb_final: Optional probabilistic ordinal embedding
  • log_var_final: Uncertainty estimate (if POE is enabled)
  • loss: Registration alignment loss (attention consistency)

These outputs are consumed by two helper methods:

  • get_risk_heads(outputs, batch)
    Uses the outputs from forward to construct structured dictionaries for each prediction head:

    • final: Uses final with years_to_cancer and years_to_last_followup
    • current: Uses current with years_to_cancer and years_to_last_followup
    • prior: Uses prior with years_to_cancer_prior and years_to_last_followup_prior
    • difference: Uses difference with years_to_cancer and years_to_last_followup
  • Each head contains:

    • risk: Predicted logits
    • risk_label: Time-to-event target
    • years_lfu: Follow-up / censoring information
    • emb, log_var: Optional probabilistic embeddings (if POE is enabled)
    • weight: Contribution to the total training loss

This enables multi-head supervision, where the model jointly learns: - Fused longitudinal risk (primary task)
- Current-only and prior-only predictions (auxiliary supervision)
- Difference-based predictions capturing temporal change

  • get_primary_risk_head(outputs)
    Uses the outputs from forward to return the final prediction for evaluation:

    • Uses the logits from the final fully connected layer (final)
    • If stochastic embeddings are used, predictions are averaged across samples
    • Softmax is applied to obtain a probability distribution over time horizons
    • The probabilities are converted into a cumulative risk score over a specified follow-up period

This represents the model’s primary time-to-event risk estimate, incorporating both longitudinal information and ordinal structure.


🧩 Integration in This Framework

OA-BreaCR is implemented as a subclass of BaseRiskModel and:

  • Supports longitudinal two-timepoint mammography input
  • Integrates ordinal learning for time-to-event prediction
  • Includes attention-based feature alignment
  • Optionally uses probabilistic ordinal embeddings (POE) for uncertainty modeling
  • Produces multiple prediction heads for auxiliary supervision

βš™οΈ Key Components

  • CNN Encoder: Extracts deep visual features from mammograms
  • Attention Pooling Module: Produces spatial attention maps for alignment
  • Feature Registration Network: Estimates deformation fields between timepoints in the feature space
  • Spatial Transformer Block: Warps prior features into current space
  • Continuous Positional Encoding: Injects time-gap information into difference features
  • POE Latent Module (optional): Models uncertainty in ordinal embedding space

πŸ“‰ Risk Prediction

OA-BreaCR performs ordinal risk prediction over time, explicitly modeling:

  • Time-to-cancer as an ordered regression problem
  • Uncertainty via probabilistic embeddings (optional POE)
  • Longitudinal change through feature alignment

The final prediction is derived from:

  • final logits β†’ softmax β†’ cumulative risk score
  • Optional probabilistic aggregation when POE is enabled

Additionally, auxiliary heads (current, prior, difference) provide: - Regularization
- Improved temporal representation learning
- Better interpretability of longitudinal changes