In [1]:

Copied!

from lets_plot import LetsPlot

LetsPlot.setup_html()
from lets_plot import LetsPlot

LetsPlot.setup_html()

In [2]:

Copied!





import numpy as np

from xaitimesynth import (
    TimeSeriesBuilder,
    gaussian_noise,
    gaussian_pulse,
    seasonal,
    plot_components,
)
from xaitimesynth.metrics import (
    auc_pr_score,
    auc_roc_score,
    relevance_mass_accuracy,
    relevance_rank_accuracy,
)
import numpy as np

from xaitimesynth import (
    TimeSeriesBuilder,
    gaussian_noise,
    gaussian_pulse,
    seasonal,
    plot_components,
)
from xaitimesynth.metrics import (
    auc_pr_score,
    auc_roc_score,
    relevance_mass_accuracy,
    relevance_rank_accuracy,
)

Package Overview¶

This guide explains the core concept, design, and key terminology of xaitimesynth.

Purpose¶

xaitimesynth generates synthetic time series data for evaluating explainable AI (XAI) methods, with the main focus being feature attribution methods. Synthetic data lets us know exactly where the important features are, enabling direct measurement of whether attribution methods correctly identify them.

In real-world time series classification, we rarely have ground truth about which time points or regions of a time series truly matter for a task. xaitimesynth solves this by generating data where features have known locations, tracking those locations internally, and providing metrics to compare attributions against the ground truth.

Core Concept¶

Every time series follows an additive composition model:

x = background + feature

Background: The base signal pattern (noise, random walks, seasonal patterns, trends)
Feature: The class-discriminating pattern placed in a specific time window

For a two-class problem:

Class 0: background + feature A (e.g., a downward level shift)
Class 1: background + feature B (e.g., a seasonal burst)

Since we know exactly where each feature is located, we can directly measure whether attribution methods highlight the right regions.

Design¶

The builder uses chainable method calls that naturally compose: each call adds a layer that gets summed into the final signal. This mirrors the additive x = background + feature model directly in the API:

In [3]:

Copied!





# define dataset structure once
base_builder = (
    TimeSeriesBuilder(n_timesteps=100, normalization="zscore")
    .for_class(0)
    .add_signal(gaussian_noise(sigma=1.0))
    .add_feature(gaussian_pulse(amplitude=3.0), random_location=True, length_pct=0.3)
    .for_class(1)
    .add_signal(gaussian_noise(sigma=1.0))
    .add_feature(
        seasonal(period=10, amplitude=3.0), random_location=True, length_pct=0.3
    )
)

# generate train and test sets with different seeds
train = base_builder.clone(n_samples=200, random_state=42).build()
test = base_builder.clone(n_samples=50, random_state=43).build()

# visualise instances from created dataset (by default first observation from each class)
plot = plot_components(train)
plot.show()
# define dataset structure once
base_builder = (
    TimeSeriesBuilder(n_timesteps=100, normalization="zscore")
    .for_class(0)
    .add_signal(gaussian_noise(sigma=1.0))
    .add_feature(gaussian_pulse(amplitude=3.0), random_location=True, length_pct=0.3)
    .for_class(1)
    .add_signal(gaussian_noise(sigma=1.0))
    .add_feature(
        seasonal(period=10, amplitude=3.0), random_location=True, length_pct=0.3
    )
)

# generate train and test sets with different seeds
train = base_builder.clone(n_samples=200, random_state=42).build()
test = base_builder.clone(n_samples=50, random_state=43).build()

# visualise instances from created dataset (by default first observation from each class)
plot = plot_components(train)
plot.show()

In [4]:

Copied!





# replace with your XAI method output; shape must be (n_samples, n_timesteps, n_dims)
np.random.seed(0)
attributions = np.random.rand(*test["X"].shape)

# evaluate random attributions against ground truth (all samples)
print("Metric scores for random attributions (all samples):")
print(f"{auc_pr_score(attributions, test, normalize=True):.3f} | AUC-PR")
print(f"{auc_roc_score(attributions, test):.3f} | AUC-ROC")
print(f"{relevance_mass_accuracy(attributions, test):.3f} | Relevance Mass Accuracy")
print(f"{relevance_rank_accuracy(attributions, test):.3f} | Relevance Rank Accuracy")

# evaluate for a single class only (e.g., class 1)
c1_indices = np.where(test["y"] == 1)[0].tolist()
c1_attributions = attributions[c1_indices]

print(f"\nMetric scores for class 1 only ({len(c1_indices)} samples):")
auc_pr_c1 = auc_pr_score(
    c1_attributions, test, sample_indices=c1_indices, normalize=True
)
auc_roc_c1 = auc_roc_score(c1_attributions, test, sample_indices=c1_indices)
rma_c1 = relevance_mass_accuracy(c1_attributions, test, sample_indices=c1_indices)
rra_c1 = relevance_rank_accuracy(c1_attributions, test, sample_indices=c1_indices)

print(f"{auc_pr_c1:.3f} | AUC-PR")
print(f"{auc_roc_c1:.3f} | AUC-ROC")
print(f"{rma_c1:.3f} | Relevance Mass Accuracy")
print(f"{rra_c1:.3f} | Relevance Rank Accuracy")
# replace with your XAI method output; shape must be (n_samples, n_timesteps, n_dims)
np.random.seed(0)
attributions = np.random.rand(*test["X"].shape)

# evaluate random attributions against ground truth (all samples)
print("Metric scores for random attributions (all samples):")
print(f"{auc_pr_score(attributions, test, normalize=True):.3f} | AUC-PR")
print(f"{auc_roc_score(attributions, test):.3f} | AUC-ROC")
print(f"{relevance_mass_accuracy(attributions, test):.3f} | Relevance Mass Accuracy")
print(f"{relevance_rank_accuracy(attributions, test):.3f} | Relevance Rank Accuracy")

# evaluate for a single class only (e.g., class 1)
c1_indices = np.where(test["y"] == 1)[0].tolist()
c1_attributions = attributions[c1_indices]

print(f"\nMetric scores for class 1 only ({len(c1_indices)} samples):")
auc_pr_c1 = auc_pr_score(
    c1_attributions, test, sample_indices=c1_indices, normalize=True
)
auc_roc_c1 = auc_roc_score(c1_attributions, test, sample_indices=c1_indices)
rma_c1 = relevance_mass_accuracy(c1_attributions, test, sample_indices=c1_indices)
rra_c1 = relevance_rank_accuracy(c1_attributions, test, sample_indices=c1_indices)

print(f"{auc_pr_c1:.3f} | AUC-PR")
print(f"{auc_roc_c1:.3f} | AUC-ROC")
print(f"{rma_c1:.3f} | Relevance Mass Accuracy")
print(f"{rra_c1:.3f} | Relevance Rank Accuracy")

Metric scores for random attributions (all samples):
0.072 | AUC-PR
0.504 | AUC-ROC
0.302 | Relevance Mass Accuracy
0.311 | Relevance Rank Accuracy

Metric scores for class 1 only (25 samples):
0.052 | AUC-PR
0.489 | AUC-ROC
0.295 | Relevance Mass Accuracy
0.292 | Relevance Rank Accuracy

Internally, the package has three main layers for data generation: component functions (user-facing API in components.py), generator functions (array production in generators.py), and the registry (registry.py) that connects them. See Adding Generators for details.

Terminology¶

Component Types (Registry Level)¶

Each component is registered with a type indicating its intended use:

Type	Intended use	Examples
`"signal"`	Full-length background patterns	`random_walk`, `gaussian_noise`, `seasonal`
`"feature"`	Localized discriminative patterns	`peak`, `trough`, `gaussian_pulse`
`"both"`	Works well either way	`constant`, `trend`, `manual`

This is purely for discoverability via list_signal_components() / list_feature_components(). It does not restrict usage — you can use any component with add_signal() or add_feature().

Signals vs Features (Builder Level)¶

Method	Purpose	Stored in
`add_signal()`	Background patterns	`components.background`
`add_feature()`	Class-discriminating patterns with tracked locations	`components.features`

All signals are combined additively into the background per class. Features are tracked separately so their locations serve as ground truth for XAI evaluation. This is done via a binary mask for feature locations.

Key Terms¶

Term	Meaning
`n_timesteps`	Length of each time series
`n_samples`	Number of time series to generate
`n_dimensions`	Number of channels (1 = univariate, >1 = multivariate)
Feature mask	Binary array: 1 where a feature is present, 0 elsewhere
Shared randomness	When `True`, same random values used across all dimensions
Shared location	When `True` with `random_location=True`, feature appears at the same position across dimensions