Cross-disciplinary example

Irregular earthquake streams can still be compared

A real irregular-data example using 2024 USGS earthquakes from California and Alaska, with explicit timestamps kept in the comparison instead of being flattened away.

Question

Can you compare irregular event streams without pretending the data are evenly sampled?

This page loads a long event table, keeps the real event timestamps for the pairwise comparison, and then profiles the same table as an event-stream dataset. The point is not the domain; the point is that EchoTime can work on genuinely irregular observations without pretending they came from a tidy grid.

Input irregular values plus explicit timestamps Call compare_series(..., left_timestamps=..., right_timestamps=...) Output SimilarityReport + DatasetProfile

Load the long event table with pandas.
Split out the two irregular trajectories you want to compare and keep their timestamps.
Run compare_series(...) for the pair and profile_dataset(...) on the long table for context.

Permalink Open report Open script

Result at a glance

Pearson 0.12Spearman 0.08Mutual info 0.07Diff r 0.13

Component mean 0.22 across 5 time-series metrics

California earthquakes vs Alaska earthquakes: Pearson r 0.12, Spearman rho 0.08, Kendall tau 0.05. The weakest agreement appears in spectral similarity and derivative similarity, so timing or regime differences probably matter.

dtw similarity0.47

spectral similarity0.35

derivative similarity0.13

shape similarity0.13

Runnable example

This is the minimal script a human user would actually run: load data, call EchoTime, inspect the returned object.

from pathlib import Path

import pandas as pd
from echotime import compare_series, profile_dataset


data_path = Path(__file__).resolve().parents[1] / "data" / "real_usgs_earthquakes_ca_ak_2024.csv"
df = pd.read_csv(data_path)
df["timestamp"] = pd.to_datetime(df["timestamp"], utc=True, format="mixed").astype("int64") / 1_000_000_000
df["event_type"] = df["magnitude"].map(lambda value: "M4+" if value >= 4.0 else "M2.5-4")

california = df.loc[df["region"] == "California"].sort_values("timestamp")
alaska = df.loc[df["region"] == "Alaska"].sort_values("timestamp")

report = compare_series(
    california["magnitude"],
    alaska["magnitude"],
    left_timestamps=california["timestamp"],
    right_timestamps=alaska["timestamp"],
    left_name="California earthquakes",
    right_name="Alaska earthquakes",
)
profile = profile_dataset(
    df.rename(columns={"region": "subject", "magnitude": "value"})[["timestamp", "value", "subject", "event_type"]],
    domain="earth_science",
)

print(report.to_summary_card_markdown())
print(profile.to_summary_card_markdown())

What you should see

You should get a usable similarity verdict plus an event-stream profile explaining why irregularity and burstiness matter.

# EchoTime similarity summary

**Compared:** California earthquakes vs Alaska earthquakes

## Headline

California earthquakes vs Alaska earthquakes: Pearson r 0.12, Spearman rho 0.08, Kendall tau 0.05. The weakest agreement appears in spectral similarity and derivative similarity, so timing or regime differences probably matter.

## Familiar statistics

| metric | value |
|---|---:|
| Pearson r | 0.121 |
| Spearman rho | 0.082 |
| Kendall tau | 0.050 |
| Best-lag Pearson r | 0.132 |
| Mutual info | 0.068 |
| First-difference r | 0.133 |

## Time-series-specific metrics

| plain-language label | score |
|---|---:|
| dtw similarity | 0.471 |
| spectral similarity | 0.349 |
| derivative similarity | 0.133 |
| shape similarity | 0.127 |

## Recommended next actions

- Plot both series after z-score normalization to show the shared shape without scale differences.
- Run rolling or windowed similarity if you expect the relationship to change over time.
- Use structural-profile similarity when scales, frequencies, or observation modes differ too much for raw-shape comparison.

Pearson 0.12 / Spearman 0.08 / Mutual info 0.07 is only part of the story; the event-timestamp handling matters just as much here.
The comparison respects event timing instead of forcing both regions onto a regular daily grid first.
The dataset profile adds event-stream context around burstiness, irregularity, and heterogeneity.

Use your own data

This is the right pattern when your own data live in a long table with real timestamps.

Rename long-table columns to subject, timestamp, channel, and value before calling profile_dataset(df, domain='generic').
If you want to compare two specific trajectories directly, pass both the values and their timestamps into compare_series(...).
Do not regularize away the gaps before the first pass; EchoTime is designed to read them.

California vs Alaska magnitudes

The overlay is shown in event order, but the actual comparison also uses the real timestamps from the USGS feed.

Similarity radar

The radar keeps the irregular-event comparison interpretable without pretending one scalar is enough.

Event-stream axes

The profile makes burstiness, irregularity, and heterogeneity visible before you start modelling.

Data and outputs

Python sourceThe exact script used for this example.Real CSV snapshotThe frozen CSV snapshot used to generate this page.Original sourceThe public upstream source used to refresh the CSV snapshot.Related reportA real-data event-stream report built from the same USGS snapshot.