AI attention breakout analogs
A flagship real-data example asking which historical attention curve DeepSeek looked more like over its first breakout window: ChatGPT or Threads.
DeepSeek vs Threads: Pearson r 0.94, Spearman rho 1.00, Kendall tau 1.00. The levels line up much more than the day-to-day changes, so the relationship is easier to defend as a broad shape analogy than as a local-dynamics match. The weakest agreement appears in derivative similarity, so timing or regime differences probably matter.
This is the minimal script a human user would actually run: load data, call EchoTime, inspect the returned object.
from pathlib import Path
import pandas as pd
from echotime import compare_series, rolling_similarity
data_path = Path(__file__).resolve().parents[1] / "data" / "real_ai_attention_breakouts.csv"
df = pd.read_csv(data_path)
threads_report = compare_series(
df["deepseek_cumulative"],
df["threads_cumulative"],
left_name="DeepSeek",
right_name="Threads",
)
chatgpt_report = compare_series(
df["deepseek_cumulative"],
df["chatgpt_cumulative"],
left_name="DeepSeek",
right_name="ChatGPT",
)
windows = rolling_similarity(df["deepseek_cumulative"], df["threads_cumulative"], window=20, step=5)
print(threads_report.to_summary_card_markdown())
print(
{
"threads_similarity": round(threads_report.similarity_score, 3),
"chatgpt_similarity": round(chatgpt_report.similarity_score, 3),
"rolling_windows": len(windows),
}
)
You should see one analog with consistently stronger Pearson, Spearman, and DTW evidence than the others, plus rolling windows that support the choice.
# EchoTime similarity summary
**Compared:** DeepSeek vs Threads
## Headline
DeepSeek vs Threads: Pearson r 0.94, Spearman rho 1.00, Kendall tau 1.00. The levels line up much more than the day-to-day changes, so the relationship is easier to defend as a broad shape analogy than as a local-dynamics match. The weakest agreement appears in derivative similarity, so timing or regime differences probably matter.
## Familiar statistics
| metric | value |
|---|---:|
| Pearson r | 0.936 |
| Spearman rho | 1.000 |
| Kendall tau | 1.000 |
| Best-lag Pearson r | 0.936 |
| Mutual info | 0.641 |
| First-difference r | 0.199 |
## Time-series-specific metrics
| plain-language label | score |
|---|---:|
| dtw similarity | 0.608 |
| trend similarity | 0.540 |
| spectral similarity | 0.464 |
| shape similarity | 0.431 |
## Recommended next actions
- Plot both series after z-score normalization to show the shared shape without scale differences.
- Run rolling or windowed similarity if you expect the relationship to change over time.
- Use structural-profile similarity when scales, frequencies, or observation modes differ too much for raw-shape comparison.
- For cumulative or monotonic inputs, compare first differences or daily increments before making an analog claim.
- DeepSeek vs Threads: Pearson 0.94 / Spearman 1.00 / Mutual info 0.64; against ChatGPT: Pearson 0.67 / Spearman 1.00 / Mutual info 0.47.
- The rolling view makes the analog choice inspectable instead of hiding disagreement behind one opaque number.
- This is a real analog-selection workflow on public data, not a hand-drawn breakout curve.
This workflow generalizes to any "which historical analog is closest?" question.
- Replace the example curves with your own daily installs, signups, traffic, or attention series.
- Compare against several historical candidates, not just one, if you want a defensible analog story.
- Keep event dates such as releases or campaigns nearby even if the first pass only compares the numeric curves.
The cumulative attention curves stay close enough that the analog is visible before you read the coefficients and radar.
The analog call is backed by multiple time-series metrics rather than a single scalar.
Rolling component mean shows whether the match survives beyond the initial breakout surge.