AFagentforecastsunny scientific open source
Stable forecasting core

agentforecast public gallery

AgentForecast is a lightweight forecast-to-publish layer built on top of a stable forecasting core, with optional research and agent/tooling surfaces.

ZWZipeng WuThe University of Birminghamzxw365@student.bham.ac.uk
white-background-firstresearch + pack surfacesagent-friendly outputs
Product surfaces

One package, four explicit surfaces

The goal is not to blur packs, research, operations, and agent tooling into one contract. The homepage should make those boundaries obvious.

Pack Surface

One-shot forecast packs, backend comparison, charts, cards, markdown, and JSON artifacts.

forecast_dataframe, compare_backends_frame, forecast_csv

Research Surface

Explicit benchmark-first APIs with fit / predict / update, strict_mode, multi-horizon output, and auditable defaults.

OnlineForecaster, forecast_benchmark_dataframe

Operations Surface

Streaming watch flows, drift diagnostics, and low-latency rolling forecast outputs.

forecast_stream_dataframe, stream_ewm, river_*

Agent Surface

Tool server, MCP, structured JSON contracts, and environment diagnostics for automation.

doctor, tool_server, mcp_server
Install and doctor

Choose a calm first path

README and the public homepage should agree on install modes, backend expectations, and the first diagnostic command to run.

PersonaInstall commandBest for
beginner / pack userpip install agentforecastone-shot forecast packs and backend auto-routing
research / benchmarkpip install "agentforecast[stats,ml]"strict benchmark runs, comparison work, OnlineForecaster
streaming / operationspip install "agentforecast[stream]"River backends and streaming watch flows
full optional stackpip install "agentforecast[all]"widest adapter coverage

Doctor first

Run environment diagnostics before guessing which backend or extra to trust.

python -m agentforecast.cli doctor

Current environment snapshot

Available workflows: pack=True, research=True, streaming=True.

missing extras now: river_linear, river_snarimax, river_holtwinters, neural_nhits, automl_autogluon, tabpfn_regression
Model selection

Real forecasting models, not toy demos

agentforecast is built around practical forecasting families that people actually use: baseline references, ETS and ARIMA, lag-feature regression, and online River-style models. Optional deep or AutoML adapters stay secondary instead of dominating the public surface.

What the package actually includes

The public homepage should answer model choice first. Auto routing stays grounded in these real, usable model families rather than novelty-only showcase algorithms.

Baseline and seasonal references

Real practical baselines for sanity checks and low-data series.

naive, seasonal_naive, moving_average, drift

Statistical workhorses

Classical models that remain strong in real business forecasting when data is limited and seasonality matters.

stats_ets, stats_arima, statsforecast_autoets, statsforecast_autoarima

Time series as regression

Lag-feature regressors for practical tabular forecasting with exogenous variables.

ml_ridge, ml_histgb, ml_xgboost, ml_lightgbm, ml_catboost, mlforecast_linear, mlforecast_xgboost

Streaming and online models

Incremental models for rolling updates, low-latency refreshes, and drift-aware operation.

stream_ewm, stream_sgd, river_linear, river_snarimax, river_holtwinters
Workflow

Point. Route. Publish.

The gallery builder is opinionated about clarity: clean stats, high-contrast cards, simple code blocks, and obvious links to the underlying artifacts.

1. Point

Use a dataset id, case id, local CSV, directory, or URL.

2. Route

Let backend auto-selection compare candidate methods or choose one explicitly.

3. Publish

Get charts, cards, markdown, JSON, and leaderboard artifacts.

4. Reuse

Push the static gallery to GitHub Pages or consume the JSON from an agent.

Forecast packs

Public runs, ready to scan

Every card is built from the same product contract: selected backend, transparent metrics, copied artifacts, and a clean path into the case detail page.

Model surface

What is inside this package

Before anyone looks at external benchmarks, they should be able to scan the backend families, concrete model ids, install tiers, serious Python entry points, and capability matrix that agentforecast exposes.

Backend families and model ids

This is the forecasting surface behind the gallery, grouped the way a human evaluator would usually reason about model choice.

baseline4/4 available in this build

Baseline references

Fast reference models for last-value carry, recent averages, drift, and simple seasonal repeats.

  • Model ids: naive, seasonal_naive, moving_average, drift
  • Providers: agentforecast
  • Install extras: base
fastlong_horizonlow_data
classical4/4 available in this build

Classical forecasting

ARIMA and ETS style models for interpretable seasonality, smoother long-horizon structure, and low-data runs.

  • Model ids: stats_arima, stats_ets, statsforecast_autoarima, statsforecast_autoets
  • Providers: statsmodels, StatsForecast
  • Install extras: stats
accuratelong_horizonlow_datascalable
tabular7/7 available in this build

Time series as regression

Lag-feature regression with Ridge, boosting, tree ensembles, and optional MLForecast-style adapters.

  • Model ids: ml_ridge, ml_histgb, ml_xgboost, ml_lightgbm, ml_catboost, mlforecast_linear, mlforecast_xgboost
  • Providers: scikit-learn, XGBoost, LightGBM, CatBoost, MLForecast
  • Install extras: ml
accurateexogenousfastlow_datatabular
streaming2/5 available in this build

Streaming and online learning

Incremental forecasters and online regressors for rolling updates, drift monitoring, and low-latency refreshes.

  • Model ids: stream_ewm, stream_sgd, river_linear, river_snarimax, river_holtwinters
  • Providers: agentforecast, scikit-learn, River
  • Install extras: base, ml, stream
accuratefastlong_horizonlow_datastreaming
deep0/1 available in this build

Deep forecasting adapters

Optional neural backends for longer-horizon experiments when the lean surface is not enough.

  • Model ids: neural_nhits
  • Providers: NeuralForecast
  • Install extras: deep
deeplong_horizon
automl0/1 available in this build

AutoML adapters

Optional automation paths for larger search spaces and managed forecasting workflows.

  • Model ids: automl_autogluon
  • Providers: AutoGluon
  • Install extras: automl
accurateautoml
tabpfn0/1 available in this build

Experimental tabular priors

Optional TabPFN-style regressors for compact tabular forecasting experiments.

  • Model ids: tabpfn_regression
  • Providers: TabPFN
  • Install extras: tabpfn
experimentaltabular

Backend capability matrix

Trust tiers and capability flags should be visible instead of hidden behind README prose or runtime surprises.

BackendExtraTierStreamingExogenousConformalDirectRecursive
naivebasestablenonononoyes
seasonal_naivebasestablenonononoyes
moving_averagebasestablenonononoyes
driftbasestablenonononoyes
stats_arimastatsstablenonononoyes
stats_etsstatsstablenonononoyes
statsforecast_autoarimastatsexperimentalnonononoyes
statsforecast_autoetsstatsexperimentalnonononoyes
ml_ridgemlstablenoyesnoyesyes
ml_histgbmlbetanoyesnoyesyes
ml_xgboostmlbetanoyesnoyesyes
ml_lightgbmmlbetanoyesnoyesyes
ml_catboostmlbetanoyesnoyesyes
mlforecast_linearmlbetanoyesnonoyes
mlforecast_xgboostmlbetanoyesnonoyes
stream_ewmbasestableyesnononoyes
stream_sgdmlexperimentalyesnononoyes
river_linearstreambetayesyesyesyesyes
river_snarimaxstreambetayesyesyesnoyes
river_holtwintersstreambetayesyesyesnoyes
neural_nhitsdeepexperimentalnonononoyes
automl_autogluonautomlexperimentalnonononoyes
tabpfn_regressiontabpfnexperimentalnoyesnoyesyes

Python API surface

These are the serious importable functions and objects a human reader can actually call, without having to start from a CLI command.

researchPython API

OnlineForecaster

The stable low-level forecasting contract for fit / predict / update benchmark workflows.

  • Call: OnlineForecaster(backend='river_linear', horizons=[1,3,6], strict_mode=True)
  • Use when: You need a serious forecasting core instead of a one-shot pack helper.
  • Returns: A stateful object with fit(), predict(), update(), and auditable state().
pythonresearchstateful
benchmarkPython API

forecast_benchmark_dataframe

One-shot low-level benchmark forecasting with horizons=[...] and recursive or direct mode.

  • Call: forecast_benchmark_dataframe(df, backend='ml_ridge', horizons=[1,3,6], mode='direct')
  • Use when: You want benchmark-safe forecasting without pack/report generation.
  • Returns: Forecast rows, prediction dict, feature spec, cleanup, and diagnostics.
pythonbenchmarkmulti-horizon
doctorPython API

diagnose_environment

Report which backends are importable, which extras are missing, and which workflows the environment supports.

  • Call: diagnose_environment()
  • Use when: You want a calm answer to what is installed and what command to run next.
  • Returns: Profiles, install hints, available backends, and capability metadata.
pythondoctorenvironment
dataPython API

validate_series_frame / clean_series_frame

Separate strict validation from forgiving cleanup instead of hiding data mutation behind one helper.

  • Call: validate_series_frame(df, strict_mode=True) or clean_series_frame(df)
  • Use when: You need to inspect timestamps, duplicates, cadence, and cleanup behavior explicitly.
  • Returns: Validation report or cleaned PreparedSeries.
pythonvalidationcleanup
packPython API

forecast_dataframe / build_hosted_site

The pack-oriented path for public forecast artifacts and static gallery publishing.

  • Call: forecast_dataframe(df, name='series', horizon=12, backend='auto')
  • Use when: You want publishable charts, cards, markdown, JSON, and a browsable gallery.
  • Returns: RunResult artifacts or a generated static site.
pythonstatic-sitepages

Package capabilities

These are the workflow-level features that matter in practice beyond the estimator list itself.

Auto routing

Compare candidate backends automatically and keep the leaderboard visible instead of hiding model choice.

Time series as regression

Expose lag points, spaced delays, rolling windows, calendar fields, and optional tsfresh descriptors.

Prediction intervals

Emit conformal-style lower_* / upper_* bands with coverage and width diagnostics.

Exogenous signals

Carry external regressors through lag-feature backends and adapters that support exogenous inputs.

Streaming drift watch

Publish rolling diagnostics and drift alert cards alongside streaming forecast packs.

Publishable artifacts

Export plots, cards, CSV, markdown, and JSON for both human readers and agents.

External benchmark hubs

Internal demo evidence stays visible, but broader public benchmark hubs are still the right place to compare against the field.

NameKindLink
Monash / ForecastingData repositorydataset hubhttps://forecastingdata.org/
Monash baseline evaluation resultsbaseline resultshttps://forecastingdata.org/RMSSE.html
M4 competitioncompetitionhttps://www.unic.ac.cy/iff/research/forecasting/m-competitions/m4/
M5 competitioncompetitionhttps://www.unic.ac.cy/iff/research/forecasting/m-competitions/m5/
OpenTS-Bench / OpenTSleaderboard hubhttps://decisionintelligence.github.io/OpenTS/
TFB repositorybenchmark codebasehttps://github.com/decisionintelligence/TFB
GIFT-Eval repositorybenchmarkhttps://github.com/SalesforceAIResearch/gift-eval
GIFT-Eval public leaderboardleaderboardhttps://huggingface.co/spaces/Salesforce/GIFT-Eval
ForecastBench baseline leaderboardleaderboardhttps://www.forecastbench.org/baseline/
ForecastBench tournament leaderboardleaderboardhttps://www.forecastbench.org/tournament/