Pack Surface
One-shot forecast packs, backend comparison, charts, cards, markdown, and JSON artifacts.
AgentForecast is a lightweight forecast-to-publish layer built on top of a stable forecasting core, with optional research and agent/tooling surfaces.
The goal is not to blur packs, research, operations, and agent tooling into one contract. The homepage should make those boundaries obvious.
One-shot forecast packs, backend comparison, charts, cards, markdown, and JSON artifacts.
Explicit benchmark-first APIs with fit / predict / update, strict_mode, multi-horizon output, and auditable defaults.
Streaming watch flows, drift diagnostics, and low-latency rolling forecast outputs.
Tool server, MCP, structured JSON contracts, and environment diagnostics for automation.
README and the public homepage should agree on install modes, backend expectations, and the first diagnostic command to run.
| Persona | Install command | Best for |
|---|---|---|
| beginner / pack user | pip install agentforecast | one-shot forecast packs and backend auto-routing |
| research / benchmark | pip install "agentforecast[stats,ml]" | strict benchmark runs, comparison work, OnlineForecaster |
| streaming / operations | pip install "agentforecast[stream]" | River backends and streaming watch flows |
| full optional stack | pip install "agentforecast[all]" | widest adapter coverage |
Run environment diagnostics before guessing which backend or extra to trust.
Available workflows: pack=True, research=True, streaming=True.
agentforecast is built around practical forecasting families that people actually use: baseline references, ETS and ARIMA, lag-feature regression, and online River-style models. Optional deep or AutoML adapters stay secondary instead of dominating the public surface.
The public homepage should answer model choice first. Auto routing stays grounded in these real, usable model families rather than novelty-only showcase algorithms.
Real practical baselines for sanity checks and low-data series.
Classical models that remain strong in real business forecasting when data is limited and seasonality matters.
Lag-feature regressors for practical tabular forecasting with exogenous variables.
Incremental models for rolling updates, low-latency refreshes, and drift-aware operation.
The gallery builder is opinionated about clarity: clean stats, high-contrast cards, simple code blocks, and obvious links to the underlying artifacts.
Use a dataset id, case id, local CSV, directory, or URL.
Let backend auto-selection compare candidate methods or choose one explicitly.
Get charts, cards, markdown, JSON, and leaderboard artifacts.
Push the static gallery to GitHub Pages or consume the JSON from an agent.
Every card is built from the same product contract: selected backend, transparent metrics, copied artifacts, and a clean path into the case detail page.

airline_passengers: stats_ets won the backend comparison and produced a publishable pack.

daily_min_temperatures: stream_ewm won the backend comparison and produced a publishable pack.

monthly_car_sales: stats_ets won the backend comparison and produced a publishable pack.
Before anyone looks at external benchmarks, they should be able to scan the backend families, concrete model ids, install tiers, serious Python entry points, and capability matrix that agentforecast exposes.
This is the forecasting surface behind the gallery, grouped the way a human evaluator would usually reason about model choice.
Fast reference models for last-value carry, recent averages, drift, and simple seasonal repeats.
naive, seasonal_naive, moving_average, driftARIMA and ETS style models for interpretable seasonality, smoother long-horizon structure, and low-data runs.
stats_arima, stats_ets, statsforecast_autoarima, statsforecast_autoetsLag-feature regression with Ridge, boosting, tree ensembles, and optional MLForecast-style adapters.
ml_ridge, ml_histgb, ml_xgboost, ml_lightgbm, ml_catboost, mlforecast_linear, mlforecast_xgboostIncremental forecasters and online regressors for rolling updates, drift monitoring, and low-latency refreshes.
stream_ewm, stream_sgd, river_linear, river_snarimax, river_holtwintersOptional neural backends for longer-horizon experiments when the lean surface is not enough.
neural_nhitsOptional automation paths for larger search spaces and managed forecasting workflows.
automl_autogluonOptional TabPFN-style regressors for compact tabular forecasting experiments.
tabpfn_regressionTrust tiers and capability flags should be visible instead of hidden behind README prose or runtime surprises.
| Backend | Extra | Tier | Streaming | Exogenous | Conformal | Direct | Recursive |
|---|---|---|---|---|---|---|---|
naive | base | stable | no | no | no | no | yes |
seasonal_naive | base | stable | no | no | no | no | yes |
moving_average | base | stable | no | no | no | no | yes |
drift | base | stable | no | no | no | no | yes |
stats_arima | stats | stable | no | no | no | no | yes |
stats_ets | stats | stable | no | no | no | no | yes |
statsforecast_autoarima | stats | experimental | no | no | no | no | yes |
statsforecast_autoets | stats | experimental | no | no | no | no | yes |
ml_ridge | ml | stable | no | yes | no | yes | yes |
ml_histgb | ml | beta | no | yes | no | yes | yes |
ml_xgboost | ml | beta | no | yes | no | yes | yes |
ml_lightgbm | ml | beta | no | yes | no | yes | yes |
ml_catboost | ml | beta | no | yes | no | yes | yes |
mlforecast_linear | ml | beta | no | yes | no | no | yes |
mlforecast_xgboost | ml | beta | no | yes | no | no | yes |
stream_ewm | base | stable | yes | no | no | no | yes |
stream_sgd | ml | experimental | yes | no | no | no | yes |
river_linear | stream | beta | yes | yes | yes | yes | yes |
river_snarimax | stream | beta | yes | yes | yes | no | yes |
river_holtwinters | stream | beta | yes | yes | yes | no | yes |
neural_nhits | deep | experimental | no | no | no | no | yes |
automl_autogluon | automl | experimental | no | no | no | no | yes |
tabpfn_regression | tabpfn | experimental | no | yes | no | yes | yes |
These are the serious importable functions and objects a human reader can actually call, without having to start from a CLI command.
OnlineForecasterThe stable low-level forecasting contract for fit / predict / update benchmark workflows.
OnlineForecaster(backend='river_linear', horizons=[1,3,6], strict_mode=True)forecast_benchmark_dataframeOne-shot low-level benchmark forecasting with horizons=[...] and recursive or direct mode.
forecast_benchmark_dataframe(df, backend='ml_ridge', horizons=[1,3,6], mode='direct')diagnose_environmentReport which backends are importable, which extras are missing, and which workflows the environment supports.
diagnose_environment()validate_series_frame / clean_series_frameSeparate strict validation from forgiving cleanup instead of hiding data mutation behind one helper.
validate_series_frame(df, strict_mode=True) or clean_series_frame(df)forecast_dataframe / build_hosted_siteThe pack-oriented path for public forecast artifacts and static gallery publishing.
forecast_dataframe(df, name='series', horizon=12, backend='auto')These are the workflow-level features that matter in practice beyond the estimator list itself.
Compare candidate backends automatically and keep the leaderboard visible instead of hiding model choice.
Expose lag points, spaced delays, rolling windows, calendar fields, and optional tsfresh descriptors.
Emit conformal-style lower_* / upper_* bands with coverage and width diagnostics.
Carry external regressors through lag-feature backends and adapters that support exogenous inputs.
Publish rolling diagnostics and drift alert cards alongside streaming forecast packs.
Export plots, cards, CSV, markdown, and JSON for both human readers and agents.
Internal demo evidence stays visible, but broader public benchmark hubs are still the right place to compare against the field.
| Name | Kind | Link |
|---|---|---|
| Monash / ForecastingData repository | dataset hub | https://forecastingdata.org/ |
| Monash baseline evaluation results | baseline results | https://forecastingdata.org/RMSSE.html |
| M4 competition | competition | https://www.unic.ac.cy/iff/research/forecasting/m-competitions/m4/ |
| M5 competition | competition | https://www.unic.ac.cy/iff/research/forecasting/m-competitions/m5/ |
| OpenTS-Bench / OpenTS | leaderboard hub | https://decisionintelligence.github.io/OpenTS/ |
| TFB repository | benchmark codebase | https://github.com/decisionintelligence/TFB |
| GIFT-Eval repository | benchmark | https://github.com/SalesforceAIResearch/gift-eval |
| GIFT-Eval public leaderboard | leaderboard | https://huggingface.co/spaces/Salesforce/GIFT-Eval |
| ForecastBench baseline leaderboard | leaderboard | https://www.forecastbench.org/baseline/ |
| ForecastBench tournament leaderboard | leaderboard | https://www.forecastbench.org/tournament/ |