Bring your own data¶
If you already have a CSV, start here instead of the demo gallery.
tscfbench now has two direct Python entry points for real data:
run_panel_data: one treated unit plus comparison units over timerun_impact_data: one treated series plus one or more control series
CLI wrappers still exist as run-csv-panel and run-csv-impact, but the docs lead with Python because this page is for users, not agents.
1. Panel data: one treated unit plus donor pool¶
Use this when you have many units over time and exactly one treated unit.
Expected CSV shape:
city,date,traffic_index
Harbor City,2024-03-01,101.2
Harbor City,2024-03-02,100.7
North City,2024-03-01,98.4
North City,2024-03-02,98.9
...
Required columns:
- one unit column such as
cityorregion - one time column such as
dateoryear - one outcome column such as
traffic_indexoremployment_index
Run it in Python:
import pandas as pd
from tscfbench import run_panel_data
df = pd.read_csv("my_panel.csv")
result = run_panel_data(
df,
unit_col="city",
time_col="date",
y_col="traffic_index",
treated_unit="Harbor City",
intervention_t="2024-03-06",
output_dir="my_panel_run",
)
result["summary"]
CLI equivalent:
python -m tscfbench run-csv-panel my_panel.csv --unit-col city --time-col date --y-col traffic_index --treated-unit "Harbor City" --intervention-t 2024-03-06 --output my_panel_run
That writes:
panel_prediction_frame.csvpanel_metrics.jsonpanel_report.md- treated-vs-counterfactual charts
- point-effect and cumulative-impact charts
2. Impact data: one treated series plus controls¶
Use this when you have one main outcome series and one or more control series in the same table.
Expected CSV shape:
date,signups,peer_signups,search_interest
2024-04-01,120,116,54
2024-04-02,123,117,53
2024-04-03,121,115,55
...
Required columns:
- one time column
- one outcome column
- one or more control columns
Run it in Python:
import pandas as pd
from tscfbench import run_impact_data
df = pd.read_csv("my_impact.csv")
result = run_impact_data(
df,
time_col="date",
y_col="signups",
x_cols=["peer_signups", "search_interest"],
intervention_t="2024-04-23",
output_dir="my_impact_run",
)
result["summary"]
CLI equivalent:
python -m tscfbench run-csv-impact my_impact.csv --time-col date --y-col signups --x-cols peer_signups search_interest --intervention-t 2024-04-23 --output my_impact_run
That writes:
impact_prediction_frame.csvimpact_metrics.jsonimpact_report.md- treated-vs-counterfactual charts
- point-effect and cumulative-impact charts
3. How to choose between the two¶
- Use
run_panel_datawhen your data is naturallyunit x time - Use
run_impact_datawhen your data is one treated series with control columns already aligned by time
4. Time column and intervention format¶
datecolumns can be normal date strings such as2024-07-14- integer-like time columns such as
yearalso work intervention_tshould match one value in your time column
5. If you still prefer CLI¶
Use run-csv-panel or run-csv-impact when you want copy-paste terminal commands, CI jobs, or shell scripts. They are thin wrappers around the same workflow.