Reproduce FTR’s Shippers Conditions Index: Methodology, Data Sources, and Code
logisticsmethodologyreplication

Reproduce FTR’s Shippers Conditions Index: Methodology, Data Sources, and Code

UUnknown
2026-03-02
11 min read
Advertisement

Open, reproducible method to build a Shippers Conditions Index proxy from public freight rates, capacity, and diesel data — with code.

Hook: Why reproducibility of the Shippers Conditions Index (SCI) matters for engineers and analysts in 2026

Data teams, analysts, and platform owners increasingly need fast, citable metrics to evaluate supply-chain risk and model cost scenarios. Yet FTR’s Shippers Conditions Index (SCI) — a compact signal that FTR uses to summarize market conditions for shippers — is a proprietary product. For engineering and analytics teams building dashboards, procurement models, or automated alerts, a transparent, reproducible SCI proxy built from public signals solves a key pain point: reliable, explainable inputs you can audit and version-control.

Executive summary — what I reproduce and why it’s useful

This article documents a fully reproducible approach to building an SCI-like composite index using public indicators for: freight rates, trucking capacity (driver and tonnage proxies), and diesel prices. I explain the choices of public data sources, provide a step-by-step methodology, and include runnable Python snippets you can copy into a Jupyter notebook. The result is not FTR’s canonical SCI, but a transparent, defensible proxy you can host, test, and iterate on in 2026.

Why 2026 makes this replication timely

FTR and others signaled a shift in late 2025 and early 2026: freight markets have shown tightening capacity and episodic rate strength even as diesel prices fell to multi-year lows. For teams modeling budgets, risk, or SLAs in 2026, that combination — higher rates + tighter capacity + lower diesel — creates ambiguous signals. A reproducible index lets you quantify the net effect and run scenario analysis for procurement or pricing systems.

"We have been forecasting a freight market shift in 2026 that would be mildly unfavorable for shippers... Van spot rates in trucking were notably stronger than seasonal expectations in December." — FTR (January 2026)

Design principles for a reproducible SCI proxy

When you replicate a composite index, follow explicit rules so others can audit and adapt your work. I recommend these principles:

  • Public, discoverable data: Prefer government and market APIs so readers can fetch identical series programmatically.
  • Transparent transforms: Use documented transforms (log, pct-change, rolling z-score) with fixed parameters.
  • Reproducible code: Provide a single Jupyter notebook with environment requirements (requirements.txt or environment.yml).
  • Baseline and normalization: Normalize each indicator to a historical baseline (e.g., 2015–2019 or 2017–2021) to remove pandemic-era distortions.
  • Sensitivity tests: Report how index changes with alternate weights and baseline windows.

Below are practical, widely-accessible indicators you can use as proxies that map to FTR’s inputs (rates, capacity, diesel):

  • Freight rates (spot and contract proxies)
    • DAT Trendlines (DAT) or Truckstop: best if you have an account and API. They provide van, reefer, and flatbed spot rates.
    • Public proxy: BLS Producer Price Index (PPI) for Truck Transportation and CASS Freight Shipments/Expenditures when available in published reports. These are price-level proxies when spot-rate APIs are not accessible.
    • Container freight: Freightos Baltic Index (FBX) is public for ocean container rates.
  • Capacity (tightness) proxies
    • BLS Employment — Trucking and Warehousing (CES) as a driver-supply proxy (monthly).
    • American Trucking Associations (ATA) For-Hire Truck Tonnage Index — widely quoted and available via monthly press releases. Use as demand-weighted tonnage proxy.
    • Federal Highway Administration (FHWA) vehicle counts or BTS vehicle-miles data for heavy trucks as geographic capacity proxies.
  • Fuel / diesel prices
    • U.S. EIA Weekly Diesel Retail Price (API available at eia.gov). This is the canonical public fuel series.

Index construction: step-by-step methodology

This section lays out concrete steps. I provide both the conceptual transform and code-ready pseudo-implementation.

Step 1 — Define the baseline and frequency

Choose a frequency that matches your strongest data source (monthly or weekly). I recommend monthly for stability and because many public series (BLS, ATA) are monthly. Choose a historic baseline to compute z-scores; a robust choice in 2026 is January 2015–December 2019 (pre-pandemic normal). The baseline removes extreme pandemic-era volatility from the normalization step.

Step 2 — Pull and align series

Fetch each series and align to the chosen frequency. Example pipeline:

  1. DAT/Truckstop: weekly van spot rate series; resample to month using month-end or monthly average.
  2. BLS CES: trucking employment — monthly level series.
  3. ATA tonnage: monthly index — align by month.
  4. EIA retail diesel: weekly price — resample to monthly average.

Step 3 — Transform each indicator into a shipper-favorability direction

FTR’s SCI is negative when conditions are worse for shippers (higher rates, tighter capacity, higher fuel). To produce the same sign orientation, convert each raw series so that higher values mean more favorable for shippers:

  • Rates: invert the normalized rate series (because higher rates are worse). For example, take negative of z-score of rates.
  • Capacity: higher capacity (more drivers, higher tonnage) is favorable, so use the z-score directly.
  • Diesel: higher diesel is worse — invert diesel z-score.

Step 4 — Normalize (rolling z-scores)

Compute z-scores using the baseline mean and standard deviation. Optionally use a robust z-score (median and MAD) or a rolling window (24 months) to adapt to structural shifts. For reproducibility, record the exact baseline window and method.

Step 5 — Combine with weights and compute SCI proxy

Combine the transformed z-scores into a single index. Without FTR’s published weights, a defensible starting point is equal weights. You should also run sensitivity analysis with alternative weights (e.g., 40% rates, 40% capacity, 20% diesel).

Formula (monthly):

SCI_proxy = w1 * z_capacity + w2 * (-z_rates) + w3 * (-z_diesel)

where w1+w2+w3 = 1. Negative values indicate conditions worse for shippers.

Runnable Python snippets (copy into a Jupyter notebook)

Below are compact code blocks you can paste into a notebook. Replace API calls or CSV load steps with your data sources.

# requirements: pandas, numpy, requests, matplotlib, scipy
import pandas as pd
import numpy as np
from scipy import stats

# Example: load CSV exports (replace with API calls where available)
rates = pd.read_csv('dat_van_spot_monthly.csv', parse_dates=['date'], index_col='date')
trucking_emp = pd.read_csv('bls_trucking_employment.csv', parse_dates=['date'], index_col='date')
diesel = pd.read_csv('eia_weekly_diesel.csv', parse_dates=['date'], index_col='date')

# Resample diesel weekly -> monthly mean
monthly_diesel = diesel['price'].resample('M').mean()

# Align series into a single dataframe
df = pd.DataFrame({
    'rates': rates['van_spot_rate'].resample('M').mean(),
    'emp': trucking_emp['employment_count'],
    'diesel': monthly_diesel
})

# Baseline: Jan 2015 - Dec 2019
baseline = df['2015-01-01':'2019-12-31']

# Z-score function using baseline mean and std
def zscore_baseline(series, baseline_series):
    mu = baseline_series.mean()
    sigma = baseline_series.std()
    return (series - mu) / sigma

z_rates = zscore_baseline(df['rates'], baseline['rates'])
z_emp = zscore_baseline(df['emp'], baseline['emp'])
z_diesel = zscore_baseline(df['diesel'], baseline['diesel'])

# Convert direction so higher = better for shippers
z_rates_inv = -z_rates
z_diesel_inv = -z_diesel

# Combine (equal weights)
weights = {'emp': 0.4, 'rates': 0.4, 'diesel': 0.2}
df['sci_proxy'] = (weights['emp'] * z_emp +
                   weights['rates'] * z_rates_inv +
                   weights['diesel'] * z_diesel_inv)

# Standardize the composite for easy interpretation
df['sci_proxy_std'] = (df['sci_proxy'] - df['sci_proxy'].mean()) / df['sci_proxy'].std()

print(df[['sci_proxy_std']].tail())

Notes on data acquisition and APIs

  • EIA (diesel): Use the EIA API (register for a free key at eia.gov). Query weekly retail diesel and resample to monthly. The EIA provides consistent national and regional diesel time series through 2026.
  • BLS (employment): Use BLS public APIs or download CSVs for NAICS series corresponding to Truck Transportation (CES). BLS updates monthly, and versioning is stable for reproducibility.
  • DAT / Truckstop (rates): If your organization has access, fetch spot rate series programmatically. If you lack a subscription, use PPI or Cass excerpts as proxies.
  • ATA (tonnage): Download monthly press release tables and scrape or manually import the tonnage index for capacity/demand signaling.

Validation and sanity checks

After building the index, validate it against known market statements and external reports for late 2025 and early 2026:

  • Check that December 2025 shows a decline in shipper favorability if your rate input shows spot-rate strength and capacity inputs show tightening.
  • Confirm diesel's downward influence in early 2026 by comparing the composite with and without diesel in the mix; diesel near four-year lows should raise the composite slightly (make conditions more favorable), all else equal.
  • Run correlation checks vs public FTR commentary dates: your proxy should move in the same direction as FTR’s SCI around major freight shocks.

Sensitivity analysis: weights, baseline, and frequency

To understand robustness:

  • Recompute with weights: (0.5 rates, 0.3 capacity, 0.2 diesel) and (0.33 equal) to see impact on trend and volatility.
  • Switch baseline windows (2010–2019, 2017–2021) and report changes in standardized scores; this exposes structural shifts post-2019.
  • Rebuild with weekly frequency if you have high-frequency spot-rate and diesel data; this increases noise but can detect short-lived shocks.

Advanced strategies and improvements for production systems

For engineering teams deploying this in production or including it in ML pipelines, consider these enhancements:

  • Automated data validation: Implement schema and range checks for each series (e.g., diesel price must be > 0 and < 10 USD/gal).
  • Inductive weighting: Use regression or PCA to derive data-driven weights that maximize correlation with an outcome variable, such as freight spend or delay incidence.
  • Version control and provenance: Store raw API responses in an immutable data lake and record the exact baseline and code version used to compute each index snapshot.
  • Explainability: Keep per-component z-scores and contribution breakdowns in your time-series output so downstream users can see which driver (rates, capacity, diesel) is dominating change.

Reproducible notebook and code packaging

To make this shareable inside your team, package as:

  1. A Jupyter notebook with these code blocks (and parameter cells at the top for API keys, baseline, and weights).
  2. requirements.txt listing pinned package versions (pandas==1.6.0, numpy==1.26.0, requests, matplotlib, scipy).
  3. A small data folder with example CSV exports and a script fetch_data.py that either downloads public series or points to where secure credentials are needed.
  4. A README that documents the baseline window, transform choices, and how to run the notebook to reproduce any figure in the notebook.

Practical takeaways for teams (actionable)

  • Implement this proxy if you need an auditable shippers-condition signal and do not have access to FTR subscriptions.
  • Use equal weights to start; document that your organization tested 3 alternate weightings and baseline windows — keep results in the notebook for governance.
  • Automate weekly ETL, but compute monthly SCI snapshots for reports if you rely on BLS/ATA monthly series.
  • For procurement: add an alert when SCI_proxy_std drops below -1.0 — this threshold historically indicates a materially worse-than-normal shippers environment (one standard deviation below mean).

Limitations and responsible use

Important clarifications:

  • This approach constructs a proxy for FTR’s SCI. FTR’s internal inputs, transforms, and weights are proprietary and may differ.
  • Proxy accuracy depends on data quality and the availability of high-frequency spot-rate data. When spot-rate APIs are inaccessible, PPI and Cass proxies increase measurement error.
  • Indices are sensitive to baseline choice. For governance, explicitly record baseline windows and rationale.

As of early 2026, market observations to embed in your modeling and commentary:

  • Spot van rates showed seasonal strength in late 2025; use a high-frequency rate input if you want to capture short-lived tightening.
  • Diesel prices dropped to near four-year lows in early 2026; include diesel to reflect fuel-cost-driven cost relief for shippers.
  • Preliminary employment and tonnage indicators suggest trucking capacity may be lower than official lagged numbers imply — consider leading indicators (ADS-B carrier counts, job postings) if you need earlier signal.

How to audit and cite your SCI proxy in reports

For public-facing reports or internal governance, include these metadata items with every chart and dashboard:

  • Exact series names and source URLs (or API endpoints) and the API call timestamp.
  • Baseline dates and normalization method (z-score vs. robust z-score).
  • Weights and sensitivity-test results.
  • Notebook commit hash or link to the releases page with the code version that produced the values.

Final checklist before production deployment

  1. Confirm automated ETL runs without stale credentials.
  2. Run full-sample sanity checks and inspect extreme values for data errors.
  3. Document governance: who reviews baseline changes, and what triggers a rebaseline (e.g., structural breaks).
  4. Expose per-component contributions on dashboards so analysts can quickly see if a move is driven by rates, capacity, or fuel.

Conclusion and call-to-action

Reproducing an SCI-like index with public inputs gives engineering and analytics teams an auditable, adaptable signal for 2026’s uncertain freight landscape. The approach above balances realism (use of DAT/Truckstop when available) with practicality (EIA and BLS public series), provides a clearly documented normalization and composition strategy, and supplies runnable code snippets you can drop into a notebook.

Next steps: Clone or create a notebook using the code snippets in this piece, run the index on your historical dataset, and publish a versioned dataset with per-component contributions. If you want a starting point, copy the code to a new Jupyter notebook in your org, add your API keys, and run the ETL for 2015–2026.

Want a ready-made GitHub template and CI script for automatic recomputation and artifact publishing (CSV/Parquet + JSON metadata)? We’re preparing a fully documented repo and a CI/CD workflow for reproducible shipping-condition indexes; subscribe or reach out via the contact below for early access.

Contact / Feedback

If you reproduce this index and publish results, include the notebook link and metadata. Send feedback or requests for alternative proxies (container rates, cross-border freight) to the editorial data desk at statistics.news.

Advertisement

Related Topics

#logistics#methodology#replication
U

Unknown

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.

Advertisement
2026-03-02T01:34:51.851Z