datasetsstreamingengagement

Streaming Engagement Dataset: Peak Viewers, MAUs, and Session Lengths for JioStar

sstatistics

2026-03-11

9 min read

Download a vetted JioHotstar engagement dataset template (99M peak viewers, 450M MAUs) for benchmarking and retention modeling.

Hook: Stop hunting unreliable streaming figures — get a ready-to-use JioHotstar engagement dataset for benchmarking and retention modeling

Finding trustworthy, analysis-ready streaming telemetry is time-consuming: you spend hours verifying figures, cleaning session data, and guessing sessionization rules before you can run retention models. This article delivers a documented, cleaned dataset template built around the widely reported figures — 99 million peak viewers for a marquee cricket final and ~450 million MAUs — and shows how to transform it into actionable benchmarks and retention models in 2026.

Executive summary: what you get and why it matters

This guide provides:

A documented, synthetic (template) engagement dataset aligned with public reporting on JioStar / JioHotstar (99M peak viewers; ~450M MAUs)
Clear data cleaning and sessionization rules to turn raw events into sessions
Pre-built analysis recipes: DAU/MAU, cohort retention, survival analysis, and churn estimates
Visualization templates (Plotly/Plotly Express) for interactive charts you can embed in dashboards
Downloadable CSV/JSON templates you can copy and paste into your environment

All examples reference trends in late 2025 — early 2026: sports-driven spikes, AI personalization improving session engagement, and stricter privacy/consent practices that affect telemetry capture.

Context: reported numbers and source

In January 2026, trade press reported that the newly consolidated JioStar (merger of Viacom18 and Star India under Reliance/Disney hold-ins) posted strong quarterly revenue and engagement numbers, including a record 99M digital viewers for the ICC Women’s Cricket World Cup final and a platform average of roughly 450M MAUs during a high-engagement quarter.

Variety, Jan 16, 2026: “JioHotstar reports 99 million digital viewers for historic cricket match as platform averages 450 million monthly users.”

Important: the dataset template below is a constructed, cleaned dataset intended for benchmarking and modeling. It is not operational telemetry from JioHotstar. Use it to prototype analyses, calibrate retention models, and validate pipelines.

Design goals and assumptions

Goal: Provide a concise, reproducible dataset to benchmark streaming engagement metrics (MAU, DAU, peak viewers, session lengths) and run retention models.
Transparency: Each column is documented; derivations are explicit so you can reproduce metrics from your raw events.
Assumptions: Sessionization uses a 30-minute inactivity threshold; MAU reflects unique active users in the calendar month; reported 450M MAUs is taken as the high-water monthly figure for Jan 2026; 99M is treated as concurrent/unique match viewers depending on the analysis need (we provide both interpretations).

Schema: fields and definitions

Use this schema as your canonical engagement table. Each row represents an aggregation at the month x region x device level (daily or per-event granularity is available by extending the template).

month — YYYY-MM (calendar month)
region — geographic region (IN, APAC, ROW)
device_type — Mobile, Desktop, ConnectedTV
maus — unique monthly active users (count)
dau_avg — average daily active users for the month
peak_viewers_concurrent — maximum concurrent viewers observed (for live events)
peak_viewers_unique_event — unique viewers who watched at least X minutes of an event (useful for one-off matches)
total_sessions — session count in the month
total_watch_seconds — total watch time in seconds across all sessions
avg_session_seconds — mean session length in seconds
median_session_seconds — median session length
session_p90_seconds — 90th percentile session length
new_users — users who first activated in month
returning_users — users active this month and prior month
paying_subscribers — active paid subscribers
ad_impressions — total served impressions
event_flag — boolean (1/0) if month contains a major live event
notes — free text for adjustments/assumptions

Derivation rules (key formulas)

DAU/MAU: dau_avg / maus (ratio useful for stickiness benchmarking)
Avg session: total_watch_seconds / total_sessions
Peak viewers (unique_event): count of unique user_ids with watch_seconds >= event_threshold (e.g., 300s) during event window
Sessionization rule: group user events into a session when gaps between events <= 30 minutes. Close session after 30 minutes of inactivity.

Sample CSV template (copy, save as CSV)

Below is a ready-to-use monthly template across six months including the Jan 2026 spike. Copy to a file named jiostar_engagement_template.csv.

month,region,device_type,maus,dau_avg,peak_viewers_concurrent,peak_viewers_unique_event,total_sessions,total_watch_seconds,avg_session_seconds,median_session_seconds,session_p90_seconds,new_users,returning_users,paying_subscribers,ad_impressions,event_flag,notes
2025-09,IN,Mobile,420000000,95_000_000,18_000_000,15_000_000,1_200_000_000,1_200_000_000_000,1000,720,3600,10_000_000,290_000_000,30_000_000,8_000_000_000,0,Pre-event baseline
2025-10,IN,Mobile,430000000,100_000_000,20_000_000,16_500_000,1_250_000_000,1_300_000_000_000,1040,740,3800,12_000_000,300_000_000,32_000_000,8_500_000_000,0,Cricket league season
2025-11,IN,ConnectedTV,435000000,105_000_000,25_000_000,18_000_000,1_300_000_000,1_500_000_000_000,1150,800,4200,14_000_000,305_000_000,34_000_000,9_000_000_000,0,Streaming promotions
2025-12,IN,Mobile,445000000,110_000_000,28_000_000,20_000_000,1_380_000_000,1_600_000_000_000,1160,820,4300,16_000_000,315_000_000,36_000_000,9_500_000_000,0,Holiday usage bump
2026-01,IN,Mobile,450000000,125_000_000,99_000_000,99_000_000,1_600_000_000,2_200_000_000_000,1375,900,5200,25_000_000,330_000_000,38_000_000,12_000_000_000,1,ICC Women's Cricket World Cup final
2026-02,IN,ConnectedTV,440000000,108_000_000,22_000_000,19_000_000,1_350_000_000,1_450_000_000_000,1074,760,4000,11_000_000,310_000_000,35_000_000,9_200_000_000,0,Post-event normalization

Tip: underscores are used above for readability in large numbers. When you paste into a CSV, remove the underscores or parse them programmatically.

Cleaning and preprocessing checklist

PII and hashing: Hash user identifiers consistently; do not store raw PII. Use a stable hashing algorithm with salt rotation documented in your pipeline.
Timezone normalization: Convert all timestamps to UTC before sessionization; store original timezone if needed for local-peak analysis.
Sessionization: Apply a 30-minute inactivity gap to create session IDs. For live-event analysis you may use a tighter 10-minute gap.
Bot and CDN noise: Filter known bot user agents and high-frequency API tokens; exclude sessions with unrealistically high event rates.
Ad-blocker bias: Tag sessions where ad_impressions are blocked — these affect ad-revenue estimates.
Sampling and correction: If using sampled logs (e.g., 1% Firehose), maintain the sample rate and apply inverse-probability weighting to estimators.

Actionable analysis recipes

1) DAU/MAU and stickiness

Compute monthly stickiness:

stickiness = dau_avg / maus

Benchmark: streaming platforms typically show stickiness between 0.12–0.30. Sports-heavy months can push stickiness above 0.25 as seen in the Jan 2026 spike (125M DAU vs 450M MAU ~ 0.28).

2) Cohort retention table (7-day, 30-day, 90-day)

Group users by signup week/month, compute retention rates at each interval. Example SQL (Postgres).

-- cohort: signup_month
WITH base AS (
  SELECT user_id, DATE_TRUNC('month', signup_at) AS cohort_month
  FROM users
)
SELECT
  cohort_month,
  DATE_TRUNC('day', activity_at) - DATE_TRUNC('month', cohort_month) AS days_since_signup,
  COUNT(DISTINCT user_id) AS active_users
FROM base
JOIN events e ON e.user_id = base.user_id
GROUP BY cohort_month, days_since_signup;

3) Survival analysis for churn

Use Kaplan–Meier curves to estimate retention hazard; Cox models to test features (device_type, event exposure, subscription status). Example Python snippet using lifelines:

from lifelines import KaplanMeierFitter
kmf = KaplanMeierFitter()
kmf.fit(durations=df['days_active'], event_observed=df['churned'])
kmf.plot_survival_function()

4) Session length distribution and segmentation

Plot session-length histograms (log scale) and compute medians and p90. Use segments: content_type (live, VOD), device_type, region. Live sports sessions will dramatically shift the right tail.

Visualization templates (interactive)

Suggested interactive charts to include in dashboards (Plotly/Observable):

MAU/DAU Time Series — stacked by device_type; annotate major events (Jan 2026 final)
Concurrent Peak Viewer Timeline — zoomable line for concurrent viewers during event windows
Cohort Retention Heatmap — months on y-axis, days/weeks on x-axis, color = retention%
Session Length Violin Plot — per content_type to compare VOD vs live
Sankey Flow — acquisition channel -> first session length bucket -> paying conversion

Example Plotly Express snippet for a retention heatmap:

import plotly.express as px
fig = px.imshow(retention_matrix, labels=dict(x='Days', y='Cohort Month', color='Retention %'))
fig.update_layout(title='Cohort Retention Heatmap')
fig.show()

Benchmarks and how to interpret the JioHotstar numbers

Using the template and the public figures, here are practical benchmarks:

MAU baseline: 450M in Jan 2026 — normalized months should be compared by region and device. Use per-capita engagement (watch minutes per MAU) to control for market size.
Peak viewers: 99M concurrent/unique for a marquee match — treat as an upper bound for live event capacity planning and CDN scaling.
Retention: target DAU/MAU > 0.20 for sports platforms outside event-months; event months can exceed 0.25–0.30.
Session length: median session ~12–18 minutes is common; live sports push the median and the p90 upward (30–90+ minutes).

Advanced retention strategies for 2026

Recent trends (late 2025–early 2026) to factor into modeling:

AI-driven personalization: Personalized feeds and recommendation models have been shown to increase session length and rewatch probability — incorporate predicted engagement scores as covariates in survival models.
Live-first monetization: Sports drives best-in-class retention; prioritize event-exposure flags and engagement uplift attribution.
Privacy-first telemetry: With stricter consent regimes, expect sparser cross-device identifiers — model missingness with data augmentation and probabilistic matching.

Limitations and caveats

This template is a constructed dataset for benchmarking and modeling — not an export of internal JioStar telemetry.
Public reports may conflate concurrent viewers and unique event viewers; choose the interpretation that fits your analysis context and document it.
Sampling bias: platform-reported MAUs may include multiple accounts or device-shared sessions; adjust models accordingly.
Regional differences: urban mobile users have different session patterns than ConnectedTV users — segment before benchmarking.

How to use this dataset to make decisions

Ingest the CSV template and map to your schemas. Recompute key metrics using the derivation rules above.
Run cohort retention analyses to identify the high-value acquisition channels and content types.
Use survival/Cox models to estimate the effect of personalization or promotions on churn hazard.
Set operational KPIs: target DAU/MAU, median session length, ad fill rate, and event-specific AVP (average viewership per event).
Stress-test infrastructure using the 99M peak estimate: ensure CDN and origin capacity planning assumes 50–100M concurrent connections for marquee events.

Appendix: JSON template

Small JSON record example to seed APIs:

{
  "month": "2026-01",
  "region": "IN",
  "device_type": "Mobile",
  "maus": 450000000,
  "dau_avg": 125000000,
  "peak_viewers_concurrent": 99000000,
  "total_sessions": 1600000000,
  "total_watch_seconds": 2200000000000,
  "avg_session_seconds": 1375,
  "median_session_seconds": 900,
  "session_p90_seconds": 5200,
  "new_users": 25000000,
  "returning_users": 330000000,
  "paying_subscribers": 38000000,
  "ad_impressions": 12000000000,
  "event_flag": 1,
  "notes": "ICC Women's Cricket World Cup final drove the spike"
}

Practical takeaways

Use the template to prototype retention models and partnerships (e.g., ad buys and CDN contracts) without waiting for production telemetry.
Document assumptions — especially how you interpret 99M (concurrent vs unique viewers), and how MAU is computed.
Segment aggressively — device and content type will explain the majority of variance in session length and conversion.
Plan capacity against the 99M benchmark for live events and test your sampling and aggregation code on the included dataset before event day.

Call to action

Download the CSV/JSON templates above, paste them into your analytics environment, and run the cohort and survival recipes this week. If you want a customized dataset (region-specific, daily granularity, or per-event telemetry) — contact our data team for a templated pipeline and retention-modeling workshop tailored to streaming platforms.

statistics

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.

Up Next

When AI Agents Buy for Us: The New SEO Problem for Product Data Teams

transportation•13 min read

Between Infrastructure and Innovation: Georgia's $1.8B Initiative to Combat Traffic Congestion

AI•22 min read

When AI Agents Become the Buyer: A Data Playbook for Brand Discoverability

finance•12 min read

Is Bitcoin Still the Best Investment? Analyzing Michael Saylor's Diminishing Strategy

fact-checking•19 min read

Evaluating Statistical Claims in Global Reporting: A Toolkit for Tech Professionals

From Our Network

Trending stories across our publication group

globalnews.cloud

AI Strategy•22 min read

Why Brand Data Is the New Media Strategy: Preparing for an Agentic Shopping Future

Music on Demand: The Future of Playlists and Personalization