Streaming Engagement Dataset: Peak Viewers, MAUs, and Session Lengths for JioStar
Download a vetted JioHotstar engagement dataset template (99M peak viewers, 450M MAUs) for benchmarking and retention modeling.
Hook: Stop hunting unreliable streaming figures — get a ready-to-use JioHotstar engagement dataset for benchmarking and retention modeling
Finding trustworthy, analysis-ready streaming telemetry is time-consuming: you spend hours verifying figures, cleaning session data, and guessing sessionization rules before you can run retention models. This article delivers a documented, cleaned dataset template built around the widely reported figures — 99 million peak viewers for a marquee cricket final and ~450 million MAUs — and shows how to transform it into actionable benchmarks and retention models in 2026.
Executive summary: what you get and why it matters
This guide provides:
- A documented, synthetic (template) engagement dataset aligned with public reporting on JioStar / JioHotstar (99M peak viewers; ~450M MAUs)
- Clear data cleaning and sessionization rules to turn raw events into sessions
- Pre-built analysis recipes: DAU/MAU, cohort retention, survival analysis, and churn estimates
- Visualization templates (Plotly/Plotly Express) for interactive charts you can embed in dashboards
- Downloadable CSV/JSON templates you can copy and paste into your environment
All examples reference trends in late 2025 — early 2026: sports-driven spikes, AI personalization improving session engagement, and stricter privacy/consent practices that affect telemetry capture.
Context: reported numbers and source
In January 2026, trade press reported that the newly consolidated JioStar (merger of Viacom18 and Star India under Reliance/Disney hold-ins) posted strong quarterly revenue and engagement numbers, including a record 99M digital viewers for the ICC Women’s Cricket World Cup final and a platform average of roughly 450M MAUs during a high-engagement quarter.
Variety, Jan 16, 2026: “JioHotstar reports 99 million digital viewers for historic cricket match as platform averages 450 million monthly users.”
Important: the dataset template below is a constructed, cleaned dataset intended for benchmarking and modeling. It is not operational telemetry from JioHotstar. Use it to prototype analyses, calibrate retention models, and validate pipelines.
Design goals and assumptions
- Goal: Provide a concise, reproducible dataset to benchmark streaming engagement metrics (MAU, DAU, peak viewers, session lengths) and run retention models.
- Transparency: Each column is documented; derivations are explicit so you can reproduce metrics from your raw events.
- Assumptions: Sessionization uses a 30-minute inactivity threshold; MAU reflects unique active users in the calendar month; reported 450M MAUs is taken as the high-water monthly figure for Jan 2026; 99M is treated as concurrent/unique match viewers depending on the analysis need (we provide both interpretations).
Schema: fields and definitions
Use this schema as your canonical engagement table. Each row represents an aggregation at the month x region x device level (daily or per-event granularity is available by extending the template).
- month — YYYY-MM (calendar month)
- region — geographic region (IN, APAC, ROW)
- device_type — Mobile, Desktop, ConnectedTV
- maus — unique monthly active users (count)
- dau_avg — average daily active users for the month
- peak_viewers_concurrent — maximum concurrent viewers observed (for live events)
- peak_viewers_unique_event — unique viewers who watched at least X minutes of an event (useful for one-off matches)
- total_sessions — session count in the month
- total_watch_seconds — total watch time in seconds across all sessions
- avg_session_seconds — mean session length in seconds
- median_session_seconds — median session length
- session_p90_seconds — 90th percentile session length
- new_users — users who first activated in month
- returning_users — users active this month and prior month
- paying_subscribers — active paid subscribers
- ad_impressions — total served impressions
- event_flag — boolean (1/0) if month contains a major live event
- notes — free text for adjustments/assumptions
Derivation rules (key formulas)
- DAU/MAU: dau_avg / maus (ratio useful for stickiness benchmarking)
- Avg session: total_watch_seconds / total_sessions
- Peak viewers (unique_event): count of unique user_ids with watch_seconds >= event_threshold (e.g., 300s) during event window
- Sessionization rule: group user events into a session when gaps between events <= 30 minutes. Close session after 30 minutes of inactivity.
Sample CSV template (copy, save as CSV)
Below is a ready-to-use monthly template across six months including the Jan 2026 spike. Copy to a file named jiostar_engagement_template.csv.
month,region,device_type,maus,dau_avg,peak_viewers_concurrent,peak_viewers_unique_event,total_sessions,total_watch_seconds,avg_session_seconds,median_session_seconds,session_p90_seconds,new_users,returning_users,paying_subscribers,ad_impressions,event_flag,notes 2025-09,IN,Mobile,420000000,95_000_000,18_000_000,15_000_000,1_200_000_000,1_200_000_000_000,1000,720,3600,10_000_000,290_000_000,30_000_000,8_000_000_000,0,Pre-event baseline 2025-10,IN,Mobile,430000000,100_000_000,20_000_000,16_500_000,1_250_000_000,1_300_000_000_000,1040,740,3800,12_000_000,300_000_000,32_000_000,8_500_000_000,0,Cricket league season 2025-11,IN,ConnectedTV,435000000,105_000_000,25_000_000,18_000_000,1_300_000_000,1_500_000_000_000,1150,800,4200,14_000_000,305_000_000,34_000_000,9_000_000_000,0,Streaming promotions 2025-12,IN,Mobile,445000000,110_000_000,28_000_000,20_000_000,1_380_000_000,1_600_000_000_000,1160,820,4300,16_000_000,315_000_000,36_000_000,9_500_000_000,0,Holiday usage bump 2026-01,IN,Mobile,450000000,125_000_000,99_000_000,99_000_000,1_600_000_000,2_200_000_000_000,1375,900,5200,25_000_000,330_000_000,38_000_000,12_000_000_000,1,ICC Women's Cricket World Cup final 2026-02,IN,ConnectedTV,440000000,108_000_000,22_000_000,19_000_000,1_350_000_000,1_450_000_000_000,1074,760,4000,11_000_000,310_000_000,35_000_000,9_200_000_000,0,Post-event normalization
Tip: underscores are used above for readability in large numbers. When you paste into a CSV, remove the underscores or parse them programmatically.
Cleaning and preprocessing checklist
- PII and hashing: Hash user identifiers consistently; do not store raw PII. Use a stable hashing algorithm with salt rotation documented in your pipeline.
- Timezone normalization: Convert all timestamps to UTC before sessionization; store original timezone if needed for local-peak analysis.
- Sessionization: Apply a 30-minute inactivity gap to create session IDs. For live-event analysis you may use a tighter 10-minute gap.
- Bot and CDN noise: Filter known bot user agents and high-frequency API tokens; exclude sessions with unrealistically high event rates.
- Ad-blocker bias: Tag sessions where ad_impressions are blocked — these affect ad-revenue estimates.
- Sampling and correction: If using sampled logs (e.g., 1% Firehose), maintain the sample rate and apply inverse-probability weighting to estimators.
Actionable analysis recipes
1) DAU/MAU and stickiness
Compute monthly stickiness:
stickiness = dau_avg / maus
Benchmark: streaming platforms typically show stickiness between 0.12–0.30. Sports-heavy months can push stickiness above 0.25 as seen in the Jan 2026 spike (125M DAU vs 450M MAU ~ 0.28).
2) Cohort retention table (7-day, 30-day, 90-day)
Group users by signup week/month, compute retention rates at each interval. Example SQL (Postgres).
-- cohort: signup_month
WITH base AS (
SELECT user_id, DATE_TRUNC('month', signup_at) AS cohort_month
FROM users
)
SELECT
cohort_month,
DATE_TRUNC('day', activity_at) - DATE_TRUNC('month', cohort_month) AS days_since_signup,
COUNT(DISTINCT user_id) AS active_users
FROM base
JOIN events e ON e.user_id = base.user_id
GROUP BY cohort_month, days_since_signup;
3) Survival analysis for churn
Use Kaplan–Meier curves to estimate retention hazard; Cox models to test features (device_type, event exposure, subscription status). Example Python snippet using lifelines:
from lifelines import KaplanMeierFitter kmf = KaplanMeierFitter() kmf.fit(durations=df['days_active'], event_observed=df['churned']) kmf.plot_survival_function()
4) Session length distribution and segmentation
Plot session-length histograms (log scale) and compute medians and p90. Use segments: content_type (live, VOD), device_type, region. Live sports sessions will dramatically shift the right tail.
Visualization templates (interactive)
Suggested interactive charts to include in dashboards (Plotly/Observable):
- MAU/DAU Time Series — stacked by device_type; annotate major events (Jan 2026 final)
- Concurrent Peak Viewer Timeline — zoomable line for concurrent viewers during event windows
- Cohort Retention Heatmap — months on y-axis, days/weeks on x-axis, color = retention%
- Session Length Violin Plot — per content_type to compare VOD vs live
- Sankey Flow — acquisition channel -> first session length bucket -> paying conversion
Example Plotly Express snippet for a retention heatmap:
import plotly.express as px fig = px.imshow(retention_matrix, labels=dict(x='Days', y='Cohort Month', color='Retention %')) fig.update_layout(title='Cohort Retention Heatmap') fig.show()
Benchmarks and how to interpret the JioHotstar numbers
Using the template and the public figures, here are practical benchmarks:
- MAU baseline: 450M in Jan 2026 — normalized months should be compared by region and device. Use per-capita engagement (watch minutes per MAU) to control for market size.
- Peak viewers: 99M concurrent/unique for a marquee match — treat as an upper bound for live event capacity planning and CDN scaling.
- Retention: target DAU/MAU > 0.20 for sports platforms outside event-months; event months can exceed 0.25–0.30.
- Session length: median session ~12–18 minutes is common; live sports push the median and the p90 upward (30–90+ minutes).
Advanced retention strategies for 2026
Recent trends (late 2025–early 2026) to factor into modeling:
- AI-driven personalization: Personalized feeds and recommendation models have been shown to increase session length and rewatch probability — incorporate predicted engagement scores as covariates in survival models.
- Live-first monetization: Sports drives best-in-class retention; prioritize event-exposure flags and engagement uplift attribution.
- Privacy-first telemetry: With stricter consent regimes, expect sparser cross-device identifiers — model missingness with data augmentation and probabilistic matching.
Limitations and caveats
- This template is a constructed dataset for benchmarking and modeling — not an export of internal JioStar telemetry.
- Public reports may conflate concurrent viewers and unique event viewers; choose the interpretation that fits your analysis context and document it.
- Sampling bias: platform-reported MAUs may include multiple accounts or device-shared sessions; adjust models accordingly.
- Regional differences: urban mobile users have different session patterns than ConnectedTV users — segment before benchmarking.
How to use this dataset to make decisions
- Ingest the CSV template and map to your schemas. Recompute key metrics using the derivation rules above.
- Run cohort retention analyses to identify the high-value acquisition channels and content types.
- Use survival/Cox models to estimate the effect of personalization or promotions on churn hazard.
- Set operational KPIs: target DAU/MAU, median session length, ad fill rate, and event-specific AVP (average viewership per event).
- Stress-test infrastructure using the 99M peak estimate: ensure CDN and origin capacity planning assumes 50–100M concurrent connections for marquee events.
Appendix: JSON template
Small JSON record example to seed APIs:
{
"month": "2026-01",
"region": "IN",
"device_type": "Mobile",
"maus": 450000000,
"dau_avg": 125000000,
"peak_viewers_concurrent": 99000000,
"total_sessions": 1600000000,
"total_watch_seconds": 2200000000000,
"avg_session_seconds": 1375,
"median_session_seconds": 900,
"session_p90_seconds": 5200,
"new_users": 25000000,
"returning_users": 330000000,
"paying_subscribers": 38000000,
"ad_impressions": 12000000000,
"event_flag": 1,
"notes": "ICC Women's Cricket World Cup final drove the spike"
}
Practical takeaways
- Use the template to prototype retention models and partnerships (e.g., ad buys and CDN contracts) without waiting for production telemetry.
- Document assumptions — especially how you interpret 99M (concurrent vs unique viewers), and how MAU is computed.
- Segment aggressively — device and content type will explain the majority of variance in session length and conversion.
- Plan capacity against the 99M benchmark for live events and test your sampling and aggregation code on the included dataset before event day.
Call to action
Download the CSV/JSON templates above, paste them into your analytics environment, and run the cohort and survival recipes this week. If you want a customized dataset (region-specific, daily granularity, or per-event telemetry) — contact our data team for a templated pipeline and retention-modeling workshop tailored to streaming platforms.
Related Reading
- Control vs Scale: Should You Book a Platform Rental or an Independent Operator?
- Affordable Skiing vs. Overcrowded Roads: A Commuter's Guide to Safer Winter Driving
- Collector Spotlight: Tracking Provenance for Limited-Edition Flag Pins and Patches
- Nat & Alex Wolff on Billie Eilish Collabs and Biopic Fantasies: 6 Songs, 6 Stories
- The 2026 Hybrid Career Playbook: Advanced Strategies for Creator-Led Careers and Sustainable Income
Related Topics
Unknown
Contributor
Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.
Up Next
More stories handpicked for you
Legal Precedents in Press Freedom: Case Studies and Future Implications
The Trump Effect: Evaluating Economic Strategies Discussed at Davos
Cursive's Resurgence: A Statistical Look at Education Trends
Rail Modernization: A Data-Driven Approach to Sustainability in Transportation
Gold Reserves and Geopolitical Risk: A Statistical Review
From Our Network
Trending stories across our publication group