Reddit’s notification engine uses smart machine learning to decide which of millions of new posts should become alerts for each user. The system quickly picks and sends the most interesting updates using a five-step process, from choosing how many alerts each person gets to making sure they aren’t overwhelmed. It matches posts to users in just 50 milliseconds and learns what each person likes by looking at clicks, upvotes, and comments together. This helps Reddit keep people active longer and stops them from turning off notifications. The same technology now powers other parts of Reddit, like recommendations and ad targeting, making new features faster to build.
How does Reddit’s notification engine use machine learning to deliver relevant real-time updates to users?
Reddit’s intelligent notification engine combines causal modeling, two-tower retrieval networks, and multi-task deep learning in a five-stage pipeline. This system rapidly selects, personalizes, and delivers push notifications from over 7 million daily posts, optimizing user engagement and minimizing fatigue across tens of millions of users.
Reddit processes more than 7 million new posts every day and decides in real time which of them should become a push notification for any of tens of millions of active users. To keep that torrent meaningful, engineers have built a notification platform that combines causal modeling, two-tower retrieval networks, and multi-task deep learning into one asynchronous, queue-backed pipeline.
From raw posts to a single push – the 5-stage journey
Stage | Purpose | Key technique |
---|---|---|
*Budgeting * | Decide how many pushes a user should get today without fatigue | Causal inference on engagement curves |
*Retrieval * | Find < 0.1 % of daily posts likely to interest each user | Two-tower vector similarity search |
*Ranking * | Score those few hundred candidates on expected click / upvote / comment | Multi-task deep network |
*Reranking * | Add freshness, diversity, business rules | Lightweight, near-real-time re-order |
*Delivery * | Send reliably across devices and time-zones | Queue-backed async workers |
The entire loop completes in under two seconds, often faster than the time it takes for the original post to appear on the author’s screen.
Smart budgeting – beating the fatigue curve
Traditional systems cap daily pushes at a flat number. Reddit instead learns individual fatigue thresholds by running uplift modeling on past engagement data. The model asks, “If we send exactly this notification today, will tomorrow’s overall engagement rise or fall?” Experiments show this approach lifts long-term active days per user by 12 % while cutting opt-outs by 38 %.
Two-tower retrieval – 50 ms to scan 7 million posts
- User tower encodes the subscriber’s recent interests (subreddits, saved posts, upvotes).
- Content tower encodes each new post (text embeddings, subreddit, age).
A single inner-product between the two 128-dimension vectors yields the top-500 candidates for that user – all inside a 50 ms SLA.
Multi-task learning – one network, three signals
Instead of training separate models for click, upvote and comment, engineers use a shared-bottom architecture that predicts all three at once. Shared layers learn generic user taste; task-specific heads learn nuanced signal weights. Offline A/B tests show +9 % click-through and +6 % comment rate versus single-task baselines.
Queue-based delivery – handling 10 M+ pushes per month
Notifications are published to Kafka topics; downstream workers fan out to Apple’s APNs, Firebase and email relays. The queue design absorbs traffic bursts – for example, when a major AMA starts – without dropping a single push.
Organizational ripple effects
Teams that built the notification engine packaged its retrieval and ranking modules into an internal ML platform service. Today, the same libraries power:
- Recommended communities on the home feed
- Suggested comments in the composer
- Promoted post targeting for advertisers
That reuse shrank the average time-to-prototype for new ML features from 4 weeks to 5 days.
2025 feature snapshot
- Dynamic reranking now factors in “trending momentum” – posts gaining upvotes faster than their historical subreddit median surface earlier.
- Reddit Insights, launched this year, lets moderators view predicted engagement heat-maps before pinning a post – a direct descendant of the notification ranking model.
Sources:
– ByteByteGo deep-dive on Reddit’s notification architecture
– Hungry Minds Dev – 10 M+ monthly pushes
How does Reddit decide which posts become push notifications?
Reddit transforms millions of daily posts into a curated stream of push alerts through a five-stage pipeline:
- Smart budgeting – causal models set a personal daily limit per user to avoid fatigue
- Two-tower retrieval – vector similarity search narrows candidates from millions to hundreds in <50 ms
- Multi-task deep learning – one neural network predicts click, upvote, comment likelihood simultaneously
- Dynamic reranking – business rules inject diversity and freshness before the final list is locked
- Queue-based delivery – asynchronous workers guarantee real-time, reliable dispatch to mobile clients
The whole cycle completes in under two seconds for every new post.
What makes the system scale to tens of millions of users?
Three architectural choices keep latency low while usage grows:
- Queue-based async layer decouples ingestion, scoring, and delivery, so traffic spikes never stall the pipeline
- Shared ML components let the notification engine reuse Reddit’s existing ranking and retrieval micro-services, doubling development speed and cutting infra costs
- Edge caching + CDN deliver static assets for rich notifications globally in <200 ms, independent of user location
Even during peak events like major AMA sessions, no horizontal scaling of core services is required – the queues simply absorb the burst.
How does Reddit prevent notification fatigue?
Unlike simple rate-limiting, Reddit applies causal modeling:
- Each user gets a dynamic daily quota predicted from past engagement signals
- The model estimates incremental lift – will one more alert today increase or decrease tomorrow’s opens?
- If lift turns negative, the post is silently dropped, cutting irrelevant notifications by 34 % after launch
The same logic runs continuously; heavy users may receive up to 7 alerts on busy days, while light users never see more than 1-2.
Why combine causal modeling with deep learning?
Causal answers “Should we send?” and deep learning answers “To whom and when?”:
- Causal layer isolates true treatment effects, avoiding the classic pitfall of recommending popular but low-impact content
- Neural ranker handles the high-cardinality user×post space, learning subtle patterns like “user upvotes cat memes at 22:00 on Tuesdays”
- Together, they lift click-through rate by 18 % and downgrade spam reports by 27 %.
What lessons transfer to other product teams?
Reddit packaged its pipeline into reusable libraries shipped to three other recommender teams:
- Feature stores and shared embeddings cut model-training time from weeks to days
- A/B framework baked into each stage lets teams test new budget curves or ranking tweaks without touching production configs
- Documentation templates (system diagrams + metric dashboards) reduced onboarding time for new ML engineers from 10 days to 3
As a result, the same causal-budgeting module now also limits email digests and in-app highlights, maintaining a cohesive user experience across all touchpoints.