The Latency Problem That Separates Winners From Everyone Else
You're a CTO at a mid-market sportsbook. Your platform has 500K monthly active users. Business is healthy. Then, during the Super Bowl, your odds update latency spikes from 200ms to 4 seconds.
This seemingly technical problem creates an immediate business problem:
- Users see stale odds, making seemingly smart bets that are actually terrible
- Your risk management system sees orders 3-4 seconds late, losing ability to adjust odds quickly
- Your traders can't react to market-wide changes (if a player gets injured), so your odds diverge from competitor books
- Users become frustrated and switch to competitors with faster odds
Within 2 hours, you've lost $2M in handle. The Super Bowl is a once-per-year event. You don't get a do-over.
This is why real-time odds infrastructure isn't a technical implementation detail. It's a competitive moat. And it's why FairPlay's infrastructure processes 125M price changes daily with <200ms p95 latency while most competitors run at 2-5 second latency.
This guide walks you through what separates fast odds infrastructure from slow odds infrastructure.
Understanding Odds Latency
First, let's define latency precisely, because most industry vendors are vague about it.
Odds latency has multiple stages:
[Bookmaker changes odds in their system]
↓ (Latency: varies, usually 0-50ms)
[Bookmaker publishes odds to their API/feed]
↓ (Latency: 50-200ms, depends on geography)
[FairPlay receives and aggregates odds from 50+ sources]
↓ (Latency: 5-20ms for aggregation)
[FairPlay publishes to its platform/API]
↓ (Latency: 50-200ms, depends on geography)
[Your infrastructure receives odds update]
↓ (Latency: 1-5ms)
[Your backend processes update]
↓ (Latency: 5-20ms)
[Your frontend renders update]
↓ (Latency: 100-300ms, depends on browser)
[User sees updated odds on screen]
Total: 350ms - 1000ms typical
Most of this latency is physics and geography. A bookmaker API in London to a user in Singapore has inherent ~150ms latency due to the speed of light and network routing.
What separates slow from fast operators is how much additional latency they add on top of physics.
Slow approach: Polling every 5 seconds. This means users see odds that are 5-10 seconds old. During fast-moving markets, this is game-breakingly slow.
Fast approach: Streaming odds in real-time from aggregated sources. This means users see odds within 200-300ms of the bookmaker changing them. Still delayed by physics, but acceptable.
Exceptional approach (FairPlay's model): Multi-source streaming with edge computing. By aggregating 50+ sources and computing odds at the edge (in multiple geographic regions), we can serve odds from the nearest datacenter, reducing geographic latency significantly.
The Architecture of Real-Time Odds at Scale
Let's examine the architecture that enables real-time odds.
Layer 1: The Odds Ingestion Engine
This is the hardest part. You need to simultaneously consume odds from 50+ different sportsbooks, each with:
- Different API formats (REST, WebSocket, proprietary)
- Different update frequencies (some every 100ms, some every 10 seconds)
- Different availability zones (some have issues 5% of the time)
- Different business relationships (some require paying per update, some provide unlimited)
The architecture:
[Bet365] \
[Betfair] |
[FanDuel] |
[DraftKings] } → [Ingestion Layer] → [Deduplication] → [Aggregation]
[PointsBet] |
[... 45+ more] /
Each sportsbook connection runs in parallel:
// Pseudo-code showing the pattern
type BookmakerConnector struct {
name string
apiClient *APIClient
updateChan chan OddsUpdate
errorChan chan error
}
func (bc *BookmakerConnector) Connect(ctx context.Context) {
for {
select {
case <-ctx.Done():
return
default:
// Get odds from this bookmaker's API
updates := bc.apiClient.FetchOdds(ctx)
for _, update := range updates {
bc.updateChan <- update
}
time.Sleep(bc.UpdateInterval())
}
}
}
// Main ingestion loop
func ingestOdds(ctx context.Context, connectors []BookmakerConnector) {
for {
select {
case update := <-aggregatedUpdateChan:
processAndPublish(update)
case err := <-errorChan:
logAndAlert(err)
}
}
}
Key challenge: deduplication
Multiple bookmakers might send you the same odds (they all agree on the odds). You need to recognize this and not process the same odds twice.
You can't just use odds == previous_odds because:
- You might get the same odds at slightly different times
- Floating point comparison is unreliable
- You might get updates from different sources
Solution: version-based deduplication
type OddsVersion struct {
EventID string
MarketID string
Timestamp time.Time
Hash string // Hash of all odds
Source string // Which bookmaker
}
func shouldProcess(update OddsUpdate, previousVersions []OddsVersion) bool {
// Only process if we haven't seen this exact odds hash in the last 100ms
for _, prev := range previousVersions {
if prev.Hash == update.Hash &&
time.Since(prev.Timestamp) < 100*time.Millisecond {
return false // Duplicate, skip
}
}
return true
}
Layer 2: The Aggregation Engine
Raw odds from 50 bookmakers is too much data. You need to aggregate into a canonical format.
[50 raw bookmaker odds streams]
↓
[Normalize formats: decimal vs fractional vs moneyline]
↓
[Validate data: are these odds physical possible?]
↓
[Apply business rules: do we trust this bookmaker? is this market suspended?]
↓
[Compute best odds: what's the consensus across all bookmakers?]
↓
[Publish aggregated odds stream]
Validation layer:
Some odds data is corrupted or invalid. You need to detect this:
func validateOdds(market MarketUpdate) error {
// Rule 1: Related markets must sum to ~100% implied probability
impliedProbs := []float64{}
for _, outcome := range market.Outcomes {
impliedProb := 1.0 / outcome.Odds
impliedProbs = append(impliedProbs, impliedProb)
}
totalProb := sum(impliedProbs)
if totalProb < 0.95 || totalProb > 1.05 {
return fmt.Errorf("Invalid odds: sum of implied probs = %f", totalProb)
}
// Rule 2: Odds shouldn't move >30% in a single update
for _, outcome := range market.Outcomes {
if outcome.PreviousOdds > 0 {
change := math.Abs(outcome.Odds-outcome.PreviousOdds) / outcome.PreviousOdds
if change > 0.3 {
return fmt.Errorf("Suspicious odds jump: %f to %f", outcome.PreviousOdds, outcome.Odds)
}
}
}
return nil
}
Best odds computation:
func computeBestOdds(market MarketUpdate, bookmakerPreferences map[string]int) Odds {
var bestOdds Odds
// If publisher has preference (e.g., "always use bet365 for this market"), respect it
if pref, exists := bookmakerPreferences[market.ID]; exists {
for _, bm := range market.Bookmakers {
if bm.ID == pref {
return bm.Odds
}
}
}
// Otherwise: highest odds for each outcome (best value for users)
for _, outcome := range market.Outcomes {
maxOdds := 0.0
for _, bm := range market.Bookmakers {
if bm.Outcome == outcome.Name && bm.Odds > maxOdds {
maxOdds = bm.Odds
}
}
bestOdds[outcome] = maxOdds
}
return bestOdds
}
Layer 3: The Distribution Network
Once you have aggregated odds, you need to distribute them to:
- Your own sportsbooks and publishers
- API clients (other operators buying your odds)
- Mobile apps
- Third-party integrations
This distribution network is where latency comes from. You need multiple distribution channels:
Channel 1: WebSocket (for real-time clients)
- Maintains persistent connections to clients
- Pushes updates as they happen
- Latency: 50-200ms
- Cost: Higher (persistent connections use memory)
Channel 2: REST API (for polling clients)
- Clients request current odds periodically
- Latency: 5-30 seconds (depending on poll frequency)
- Cost: Lower, but less real-time
Channel 3: Message Queue (for backend systems)
- Kafka topics for each sport/league
- Backend systems subscribe and get real-time updates
- Latency: 100-300ms
- Cost: Medium
[Aggregated Odds]
├─→ [WebSocket Servers] → [Connected clients] (50-200ms)
├─→ [Redis Cache] → [REST API] (5-30s polling)
├─→ [Kafka Topics] → [Backend subscribers] (100-300ms)
└─→ [Historical Database] → [Reporting/analytics] (eventual consistency)
Layer 4: Edge Computing
This is where FairPlay's 125M daily updates comes from. Instead of centralizing all odds processing, we compute odds at the edge—in multiple geographic locations.
User in Singapore
↓
[Nearest FairPlay edge node: Singapore datacenter]
├─→ Gets odds from Asian bookmakers (low latency)
├─→ Gets odds from European bookmakers (via cache, updated frequently)
├─→ Computes best odds aggregation
└─→ Sends to user (low latency due to geography)
User in London
↓
[Nearest FairPlay edge node: London datacenter]
├─→ Gets odds from European bookmakers (low latency)
├─→ Gets odds from Asian bookmakers (via cache)
├─→ Computes best odds aggregation
└─→ Sends to user (low latency due to geography)
This reduces geographic latency from 150-250ms (if everything routes through one central location) to 30-50ms (local datacenter) + 20-50ms (to user) = 50-100ms instead of 180-300ms.
Reliability Patterns for 99.99% Uptime
Real-time odds are only valuable if they're available. Here's how to build 99.99% uptime (5 minutes downtime per year).
Redundancy Pattern 1: Multiple Sources
Never depend on a single bookmaker feed.
If Bet365's feed goes offline for 30 minutes, your system detects this automatically and switches to alternative sources. Users don't notice because they still get odds from 49 other bookmakers.
func (m *MarketHandler) selectOddsSources(market MarketUpdate) []OddsSource {
var sources []OddsSource
// Primary sources (fast, reliable)
primarySources := []string{"bet365", "betfair", "fanduel"}
for _, sourceID := range primarySources {
if isHealthy(sourceID) {
sources = append(sources, getSource(sourceID))
}
}
// Secondary sources (slower but reliable)
if len(sources) < 2 {
for _, sourceID := range []string{"pointsbet", "draftkings", "caesars"} {
if isHealthy(sourceID) {
sources = append(sources, getSource(sourceID))
}
}
}
// Tertiary sources (very slow but last resort)
if len(sources) < 1 {
sources = append(sources, getAllHealthySources()...)
}
return sources
}
Redundancy Pattern 2: Geographic Distribution
Deploy in multiple data centers across continents.
If your primary data center goes offline, traffic automatically routes to secondary data centers. This requires:
- Data replication: Odds data is replicated across all data centers in real-time
- Database replication: Your events, markets, and configuration is replicated
- State synchronization: Any user-specific state (open bets, account balance) is synchronized
[Primary DC: US East]
↓ (replication)
[Secondary DC: US West]
↓ (replication)
[Tertiary DC: Europe]
↓ (replication)
[Quaternary DC: Asia-Pacific]
If Primary goes down:
- All clients automatically route to Secondary
- Failover time: <10 seconds
- Zero data loss due to replication
Redundancy Pattern 3: Circuit Breaker
If a particular bookmaker's feed is consistently failing, stop trying to use it temporarily.
type CircuitBreaker struct {
failures int
lastFailTime time.Time
state string // "closed", "open", "half-open"
threshold int
timeout time.Duration
}
func (cb *CircuitBreaker) Call(fn func() error) error {
if cb.state == "open" {
if time.Since(cb.lastFailTime) > cb.timeout {
cb.state = "half-open" // Try again
} else {
return fmt.Errorf("circuit breaker open")
}
}
err := fn()
if err != nil {
cb.failures++
if cb.failures >= cb.threshold {
cb.state = "open"
cb.lastFailTime = time.Now()
}
return err
}
cb.failures = 0 // Reset on success
cb.state = "closed"
return nil
}
Redundancy Pattern 4: Graceful Degradation
If your primary distribution channel (WebSocket) is overloaded, automatically switch clients to REST API polling.
This means users get slightly stale odds (5-10 seconds) but the system doesn't collapse.
Scaling to 125M Price Changes Per Day
To process 125M price changes daily, you need to think about scale at every layer.
Throughput Calculation
125M price changes per day = 1,450 updates per second (125,000,000 / 86,400 seconds).
But this isn't distributed evenly. During major events:
- Major soccer match: 500-1,000 updates per second
- Super Bowl: 5,000+ updates per second
- During peak hours (evenings in Europe): 10,000+ updates per second
Your infrastructure needs to handle 10x+ peak load.
So 1,450 average means you need to handle 15,000 updates per second peak.
Architecture for 15K ups/sec
[Bookmaker feeds]
↓
[Kafka cluster with 10 partitions]
(Kafka can handle 100K+ msgs/sec easily)
↓
[Stream processors (Flink/Storm)]
(100 parallel processes)
(Each handles 150 updates/sec)
↓
[Aggregation layer]
(Store in Redis for cache, TimescaleDB for historical)
↓
[Distribution to clients]
(WebSocket gateway with load balancing)
Database Strategy
You need two databases:
1. Hot Database (Redis)
- Current odds for all active markets
- 100-500 MB total (all current odds for all events)
- Accessed: 15,000 times/sec (read/write)
- Latency: <10ms
- TTL: 1 hour (market closes, odds expire)
2. Cold Database (TimescaleDB or ClickHouse)
- Historical price changes for audit and analytics
- 1-5 TB per month (125M * 30 days)
- Accessed: <100 times/sec (queries from risk management)
- Latency: 100-500ms acceptable
- Retention: 7 years (compliance requirement)
func storeOddsUpdate(update OddsUpdate) error {
// Write to hot cache (immediate)
hot := redis.Client()
hot.Set(fmt.Sprintf("odds:%s:%s", update.EventID, update.MarketID),
update.JSON(), 1*time.Hour)
// Write to cold storage (async, eventual consistency)
go func() {
db := timescaledb.Client()
db.InsertOdds(update)
}()
return nil
}
Monitoring at Scale
With 15K updates per second, you need automated monitoring:
[Each update carries metadata]
↓
[Aggregated metrics every second]
↓
[Time-series database (Prometheus)]
↓
[Alerting rules]
├─→ If p99 latency > 500ms, page on-call
├─→ If update loss rate > 0.1%, page on-call
├─→ If Kafka lag > 1 second, page on-call
└─→ If any bookmaker feed offline for 5 min, page on-call
Advanced Optimisation Techniques
Queue-Based Distribution
Instead of direct point-to-point connections, use message queues:
// Kafka-based distribution
type OddsKafkaPublisher struct {
producer *kafka.Producer
topic string
}
func (kp *OddsKafkaPublisher) Publish(odds OddsUpdate) error {
message := &kafka.Message{
Key: []byte(odds.EventID),
Value: odds.ToJSON(),
}
partition, offset, err := kp.producer.SendMessage(message)
if err != nil {
return err
}
// Can now scale to 1000+ subscribers without connection overhead
metrics.Track("odds_published", map[string]interface{}{
"partition": partition,
"offset": offset,
"latency": time.Now().Sub(odds.Timestamp),
})
return nil
}
Kafka advantages:
- Decouple publishers from subscribers
- Replay messages for recovery
- Scale to unlimited subscribers
- Per-partition ordering guarantee
- High throughput (1M+ messages/sec)
Bloom Filters for Market Detection
Detect which markets have changes without processing all odds:
// Bloom filter tracks which markets changed
type MarketChangeDetector struct {
bloom *bloom.BloomFilter
reset time.Ticker
}
func (mcd *MarketChangeDetector) DidChange(market string) bool {
return mcd.bloom.Test([]byte(market))
}
func (mcd *MarketChangeDetector) RecordChange(market string) {
mcd.bloom.Add([]byte(market))
}
// Reset every 1 second to detect new changes
func (mcd *MarketChangeDetector) ResetPeriodically() {
for range mcd.reset.C {
// Count how many markets changed
changes := mcd.bloom.Count()
metrics.Gauge("markets_changed", float64(changes))
mcd.bloom.Reset()
}
}
Benefit: O(1) lookup instead of scanning all odds.
Content Delivery Network (CDN) Integration
Cache static odds on CDN edge:
[Origin: Aggregated Odds API]
↓
[CDN Cache Layer (Cloudflare, Akamai)]
├─→ US Region: <50ms to user
├─→ EU Region: <50ms to user
└─→ APAC Region: <50ms to user
CDN benefits:
- Geographic distribution (reduce latency)
- Automatic failover
- DDoS protection
- Bandwidth reduction
CDN challenges:
- Cache invalidation (odds update every 100ms)
- Consistency across regions
- Cost (CDN bandwidth is expensive at scale)
Solution: Use smart cache keys
Cache key: `odds:{sport}:{league}:{market}:{ttl}`
Where ttl = time to live cache
- Pre-match: 30 second TTL (okay if slightly stale)
- In-play: 1 second TTL (must be fresh)
- Post-match: 300 second TTL (settled, don't change)
Cost Optimisation Strategies
Real-time odds infrastructure is expensive. Here's how to optimise:
Strategy 1: Graduated Redundancy
Don't use 50 sources everywhere. Use different source counts by region:
Major event (e.g., Super Bowl):
- 50 sources (maximum redundancy)
- Cost: Premium
Major league regular season (EPL, NBA):
- 20 sources (good redundancy)
- Cost: Standard
Secondary leagues (second division):
- 5-10 sources (basic redundancy)
- Cost: Economy
Niche sports (esports, cricket):
- 2-3 sources (minimal redundancy)
- Cost: Minimal
Strategy 2: Tiered SLAs
Offer different SLA tiers at different costs:
Tier 1 (Premium): $50K/month
- 99.99% uptime SLA
- <200ms p99 latency
- 50+ source redundancy
- Dedicated support
Tier 2 (Standard): $20K/month
- 99.9% uptime SLA
- <500ms p99 latency
- 10-20 source redundancy
- Community support
Tier 3 (Basic): $5K/month
- 99% uptime SLA (4.3 hours downtime/year)
- <2 second p99 latency
- 3-5 source redundancy
- Email support only
This lets companies start cheap, upgrade as they grow.
Strategy 3: Compression
Compress odds updates to reduce bandwidth:
// Original: {"eventId": "event_123", "market": "winner", "decimal": 1.95}
// Size: 75 bytes
// Compressed:
// Event IDs: pre-shared dictionary
// Market IDs: pre-shared dictionary
// Decimal odds: float32 instead of string
// Result: 12 bytes (80% reduction)
type CompressedOddsUpdate struct {
EventID uint32 // 4 bytes
MarketID uint16 // 2 bytes
Decimal float32 // 4 bytes
Timestamp int64 // 8 bytes
}
// Total: 18 bytes vs 75 bytes original
// Bandwidth savings: 75% at scale = $500K+/year savings
Incident Response Playbook
When latency spikes, you need a playbook:
Incident: P99 latency > 1000ms
Steps:
1. (0s) Alert triggered, on-call engineer paged
2. (60s) Engineer checks dashboard
- Is it all events or specific events?
- Is it all providers or specific providers?
- Is it upstream (our providers) or downstream (our system)?
3. (120s) If upstream:
- Failover to alternative providers
- Monitor p99 latency
- If recovered, declare incident resolved
- If not recovered, call provider support
4. (120s) If downstream:
- Check CPU/memory usage
- Check database latency
- Restart service if necessary
- Check if there's a recent deployment that caused it
5. (300s) If not resolved:
- Escalate to manager
- Consider going into degraded mode (fewer updates, less data)
- Communicate with customers
6. (After recovery):
- Root cause analysis
- Preventive measures (code change, capacity increase, etc.)
- Publish postmortem
Moving Forward: Building Your Real-Time Odds Infrastructure
If you're evaluating your current real-time odds infrastructure, ask yourself:
- What's your p99 latency? If it's >500ms, there's room for improvement.
- How many source bookmakers do you use? If <5, you have single-point-of-failure risk.
- How many geographic data centers? If <2, one outage takes you offline.
- What's your uptime track record? Are you hitting 99.9% or falling short?
FairPlay's infrastructure answers all of these questions:
- p99 latency: <200ms
- Source bookmakers: 50+
- Data centers: 4 (NA, EU, APAC, South America)
- Uptime: 99.99%
- Daily scale: 125M price changes
This isn't a technology flex. It's the infrastructure necessary to operate a competitive sportsbook in 2026.
Cost-Benefit Analysis
Building real-time odds infrastructure internally:
| Component | Annual Cost | Difficulty |
|---|---|---|
| Engineering (10 people) | $2M | Very high |
| Infrastructure (4 data centers) | $1.5M | High |
| Monitoring and ops | $500K | Medium |
| Licensing and partnerships | $1M | Medium |
| Total | $5M+ | Requires expertise |
FairPlay partnership:
| Option | Annual Cost | ROI |
|---|---|---|
| Tier 1 ($50K/month) | $600K | 8x cheaper than building |
| Tier 2 ($20K/month) | $240K | 20x cheaper than building |
| Tier 3 ($5K/month) | $60K | 80x cheaper than building |
The math is clear: buy, don't build (unless you're already a $100M+ operator).
Next Steps
- Benchmark your current infrastructure: Measure p50, p95, p99 latency across different sports/regions
- Identify bottlenecks: Is it ingestion, processing, database, or distribution?
- Plan improvements: Can you improve with infrastructure changes, or do you need more data sources?
- ROI calculation: Compare cost of improving vs. partnering with FairPlay
- Consider hybrid: Start with partnership, build internal systems for unique needs
The winner in sports betting isn't the operator with the most sports. It's the operator with the fastest, most reliable odds.
Let's make sure you're building the right infrastructure.
Related Articles:
Ready to explore BetTech for your business?
Talk to the FairPlay team about how our platform can work for your business.
Get Started








