Architecture Maps

Lichess Internals

Interactive architecture map of Lichess — the free, open-source chess server. Covering Lila (Scala/Play core), real-time WebSockets via lila-ws, MongoDB storage, Redis caching, Stockfish analysis via fishnet, ElasticSearch, the tournament engine, puzzle generation, OAuth/API, and CDN/asset serving.

Open Source (AGPL-3.0) Since 2010 Scala / Play Framework ~5M Monthly Active Users 150M+ Games Played
01

System Overview

Lichess is a free, open-source, community-driven chess server founded by Thibault Duplessis in 2010. It runs on a surprisingly modest infrastructure, serving millions of daily games with real-time play, analysis, tournaments, puzzles, and a comprehensive API — all without ads or premium tiers.

~200K
Concurrent Users
5M+
Monthly Players
~10
Physical Servers
150B+
Moves Recorded
Interactive Architecture Diagram — Click nodes for details
Lila Core
Real-Time / WS
Data Storage
Chess Engine
API / Auth
Infrastructure
Search
Social / Community
02

Lila — The Scala Core

Lila (short for "li[chess in sca]la") is the monolithic backend application built with Scala and the Play Framework. It handles HTTP routing, game logic, user management, tournament scheduling, and coordinates all other services.

Module Architecture

Lila follows a modular monolith pattern. The codebase is organized into ~40 SBT modules, each encapsulating a domain — game, round, tournament, puzzle, user, forum, blog, etc. Modules communicate through Scala traits and an internal event bus, keeping concerns separated while avoiding the overhead of a microservice mesh.

Play Framework

Asynchronous HTTP server built on Pekko (previously Akka). Handles all web requests, template rendering (Scalatags, not Twirl), and reverse routing. Runs on the JVM with Netty as the underlying transport.

Core

Game Module

Central domain model. A Game object tracks board state, clock, players, status, opening, and variant. Uses a compact binary encoding for move history (5 bits per move) and PGN export on demand.

Core

Round Module

Manages the lifecycle of an active game round: accepting moves, enforcing clock rules, detecting checkmate/stalemate/draw, handling takebacks, and triggering game-over events. One actor per active game.

Core

Scalatags Templates

Server-side HTML rendering uses Scalatags — a Scala DSL for type-safe HTML generation. No template engine; the HTML is pure Scala code, compiled and type-checked at build time.

Core

Event Bus

Internal pub/sub system for decoupling modules. When a game finishes, the event bus notifies the tournament module, the puzzle generator, the rating system, and the activity tracker — all without direct coupling.

Core

Scheduler / CRON

Background task scheduling for leaderboard recalculation, tournament pairing, cleanup of stale games, rating deflation adjustments, and periodic cache warming. Uses Pekko Scheduler internally.

Core
Design Philosophy

Lichess deliberately avoids microservices. Thibault has stated that a monolith is simpler to deploy, debug, and reason about. The ~10 supporting services (lila-ws, fishnet, search, etc.) exist only because they have genuinely different runtime requirements — not for organizational scaling.

03

Real-Time Layer — lila-ws

lila-ws is a standalone Rust service that manages all WebSocket connections. It was rewritten from Scala to Rust in 2020 for lower memory usage and better concurrency, handling 100K+ simultaneous WebSocket connections per server.

WebSocket Connection Flow
Browser
JS Client
nginx
TLS termination
lila-ws
Rust / Tokio
Redis
Pub/Sub bridge
Lila
Scala / Play

Connection Management

Each WebSocket connection is a Tokio task. lila-ws tracks which room (game, tournament, lobby) each connection belongs to and efficiently broadcasts state updates to relevant subscribers.

Real-Time

Redis Pub/Sub Bridge

Lila and lila-ws communicate through Redis pub/sub channels. When a player makes a move, Lila publishes to Redis; lila-ws picks it up and pushes to the opponent's WebSocket. Latency is typically under 5ms.

Real-Time

Room System

Connections are organized into rooms: game rooms (2 players + spectators), tournament rooms, lobby, TV channel, and user activity rooms. Each room type has specific message routing rules.

Real-Time

Client Protocol

JSON messages over WebSocket. The protocol includes move submission (SAN notation), clock sync, chat messages, draw/takeback offers, and spectator crowd counts. Ping/pong keepalive every 3 seconds.

Real-Time
Rust Migration Impact

The 2020 rewrite from Scala to Rust reduced WebSocket memory usage by ~10x. A single lila-ws instance now handles 100K+ connections using ~2GB RAM, compared to the old Scala implementation which needed ~20GB for the same load.

04

Data Layer

Lichess uses MongoDB as its primary data store and Redis for caching, pub/sub, and rate limiting. The choice of MongoDB over a relational database reflects the document-oriented nature of chess games and the schema flexibility needed for rapid feature development.

MongoDB Collections

The MongoDB cluster stores all persistent data: games (with compact binary move encoding), users, puzzles, tournaments, teams, forums, studies, and more. Games are the largest collection at hundreds of millions of documents.

Collection Purpose Scale
game5 All completed and in-progress games with binary-encoded moves ~4B+ documents
user4 User profiles, ratings (Glicko-2), perfs, preferences ~15M+ documents
puzzle2 Tactical puzzles with rating, themes, moves ~4M+ documents
tournament2 Tournament metadata, standings, pairings ~2M+ documents
study Analysis boards, shared studies, chapters ~10M+ documents
team Team membership, forums, leaders ~500K+ documents

Redis Roles

Redis serves multiple roles: pub/sub bridge to lila-ws, session storage, rate limiting, live game clocks, lobby seek pool, round socket state, and short-lived caches for leaderboards and tournament standings.

Data

Binary Move Encoding

Games store moves as a compact binary blob (5 bits per move — from-square + to-square + promotion) rather than PGN text. This reduces storage by ~80% compared to plain SAN and allows the entire move history to fit in a single BSON field.

Data

Glicko-2 Ratings

Ratings use the Glicko-2 system with separate ratings per variant and time control. Rating calculations happen in Scala, stored per user in the perfs subdocument. Rating deviation decays for inactive players.

Data
05

Stockfish Integration

Lichess integrates Stockfish — the strongest open-source chess engine — for server-side analysis, game review, and puzzle validation. Client-side analysis uses Stockfish compiled to WebAssembly (WASM) running in a Web Worker for zero server load.

Server-Side Analysis

Deep analysis for game reviews, opening explorer data, and puzzle generation runs on dedicated server hardware. Requests are queued and distributed across analysis workers via the fishnet protocol.

Engine

Client-Side WASM

The analysis board runs Stockfish.js (compiled via Emscripten) in a SharedArrayBuffer Web Worker. Users get real-time engine evaluation locally — no server round-trip needed. Multi-threaded WASM is supported in modern browsers.

Engine

NNUE Evaluation

Modern Stockfish uses NNUE (efficiently updatable neural network) evaluation. The NNUE net file (~40MB) is bundled with the WASM build. It provides grandmaster-level evaluation at depth 20+ in seconds on consumer hardware.

Engine

UCI Protocol

Communication with Stockfish uses the Universal Chess Interface (UCI). Commands like position fen ... moves ... and go depth 24 are sent; the engine returns bestmove and multi-PV evaluation lines.

Engine
06

Fishnet — Distributed Analysis

Fishnet is Lichess's distributed volunteer computing network. Community members donate CPU time to run Stockfish analysis on their machines, providing free deep analysis for every game played on the platform.

Fishnet Architecture
Lila
Analysis Request
Fishnet API
Work Queue
Volunteer Client
Rust + Stockfish
Fishnet API
Result
Lila
Store Analysis

Client Architecture

The fishnet client is a Rust binary that polls the fishnet API for work, runs Stockfish locally, and reports results. Clients authenticate with API keys and can be configured for analysis depth, move generation, or both.

Engine

Work Types

Two main work types: analysis (deep evaluation of every position in a game, ~30 seconds per game) and move (finding the best move for bot accounts, ~1 second). Analysis has lower priority but is the bulk of work.

Engine

Volunteer Network

Hundreds of volunteers worldwide contribute computing power. The system dynamically distributes work based on client capability (thread count, hash size) and current queue depth. No centralized cluster needed.

Engine
Crowd-Sourced Infrastructure

Fishnet is one of the most elegant parts of Lichess's architecture. Instead of paying for hundreds of analysis servers, the community donates CPU cycles. This aligns perfectly with Lichess's non-profit, community-first philosophy and provides analysis capacity that would cost tens of thousands of dollars per month commercially.

07

Tournament System

Lichess runs multiple tournament formats simultaneously: Arena (continuous Swiss-like), Swiss, Simul (simultaneous exhibition), and custom team battles. The tournament scheduler creates Arena events around the clock automatically.

Arena Tournaments

The flagship format. Players join and leave freely during the tournament window. Pairing uses a Swiss-like algorithm with "berserk" (halve time for double points) and streak bonuses. Leaderboard updates in real-time via WebSocket.

Core

Swiss Tournaments

Traditional Swiss-system events with fixed rounds. Pairing uses the Dutch FIDE system. Players must be present for each round. Used for serious OTB-style competitions and titled events.

Core

Pairing Engine

Arena pairing considers rating, recent opponents, color history, and wait time. The algorithm runs every few seconds, matching available players instantly. Swiss pairing is computed per round using a dedicated solver.

Core

Automatic Scheduling

A CRON-like scheduler creates hourly, daily, and weekly Arena events across all time controls and variants. The Lichess calendar is always populated with upcoming events, ensuring players always find a tournament to join.

Core
08

Puzzle Generation

Lichess hosts over 4 million tactical puzzles, all generated from real games. The pipeline identifies critical moments where one side has a forced winning sequence, validates with Stockfish, and assigns difficulty ratings using Glicko-2.

Puzzle Generation Pipeline
Game Archive
MongoDB
Position Scanner
Find Imbalances
Stockfish Validation
Verify Forced Win
Theme Tagger
Fork, Pin, etc.
puzzle2
MongoDB

Candidate Selection

The scanner looks for positions where Stockfish evaluation swings dramatically (e.g., from +0.3 to +3.0) — indicating a critical tactical moment. Only positions with a single clear best line survive filtering.

Engine

Theme Classification

Puzzles are automatically tagged with tactical themes: fork, pin, skewer, discovered attack, deflection, back-rank mate, etc. The classifier analyzes the solution moves to identify the underlying tactic. Users can train specific themes.

Core

Puzzle Rating

Each puzzle has its own Glicko-2 rating, adjusted as users solve or fail it. New puzzles start at a provisional rating based on the game's player ratings and calibrate quickly through user attempts.

Core
09

API & OAuth

Lichess provides one of the most comprehensive free chess APIs available. It uses OAuth 2.0 for authentication, supports both JSON and NDJSON streaming responses, and serves over 1 billion API requests per month.

REST API

Comprehensive endpoints for games, users, tournaments, puzzles, teams, studies, broadcasts, and more. Most endpoints work without authentication; write operations and private data require OAuth tokens.

API

Streaming API

NDJSON streaming endpoints for real-time data: game events, board state changes, tournament updates. The Board API allows external applications to play games on Lichess — used by physical boards, bots, and custom UIs.

API

OAuth 2.0 (PKCE)

Standard OAuth 2.0 with PKCE flow for third-party apps. Scopes include game play, preferences, puzzle activity, team management, and engine analysis. Personal API tokens available for server-to-server use.

API

Bot API

Dedicated protocol for chess bots. Bot accounts connect via the Board API, receive game challenges, stream board positions, and submit moves. Used by hundreds of community bots running various engines.

API

Rate Limiting

Redis-backed rate limiting per IP and per user. Different limits for authenticated vs. anonymous requests. Streaming endpoints have separate connection limits. Abuse detection feeds into anti-cheat and ToS enforcement.

API

Opening Explorer

The opening explorer API aggregates move statistics from millions of games (masters, Lichess, player-specific). Data is precomputed and served from dedicated infrastructure. Supports standard, Antichess, and other variants.

API
10

Infrastructure & Deployment

Lichess runs on a remarkably small fleet of bare-metal servers (not cloud), hosted at OVH in France. The entire operation is run by a tiny team with minimal infrastructure overhead, demonstrating that massive scale doesn't require massive infrastructure.

Bare Metal Servers

No cloud provider. Lichess rents dedicated physical servers from OVH, giving full control over hardware and avoiding cloud markup. This is a deliberate cost-saving choice — cloud would cost 10-50x more at this scale.

Infra

nginx + Reverse Proxy

nginx handles TLS termination, static asset serving, WebSocket upgrades, and reverse proxying to Lila and lila-ws. The configuration manages routing between HTTP and WebSocket traffic on the same domain.

Infra

CDN / Asset Serving

Static assets (JS bundles, CSS, piece SVGs, board themes, sounds) are served from a CDN. The frontend is built with TypeScript (compiled with esbuild) and uses SCSS. Chess piece rendering uses inline SVG for crisp scaling.

Infra

ElasticSearch

Powers full-text search across games (by player, opening, date range), forum posts, studies, and user profiles. The lila-search service bridges Lila to ElasticSearch with game indexing and query translation.

Search

Monitoring

Prometheus metrics + Grafana dashboards track server health, game throughput, WebSocket connection counts, MongoDB query performance, and fishnet work queue depth. Critical for maintaining the lean infrastructure.

Infra

Deployment

Lila deploys via SBT build to a fat JAR, then rsync to production servers. No Kubernetes, no Docker in production. The deploy script compiles Scala, builds TypeScript, and restarts the JVM process. Downtime is typically under 30 seconds.

Infra
Anti-Cheat System

Lichess runs sophisticated cheat detection that compares player moves against Stockfish's top choices, analyzing move accuracy patterns, centipawn loss distributions, timing anomalies, and behavioral signals. Flagged accounts are reviewed by a moderation team. The system catches thousands of cheaters weekly while maintaining a low false-positive rate.

Service Map

Summary of all services and their technology stack:

Service Language Role
lila Scala (Play Framework) Core application server — HTTP, game logic, all features
lila-ws Rust (Tokio) WebSocket server — real-time game play, chat, spectating
fishnet Rust + Stockfish (C++) Distributed analysis client — volunteer computing
lila-search Scala ElasticSearch bridge — game and forum search
lila-fishnet Scala Fishnet API server — work queue management
lila-gif Rust GIF/image rendering for game thumbnails and sharing
scalachess Scala Chess rules library — move validation, all variants
chessground TypeScript Interactive chessboard UI component
pgn-viewer TypeScript Embeddable PGN viewer for blogs and forums
lila-tablebase Rust Syzygy tablebase probing for endgame positions

Technology

Connections