Interactive architecture map of Bluesky and the AT Protocol (Authenticated Transfer Protocol) — a federated social networking platform built on decentralized identity, portable data repositories, open algorithms, and composable moderation. The protocol that puts users in control of their social graph.
Bluesky is a decentralized social network built on the AT Protocol (atproto). Unlike traditional platforms, the architecture separates identity, data hosting, indexing, feed curation, and moderation into independent, replaceable services. Users own their data and identity, and can migrate between providers without losing their social graph.
AT Protocol decouples identity from any single server. Users have a persistent DID (Decentralized Identifier) that survives server migrations, and a human-readable handle (domain name) for discovery. This two-layer identity system is the foundation of data portability.
The primary DID method for Bluesky. A "placeholder" method designed for the transition to full decentralization. DIDs are short (e.g., did:plc:abc123), managed by a central PLC directory server that logs signed rotation operations. Users can rotate their signing keys and update their PDS endpoint without changing their DID.
An alternative DID method that uses DNS and HTTPS. The DID document is hosted at a well-known URL on the user's domain. Useful for organizations and self-hosters who want full control of their identity without relying on the PLC directory. Less resilient to domain loss.
Handles are domain names (e.g., alice.bsky.social or alice.com). They resolve to DIDs via DNS TXT records (_atproto.handle) or HTTPS well-known endpoints. Handles are mutable cosmetic labels — the DID is the true persistent identity. Custom domains verify ownership.
A centralized audit log for did:plc operations. Stores the history of all DID document updates (key rotations, PDS migrations, handle changes). Designed to eventually be replaced by or mirrored to a more decentralized system. Critical infrastructure for the network.
Traditional social platforms tie your identity to your account on their server. If you leave, you lose your identity, followers, and content. With AT Protocol, your DID is independent of any server. You can migrate your entire account — posts, follows, identity — to a different PDS, and your followers never need to update anything. The DID stays the same; only the PDS endpoint in the DID document changes.
The PDS is a user's home server — it hosts their data repository, manages authentication, and serves as the origin point for all their content. Users can self-host a PDS or use a hosting provider. Bluesky Social PBC operates the largest PDS cluster (bsky.social), but the protocol is designed for any number of independent PDS instances.
Each PDS stores one or more user repositories. A repository is a signed, content-addressed Merkle Search Tree (MST) containing all of a user's records (posts, likes, follows, profile). The PDS serves the repo over XRPC and syncs changes to the network via the firehose.
PDS handles user auth via OAuth 2.0 (DPoP-bound tokens). Supports the AT Protocol OAuth profile with PKCE, pushed authorization requests, and DID-based client identification. Session tokens are short-lived JWTs; refresh tokens enable persistent sessions.
Every PDS emits a real-time event stream (WebSocket) of repository commits. Each event contains the repo DID, the commit operation (create/update/delete), the affected records, and a signed commit object. Relays subscribe to these streams to aggregate the network.
Media files (images, videos) are stored as blobs alongside the repository. Blobs are content-addressed by CID (Content Identifier). The PDS serves blobs via HTTP and tracks which records reference which blobs. Blob lifecycle is tied to the records that reference them.
The PDS exposes AT Protocol APIs via XRPC (Cross-Server RPC) — essentially HTTP endpoints defined by Lexicon schemas. Handles both authenticated user operations (creating posts, following) and unauthenticated reads (repo sync, public profile). Endpoints follow the com.atproto.* and app.bsky.* namespaces.
Users can migrate between PDS instances without losing data or followers. The migration process: export the signed repo from the old PDS, import to the new PDS, update the DID document to point to the new PDS. Since the repo is signed by the user's key (not the server), any PDS can verify and host it.
The official PDS distribution is a TypeScript/Node.js application with SQLite for metadata and local disk for blob storage. It is designed to run on minimal hardware — a single VPS can comfortably host hundreds of accounts. Self-hosters point their domain at the PDS and configure DID resolution, then connect to the network by having a Relay subscribe to their firehose.
Every user on AT Protocol has a data repository — a signed, content-addressed data structure containing all their records. The repo uses a Merkle Search Tree (MST) for efficient verification and sync, and CBOR/CID-based encoding for compact, deterministic serialization.
The MST is a B-tree-like structure where each node's position is determined by the leading zeros of the SHA-256 hash of its key. This creates a deterministic, balanced tree that enables efficient diff-based sync: two repos can exchange only the tree nodes that differ, rather than the entire dataset. Each commit object contains the root CID of the MST, signed by the user's key.
Records are typed by Lexicon NSID (e.g., app.bsky.feed.post, app.bsky.graph.follow). Each record is a CBOR-encoded object stored at a path like collection/rkey. The collection is the Lexicon ID; the rkey is a TID (timestamp-based ID) or self-describing key.
Every change to a repo produces a new commit object containing: the repo DID, a sequence number, the new MST root CID, the previous commit CID, and a signature from the user's signing key. This creates an auditable, tamper-evident chain of all changes.
Records use DAG-CBOR (Concise Binary Object Representation) for deterministic serialization. Every object is addressable by its CID (Content Identifier) — a hash of its CBOR bytes. This content-addressing enables deduplication, integrity verification, and efficient sync across the network.
Repositories can be exported as CAR (Content ARchive) files — a standard IPLD format that bundles all blocks (MST nodes, records, commit objects) into a single binary file. This is the format used for account migration and full repo backup. Any conforming implementation can read and verify a CAR export.
The Relay (also called Big Graph Service or BGS) is the network aggregation layer. It subscribes to the firehose of every known PDS, validates and merges the event streams, and re-broadcasts a unified firehose that downstream consumers (App Views, Feed Generators, Labelers) can subscribe to. It is the backbone of AT Protocol's federated data flow.
The Relay maintains WebSocket connections to every known PDS in the network. It receives commit events, validates signatures and repo integrity, deduplicates, and merges them into a single ordered firehose. At scale, this stream carries tens of thousands of events per second.
The Relay discovers new PDS instances through multiple channels: manual registration, DID document resolution, and crawling references in the data. When a new PDS is found, the Relay subscribes to its firehose and begins backfilling historical data by syncing full repos.
The Relay validates that incoming commits are properly signed by the repo owner's key (resolved via the DID document). It verifies MST integrity, checks sequence numbers for gaps, and rejects malformed or unauthorized data. This prevents PDS operators from forging content on behalf of their users.
Downstream consumers connect to the Relay's firehose with a cursor (sequence number). If disconnected, they can resume from their last cursor without missing events. The Relay buffers a window of recent events to support this catch-up mechanism, enabling reliable at-least-once delivery.
AT Protocol uses a "big world" federation model, unlike ActivityPub's server-to-server approach. Instead of every server talking to every other server, PDS instances push data to Relays, and consumers pull from Relays. This hub-and-spoke pattern reduces the N-squared connection problem and enables global-scale indexing. Multiple independent Relays can coexist, each aggregating the full network or a subset.
The App View is the indexing and API layer that transforms raw repository data into the rich, queryable APIs that client applications consume. It subscribes to the Relay firehose, builds materialized views (timelines, thread trees, notification lists, search indices), and serves the app.bsky.* Lexicon endpoints.
The App View subscribes to the Relay's unified firehose and processes every event in real time. It parses records by Lexicon type, resolves references (reply parents, quote posts, embeds), hydrates user profiles, and updates its materialized views. Handles the full app.bsky.* record namespace.
Builds per-user timelines by indexing follow relationships and chronologically ordering posts from followed accounts. The "Following" feed is computed server-side by the App View. Custom algorithmic feeds are delegated to external Feed Generators via the Feed Generator protocol.
Resolves reply trees into threaded conversations. Posts reference their parent and root via AT URIs (at://did/collection/rkey). The App View materializes these references into navigable thread structures, handling deleted posts, blocked users, and deeply nested replies.
Full-text search for posts and user profiles, trending topic detection, and suggested follows. The search infrastructure indexes record content, user metadata, and social signals. Powers the Explore/Discover features in client apps.
Generates notifications by indexing events that reference a user: likes on their posts, replies, follows, mentions, quote posts, and reposts. Notifications are computed from the firehose events and served via the app.bsky.notification.* endpoints.
The App View integrates labels from Labeler services into API responses. When serving content, it attaches relevant labels (content warnings, flags, categories) so clients can implement their own moderation display logic based on user preferences.
Feed Generators are one of AT Protocol's most distinctive features: algorithmic feeds are external services that anyone can build and publish. Instead of a single company controlling what you see, users choose from an open marketplace of feed algorithms — from simple chronological filters to sophisticated ML-powered recommendation engines.
A Feed Generator is an XRPC service that implements the app.bsky.feed.getFeedSkeleton endpoint. It receives a request with the user's DID and a cursor, and returns an ordered list of post AT URIs (the "skeleton"). The App View then hydrates these URIs into full post objects with profiles, embeds, and labels.
Feed Generators typically subscribe to the Relay firehose to build their own index of posts. They filter and score content based on their algorithm (topic, language, engagement, ML signals) and store a ranked index. When a skeleton is requested, they query this index and return matching post URIs.
Feed Generators publish a generator record (app.bsky.feed.generator) in their creator's repository with metadata: display name, description, avatar, and the service endpoint URL. Users "pin" feeds to their sidebar by saving references to these generator records. The App View resolves the endpoint and proxies skeleton requests.
Anyone can build and host a Feed Generator. Popular examples include topic feeds (science, art, sports), language-specific feeds, "mutuals only" feeds, and community-curated feeds. This separation of content ranking from content hosting is a core architectural principle of AT Protocol.
AT Protocol's moderation system is composable and multi-layered. Rather than a single moderation authority, the protocol defines a labeling system where independent Labeler services can flag content, and users choose which labelers to subscribe to. This enables community-driven, transparent, and customizable moderation.
Labelers are independent services that subscribe to the firehose, analyze content, and emit labels (e.g., "nudity", "spam", "misleading"). Labels are signed by the labeler's DID and published via the com.atproto.label.* endpoints. Users subscribe to labelers they trust, and clients apply labels according to user preferences.
Ozone is the open-source moderation dashboard built by Bluesky. It provides a web interface for reviewing reports, applying labels, managing appeals, and coordinating moderation teams. Any labeler operator can run an Ozone instance. It connects to the labeler backend and the firehose for real-time review queues.
Users can subscribe to multiple labelers simultaneously. Labels from different sources stack: Bluesky's official labeler, community-run topical labelers, and personal block/mute lists all combine. Clients resolve conflicts using priority rules and user preferences (hide, warn, or show).
Content removal operates at multiple levels: PDS operators can remove content from their hosting, the App View can filter content from API responses, and labelers can flag content for client-side hiding. This layered approach means no single entity has absolute power, but each layer can act independently based on its policies.
AT Protocol explicitly rejects the idea that moderation must be all-or-nothing. By separating content hosting (PDS), content indexing (App View), content ranking (Feed Generators), and content labeling (Labelers), the protocol creates checks and balances. A PDS cannot prevent you from being seen on other PDS instances. A labeler cannot delete your data. An App View cannot prevent other App Views from indexing you. Users choose their own trust boundaries.
The AT Protocol defines the wire format, schema system, and RPC conventions that enable all services to interoperate. Lexicon provides type-safe API schemas, XRPC maps them to HTTP, and the data model ensures records are portable and verifiable across the network.
Lexicon is AT Protocol's schema definition language. Each API endpoint and record type is defined by a Lexicon document (JSON) with a reverse-DNS NSID (e.g., app.bsky.feed.post). Lexicons define field types, constraints, and references. They enable code generation, validation, and forward-compatible schema evolution.
XRPC maps Lexicon methods to HTTP endpoints. Queries map to GET requests, procedures to POST. The endpoint path is derived from the NSID (e.g., /xrpc/app.bsky.feed.getTimeline). XRPC also defines subscription methods for WebSocket streams (used for the firehose). Content-type is application/json for queries/procedures and application/cbor for repo sync.
Records are referenced by AT URIs: at://did:plc:abc/app.bsky.feed.post/3jui7k. The scheme encodes the authority (DID), collection (Lexicon NSID), and record key (rkey). AT URIs are the universal pointer format for cross-referencing records (replies, quotes, embeds) across the network.
NSIDs use reverse-DNS notation to namespace all Lexicon types. The app.bsky.* namespace is Bluesky's social application; com.atproto.* is the core protocol; third parties can define their own (e.g., blue.zio.*, community.lexicon.*). This enables extensibility without coordination.
Records within a collection are identified by their rkey. Most use TIDs (Timestamp IDs) — base32-encoded microsecond timestamps that sort chronologically and avoid collisions. Some collections use semantic keys (e.g., "self" for profile records, the followed DID for follow records).
Services authenticate to each other using signed JWTs with the service's DID as the issuer. The App View authenticates to PDS instances when proxying user requests. Feed Generators verify the App View's identity. This chain of signed tokens ensures each hop in the request path is authenticated and authorized.
| Record Type | Lexicon NSID | Description |
|---|---|---|
| Post | app.bsky.feed.post |
A text post with optional embeds (images, links, quote posts), facets (mentions, links, tags), and reply references |
| Like | app.bsky.feed.like |
A like on a post, referencing the subject post's AT URI and CID |
| Repost | app.bsky.feed.repost |
A repost/boost of another user's post |
| Follow | app.bsky.graph.follow |
A follow relationship, stored in the follower's repo with the followed DID as the subject |
| Profile | app.bsky.actor.profile |
User display name, bio, avatar, and banner image (rkey is always "self") |
| Block | app.bsky.graph.block |
A block record that prevents mutual interaction (bidirectional enforcement) |
| List | app.bsky.graph.list |
A curated list of users (mute list, moderation list, or curation list) |
| Feed Generator | app.bsky.feed.generator |
Declaration record for a custom feed algorithm with service endpoint |
| Labeler | app.bsky.labeler.service |
Declaration record for a labeler service with supported label definitions |