Discover the Brilliance of Zanzibar: Google’s Authorization Solution
Welcome to DailyDev.in!
At DailyDev.in, we make learning new software engineering concepts easy and fun, one week at a time.
Interested in diving deeper? Check out our latest post:
Ever wondered how Google ensures that billions of users can access their emails, YouTube videos, or shared Google Drive files seamlessly and securely? Behind the scenes, a powerful system works tirelessly to decide who can access what, no matter where they are in the world or how complex the rules might be. That system is Google Zanzibar—a globally distributed authorization framework that quietly powers the access control behind services like Google Calendar, Maps, Photos, Drive, and more.
Imagine managing trillions of access rules, answering millions of requests every second, all while ensuring decisions are correct, lightning-fast, and reliable. Sounds like magic? It’s not—it’s engineering at scale. In this blog, we’ll uncover the secrets of Zanzibar’s architecture, its data model, and how it achieves the holy grail of consistency, scalability, and low latency. Whether you’re a curious student, a software engineer, or a tech leader, by the end of this deep dive, you’ll have a clear understanding of how one of Google’s most impressive systems operates—and why it’s a benchmark for modern access control.
Ready to explore how Google keeps the gears of authorization running smoothly? Let’s dive in.
Table of Contents
What is Zanzibar?
Zanzibar is Google’s global authorization system designed to store and evaluate access control lists (ACLs) for a wide range of services, including Calendar, Cloud, Drive, Maps, Photos, and YouTube. It offers a uniform data model and configuration language to express diverse access control policies.
Key Features of Zanzibar
Correctness: Ensures consistent access control decisions.
Flexibility: Supports a rich set of access control policies.
Low Latency: Responds quickly to authorization checks.
High Availability: Reliably handles requests.
Large Scale: Manages trillions of ACLs and millions of requests per second.
Data Model and Configuration
Defining the Tuple Data Model
Zanzibar uses relation tuples to represent ACLs. These tuples can include groups and nested group memberships. The tuple format is as follows:
⟨tuple⟩ ::= ⟨object⟩‘#’⟨relation⟩‘@’⟨user⟩
⟨object⟩ ::= ⟨namespace⟩‘:’⟨object id⟩
⟨user⟩ ::= ⟨user id⟩ | ⟨userset⟩
⟨userset⟩ ::= ⟨object⟩‘#’⟨relation⟩
Example Tuples:
doc:readme#owner@10 # User 10 is an owner of doc:readme
group:eng#member@11 # User 11 is a member of group:eng
doc:readme#viewer@group:eng#member # Members of group:eng are viewers of doc:readme
Namespace Configuration
Before clients can store relation tuples in Zanzibar, they must configure their namespaces. A namespace configuration specifies its relations as well as its storage parameters. Each relation has a name and a relation config. Storage parameters include sharding settings and an encoding for object IDs.
Userset Rewrite Rules
Userset rewrite rules allow clients to define relationships between different relations in a namespace. These rules enable the creation of complex access control policies by specifying how one relation can be derived from another. In the given example, the namespace is "doc"
, and it defines three relations: "owner"
, "editor"
, and "viewer"
.
Configuration Breakdown
name: "doc"
relation { name: "owner" }
relation {
name: "editor"
userset_rewrite {
union {
child { _this {} }
child { computed_userset { relation: "owner" } }
}
}
}
relation {
name: "viewer"
userset_rewrite {
union {
child { _this {} }
child { computed_userset { relation: "editor" } }
}
}
}
Relations Defined
owner
: This is a basic relation that directly specifies the owners of a document.editor
: This relation is defined with a userset rewrite rule that combines the current editors (_this
) with the owners (computed_userset { relation: "owner" }
).viewer
: This relation is defined with a userset rewrite rule that combines the current viewers (_this
) with the editors (computed_userset { relation: "editor" }
).
Userset Rewrite Rules
editor
Relation
relation {
name: "editor"
userset_rewrite {
union {
child { _this {} }
child { computed_userset { relation: "owner" } }
}
}
}
_this
: Refers to the current set of users who have theeditor
relation.computed_userset { relation: "owner" }
: Computes the set of users who have theowner
relation.
The union
operation combines these two sets, meaning that anyone who is an owner
of a document is also considered an editor
.
viewer
Relation
relation {
name: "viewer"
userset_rewrite {
union {
child { _this {} }
child { computed_userset { relation: "editor" } }
}
}
}
_this
: Refers to the current set of users who have theviewer
relation.computed_userset { relation: "editor" }
: Computes the set of users who have theeditor
relation.
The union
operation combines these two sets, meaning that anyone who is an editor
of a document is also considered a viewer
.
Practical Example
Let’s say we have the following relation tuples:
doc:example#owner@alice # Alice is the owner of the example document
doc:example#editor@bob # Bob is an editor of the example document
doc:example#viewer@charlie # Charlie is a viewer of the example document
Based on the userset rewrite rules:
Editors: The set of editors includes both
bob
(explicitly defined) andalice
(implicitly included because she is an owner).Viewers: The set of viewers includes
charlie
(explicitly defined),bob
(implicitly included because he is an editor), andalice
(implicitly included because she is an owner).
Enjoying What You Read?
Subscribe to our newsletter for more insightful content like this!
Check Evaluation
Zanzibar evaluates ACL checks by converting check requests to boolean expressions. For example:
CHECK(U, ⟨object#relation⟩) =
∃ tuple ⟨object#relation@U⟩
∨ ∃ tuple ⟨object#relation@U′⟩, where
U′ = ⟨object′#relation′⟩ s.t. CHECK(U,U′).
Example Scenario
Consider a document named example
with the following relation tuples:
doc:example#owner@alice # Alice is the owner of the example document
doc:example#editor@bob # Bob is an editor of the example document
doc:example#viewer@charlie # Charlie is a viewer of the example document
And the namespace configuration for doc
is:
name: "doc"
relation { name: "owner" }
relation {
name: "editor"
userset_rewrite {
union {
child { _this {} }
child { computed_userset { relation: "owner" } }
}
}
}
relation {
name: "viewer"
userset_rewrite {
union {
child { _this {} }
child { computed_userset { relation: "editor" } }
}
}
}
Check Evaluation: Does david
have viewer
access to doc:example
?
To evaluate this, Zanzibar will follow these steps:
Direct Check: First, Zanzibar checks if
david
is directly listed as aviewer
ofdoc:example
.There is no direct tuple for
david
as aviewer
.
Inherited Check: Next, Zanzibar checks if
david
is aneditor
ofdoc:example
becauseviewers
includeeditors
.There is no direct tuple for
david
as aneditor
.
Further Inherited Check: Finally, Zanzibar checks if
david
is anowner
ofdoc:example
becauseeditors
includeowners
.There is no direct tuple for
david
as anowner
.
Since david
is not directly or indirectly listed as a viewer
, editor
, or owner
of doc:example
, the check evaluation concludes that david
does not have viewer
access to doc:example
.
Check Evaluation: Does bob
have viewer
access to doc:example
?
To evaluate this, Zanzibar will follow these steps:
Direct Check: First, Zanzibar checks if
bob
is directly listed as aviewer
ofdoc:example
.There is no direct tuple for
bob
as aviewer
.
Inherited Check: Next, Zanzibar checks if
bob
is aneditor
ofdoc:example
becauseviewers
includeeditors
.There is a direct tuple for
bob
as aneditor
.
Since bob
is an editor
of doc:example
, and viewers
include editors
, the check evaluation concludes that bob
has viewer
access to doc:example
.
Check Evaluation: Does alice
have viewer
access to doc:example
?
To evaluate this, Zanzibar will follow these steps:
Direct Check: First, Zanzibar checks if
alice
is directly listed as aviewer
ofdoc:example
.There is no direct tuple for
alice
as aviewer
.
Inherited Check: Next, Zanzibar checks if
alice
is aneditor
ofdoc:example
becauseviewers
includeeditors
.There is no direct tuple for
alice
as aneditor
.
Further Inherited Check: Finally, Zanzibar checks if
alice
is anowner
ofdoc:example
becauseeditors
includeowners
.There is a direct tuple for
alice
as anowner
.
Since alice
is an owner
of doc:example
, and editors
include owners
, and viewers
include editors
, the check evaluation concludes that alice
has viewer
access to doc:example
.
Summary
Zanzibar’s check evaluation process involves:
Checking for direct relations.
Checking for inherited relations based on userset rewrite rules.
Combining these checks to determine if a user has the requested relation to an object.
This approach allows Zanzibar to handle complex access control policies efficiently and consistently.
API
Zanzibar provides several APIs:
Check: Determines if a user has a specific relation to an object.
Read: Retrieves relation tuples.
Write: Modifies relation tuples.
Watch: Monitors changes to relation tuples.
Expand: Returns the effective userset for an object-relation pair.
Architecture
Zanzibar’s architecture includes:
aclservers: Handle Check, Read, Expand, and Write requests.
watchservers: Respond to Watch requests.
Spanner: A globally distributed database system that stores ACLs and metadata.
Leopard: An indexing system for efficient set computations.
Performance Optimizations
Distributed Caching
Zanzibar employs a distributed cache to store results of recent checks and reads. This helps in handling hot spots and reducing latency. Cache entries are distributed across Zanzibar
servers using consistent hashing.
Request Hedging
To mitigate tail latency, Zanzibar uses request hedging. This involves sending the same request to multiple servers and using the first response that arrives. This technique helps in accommodating slow tasks without significantly increasing load.
Performance Isolation
Zanzibar implements performance isolation to ensure that misbehaving clients or unexpected usage patterns do not affect other clients. This is achieved through:
CPU Capacity Allocation: Each client has a global limit on maximum CPU usage per second.
Memory Usage Limits: Each server limits the total number of outstanding RPCs to control memory usage.
Spanner Server Limits: Limits the maximum number of concurrent reads per (object, client) and per client on each Spanner server.
Real-Life Example
Imagine a photo-sharing service where users can share photos with specific groups. Using Zanzibar, the service can efficiently manage ACLs to ensure that only authorized users can view or edit photos. For example:
photo:vacation#owner@alice # Alice is the owner of the vacation photo
group:family#member@bob # Bob is a member of the family group
photo:vacation#viewer@group:family#member # Family members can view the vacation photo
Code Sample
Here’s a simplified example of how you might use Zanzibar’s API to check if a user has permission to view a photo:
from zanzibar_client import ZanzibarClient
client = ZanzibarClient()
# Check if Bob can view the vacation photo
result = client.check("photo:vacation", "viewer", "bob")
print("Bob can view the photo:", result)
Graphs and Data
Deep Dive into Zanzibar’s Components
Aclservers
Aclservers
are the primary servers in Zanzibar’s architecture. They handle various types of requests:
Check Requests: Determine if a user has a specific relation to an object.
Read Requests: Retrieve relation tuples.
Expand Requests: Return the effective userset for an object-relation pair.
Write Requests: Modify relation tuples.
Watchservers
Watchservers
are specialized servers that respond to Watch
requests. They tail the changelog and serve a stream of namespace changes to clients in near real-time.
Spanner
Spanner is a globally distributed database system used by Zanzibar to store ACLs and their metadata. It provides external consistency and snapshot reads with bounded staleness, ensuring that ACL evaluations respect the causal ordering of updates.
Leopard
Leopard is an indexing system used to optimize operations on large and deeply nested sets. It reads periodic snapshots of ACL data and watches for changes between snapshots. Leopard performs transformations on the data, such as denormalization, and responds to requests from aclservers
.
Learnings from Zanzibar’s System Design
Unified Data Model: Zanzibar’s use of a unified data model for ACLs and groups simplifies the representation and management of access control policies.
External Consistency: By leveraging Spanner’s TrueTime mechanism, Zanzibar ensures that ACL evaluations respect the causal ordering of updates, preventing issues like the “new enemy” problem.
Scalability: Zanzibar’s architecture, with its distributed servers and caching mechanisms, allows it to scale to trillions of ACLs and millions of requests per second.
Low Latency: Techniques like distributed caching, request hedging, and performance isolation help Zanzibar achieve low latency, even under high load.
Flexibility: The system’s configuration language and userset rewrite rules provide the flexibility needed to accommodate a wide range of access control policies.
Conclusion
Zanzibar is a powerful authorization system that ensures consistent, scalable, and performant access control across Google’s services. By understanding its data model, consistency model, and architecture, developers can build robust authorization mechanisms in their own applications. Whether you’re a college student learning about system design or a senior engineer looking to implement a global authorization system, Zanzibar offers valuable insights and techniques.
Have you worked with authorization systems before? How do you ensure consistency and scalability in your projects? Share your experiences and thoughts in the comments below!
References
Pang, R., Cáceres, R., Burrows, M., et al. (2019). Zanzibar: Google’s Consistent, Global Authorization System. USENIX Annual Technical Conference (USENIX ATC ’19).
Google Cloud Identity and Access Management. https://cloud.google.com/iam/