I Used DynamoDB for 2 Years. Here’s Why I recommend SQL.
After two years using DynamoDB, I’ve concluded that the best database for a million users is often the wrong choice for your first ten.
I’ve spent a significant amount of time leveraging the power of DynamoDB, particularly with the single-table design pattern. This approach, which utilizes generic keys like PK
(Partition Key) and SK
(Sort Key) along with Global Secondary Indexes (GSIs), offers incredible performance at scale by eliminating the need for joins, a common bottleneck in traditional SQL databases.
However, my journey with DynamoDB has also illuminated some critical drawbacks, especially for early-stage products where agility and adaptability are paramount.
The Challenge of Predefined Access Patterns in Early-Stage Products
DynamoDB’s greatest strength is its ability to provide fast, predictable performance. This is intrinsically tied to a deep, upfront understanding of your application’s data access patterns. To design an efficient DynamoDB table, particularly a single table, you must know precisely how you will read and write data. This knowledge dictates the choice of your partition and sort keys, as well as the structure of your GSIs.
For a mature product with established features and user behaviors, defining these access patterns is a manageable task. However, for a startup or an early-stage product, the landscape is one of constant evolution. It’s nearly impossible to predict all the ways you’ll need to query your data in the future. This is where DynamoDB’s rigidity can become a significant roadblock.
Concrete Example: An E-commerce Platform
Imagine you’re building a new e-commerce platform. Initially, you might define the following access patterns for your products
entity:
- Get a product by its
productID
. - Get all products in a specific
category
. - Get all products by a certain
brand
.
With these patterns in mind, you could design a single table with a PK
of PRODUCT#{productID}
and an SK
that also holds the productID
for direct lookups. To handle the other access patterns, you might create a GSI, say GSI1
, with a GSI1PK
of CATEGORY#{category}
and a GSI1SK
of BRAND#{brand}
. This setup would work efficiently for the initial requirements.
Now, let’s say after a few months and acquiring some users, you discover a crucial new feature requirement: the ability to filter products by price range within a specific category. Your current GSI1
doesn’t support this efficiently. You can query by category, but you’d have to fetch all products in that category and then filter by price on the client-side or in your application logic. This is inefficient and can become costly in terms of read capacity units as your product catalog grows.
To address this new access pattern, you would need to either:
- Create a new GSI: You could add a
GSI2
with aGSI2PK
ofCATEGORY#{category}
and aGSI2SK
that represents the product’s price. This would allow for efficient querying of products by price within a category. - Backfill data for the new GSI: After creating the new GSI, you would need to run a script to update all existing product items with the necessary attributes to populate this new index.
In a relational database, this new requirement would be a simple matter of adding an index to the price
and category
columns in your products
table. The flexibility of SQL allows for ad-hoc queries, making it much more forgiving when access patterns evolve. With DynamoDB, what would be a minor change in a SQL database becomes a more involved data migration task. This friction can significantly slow down development velocity at a stage when speed is critical.
The Burden of Schema Enforcement and Boilerplate Code
DynamoDB is often described as “schemaless,” which offers flexibility in storing diverse data within a single table. However, this “flexibility” comes at a cost: the responsibility of maintaining data consistency and a coherent schema shifts from the database to your application code.
While you don’t need to define a rigid schema upfront at the database level (other than the primary key attributes), your application still needs to work with predictable data structures. This means you have to implement robust validation and data transformation logic to ensure that the data being written to and read from DynamoDB conforms to the expected format of your Data Transfer Objects (DTOs).
Concrete Example: User Profile Management
Consider a user profile in your application. In a SQL database, you would have a users
table with clearly defined columns and data types (e.g., username
as VARCHAR
, email
as VARCHAR
, is_premium_member
as BOOLEAN
). The database itself enforces these constraints.
In DynamoDB, a user item might look like this:
{
"PK": "USER#123",
"SK": "PROFILE",
"Username": "john_doe",
"Email": "john.doe@example.com",
"IsPremium": true
}
Now, imagine a new developer on your team accidentally introduces a change where the IsPremium
attribute is saved as a string "true"
instead of a boolean true
. DynamoDB will happily accept this, as it doesn’t enforce a schema on non-key attributes.
This seemingly minor inconsistency can lead to subtle and hard-to-debug issues in your application. For instance, a part of your code that expects a boolean for IsPremium
might fail unexpectedly. To prevent such errors, you need to write extra “boilerplate” code in your application layer:
- Validation Logic: Before writing any data to DynamoDB, you need code that validates the data against a predefined schema (e.g., using a library like Zod or Pydantic).
- Data Transformation: When reading data, you may need to transform it to ensure it matches your application’s internal data models, handling cases where attributes might be missing or in an unexpected format.
This additional layer of code adds complexity and can increase the surface area for bugs. In contrast, a SQL database handles much of this schema enforcement automatically, allowing developers to focus more on business logic.
Conclusion: The Right Tool for the Right Job
DynamoDB, with its single-table design, is an incredibly powerful tool for applications that have well-understood access patterns and require massive scalability. Its performance characteristics are hard to beat in those scenarios.
However, for early-stage products, where the path forward is uncertain and the ability to pivot and iterate quickly is paramount, the rigidity of DynamoDB’s access pattern requirements and the overhead of application-side schema enforcement can be significant hindrances.
A traditional SQL database, with its flexible querying capabilities and built-in schema enforcement, is often a more pragmatic choice in the initial phases of a product’s lifecycle. The focus should be on rapid feature development and discovering the product’s market fit. Once the product matures, usage grows, and performance bottlenecks in the SQL database start to appear, migrating the well-understood, high-traffic parts of your application to a finely-tuned DynamoDB single-table design can be a very effective scaling strategy.
💡 Need a Developer Who Gets It Done?
If this post helped solve your problem, imagine what we could build together! I'm a full-stack developer with expertise in Python, Django, Typescript, and modern web technologies. I specialize in turning complex ideas into clean, efficient solutions.