Exiting the Vietnam of Programming: Our Journey in Dropping the ORM (in Golang)

18 min readNov 25, 2021

We recently decided to drop our Object Relational Mapper (ORM) at Bridge Financial Technology, and have been loving the decision. Dropping your ORM takes challenge and commitment, but the profits are real. I’ll cover:

Background: characterizing the so-called Vietnam Problem in software development, understanding and classifying mappers.
Our Journey: why and how we transitioned and lessons learned.
Problems Solved and the Gains We’re Reaping: improvements to performance, scalability, maintainability and more cloud-native code and a data-centric, transparent culture.

Background and Reflections on the Vietnam Problem

While at DjangoCon around 2010 someone told me that using an ORM is like the U.S. going to war in Vietnam. That comment was in reference to Ted Neward coining the term and formalizing his comparison in his 2006 blog post titled The Vietnam of Computer Science.

The comparison resonated with me but I never considered abandoning the ORM. Drawing the completely wrong lessons from the analogy I spent several years looking for the perfect ORM and either through my own decision making or someone else’s I’ve been exposed to a veritable tour de France: Django and SQLAlchemy in Python, Active Record, Linq, Hibernate, Entity Framework, and most recently, Gorm in our Golang-powered backend at Bridge Financial Technology.

When we find pain in software development it’s important to lean into it. Make it hurt until you can’t stand it anymore. The more acute the pain the better you’ll be able to describe and identify where the pain comes from. Our pain caused me to reevaluate the so-called Vietnam Problem and question whether ORMs make sense at all. I won’t go until a full analysis of the mismatch between the object-oriented and relational world. For that I recommend reading Ted Neward’s timeless post. But I will summarize the important takeaways.

Vietnam: The Object-Relational Mismatch

I begin by oversimplifying the Vietnam war the way most historians see it today. The Vietnamese were fighting a civil war to unify their country. The United States was engaged in a proxy war with the Soviet Union and communism itself to prevent it from spreading. That is, the Vietnamese saw it as a north/south issue while the U.S. saw it as an east/west issue.

And the whole thing wasn’t worth it. Vietnam is a united communist country today (the Vietcong got what it wanted) without witnessing the spread of communist doctrine throughout Asia beyond China, which became a communist country in 1949 (the U.S. got what it wanted). And yet it spanned nearly 15 years, 3 presidential administrations and over a million casualties (across the U.S., North and South Vietnamese, allies on both sides and military and civilian personnel). The better strategy, ostensibly at any point, would have been to disengage entirely.

Shifting to software development: objects and relations are likewise fundamentally different things. They come from different places with different goals, and neither is wrong or bad. Of course technologies can be stack-ranked within these worlds and you’ll get more mileage using a stack that your team knows, respects and trusts. For us, we believe Postgres is the best-in-class implementation of a relational database (better than say, MySQL or SQL Server). Likewise we like and admire Go, particularly that it doesn’t implement object oriented design goals through inheritance. If you struggle with either your database or programming language I recommend fixing that problem first before solving problems at the intersection, or at least was our trajectory and we’re happy with the results. To flesh out the analogy below is an overview of these “sides”: objects, relations, the differences and problems encountered when they interact with each other.

Object Orientation Principles

An object oriented system aims to provide:

Identity management: making a separation between the equivalence of state and the objects themselves. Object equivalence occurs when the values of two objects are the same, while identity equivalence occurs when two objects are the same, which is to say, they both point to the same location in memory.
State management: the ability to correlate several primitives into a larger bundle that represents something about the world or problem.
Behaviors: a collection of operations to manipulate said state.
Encapsulation: the ability to define a simplified, exported surface area of the object to the rest of the system thereby hiding unnecessary details.
Polymorphism: the ability to treat different objects homogeneously that can react in similar ways, despite being different things.

Most programming languages achieve these principles with an inheritance model. But it’s worth noting that many highly successful languages are indeed object oriented without inheritance: notably, Go, Erlang and Rust.

Relational Principles

A relational storage engine seeks to normalize data and document facts of the world. SQL provides operations to interact with data built up around set theory to ensure mathematical correctness and achieve properties during the mutability of the data; namely, ACID compliance. SQL operations executed within a transaction are atomic, consistent, isolated and durable. Normalization is typically achieved through proper design, the gold standard being third-normal, or Boyce-Codd, form.

(Some) Differences

These are completely different systems. Some of their differences include:

Objects make sense with aggregated state, while relations seek to fragment it into multiple tables.
Data mutability concerns in the object world are around concerns of accidental overwrite in a concurrent system and protected through encapsulation. Mutability in the relational world requires a transaction to guarantee correctness when state has been altered. There’s less concern to obtain correctness in mutating object state until the data is persisted.
A collection of objects exists to enable behaviors across those objects, while a collection of relations exists to establish facts about the world.

Conceptual Problems

Below is a list of problems that stem from these differences.

The mapping problem. It’s difficult to map tables to objects using any mapper because of the following problems:

An object relationship will be represented by composing one object with another. However, related objects rely on an implicit JOIN operation in SQL, otherwise related objects won’t be initialized with data.
A many to many table in a relational engine involves 3 tables: the two sources of data and a third join table. However such a relationship only requires two objects with a list of the other.
If your programming language supports inheritance, it may be tempting to model objects in that way, but there’s no IS-A type of relation within databases.

Who / what owns the schema definition? The programming language, and as such the the application developers, or the DDL of the database, and as such the DBAs? Even if you don’t have a distinction between these roles in your organization you’ll still have this problem, known as slaving the “O” to the “R” or vice-versa.

Where does metadata go? Most fields will naturally have a 1:1 correspondence. For example, integers, booleans, etc. will have well known types in the database and your programming language. But what about enumerables? Likely these are strings or ints in either with a limited number of options, and both the database and programming language can enforce constraints. So where do you place such metadata? The application? The database? Both?

All of these problems need a resolution by declaring one side (objects or relations) as the definitive authority. And that leads to a classification of ORMs- what is it’s opinion on that authority? Objects or relations?

Understanding and Classifying ORMs

In reality you can’t simply “exit” and disengage from the problem. The title of this post is somewhat marketing-motivated. After all, you’re not going to persist objects to flat files and call it a “database” and you’re not going to build applications in SQL. These things must meet at some point. But I argue that the common approach many ORMs have taken is limited and largely motivated by convenience, usually at the expense of one side.

Broadly, there are two types of ORMs: code-first and database-first.

In the code-first approach you define object model definitions as classes or types that will map to database entities (one or more tables). This approach generates SQL on the fly and makes heavy use of reflection. An example in the Go community is gorm.
The database-first approach typically relies on generating object definition code from your database definition language (DDL). A Golang example is SQLBoiler.

You’ll probably have an intuitive and gut reaction to one of these. And that view can be stubbornly-held. Personally, I always prioritized objects over relations and went with code-first ORMs. But ultimately I switched my thinking to prioritizing the database. That decision precipitated dropping the ORM all together by generating the code used to interact with the database.

Our Journey to Drop the ORM

The above points are all theoretical. Our journey begins with the practical pain and problems we started feeling from our ORM, Gorm.

Problem 1: Wrapped API

An early facepalm moment came from updating some records in the database. The SQL for doing this is:

UPDATE <table>
SET <values>
WHERE <conditions>

But the Gorm API changes the order in which values and conditions are accepted.

// intended
db.Model(<table>).Where(<conditions>).Updates(<values>)// updates the whole table, ignoring conditions
db.Model(<table>).Updates(<values>).Where(<conditions>)

We learned the hard way that getting this wrong has disastrous consequences. The ORM’s API is chainable but it’s not entirely lazy. Certain statements are finalizing, including the Update. If you reverse the order the updates will apply without the Where conditions taking effect, meaning you’ve updated everything in the model.

Now you can argue that SQL has this backwards and the ORM is simply making a correction to what the API should really be. And you can argue our team should have known better. Or that the API could have been better documented. Whatever the argument the result is the same: we had a bad day. And the broader point is this: an ORM is effectively a wrapper on SQL.

Early in my career an engineering manager taught me to be skeptical of wrappers, as they only add layers of cognitive load to the workload. I pushed back, arguing that, taken to its logical conclusion, he’d have all code written in assembly. Neither extreme is correct: abstraction is about achieving a balancing act. But as my career progresses the more I place the ORM in the unnecessary wrapper camp. Why are we dealing with a middleware that reverses the super stable and well known ANSI SQL API that all virtually all developers are trained on? Every developer knows how to update records in a table using SQL (or could easily Google it) and can expect an error when they get it wrong. Not every developer, in fact very few developers, know Gorm’s abstraction. And the same is true (among new hires) for whatever ORM you’re using; you’ll constantly be training people on a mission-critical section of the stack.

Problem 2: Performance and Excessive Memory Consumption

Our backend runs on a serverless stack (AWS Fargate) with limited memory capacity. Over time we kept having to increase the memory capacity of our instances, eventually reaching the maximum and then watching our containers die. As the data volume grew we saw the number of containers grow in somewhat linear fashion. One would hope the backend would instead scale sublinearly.

The ORM was a natural culprit: it’s easy to see that many ORMs will make use of object introspection to build SQL statements, hydrate results, or both. Gorm’s memory footprint is extreme although sadly not uncommon. Bridge started with Python on the backend and we used Django’s ORM to interact with the database which had similar problems.

We didn’t realize the extent of the problem until we ultimately removed it from our stack to give us a comparison point. The details are included in the next section, but as a preview: we increased execution performance by about 2x and reduced our memory footprint by nearly 10x.

Problem 3: Understanding our I/O Profile

Over time we noticed ourselves consuming database logging tools to understand our own use cases and recognized it as a mismatch. We had setup RDS Performance Insights on AWS and pganalyze to identify bottlenecks in the database. These tools proved their value early on and we ended up using them to learn about the ways in which we interacted with the database. Do we over-fetch columns? Are we running unindexed queries?

Of course these questions have known, definitive answers. The fact that we need an outside tool to shed light on the matter is an obvious structural flaw in the code. To me the underlying problem is that the ORM made it too easy to interact with the database. The code wasn’t centralized or modularized to a middleware layer in the codebase. Instead it was spaghettified throughout. Understanding our database interactivity required extensive code audits and reviews of stuff that had more to do with business logic than reads and writes.

Alternatives

The alternative to having an ORM seem fairly limited: use a low-level database driver, build SQL queries at runtime and map the results back into objects yourself. Of course an ORM does all these things in automated fashion, so going this route would be a huge sacrifice to maintainability. Our team concluded (fairly easily and without much decision making) that whatever benefits here, the costs are simply too high to consider it.

There is a third route, however: employing code generators to automate these steps. We bucketed projects in the Go community along two lines:

Code-generating SQL at runtime (example: squirrel)
Generating application code at compile time(examples: jet, sqlc)

Generating SQL code is an interesting idea, and requires less tooling and commitment than generating application code. However we felt it would be a lateral move in the maintainability of our code. SQL generation will require string interpolation, which means auditing the code as database migrations are applied, a labor-intensive and exhausting process we wanted to end.

Baby in the Bathwater?

We thought long and hard about whether we were throwing out the baby with the bathwater. Perhaps the problem isn’t ORMs per se, but only the code-first subset. In the Go community sqlboiler is an interesting project that generates model definitions from your DDL.

We decided against using this project, for the following reasons:

There is such a thing as too much code generation. Generated code requires flexible configuration to control the output, which is a fine line to walk. On the one hand you don’t want to swap code for configuration and drop huge yaml or toml files in your codebase that require its own set of maintenance issues. At the other extreme if there’s something that you want to control or customize about the generated code that isn’t exposed in configuration, you’re out of luck.
Sqlboiler is largely inspired by Active Record which we feel overly abstracts the database. We sought to embrace the database because, culturally, we’re a data-centric organization and wanted our database to be more transparent within our application and API.

Selecting a Code Generator

We looked closely at two code generators: jet and sqlc, ultimately selecting sqlc. With jet you write SQL within your application as a DSL. But because it generates code it goes a step beyond what a runtime SQL generator like squirrel offers. Models and fields and are first-class referenceable types, rather than requiring string interpolation, which avoids the need to grep through code in an audit process when you want to make changes.

Even more appealing, it offers a way to aggregate, or de-normalize data in a database. Whereas an ORM’s goal is to make relationship traversal easy, Jet’s goal is to provide a package of data in a complete and well-typed struct that clearly advertises what’s available within it. Here’s an example:

stmt := SELECT(
    Actor.ActorID, Actor.FirstName, Actor.LastName, Actor.LastUpdate,
    Film.AllColumns,
    Language.AllColumns,
    Category.AllColumns,
).FROM(
    Actor.
        INNER_JOIN(FilmActor, Actor.ActorID.EQ(FilmActor.ActorID)).
        INNER_JOIN(Film, Film.FilmID.EQ(FilmActor.FilmID)).
        INNER_JOIN(Language, Language.LanguageID.EQ(Film.LanguageID)).
        INNER_JOIN(FilmCategory, FilmCategory.FilmID.EQ(Film.FilmID)).
        INNER_JOIN(Category, Category.CategoryID.EQ(FilmCategory.CategoryID)),
).WHERE(
    Language.Name.EQ(String("English")).
        AND(Category.Name.NOT_EQ(String("Action"))).
        AND(Film.Length.GT(Int(180))),
).ORDER_BY(
    Actor.ActorID.ASC(),
    Film.FilmID.ASC(),
)var dest []struct {
    model.Actor    Films []struct {
        model.Film        Language   model.Language
        Categories []model.Category
    }
}// Execute query and store result
err = stmt.Query(db, &dest)

There’s a lot of data aggregation going on here. The application-side model being built up is an actor with all the films they’ve been in, the language the film is featured in and the categories its classified under.

We were initially attracted to this design, but after experimenting a bit it didn’t quite feel right. In this example the query is driving the data model within the application, rather than the other way around and we feared this approach would lead to a lot of throw-away aggregate models. Our objective was to promote reusable models with lots of business logic and mutability captured in methods on their types.

Moreover our preference was to move SQL out of code entirely. The problem here is that any developer can simply query the database however they wish. While this is an initial productivity win, it’s at the cost of long-term maintainability of both the code and runtime performance. What if a developer queries the database sub-optimally, without using an index? That risk is high as the data model becomes larger and complex since it’s a step removed from the SQL. And while the DSL is appreciated, we still felt it was wrapper-like in the end.

The Answer: sqlc

We decided to use sqlc, a configurable opt-in sql compiler. The approach resonated with us; we like that it doesn’t generate things you don’t ask for, and the resulting generated code can be tailored to the types and tags we’ve defined. It makes the code feel like ours while providing an obvious path to migrate our current implementation. I’ll elaborate on the details of how we made sqlc work for us in a future post.

Benefits to Dropping the ORM

This project was a huge undertaking, requiring not only commitment from our developers, but our product team and the company at large. We’ve had feature freezes, headaches around running our ORM side-by-side with generated code, and had to plan our migrations and deployment path carefully. All of this in the context of a small (but growing) company with limited resources. With all this cost, the benefits had better be significant, and indeed they are. Among them, we’re realizing better performance and scalability in our backend runtime, better maintainability of our codebase, less dependency on database logs to understand our data I/O profile, a more cloud-native implementation and transparency in our backend data model to all our developers regardless of their day to day proximity to the database in the stack.

Performance and Scale

If your ORM is dynamic, without the using generated code or consuming generic types or interfaces, then it’s probably doing some level of reflection behind the scenes. In our case, Gorm makes heavy use of reflection since Go doesn’t support generics, and Gorm doesn’t define many interfaces, beyond requiring you to declare the table name that corresponds to an application model. So we expected large benefits here, but when we started benchmarking our system we were happily impressed.

Performance is about achieving low runtime execution. We benchmarked the results by identifying various workloads that are typical in our backend, either because the API is executing them or due to an offline or batched process that’s causing heavy I/O against the database. In the graph below we have use-cases on the horizontal axis; blue indicates our sqlc-driven data interaction layer and the red indicates our current latency with the Gorm ORM. Lower is better.

Runtime performance is 52% faster without the ORM

Across our workloads we’re enjoying approximately a 2x speedup in execution performance. Happily, this number tends to be even higher when the workloads fetch more data.

Scalability is about consuming the least amount of memory possible, which is particularly important for us since we run all our workloads on a serverless backend (AWS Fargate), so we’re better suited to scale-out than up. The less memory we’re using on each instance means fewer instances need to come online to achieve a result, which translates to lower cost and an overall better utilization of the fleet. Put differently, if you need half the number of instances than you’re current using (which are budgeted) you should be able to process double the amount of data without having to talk to your CFO.

Memory consumption is 78% less without the ORM

We’ve reduced our memory consumption on average by 78%. Now you could argue that perhaps Gorm is doing something excessively inefficient here, and other ORMs may fare better, but fundamentally most mappers will have a need for type introspection which will lead to a bad memory profile.

Both of these improvements are driven by the fewer number of allocations that need to happen per operation, which we’ve benchmarked to the tune of another 80% drop:

Allocations per operation is 80% less without the ORM

Code Maintainability

I consider all the code that interacts with your data layer as middleware. Of course if you’re using an ORM you might not have this middleware explicitly boxed up into a package or set of functions, which I would argue makes you worse off: the middleware is still there but it’s not isolated. Instead, the database interactivity is spaghettified throughout the codebase.

When we want to retrieve, update, create or delete data we invoke functions that do that for us:

q.GetAccounts(ctx, ids)// more complex queries take a generated <X>Params type
q.GetAccountsPage(ctx, db.GetAccountsPageParams{...})

And our endpoints don’t even do this; they abstract the details by invoking an interface:

result, err := fetch.Page(ctx, fetch.PageParams{
   Fetcher:    accounts.Fetcher{},
   ...
})

Understanding our Data I/O Profile

When you have an ORM you’re inviting all software developers in your organization to hit the database in potentially unaccounted ways. Despite best efforts to train your team on what indexes are available or setup DBA roles, ultimately you’ll have database interaction code that can’t be accounted for without a code review. Which inevitably leads one to turn to database logs and monitoring solutions to understand how the database is being accessed. These tools are a welcome addition to any process reviewing runtime performance and achieving SLAs, but if you’re using them to understand how your database is accessed it’s already too late.

We still use tools like RDS Performance Insights and pganalyze but we’re no longer reliant on them for understanding the general profile, or worry whether we’re consuming an index or not. This work has been shifted to our centralized repository that acts as a middleware for all our database I/O, which we simply call the data repo.

It’s not without process, but now it’s a managed process. When an application developer needs a new query she needs to open a PR in the data repo which will come with a code review where people can ask whether an index or transaction is being employed. True, such code review standards should applied to all repositories, but database I/O is going to be just one bullet point in a downstream application. Our data repo is focused on one thing and one thing only: managed database interaction. Moreover it’s easy to conduct code audits after the fact. The DDL and SQL queries are all side by side, making it easy to know whether a query is properly consuming an index or not.

More Cloud Native Implementation

Your mileage may vary, but the two ORMs we used (Gorm and Django) both wrapped the database connection, causing two problems. First, in both cases, the wrapped object exposed less functionality than what was available in the underlying driver. And as databases and drivers update to address specific needs this can become quite frustrating.

Second, particularly in the case of Django, it moved us away from a cloud-native design. One aspect in particular that we struggled with is accessing data from within our Lambda functions. Function-as-a-service platforms such as Lambda will want you to define your database connection as a global variable so that it’s freezable. This task was basically impossible on Django. Although we had less trouble with this problem in Gorm, we had other problems around obtaining the connection pooling characteristics we wanted, even on long-lived compute layers in the cloud.

Ultimately being able to implement a cloud-native design comes down to the database driver you’ve selected, and you need to make sure your ORM supports that driver. We’re lucky to use Postgres and more lucky that the Go community has a specialized driver solely for it: jackc/pgx. Being able to utilize this driver directly without the ORM has given us more flexibility on cloud-native design and the ability to take advantage of Postgres-specific features that are often left out of other drivers that prioritize broad, cross-db support.

Data Model Transparency

Finally, and perhaps greatest of all, dropping our ORM has changed our engineering culture to be more data-centric by promoting the transparency of our data model. Bridge is a data processing company. We do the unsexy work of normalizing and enriching financial data for registered investment advisors, enterprises and other fintech platforms.

We value integrity, accuracy and consistency in our data to deliver on these goals, and we can’t do that unless everyone feels they understand the data model. Many ORMs are philosophically built around hiding or abstracting away the database from the development process, which will ultimately lead your team to pay close attention to “the O” and deprioritizing “the R”. And the “the O” is locked up in individual repositories that any single person may or may not have knowledge of. But everyone can get behind the structural arrangement of a database: the organization into schemas, DDL, E/R diagrams, etc.

For us, our database is more than just a container of information that we read and write into. It’s the expression of our ideas; how we simplify and model the complexity of industry challenges. Removing the ORM has put all those details front-end-center in people’s minds, created more ownership over both the data model and Postgres with less throw-it-over-the-fence mentality. And that’s perhaps the best gain of all.

Bridge Financial Technology is an exciting young startup and an amazing place to work. Please reach out on LinkedIn if you’d like to join our team!