Rewriting the Past

How we attempted a full rewrite, why it failed and what we're doing differently now

February 01, 2019

By the year 2014, Aloompa had already been around for six years. What had started as a festival app for Bonnaroo had grown into a platform that powered hundreds of music festivals, comic cons, rodeos and corporate events. Keeping the pace with the amount of customers we were acquiring was made difficult by many legacy aspects of our codebase. A new generation of Aloompa employees were brought in to write the next version of our platform. I was one of them.

In the beginning, there was a feeling excitement around the office. We were getting a new lease on life. We could break away from the shackles of our old codebase and write something better and more maintainable. If only we knew then what we know now, our excitement would have turned to dread.

Why we love a rewrite ❤️

Before I get into what happened in our case, I would like to go over some ways we justified a total re-write of our system. There are many reasons developers love throwing out old code and starting fresh, but in the case of a total rewrite, I think the main points are that a rewrite is easier to conceptualize than incremental changes, we have a strong bias for the present and we love to work with shiny new technology.

A rewrite is easier to conceptualize

A tendency among developers is to look at old code and think that it is all trash. Starting over often seems easier. The up-front commitment is minimal. There is nothing more soul-filling than starting a greenfield project.

What often happens is the trajectory of all software products, things are easy and manageable at first and then as we add more and more code, they become less and less manageable. If we are doing proper testing, we start to find weird edge cases that illuminate why the legacy codebase was written like it was. By the end of the exercise, we have a new codebase that is not as battle-tested as the legacy codebase and is quickly accruing many of the weird code to handle edge cases we had in the beginning.

Present-bias

The best way I can describe developers bias for new code over old code is by using the analogy of a high-school diary. Imagine if you as an adult couldn’t keep a new diary, but had to keep coming back to your old diary from high-school with a pen and white-out to update it to reflect the way you currently feel about the world. The person you are now is not the person you used to be and so every time you open your diary to make changes, you are filled with embarrassment and regret. Now imagine that you worked on a team who all collectively scrutinized and updated your diary. That is what it feels like to work on a legacy code-base.

There are numerous lines that have lost their context but you are hesitant to remove them because nobody knows what they are supposed to do. There are sometimes poor past architectural decisions that infect every facet of the codebase.

Shiny new technology

Over the past ten years, technology has advanced significantly. What was once accomplished with ASP.NET, jQuery, ObjectiveC and Java might be a better fit for Node microservices, GraphQL, React, Swift and Kotlin. It’s not just that technology innovations are shiny and new, they provide quantifiable value. They can increase performance, developer ergonomics and even make hiring easier.

Why it failed 💔

There is hardly ever just one reason that something fails. Looking back it’s clear to see why our rewrite didn’t work. These key insights have been valuable to us as we continually evaluate our direction. We didn’t understand the full scope of the product

After six years, the team working on the product was a different set of people than the team that initially wrote it. Many of us were brought in with the initial goal of doing the rewrite and maintaining the next version. Many of us didn’t fully understand the more complex parts of our domain yet. Many of us didn’t know the myriad of things our product should necessarily support.

Managing old and new platforms

As we were building our new platform, we couldn’t stop supporting our current customers. We continued to ship apps on the first version of our platform while building the second version in a vacuum. Our customers weren’t getting the benefits of our new work and the team was split doing client work and building the new version.

Timing releases of multiple platforms

Our rewrite touched multiple teams using multiple technologies. This meant that to release our next version, we needed iOS, Android, the backend and the frontend JavaScript to be in sync. Inevitably the teams were forced to wait on the slowest team in order to release new features. This problem isn’t unique to a rewrite, but because of the scope of the project, the disparity leapt from days to months.

Not enough testing

Developers are notorious for testing the happy path. We were no exception. Some of our apps may contain hundreds of events, but as we were writing our next version, we were using test data with a handful of records. Our API was restful, but since many of our events occur in low connectivity environments, we wanted to send down as much related data as possible with each request to minimize network traffic. This resulted in huge requests that only got worse when the amount of event records grew to a few hundred. Requests began running for minutes and sometimes timing out.

Our new CMS and our apps became unresponsive and unusable. Our client-side data layer was deeply tied to the structure of the data coming back, so migrating to slimmer endpoints was no simple task. If we had taken more time to test real-world scenarios, this all might have been caught earlier on.

“Burning the ships”

Arguably the worst choice we made in the entire process was to use an entirely different database for our V2 release.

Our main events database was on SQLServer and we wanted to use Postgres because of it’s improved performance, cheaper hosting and ability to contain JSON fields.

Switching databases meant that when we realized how deep of a hole we had gotten into, getting out of the hole was not an easy process. There was no backwards migration path. When we realized how broken everything was, even our founders were working nights and weekends manually migrating our customer’s data back to the old system. This was a stressful time for everyone. Innumerable consolation pizzas were being delivered as we worked insane hours to try to simultaneously repair our new version and migrate our customers back to V1.

Halting real innovation

The most devastating cost of this process was the slowing down of innovation. We were focusing most of our developer resources on problems that had already been solved instead of on building new features for our customers. This isn’t to say that we didn’t make anything new over that time period. We built a web embed platform to embed your Aloompa-powered lineup on your website, built a “LiveStory” feature that passively builds a timeline of the events you attended and released many upgrades to our core FestApp offering. But compared to the rapid pass of innovation that we have hit over the past two years, these seem like a drop in the bucket. The fact that our time was mostly devoted to trying to release a V2 meant that we were necessarily pushing off more meaningful innovations.

Course correction

Around 2016, we evaluated our current position. At this point, we had released an early first version, rolled it back and then invested significant time trying to do our rewrite the “right” way by releasing a smaller more tested version we dubbed “V3”. We were moving slow and a dark cloud had begun to settle over the whole rewrite.

We evaluated what we really needed. We made a conscious choice to stop actively working on a next version, but to put energy into automation, innovation and improvement of our legacy codebase. This ended up signaling a pivotal change for the company and how we approach our product on a go-forward basis.

We were able to salvage some aspects of our work over the past couple of years, but for the most part, all we maintained was a huge lesson learned about how to approach legacy code in the future.

Better solutions

Refactor

One solution is refactoring an existing codebase. Instead of rewriting everything, identify an issue that you feel the code currently has and dedicate a little time to making it better. The key here is to handle one problem at a time. When working on a feature, take a little extra time to improve the existing code in that feature.

Create new Features

Always build new features in as much isolation as possible. This allows you to feel the “new code” buzz that you might get by starting a new repository but on a much smaller scale. Sometimes a feature will be an eventual replacement for an existing feature, but approached from a different perspective. Remember, the goal here is not to write an existing feature better (that’s better suited for refactoring), but to make a different feature entirely.

Create new Products

In some cases, it makes sense to create a totally new product. Sometimes, this is a next generation of a current offering, but we always build an MVP (minimally viable product) so we can start using it in the real world as early as possible. The key here is that it is a new product, so it should always meet a need that is not satisfied by current product offerings. The idea here is to do the smallest amount of work possible to get your software in your clients’ hands so that you can start gathering feedback from real users and iterating.

One of our biggest mistakes was trying to rewrite four different technology stacks at the same time. A better approach might have been to focus on making a better CMS MVP first, then iterating until it became more feature-rich than the original CMS. At that point, we might focus on the mobile apps, releasing a very stripped-down version of the app and following the same practice of only adding features as we see the market need.

How Aloompa has moved forward

Since recovering from our failed rewrite, Aloompa has gotten a new lease on life. We have been rapidly innovating on new features and even building products that reach new verticals we never would have thought possible. There is a noticeable positive shift in the energy of the company as we’re focusing all of our attention on automation, innovation and our customers.

I can’t stress enough that the time we sunk into a full rewrite is not completely lost. There were elements of the rewrite we have been able to integrate into our product, but the even bigger impact is in the lessons we have learned. Having survived such a large mis-step, we have embraced an agile approach to product development. Over the past two years, this has allowed us to move more quickly and innovate in ways that would have been outside the realm of possibility in the midst of a rewrite.

The core apps themselves have seen a myriad of updates over the past couple of years. We’ve refactored our maps, created a universal search feature, and added a Spotify audio player just to name a few. Additionally, we’ve innovated on several other products. Here are a handful of places we’ve been investing our energy:

Web Embeds

We have developed a suite of web embeds that are driven by the performers and events in our system. These include lineup and schedule embeds as well as embeds for locations and places of interest. They can be included on any page by inserting a small snippet of code.

Web Embeds

Presence

Using location tracking technology like beacons and geofencing, we are able to compile data from our users in order to improve the event experience by sending location-aware targeted push messages and generating analytics and insights from the user behavior.

Presence

LiveStory

During the course of an event, we are getting really valuable user data using beacon and geofence technology. Using that data, users are able to passively generate a timeline of the events that they attend that they can enjoy even long after the experience is over.

LiveStory

LiveChat

We built a chat feature to allow event-goers to communicate during the course of an event within the app. This has already seen multiple iterations and improvements and might be considered the first really big success in transforming our process into something more iterative and agile.

LiveChat

LiveOrder

We partnered with Square to create a product that allows you to order food, a beverage and even merchandise at an event and pick it up without waiting in line. This has already begun to dramatically improve the experience of event attendees and has allowed the event venues to streamline their process.

LiveOrder

Conclusion

In a perfect world, we never would have attempted such a massive rewrite, but it has given us wisdom in approaching problems that we previously lacked. We have been through the fire and we are never going back.

Tyson Cadenhead

Tyson is the Chief Technology Officer at Aloompa.

He has a passion for Functional Programming, GraphQL, the Serverless architecture and React.

When he's not writing code or working with his team, Tyson enjoys playing guitar, growing vegetables and spending time with his family.

Tyson primarily works remotely to help support the needs of his oldest son who has level 3 autism.