I’m NOT a gamer.
Nor have I played computer games as a technology professional. OK, maybe Solitaire and Minesweeper on Windows ’95.
But I had the amazing opportunity to be a part of Grand Theft Auto V release back in 2013. And the not-as-popular game L.A. Noire in 2011.
My colleague and I were the “babysitters” of the backend database as GTA V did a global phased release of the game.
There wasn’t much for us to do during the launch other than be available on standby “in case” something went wrong.
I didn’t have much to do either. I spent my time looking at the monitoring tools to spot anomalies in performance and stability. I did write a custom script to read the SQL Server transaction log backups that analyze database activity across all the platforms in the GTA V ecosystem.
It was amazing to observe how the teams worked in preparation for the launch. And I have nothing but respect and admiration for those people.
I’ve had the opportunity to meet the infrastructure team from Take-Two Interactive in NYC. They built the entire infrastructure from the ground up. The database servers, web servers, app servers…all of it. And I could sense how much they valued their work. Quality excellence was their ethos.
Then, there was the development team in San Diego. They practiced DevOps even before DevOps was cool. They built tools that track data throughout the journey of the player. What was even more admirable was how calmly they managed their work. Imagine the overwhelming pressure of being a part of the launch team.
And what I love about this is the fact that they run a really tight ship. When they had to fix a bug in the code, you could visualize the impact in the entire platform ecosystem.
Update a stored procedure? You’ll notice a lag in how the player moved between scenes.
Update the API that keeps track of the scores? You would see a spike in the storage IOPs and throughput.
They immediately rolled back any change they made that negatively affected the entire platform.
And while they didn’t give me a copy of GTA V as a souvenir, the bottle of wine was an excellent gift. That’s on the assumption that I can bring it with me on my flight back home or that I drink wine at all. OK, fine, I’m a bit disappointed, to say the least.
I’m tempted to talk about the importance of how to give proper gifts. But I’d rather focus on what I took away from that experience that applies to deploying high availability solutions. And the fact that I didn’t get a copy of GTA V nor took the bottle of wine back home 🙂
1) A well-defined system is mandatory in high-stakes environment. Not all business function is mission-critical. Same thing with IT environments.
You can get away with doing whatever you want in a low-stakes environment. Change configuration settings, deploy code fixes, even take an ad-hoc backup.
But when that environment becomes mission-critical, you need guard rails to keep the system functioning – and prevent it from crashing. Even better if you can improve it over time.
The GTA V development team had a well-defined system. They made sure anything and everything they did improved the platform without causing it to malfunction. Because even a lag in the gamer experience is considered a malfunction.
What’s ironic?
The issues and outages I’ve had to deal with in my customers’ environments were caused by not having a well-defined system in place. Doing a root case analysis would reveal a process breakdown or a configuration drift caused by an unexpected change in the system. They could have avoided these issues if there was a well-defined system.
2) You need a rockstar team (pun intended). A team of rockstars from Rockstar Games and Take-Two Interactive built GTA V.
There was no “junior” member on both teams. Not because they didn’t want to develop new talent nor mentor interns. In fact, I’ve met people on both teams who were fresh out of university. But they were not on the GTA V teams.
That’s because they understood how critical this mission is.
This reminded me of my experience training with the army scout ranger regiment. Neophytes will undergo intense training for a long period of time. But when it comes to important assignments involving life-and-death situations? Only the specialists got deployed.
I’ve worked with so many clients who deployed mission-critical systems. They either built it themselves or hired somebody to do it. Yet, nobody on the team knows how to manage it.
A few of the database administrators and engineers who took my intensive training classes before shared the same sentiments. They now need to get up-to-speed on how to take care of a mission-critical system they knew nothing about.
That’s like having an intern perform a heart surgery on you. No way.
You need a rockstar team the minute you deploy a mission-critical system.
3) You need an insurance policy. My first visit to the NYC and San Diego offices a few weeks before the launch led me to believe that the teams got it covered. The platforms were stable. They immediately fixed identified bugs. Ironed out every wrinkle. They’ve got everything under control.
I was surprised when my colleague and I were told to “hop on a plane and pay Rockstar a visit” the week before the launch. He would go to the NYC office while I to the San Diego office.
Our job: be available throughout the launch as standby engineers for the database platform. It’s a fancy way of saying we’re “database babysitters”.
I was hesitant to go since I was also running a major platform migration project for a global financial company. It’s hard to manage several projects at the same time, let alone a project as critical as GTA V.
I felt the anxiety in the room when I got to the San Diego office. And it’s expected…this was the biggest launch they’ve ever done in the history of the GTA series.
I talked to one of the managers, curious about my role in the launch. From what I can tell, they didn’t need me nor my colleague.
The manager took his time to thank me for showing up on very short notice. He explained that while it may seem obvious that everything was under control, the board advised to bring us in. What he said next changed the way I looked at mission-critical systems forever. In fact, it influenced the way I run my business today.
He said, “We’ve invested so much on GTA V for the past few years. And we’re not about to screw this up during the launch. We want to make sure we’re protecting our investment.”
He continued, “We may be the experts in our platforms. But we’re still humans. We have blind spots. And we cannot afford to be victims of our blind spots during the launch. Bringing you guys in means having a second pair of hands and eyes to make sure we guarantee a successful launch.“
He and the board didn’t see the entire GTA V platform ecosystem as a cost. Not a bunch of servers and apps weaved together to provide a great gaming experience for their customers. They saw it as an investment. And they treated everything like an investor would protect their investment.
It didn’t come as a surprise when, in less than 24 hours, GTA V generated more than US$815million in worldwide sales and US$1billion in its first three days. It is still the second best-selling video game of all time. And I still went home without a copy of GTA V nor a bottle of wine 🙂 I did have a lot of fun hanging out with the San Diego team.
Treating mission-critical systems as an investment rather than a cost is what allows companies like Rockstar Games and Take-Two Interactive to reap the rewards.
Sadly, that’s not the case for most customers I worked with.
I get brought in to fix emergency issues and resolve outages for mission-critical systems. And when the dust settles and root cause analysis done, almost all of them don’t have these three things.
And I cannot blame them.
Companies get caught up in achieving exponential growth beyond what they can manage. Creating well-defined systems, having a rockstar team, and having an insurance policy aren’t in the list of priorities.
That’s why everyone is shocked when apps evolve to become mission-critical.
Remember that Microsoft Access database project that you were tinkering with? Guess what? It’s now your mission-critical ERP system.
When your apps and platforms are starting to become mission-critical, it’s time to evaluate whether or not you have all three.
Because the last thing you want is for the business to be crippled when the unexpected happens.
If you need help with either building a mission-critical system or having all these three things, let’s talk. Schedule a call with me using my calendar link.
Schedule a Call
