This Sneaky Issue Was Silently Ruining Our Database

This Sneaky Issue Was Silently Ruining Our Database

It's one o'clock in the morning. My phone starts going crazy.

I groan and look over. Sure enough, it's another PagerDuty call. I am on-call this week, and it's the third time in as many days that I have been woken up for the same reason.

Hundreds of our customers have complained in unison, for three days in a row, that they have purchased event tickets from us and arrived at the venue only to find that their ticket has already been redeemed. They are being denied entry, and they are understandably very angry.

Initial investigation showed some very odd behaviour.

Customers were purchasing tickets, then a matter of minutes later, their tickets would be redeemed. This was far in advance of the event itself, so it was clearly unexpected. And it wasn't spread evenly across our customer base; the issue was heavily focused within some select groups of university students.

We already knew that we had a fundamental issue with our redemption system; it was possible to scan the QR code in the ticket order confirmation email yourself like the door staff would do at the venue, which would redeem the ticket. Unfortunately, the bigger fix for this was still a few weeks away.

Even so, why would large numbers of customers within specific university circles redeem or accidentally redeem their own tickets before the event?!

None of our early theories made sense, and the issue kept happening.

We went digging. Eventually, we found the issue.

The automated virus scanner on specific university .edu email systems was hitting all links in our confirmation emails to check them for viruses, including programmatically scanning all QR codes.

Therefore this one process, completely invisible to my team, was automatically redeeming our customers' event tickets before the customer even received the email.

Never, ever update persisted data on page-load.

By making the ticket QR code hit a webpage that immediately redeems the ticket on page-load, it meant that anything visiting that link, whether human or machine, would inadvertently update our database to mark the ticket as redeemed.

There are thousands of unseen automated visitors to our websites that we don't think about most of the time. Here's some examples:

  • Browsers prefetching resources. This is a big one. Let's say you visit https://myblog.com, and there prominent links on that page to/blog-post-1 and /blog-post-2 . It is very likely that your browser will already be prefetching those pages in the background. If those pages mutate data on page-load, for example to track visits, just being on the homepage is enough to trigger it, even though you might never click on the links.
  • Google crawlers. Another big one. We recently saw a large number of errors that came from Google crawlers visiting pages then triggering unauthenticated calls to APIs.
  • Email scanners. Very common in schools, governments, large corporations, and more.
  • Internet archive machines

The simple lesson here is to avoid things happening automatically on page-loads as much as possible. Besides the issues that we experienced, unexpected side-effects like this are bad UX. If the action is "Redeem", make it happen when you click a button that says "Redeem".

I hope that this was useful - subscribe now to get my Dev Diary in your inbox!