Tuesday, October 13, 2009

The One That Got Away: True story of not taking backups

"You mean to tell me that we have absolutely no backups of paris whatsoever?" I will never forget those words. I had been in charge of backups for only two months, and I just knew my career was over. We had moved an Oracle application from one server to another about six weeks earlier, and there was one crucial part of the move that I missed. I knew very little about database backups in those days, and I didn't realize that I needed to shut down an Oracle database before backing it up. This was accomplished on the old server by a cron job that I never knew existed. I discovered all of this after a disk on the new server went south.

"Just give us the last full backup," they said. I started looking through my logs. That's when I started seeing the errors. "No problem," I thought, "I'll just use an older backup." The older logs didn't look any better. Frantic, I looked at log after log until I came to one that looked as if it were OK. It was just over six weeks old. When I went to grab that volume, I realized that we had a six-week rotation cycle, and we had over-written that volume two days before.

That was it! At that moment, I knew that I'd be looking for another job. This was our purchasing database, and this data loss would amount to approximately two months of lost purchase orders for a multibillion-dollar company.

So I told me boss the news. That's when I heard, "You mean to tell me that we have absolutely no backups of paris whatsoever?" Isn't it amazing that I haven't forgotten its name? I don't remember any other system names from that place, but I remember this one. I felt so small that I could have fit inside a 4mm tape box. Fortunately, a system administrator worked what, at the time, I could only describe as magic. The dead disk was resurrected, and the data was recovered straight from the disk itself. We lost only a few days' worth of data. Our department had to send a memo to the entire company saying that any purchase order entered in the last two days had to be reentered. I should have framed a copy of that memo to remind me what can happen if you don't take this job seriously enough. I didn't need to though; its image is permanently etched in my brain.

No comments:

Post a Comment