In my business, we deal with helping companies obtain critical data under very tight deadlines. Usually, this data is not the freshest and, more often than not, stored in some long forgotten archive. So when we’re called in, we are getting information that is anywhere from one to ten or more years old in some cases. This presents a multitude of challenges. In the corporate setting, how many people have 10 year old tape drives sitting around that still work? How about the software used to archive the data in the first place? Luckily, through time and experience, we’ve built up the expertise to deal with these situations with little problem.
But the benefit we have is that the data we normally deal with falls into an era that has largely moved on. It was that time in computing when massive amounts of information were created, used, and stored on-site. Vast server farms were built by corporations to house all of this data. It was what we did way back when. The closest we came to outsourced services was maybe the corporate website was hosted on our ISP’s webserver. Or maybe someone else took care of DNS. Or, if we were small enough, we let our upstream ISP handle our email on their servers. It was good. We knew that if we needed our data we could call up whoever was providing the service and get our data back. Sometimes our ISP was just the local Telco or CLEC. Other times we dealt with larger companies – many of which have gone into the annals of technology lore. The point is that we could conceptualize the physical location of our data. We knew that if push came to shove, we could hop in our car and demand our data back. It was improbable that would ever really happen, but it was the warm blanket and cup of milk that allowed us to sleep easy at night.
Somewhere along the line we got used to that whole idea. Why should we spend the money to build out huge storage arrays when there are providers who will store data for us? But it will only be this little piece of content that doesn’t matter if anything happens to it so what’s the harm? Why should we have a huge infrastructure to deal with email when this provider will take care of everything for a dollar a box? They handle the anti-spam and anti-virus and do backups – we could save thousands. And so, it crept along. We slowly moved more and more services and data outside our facilities and got more comfortable with the idea.
We also lost track of where that data was actually located. When we were small and niave, we accepted that our data was physically stored in one place. But as we got more and more sophisticated, we first requested and then demanded more data redundancy and risk mitigation from our providers. Our data began to spread through the digital world. One small email message might be physically located in twenty or more physical locations across the country or world. Do we really know exactly where?
And that poses the problem. We believe our data is safe, but how do we deal with getting it back? It’s an accepted truth that the data is always there exactly when we need it. However, we all have experienced times when that is not the case. In September of 2009, Google’s Gmail service was unavailable for a relatively substantial period of time (read more). As late as March of this year, another major outage affected Gmail and other Google apps (read more). It may seem like I’m picking on Google, but the fact is that they are a major player in this arena.
So the question still remains, if the service goes dead, where is the data? How can we get it back? How can we demand it back in the event of a legal action? The two former questions I’ve seen answered via the ostrich strategy. Most organizations don’t want to think about it due to the catastrophic implications an event like that would cause. The last question I’ve seen addressed in a multitude of forums, both on the civil side as well as the law enforcement side. The answer is not as clear cut as anyone would like, but most situations do find some form of resolution to the problem.
As we move more and more into the “cloud” and lose track of exactly where our data exists, as an industry, we need to begin to address these issues head on. We need to determine methods of tracking what’s ours and how we can get it back. Would we accept the same level of access from our banks? Why do we form organizations that often times hold assets of ours more valuable than our cash accounts – our data and intellectual property?
Bradley J. Bartram is the Vice President of Information Technology and CTO for DIGITS LLC, one of the premier providers of forensic services in New York and the surrounding states. Brad has been employed in various capacities in Information Technology since 1996 and currently holds certifications as a Certified Electronic Evidence Collection Specialist (CEECS) and Certified Forensic Computer Examiner (CFCE). He blogs concerning information security, digital forensics, and eDiscovery matters.