More On

Mapping and collecting data stored in an IaaS environment

Sure I know where my data is…it’s right here on my “S” drive

In part one of this three-part series, we discussed some of the basic concepts and advantages when organizations abandon the traditional concept of data storage and move their IT infrastructure to Infrastructure-as-a-Service (IaaS) providers. However, once the data is in the hands of the IaaS providers, is it really where you think it is? The answer is a resounding “maybe.”

If you take a few minutes to peruse the service-level agreements for Amazon’s various service offerings, you will find that Amazon operates its data hosting/providing services out of data centers in eight distinct regions across the globe: U.S. East (located in Northern Virginia), U.S. West (located in Oregon), U.S. West (located in Northern California), EU (located in Ireland), Asia Pacific (located in Singapore), Asia Pacific (located in Tokyo), South America (located in Sao Paulo) and a separate environment for the U.S. government. Therefore, when you sign up for Amazon’s services, you would naturally assume that your data will remain housed at that one particular location within that particular region. As it turns out, that both is and is not the case. 

As an example of how data can be stored in one location, but not always be in that location, we need to focus on Amazon’s U.S. East region, in which it operates its data hosting/providing services. A quick Google search will show you that in that region, which Amazon defines as being located in “Northern Virginia”, the company actually operates data centers in Ashburn, Va.; Miami, Fla.; and, Newark, N.J. As with any other large data provider, Amazon presumably operates its data centers with some level of redundancy. I say “presumably” because Amazon is, with good reason, tight-lipped about its exact method of operation. It is that concept of redundancy that allows the data to both be in one particular location and also not in that location.

To drill down even further in this example, say your organization, located in Amazon’s U.S. East region, gets all of its IT services through Amazon, to include storage space for individual users on file servers. One of these users, John Doe from human resources, has a spreadsheet that contains confidential information concerning the number of and types of complaints filed against certain employees. To John Doe, that spreadsheet is found on what he sees as the “S” drive on his computer. On the back end however, the primary copy of that spreadsheet could be stored on a hard drive in the Ashburn, Va. data center. It also is entirely possible that a complete copy, or parts of a copy, could be stored on other hard drives in the Ashburn, Va. data center, or on hard drives in the Miami and/or Newark data centers.

Therefore, when it comes to mapping where your data is located in an IaaS setting, you must always take the concept of redundancy into consideration. You can map out, with a high degree of certitude where individual copies of files are located within your organization’s IaaS environment; however, you should always remember that there may be a redundant copy of those files somewhere within the IaaS provider’s systems.

So does this mean that the whole concept of collecting data in a sound manner from IaaS environments is a moot point because you can never be certain that you are collecting everything? The short answer is no, and we will be discussing that in the third and final installment of our three-part series. 

Contributing Author

author image

Jonathan Fowler

Jonathan Fowler, EnCE, ACE, is the Director of Forensics at First Advantage Litigation Consulting. He can be reached at jon.fowler@fadv.com.

Bio and more articles

Join the Conversation

Advertisement. Closing in 15 seconds.