Mapping and collecting data stored in an IaaS environment

Sure I know where my data is…it’s right here on my “S” drive

In part one of this three-part series, we discussed some of the basic concepts and advantages when organizations abandon the traditional concept of data storage and move their IT infrastructure to Infrastructure-as-a-Service (IaaS) providers. However, once the data is in the hands of the IaaS providers, is it really where you think it is? The answer is a resounding “maybe.”

If you take a few minutes to peruse the service-level agreements for Amazon’s various service offerings, you will find that Amazon operates its data hosting/providing services out of data centers in eight distinct regions across the globe: U.S. East (located in Northern Virginia), U.S. West (located in Oregon), U.S. West (located in Northern California), EU (located in Ireland), Asia Pacific (located in Singapore), Asia Pacific (located in Tokyo), South America (located in Sao Paulo) and a separate environment for the U.S. government. Therefore, when you sign up for Amazon’s services, you would naturally assume that your data will remain housed at that one particular location within that particular region. As it turns out, that both is and is not the case. 

As an example of how data can be stored in one location, but not always be in that location, we need to focus on Amazon’s U.S. East region, in which it operates its data hosting/providing services. A quick Google search will show you that in that region, which Amazon defines as being located in “Northern Virginia”, the company actually operates data centers in Ashburn, Va.; Miami, Fla.; and, Newark, N.J. As with any other large data provider, Amazon presumably operates its data centers with some level of redundancy. I say “presumably” because Amazon is, with good reason, tight-lipped about its exact method of operation. It is that concept of redundancy that allows the data to both be in one particular location and also not in that location.

To drill down even further in this example, say your organization, located in Amazon’s U.S. East region, gets all of its IT services through Amazon, to include storage space for individual users on file servers. One of these users, John Doe from human resources, has a spreadsheet that contains confidential information concerning the number of and types of complaints filed against certain employees. To John Doe, that spreadsheet is found on what he sees as the “S” drive on his computer. On the back end however, the primary copy of that spreadsheet could be stored on a hard drive in the Ashburn, Va. data center. It also is entirely possible that a complete copy, or parts of a copy, could be stored on other hard drives in the Ashburn, Va. data center, or on hard drives in the Miami and/or Newark data centers.

Therefore, when it comes to mapping where your data is located in an IaaS setting, you must always take the concept of redundancy into consideration. You can map out, with a high degree of certitude where individual copies of files are located within your organization’s IaaS environment; however, you should always remember that there may be a redundant copy of those files somewhere within the IaaS provider’s systems.

So does this mean that the whole concept of collecting data in a sound manner from IaaS environments is a moot point because you can never be certain that you are collecting everything? The short answer is no, and we will be discussing that in the third and final installment of our three-part series. 

About the Author
Jonathan Fowler

Jonathan Fowler

Jonathan Fowler, EnCE, ACE, is the Director of Forensics at First Advantage Litigation Consulting. He can be reached at jon.fowler@fadv.com.

Comments

InsideScoop Daily eNewsletter

InsideScoop delivers the latest-breaking news affecting in-house counsel. Get the latest business trends, current corporate litigation, labor developments, technology initiatives and more — FREE. Sign up now!

You have been subscribed! You will receive a confirmation email soon.

See the entire list of InsideCounsel eNewsletters.

Resource Library


Reduce eDiscovery Costs and Risks through Email Disposition

Read this white paper to learn best practices on determining email retention periods with real...

Prepare for the Eventuality of eDiscovery Now and Reap the...

This report presents an overview of eDiscovery implementation challenges organizations may face as well as...

The Fastest and Most Cost-Effective Document Review Available!

Recommind's Predictive Coding is the market's only solution that allows clients the option of reviewing...

Bring the Benefits of Decision Tree Analysis to Your Everyday...

In this on-demand webinar, learn how to counter the challenges of litigation with predictive analytics...

13 Things to do Now to Reduce Risk and Avoid...

We have developed best practices for lowering your e-Discovery costs, shortening the length of your...

7 Simple Strategies for Improving Legal Fee Budgeting Certainty

Understanding the legal fee budgeting paradigm and following seven simple strategies will help you control...

Complimentary White Paper: Best Practices for Meeting Critical eDiscovery Challenges

Packed with practical advice, this white paper discusses best practices for meeting eDiscovery challenges across...

Complimentary White Paper "Key Considerations for Collection Methodologies and Resources"

This white paper addresses the need for companies to reevaluate their current collection policies in...

Moving Matters In-House: How Technology Enables Legal In-Sourcing

Strategically shifting more matters to in-house counsel has proven to be an effective strategy to...

5 Ways to Promote Responsible Content Sharing

Find out five ways that organizations can promote responsible sharing of content among employees by...

View All »

Advertisement. Closing in 15 seconds.