Executing Search Warrants in the Cloud
By John M. Cauthen
Generally, evidence has been in one of two forms—a physical item, like a document, or testimony from a person who witnessed an event. Overall, collecting this evidence was no more difficult than collecting a shell casing at a homicide scene. Since the advent of computers all this has changed. A great deal of evidence is now in digital format.
Law enforcement has been keeping up with changing technology, and officers are becoming skilled at finding and collecting the digital evidence on computers and cell phones. The legal system, however, has had difficulty keeping up with law enforcement in recognizing new technologies.1
Today both law enforcement and the legal system face a new challenge—digital evidence distributed in the cloud.2 Technology requires investigators to change their methods from traditional passive searches to a new model focused more on live recovery.3
Digital evidence is powerful and compelling. Some examples include GPS data showing when and where kidnapping victims have travelled; voice calls, e-mails, and chat messages indicating criminal intent; and financial records revealing the proceeds of fraud. Because of the power of digital evidence, investigators go to great lengths to obtain and analyze it.
Investigators are familiar with recognizing items that contain potential digital evidence, such as computers, USB drives, and cell phones. In these cases they can identify a specific physical location where the data is stored, go to the location, seize the data, and analyze it back in the laboratory.
In the 1980s in response to new technologies, such as e-mail and Web-based servers where the electronic data were not in a known physical location, federal law was adapted to allow law enforcement searches with the host company performing the search and sending the results to the investigator. Today the growing complexity of cloud computing may be overcoming even this adaptation.
Special Agent Cauthen is assigned to the FBI’s Sacramento, California, office.
A TWOFOLD PROBLEM
Executing law enforcement searches in a cloud-computing environment presents a twofold problem. First, little, if any, data pertaining to a computer user is found in a single geographic location. Second, and more important, even when the data is recovered, it may not be able to be converted to a format understandable by a human reader.
Today most search warrants for the seizure of digital evidence reference a particular location.4 This is understandable because many lawyers and judges do not have a strong understanding of how digital evidence works. They often make the analogy comparing the hard disk drive of a computer to a filing cabinet. Filing cabinets contain files and folders just like computers, so this is easily understood. They also are located in a specific place, so search warrants for data often are crafted like ones for information in a filing cabinet. However, unlike a filing cabinet, the investigator might not know the specific location of the digital data before the search begins.
In large businesses and government enterprises, computer users often are connected to a network via a computer that simply functions as a terminal. The computer may contain a hard drive with an operating system, but the majority of files are maintained on another computer located elsewhere. From the user’s vantage point, the files appear to be stored on the computer. The user can see the files on the screen, launch applications, and download and store data. However, behind the scenes the data actually is stored on servers located somewhere else on the network. E-mails are stored on an e-mail exchange server, documents and pictures are saved on another server, and applications and software are saved on yet another. If the investigator merely searches the user’s computer, very little data will be found because all the important records actually are stored on other network computers.
Cloud computing is a similar concept, except that files are stored somewhere on the Internet instead of on the corporate network. Generally, the user rents cloud services from a provider who maintains the software and data storage facilities, which could be at a nearby data center, spread out over multiple data centers, or stored in foreign countries. The problem is that finding where this data is physically stored can be very difficult—even the user might not know where it is. If the user is connected with limited control, as is common with tablets and cell phones, that person might not even have the ability to determine where the data is physically located. Likewise, due to service-level agreements, the service provider might have physical access to the data but not have the ability to search or recover it as data often is encrypted with a key possessed only by the user.
If the investigator is seeking files stored in the cloud, merely seizing and searching a computer or tablet used to connect to the cloud may not provide the needed data. At best, it might show that a connection to the data existed in the past, similar to searching a building or office where a file cabinet used to be.
Laws that apply to search warrants that deal with these issues may be challenged due to the increased complexity of cloud computing. Currently there are two types of warrants used for criminal searches. The first is the traditional search warrant under FED. R. EVID. P. 41, which covers a search of a particular location. The second is the search warrant under 18. U. S. C. §2703 where the court may issue a warrant for records held by cloud providers in another district.5 Traditionally, investigators use warrants under §2703 for searches of e-mail accounts where the search may be performed by the provider who supplies all of the subject’s e-mails, which the investigator then examines for content authorized within the scope of the warrant.
For data overseas, investigators may use legal process according to the laws in the host country. In one situation when serving a search warrant for data on a computer in the United States, investigators were able to see that there was a direct link to data located in a foreign country. They took the opportunity to download the data from the computer located overseas. In this instance the court found that the data could be used against the subject in trial. However, investigators should be aware that executing an international search without permission of the host country could cause other problems.6
Even in circumstances where the investigator manages to identify the location of the data in the cloud and seizes it with appropriate legal authority, it still may not be usable. This occurs due to the increased use of virtualization, encryption, and relational databases.
Virtualization is the concept of using a single computer to run multiple operating systems. Individuals can use virtual machines on their personal computers, but they are more commonly seen in large networks and, particularly, in cloud computing. From an investigator’s perspective this can be a problem, whether or not the virtual machine is in the cloud, because the data in a virtual machine is stored physically in a way that it is only viewable when the virtual machine is turned on.
In cloud computing the entity maintaining physical control over the computers containing the virtual machines most likely does not have the passwords needed to operate these virtual machines. Passwords often are required to log in to a virtual machine or decrypt files needed to view data. Investigators might be able to locate the physical server hosting the data, but they cannot view the data because it is contained in a virtual machine that requires a password.
Encryption is becoming more prevalent due to high-visibility data breaches. Data is commonly encrypted both when it is stored and when it is moving across a network. Service-level agreements between the cloud provider and the user often contain guarantees of encryption and prevent the provider or anyone other than the user from accessing unencrypted data. Cloud computing demands high levels of encryption because the users insist that only they, and no one else, can see their data. This means that the investigator may be able to locate the physical server hosting the data and seize the data but be unable to view it because it is encrypted.
Relational databases are a method of storing data in files based on relationships between pieces of data. The relational database concept is difficult to communicate to judges and juries. As an analogy a coffee vending machine contains all the ingredients for coffee—water, coffee, powdered milk, sugar, and specialty ingredients. The user can press a button to make a mocha latte, and the vending machine automatically selects the correct ingredients to create this type of coffee. However, if the user merely conducted a search inside the machine for all mocha lattes there would be none, only ingredients. Relational databases work the same way, except instead of ingredients, there are data fields, and the database software controls the machine that puts the parts together to respond to a user’s query. Relational databases can exist in the cloud or over large networks. The data fields are not necessarily stored on a single computer; however, the database software permits the user to input data and make queries from a single computer. This software determines where the data is stored and how it is retrieved in a form viewable to the user. Without the correct software, the data would appear to be a series of meaningless data fields.
A worst-case example highlights these problems. In a healthcare fraud investigation, the investigator is seeking data regarding a doctor who is fraudulently prescribing medications. The investigator knows that the hospital keeps important records on a computer database. In this database one file might contain patients’ names, another could list their addresses, a different one would record prescribed medications, and the final file may be information about the patients’ doctors. Software allows medical personnel to input data and make queries. The investigator might need a list of all patients of a particular doctor who have been prescribed a certain medication. If the investigator simply goes to the doctor’s office with a search warrant for all the computers on the premises, the results may not be as expected.
The computer in the examination room used by the doctor to enter data and read medical records may not store any patient data. This computer simply may be a terminal used to access the data, which may be stored in various locations. For example, patient names and addresses might be in a computer located out of state at a central computer room running virtualization, while drug records possibly may be maintained by an outside vendor and linked to the patient by a coded value. Concurrently, the doctor’s records could be kept in the medical company’s human resources computer department. Each of these records can be tied together by a relational database application maintained onsite. Most likely, all of the data will be encrypted; therefore, if any single computer is seized, searched, and decrypted, the data still will be a meaningless list of unrelated records. It only makes sense if the entire set of data is viewed at the same time using the correct software and passwords. Due to the nature of cloud computing, the use of encryption, and the operation of relational databases spread out over various locations, the data cannot merely be seized and examined back in the laboratory.
The most common option is to serve a search warrant on the cloud provider under §2703 as already is done for e-mail providers. This can work with a simple situation, like an e-mail provider, and is beneficial for providing transactional records, such as payments. However, cloud providers often do not have complete access to the customer data stored on their systems and may not be able to provide data in a complete or usable format.
An alternative is for the investigator to search in the same manner as the user would—with the computer turned on and connected to the data. In this example, the investigator needs access to the subject’s computer with the relational database software and connection to the cloud. The investigator could consider combining two search warrants—one on the computer owner for the location being searched under Rule 41 and one on the cloud provider under §2703 for the content to which the computer is connected.
With this approach, the investigator will need to understand how to operate database software and make queries. These queries must comply with the search warrant. The investigator must conduct the search carefully as actions taken on a live system will change the data on the computer. Using this method, it may be possible to obtain a single search warrant combining the provisions of Rule 41 and §2703; however, it should be noted that there is no case law yet on implementing this strategy.
Cloud computing adds complexity and numerous considerations to the execution of a search warrant. The fact that a subject’s data is stored in numerous locations, may be encrypted, and could require specialized software to view increases the difficulty. The future may require law enforcement to use the subject’s own computer and software to execute search warrants. This will require specialized knowledge and training. Although a challenge for the investigator, it already is a reality for successful large-scale searches involving companies, hospitals, and government entities.
Where is the Data?
|On a Network|
in a Single
|On a Network in Multiple Districts||On a Network with Data Stored Internationally||Unknown Where the Data is Stored (Cloud)|
|Search under Rule 41; consider noting in affidavit the possibility of other locations||Multiple search warrants for each district with data or §2703 Warrant served on service provider||Use legal process required in country hosting the data, or consider accessing data remotely with a search warrant under Rule 41||Search under Rule 41 for subject computers, and concurrently search under §2703 served on service provider|
How is the Data Stored?
Seize the encrypted files and decrypt them using a password or key and the appropriate decryption software.
|Virtualization:||Seize the virtual image file and open it with the correct password.|
Log into the virtual machine and seize the data while the virtual machine is turned on and in an unencrypted state.
|Relational Database:||Seize all the files containing records. Obtain a copy of the database software and rebuild the database.|
Log into the database while it is live and employ the application used to create and manage the database as a search tool. Download the data using the method allowed by the application, either in the form of printouts or data files.
For additional information Special Agent Cauthen may be reached at firstname.lastname@example.org.