Data Categorization and the Use of Data Labels
Putting in place effective and appropriate controls for information systems requires an understanding of the nature of the information. In this regard, sensitive or otherwise valuable data should be categorized to support data security. By identifying data according to sensitivity, one can implement various strategies to better protect such data. Unfortunately, understanding what other cloud data may require protection may not always be clear.
Data that a user chooses to store in the cloud may not require protection if it is not sensitive or if it can easily be recovered. But generally, protecting data is a universal requirement regardless of its value, if for no other reason than failing to do so leads to all manner of complexity, consequence, and mischief.
In identifying and categorizing data, what we face is a multifaceted problem. Besides identifying classes of information that are sensitive or otherwise have value and labeling such information according to its characteristics, we need to protect such data, usually by means such as file permissions, encryption, or more sophisticated container approaches. We also need identity-based access controls to support organizational access policies.
Procedures are also necessary for security across phases of the data life cycle, for instance, to limit exposure of such data when we create copies or backups. Also, we need mechanisms to detect when the valuable resource is accessed in ways that warrant concern.
Data or information labeling is one information security technique that has been used to great success for classified information such as the hierarchical categories of Unclassified, Confidential, Secret, Top Secret, and Compartmented. Labeling also supports non-classified and non-hierarchical categories such as Finance, Business Strategy, and Human Resources. The objective of information identification and categorization is to put in place an information-centric framework for controls and data handling.
SELinux and Trusted Solaris are two example operating systems that support information categorization and access enforcement for U.S. Department of Defense style mandatory access controls (MAC). Briefly, this amounts to sophisticated access enforcement by the OS and network controls.
At the heart of MACbased security are two concepts. First, every file, discrete piece of data or network connection is marked to bound its security level with a label that the OS uses to enforce access. Second, every subject (user or process acting on behalf of a user) has a set of permissions including clearances and roles. The OS mediates all operations that subjects perform against data enforcing complex logical security operations. Although this may sound complex, and while such enforcement technology must be implemented with correctness and completeness, the concept is quite simple and the benefits enable a simplification of what otherwise would be highly complex and prone to error alternative implementations.
The Ostrich Approach (or How I Learned to Hide My Head in the Sand)
In contrast to identifying sensitive data, there are many consequences when you uniformly treat all data as being equal in sensitivity or value. Without any data sensitivity oriented controls, a relatively small percentage of sensitive data is mixed in with far more nonsensitive data and is accessible to anyone with overall access. Failing to identify sensitive data complicates incident resolution and can be problematic when compromised data includes data subject to regulatory controls.
There is one misguided school of thought about this, and it can be described as the notion of hiding valuables in plain sight and hoping for the best. This is a strategy that is doomed even at the level of an individual computer used by multiple parties.
By example, one might think that credit card data can be discretely squirreled away in a file and almost impossible to locate via a search if the file system has enough files. However, such data follows defined regular patterns both in terms of the number of digits and key digits of the number. Searching for well-known strings is trivial with a computer, and because of this, several pieces of spyware do exactly that by first identifying strings such as a credit card number or a social security number and then extracting enough characters around these prizes to obtain expiration date, associated names, and along with other personal data.
Over Use of Classification
A second problem with sensitive information is a common inclination to classify or label everything as sensitive or for instance Secret. But over classification can lead to a reduction in care in handling actually sensitive data. What we need is a balance in managing sensitive information and sound strategies for protecting the data.