I am often told to relax or switch to decaf when I make a big deal about the words we use in our industry. I can usually walk away from most of the confusion but one that I cannot let go is the difference between data, information, and knowledge. This is a personal thing and I have come to know these terms through other domains like Library and Information Science (LIS) and the Knowledge Management industry.
What is the big deal? It is not a big deal until the time in which it is. It's like saying "This book is purely fictional except for those parts which are not". In your line of work, you may not at this point need a more granular descriptor; but for what I have been doing these past six years, I need all the help I can get describing these intangible notions that have no physical properties. Defining information as “stuff” is just not helpful.
We in information security and risk management have much to learn from other domains like Library and Information Science (LIS). I have been a very good study and I’m right in the middle of a great book on the Philosophy of Information; oh my god it is awesome. You may be thinking “Philosophy of Information, give me a break. I have a real job and some real problems to solve.” Great, and when you are done solving the problem at hand, think about other domains like the field of law for instance: while the majority of the field is made up of practitioners who serve the market, there is a small minority concerned with the intellectual underpinnings of the system. They are made up of legal theorists and philosophers that include the U.S. Supreme Court justices and their like. What I am saying is that what you may view as an unnatural imbalance in the community of experts is very natural and works quite well in other domains that face similar problems.
So let’s get back to this exploration of the difference between the terms data, information, and knowledge. Even with my years of concentration on this subject matter, I have only scratched the surface but intend to be up to my neck in the Philosophy of Information as it stands today in other fields.
Data
Data are described as a set whose members are distinct from one another but lack context beyond just their presents and absence. For example: 20 IP packets, 300 vulnerabilities, and 600 attacks. Value is created at this level by the sheer ability to capture the phenomenon, nothing more, and nothing less. Through some function X, data is transformed to information.
Information
We have come to understand information as an emergent form present when data are presented in context and a information connection is made between observer and that which is observed. Data from multiple domains are related and presented as a single form: information. Included in this synthesis are temporal factors that change the resolution of the presentation. Using the same examples above: "The first 20 packets from a TCP flow established between machine A and Machine B", "300 distinct vulnerabilities affecting our web-services over the past 5 years", "600 attacks originating from our servers"
Knowledge or Intelligence
A form of yet another higher order is knowledge or intelligence. I have found both of these terms interchangeable with the public sector biased toward the term intelligence and the private sector the term knowledge. Following the structure so far, knowledge then is data in context in context; the observer understanding the information in a context that is broader than what is presented at the time of observation. An example would be "Last night at 0100 hours, our sensors recorded 600 attacks originating from our extra-net servers with a destination of company X but the first 20 packets from a TCP flow established between machine A on our end and Machine B at company X showed that none of the attacks were exploiting the 300 distinct vulnerabilities effecting our web-services over the past 5 years."
As you can see, the value at each logical level is different depending on the processes you are involved in. The skill is to be able to jump around this cognitive model and with every movement, you the observer are growing your knowledge at a rate that is beyond the sum of what is being presented.
The form knowledge has some very peculiar properties that are worth mentioning. As we move further and further away from an economy based on rival-goods, these properties will no longer be in the background and will be center to our discourse.
[This collection noted by N. Wiener, A. Toffler, J Piaget, and others, comments by TK]
Knowledge is inherently non-rival
If I give it to you, I still have it. As opposed to rival-good where if I sell you something, in the transaction I sell you item A which then I no longer have and you pay me item B which then you no longer have.
Knowledge is intangible.
We can’t apply the domain of physics to it but that does not mean we cannot manipulate it.
Knowledge is non-linear
As we begin to develop more and more of a informational understanding of nature itself, we can see that non-linear patterns are much more common than linear patterns. Even in business, tiny insights can yield huge outputs.
Knowledge is relational
An observer attains meaning only when knowledge is held in some ratio to other knowledge.
Knowledge mates with other knowledge
This growth is exponential because the more there is, the more synthesis and analysis can be performed, the more new knowledge is created which is then fed back in to the system.
Knowledge is observer centric
There is a hermeneutic principle that knowledge follows: The hearer, not the speaker determines the meaning of an utterance. Piaget was quoted as saying “He who organizes his experiences organizes the world”
Knowledge is explicit or implicit, expressed or not expressed, shared or tacit.
It is at the very edge of our human knowing.
All of this research was done in the 1950’s and much of it has still not yet been applied because our community still suffers from what my buddy David Mann calls “Physics-envy”. The sooner we let go of the paradigms and language of the industrial age, the better. It really does not matter if you agree or don’t agree; it has already begun. Everything around us; our media, our social networks, our bodies are all transcending to a data/information/knowledge representation. I have a few ideas on how to go about managing risk and certainty that may or may not work out, but I can tell you that the methods we are using today are in their sunset years.
--TK