top of page

What is "dark data" and what dangers does it entail?

Every day, our data is stored on the Internet and partly used for targeted advertising, but only a fraction of it is actually processed. The rest, the so-called dark data, is forgotten. Companies like Google have awakened this dark data from its slumber. But they can also fall into the wrong hands, warns scientist Iris Lorscheid from the University of Applied Sciences Europe in Hamburg: "We're potentially getting a 'digital label' attached to us that we can't get rid of so easily".

In the digital age, data is the fuel for many companies and drives entire industries - especially in online marketing. This maxim has long been undeniable. Iris Lorscheid, Head of Digital Business and Data Science at the University of Applied Sciences Europe in Hamburg, has, however, set herself the goal of teaching about dark data. In other words, about data that cannot (yet) be evaluated, but nevertheless exists about us. In following interview, she describes the opportunities and risks of this data:

Mrs. Lorscheid, every day data is stored on the net, but only a fraction of it is really processed. The rest is called dark data. How high is the share of Dark Data in the world wide web? Can this be quantified?

With our stored data we have arrived far above the billion in the Zettabyte universe, that is 21 zeros behind the 1. It is not difficult to imagine that this far exceeds our resources and current technological possibilities for analysis. The discrepancy between the data we use and the data we collect and manage is growing. According to an IBM study, more than 80 percent of all data was dark by 2015 and should rise to 93 percent by 2020.

What exactly is Dark Data?

The term describes the fact that we collect significantly more data than we analyze. One also speaks of sleeping data. It just waits and it is still completely open what we can do with it or what purpose it could serve. And this is where it gets exciting. From a business perspective, the question arises: How much does it cost us not to derive value from this data and what potential is in it? From a social perspective, what analysis options will we have one day, and what consequences can be derived from these analyses?

Let's take a concrete example: What dark data could an Apple Watch collect about me?

An Apple Watch and related fitness trackers give me feedback on my activities, so I'm happy when I reach my training goals or take an extra walk. So the primary purpose is to improve my personal health and well-being. But there are a number of other ways that this exercise and activity data can be used. In purely theoretical terms, it would be possible to reconstruct a person's entire everyday life and analyse their behaviour in certain situations.

Why exactly does this dark data emerge at all? Why do the providers not only collect the data they really need?

Most dark data is created by the large amount of information that is fed by the many sensors in our world - from our online behavior, social media activities, smartphones, and increasingly from the Internet of Things. We have the technical means to collect data, so we do - this is often the first maxim in companies. Even if the collection is costly and the growing server farms are an ecological burden, no one wants to miss the train of digital transformation, and that means collecting data first.

Are there companies that use Dark Data sensibly?

A very positive example that we all know is Google Traffic as part of Google Maps. Google initially only had GPS data from smartphones available as raw data, and with this initially dark data had found a way to generate accurate and up-to-date traffic information on Google Maps. The GPS data is not collected for this purpose, but Google has found a way to generate valuable information from the data sent anyway. In various companies, prediction models based on dark data are already helping optimization processes. Shell, for example, can make expensive maintenance and servicing of oil platforms much more efficient on the basis of inventory analyses. This already reaches into agriculture, where livestock farms can use their dark data to analyze the situation.

This blog was translated from following article. Continue reading and learn all about the dark data, the danger of its misuse, its advantages and more:

************* Promotion *************

Campaigning in the age of algorithms

Secure your ticket now:

You like this content? Subscribe to your newsletter now:

6 Ansichten0 Kommentare

Aktuelle Beiträge

Alle ansehen


bottom of page