Earlier, we lived in industrial and post-industrial societies, and gas and oil were the only things of value. Now, it’s the age of information society and data has replaced petrol as the economy’s driving force. The reason is that with the help of Big Data, people significantly improve production efficiency and business economics. That’s true.
That’s why today, you may hear a lot of comparisons of data with gas, oil, gold, and other treasures of humanity.
Historical background: The first person who compared data to oil was Clive Humby. But the context was quite different — not connected with monetization. He meant that data is useless until it’s put to work and is refined. Thus, to become like gas or oil, the data should be collected, analyzed, interpreted, and used.
Why is data compared to oil?
Data today is a vital business asset. The most prominent companies in the world like Microsoft, Amazon, Apple, and Facebook own such an amount of data that nobody can even imagine.
When people use their products or services, these corporate giants collect information about every aspect of their behaviour and decision making. Then they adjust their products, marketing strategies according to data received and analyzed. That makes them even more prosperous and more powerful.
Thus, companies that base their development strategy on data massive get 5% greater revenue than the companies that do not rely on that.
Just like with oil, which brings value only after it’s refined and used.
Open data issue
Open data is everything you can reach without a password: social media, blogs, forums, dating apps, etc. Big data is more than a billion lines or more than a petabyte. Because now, the term ‘big data’ is so well-known that it is complicated to understand what is what. So when I say big data, I mean exactly this.
We’ll start by looking at how people represent big data. Most think of them as a large, large amount of something. But in reality, all big data technologies are based on diverse data, and from these little pieces of collected data, a giant picture is created.
For instance, most people never check the background in photos on social media. And this is 60–70% of all the insights that can be obtained per person — is there any renovation in the apartment, and this is the level of income, which can be seen from the window, all kinds of sights for determining geolocation. Then, from these small photos, a clever algorithm can assemble a complete picture of what is around the person.
So if you are taking pictures for social networks, check the background.
The ethics of data collection: cases
What other sources are there based on which it is easy to collect information about you? It can be a person, and it can be an algorithm that will target ads to you. Of course, the most popular sources are social networks, blogs, forums, and e-commerce platforms.
The leader of people’s interest in 2016-2017 was Tinder. The point is, these various dating apps show the distance to the person. Anyone who is more or less familiar with mathematics knows that simple methods allow you to determine the location from several points, knowing the distance to a person. It is clear that all kinds of services and social networks never show the real location. They protect the user and indicate approximate distance. It is impossible to understand where exactly this person is located. But the average lonely person updates Tinder about 18 times a day. By updating this information, you can understand how the person moved, where he was.
Geolocation is the first thing on the list of things that can be learned about a person, because judging by the same personal data law, it’s like your home address is your personal information. No one should know it. For example, in social networks, it is in closed access. But if we take all your publications, see where you were, for the majority — really for 99% – users, 80% of geo-sites are two clusters: home and work. At the same time, what is closer to the city center is work, what is next is home. It is clear that there are exceptions, but these exceptions are at the level of statistical error.
Another example: A couple of years ago, the Windows 10 operating system’s free distribution was launched. Those who rushed to install the OS on their personal computers were a little disappointed in the end.
A number of “controversial points” were found in the agreement. The first was that Microsoft collects and stores the history of visits to pages on the Internet, passwords to sites, and data about the user’s access points. This is done through the data connection of Microsoft account. The option can be disabled, but in Windows 10 it is enabled by default. As a result, the user can become a victim of fraudsters and voluntarily transfers data to American courts and intelligence services — in accordance with US law, IT companies are required to disclose this data at the authorities’ request.
Few people like the ability in Windows 10 to track online user behavior by third-party advertisers. Perhaps, most of all questions are raised by one of the points of the license agreement, which states the following:
In general, everyone is used to the fact that apps collect data about users. Most of the time, the data collected by Internet companies is used to make apps work properly. For example, when ordering a taxi or building a route to the desired object, the program asks for the user’s location in order to correctly process the request and provide relevant information.
Of course, data and information can’t replace gas or oil. However, it can be fully compared in value and worth for business and economy.