Big Data and non-obvious connections

// 01.12.2022

Another significant achievement of our time is Big Data. This term emerged because of the emergence, on the basis of widespread informatization, of gigantic (petabytes) arrays of homogeneous information collected over a long period of time.

It is all about the digital format. And before, for decades a huge amount of information was accumulated, but all of it was exclusively in paper form and it was virtually impossible to summarize, analyze, find connections.

With the beginning of mass usage of databases, information structuring, possibility of allocation of key figures in "key-value" format allowed making absolutely random selection from data arrays, gathered for several years, for example.

Banks, insurance companies, air carriers, retail operators, mobile operators, logistics companies - where a huge amount of homogeneous information is generated every second - turned out to be the administrators of such arrays.

Based on Big Data analysis can identify seasonality, consumer preferences of different focus groups, create and calculate various probabilistic forecasts, and most importantly, find unobvious connections.

Non-obvious connections are dependencies, which cannot be found and determined by simple analysis. To find unobvious relationships, we need to find dependencies of attributes that are not directly correlated with each other or we cannot see their correlation due to lack of information. For example, that sales managers named Alexander at age 32 tend to drive blue Volvo cars, or prefer to fly to Bali in November.

For some, this analytical sampling is invaluable material for work, and for others, it's meaningless garbage. The value of Big Data is that they are just huge piles of sand that increase in size by the second. In and of themselves, these piles of sand do not represent anything useful, in order to get intelligible and useful results, it is necessary to clearly understand what we want to learn and get as a result and how we will use it. After that, we have to determine how to build a sieve that will catch only the patterns and data we need.

And despite the fact that all this sounds simple enough, in reality it is not so easy to build such algorithms for Big Data processing. It all takes time, money and expertise, and even the conclusions may not be obvious and will need to be tested in practice.

Thus, the very essence of Big Data is not a breakthrough to the top, but just an unploughed field of various data, which allows only with a competent approach to obtain material for making business decisions in a way that previously was simply unavailable to the corporate sector in principle.

comments powered by Disqus

Free place