Big Data - modern and problematic method of data storage and analyzing
Digital technologies are presented in all areas of human life. The amount of data recorded in the global warehouse is growing every second, which means that storage conditions and new opportunities to increase its volume must change with the same rate.
Experts in the field of IT have expressed the view that the expansion of Big Data and the acceleration of the growth rate has become an objective reality. Every second sources such as social networks, news sites, file sharing and others generate huge amounts of content.
According to a study of IDC Digital Universe, in the next five years, the volume of data on the planet will grow up to 40 Zettabyte, that is by 2020 every person living on Earth will have 5200 GB.
It is known that the main flow of information is generated not by people. The sources are robots, which interact with each other. These are apparatus for monitoring, sensors, surveillance systems, operating systems, personal devices, smart phones, intelligent systems, sensors and so on. They all set a furious pace of data growth, which leads to the need to increase the number of production servers (real and virtual) - as a result, it leads to expanding and implementing of new data centers.
In fact, big data - rather conditional and relative concept. The most common definition of it - is a set of information, exceeding the hard drive of a personal device in volume that is not operated by classical tools used for smaller volumes.
Generally speaking, big data processing technology can be summarized in three main areas:
- Storage and transfer of incoming information in gigabytes, terabytes and zettabytes for its storage, handling and application.
- Structuring fragmented content: texts, photos, video, audio and all other kinds of data.
- Analysis of Big Data and the introduction of different methods of processing unstructured information, the establishment of various analytical reports.
In fact, the use of Big Data means all areas of work with a huge amount of very disparate information, constantly updated and scattered across various sources. The goal is very simple - maximum efficiency, the introduction of new products and the growth of competitiveness.
The problem of Big Data
System Issues Big Data can be summarized in three main groups: the volume, speed of processing, lack of structure. They are three V - Volume, Velocity and Variety.
- Storage of large volumes of data requires special conditions, and it is a question of space and possibilities. Speed is not only connected with a possible slowdown made by the old methods of operating, it is also a question of interactivity: the faster the process, the greater the efficiency, the more productive results you get.
- The problem of heterogeneity and lack of structure arises because of different sources, formats and quality. To combine data and process it effectively, not only work on bringing them into a suitable form, but also some analytical tools required.
- There is a problem of the limit of data volume. It is difficult to predict it, and therefore difficult to predict what technologies and how much financial investments are required for further development. However, for certain volumes of data (terabytes, for example) are already used existing processing tools, which are also actively developed.
There is a problem with the lack of clear principles of work with such data volume. Heterogeneity of flows only aggravates the situation.
Selection of data for processing and analysis algorithm may also be a problem because there is no understanding of what data should be collected and stored, and which can be ignored.
Another problem of Big Data is ethical. Namely: how the collection of data (especially without the user's knowledge) is different from the violation of privacy boundaries? Search Engines record user click on the Internet, they know your IP address, geolocation, interests, online shopping, personal data, email messages, etc., that, for example, allows to display contextual ads according to user behavior on the Internet. That is, by default, Big Data collects all information that is then stored on a data server sites.
Analysis of big data has long been successfully used in marketing to determine: target of audiences, interests, demand, consumer activity. Thus, Big Data is an accurate tool for predicting the future marketing of the company.
Today, at the peak of high technology and huge flows of information, companies have more opportunities to achieve superior performance in business through the use of Big Data, but the technology requires new methods for its efficient and secure use.
Follow me, to learn more about popular science, math and technologies