Desktop version

Home arrow Computer Science

  • Increase font
  • Decrease font


<<   CONTENTS   >>

APPLIED DATA ANALYTICS

Table of Contents:

Introduction

Decision-making algorithms are playing the significant role in the implementation of the digitalization strategies in the different fields particularly in business analytics. Big Data analytics is an ecosystem of technologies that allows the collection, storage, and exploitation of large volumes of data that are generated at different speeds and have different varieties of information, both structured and unstructured (blogs, social networks, videos, images, etc.).

This setup allows to have a very flexible platform that operates as a unified repository of information that reduces costs and serves as the basis to give business solutions to a wide range of requirements (analysis, event correlation, exploitation, transformation, business intelligence [BI], and client 360) that allows to exploit Tb of information with thousands of operations per second and in real time.

Data scientists are the advanced analytics experts who give value to data. Through their work, companies can face new challenges, predict future situations, bet on the best alternatives, provide better services to customers, and maximize profits.

More and more Big Data projects are being developed by the three “Vs” (Volume, Variety, and Velocity), but do we know how the new General Data Protection Regulation (GDPR) impacts these systems?

Data analytics techniques extract relevant information of data and try to obtain any feature which would help for different purposes, for example, to model the data by extracting statistical characteristics. Once the behavior of a parameter is modelled, we can predict its value under a given situation.

The exploitation of large amounts of data, including personal data, using a set of technologies, systems, and algorithms, is booming, as it allows, in the face of an enormous volume of data, to extract valuable information for studies: retrospective, prospective, commercial projections, establishment of profiles, and usage patterns (both for statistical and scientific and commercial purposes), etc. The results of these analysis can have a direct impact on people, which is why they are increasingly becoming a matter of concern, and regulation is necessary to safeguard the privacy of people in Big Data models.

Big Data

Big Data is an emerging trend, which is attracting the attention of scientists, industry, organizations, governments, and individuals. The motivations will be different, but the basis is the same: capability to achieve and store heterogeneous data to be processed in different ways to extract meaningful information.

Big Data refers to datasets that are not only big but also high in variety and velocity, which makes them difficult to handle using traditional tools and techniques (Elgendy and Elragal, 2014).

Since the beginning of the digital society, the information society, the availability of data, either captured in real time or stored in large databases, increases. Researchers both in social science and scientific science, and decision makers in general, are eager to manage amounts of data in order to refine or achieve more accurate results (e.g., medical diagnosis, people behavior, artificial intelligence [AI], system modelling, etc.).

"Die term Big Data regards more than just large amounts, but their heterogeneous nature, the availability of data coming from different sources, captured at different rates or speeds, are difficult to manage (Angierski and Kuehn, 2013). In consequence, innovative techniques must be provided in order to achieve a better performance. Current research trends point to the use of novel methods and techniques to manage such huge amount of data, which is in continuous and fast growth due to endless data generation, which must be extracted from raw data, preprocessed, managed and stored in databases, and finally processed to make decisions, in the framework of their final usage (Srinivasa and Mehta, 2014).

Those novel techniques should provide decision makers with valuable tools to extract the relevant information which seems to be hidden to traditional approaches, especially in those cases where the data has a high volatility, and the sophisticated algorithms must be fast and agile. Big Data analytics can be the appropriate toolkit aimed at providing the additional value to Big Data (Pyne et al., 2016).

As the Big Data problem is understood in greater depth, the definition of what we understand by Big Data becomes more precise, and greater the need for more appropriate tools for data analysis. However, the nuances of the Big Data definition are conditioned by the final application, the characteristics and properties of the data, and their nature and origin, so there will not be a standard definition applied in all cases, but they will have nuances depending on the problem to solve. We can take as an example the definition that authors propose regarding their work on quality in Big Data (Emmanuel and Stanier, 2016).

"Die three “Vs” of Big Data reflect the challenge that big companies face when it comes to giving data a value to make better decisions, improve operations, and reduce risks. Therefore, it is necessary to be able to navigate easily to obtain information both within the company’s systems and the data that arrives from outside.

If we analyze Big Data projects, they generally follow the following phases:

  • • Data collection (which may involve buying and selling information).
  • • The verification and validation of the data.
  • • Storage (both initial and resulting data).
  • • The analysis and exploitation of the results.

There are several final applications where Big Data-based techniques are being applied with different purposes. Among them, we can highlight the following:

  • • Understanding and segmenting customers: Marketing and sales are perhaps the areas of greatest application of Big Data today. The data is used to better understand customers, their behaviors, and preferences. Companies are willing to expand traditional data centers with those of social networks, navigation logs, text analysis, and sensor data to get a complete picture of their client. The main objective in most cases is to create predictive models.
  • • Understanding and optimizing business processes: Big Data is increasingly being used to optimize business processes in companies. In the retail sector, businesses are optimizing their stock based on predictions generated thanks to social network data, web search trends, and weather forecasts. A process that is being transformed particularly thanks to Big Data is that of the supply chain and the optimization of delivery routes.
  • • Quantification and optimization of personal performance: Big Data is not only for companies, public institutions, or large organizations. We all can benefit from the data generated from wearable devices such as smart watches or bracelets.
  • • Improving public health: Another area of collective mass data use is the coding of genetic material. The more the users are involved, the more the benefits are obtained, either to know more about our ancestors, whose diet or food is most suitable for our genotype, or to discover how or why certain genes that can lead to chronic diseases are activated. The processing capacity of Big Data analysis platforms allows us to decode entire chains of DNA in a matter of minutes and will allow us to find new treatments and better understand diseases, their triggers, and propagation patterns.
  • • Improving sports performance: Most elite athletes are already adopting high-volume data analysis techniques. In tennis, it has been a long time of using the SlamTracker tool (based on IBM SPSS predictive analysis technology) in the most prestigious tournaments in the world (Wimbledon, Roland Garros, and Australian Open).
  • • Improving science and research: Scientific research is being transformed by the new possibilities offered by Big Data.
  • • Optimizing the performance of machines and devices: Big Data analysis is helping machines and devices to be more intelligent and autonomous.
  • • Improving security and law enforcement: Big Data analysis is being used intensively to improve security and law enforcement. The news leaked via Wikileaks revealed that the National Security Agency (NSA) has been spying on all the communications of all citizens. The objective is the protection against terrorist attacks.
  • • Improving and optimizing cities: Big Data is also being used to improve aspects of our cities and countries. The technology allows to optimize traffic flows based on data that arrives in real time from traffic, social networks, and weather.
  • • Financial trading: The last area of examples of use of Big Data that we are going to review, although not of smaller volume or importance, is that of the application of Big Data in capital markets. The activities related to High-Frequency Trading (HFT) is where there is the greatest use of Big Data.

In our case, we are going to apply the data analysis techniques to determine the cyber risk that a certain company has against possible external attacks on its computer systems and databases (Bartolini et ah, 2017a). Until now, this evaluation procedure was carried out through a questionnaire and required the consumption of human resources and time. The objective is to systematize this evaluation by reducing the impact of the subjective analysis on the person in charge of the evaluation in some critical factors, which can lead to inaccuracies in the evaluation result.

 
<<   CONTENTS   >>

Related topics