Desktop version

Home arrow Marketing

  • Increase font
  • Decrease font


<<   CONTENTS   >>
Table of Contents:

: Big Data

Overview

What is Big Data? It is a collection of data in very large and complex databases that defy previous management and processing methods. Big Data has large volume, large velocity, and large variety in data formats. The growth of databases has increased exponentially in recent years, and the differing data formats (e.g., numbers, text, pictures, videos, voice, etc.) require enormous amounts of computer storage, server speed, and specialized analytical software to access, process, and interpret the data. The size of these databases is enormous. The world moved from discussing databases in terms of kilobytes in the 1990s, which represents approximately half of a page of text, to terms of terabytes a decade later, which is equivalent to about 75 million text pages. Today’s information technology (IT) systems create multiple terabytes in a matter of hours. Table 10.1 lists database sizes and provides examples. Large databases require data conditioning, transformations, and new statistical methods because most standard methods to organize and analyze it are inadequate. There are challenges in storage, searching, transferring, and visualizing these large and diverse databases.

Big data is driving global transformation of the ways we learn, work, and produce goods and services. First, there is the IoT, composed of interconnected devices and sensors that provide information on status, predict performance, and control these connected devices. Currently there are more than twenty billion of these connections. They control global

TABLE 10.1

Database Size Comparison

Name

Binary Usage

Value

Typical Year Example

Kilobyte

210

103

1980

half of a page text

Megabyte

220

106

1990

Gigabyte

2i0

109

1995

Terabyte

2

1012

2000

Petabyte

2

1015

2005

Exabyte

260

1018

2009

150 exabytes represents several times the size of all books written

production and services across supply chains. They also offer opportunities to improve efficiency while meeting customer expectations. Second, there is virtualization in the design of almost anything today. This enables physical objects to be created using models and algorithms, and the model can be tested in virtual environments to identify design flaws that can be corrected prior to production. Service system models can also be simulated to analyze their response to changes in incoming demand and capacity if systems fail. Data virtualization promotes the use of Big Data because it can be organized and presented in easily consumable formats that provide insights of relationships and status for decision making. It also provides a single source of trusted truth.

Analytics are enhanced through access to cloud platforms that enables sharing of disparate, very large databases using new methods to organize them. Artificial intelligence and machine learning are also applied to provide new insights into data relationships. User-friendly interfaces allow easy querying of the large databases. Big Data methods combined with cloud access also help identify and repurpose dark data, or data that is accumulated but not used. The reasons that dark data is not used are its size, its unstructured format, or other reasons that make it unavailable. Although some obstacles still remain, most barriers for making efficient use of dark data have been overcome. This enables organizations to put the data to use and oftentimes to monetize it. The key for using previously dark data is sorting out useful data. The balance of the data may need to be archived because of laws or regulatory status but does not provide insights for operational management. Storing old and unusable data is called cold storage, and such data are not placed in the cloud but rather are stored on physical media.

 
<<   CONTENTS   >>

Related topics