From the Inside January/February 2014

Big Data is still a hot topic, even two years after the term was first introduced. Although, it seems people are still not sure what exactly they should be doing regarding Big Data.

I talked about the terminology of what "Big Data" was supposed to be to mean a few years ago. For those that didn't catch that From The Inside, here it is again.

Big Data references to the complexity, amount, and the management of a large quantity of data. If you look at the definition on Wikipedia, you will see it talks about "terabytes and petabytes" of data, but it also states "beyond the ability of commonly used software tools to manage and process within a tolerable time."

Big data is a term applied to data sets whose size is beyond the ability of commonly used software tools to capture, manage, and process the data within a tolerable elapsed time. Big data sizes are a constantly moving target currently ranging from a few dozen terabytes to many petabytes of data in a single data set.

Reference: http://en.wikipedia.org/wiki/Big_data

Let's look at the concept of "tolerable time" and "software tools." At first glance you may say, well, I don't fall into that category. I can manage all my data in my programs within a "tolerable time."

Tolerable time is extremely relative due to the who, when, and why an enterprise has to deal with on a daily basis. Are you generating reports? Are you generating lookups and presentation for immediate action or batch processing? When does the user expect this information to be presented to them… Now or in 30 minutes?

The "Big" in Big Data is relative as well. I've seen articles on the Internet arguing about what constitutes Big Data sizes. Some define it in the amount of data being stored and the size of the database.

Let look at the classic example of sensor data. This data is usually highly structured and not very big per transaction (50 Bytes). But there can be large number transactions. If that is the case, 1,000,000 transactions would be about 48 MB. That is not terabytes and petabytes. It would take 21,990,232,555 transactions to reach 1 terabyte. (60,247,212 transactions a day for 1 year) While this may happen in your environment, in a normal business environment this is not likely.

I don't necessarily agree that Big Data represents either of these things.

Big Data is not about the size of your database but the complexity of the data you need to work with. The idea of Big Data was coined to address the problem of working with data that have complex relationships, and how to find trends and patterns to solve problems using large volumes and a variety of data.

We all have large amount of data in our systems that can provide useful information if we just know how to apply the Big Data concepts. The sheer volume and complexity of the data can make it hard for us to decide to approach the problem, let alone even know what questions to ask.

At the 2014 Spectrum Conference, on April 7-10 in Phoenix, we will talk more in-depth about Big Data and what questions to ask and how to get the information from the data you already have.

Nathan Rector

Nathan Rector, President of International Spectrum, has been in the MultiValue marketplace as a consultant, author, and presenter since 1992. As a consultant, Nathan specialized in integrating MultiValue applications with other devices and non-MultiValue data, structures, and applications into existing MultiValue databases. During that time, Nathan worked with PDA, Mobile Device, Handheld scanners, POS, and other manufacturing and distribution interfaces.

In 2006, Nathan purchased International Spectrum Magazine and Conference and has been working with the MultiValue Community to expand its reach into current technologies and markets. During this time he has been providing mentorship training to people converting Console Applications (Green Screen/Text Driven) to GUI (Graphical User Interfaces), Mobile, and Web. He has also been working with new developers to the MultiValue Marketplace to train them in how MultiValue works and acts, as well as how it differs from the traditional Relational Database Model (SQL).

View more articles

Featured:

Jan/Feb 2014

menu
menu