I regularly attend conferences and meetup groups focused on data and business intelligence (BI) in order to keep up my industry knowledge, network with my peers and just because I enjoy talking about data. These are virtual now with Covid, but over the last few years, I’ve noticed that while the topics are fun and engaging, they seem to be more and more heavily focused on big data and data science. There’s a tendency to treat little data, analytics, dashboards, reporting, etc. as ‘traditional’ and therefore of less interest to attendees.
The definition of ‘big data’ has been evolving over the last 20 years but the most common definitions of big data focus on the four V’s:
- Volume – big data volumes are usually too large to be easily accessible and actionable. It usually deals with terabytes and petabytes.
- Variety – big data often deals with unstructured data and may be in various forms including text, images, audio, video, etc.
- Velocity – big data is usually generated at speed. Every click, text, sensor readout, etc. may be captured.
- Veracity – big data may change frequently
Examples of big data include social media content, log files, sensor results, clicks on a website, text messages, phone calls – virtually any kind of data that could be defined by one of the 4 V’s. There’s no magic line from when data becomes big data, it is just a label.
Data that isn’t ‘big’, is usually just called data — most companies still spend the balance of their data efforts on this little data and more traditional applications of it. What is little data? It’s not exactly a standard term and some people call it small data rather than little data. I personally loved the Wikipedia definition of small data:
Small data is data that is ‘small’ enough for human comprehension. It is data in a volume and format that makes it accessible, informative and actionable…
The term “big data” is about machines and “small data” is about people. This is to say that eyewitness observations or five pieces of related data could be small data. Small data is what we used to think of as data. The only way to comprehend big data is to reduce the data into small, visually-appealing objects representing various aspects of large data sets…
Examples of little data would be transactions, employee records, customer records, sales interactions, calls to the call center, patient visits, summarized views of big datasets, etc. These may be in the millions of records, but they are of a size and complexity that is much easier to translate into action by the average consumer than big data.
But ‘easier’ doesn’t mean ‘easy’ — most companies still haven’t mastered big data or little data. The 2020 TDWI Teams, Skills and Budgets Report shows that 73% of respondents ranked themselves as beginner or intermediate on their data management implementations.
Why is Little Data Important?
Applications of big data tend to lean toward complex data mining for predictions, recommendations, pattern finding, etc. Examples include personalizing a user’s experience on a website, fraud detection, traffic management, transportation logistics, management of TV streaming, etc.
These are hugely important to many companies, but just about EVERY company needs little data. Little data is what drives your business. Little data manages payroll, inventory reports, sales data, hiring, call center volumes and issues, operational efficiency, profitability, marketing campaigns — the opportunities are endless. Just about every application of data that we’ve highlighted in Datagami uses little data.
TDWI’s skills report also shares how different analytics aspects are used in a company. Little data applications are more prominent and heavily used by respondents than big data solutions. Many respondents hadn’t even implemented many big-data areas of focus such as streaming analytics (43% hadn’t implemented it), artificial intelligence (35% hadn’t implemented it), and social media analytics (34% hadn’t implemented it). And for those who have, the majority were just beginning. Yet 78% of the respondents considered their data implementations a success.
Little data is also where most data teams spend their time and effort. Roles such as business analyst, data analyst, analytics specialist, report developer, and ETL specialist far outnumber data scientists and big data engineers.
Why Is Big Data Getting More Attention?
Given the prevalence and benefits of little data, why is attention skewed toward big data solutions? I have a few ideas:
Cloud Computing: The growing popularity of cloud-based systems may be one factor that is driving the industry’s hyper-attention to big data. Big data is big – it can have high costs to store and manage. Little data is often a minor blip on a company’s cloud costs. So cloud vendors are obviously going to target the higher-end costs of big data.
Fascination with the Technology: Big data is a technologically challenging problem to solve – it offers the opportunity to learn cutting edge technologies. With little data, the complexity is often in the architecture and solutions rather than the technology. I’ve noticed that engineers can tend to focus on the technology OR on the data and its application but it is rare to find an engineer who is interested in both. If you begin hiring engineers who are enamored of the technology then they often gravitate to newer tools and opportunities to implement them rather than to the purpose of the solution. Engineers who focus on data, in my experience, tend to have a tighter grasp of the business applications of the data and what their users need.
Education: I think our educational systems are focusing more on the latest technologies and are leaning more toward data science and big data as a result. With little data being treated as more ‘conventional’, graduates may be coming out with expectations that they will get to work on these technologies. It puts huge pressure on companies to find ways to offer these candidates the opportunity to work in these areas, even when they are hiring for roles that are focused on little data.
Money: I hate to sound cynical, but it can be in a vendor’s best interest to convince us that we need new technologies and new systems. Often we do, but we often only need a fraction of that technology to meet our customer’s needs.
I don’t mean this to sound like I’m against big data. I love it, and I love its applications. Most of the companies I have worked for have utilized both little data and big data successfully. There’s a time and place and need for both and they complement each other well. I am merely saying that we as an industry shouldn’t let the technology overshadow the business value of what we are doing and cause little data to be under-valued.