UPDATED 09:27 EDT / MARCH 22 2018

BIG DATA

Wikibon trip report from Big Data Silicon Valley: Big data lives!

Here’s what my analyst colleague Neil Raden said to me at our recent annual Big Data Silicon Valley community gathering in San Jose:

“I spent the day on the floor of Strata Data Conference, and I heard the term ‘big data’ maybe three times.”

What’s going on? Have so many failed so often at big data that they dare not speak its name?

Well, yes, but whatever you call it, “big data” is still hot. Let me explain.

The basis of digital business transformation is the use of data as an asset. Any business that aspires to digitally transform – and you should short any businesses that say that they don’t – must focus on how to get more work and more value out of their data.

Three strategic business capabilities are required to achieve that goal. First, businesses must be much more purposeful in capturing data from their business activities. Second, businesses must be better at turning captured data into specific data assets, in the form of models, insights and software. And third, businesses must advance their ability to turn data into market actions that better automate operations and differentiate customer experience.

At Wikibon, we don’t care what it’s called, but – fundamentally – big data is the second strategic capability I listed: Big data is the basis for turning data into data assets that create business value.

Our Big Data SV 2018 guests on theCUBE might have used different words, but across the board they articulated the same message: Digital business competition increasingly hinges on a business’s ability to create and apply data assets, and that’s what leading big data and analytics, or BDA, that technologists are working on.

Five trends

Having said that, based on our conversations at Big Data SV, the “whatever we call the big data market” is bifurcating: Infrastructure is converging (and earning the term “big data”), and application focus increasingly is centering on machine learning and artificial intelligence. Five trends are emerging:

  • Cloud reshapes the big data landscape. Yeah, I know: Cloud is part of every conversation in tech these days. But at Big Data SV 2018, it was different. First, the biggest users of big data technology are cloud service providers, which is starting to bias tool evolution. Second, cloud service providers are starting to express their big data technology as services, which is undermining certain promises of open source. And third, it just works, which accelerates time-to-value and increases focus on the application side of big data. Big data users? Going to the cloud. Tool providers? Going to the cloud. Apps houses? Going to the cloud. Alpha-dog data scientists? Going to cloud companies. Near term, cloud is on the critical path for most big data evolution.
  • Networks of data become crucial. The coevolution of big data and the cloud puts a real emphasis on data movement, which puts a real constraint on a number of trends. First, even though cloud utilization is certain, physics, regulation, IP protection and cost will ensure that cloud is not the default for all data – at least not in successful enterprises. Rather, big data and analytics pros and architects need to start thinking in terms of “networks of data,” explicitly understanding and designing relationships among data, data source and data sinks. Indeed, it’s not off to imagine a network model in which data is the primary citizen, and not devices (the internet) or pages (the web). In this model, recognizing usage patterns, optimally placing data and staging data will be key services. From this simple thought comes two additional questions: (1) Does this catalyze real enterprise use of blockchain? And (2) what does that mean to the future of database management system technology?
  • Hardware is still important. This notion of networks of data ensures that not all data will go to the cloud (even as the cloud grows in importance to big data). Enterprises will still have to handle large amounts of data on-premises, perhaps because the data is too expensive to move, takes too long to move or is limited by regulations. A new class of memory-oriented cluster is emerging, thanks in large measure to NVMe-oF, flash storage, hyperconverged technologies and new composable application development tools. All the piece parts are here today and will mature rapidly during 2018. UniGrid is the term Wikibon has coined to describe future systems organized to run data-oriented systems. It combines simpler elements that especially bind together storage and network resources into highly scalable clusters capable of binding data resources together with very low – like 5μs overhead. How will future automation platforms work? For apps that require real-timelike interaction between operational and analytic systems, they’ll run on emerging UniGrid true private cloud systems.
  • Data scientists do more data science. ML and AI tools are maturing rapidly, facilitated by the facts that (1) many AI and ML algorithms have been around for ages and don’t require a lot of invention; and (2) the infrastructure technologies for getting data into place for running model training and inferencing tools are working pretty well (in part, thanks to the cloud). That means real data scientists can focus more time and attention on doing real data science and leave more of the operational aspects of big data practices to legions of data analysts who really are best at managing infrastructure and wrangling data. Having said that, the best and brightest data scientists continue, again, to beat paths to the cloud companies, but better tools mean better diffusion of big data and analytics knowledge into more enterprises.
  • AI technology is diffused into applications. Finally, we heard repeatedly that BDA pros are expecting SaaS and enterprise application companies to lead the industry in creating enterprise BDA applications. Hadoop no longer is the center of the BDA ecosystem. Nor is Spark or some other singular technology. We’re moving to an application focus – especially a services-oriented app focus – and that will shape the BDA market for years to come.

Final thoughts

Wikibon just completed its annual Worldwide BDA study and here are a few numbers we discussed:

  • By 2022, BDA will be a $70 billion market, growing to $103 billion by 2027.
  • The BDA market remains very fragmented; fully 68 percent of the market is outside of circle of the top 10 BDA vendors.
  • Splunk is the biggest BDA software vendor. Why? Because of its application focus (see above point about AI tech diffusing in apps).
  • Finally, 2017 was the first year that software revenues exceeded hardware revenues in BDA. The bifurcation is happening.
Image: geralt/Pixabay

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU