The Role of DataOps in Building a Data-Driven Organization
Originated in 1995 by the then 28-year-old software developer Pierre Omidyar, eBay continuously progressed in the early years of the Internet and thus, went public in the year 1998. From 2005–2010, the total revenue earned by eBay reached worth nearly $44.61 billion — exactly $1.03 million per hour — with over 200 million items being listed on eBay on any given day.
The stakeholders at eBay recognized quite early that their existing system was unable to manage the huge amount of data being generated. Their big data was getting bigger day by day.
While struggling to process all the customer journey data, they started looking out for ways to go about storing it and to be able to manipulate it. Finally, they came up with an approach, trending today by the name of DataOps.
- The democratization of Data (Technology): They built a custom data warehouse called Singularity, which was able to run ad hoc queries in 32 seconds. Moreover, they started using data in a programmatic fashion, thereby bringing in the power of machine learning.
- Personalization (Process): Shifted their focus on behavioral data to drive customer engagement via its Customer Relationship Management (CRM) system.
- Data-Driven Culture (People): Analytic teams started to fine-tune their operational performance by self-reflecting, at regular intervals, on feedback provided by their customers, themselves, and operational statistics.
By bringing the DataOps strategy in their business model, eBay learned to capture and analyze all of its data to drive the daily decision-making processes of the company and got successful in building a self-service big data infrastructure to support all of this data and analytics.
The ‘Data Challenge’ in Today’s Digital Era
Just like eBay, all modern enterprises need to deliver the right data to their growing customer-base that wants to consume data as quickly as possible. But many of these organizations are failing to deliver a seamless customer experience, despite investing a large chunk of investment in data science applications.
The reason for failure is that their traditional data pipelines are breaking, and data quality is suffering.
Thus, the DataOps concept emerged as a reaction to the challenges faced by most contemporary organizations in collecting & analyzing large amounts of data in sync with the needs of their operations teams.
The Roots of the DataOps Approach
Once upon a time in the software development world, corporate developers and operations professionals worked separately, in heavily armored silos. Developers were busy writing application codes and throwing it to the operations team, who made sure that the built applications worked when their potential users actually had them in their hands.
This worked for a while in a monolithic world where the enterprise dictated the pace. If the users wanted something, they had to wait.
The Internet changed that around.
Suddenly, there was a competition left, right, and center. The users were spoilt for choice and the speed of innovation and progress increased dramatically. Initially, the same silo-oriented systems and processes were followed. However, the need to roll out fresh code quicker, the updates be made available to consumers on production sooner, and the lowering patience of their paying customers to mistakes led to an important, profound change in how things were done.
The organizations started to embrace a new practice — DevOps — to improve coordination between developers and the operations team. DevOps redefined and optimized the way organizations develop and deliver products, thereby increasing quality, productivity, and customer satisfaction.
Just like DevOps, the DataOps approach also promises to streamline the data management process to maximize the value of colossal enterprise data stores. The principles applied for the DevOps approach are also applicable to data management with DataOps. The DataOps approach is even more critical since not every organization does software development, but every organization definitely uses data.
So, What is DataOps Approach?
Andy Palmer, the co-founder and CEO of data management developer Tamr, coined the term DataOps in 2015, defining it as the meeting of several aspects of data management:
DataOps is a process-oriented methodology that focuses on collaborative data management practice with an aim to improve cross-functional communication, integration, and automation of data flows between various stakeholders.
The goal of DataOps is to deliver value faster by creating predictable delivery and change management of data, data models, and related artifacts.
A Successful DataOps Practice
A report released by 451 Research states that most enterprises are planning to invest significantly in DataOps in the coming year: 86% of respondents plan to increase spending, investment or development effort on DataOps, and 92% agreed that improved DataOps would have a positive impact on organization’s success.
However, to deliver data that meets the needs of businesses, it is vital for organizations to focus on three important pillars of DataOps: People, Process, and Technology.
- People: Define rules for an abstracted semantic layer. Ensure everyone is “speaking the same language” and agrees upon what the data (and metadata) is and is not.
- Process: Design process for growth and extensibility. The data flow model must be designed to accommodate volume and variety of data. Ensure enabling technologies are priced affordably to scale with that enterprise data growth.
- Technology: Automate as many stages of the data flow as possible including BI, data science, and analytics.
Apart from the principles mentioned above, there are a few other vital principles discussed in The DataOps Manifesto.
DataOps Approach: Use Cases
In 2018, 98.6% of firms aspired to a data-driven culture — a 15.3% increase from 2017. Thus, organizations are interested in and have started to apply the principles of DataOps in their business model. The following figure is the result of a survey conducted by Eckerson Group, which shows that apart from application development, the three other DataOps use cases are: data science, data warehousing, and dashboards and reporting.
Dashboards and Reports
Reports and dashboards grow rapidly as organizations move towards becoming more data-driven. However, creating and managing various dashboards and reports is a complex and difficult task. Organizations started to adopt agile methods to better understand the expectations of the business.
DataOps completes this approach by linking these ideas with the rest of the development processes and thereby creating an end-to-end responsibility for all stakeholders.
The goal of applying data science to the business model is to make use of statistical methods and algorithms to explore data sets and draw out valuable insights,
However, getting the most out of data science requires elucidating the discipline by establishing defined structures and processes. And this gap is bridged by bringing DataOps into the business model.
One of the important data pipelines of data and analytics is data warehousing. A data warehouse is termed as a central repository for enterprise data. There are many challenges that are being faced by data warehousing like, increasing heterogeneity and complexity and rapidly changing business requirements.
These issues mentioned above indicate that there is a need for a new approach to handle the complex data landscapes and adjust data warehousing for today’s business reality. Applying the DataOps approach to your business model helps reduce complexity and improve the manageability of complex solutions.
Remember, data does not just belong to IT, data scientists, or analysts. It belongs to everyone involved in your business. Thus, one of the first steps to succeed in today’s digitized era is to become a data-driven organization followed by a data-informed one.
The DataOps approach is missing, yet a trending piece of the puzzle that once gets connected with your organization’s data strategy streamlines data and analytics structures in order to meet increasingly demanding and complex business requirements, thereby enabling you to become a data-driven organization.