Professional Documents
Culture Documents
The Anatomy of A Data Product Data Products Are Building Blocks
The Anatomy of A Data Product Data Products Are Building Blocks
The Anatomy of A Data Product Data Products Are Building Blocks
my of a Data Product. Data Products are building blocks of… | by Eric Broda | Towards Data Science
Member-only story
Data Products are the foundational building block of an enterprise Data Mesh. But what
exactly is a Data Product, how do they work, how can they be identified, and how can they
be built quickly?
https://towardsdatascience.com/the-anatomy-of-a-data-product-d3140f068311 1/17
23.08.2023, 13:26 The Anatomy of a Data Product. Data Products are building blocks of… | by Eric Broda | Towards Data Science
How are data products designed, and how to they work such that they make data easy
to find, consume, share, and govern?
What capabilities, APIs, and lifecycle needs to be established to make Data Products
easy to build, deploy, secure, and manage?
Simply put, if you can answer these questions, then, first, you will be able to explain why
Data Products are foundational to your Data Mesh journey, and second, you will
understand the capabilities necessary to accelerate the adoption and buildout of Data
Products in your enterprise Data Mesh.
Before you start, this article assumes that you have a high-level understanding of Data
Mesh. If you need some background information on Data Mesh, there are a number of
great articles are available here (patterns), here (architecture), here (principles) and here
(lessons learned). For interested readers, a full set of Data Mesh patterns are available here
and here.
https://towardsdatascience.com/the-anatomy-of-a-data-product-d3140f068311 2/17
23.08.2023, 13:26 The Anatomy of a Data Product. Data Products are building blocks of… | by Eric Broda | Towards Data Science
Let’s unpack this a bit starting with “product thinking”. I like some insights found in a
recent article in Harvard Business Review: First, a product marshals an organization’s
production capabilities to “deliver and capture value”. Second, there is an “end customer
who purchases and uses that product”. Lastly, a Product has an owner and team that
creates a long-term plan to ensure that “products can be continuously improved to make
them more successful” delivered by a group that focusses on “outcomes instead of
outputs”.
To paraphrase, product thinking means that ensuring your product meets a specific
business need and delivers some tangible value, has a long-term time horizon, and has a
clear and empowered owner that acts in not only the enterprise’s but also the customer’s
interest.
Unfortunately, defining “data domain” is not as simple since this term tends to be quite
ambiguous in large enterprises. For the Chief Data Officer, governance, regulation, and
privacy are a central concern leading to coarsely grained domains: All customers instead
of current customers, or Canadian customers, for example.
Similarly, the data architect may consider customers to be a subset of the “party” domain
which includes, current clients as well prospects. And the application developer may view
customers as unique identifier linking a customer’s accounts and transactions.
For the purposes of this article, I define a data domain as a set of identifiable, real, related
data that is managed consistently, and which has some measure of quality and accuracy.
https://towardsdatascience.com/the-anatomy-of-a-data-product-d3140f068311 3/17
23.08.2023, 13:26 The Anatomy of a Data Product. Data Products are building blocks of… | by Eric Broda | Towards Data Science
So, now let’s combine these ideas and create a practical definition of a Data Products. A
Data Product has/is:
Published metadata, that enables discovery and self-serve while making data
understandable
https://towardsdatascience.com/the-anatomy-of-a-data-product-d3140f068311 4/17
23.08.2023, 13:26 The Anatomy of a Data Product. Data Products are building blocks of… | by Eric Broda | Towards Data Science
Bounded: Data products store any type of data that has a clearly defined boundary and
owner; While analytic data is a primary use case, both operational and engagement
data can also be managed with in a data product.
Open in app
Self-Aware: Automatically capturing changes and information about itself; All data
product changes can be captured and distributed as “events” within the data product,
to other data products, or to interested parties across the enterprise.
Discoverable: Each data product contains its own “Registry” that publishes its data
product metadata, ownership information, policies, and any additional enabling
behaviours; The data product registry is the “one-stop-shop” for developers, data
scientists, and data analysts to find, consume, share, and govern data managed by a
specific data product. It also is the entry point to behaviours specific to that data
product enabling sophisticated interactions allowing users to request access to data, or
“owners” to create new data products.
Secure: Data products ensure that all data is secure both at-rest and in-motion; Our
objective is to ensure that all data products operate in a “Zero-Trust”
container/environment.
Historical and Temporal: Changes to data state or exceptions using the data product
are captured and managed in an immutable log to support a federated governance,
diagnosis of security issues, and (when data state changes are aggregated) provide data
lineage.
Shareable: A Data product has “ports” that allow data managed by the data product to
be ingested or consumed. Information and events (for example, a data change or an
API call) can be communicated using bulk pipelines or in near real-time inside the
data product domain, between data products, as well as across the organization using a
robust, reliable, and resilient backbone.
https://towardsdatascience.com/the-anatomy-of-a-data-product-d3140f068311 5/17
23.08.2023, 13:26 The Anatomy of a Data Product. Data Products are building blocks of… | by Eric Broda | Towards Data Science
However, in large enterprises interoperable interfaces have several expectations (in some
cases mandatory requirements):
Formal Security: This is tricky — each tool may offer a different security approach, and
worse, some may not have a robust nor complete security model. Still, this does not
negate the need for securing your producer and consumer interfaces — rather, it just
makes it harder to do.
While the producer and consumer interfaces are important, we should not overlook the
crucial nature of interfaces that enable discovery, observability, and manageability. In fact,
most of these interfaces are implemented as APIs which means that you can take
advantage of the capabilities offered by OpenAPI specifications:
https://towardsdatascience.com/the-anatomy-of-a-data-product-d3140f068311 6/17
23.08.2023, 13:26 The Anatomy of a Data Product. Data Products are building blocks of… | by Eric Broda | Towards Data Science
Formal Security: OpenAPI specifications provide a robust, well understood, and well
documented approach to defining security schemas that define to “scopes” which map
directly to roles; with a little bit of due diligence, these scopes can be implemented
using OAUTH2 (a common security approach) and connected to an enterprise’s
identity book of record.
With these basic attributes in place, a data product can begin to be used in the enterprise.
And if designed well, then the data product can now make data easy to find, consume,
share, and govern. And as data is more easily and frequently consumed and shared,
newfound agility and speed result. And with this agility and speed come true business
value:
Faster and better insights, that are key to creating an outstanding customers
experience or quickly addressing changing market needs.
Improved time-to-market, especially for end consumer products heavily reliant upon
data.
https://towardsdatascience.com/the-anatomy-of-a-data-product-d3140f068311 7/17
23.08.2023, 13:26 The Anatomy of a Data Product. Data Products are building blocks of… | by Eric Broda | Towards Data Science
But how can data products be delivered quickly, consistently, and securely? That is where
the “Data Product Factory” comes in.
Easy to build, by providing templates that simplify the building of a Data Product;
these templates generate microservices/APIs with built-in discoverability (the
“/discover” endpoint) and observability (“/observe”, “/usage”, “/logs”, and “/alerts”
endpoints.
So, clearly Data Products make a lot of sense! But how do we identify them? Fortunately,
there are a lot of hints that help us find Data Products in an enterprise:
Conway’s Law: Applying Conway’s Law (to paraphrase, your systems and data will
follow your organization structure) to Data Product means that ownership migrates to
groups aligned closely to organizational units (lines of business, etc), that have deep
knowledge of the data as well as direct accountability for delivering results with, and
hence decision and funding rights, for the data.
CDO Data Domains: A data domain map (enterprise or group) identifies business
entities that are of significant value to the enterprise. These entities provide “hints”
that may identify data product candidates. However, note that in may cases enterprise
domains may need to be sub-divided into finer grained domains to map to Data
Products.
But there is one lessons learned that I would be remiss not to share: Granularity matters!
So-called enterprise data domains — for example: “enterprise client” — are too coarsely
grained to suit a data product making it quite difficult to define data boundaries and
owners. Rather finer granularity data boundaries map much better to “owners” and hence,
to data products (“commercial lending clients in UK”).
With this simple observation, we can now delegate several simple yet specific
responsibilities to the Enterprise Data Mesh.
Chief Prognosticator of Data Mesh concept: The Data Mesh first and foremost a
concept — a marketing message, an executive imperative, a demarcation for an
enterprise data journey — whose primary purpose is to describe and communicate the
organizational construct and logical architecture abstraction that that binds data
products into an ecosystem.
The Agent of Data Product Discoverability: Data Mesh is the owner of the “Enterprise
Data Product Registry”, that makes Data Products easy to find, consume, share, and
govern.
https://towardsdatascience.com/the-anatomy-of-a-data-product-d3140f068311 10/17
23.08.2023, 13:26 The Anatomy of a Data Product. Data Products are building blocks of… | by Eric Broda | Towards Data Science
Keeper of Data Product Protocols: Data Mesh establishes the protocols by which data
can be shared both inside a data product, between data products, and with the broader
organization. As a result, becomes a key consumer and/or stakeholder in an
enterprise’s common communications, pipeline, and/or event streaming backbone.
Concluding Thoughts
In this article I discussed how data products work such that they make data easy to find,
consume, share, and govern. And I also introduced the “Data Product Factory” that makes
Data Products easy to build, deploy, secure, and manage.
I am hopeful that with these insights from this article that, first, you will be able to explain
why Data Products are foundational to your Data Mesh journey; And second, you will
understand the capabilities necessary to accelerate the adoption and buildout of Data
Products in your enterprise Data Mesh.
***
All images in this document except where otherwise noted have been created by Eric Broda (the
author of this article). All icons used in the images are stock PowerPoint icons and/or are free
from copyrights.
The opinions expressed in this article are mine alone and do not necessarily reflect the views of my
clients.
Follow
I write at the intersection of Data Mesh, Data-as-a-Product, APIs, Event Management, and the digital ecosystem.
https://towardsdatascience.com/the-anatomy-of-a-data-product-d3140f068311 11/17
23.08.2023, 13:26 The Anatomy of a Data Product. Data Products are building blocks of… | by Eric Broda | Towards Data Science
251 2
130 ML Tricks And Resources Curated Carefully From 3 Years (Plus Free eBook)
Bex T. in Towards Data Science
130 ML Tricks And Resources Curated Carefully From 3 Years (Plus Free
eBook)
Each one is worth your time
2.9K 10
1.7K 31
154 1
https://towardsdatascience.com/the-anatomy-of-a-data-product-d3140f068311 13/17
23.08.2023, 13:26 The Anatomy of a Data Product. Data Products are building blocks of… | by Eric Broda | Towards Data Science
Analytics at Meta
632 9
459 14
Lists
199 1
https://towardsdatascience.com/the-anatomy-of-a-data-product-d3140f068311 15/17
23.08.2023, 13:26 The Anatomy of a Data Product. Data Products are building blocks of… | by Eric Broda | Towards Data Science
Maria Beckles
12
https://towardsdatascience.com/the-anatomy-of-a-data-product-d3140f068311 16/17
23.08.2023, 13:26 The Anatomy of a Data Product. Data Products are building blocks of… | by Eric Broda | Towards Data Science
And how to use Generative AI in BigQuery to Generate Product Catalog Descriptions and Improve
Data Quality
18 1
587 15
https://towardsdatascience.com/the-anatomy-of-a-data-product-d3140f068311 17/17