Part 2 - Memo

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

To: The Director

From: Ashish Kumar


Date: 11 /28/2020

Subject: Importance of Data Curation in our company

As the head of the Data Curation team, I welcome you to our company as our new Director. We are happy
to have you and very ready to work with you in achieving the company’s mission. Any changes that you
would wish to introduce to make process better are also highly welcomed. Do not hesitate to let me know
anytime you need my help.

With this memo I want to bring your attention of current ongoing Data Curation work, importance of Data
curation in our company’s data analytics products and other services. As you know that our business
analytics products are depend on data that we collect huge volume from various sources every day. And
Data curation at our company involves annotation, publication and presentation of the data from various
sources to make it reusable and available for longer period. In the world of big data, data curation is
extremely important to ensure the data scientists team use quality data for their analysis. Since, we
process high volume of data from various sources and process them through complex ETL (extract,
transform, load) system into our database and it’s important that we strive to prioritize data quality.

As I mentioned above, we process data from various channels and maintain Provenance to track data
came in our system from which source and their details. The benefits of data provenance are numerous
including maintaining the quality and suitability of data. Understanding the origins helps us to stamp data
with its quality and help us to identify the right set of data for analysis. Data provenance gives visibility
while greatly simplifying the ability to trace back if any error to find out the root cause in the data analytics
process. Provence helps our systems to use the right set of data for the right problems. By methods and
processes that we have put in place, the data created is attributable and immutable. We also have put in
place methods by which data is transparent and can be audited at any time. The data provenance practices
in our company makes sure that the analytics and intelligence our company uses is trustworthy and
effective.

Along with Provenance we also maintain the Metadata for all the streams of data being used by our data
scientist team. The metadata is made available with data catalog to team to find, research and use of the
actual data sets available in the most efficient manner.
The volume of data that we store from various sources growing exponentially and Preservation is also
important aspect for our company. We have active data management initiatives to preserve relevant data.
Metadata stores along with the data storage makes it easier to preserve data, make it searchable and
available for future use.
We also have put in place data Policy around the procedures of data ownership, data management, data
sharing and data archival. There are different team they take care of data from various streams are the
owners of their corresponding data and making data available to other team after curation makes sure
that the teams using the data get from a single source and the data is ready to use. Data policy is very
important along with above mentioned data curation activity to ensure that there is no misuse of data.
Our data curation team following the robust practices that help data scientists in their analysis/research
and company data analytics products more useful.

I hope we have addressed the concerns that you might have our data curation services.

We very much look forward to working with you.

Regards,
Ashish Kumar

You might also like