Unit 2: Preparing Datasets: Week 1: Data Modeling

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Week 1: Data Modeling

Unit 2: Preparing Datasets


Preparing datasets
Overview

Data First Only

Uses ELT import workflow

Measures Only

Primarily used for ad-hoc analytics

© 2021 SAP SE or an SAP affiliate company. All rights reserved. ǀ PUBLIC 2


Preparing datasets
When would you want to create a dataset?

You have data on hand that you want to analyze (data first).

You have a predictive scenario


▪ Predictive scenarios require datasets

You are not certain what view you might need –


▪ Flexibility to quickly tweak the wrangling
▪ Iterative wrangling and analysis
▪ Extract, Load, Transform (ELT) workflow

You are not doing any planning.

The import job is a one-off (or you can overwrite the entire dataset).

© 2021 SAP SE or an SAP affiliate company. All rights reserved. ǀ PUBLIC 3


Preparing datasets
Getting started

Dataset creation and management is its own module.

Accessible from the navigation menu, at the far left.

In the Dataset module, you can:


▪ View/edit existing datasets
▪ Create new datasets from uploaded Microsoft Excel or CSV files
▪ Create new datasets from a data connection

Remember!

Datasets are always created via data first modeling!

© 2021 SAP SE or an SAP affiliate company. All rights reserved. ǀ PUBLIC 4


Preparing datasets
Creating a new dataset from a file

To create a dataset from uploaded data, select


From a CSV or Excel File

You can select a file from the local filesystem or a


file server

The data import job will be from this file

© 2021 SAP SE or an SAP affiliate company. All rights reserved. ǀ PUBLIC 5


Preparing datasets
Creating a new dataset from a data source

To create a dataset from a data connection, select


From a Data Source

This will bring up the data source selector


▪ If you have not already configured the desired type of
connection, you can do so now
▪ If you have configured a connection, you will be able
to re-use it

Once you have selected a data source, your


import job will come from it

© 2021 SAP SE or an SAP affiliate company. All rights reserved. ǀ PUBLIC 6


Preparing datasets
Creating a new dataset from a within a story (embedded dataset)

It is possible to create a dataset from directly


within a story.

This is done by
▪ Electing to add data during initial story creation
-or-
▪ Selecting the Data tab and then selecting Add
New Data
▪ If there has been no data added to the story yet, it
will proceed to the data selection dialog

Selecting either the file or data source options


will result in a new dataset being created via
data first modeling

© 2021 SAP SE or an SAP affiliate company. All rights reserved. ǀ PUBLIC 7


Preparing datasets
Creating a new dataset from a within a story (embedded dataset)

The Smart Wrangling view is displayed in spreadsheet style format.

▪ Menu bar at the top


▪ Navigation panel at the right-side toggles between the transformation log and details
▪ If an individual column has been selected,
the details and transformation log for
that column will be shown
▪ If no column is currently selected, the
details and transformation log for the
dataset as a whole will be displayed

© 2021 SAP SE or an SAP affiliate company. All rights reserved. ǀ PUBLIC 8


Preparing datasets
Data types

Columns of the dataset are strongly typed

Data types are


▪ Inferred when the dataset is being uploaded from the file –
users can overrule inference
▪ Defined by the source in other cases

List of types:
▪ Dimensions: date, integer, number, string, time, date and time
▪ Measure: Integer, number

Date and time


▪ Format is automatically detected
▪ Custom formats can be defined by specifying template:
– Example: DD-MON-YYYY

© 2021 SAP SE or an SAP affiliate company. All rights reserved. ǀ PUBLIC 9


Preparing datasets
Transforms

All existing transformations in current wrangling


are available with agile analytics (on datasets)
▪ Concatenate, split, extract, replace, change
▪ Filter is a new function available as a transformation
– Users can now specify the values to be filtered,
even if not part of the sample

© 2021 SAP SE or an SAP affiliate company. All rights reserved. ǀ PUBLIC 10


Preparing datasets
Level hierarchies

For a drilling experience in charts across


dimensions, users can create hierarchies in
wrangling

Functionality
▪ Levels are defined by dimensions
▪ Hierarchies are declared objects in models
▪ Columns can be ordered together in the grid

© 2021 SAP SE or an SAP affiliate company. All rights reserved. ǀ PUBLIC 11


Preparing datasets
Validation of the data

The details pane will alert users when:


▪ Data in a column does not fit the corresponding data type
▪ An ID has multiple descriptions

Full validation happens upon user request


▪ Once toggled, all subsequent actions will be validated on the
full dataset

At consumption time
▪ Cells with issues will be cleared
(and not the entire row, such as with models)

© 2021 SAP SE or an SAP affiliate company. All rights reserved. ǀ PUBLIC 12


Preparing datasets
Custom transformations

The wrangling expression language is a scripting language for data transformations.

Focus on power and efficiency


▪ Getting started examples
▪ Type-ahead functionality for keywords,
functions, dataset objects
▪ Built-in documentation
▪ Clear error handling, pinpointing where
the issue happened

© 2021 SAP SE or an SAP affiliate company. All rights reserved. ǀ PUBLIC 13


Preparing datasets
Lifecycle

When saving, the user can select which folder to save it in


▪ This determines its visibility and accessibility to others

Will be re-useable in any story created by a user with access

Datasets created inside stories will only ever exist inside that story
▪ Will not be reusable

Datasets can always be re-wrangled and modified


▪ Warning! Even if they have been used

Datasets can’t have additional data loaded


▪ Consequence of their ELT approach
▪ Can be manually refreshed by overwriting the entire dataset
▪ Customers with SAP Data Intelligence (DI) can push DI dataflows into datasets

© 2021 SAP SE or an SAP affiliate company. All rights reserved. ǀ PUBLIC 14


Preparing datasets
Demo

© 2021 SAP SE or an SAP affiliate company. All rights reserved. ǀ PUBLIC 15


Thank you.
Contact information:

open@sap.com
Follow all of SAP

www.sap.com/contactsap

© 2021 SAP SE or an SAP affiliate company. All rights reserved.


No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of
SAP SE or an SAP affiliate company.
The information contained herein may be changed without prior notice. Some software products marketed by SAP SE and its
distributors contain proprietary software components of other software vendors. National product specifications may vary.
These materials are provided by SAP SE or an SAP affiliate company for informational purposes only, without representation or
warranty of any kind, and SAP or its affiliated companies shall not be liable for errors or omissions with respect to the materials.
The only warranties for SAP or SAP affiliate company products and services are those that are set forth in the express warranty
statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional
warranty.
In particular, SAP SE or its affiliated companies have no obligation to pursue any course of business outlined in this document or
any related presentation, or to develop or release any functionality mentioned therein. This document, or any related presentation,
and SAP SE’s or its affiliated companies’ strategy and possible future developments, products, and/or platforms, directions, and
functionality are all subject to change and may be changed by SAP SE or its affiliated companies at any time for any reason
without notice. The information in this document is not a commitment, promise, or legal obligation to deliver any material, code, or
functionality. All forward-looking statements are subject to various risks and uncertainties that could cause actual results to differ
materially from expectations. Readers are cautioned not to place undue reliance on these forward-looking statements, and they
should not be relied upon in making purchasing decisions.
SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered
trademarks of SAP SE (or an SAP affiliate company) in Germany and other countries. All other product and service names
mentioned are the trademarks of their respective companies.
See www.sap.com/trademark for additional trademark information and notices.

You might also like