openSAP_dsp1_Week_1_Transcript_EN

PUBLIC
openSAP
Introduction to SAP Datasphere
Week 1 Unit 1
00:00:06 Hello, and welcome to our openSAP course about SAP Datasphere.
00:00:11 My name is Klaus-Peter Sauer and I work as a Senior Director in Product Management for
SAP Datasphere.
00:00:17 I'm really excited to guide you through the first few units of this course.
00:00:26 So what is the course about and what can you expect? This course is an introduction course to
SAP Datasphere.
00:00:34 You will learn how to leverage the features and functions of Datasphere,
00:00:38 starting with a simple analytical requirement. During the course, we will gradually extend the
complexity
00:00:45 of the practical example, as we add additional tasks
00:00:49 and the requirements that come in. You can also use hands-on exercises
00:00:56 to get your own experience. So the focus of this course is on data modeling,
00:01:02 but we will also touch many other aspects of the solution over the three weeks of the course.
00:01:09 At the end, you will have a very good understanding on how to use the solution for real use
cases
00:01:15 to get your job done. So as mentioned, in this three-week course,
00:01:22 we get you introduced in the first week to the system to start with your first data models.
00:01:29 We will show you how to extend them as the requirements are changing and getting extended.
00:01:37 You'll also learn how to integrate data from remote sources, how to use the data flow and other
data integration features
00:01:45 like the data integration monitor, also how to share data and apply data access controls.
00:01:54 In week two, you'll learn more about advanced modeling topics,
00:01:57 so using the analytic model and the Business Builder. We'll also introduce you to the Data
Marketplace
00:02:05 and the intelligent lookup functionality. More topics in week two include the repository,
00:02:12 the data impact and lineage analysis, as well as the new catalog component.
00:02:18 We are rounding it off with the command line interface, where you get to know how you can
use it
00:02:23 and what kind of features and functions are there for you. In week three, you will learn about
administration
00:02:32 and configuration topics, but also more options on the data integration side.
00:02:39 We will show you also integration into analytics and planning, as well as the SAP BW bridge,
00:02:46 and how you can leverage your existing BW system in hybrid deployments.
00:02:54 So let's have a look at your task during the course. Your task is simple - you start building the
analytics
00:03:01 for the Best Run Bikes company. So you have tenants for SAP Datasphere,
00:03:07 as well as for SAP Analytics Cloud to create your data models, to load and manage the data,
00:03:14 as well as building analytical reports on top. The focus of this course is, of course, on SAP
Datasphere.
00:03:23 There are different courses available for SAP Analytics Cloud,
00:03:27 in case you're interested to learn more about that product. So in this unit, you will get an
overview of the scenario,
00:03:35 the system setup, about your own dedicated space
00:03:39 where you actually work with the data. So before we dive into the tasks,
00:03:47 you need to get access to a system first. This is mandatory if you want to do the hands-on
exercises
00:03:54 of this course. So if you do not want to do any hands-on exercises,
00:04:01 you can basically skip this task as well. So let me show you this in the browser, how you get
there.
00:04:11 So in your browser, you navigate to sap.com/datasphere, then you get to this page,
00:04:20 and here you simply click the Experience SAP Datasphere button. Now, a registration form
comes up
00:04:30 where you basically fill in your details and take it from here.
00:04:36 After that, you will get an email to activate your account after successful registration,
00:04:43 and your login details to the system will be emailed to you. Then you have a 30-day access
00:04:49 to a guided experience trial system. If you already have an existing SAP account,
00:04:56 you can also use the button for logging in here on the right-hand side,
00:05:00 and then you don't need to fill in the form and just proceed using your existing account.
00:05:08 In case you already have access to a guided experience trial system
00:05:12 and your system access has expired, you can simply get a new one with the same process.
00:05:21 So when you actually log in to the system, you get to the Home screen.
00:05:25 So the main menu on the left provides you direct access to all essential functions of
Datasphere,
00:05:32 but let me show you that in the system. So in the system, the first thing you find
00:05:40 on the left-hand side is the menu bar and the menu entries. So on the bottom, you find more
the administrative topics
00:05:51 with like Space Management, where you define and manage the different spaces in the
system.
00:05:58 So these entries in the menu may differ, depending on the authorizations you might have in
the system,
00:06:05 so I have an administrator, right, where I can see everything.
00:06:09 That's why I also see like the System Monitor, where you are actually monitoring the overall
system.
00:06:17 The Content Network, the next item here, is where you can deploy business content from SAP,
00:06:24 but also from partners. And also, we offer sample packages,
00:06:29 which can be deployed. In the Security area, you can manage your users,
00:06:36 so create new users, delete users, assign roles, and also have access to the activities log
00:06:43 of the different users. The Transport allows you to export models,
00:06:52 which can be imported then in other Datasphere tenants. The Data Sharing Cockpit is part of
the Data Marketplace
00:07:03 and the place for data providers to define their profiles, as well as the data products, licenses,
contexts, and so on.
00:07:17 The System entry brings you to the Administration and Configuration areas.
00:07:27 Above the Administration section, you find the different applications.
2 / 34
00:07:32 So the applications for data modeling with the Data Builder, as well as for business modeling
for the Business Builder.
00:07:42 The Data Marketplace in this part of the menu is on the consumption side.
00:07:48 So this is where you find the landing page, where you can browse the different data products,
00:07:55 manage the licenses you might have acquired, and so on. Data Access Controls are basically
defining
00:08:09 role-level security, so giving your users access to certain levels on the data that can be
managed and set up
00:08:18 with the data access controls. The Data Integration Monitor can be used
00:08:23 to monitor your data loads, as well as your schedules. And finally, here, the Connections area
that allows you
00:08:32 to set up connections, also manage these source system connections
00:08:36 of your different spaces. On top of the applications,
00:08:42 you find the area for the metadata and that's where you find the repository,
00:08:49 as well as our new cataloging solution. In the middle of the Home screen,
00:08:57 that contains the most recent news and blogs, which you find also on the Communities page,
00:09:03 I already talked about that. And you also have some click links here in the middle,
00:09:09 as well as the recent files are shown here. On the top bar, there's also a few menu items.
00:09:20 The first one you see here is the notifications. This is empty as we haven't done a lot here
00:09:26 in this tenant so far, but you will see that during the course,
00:09:30 where you find that your deployments, the Successful or Failed notices will be shown here
00:09:36 in the notification area. You can also give us feedback about the solution,
00:09:42 and the next icon here is more towards our support colleagues,
00:09:47 where an administrator of the system can create a support user for our colleagues
00:09:52 or download a certain log. An important one is the question mark icon here
00:09:59 because this is getting you to the in-app help, where you find some question marks in the area,
00:10:07 giving you more information about the different sections here.
00:10:11 And we also have sometimes a video embedded in the help, which gets you more information
00:10:17 about the different features and functions. And we also have a What's New section available
here,
00:10:25 where you basically get information about the latest and greatest additions we have
00:10:31 for the different versions of the system. Since we are on a biweekly release cycle,
00:10:36 this will frequently change, so every two weeks you will find a new entry
00:10:41 in this area here. So let's close the help bar.
00:10:47 The next button, with this icon here, is basically getting you to your profile settings,
00:10:55 where you can have the settings about language and those kind of things,
00:10:59 and also, the Sign Out button if you want to log out of the system.
00:11:04 And last but not least, the most right icon here on top is getting you to the so-called product
switch.
00:11:13 So we are very tightly integrated with our solution, with SAP Datasphere and SAP Analytics
Cloud,
00:11:20 so that's why we have this kind of embedded application switch in here,
00:11:26 and if you click on the Analytics side of this, you will get to the connected SAP Analytics Cloud
tenant,
00:11:33 but we will also show you that later in one of the first exercises.
3 / 34
00:11:40 So in SAP Datasphere, everything starts with a space. Spaces are virtual work environments
for your artifacts.
00:11:49 It means all your data models live in a space, and depending on your access rights
00:11:54 to one or multiple spaces, you can access the models or not.
00:12:00 Of course, you can also share objects using Cross Space Sharing, and you will also learn
more about that in a later unit.
00:12:10 So spaces help you to isolate objects and assign resources, like space quota and workload
settings and others.
00:12:20 Spaces also hold system connections, as well as space-wide time dimension settings
00:12:26 or settings for currency conversion. So a system can hold many spaces,
00:12:34 for example, one for Finance, for Sales, or for HR data. Also for project or line of business,
specific spaces are possible.
00:12:44 This really depends on your needs. So each can have different connections,
00:12:50 but let's have a look at the system, how to do that. So when you get to the Space overview,
00:12:59 the different spaces will be shown to you, like in this example here.
00:13:03 So let's dig into the details of a particular space and take our openSAP example.
00:13:13 So when you get into the Space Management section on top, the first thing you find is more
the description
00:13:21 and the technical name of the space, if it was deployed or not, and who actually set it up.
00:13:28 You can also, if you're an administrator, enable the space quota - so that's basically
00:13:34 where you assign disk storage to a particular space, and also the compute part.
00:13:44 And let's disable this for this space. And in the next area, you find some workload settings.
00:13:53 You can use with custom and default settings, we're also using here the default settings.
00:14:00 Then in the Members section, you can basically add and remove colleagues
00:14:06 you want to have access to this particular space, so I already added a few colleagues here for
my area.
00:14:15 Then in the Database Access area, you can create database users
00:14:21 and create a so-called Open SQL schema, but we will let you know about the Open SQL
schema
00:14:28 in one of the later units. The Connections area, that's where you get
00:14:35 to the Connections screen. We will also show you those details
00:14:40 in one of the later units, where you can basically define or manage your connections to
different source systems.
00:14:49 The Time Data section, that's where you create the time dimension.
00:14:54 Time dimension is often used for analytics and reporting purposes because you want to have
the data structured
00:15:01 in years, months, quarters, or weeks. And with this time dimension,
00:15:07 we generate a generic time dimension, but we will also show you that
00:15:12 in one of the next units already. At the bottom here, you find the Auditing settings,
00:15:18 so you can enable or disable audit settings for particular spaces, and then the different audit
logs will be recorded
00:15:28 for read or change operations, depending on your settings. As the course goes on, you might
wonder
00:15:37 where to find more information outside of the course material.
00:15:42 So this link collection here gets you to the SAP Community page for Datasphere,
00:15:46 where you find more information about getting started, best practices, business content, the
BW Bridge, and more.
4 / 34
00:15:56 You will also find there the latest blogs about solutions. Or compose your own blogs there, if
you're interested.
00:16:04 The online documentation is also very helpful to get more details about specific features and
functions.
00:16:13 There are also developer tutorials available, as well as the learning journey for SAP
Datasphere.
00:16:23 So to sum up, now you have a good overview of the course and its structures.
00:16:29 I explained the scenario briefly, and during the other units, you will expand the tasks.
00:16:36 I have shown you how to register to get access to a guided experience trial system
00:16:41 to get really hands-on for the exercises of the course. I also explained what you find on the
Home screen,
00:16:49 what the Space Management is all about. Finally, I showed you some good resources for
information,
00:16:57 where you'll find more helpful information during the course. So that's it for unit one.
00:17:03 Thank you, and good luck with the quiz.
5 / 34
Week 1 Unit 2
00:00:05 Hello and welcome to unit two of our openSAP course about SAP Datasphere.
00:00:12 My name is Klaus-Peter Sauer, and I work as a senior director
00:00:15 in product management for SAP Datasphere. In this unit,
00:00:19 you will learn how to create your first data model and a simple report.
00:00:27 So now that you have access to a system and while IT is setting up the system connectivity
00:00:33 to the backend systems, you can already start to create the first report
00:00:38 based on the sales order sample file, which you were provided with by a colleague.
00:00:44 So the report should show the revenue by sales organization. To achieve that, you need to
upload the file to the system,
00:00:54 build a simple model, and then the data model is then consumed
00:01:00 by SAP Analytics Cloud in a story. I will show you how to do that.
00:01:09 First, let me give you an overview of the Data Builder before we actually upload a file.
00:01:15 So the Data Builder is the central place for data modeling in SAP Datasphere.
00:01:21 You will find various editors there. The Table Editor to create, maintain local tables,
00:01:29 edit remote tables, define semantic usage, manage columns, fields, attributes, measures, and
hierarchies, and more.
00:01:40 The Graphical View Editor allows you to combine local and remote datasets into one data
model,
00:01:46 including filters, associations, joins, unions, and other operators you might want to use.
00:01:55 In the SQL View Editor, you can pretty much do the same using SQL language.
00:02:02 The Entity Relationship Editor allows you to create entity relationship models,
00:02:07 so arranging your local tables, and views, and their relations to each other.
00:02:15 The Analytic Model Editor is used to define multi-dimensional models
00:02:20 for analytical purposes. So you can basically model cube-like structures
00:02:25 with facts and dimensions there. The Data Flow Editor,
00:02:32 that lets you model data flows into SAP Datasphere, including complex transformations.
00:02:41 The Intelligent Lookup Editor allows you to merge data from two entities, even if there is no
joining column,
00:02:50 which actually brings them together. The Replication Flow Editor is relevant
00:02:57 if you want to copy multiple datasets from the same source to the same target
00:03:02 in a fast and easy way and do not require complex transformations.
00:03:08 You can group multiple tasks into a task chain and then run them manually or periodically
00:03:15 through a schedule which you can create. So you will learn more about all of these different
editors
00:03:21 as the course goes on. So in addition to the editors,
00:03:29 there are more functions available in the Data Builder overview.
00:03:33 You can import CSV files, which we will do in a minute, but also import objects
00:03:39 based on CSN or JSON metadata files, import entities from systems like S/4HANA Cloud
00:03:46 or the SAP BW bridge. The remote table import allows the creation of remote tables
00:03:53 from connected systems. The overview is also the place
00:03:58 where you can execute mass operations, meaning for data sharing, for deploying,
00:04:04 deleting multiple models. You can also look at the impact
00:04:08 and lineage analysis for artifacts from the Data Builder overview.
00:04:16 So to import your file and to build your first data model, you need to allow a few,
6 / 34
00:04:22 you need to follow a few steps to start from the Data Builder overview.
00:04:28 But let me show you that in the system, how you actually get there.
00:04:33 In the system, you navigate to the Data Builder and use the CSV upload function.
00:04:42 So you select the source file from your computer. In this case, the SalesOrders.csv.
00:04:51 You use the default settings as you see them on the screen and click Upload.
00:04:58 We provide the file with the course materials, of course. This takes a while and gets you to the
screen
00:05:06 where you could do some further transformations on the dataset.
00:05:10 We keep everything as is and click Deploy. So now you give it a name,
00:05:18 like SalesOrders CSV, and deploy it.
00:05:29 So deployment means we are physically creating the table on the database.
00:05:34 After successful deployment of your table, you see the table in the Data Builder overview.
00:05:42 Let's click on it. And so we get to the Table Editor.
00:05:50 There, we need to give the SALESORDERID the setting of a key field,
00:05:58 and then deploy the table. So we can now also look at the data in the preview
00:06:08 after the table is deployed. So that shows up on the bottom of your page,
00:06:14 takes a second, and then you can directly preview the data we have uploaded.
00:06:24 So the Table Editor allows you to define tables and their semantics in SAP Datasphere.
00:06:31 There are different usage types available, for example, for relational datasets, meaning flat
tables,
00:06:37 this is also the default setting we just used. There are other specific types
00:06:43 for multi-language text descriptions or hierarchy datasets. Analytical datasets offer fact data
00:06:52 where you can associate dimensional data to create multi-dimensional models.
00:07:00 In the Table Editor, you can define or modify primary keys. We just did that in the example.
00:07:07 Define compound keys, set default values, decide on the column visibility, associations, field
names,
00:07:16 data types, descriptions, time dependency, and also the semantic settings.
00:07:24 You can also delete and upload new data from files, as well as preview the data and share it
with other spaces.
00:07:33 You will learn more about it as the course goes on. So now that the file was imported
00:07:42 and the local table is deployed, you can use it for further modeling.
00:07:48 Let's create a simple view to be consumed in SAP Analytics Cloud for a simple report.
00:07:55 And also here, let me show you how you do that. In the Data Builder overview,
00:08:01 we click on the create New Graphical View button. An empty canvas now shows up,
00:08:09 with a pane on the left for your repository and source artifacts.
00:08:15 We open the tables and see our just uploaded SalesOrders CSV table.
00:08:24 Let's drag it to the canvas. And now a Properties section shows up
00:08:31 to the right of your screen. Let's click on the output node, the view.
00:08:40 Give it a name, like Sales Orders View. As we want to use it for reporting,
00:08:49 let's select the semantic usage for an analytical dataset. You also need to expose the
consumption
00:08:58 so that it's visible for external tooling. Now we get an error message
00:09:03 that we haven't defined any measures yet. So let's take them here, the GROSSAMOUNT,
00:09:11 the NETAMOUNT, the TAXAMOUNT, we highlight them
00:09:15 and just simply drag and drop them to the Measures section, and the error is gone.
00:09:23 So now let's save and deploy this view. So now it's deployed and we can continue.
7 / 34
00:09:46 So looking at the Graphical View Editor in general, it offers the same semantic usage types as
for the tables.
00:09:54 We have just defined an analytical dataset, where the distinction
00:09:59 between measures and attributes is required. So the editor also allows for complex modeling
00:10:07 using different operators like joins, unions, aggregations, projections, currency conversions,
and others.
00:10:17 You can also rename and remove columns, add calculations, add filters, associations, and
more.
00:10:27 You can also model hierarchies with parent-child or level-based relationships.
00:10:34 Modeling multi-language text descriptions is another option that may be useful with
international businesses
00:10:41 that require analytics in local languages, like German, Spanish, or others.
00:10:48 Depending on the login language of a user, the text fields will then show
00:10:53 the respective local language translations if, of course, maintained in the data.
00:11:01 You can apply data access controls, input parameters, persisted views,
00:11:07 and also share these models with other spaces. The data preview is also here possible at
each node,
00:11:16 so you can check if your operator is working correctly. So let's create a simple report to check
on your data model.
00:11:29 We use the application switch on the top right-hand side to switch to SAP Analytics Cloud,
00:11:35 but let me show you that in the system. Okay, now you can use the application switcher
00:11:42 to get to SAP Analytics Cloud. In Analytics Cloud, we go to Stories,
00:11:52 create a new story using the optimized language. Let's take a chart.
00:12:04 So now, they ask us to connect to your data model in Datasphere.
00:12:09 So let's use the SAPDWC, Select the space and the Sales_Order_View
00:12:17 which we have just created. So now in the chart, we have to select a measure.
00:12:29 We just defined three, the GROSSAMOUNT, NETAMOUNT, and TAXAMOUNT.
00:12:33 So let's take the NETAMOUNT. And we want a report based on the sales organizations.
00:12:41 So we select that. And now we basically created our first little report.
00:12:49 Let's save it. Give it the name, like My First Story.
00:13:06 Save it. And we're actually done for the first part of our exercise.
00:13:13 So in general, there are different options to consume data from SAP Datasphere.
00:13:19 SAP Analytics Cloud is offering a direct live connection. That is what we just saw in the demo.
00:13:26 You can connect multiple Analytics Cloud systems to multiple Datasphere tenants.
00:13:35 Another option is the Microsoft Office Integration, with the add-in for Office 365
00:13:41 for online or desktop versions. The older SAP Analysis for Office is also supported.
00:13:51 Other tools can use SQL or ODatabase interfaces to connect to data models which are
exposed for consumption.
00:14:04 So in this unit, you learned about the Data Builder, about the different editors and options
00:14:11 on the overview screen. You also learned to upload a flat file,
00:14:17 use the Table Editor with some of its features. We also introduced the Graphical View Editor
00:14:23 and some of its basic features. And then you learned how to create a simple story
00:14:29 in SAP Analytics Cloud, and also how to get there using the application switcher.
00:14:36 So that's it for unit two. Thank you and good luck with the quiz.
8 / 34
Week 1 Unit 3
00:00:06 Hello and welcome to unit three of our openSAP course about SAP Datasphere.
00:00:12 My name is Klaus-Peter Sauer and I work as a senior director in product management for SAP
Datasphere.
00:00:18 In this unit, you will learn how to connect to a remote source
00:00:22 and also how to add a time dimension. So in unit two you created your first data model and a
simple story.
00:00:33 In the meantime, your IT colleague has set up the system connection to the backend system.
00:00:39 So you don't need to start from scratch. You can simply replace your local table with a remote
table,
00:00:45 and I'll show you how to do that. After the quick success with your first story,
00:00:53 you notice that the data is summed up over multiple years. If you simply add the date field
00:01:00 it will show the data for each day with booked values. What you really want to achieve
00:01:06 is a more generic time dimension so that the users can view the data on yearly, monthly, or
quarterly levels,
00:01:15 and this unit will show you how to achieve that. So replacing a source table or view
00:01:26 with another one is a very useful function, especially in cases where you start with sample data
first
00:01:33 to get your model, your calculations right, before you use the actual production data.
00:01:40 But let me show you that in the system first how you actually do that.
00:01:45 So let's go to the system connections and check if the HANA connection provided by IT
00:01:54 is actually working, by using the Validate button here. So this now checks
00:01:59 if the different connection options are there and the toaster message on the bottom just
showed us,
00:02:05 okay, this connection is ready for all three different data integration options.
00:02:13 Now move back to the data builder and go to our existing view.
00:02:23 So now, clicking on the view, the system is doing some simple checks
00:02:27 and checking if the underlying structures have changed or were updated by someone else.
00:02:34 That has not happened here so we directly have everything ready.
00:02:39 So instead of the Repository tab, we now move to the Source tab, where we find the different
source system connections
00:02:48 we have in this space and we see the HANA connection,
00:02:52 which we just checked in the connections overview. And now we need to drill down to our area
for the demo
00:03:05 where we find the different tables we have accessible in that remote system. So we basically
take now the sales orders,
00:03:19 drag them onto the canvas, and you now see three options to the right,
00:03:25 union, join, or replace. In our case, we want to replace the flat file with this remote source
00:03:32 so we just hit it here on the Replace button. We click on Import and Deploy.
00:03:43 So same process here, that the table is actually finally deployed on the database.
00:03:50 It asks us about the mapping, since the structures of both of these tables are identical
00:03:56 we can simply hit the Replace button, and now this remote table is actually deployed.
00:04:07 So for the view, also no change here. We can simply just deploy the simple update we had
00:04:20 and then we are good to go and we could refresh our report. So now let's have a look at the
connections in general.
9 / 34
00:04:31 The system offers many options for system connections. We have just used an SAP HANA
connection in our example.
00:04:41 In the Connections area you can set up new connections, edit, delete, and validate existing
connections.
00:04:49 You can also pause and restart connections that are used for real-time replications.
00:04:56 You can also use the open connectors and connect to supported third-party data sources.
00:05:05 So here you see the currently supported connection tiles. You see that we offer a lot of SAP
sources
00:05:13 but also generic connectors to databases, specific hyperscaler systems, cloud storages,
00:05:20 partner solutions, and more. So replacing the local table
00:05:28 with a remote table means that we virtually access data that is not persisted in SAP
Datasphere.
00:05:36 This is called virtual access or data federation. So virtual tables behave similarly to local tables
00:05:44 but the data is only accessed if you query on the dataset, like a preview or in your SAP
Analytics Cloud story.
00:05:54 The data is transferred through the network each time a query is executed. So this also affects
the source system,
00:06:02 where the data is pulled on every access. That is why the data transfer can be restricted
00:06:11 using central filters and selected columns only. So in addition, you can switch seamlessly
00:06:21 between remote access and data replication or snapshots,
00:06:27 without the need to actually change your data model. You can petition these data loads
00:06:33 and schedule the snapshots regularly. As mentioned before, the real-time replication can be
paused and resumed,
00:06:42 and, of course, stopped or canceled. So going into the space management, as I showed you
earlier,
00:06:53 there is an area about connections, and if we click on that button here,
00:06:59 we get to the same place where we've just been earlier with directly using the menu entry to
check
00:07:07 if our HANA connection is valid. So looking at the screen, I can create also new connections.
00:07:18 And on each of those tiles you always find an information icon,
00:07:25 which shows us what kind of data integration options are supported with this particular
connection tile.
00:07:34 So let's have a look at the HANA connection. This supports data flows, remote tables,
00:07:40 and replication flows. And if we want to create a new one
00:07:44 to set a new connection up, we have basically the option to select
00:07:50 if it's a HANA Cloud or an on-premise-based system, depending on what kind of source
system it is
00:07:58 I need to give different connection details, usually it's hosts and port name, user credentials,
00:08:06 and then if I have, in this case, an on-prem system I need a middleware component to pick
from,
00:08:14 so DP agent or Cloud Connector. And if I pick a DP agent, I have to pick one.
00:08:23 And then all of these different connection options are actually possible. I don't want to create a
new one
00:08:31 since IT has already done that for us. So just to show you how simple it is
00:08:37 if, of course, you have the connection details to the source systems to set up a new connection
00:08:43 and there are different ones that you can pick from. So we exchanged the local table with a
remote table earlier.
10 / 34
00:08:53 You can also refresh your story in SAP Analytics Cloud and see that it still works,
00:09:00 and now shows the data based on the remote connection. The story shows the data for all
years
00:09:08 but you only want to show it for the split in different years. So as mentioned in the introduction,
00:09:15 simply adding the date field to the graphic would not do the job,
00:09:20 as it would show all dates where you have booked values. So this is where the time dimension
now comes in.
00:09:29 Let me also show you that in the system, how to set up the time dimension
00:09:33 and how to use it in your data model. So let's go into the space management again
00:09:39 and navigate down to the time data area. Let's create these timetables,
00:09:47 and you can select from which year to which year you want to create it. So let's use year 2000
until 2050.
00:09:57 Hit the Create button. And now the actual timetables are generated for you.
00:10:07 So from those year values you can pick whatever is fitting your needs here,
00:10:14 and now the timetable is created. If we go to the data builder,
00:10:20 we actually see a lot more tables and views being created for you, which you can now use in
your different data models.
00:10:30 So let's open our sales order view. Again, the system is checking if there are any changes.
00:10:43 And on the right-hand side, in the Properties pane, we scroll down to the Associations area.
00:10:51 We pick Association. Now all the possible objects
00:10:55 for associations are being shown here. Let's limit this to dimensions,
00:11:02 and we see, okay, I could associate the day, month, quarter or year dimension.
00:11:08 Let's use the day so we have the most granular way. Now we get an error message thrown
00:11:19 because we have an association but there's no mapping between the different tables.
00:11:25 So that's what we need to do here. So we basically take the created at date,
00:11:34 drag it to the date field so that we have created a join criteria between the two.
00:11:42 And that's it, let's move back. Deploy the table,
00:11:54 or the view, sorry. And on top, here on the right-hand side, you can always check
00:12:04 what kind of status your view has at the moment. It's not deployed.
00:12:11 Now the deployment process is done, shown by little toaster message,
00:12:16 shown by the Deployed status here, and also you can look it up in the notifications area
00:12:24 that a few seconds ago now this table was successfully deployed. So now we updated the
model, but we need to check also
00:12:34 on the report where we want to change and use the time dimension.
00:12:38 So let's move over to SAP Analytics Cloud, where we open our first simple story we have
created.
00:12:50 We want to edit the story, select our chart here,
00:13:04 and now we can simply add a time dimension, the created at date, which we used.
00:13:10 You also see that there's a little hierarchy icon, which means the time hierarchy is already
active.
00:13:17 So let's select it and select here on the hierarchy,
00:13:23 level two, which gives us the year. And since the horizontal view is not so nice,
00:13:32 let's use the vertical view and move around the date and the sales organization
00:13:42 so now we have everything grouped by year and sales organization, and we can save our
story,
00:13:51 and we have achieved our requirement. So the time dimension in general provides a common
setup
11 / 34
00:14:00 of time data for your space. So you don't need to create time data
00:14:07 for each and every data model that you are building. You can use it in multiple models,
00:14:13 use the predefined hierarchies, and also SAP Analytics Cloud understands these
00:14:19 to drill down into the time dimension. Associations, which we also used with the time
dimension here,
00:14:30 is also a powerful feature of SAP Datasphere. You can create them in various places like the
view editor,
00:14:38 which we just used, but also in the table or the entity relationship editor.
00:14:46 An association basically creates a join that is only executed at runtime,
00:14:52 so, when the association is actually queried. You can associate tables, views from dimensions,
00:15:01 texts, hierarchies, and others. So if we look at this unit, what we have learned
00:15:11 is about the connections and remote tables. I showed you how to replace a local table
00:15:17 with a remote table while your report and the model stay stable.
00:15:24 You also learned about the time dimension and how you can use it in your models.
00:15:30 And last but not least, you also learned about the concept of associations
00:15:35 and how to use them in your data models. So that's it for unit three.
00:15:41 So thank you, and good luck with the quiz.
12 / 34
Week 1 Unit 4
00:00:06 Hello and welcome to unit four of our openSAP course about SAP Datasphere.
00:00:12 My name is Klaus-Peter Sauer and I work as a senior director in product management for SAP
Datasphere.
00:00:18 In this unit, you will learn about to enhance your model with joins and dimensions.
00:00:27 So your task in this unit is to enhance your simple story with the top five sales partners per
region and year.
00:00:36 Looking at the data model, the sales partners show only the ID,
00:00:42 but you want to show the partner company name to be visible in the story.
00:00:47 Also, you want to show the information about the best-selling products.
00:00:52 To achieve this, you need to learn about dimension tables, joins, and other important features.
00:01:03 So first you need to add the table for the business partners to the system,
00:01:08 but let me show you in the system how that works. So in the Data Builder,
00:01:15 navigate to the Import Remote Tables function. So that's what you get here.
00:01:22 Select the existing HANA connection, and we can click on the next step.
00:01:31 Open our schema here and select the BusinessPartners table.
00:01:39 Click on the next step. Here we could correct like the name and the technical name,
00:01:46 but we leave it as is and directly hit the Import and Deploy button.
00:01:55 Then we close the dialog, and now you see the BusinessPartners remote table imported
00:02:03 in your Data Builder overview. Now the table is deployed,
00:02:11 let's enter into the table definition. Takes a second here.
00:02:18 And what you see in addition to your local tables is this section about remote,
00:02:26 where you find the connection name, the remote table name, and the access method,
00:02:32 which is in this case, of course, Remote. So what we need to do in order to prepare the
dataset
00:02:40 is change the semantic usage here to Dimension. Now we get in the attributes table different
settings.
00:02:52 So we can now say that the company name, for example, is a text field.
00:02:57 There are multiple semantic types available, and with the Partner ID,
00:03:04 give it the label column "Company name". So then we deploy the change.
00:03:20 Process is complete. So the deployment process is still pending.
00:03:27 We see that in a second here when the changes are deployed. What we can also do here with
remote tables
00:03:37 is, as I mentioned earlier, load snapshots. So I can simply show you how to do that
00:03:44 using the Load New Snapshot command here. Simply execute it.
00:03:51 We also see that the process is completed. If we refresh this section,
00:03:58 we see that the update happened just now, and the refresh frequency is none
00:04:07 because we didn't schedule anything here in this area. So now we've imported the table,
00:04:13 but we need to bring it together with our already existing data model. So let's open our Sales
Orders View.
00:04:32 Scrolling down here to the Associations area, we've seen that before, with the time dimension.
00:04:38 Let's add another association. And we see here the BusinessPartners table,
00:04:45 the remote table we just imported. We select this one.
13 / 34
00:04:52 And now, different to when we assigned the time dimension, we don't need to actually
associate
00:05:02 the different identifying columns. This has automatically been done by the system
00:05:07 because the IDs and the field names are identical with Partner ID on the BusinessPartners
table
00:05:14 and Partner ID on the SalesOrders table. So we can go one step back and directly deploy the
changes.
00:05:30 So let's check if the deploy part has been executed, still on its way.
00:05:44 So now it is deployed. So we can switch over to SAP Analytics Cloud,
00:05:50 open the story we created earlier. Let's edit the story,
00:06:08 and now we can add a different dimension with the business partner.
00:06:14 We see that as Partner ID here. Let's bring it into the picture.
00:06:23 And oops, make it bigger.
00:06:31 So we see now all the partner names automatically identified here.
00:06:36 Sometimes it can happen that it displays the ID only. So that's what you have here as the
Display As setting
00:06:44 with Description, ID, or both. You can select that.
00:06:47 Since we want to show the names only, it's perfect as it is. And now as we want to have the
top five,
00:06:55 let's go to the Rank function, select Partner ID, Top 5, and now we have the top five by year,
region of our sales partners.
00:07:12 Let's save the report to keep our changes. And that's it for this part of the demo.
00:07:24 So we have used a lot of functions in the last demo, so let's take a step back and look at them.
00:07:32 We used a few times remote tables to achieve virtual access. That leaves the data in the
source system
00:07:39 and it's accessed only when needed. So there's no upfront data movement
00:07:44 and various sources are supported for remote access. The data can also be persisted using
snapshots
00:07:52 or real-time replication. We use the snapshot feature with the BusinessPartners table.
00:07:59 And next to remote tables, you can also take snapshots of views
00:08:03 to materialize your transformations, which you have built into the view.
00:08:11 You could also schedule these snapshots regularly and refresh them, for example on a daily
basis.
00:08:20 Multiple of these runs could also be orchestrated using task chains.
00:08:29 So we have used the semantic usage already a few times. So let's have a closer look at what
those options are about.
00:08:39 The analytical dataset is used for multidimensional analysis. That is where you need at least
one
00:08:46 or more measures that can be analyzed. Relational datasets are just flat representations
00:08:54 and contain columns with basically no specific analytical purpose. Dimensions indicate that
your entity contains attributes
00:09:05 that are used to analyze and categorize measures defined in other entities, like Product Master
Data.
00:09:15 Hierarchies are used to show your data with parent-child relationships for members in a
dimension.
00:09:22 And finally, texts indicate that your entities contain strings
00:09:27 with language identifiers to translate those text attributes.
00:09:34 Let's have a closer look at dimensions. So dimensions are needed
14 / 34
00:09:39 for multidimensional analysis of data. You typically have a table with measures
00:09:45 that you want to analyze in different dimensions, like geography, for example regions,
countries, states, cities.
00:09:57 Or you might have products in there with categories and product details,
00:10:02 or business partners, which we just used. And the time dimension is another example
00:10:07 of what we already used for multidimensional analysis. As mentioned before, you are also able
00:10:18 to schedule specific data loads like snapshots. It allows you to load the data on a regular basis
00:10:26 and get the latest updates. The screen here shows the settings of a schedule.
00:10:32 So successful loads have the status Available; as we have seen that in the demo before.
00:10:42 So now let's look at the second part of the task, where you want to get to the best-selling
products.
00:10:50 The SalesOrders table does not contain any product information. So to achieve that,
00:10:56 we now need to join the SalesOrderItems table. So let me also show in a demo how to achieve
that.
00:11:07 So go to the Sales Orders View and join the SalesOrderItems table with the SalesOrders.
00:11:14 So we select our HANA connection, our schema,
00:11:25 and scrolling down, we find the SalesOrderItems table. And similar as we did it with the
replace function,
00:11:33 we simply drag and drop it onto the SalesOrders table which is there because joining is the
default setting.
00:11:40 We don't need to specifically select that. The table is new, so let's import and deploy it directly.
00:11:53 We also get to the join operator here directly since SALESORDERID is available in both
tables.
00:12:05 We can also extend this if you want to have a better view on these settings, right?
00:12:12 So inner join, that's what we want, that's a default setting.
00:12:15 We could select cardinality, but we leave that for the moment, and everything is here as the
standard setting.
00:12:27 One thing we can adjust also in the View Properties is the SalesOrderItems table
00:12:39 contains an additional measure. We could bring that also to the measures area.
00:12:45 Oops, it's in this case here the QUANTITY. You see also depicted here in the picture
00:12:51 which table this field comes from. So QUANTITY, we can simply say Change to Measure.
00:12:58 So instead of dragging and dropping the information, we can also bring it in there with this
function
00:13:06 and deploy the table. So the deployment is still in process.
00:13:30 The deployment's done, so let's move over to SAP Analytics Cloud.
00:13:36 Let's open our story again. Let's edit the story
00:13:52 and let's bring a new graphic in there. Select the measure, for example "Gross amount".
00:14:13 As a dimension, we select the PRODUCTID and we also add the "Created at date".
00:14:31 Let's use the Horizontal setting, move this around,
00:14:38 and use the year as we did before, Level 2, that is,
00:14:44 and make this a little bigger. And as we want to have the top five products per year,
00:14:53 let's use the Rank function again, Product ID, Top 5. And here we go, top five products per
year.
00:15:06 So that's basically it about the best-selling products story and the end of this second part of the
demo.
15 / 34
00:15:17 So now coming back to the theory part, there are multiple options for joins and unions.
00:15:24 So Datasphere supports several types of joins, like cross joins, full joins, left joins,
00:15:31 right joins, or inner joins. The inner join is the default value.
00:15:36 So that's what we have used also in the example. You can also define the cardinality on both
tables
00:15:43 to improve the performance of the join execution. Using the Distinct Values checkbox would
mean
00:15:51 that you return only unique values. So there are some datasets which have unique values,
00:15:59 and that's a very valuable feature which could be used in those scenarios.
00:16:08 So in this unit, you have learned a lot about remote tables and how to import them
00:16:14 from the Data Builder overview, as well as from the graphical view editor.
00:16:23 You also better understand now the semantic usage types of tables and views, as well as the
semantics for fields.
00:16:35 We use the dimensions to show you how to display texts instead of IDs and ID values in your
story
00:16:43 and by defining texts and IDs in your data model. You also learned about snapshots and
scheduling,
00:16:52 as well as joins and unions. So that's it for unit four.
00:17:01 Thank you and good luck with the quiz.
16 / 34
Week 1 Unit 5
00:00:05 Hello, and welcome to week one, unit five of our openSAP course, Introduction to SAP
Datasphere.
00:00:13 My name is Tim Huse from the SAP Analytics and Insight Consulting.
00:00:18 In this unit, you will learn how to create data flows, task chains, and SQL views with SAP
Datasphere.
00:00:25 Let's dive in. We start by looking at your task for this unit.
00:00:30 The story that has already been created needs to be extended so that the top sales person per
region
00:00:36 can also be displayed. The required data is stored in two remote tables in other systems,
00:00:42 which can be imported into SAP Datasphere using the data flow. For this purpose, the data is
joined
00:00:49 and persisted in a local table in your space. In order to load the data in a systematic process,
00:00:55 you will learn how to create a task chain to build a sequence of loading tasks.
00:01:01 The data flow is an artifact in SAP Datasphere that can be used to integrate
00:01:06 and transform data from a plethora of data sources. The data flow provides an intuitive
00:01:11 graphical modeling experience to meet extract, transform, and load requirements.
00:01:16 In a data flow, data from SAP and non-SAP data sources can be loaded and combined,
00:01:22 such as tables and ABAP CDS views. Standard transformations,
00:01:26 such as aggregation and filtering, can be used, as well as scripting for advanced requirements.
00:01:33 Data can be replicated via a filter-based delta, and also only specific columns
00:01:37 from these source tables can be replicated to reduce the data transfer.
00:01:42 Data flows can be started automatically via plan schedule, memory can be allocated
dynamically,
00:01:48 and data flows can be restarted automatically via an auto restart option in case of errors.
00:01:55 Now let's take a closer look at the script operator, as opposed to standard transformations.
00:02:01 As already mentioned, the data flow offers standard transformations.
00:02:05 Thereby data can be combined in a no-code environment with aggregations, joins, filters,
00:02:11 unions, as well as the addition of new tables. Tables, ABAP CDS views, OData services, and
remote files,
00:02:19 such as JSON or Parquet, can be selected as data sources. You can specify for the target
table
00:02:26 in which the entries are persisted whether data is appended at the end,
00:02:30 whether the table is truncated before the run, or whether existing entries are deleted
00:02:34 based on match conditions during the data flow run. In contrast to the standard
transformations,
00:02:41 the script operator can be used if there are more advanced requirements.
00:02:45 For example, the extraction of text fragments. The script editor is integrated in the Data Flow
Modeler
00:02:52 and supports the standard Python 3 scripting language. The script operator can provide data
manipulation
00:02:59 and vector operations, and supports the well-known modules NumPy and Pandas
00:03:04 without the need to import them explicitly. Now let's create your own data flow in a demo.
00:03:10 So in this demo, we want to build a data flow to retrieve data about employees
17 / 34
00:03:16 and the addresses of these employees that are stored within a HANA table.
00:03:19 Some columns from these sources are not needed and we also have some sensitive data,
00:03:23 phone numbers and email addresses, that we want to mask in our data model.
00:03:29 Okay, we start within Data Builder and go to the tab Flows.
00:03:38 There, we click on create a New Data Flow. We are going to rename the data flow
00:03:43 to DataFlow_EmployeesWithAddresses. Now we have the possibility to insert tables and views
00:03:53 from your repository or from other sources. As we want to use tables from a HANA source
system,
00:03:58 we click on Sources, and then we choose our SAP HANA connection.
00:04:06 Now we have to open the correct schema, which is DWC_DEMO.
00:04:16 And there you can see the Employees table, as well as the Addresses table.
00:04:21 You can just drag it to the panel. And then we can click on the join operator
00:04:34 and connect both source tables with the join operator. And if you click on the join operator,
00:04:46 you can already see that the system will propose you the join condition.
00:04:49 In our case, that's already fine. Nothing to do from our side.
00:04:56 But there are some columns that we don't need, like Street and Postal Code,
00:04:59 so we need to add a projection node and remove all the unnecessary columns.
00:05:07 Those columns are PHONENUMBER, EMAILADDRESS,
00:05:21 the STREET, POSTALCODE,
00:05:31 BUILDING, as well as ADDRESSTYPE.
00:05:39 For the masked columns, we are creating so-called calculated columns,
00:05:43 therefore we click on the plus button. The first one is PHONENUMBER.
00:05:49 We will rename the column to PHONENUMBER and set it as a string with 14 characters.
00:06:00 We want to apply a simple masking function that displays a placeholder
00:06:04 for the first digits of the phone number and then displays the last four digits
00:06:08 of the original column PHONENUMBER. So we are building our expression.
00:06:15 We can search for all available functions here as well, for all columns and operators.
00:06:19 We need CONCAT so I'm searching for that. Now I will insert a placeholder string,
00:06:25 which will be the first part of the masked number. Next, I'm searching for RIGHT,
00:06:37 which is a function to return the rightmost characters of a string.
00:06:48 I could also use the autocompletion to find the function.
00:06:55 And here, I'm using the name of the original column, which is also PHONENUMBER,
00:07:02 and a four as the second argument as we want to show the last four digits.
00:07:13 Now we can validate the expression, and that's it.
00:07:19 One more to go as we also want to mask the EMAILADDRESS.
00:07:26 This time, we need a more complex expression as we want to display the first four
00:07:30 as well as the last four characters of the email address and mask the middle part.
00:07:35 Therefore, we create another calculated column, we rename it to EMAILADDRESS,
00:07:39 and as a string with 12 characters. I just paste the statement,
00:07:45 which is using the CONCAT and RIGHT function again, as well as the LEFT function.
00:07:50 And let's see if the validation is successful. Yes, it is.
00:07:57 So let's continue. Finally, we need the target table,
00:08:01 where we want to persist the transformed data records. So we click on the last node in our flow
00:08:06 and press the table icon, and it will automatically add another node
18 / 34
00:08:09 for the target table. It will also already add all necessary columns.
00:08:14 We will rename this target table to EmployeesWithAddresses.
00:08:18 Now this local table is not deployed yet, so we will go to the top right
00:08:21 in the Properties panel and click Create and Deploy Table.
00:08:29 And now we also can define the mode of the target table. We will use TRUNCATE.
00:08:39 So each time the data flow runs, all the existing data in the target table
00:08:43 is truncated before the run. Now we are good to go
00:08:47 and can deploy the data by pressing the cloud icon, which will automatically save the data flow
as well.
00:08:57 Now that the deployment is finalized, we can click the run button,
00:09:02 which looks like a play button, to start the data flow once. You can check the run status on the
right side
00:09:11 and you could also refresh it. Okay, it is finished,
00:09:18 so we can open the target table and look at your data preview.
00:09:40 Okay, cool. So the data is here.
00:09:42 We cannot see the masked columns. This is because by default,
00:09:46 only 20 columns are displayed in the data preview, so we need to activate the
PHONENUMBER
00:09:50 and EMAILADDRESS as well. Great, the sensitive data is masked now.
00:10:10 Now that's it for this demo. Now let's have a look at an example
00:10:16 on how to utilize the script operator and data flows. No worries, this part is not included in your
exercises.
00:10:22 In this data flow, we have a table with first names and last names
00:10:26 and we want to generate email addresses for that. You can see we have three records on the
source table.
00:10:37 So as a second node, we have the script operator. You can see that we have added another
column,
00:10:45 the column Email as a string 100, which will be filled during this operation.
00:10:51 And in the Properties panel, you can also see the Python script.
00:11:02 Okay, so we can click here to maximize the script for you. So we have a function transform
here,
00:11:08 and the data of the input node will arrive as a Python dataframe here.
00:11:12 So we are just creating a new column for the dataframe, which we call Email,
00:11:16 as we did it in the configuration of the data flow. And we assign data to the column.
00:11:21 The data you will return within this transform function is sent to the next node of the data flow.
00:11:28 In our case, this is the target table. And as we already deployed the artifacts
00:11:33 and ran the data flow previously, you can see in the data preview
00:11:36 that the emails have been generated. So that's it for data flows now.
00:11:42 Next, we will take a look at task chains. Task chains offer the possibility to group multiple tasks
00:11:49 and to start them manually or to start them periodically via a schedule.
00:11:54 A task chain can include flow runs, replication of remote tables, and persistence of views.
00:12:01 The tasks of a task chain are always processed serially. In the settings,
00:12:06 you can specify that a customizable email notification is sent to a specified group of people
00:12:11 in the event of success and/or error of a task run. On the right-hand side, a sample task chain
is shown.
00:12:19 Here the remote table Master Data Attributes is replicated first,
19 / 34
00:12:23 then the data flow number seven is processed, and finally, the view Sales Orders is persisted.
00:12:31 Task chains can be started manually once or periodically via schedule.
00:12:36 On the left-hand side, you can see how a task chain can be started directly
00:12:39 in the task chain modeler by clicking on the play button.
00:12:42 On the right-hand side, you can see the task chains
00:12:45 can be scheduled and started manually in the so-called Data Integration Monitor.
00:12:50 The previous runs and the current statistics, such as the duration, and the start and end of the
last run,
00:12:55 are also displayed here. In the Data Flow Monitor tab,
00:12:59 data flows can also be started and scheduled analogously. Now let's continue with a short
demo
00:13:04 on creating a task chain. So we start in the Data Builder again
00:13:09 and we navigate to the Task Chains tab. We will create a new task chain,
00:13:13 which we rename simply as TaskChain. Now you can see on the left side
00:13:18 that we can insert remote tables, views, as well as data flows.
00:13:23 We want to start by replicating the BusinessPartners table, so we will just drag it to the middle.
00:13:36 And as a second step, we want to execute the previously created data flow.
00:13:40 So we also drag and drop it to the editor area. You can see it is already assigned
00:13:51 as a second step of the task chain. You can see in the Properties panel
00:13:58 that there are two objects in the task chain, and you can set an email notification
00:14:02 in case of a successful or failed task chain. But we don't need this now.
00:14:05 We are good to go. We can deploy the task chain,
00:14:08 which will also save the artifact. And it's finished now.
00:14:11 However, we don't need to start it now, but feel free to try it out on your own.
00:14:16 So that's it for this demo. Okay, we start again in the Data Builder,
00:14:22 and let's say we have identified a Time Dimension - Day view as a view with a bad
performance,
00:14:27 which we want to materialize. So we click on that view.
00:14:31 And you can see here that this is an SQL view with SQL coding.
00:14:36 In the Properties panel on the right, you can see that there is an area persistence.
00:14:41 Currently, there is no persistency and the access is virtual.
00:14:46 Now if you click on the database icon, you can manually create a snapshot of the data in the
view.
00:14:52 And then you can see the last updated time. However, if you click on the calendar icon,
00:15:01 you can schedule the view persistency periodically and utilize either simple schedules or cron
expressions.
00:15:19 We will not do this now. That's it for this demo.
00:15:22 Besides graphical views with drag and drop support, there is also the possibility to use SQL
views
00:15:28 in SAP Datasphere to develop views with SQL scripting capabilities.
00:15:34 Here, a distinction is made between two languages that can be utilized.
00:15:38 For standard SQL queries, which are based on a select statement,
00:15:42 SQL can be selected. For more complex structures,
00:15:45 for example, if statements and loops, SQLScript can be selected to develop a table function.
00:15:51 SQLScript is an SAP-own extension for SQL. SAP Datasphere supports a subset of the SQL
syntax
20 / 34
00:15:59 supported by SAP HANA Cloud. This includes operators,
00:16:03 predicates, expressions, and functions. Details are described in the documentation.
00:16:11 By persisting the view data, you can improve the performance while working with views.
00:16:16 You can enable view persistency for graphical and SQL views
00:16:19 to materialize the output results. By default, a view is run every time it is accessed.
00:16:25 And if the data is complex or a large amount of data is processed,
00:16:29 or the remote source is slow, this may impact the performance of other views
00:16:33 or dashboards built on top of it. You can improve performance by persisting the view data
00:16:39 and you can schedule regular updates to keep the data fresh. Similar to remote snapshots, the
result set is persisted.
00:16:47 Only the required data must be persisted, instead of a one-to-one replication of the remote
source.
00:16:53 The Data Integration Monitor can be used to monitor the view persistence,
00:16:57 to automatically trigger it via schedule, and to analyze the view.
00:17:01 View persistence supports partitioning and partition-wise refresh of data.
00:17:06 Furthermore, the view persistence can be included in the task chain
00:17:09 and started via that task chain. Now let's have a demo
00:17:13 on how to enable this view persistence. The next demo is about associating
00:17:17 the employees data that we just incorporated within the previous exercises.
00:17:21 Therefore, we create a new graphical view in the Data Builder.
00:17:24 We direct the local table EmployeesWithAddresses to the canvas,
00:17:31 and then we rename the view to Employees. We set semantic usage to Dimension
00:17:51 as Employees are one dimension that we want to associate to our dataset.
00:17:58 We set the EMPLOYEEID as the primary key of the view. Now we can deploy the Employees
view
00:18:05 by pressing the cloud icon. The next thing we can already start while it's deploying
00:18:19 is to open the Sales Order View because we want to associate the Employees dimension to it.
00:18:44 So we go to the Associations area and click on the plus to create a new association.
00:19:05 We want to connect the EMPLOYEEID field of our Employees dimension
00:19:09 to the field "Created by" in our Sales Order View. We do so by dragging and dropping the
"Created by" field
00:19:23 on the EMPLOYEEID field. Now we can hit the deploy button
00:19:35 to redeploy the analytic dataset and switch to the SAP Analytics Cloud.
00:19:41 And there, we can open our previously created story. We will insert a new chart
00:19:54 where we want to display the sales per person. Therefore, we will use the measure "Gross
amount",
00:20:08 and we will display the last name of the "Created by" association as Dimension.
00:20:25 In order to rank the results, we can restrict the results to the top 10 employees
00:20:29 with the highest gross amount. Et voilà - we have new data in our dashboard.
00:20:38 That's it for now. Let's summarize what you've learned in this unit.
00:20:43 You've learned what a data flow in SAP Datasphere is and how to use the Data Flow Editor.
00:20:49 You learned how to fulfill more complex transformation requirements
00:20:52 using the script operator in data flows. You learned how to create and schedule a task chain.
00:20:59 And finally, you've learned how to persist a view. That's it for this unit on data flows,
00:21:04 task chains, and SQL views. In the next unit, unit six,
00:21:08 I will show you how to share data, use access controls,
21 / 34
00:21:12 and create entity relationship models in SAP Datasphere. Thank you, and good luck with the
upcoming quiz.
22 / 34
Week 1 Unit 6
00:00:05 Hello and welcome to week one, unit six of our openSAP course, Introduction to SAP
Datasphere.
00:00:12 My name is Tim Huse from the SAP analytics and insight consulting.
00:00:17 In this unit, you will learn how to share data, create hierarchies, create access controls,
00:00:22 and create entity relationship models with SAP Datasphere. Let's dive in.
00:00:29 We start by looking at your task for this unit. You have created a story that shows a good view
of sales,
00:00:35 business partners, products, and employees. You now need to enhance the product
information into the sales order view
00:00:42 and then share it with the sales organization for their reporting. You will use the products view
from the master data space
00:00:48 and share the resulting sales order view to the sales org space for the consumption of the
view.
00:00:54 But before releasing this report, we need to ensure that users see data based on their
authorization,
00:00:59 which is region specific. Therefore, you will create and apply a data access control.
00:01:07 Let's start with hierarchies. A hierarchy is a systematic way of organizing members
00:01:12 of a dimension into a logical tree structure in order to support drill down and drill up
00:01:17 on the dimension in business intelligence clients, such as SAP Analytics Cloud.
00:01:23 An example would be to display the sales per country and then drill down to display the sales
00:01:28 for the respective regions of the selected country, and then to drill down to the respective city
00:01:33 of the selected region. This will be a so-called drill down
00:01:36 to the dimension location with the levels, country, region, and city.
00:01:42 There are three ways to create a hierarchy for your model. You can create level-based
hierarchies,
00:01:48 parent-child hierarchies, or you may use external hierarchies for your dimension.
00:01:53 A level-based hierarchy is non recursive and has a fixed number of levels.
00:01:58 This can be, for example, a time hierarchy like year, quarter, month, day. The example I just
gave with country and region is also a level-based hierarchy.
00:02:09 A parent-child hierarchy is recursive, can have different numbers of levels
00:02:13 and is defined by specifying a column for parents and a column for children in the dimension.
00:02:20 For example, a departmental hierarchy could be modeled with the parent department ID and
department ID columns.
00:02:27 An external hierarchy is a parent-child hierarchy, where the information of the hierarchy is
contained in a separate view,
00:02:35 which needs to be associated with the dimension. At this point, let's take a look at how to
create such a hierarchy in a demo.
00:02:43 In this demo, we are creating a dimension products for product master data that has been
shared with us from the master data space.
00:02:51 We want to create a hierarchy for this dimension So, we create a new view.
00:02:56 On the left side, we can see a folder called shared objects. Here we can find the MD products
view
00:03:01 that has been shared with us. We drag it to the canvas and renamed the view to products.
00:03:14 And we set a semantic usage type to dimension. Now, let's preview the data.
00:03:27 Looks good so far. You can see for every product we have a product category
23 / 34
00:03:31 and they are non-recursive. So, we can create a level-based hierarchy
00:03:35 for product category and product. In the Properties panel on the right,
00:03:42 we click on the Hierarchy icon and then we click plus to add a new level-based hierarchy.
00:03:47 We can keep the default name here. Level one will be product category ID
00:03:52 and level two will be the product ID. We have added a hierarchy.
00:03:59 Now, we can deploy the new dimension view. The next thing we have to do is associating
00:04:04 this dimension to our sales order view. We therefore go back to the data builder
00:04:10 and select our sales order view. We go to the Associations area
00:04:31 and add an association to the products view. And now we can deploy the change by pressing
the cloud icon.
00:04:51 Now, let's switch to the SAP Analytics Cloud story. We can change the top selling products
chart
00:05:01 to display the hierarchy or to display product names instead of the product ID now.
00:05:07 So, if we click on the dimension product ID, we can already see that a hierarchy has been
identified
00:05:11 and you can choose a level here. So, one would be root level,
00:05:15 two is the product category level, and three is the product level.
00:05:19 By clicking on one product category in the chart, you can just drill down to see all the products
00:05:24 within this category. We can also display the product name
00:05:33 instead of the product ID now. And we can rank the output to only show the top 10 best-selling
products.
00:05:47 Also, have a look at the sales organization in your story. You should see data for EMEA, APJ,
as well as America.
00:05:55 This is important for the next exercise, as we want to restrict the data that you see here.
00:06:00 Now, let's take a look at data access controls. These allow a more granular view on data at the
row level.
00:06:07 What does that mean? A user may only see the rows of a data set
00:06:11 for which he's authorized to see the data. Whether he's authorized depends on the previously
defined criteria,
00:06:18 which are defined in a data access control. Data access controls are applied to artifacts
00:06:24 in the data layer and cannot be overruled. For example, if a view is built on top of a view
00:06:29 that is restricted by a data access control, this restriction also applies to the view above it.
00:06:36 Data access controls, once created, can be applied to various artifacts within the data layer.
00:06:42 In the following demo, we will look at data access controls in action.
00:06:45 We will restrict the sales order view so that a user can only see the sales organizations that he
authorized for.
00:06:52 So, let's start. We start in the data builder and go to the tables
00:06:58 as we want to define the privilege for our data access control within a new local table.
00:07:04 We will call this table DACDefinition, and it will have two columns,
00:07:09 username as well as allowed values. With this table, we can define which user
00:07:17 will see which sales organization data. We can keep the default data types here
00:07:22 and deploy the table. Now, let's utilize the data editor
00:07:40 within the table to insert new data to the table. The data editor is on the top right side.
00:07:51 Now, you can add two rows with your username and EMEA, as well as APJ as allowed values.
00:08:41 Now, we can save these data entries and jump to the data access controls.
24 / 34
00:08:47 The data access controls have their own panel that you can find in the left navigation bar,
below the data builder.
00:08:52 We rename the new DAC to Region_DAC. We add a permission entity,
00:09:00 which will be our newly created DACDefinition table. We will only use the allowed values
column as criteria
00:09:07 for the access control. As the username column will be used
00:09:11 as the identifier column. Now, we can deploy the access control
00:09:16 by pressing the cloud icon, which will automatically save the artifact as well.
00:09:27 The data access control is created now, but we need to associate it to our sales order view
00:09:32 in order to apply it to the view. Therefore, we go back to the data builder
00:09:38 and jump into the sales order view. There is a special area for data access controls
00:09:44 and we can link it here. We will map the allowed values column
00:09:53 to the sales organization column of our sales order view. Afterwards, we will deploy the
change to the view.
00:10:07 Now finally, we can open the data preview in order to check if we can only see the regions
EMEA and APJ anymore.
00:10:15 You can check this in your Analytics Cloud story afterwards as well. Yes, so we can see EMEA
and also APJ.
00:10:39 Yes, looks good. So, that's it for this demo.
00:10:44 It is possible to share a table or view of the data layer to another space to allow members in
that space to use it as a source for their objects.
00:10:52 A space is a secure area, and its artifacts are not seen in other spaces unless you choose to
share them
00:10:58 with the other space members. When you share an entity to another space,
00:11:03 users in that space can use it as a source for their own views and other objects.
00:11:08 In the example on the slide, you can see how the products and sales tables originally reside in
a space
00:11:14 that is assigned to the IT department. This is because the data comes
00:11:18 from an external source system that is governed by IT. The products table is shared with the
sales space
00:11:25 and the view with sales data is also shared with the sales space.
00:11:29 So that a sales by product view can be built on top in the sales space. The artifacts from the IT
space are marked as shared.
00:11:38 We will now take a look at this concept in a demo. We start in the data builder in our openSAP
space.
00:11:44 As an example, we want to share our sales order view to a sales space. So we click on the
Sales Order View and hit the share icon.
00:11:53 Then we can search for any space in the tenant and edit with read privileges.
00:12:02 The view is now shared. You can see it because there is a small share icon
00:12:07 behind the name of the view. If you click on that icon, you can see all spaces the view is
shared with.
00:12:19 Now, I want to show you how to consume a view that is shared with your space.
00:12:23 We have seen this in a previous exercise already. If you create a new view,
00:12:27 you can see a folder in the left panel, Shared Objects. Here, you can access all tables and
views that have been shared with you.
00:12:34 In the example, the view MD_Products has been shared with my space
00:12:38 from the space OPENSAP_MASTERD_REF. With this, we can close this demo on cross-
space sharing
25 / 34
00:12:45 and go to the entity relationship models. Entity relationship models can be developed in SAP
Datasphere.
00:12:52 These so-called ER models provide a diagram that shows the data entities, tables, as well as
views of an environment
00:13:00 in relation to one another. You can use an ER model to better understand the subset of the
entities in your space
00:13:06 and to communicate this information to other stakeholders. With the ER model, physical or
remote database models
00:13:13 can be designed and afterwards also deployed. Furthermore, existing tables and views from
the data layer
00:13:19 can be reused in the ER model, which means that the model is capable of reverse
engineering.
00:13:25 New entities can be added on the fly in the entity relationship modeler.
00:13:30 Data can be previewed in real time in the editor. The integrated impact and lineage analysis
can be employed
00:13:38 to visually analyze how the artifacts of the data model depend on each other. Conveniently,
the source file of the ER model
00:13:45 can be exported and imported from SAP Datasphere. And now, let's take a look at the ER
model in a short demo.
00:13:53 In this short demo, I want to show you what an entity relationship model can look like in SAP
Datasphere.
00:13:59 You can see there are several tables as well as views in the model.
00:14:03 You can visualize the data types and the entities and also see the relation between these
entities.
00:14:09 You can also see that entities can have relations to themselves.
00:14:14 In this example, each employee has a manager, who is an employee himself.
00:14:23 You can click on the arrows to see more information on the relationship. By clicking on an
entity,
00:14:34 you can choose the create table icon in order to create a relation to a new table.
00:14:40 Then you could just define the columns of the new table, rename it, and even deploy the table
00:14:46 within the entity relationship modeler. We will not do this now.
00:14:52 The ER model can be easily imported and exported to a season file.
00:14:56 This means you can easily share your modeling thoughts with people that work in other
spaces or tenants.
00:15:03 That's it for this demo. Let's summarize what you've learned in this unit.
00:15:08 You've learned how to create hierarchies within dimensions. You've learned how to use data
access controls.
00:15:14 Furthermore, you've learned how to use cross- space sharing in SAP Datasphere. And finally,
you got to know
00:15:21 the entity relationship modeler. That's it for this unit on sharing data access controls and ER
models.
00:15:29 In the next unit, unit seven, you will learn how to utilize the data integration monitor in SAP
Datasphere.
00:15:35 Thank you, and good luck with the upcoming quiz.
26 / 34
Week 1 Unit 7
00:00:05 Hello, and welcome to week one, unit seven of our openSAP course, Introduction to SAP
Datasphere.
00:00:13 I'm Amogh Kulkarni, from the SAP Datasphere Product Management.
00:00:18 In this unit, you will learn how to use the different monitoring tools in the tenant
00:00:24 to understand your tenant health, as well as monitor your data integration tasks.
00:00:29 So let's start. This is a section called Know Your SAP Datasphere,
00:00:35 and the theme for this unit is integration and monitoring. The different topics covered in this
theme are -
00:00:41 the Data Integration Monitor, the System Monitor, configuring your tenant for optimal
monitoring,
00:00:47 working with database analysis users, and finally, the navigation to the SAP HANA Cloud
cockpit.
00:00:59 Data Integration Monitor is a central place in SAP Datasphere where you would monitor data
application for remote tables,
00:01:06 monitor data flows and task chain executions as well. You would add and monitor view
persistency,
00:01:14 as well as monitor queries that are sent from your Datasphere tenant
00:01:18 to your remote connected sources. Data Integration Monitor consists of five different monitors
00:01:25 separated by tabs. Let's go through the specific task

00:01:29 that is achieved by each of these monitors. The first one is the Remote Table Monitor.
00:01:38 The Remote Table Monitor lets you replicate data for the remote tables that have been
deployed
00:01:43 in the context of your space. So whenever you're modeling,
00:01:48 and if you deploy remote tables you would be able to control the data replication tasks
00:01:53 for your remote tables through the Remote Table Monitor
00:01:57 in the Data Integration Monitor. This replication can either be a snapshot-based replication
00:02:04 or you can also set up a real-time replication via the change-data-capturing, that is CDC.
00:02:11 That's the first one. The second one is the View Persistency Monitor.
00:02:16 So whenever you are creating views within your SAP Datasphere, you can also decide
whether you want to persist the data
00:02:23 of these views in your space. You would do such an activity to ensure
00:02:27 that they perform better when you are modeling views in your SAP Datasphere and you're
consuming them.
00:02:35 The next one is the Flow Monitor. So whether you have data flows,
00:02:40 or whether you have replication flows in your SAP Datasphere that you have deployed in your
space,
00:02:46 Flow Monitor would be the place where you would monitor the execution of the flows,
00:02:50 not just the present ones but also the past ones. Additionally, you would also be able to run
00:02:58 and schedule your flows directly from the Flow Monitor. So, all the flows that are deployed in
your space
00:03:05 would then be available here for monitoring, and you would come here
00:03:08 to look at the execution of all your flows. The next one is the Task Chain Monitor.
00:03:17 Oftentimes you want to string together multiple tasks and run them, maybe as a chain, either
sequentially or parallelly.
00:03:25 You would find such task chains for your monitoring on the screen.
27 / 34
00:03:30 And the last, the fifth monitor, that is the Remote Query Monitor.
00:03:34 It is a bit different from all the monitors that we have seen so far.
00:03:38 The Remote Query Monitor has two additional monitoring possibilities.
00:03:42 The first one is the remote table statistics. This lets you define query execution plans
00:03:48 to improve the performance, when the data is read from remote tables.
00:03:53 One thing to note here is that if the data access is set to replicate it,
00:03:58 which means that you've already persisted the data of your remote tables,
00:04:03 then the statistics creation is disabled. This is only applicable to federated datasets.
00:04:10 The three types of statistics that you could create are record count on a simple one,
00:04:15 which creates basic statistics for columns such as min, max, null count, distinct count,
00:04:23 and the last type is the histogram, which shows this distribution per column.
00:04:29 Remote table statistics also lets you delete existing statistics that you have created.
00:04:34 So whenever you want to create or delete, this is the place where you come
00:04:38 for remote table statistics. The second capability within the Remote Query Monitor
00:04:44 is tracking remote queries. It lets you track the queries that are executed
00:04:49 towards your remote connected sources from your space. So every time that you're querying a
remote source,
00:04:56 you would see that your statements are recorded in the Remote Query Monitor.
00:05:00 Additionally, this is the place where you would find statistics for your queries,
00:05:05 like the runtime, the number of rows that you have fetched, and the status of the query,
00:05:09 whether it is running or whether it has been closed. This is the place to monitor your remote
queries
00:05:16 between your SAP Datasphere and your remote sources. Additionally, what you would also
see here
00:05:23 is the actual statement that has been executed towards your remote source.
00:05:27 So you can also find out what statement is sent to the remote source.
00:05:32 So in a nutshell, every time that you want to track the activity between your SAP Datasphere
tenant
00:05:39 and your remote source, you would use the Remote Query Monitor
00:05:43 under the Data Integration Monitor. These are the five different monitors
00:05:48 that are available within the Data Integration Monitor, that let you effectively monitor the
execution
00:05:55 of all the different activities in your tenant. The different monitors
00:06:02 within the Data Integration Monitor need different privileges for you to be able
00:06:07 to effectively monitor and execute different tasks, whether it is the remote connection privilege,
00:06:14 or it is the data integration privilege. A user that is monitoring and executing tasks
00:06:21 needs to have the correct privileges to be able to perform these actions.
00:06:27 So, sort the privileges out before using the Data Integration Monitor.
00:06:33 In a nutshell, the Data Integration Monitor is a group of monitors that work in the context of a
space,
00:06:40 meaning one has to select a space before accessing any of the monitors,
00:06:46 and this is really important. For this demo, I'm logged in as a data integrator
00:06:53 and a space administrator, so I have both the roles for this user.
00:06:58 From the left navigation pane, we move to the Data Integration Monitor.
00:07:04 Now because I am part of only one space, it did not ask me for space selection,
28 / 34
00:07:09 but ideally, it will first ask you to select the space, and once you do, you can look at all the
different monitors
00:07:16 in the context of the space. The first one in the list is the Remote Table Monitor.
00:07:21 Here you can either replicate using snapshots or you could enable real-time access for a
remote table,
00:07:30 using the buttons here. Or you can also navigate to the additional details page,
00:07:38 and there you can not only see the current execution or the latest execution, but the previous
ones as well.
00:07:46 Also, it is possible then to load a new snapshot, remove existing snapshots,
00:07:52 or enable or disable real-time access. Additionally, you can also schedule the execution of a
remote table,
00:08:03 using simple schedule, where you specify the recurrence, the time,
00:08:08 the start and the end date, or in a chron expression manner,
00:08:14 where you specify the chron expression, and then the start and the end date,
00:08:18 which then tells you when will be the next runs according to the chron expression that you
have provided.
00:08:25 Every time that you have to work with your remote tables and their replication,
00:08:30 you will come directly to the Data Integration Monitor, and then use either the details screen
here
00:08:38 or the Remote Table Monitor to select a remote table
00:08:42 and specify the replication strategy. The next one is the view persistency.
00:08:50 Here you can perform a similar process for view persistencies. If you don't see a view in the
list,
00:08:59 you can select a view from the add view button here, and then perform the exact same
process
00:09:08 that we did for remote tables. The third one is the data flow
00:09:13 and the replication Flow Monitor, where you have the list of all the data flows in your space,
00:09:21 either to run them immediately or similarly schedule them for a future execution.
00:09:29 The last one, in this page but not on a list
00:09:34 is the Task Chain Monitor, where you find all the information that you need
00:09:39 about a task chain, whether you want to execute it,
00:09:42 whether you want to schedule it, or whether you want to see
00:09:45 where the previous runs failed or completed, how did they go -
00:09:49 this is visible in a three-page display where you select the execution,
00:09:53 then the step in the execution that you want to view more details about.
00:09:59 And on the rightmost side, then you find the execution details
00:10:02 for that particular task in that particular execution run. This is how you will then monitor
00:10:11 all your data integration tasks. Finally, we have the Remote Query Monitor.
00:10:17 I'm treating it differently because it also works a bit different from others.
00:10:21 Where you have your remote queries, you can see all the remote queries that have been fired
00:10:28 from your SAP Datasphere towards your remote datasets. Find most basic information about
their execution,
00:10:35 but also look at the SQL statement that was executed. And then the second part is the remote
table statistics.
00:10:45 Here you can select one of your remote tables and create or delete statistics.
00:10:51 Creation of statistics is a two-step process, where it'll first ask you what is the type of statistic
00:10:59 that you want to create, and then the relevant statistics are created.
29 / 34
00:11:03 Make sure that you create statistics for your remote table because they improve the
performance
00:11:08 of your query execution plan. Now we move to the System Monitor.
00:11:16 This is the main monitoring tool in SAP Datasphere. And we will look at what is the difference
00:11:24 between a Data Integration Monitor and a System Monitor. Administrators in SAP Datasphere
00:11:30 would use the System Monitor to take a look at the performance of the tenant
00:11:34 and monitor the storage, the tasks, out-of-memory issues, and all the other problems across all
these spaces.
00:11:41 This is the central monitoring in SAP Datasphere. It is a tenant-wide monitor,
00:11:48 which means that only administrators at this point can have access to the System Monitor.
00:11:55 The role required to access the System Monitor is thus the SAP Datasphere administrator.
00:12:02 This is the role that you would need to use the System Monitor.
00:12:06 The System Monitor provides you with a top-level view of all the activities happening across
the tenant,
00:12:12 and it also has a navigation path to the Data Integration Monitor that we've seen previously.
00:12:18 The System Monitor is divided into two parts. The first one is the dashboard,
00:12:23 so that's the landing page that gives you all the information
00:12:27 about the storage distribution, the disk, the memory utilization by spaces,
00:12:32 but also an aggregated count for failed tasks and out-of-memory issues in the tenant.
00:12:37 So this is the dashboard that you would like to visit if there are any issues in your tenant.
00:12:43 The available information is further delineated into the top five spaces that are contributing
00:12:48 to the out-of-memory errors, or the number of failed tasks
00:12:51 in the last seven days, in the last 24 hours, as well as in the last 48 hours.
00:12:55 So this is really useful. The dashboard is thus a culmination of all the information
00:13:02 that is available for your monitoring, and this is the place
00:13:05 where you would start your monitoring journey. The second tab in the System Monitor
00:13:10 provides you with information about tasks that have been executing in the tenant.
00:13:16 Just note that only an administrator has access to the System Monitor.
00:13:20 I'm stating it again. Here you will find out about all the executions
00:13:26 of all the tasks with their execution statistics, like duration, memory consumption,
00:13:31 the status of the tasks, the number of records that they have fetched.
00:13:36 Further, it is also possible to look at the statements that have been executed
00:13:41 owing to these tasks and are classified as expensive statements.
00:13:45 So one task can have one or multiple statements. The System Monitor lets you drill down
00:13:52 from a top-level KPI, to actual task, and the resulting individual expensive statements as well.
00:14:02 For this demo, I'm logged in as an administrator. From the left navigation pane,
00:14:08 because I'm an administrator, I now see the System Monitor. I navigate to the System Monitor,
00:14:14 where it shows me the dashboard with all the most important KPIs
00:14:20 that tell you the health and status of your SAP Datasphere tenant,
00:14:25 be it the space consumption or the memory consumption. Moving on, it'll also show you all the
number
00:14:32 of failed tasks in your tenant, in the last seven days, or out-of-memory errors, run duration,
memory consumption,
00:14:42 as well as some statistics on the MDS requests. This is the page where you will come
00:14:49 if you want to find out more about the current status of your tenant.
30 / 34
00:14:55 Here you get it in an aggregated manner, but if you move to the logs,
00:15:00 you can dive into individual executions and the different tasks that have been executed.
00:15:07 Now, here you will find all the different executions that have gone in the tenant in all the
spaces.
00:15:17 Remember, I'm an administrator, and that's why I see all the spaces
00:15:21 and all the tasks that have been executed. I can directly move to the Data Integration Monitor
00:15:27 if I click on the activity, or I can also go to the modeling screen
00:15:31 if I click on the object name. I scroll to the right.
00:15:36 It also provides me with the statements that have been executed,
00:15:41 but because this statement was not classified as an expensive statement,
00:15:46 to which we'll come later on, you won't see any statements here.
00:15:51 But if you clear the filter out, you'll see all the other statements
00:15:54 that have been executed in the tenant. And if there are any tasks that are related to it,
00:16:01 so scheduled tasks or manually executed tasks, you will see them with a task log ID.
00:16:11 This is the page where you would then come when you want to look at all the different
executions
00:16:17 that have gone either in a sorted manner, or in a timely manner.
00:16:22 So you can also select the start time and the end time to give it a timestamp,
00:16:28 and then find out what has gone in the tenant in that given time duration,
00:16:36 so that you can filter out from all these tasks that have executed
00:16:41 and only look at the ones that you are willing to look at. We will now look at the monitoring
configuration,
00:16:51 which is available under system, followed by configuration. So system, configuration.
00:16:58 There are two configurations that are interesting here. The first one is the monitoring view
enablement,
00:17:05 which lets you select up to two spaces that get an access to the monitoring views
00:17:10 in the Data Builder. By the virtue of these monitoring views,
00:17:14 you'll be able to consume System Monitoring views from the underlying SAP HANA Cloud
instance.
00:17:20 So whenever you want to look at how my instance is behaving, you can use these monitoring
views.
00:17:26 One of the spaces is set as the SAP monitoring content space, where you would deploy the
delivered monitoring content from SAP,
00:17:34 so that is the standard business content for monitoring. Whereas the second space is a user-
defined space
00:17:42 that gets an access to the underlying monitoring views. So the first one is the technical content
space from SAP,
00:17:48 and the second one is the custom user space. When enabled, a modeler in the space can
consume and model
00:17:57 on top of the System Monitoring views to derive additional insights.
00:18:01 You will do that if you want to fetch more insights other than what System Monitor shows you.
00:18:07 The second configuration is expensive statements tracing, and this is the configuration that
controls the statements
00:18:14 that are classified as expensive statements and are available for your monitoring
00:18:18 in the System Monitor under the Statements tab. This is what I was referring to as the
expensive statements.
00:18:25 So to classify a query or a statement as an expensive statement,
00:18:30 you will have to specify thresholds that let the system filter statements
31 / 34
00:18:34 that are costlier than the configured values. A threshold and a statement act together there.
00:18:41 As soon as a statement consumes either more CPU time, or memory, or duration
00:18:46 than the threshold value that you have configured, it is automatically classified as an
expensive statement,
00:18:53 and this statement is available for monitoring in the System Monitoring dashboard.
00:18:58 So you see how these two things work together. Now, a database analysis user
00:19:07 is an SAP HANA Cloud database user with wide-ranging read privileges,
00:19:12 over the underlying runtime HANA Cloud instance. Optionally, this user can also be granted
space schema privileges,
00:19:21 which means that this user can also read the data that is stored in the space schema,
00:19:27 but this is clearly optional. You can decide not to grant these privileges.
00:19:33 You would create such a user to support monitoring, analyzing, tracing, and debugging
00:19:39 of your SAP Datasphere runtime HANA Cloud instance. These database analysis users can
be configured
00:19:46 with an expiration date, or without one. You can choose when you're creating a user.
00:19:52 This user could either be leveraged to consume the underlying HANA Cloud monitoring views
00:19:57 in the database explorer, similar to what you will do in the graphical modeler,
00:20:02 or to access the SAP HANA Cloud cockpit. Now, the database analysis users that we have
looked at
00:20:13 on the previous slide, these could be used to access the HANA Cloud cockpit.
00:20:19 This is a tool for administering and monitoring the underlying cloud runtime database.
00:20:25 So whenever you want to look at how my underlying instance is behaving, use the HANA
Cloud cockpit
00:20:32 as a complementary tool to the System Monitor. The HANA Cloud cockpit lets you monitor
alerts,
00:20:38 resource usages, and performance of your SAP HANA Cloud instance.
00:20:43 This could be done by the virtue of the performance monitor,
00:20:46 as well as the workload analyzer. You will use the tools that fit your requirements
00:20:51 when you are monitoring your HANA Cloud instance. This will also be the place where you will
analyze
00:20:57 trace and diagnosis files, as well as query the system
00:21:00 using SAP HANA database explorer. Now, because the database analysis user
00:21:07 has a read privilege over the underlying instance, with this user, we will not be able to make
00:21:12 any configuration changes in the underlying SAP HANA Cloud instance.
00:21:16 This still remains outside of the borders or limitations of the database analysis user.
00:21:23 The database analysis user, as the name states, is merely for analyzing the SAP HANA Cloud
instance
00:21:29 and not for making any configurational changes. In this demo, now I am going
00:21:37 to use the Configuration, under System. And here I'm going to go to the Monitoring view.
00:21:47 As mentioned in the slides, you can select one of your spaces in the tenant
00:21:54 to be a candidate for the monitoring view, and this space now gets an access,
00:21:59 and all the modelers in the space also get an access to consume the underlying monitoring
views.
00:22:05 The second part which is of interest is the expensive statements tracing,
00:22:10 where you can find out how to configure, or what are the current configurations.
00:22:15 At the moment, the thresholds are set only for memory and duration, which means that every
statement
00:22:23 that passes the threshold either for its memory consumption or takes more time to complete
32 / 34
00:22:29 than the duration mentioned here, will be classified as an expensive statement,
00:22:33 and it will be visible under the System Monitor in the statement step.
00:22:37 Make sure that you set this optimally, because if you set it too low,
00:22:41 then you are actually tracing everything and then everything is treated as an expensive
statement.
00:22:47 If you set the thresholds too high, then you are not filtering anything
00:22:51 and then you're losing out on any insights that you get out of expensive statements.
00:22:56 So make sure that you always set this optimally, and the optimal threshold values can be
found out
00:23:03 by some trial and error when you're trying to filter some out,
00:23:08 check the query execution, and then come back and configure it.
00:23:14 For this demo, now we are going to go again into the System and Configuration,
00:23:21 where we find a tab that says Database Access, and under Database Access, we have
Database Analysis Users.
00:23:29 Here you can find out all the users that have been created, but also create new users.
00:23:36 Creation of a user lets you specify the name, and then whether the user has an access to the
space schema.
00:23:44 Additionally, you can also specify the expiration date for this user, so if you want the user to
expire in a certain number of days,
00:23:51 up to five, or you never want it to expire,
00:23:55 you can specify that and create. Once you create,
00:23:59 and we will quickly create one for our demo... Once you do so, you have all the details that you
need
00:24:08 to either connect it using an external database tool, or you can use this user to open the SAP
HANA Cloud cockpit.
00:24:20 Opening this would then jump from your SAP Datasphere onto your SAP HANA Cloud cockpit,
00:24:28 to let you monitor, analyze, and troubleshoot everything that you need to do
00:24:31 with your underlying SAP HANA Cloud instance. Let's summarize what we have learned in this
unit.
00:24:40 We started with the Data Integration Monitor, to look at all the different monitoring capabilities
00:24:44 that it offers. Then we went to the System Monitor,
00:24:48 which offers a dashboard with KPIs, and also lets you drill down
00:24:52 the individual task execution and statements. We also learned about expensive statements
tracing,
00:24:59 the thresholds and the configuration, the database analysis user,
00:25:03 and the navigation to SAP HANA Cloud cockpit. I hope you can now use all the tools that you
need
00:25:10 to monitor and understand the health of your SAP Datasphere tenant.
00:25:15 That's it for this unit on Data Integration Monitor. Thank you, and good luck with the upcoming
quiz.
33 / 34
© 2023 SAP SE or an SAP affiliate company. All rights reserved.
See Legal Notice on www.sap.com/legal-notice for use terms,
disclaimers, disclosures, or restrictions related to SAP Materials
for general audiences.

openSAP_dsp1_Week_1_Transcript_EN

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

openSAP_dsp1_Week_1_Transcript_EN

Uploaded by

Copyright:

Available Formats

PUBLIC

00:01:25 separated by tabs. Let's go through the specific task

You might also like