Professional Documents
Culture Documents
01 - IBM Watsonx - Data Exploring Watsonx - Data
01 - IBM Watsonx - Data Exploring Watsonx - Data
01 - IBM Watsonx - Data Exploring Watsonx - Data
IBM watsonx.data
Hands-on Lab
Lab Guide by:
Kelly Schlamb
Principal, Learning Content Development | Data & AI
kschlamb@ca.ibm.com
Danny Arnold
Principal, Learning Content Development | Data & AI
darnold@us.ibm.com
Presenter:
Farah Auni Hisham
Technical Enablement Specialist | Data & AI
farah.hisham@ibm.com
1
Ecosystem Technical Enablement | Data & AI
watsonx.data
Hands-on Lab Agenda
2
Ecosystem Technical Enablement | Data & AI
Part 1
Accessing watsonx.data
33
Ecosystem Technical Enablement | Data & AI
Public Service Announcement:
Using different folder name, catalog names etc. might impact on the
subsequent steps.
To troubleshoot or amend the query will sometimes takes more time than
to re-start the hands-on lab steps.
5
Ecosystem Technical Enablement | Data & AI
1.1 Environment Setup
2. Refer to ‘Published Services’ section in to access SSH command, Presto console, MinIO console and
watsonx.data UI.
• No active reservation? Follow steps in • *Use your own commands and URLs from • Copy your list of Published services
Appendix 1 Lab Reservation. somewhere like Notepad in case you
‘Published services’ section in your instance page
get logged out from Techzone!
6
Ecosystem Technical Enablement | Data & AI
1.1 Environment Setup
7
Ecosystem Technical Enablement | Data & AI
1.1 Environment Setup
SSH command*:
• ssh -p <5 digits> watsonx@<server>.techzone-services.com
Are you sure you want to continue connecting?
• Yes
Password [Password input will be invisible]
• watsonx.data
88
Ecosystem Technical Enablement | Data & AI
watsonx.data
Hands-on Lab Agenda
9
Ecosystem Technical Enablement | Data & AI
1.2 Infrastructure Components
Query Engines
Query engines • Run workloads against data in watsonx.data .
• Supports multiple engines (this lab will use Presto as engine)
Catalogs
Governance
Metadata store • Manage table schemas and metadata for the data residing in
and metadata
Access control management watsonx.data
Data format
Storage
Storage • External buckets and databases can be registered and used in
watsonx.data
Infrastructure
10
Ecosystem Technical Enablement | Data & AI
1.2 Infrastructure Components
• As standalone hybrid-cloud software that can be installed on Red Hat OpenShift (on-premises or in the cloud)
watsonx.data is currently available in a Standard Edition, with an Enterprise Edition planned for the future.
For the developer and partner community, IBM also offers an entry-level Developer Edition, which can be used to get
familiar with the watsonx.data console and environment. The Developer Edition has the same code base as the Standard
Edition, but some features are restricted, and it is not intended for production use.
This lab utilizes a pre-installed Developer Edition virtual machine (VM) environment that can be easily provisioned from
IBM Technology Zone (TechZone).
11
Ecosystem Technical Enablement | Data & AI
1.2 Infrastructure Components
Query Engines
Query engines • presto-01: A Presto query engine used to interact with data in the
data lakehouse.
Catalogs
Governance
Metadata store • iceberg_data: An Iceberg catalog, residing within watsonx.data’s
and metadata
Access control management embedded Hive Metastore (HMS).
• hive_data: A Hive catalog, also residing within the embedded HMS.
• wxd_system_data: This is a Hive catalog, associated with the wxd-
Data format system bucket.
Storage
Storage • iceberg-bucket: A bucket in the embedded MinIO object store.
The table data stored here is associated with the iceberg_data
catalog.
Infrastructure • hive-bucket: A bucket in the embedded MinIO object store. The
table data stored here is associated with the hive_data catalog.
• wxd-system: This is a bucket used to hold diagnostic data such
as query history and query event-related information for the
Presto engine. 12
Ecosystem Technical Enablement | Data & AI
watsonx.data
Hands-on Lab Agenda
13
Ecosystem Technical Enablement | Data & AI
1.3 Key User
Interface
• Home Screen
• Infrastructure Manager
• Data Manager
• Query Workspace
• Query History
• Access Control
• Home Screen
• Infrastructure Manager
• Data Manager
• Query Workspace
• Query History
• Access Control
• Home Screen
• Infrastructure Manager
• Data Manager
• Query Workspace
• Query History
• Access Control
Interface
• Home Screen
• Infrastructure Manager
• Data Manager
• Query Workspace
2. Create table from file
• Query History
• Table name: new_cars from file: https://ibm.ent.box.com/v/data-cars-json
• Access Control
• Home Screen
• Infrastructure Manager
• Data Manager
• Query Workspace
• Query History
• Access Control
• You can generate path quickly for • When you are instructed to copy and paste a SQL statement into the
selected infrastructure items SQL worksheet, clear any statement(s) you previously ran before
running the new statement.
• Home Screen
• Infrastructure Manager
• Data Manager
• Query Workspace
• Query History
• Access Control
• Home Screen
• Infrastructure Manager
• Data Manager
• Query Workspace
• Query History
• Access Control
• Home Screen
• Infrastructure Manager
• Data Manager
• Query Workspace
• Query History
• Access Control
• This is not the usual practices when adding user, we just want
to set up a test data quickly. If you would prepare for a demo
with client, you can do this step earlier. For organization IdM, it
is possible for SSO as such other IBM enterprise products
however this is lab version hence will not cover it. 21
Ecosystem Technical Enablement | Data & AI
1.3 Key User 2. Managing Policies Access (add new policies)
an open, hybrid,
and governed fit-for-purpose data store
optimized to scale all data, analytics and
AI workloads
First step before commencing watsonx hands-on exercise is reserving your lab environment.
For this workshop, there are 2 methods:
1) Reserving your own lab environment via IBM Technology Zone or commonly known as Techzone.
• Reserving your own lab via Techzone is always the most recommended option as this means that you may extend
your reservation up to 2 times.
• You are also able to access the environment at least 2 days after your reservation start time.
• You can proceed to reserve your own by Appendix Lab Reservation.
• Learning how to reserve your own environment in Techzone is essential for demo or even PoX.
• If you already have an active reservation of watsonx.data image in Techzone, you may skip this part and proceed to
1.1 Environment Setup.
25
Ecosystem Technical Enablement | Data & AI
watsonx.data Lab Reservation
link below:
• https://techzone.ibm.com/collection/ibm-watsonxdata- 4
developer-base-image/environments
3. Go to Environment tab
4. Select IBM watsonx.data Development Lab (please do not
26
select POC version for the purpose of this lab!)
5. Select Reserve.
6. Reserve for now and fill in reservation form.
• Purpose: Practice / Self-Education
• Purpose description: Practice watsonx lab
• Preferred Geography: itz-watsonx – AMERICAS…
• VPN Access: Disable
7. Tick to agree with IBM Techzone T&C and policies and click
Submit.
8. When your reservation is ready, you will receive email 9. Click on the reservation tile to see the published services.
notification. Click on the View My Reservation.
27
Ecosystem Technical Enablement | Data & AI
Appendix 2 Access Pre-reserved Workshop
28
Ecosystem Technical Enablement | Data & AI
Appendix 3 Restart container – Only to troubleshoot
29
Ecosystem Technical Enablement | Data & AI
Appendix 3 Restart container – Only to troubleshoot
30
Ecosystem Technical Enablement | Data & AI
Appendix 4 Removing the Db2 container’s password 90-day limit
If there is ‘password expired error’ in Db2 connection, it is due to Db2 container has a 90-day limit on the password.
Use the following method to remove the limit.
31
Ecosystem Technical Enablement | Data & AI