Professional Documents
Culture Documents
Integrating Data Lakes With Salesforce - Lake Hydration and Visualization With Tableau
Integrating Data Lakes With Salesforce - Lake Hydration and Visualization With Tableau
Integrating Data Lakes With Salesforce - Lake Hydration and Visualization With Tableau
Listen Share
Introduction
Data is among the most valuable assets businesses have, but gaining insights from
that data can be difficult. Part one of this two-part series covered setting up an AWS
account and establishing an Amazon Simple Storage Service (S3) bucket for use to
store data from Nonprofit Cloud in a sample data lake integration use case.
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 1/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
This post covers how to sync data from Salesforce to S3 and visualizing it using
Tableau.
On the AppFlow dashboard click Create Flow and then give your flow a name.
Consider choosing the name of the Salesforce object your are syncing (for example,
Contacts or Accounts). Each Salesforce object you sync will have a separate flow. The
flow name will also be the name of the folder AppFlow creates in your S3 Bucket.
Optionally, give the flow a description so it is easy to recognize from a long list of
flows. Leave all other settings at their defaults. When finished, click “Next”.
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 2/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
As you configure the flow, select Salesforce as the data source, select Amazon S3 as
the destination, enter the bucket you created in the previous step, and set the data
format preference as Parquet (under additional settings). Leave all other settings at
their defaults.
Click Choose Salesforce Connection to create a new connection and then select the
Salesforce environment type. It is best practice to test all integrations in a Sandbox
environment before connecting to Production. Give the connection an easily
identifiable name as you will be able to reuse this connection for additional flows on
other Salesforce objects. For the Flow Trigger, you can select run on demand, or on a
schedule. When finished, click Continue.
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 3/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
Use the credentials for the dedicated API user you created in Step 1.2. Create a
Salesforce Dedicated API User. If you did not create an API user, then login using a
user with the necessary permissions as described in part one.
Click Allow to give AppFlow access to Salesforce data. This will create a Connected
App and policies in Salesforce to enable data to flow between the AWS and
Salesforce clouds. You can review and adjust the Connected App settings within
Salesforce.
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 4/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
Search
Now that you have successfully connected, you can finish configuring your flow.
Remember that you will create a flow for each Salesforce object you want to sync to
S3. This example creates a flow for the Contact object.
Scroll down the configuration page and select a Flow Trigger. When starting out it is
best to run your flows on demand. Once you have tested the flow and are
comfortable with the results, you can consider setting up a schedule. When finished,
click “Next”.
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 5/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
Now you can use the drop down to search for and select fields from the Salesforce
object. You can also select all the fields from the Salesforce object.
Click the “Map Fields Directly” button to copy over the field labels and API name
automatically. Optionally, you can transform the source data before writing to S3 with
formulas and other data modification operations. You can also choose to use
validations to specify an action when unexpected data and formats are found. When
finished, click “Next”.
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 6/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
You can choose to filter your data. For example, you might create a flow in which only
contacts with a mailing address in California are selected. When finished, click “Next”.
On the final screen, review your selections and go back to make any final updates.
When finished, click Create Flow. After the flow is created, you will be redirected to
the flow’s landing page and a success banner will appear at the top.
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 7/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
To run this flow on demand, click Run Flow. A status bar will appear at the top alerting
you to progress. Successful results will appear at the top of the page in green.
Congratulations, you have now set up an Amazon S3 Data Lake, connected it to your
Salesforce Nonprofit Cloud, and hydrated the data lake with data from the Salesforce
Contact object. You can repeat the instructions in Step 3 to create flows and
schedules for additional Salesforce objects. You can also create additional buckets
and connect your lake and other sources using AppFlow (for supported applications)
or another extract-transform-load (ETL) tool like MuleSoft or Talend.
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 8/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
On the AWS Glue Console, on the left side menu under Data catalog > Databases,
click Add Database. Type the database name and click Create to create a database in
Glue.
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 9/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
Enter a meaningful name for the data crawler and click Next.
For the Crawler Source Type select Data Stores, and for the Repeat option select
Crawl All Folders.
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 10/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
Click the folder icon to the right of Include Path and select the S3 bucket path that
contains the Parquet file containing the Salesforce Contact objects data.
When asked if you want to add another data store, select No and click Next.
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 11/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
On the Choose An IAM Role screen, select Create An IAM Role and enter a name for
the role. AWS Glue will automatically create an IAM Role that will have the requisite
permissions to access Salesforce data in the S3 bucket you specified in the previous
step.
Set the schedule to run the crawler on demand. If you intend to make frequent
updates to your Salesforce Schema that will impact the data being synced to S3, then
you may consider running the crawler on a more regular schedule.
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 12/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
Choose the database created in the earlier steps to configure the crawler’s output.
This step will ensure the crawler creates a data catalog table in the database you
created earlier.
On the final screen, validate all the information entered and click Finish to create the
crawler.
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 13/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
After the crawler is created you should see it on the screen. Select your new crawler
and click Run Crawler.
Upon a successful run, a status message will appear at the top of the page in a green
box.
Now navigate to the Athena service from the AWS Management Console. Select
AWSDataCatalog as the Data Source. Select Salesforce-Schema (the name you used
in the Glue step) as the database. Click the three blue vertical dots (located in the
top-right of the Tables section), and pick Preview table. You can now confirm that
your S3 data source is connected to Athena by reviewing the tables and fields in the
left column.
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 14/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
4.3. Create an IAM User to set up the connection between Athena and
Tableau
From the AWS Console, search for the IAM service and click IAM.
From the IAM dashboard, click Users on the left window and then click Add User.
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 15/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
Enter a user name and select Programmatic Access. Click on Next: Permissions.
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 16/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
Click Attach Existing Policies Directly. Then use the Filter Policies search bar to find
and attach the following policies:
AmazonAthenaFullAccess
AmazonS3FullAccess
After creating the user, you should see Success message. Download the CSV file that
contains the access key and secret Kkey information and store it in a safe location.
You will need this information to set up the connection in Tableau.
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 17/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
6. Restart Tableau
Once these steps are completed, you can add a new Amazon Athena connection and
begin configuring it.
In step 4.3. Create an IAM User to set up the connection between Athena and
Tableau, you downloaded the IAM user’s security credentials in a CSV file. The file
contains the access key and secret key needed to set up the connection between
Athena and Tableau. Open the CSV file in a text editor and keep the keys handy.
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 18/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
Next, identify the AWS region in which you are running Athena. You can do this by
clicking the region dropdown menu in the top right. Use the following syntax,
replacing “us-east-1” with the region your organization is running: Athena.us-east-
1.amazonaws.com Copy this text string to your scratch document for use later on in
the setup process.
Lastly, find the Athena Query Results location via Settings in the Athena console. If
this value is blank, you will need to select a bucket path using the folder icon. You can
choose the same bucket you created in Step 2.1. Setup Amazon S3.
With these values ready you are now prepared to open Tableau Desktop and make
the connection to Athena.
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 19/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
Use the values you just collected to fill in the login screen. Note that you will use the
Athena Query Results path for the S3 Staging Directory.
Your connection is now complete and you can see available tables for use in the S3
bucket(s) defined.
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 20/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
Conclusion
This two-part series walked through setting up a data lake in S3 , hydrating the data
lake with Salesforce data using AppFlow, and then using Glue and Athena to connect
that data to Tableau so you can create visualizations and dashboards.
The flow showcased in this series is the foundation for a data lake. From this starting
point, it’s possible to read Salesforce data as it was exactly at any given day based on
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 21/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
a schedule aligned with you organization’s needs. Over time your organization can
use the concepts covered in this tutorial to ingest data from additional sources
including spreadsheets, on-premises servers, and SaaS applications to build out a
comprehensive data management strategy. Before getting started, seek out the help
of your database administrator as many of these steps will require an administrator
profile and access.
With the power of Salesforce and AWS, any nonprofit organization, small business, or
enterprise can start their data lake journey and harness the power of data across
their organization with minimal technical resources.
Additional Resources
Trailhead: Amazon AppFlow
Building Secure and Private Data Flows Between AWS and Salesforce Using
Amazon AppFlow
Building AWS Data Lake visualizations with Amazon Athena and Tableau
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 22/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
his local community with a passion for animal welfare, the arts, and workforce
development. You can connect with Tim on LinkedIn.
Akshay Saxena is a Senior Solutions Architect with Amazon Web Services (AWS)
supporting nonprofit organizations. He enjoys helping customers solve their
technology problems by leveraging the power of AWS. His areas of interest are data
lakes, media & entertainment, and cloud-based contact center solutions.
Alex Dinnouti is a Technical Program Manager on the Amazon Web Services (AWS)
for nonprofits team. Before joining AWS, he was the head of information technology
solutions at Conservation International. He has a master’s degree in software
engineering and his retirement dream is working with nonprofit mission impact
open-source projects.
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 23/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
Follow
We exist to empower, inspire and connect the best folks around: Salesforce Architects.
Security Best Practices for API Access and Internal System Users
Learn best practices to secure Salesforce API access and internal system users.
92 3
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 24/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
9 1
One of our clients recently shared their concerns about their Salesforce file storage usage and
wanted us to design and implement a…
18 1
403 3
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 26/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
37
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 27/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
38 5
Lists
ChatGPT prompts
47 stories · 1645 saves
MODERN MARKETING
156 stories · 665 saves
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 28/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
SF SENSEI
Salesforce offers multiple methods for handling authentication and single sign-on (SSO) to
enhance security and user experience. Two…
114 1
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 30/31
07/06/2024, 22:49 Integrating Data Lakes With Salesforce: Lake Hydration and Visualization with Tableau | by Salesforce Architects | Salesforce Architects | …
SFMC Tips #20 : Extracting Subscribers Who Haven’t Opened Emails in the
Last 180 Days Using…
In your organization, is there a possibility that subscribers who have not opened any emails in
the past 180 days might still open emails…
55 1
https://medium.com/salesforce-architects/integrating-data-lakes-with-salesforce-lake-hydration-and-visualization-with-tableau-c359842fd27a 31/31