Professional Documents
Culture Documents
Ingest Salesforce Data Into Amazon S3 Data Lake
Ingest Salesforce Data Into Amazon S3 Data Lake
Ingest Salesforce Data Into Amazon S3 Data Lake
Search
Listen Share
In this blog, you will learn how to ingest Salesforce data using Bulk API (optimized
to process large sets of data) and store it in Amazon Simple Storage Service (Amazon
S3) Data Lake using StreamSets Data Collector, a fast data ingestion engine . The
primary AWS service used in our data pipeline is Amazon S3, which provides cost
effective storage and archival to underpin the data lake.
Consider the use case where a data engineer is tasked with archiving all Salesforce
contacts along with some of their account information in Amazon S3. To
demonstrate an approach of connecting Salesforce and AWS, I have created a data
pipeline that is specifically designed to facilitate seamless, secure, and real-time
flow of data between Salesforce and Amazon S3.
Salesforce origin
https://medium.com/@iamontheinet/ingest-salesforce-data-into-amazon-s3-data-lake-27dc16563180 1/13
07/06/2024, 22:49 Ingest Salesforce Data Into Amazon S3 Data Lake | by Dash Desai | Medium
You can configure the Salesforce origin to read existing data using the Bulk or
SOAP API and provide the SOQL query, offset field, and optional initial offset to
use. When using the Bulk API, you can enable PK Chunking to efficiently
process very large volumes of data.
https://medium.com/@iamontheinet/ingest-salesforce-data-into-amazon-s3-data-lake-27dc16563180 2/13
07/06/2024, 22:49 Ingest Salesforce Data Into Amazon S3 Data Lake | by Dash Desai | Medium
This processor is configured to mask PII (contact’s email address) before storing
the data in Amazon S3.
This enables writing data in a compressed (Avro) format for cost effective
storage in Amazon S3.
https://medium.com/@iamontheinet/ingest-salesforce-data-into-amazon-s3-data-lake-27dc16563180 3/13
07/06/2024, 22:49 Ingest Salesforce Data Into Amazon S3 Data Lake | by Dash Desai | Medium
Pipeline Run
https://medium.com/@iamontheinet/ingest-salesforce-data-into-amazon-s3-data-lake-27dc16563180 4/13
07/06/2024, 22:49 Ingest Salesforce Data Into Amazon S3 Data Lake | by Dash Desai | Medium
After the pipeline runs successfully, you should see the output similar to the one
shown below. Notice the highlighted AWS encryption and data format of the object
stored on Amazon S3.
And the contents of the S3 object stored in Avro format should look something like
this.
In this post, you learned the value companies can realize by leveraging and
integrating data between AWS and Salesforce using StreamSets Data Collector.
Closer integration between AWS and Salesforce opens up plenty of opportunities for
enterprises to develop new and unique ways of accessing, analyzing, and storing
their data.
https://medium.com/@iamontheinet/ingest-salesforce-data-into-amazon-s3-data-lake-27dc16563180 5/13
07/06/2024, 22:49 Ingest Salesforce Data Into Amazon S3 Data Lake | by Dash Desai | Medium
Follow
Lead Developer Advocate @ Snowflake | AWS Machine Learning Specialty | #DataScience | #ML |
#CloudComputing | #Photog
Dash Desai in Snowflake Builders Blog: Data Engineers, App Developers, AI/ML, & Data Science
139 3
https://medium.com/@iamontheinet/ingest-salesforce-data-into-amazon-s3-data-lake-27dc16563180 6/13
07/06/2024, 22:49 Ingest Salesforce Data Into Amazon S3 Data Lake | by Dash Desai | Medium
Dash Desai in Snowflake Builders Blog: Data Engineers, App Developers, AI/ML, & Data Science
151 2
Dash Desai in Snowflake Builders Blog: Data Engineers, App Developers, AI/ML, & Data Science
40 1
Dash Desai in Snowflake Builders Blog: Data Engineers, App Developers, AI/ML, & Data Science
65 1
https://medium.com/@iamontheinet/ingest-salesforce-data-into-amazon-s3-data-lake-27dc16563180 8/13
07/06/2024, 22:49 Ingest Salesforce Data Into Amazon S3 Data Lake | by Dash Desai | Medium
15 1
https://medium.com/@iamontheinet/ingest-salesforce-data-into-amazon-s3-data-lake-27dc16563180 9/13
07/06/2024, 22:49 Ingest Salesforce Data Into Amazon S3 Data Lake | by Dash Desai | Medium
SFMC Tips #42 : Summer ’24 Release Highlights: Notable New Features
in Marketing Cloud
The release notes for Salesforce Marketing Cloud Summer ’24, focusing on new features, have
been published. I would like to write an…
14
Lists
Staff Picks
656 stories · 1020 saves
https://medium.com/@iamontheinet/ingest-salesforce-data-into-amazon-s3-data-lake-27dc16563180 10/13
07/06/2024, 22:49 Ingest Salesforce Data Into Amazon S3 Data Lake | by Dash Desai | Medium
Hugo Lemos
Often, there is a business case for copying data between Salesforce and Snowflake. A less-
effort solution to enable this integration is to…
Manojkumar Vadivel
238 8
https://medium.com/@iamontheinet/ingest-salesforce-data-into-amazon-s3-data-lake-27dc16563180 12/13
07/06/2024, 22:49 Ingest Salesforce Data Into Amazon S3 Data Lake | by Dash Desai | Medium
37
https://medium.com/@iamontheinet/ingest-salesforce-data-into-amazon-s3-data-lake-27dc16563180 13/13