Professional Documents
Culture Documents
GCP Overview
GCP Overview
Project Vector at Google is the first large data management deployment on Google Cloud
Platform (GCP) and also one of the largest Salesforce implementations for Deloitte.
Applications:-
Informatica MDM, Dell Boomi, GCP and Salesforce.
gsutil:-
Command line tool used to move files to and from google storage.
Note:- If you have a large number of files to move/copy you might want to use the gsutil -m option, to perform a multi-threaded/multi-processing.
Google DataPrep :-
Google Cloud Dataprep is a managed cloud service for quick data exploration and transformation. Dataprep makes it easy to clean and transform large
datasets for analysis.
Google Storage :-
Can store any type of data and any size.
Points to note:-
Batch inserts are free, but streaming inserts incur extra changes which is currently $0.05 per GB sent.
BigQuery queries’ costs depend on the amount of data scanned, not the data retrieved.
Best suited for ELT.
Select what you need.
Where clause String comparation? Avoid lower and upper for case insensitive filters and use regex_match()
Order by clause is the most expensive so think twice before use.