Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

1

Module 7 In this module we will:


• Query from External Data Sources
• Avoid Data Ingesting Pitfalls
Ingesting New Datasets
• Ingest New Data into Permanent Tables
into Google BigQuery • Discuss Streaming Inserts

© 2017 Google Inc. All rights reserved. Google and the Google logo are trademarks of Google Inc. All other
company and product names may be trademarks of the respective companies with which they are associated.
2

Ingest data permanently into BigQuery from a variety of formats

Cloud Google
Storage Drive

Cloud BigQuery
Cloud Query
Dataflow Dataprep Engine

BigQuery Managed Storage


CSV,
Data ingested into BigQuery is
JSON,
Cloud stored in permanent tables and
Avro
Bigtable this storage is scalable and
fully-managed
… and more formats
© 2017 Google Inc. All rights reserved. Google and the Google logo are trademarks of Google Inc. All other
company and product names may be trademarks of the respective companies with which they are associated.
3

BigQuery can query external data sources in GCS and Drive directly

Cloud Google
Storage Drive

Cloud BigQuery
Cloud Query
Dataflow Dataprep Engine

BigQuery Managed Storage


CSV,
JSON,
Cloud AVRO You can query external data
Bigtable sources directly from BigQuery
which bypasses managed
storage
© 2017 Google Inc. All rights reserved. Google and the Google logo are trademarks of Google Inc. All other
company and product names may be trademarks of the respective companies with which they are associated.
4

Pitfalls: Querying from External Data Sources Directly

Limitations
● Strong Performance
Disadvantages
● Data Consistency not
Guaranteed
● Can’t Use Table Wildcards
(cool feature we will
introduce shortly)

© 2017 Google Inc. All rights reserved. Google and the Google logo are trademarks of Google Inc. All other
company and product names may be trademarks of the respective companies with which they are associated.
5

Streaming records into BigQuery through the API

BigQuery
Query
Engine

Streaming Record Inserts BigQuery Managed Storage

Streaming data allows you to


query data without waiting for a
full batch load

© 2017 Google Inc. All rights reserved. Google and the Google logo are trademarks of Google Inc. All other
company and product names may be trademarks of the respective companies with which they are associated.
6

Summary: GCPnew
Summary: Ingest offers youinto
datasets theBigQuery
abilitymanaged
to: storage

Data stored Load new data Setup streaming


permanently in from a variety of ingestion into
BigQuery is formats BigQuery through
fully-managed APIs
(performance,
backups,
redundancy).
© 2017 Google Inc. All rights reserved. Google and the Google logo are trademarks of Google Inc. All other
company and product names may be trademarks of the respective companies with which they are associated.
7

Lab 6
Ingesting and Querying
New Datasets

© 2017 Google Inc. All rights reserved. Google and the Google logo are trademarks of Google Inc. All other
company and product names may be trademarks of the respective companies with which they are associated.
Ingesting and Querying New Datasets

In this lab, you will ingest new data


sources into Google BigQuery and learn
how to query external data sources
directly.

© 2017 Google Inc. All rights reserved. Google and the Google logo are trademarks of Google Inc. All other
company and product names may be trademarks of the respective companies with which they are associated.

You might also like