What is BigQuery?
In our world of ‘Big Data’ it can be time consuming and expensive to query massive datasets without the right infrastructure. Google BigQuery solves this problem by enabling super-fast, SQL-like queries against append-only tables, using the processing power of Google’s infrastructure.
What You need to do?
- Move your data into BigQuery – This is what we will do in this post.
- Let Google BigQuery handle the hard work.
- Query your big data with a smile in this cost/effective way.
How to upload data to Big Query?
There are two main approaches: stream you data or upload it directly from Google cloud storage. Let’s have a look at the steps to leverage Google cloud storage in order to upload data into BigQuery.
The main steps you need to follow:
- You will need to prepare your data. In this stage, you need to analyze and think what will be the best format (both JSON and CSV are supported).
- In our example, we will show you how to work with CSV files and even better, we will upload them to Google Cloud Storage and later with a BigQuery job we will make sure our data is being pulled automatically into BigQuery.
- Run a ‘sanity’ check to see that the new data is in good shape (optional step).
- Upload your the data to a project with a good name (The default project names are not too clear in most cases).
- Consider breaking your data (e.g monthly tables instead of a unique big one) because it will make life easier in the future to update, query and maintain the data source.
- Have an example dataset with data that reflect the popular cases. This could be great to give developer an option to ‘play’ with the data and see its value.
- Think on some good and bold example. A few sample queries are crucial to get people started on a dataset.
Continue reading →
Share only with good friends: