1. How to handle duplicate column names on left join
A SELECT * EXCEPT statement specifies the names of one or more columns to exclude from the result set. All matching column names are omitted from the output.
SELECT
t1.* EXCEPT (name),
t2.* EXCEPT (name,account_id),
t1.name AS deployment_name,
t2.name AS account_name
FROM
table1 t1
LEFT JOIN
table2 t2
ON t1.name = t2.name
2. Extract date from timestamp
Returns a value that corresponds to the specified part
from a supplied timestamp_expression
. This function supports an optional timezone
parameter. See Time zone definitions for information on how to specify a time zone.
EXTRACT(part FROM timestamp_expression [AT TIME ZONE timezone])
SELECT
EXTRACT(DATE FROM started) AS deployment_start_date
FROM
table
3. How to calculate duration between two timestamp
Returns the number of whole specified date_part
intervals between two TIMESTAMP
objects (timestamp_expression_a
- timestamp_expression_b
). If the first TIMESTAMP
is earlier than the second one, the output is negative. Throws an error if the computation overflows the result type, such as if the difference in microseconds between the two TIMESTAMP
objects would overflow an INT64
value.
TIMESTAMP_DIFF(timestamp_expression_a, timestamp_expression_b, date_part)
SELECT
TIMESTAMP_DIFF( t1.stopped, t1.started, SECOND ) AS deployment_time
FROM
table t1
4. Regular expression and multi-condition in BQ
Short-circuiting in conditional expressions can be exploited for error handling or performance tuning.
Note: BigQuery provides regular expression support using the re2 library; see that documentation for its regular expression syntax.
SELECT
CASE
WHEN REGEXP_CONTAINS(t1.app_image,r"dkr.ecr.") THEN "central ECR"
WHEN REGEXP_CONTAINS(t1.app_image,r"registry.cowbell.realestate.com.au") THEN "cowbell"
ELSE "others"
END
AS registry
FROM
table t1
5. How to get data from last one year dynamically
Truncates a timestamp to the granularity of date_part
.
TIMESTAMP_TRUNC(timestamp_expression, date_part[, timezone])
SELECT
t1.*
FROM
table t1
WHERE
s1.started
BETWEEN
TIMESTAMP_TRUNC(TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 365 * 24 HOUR),DAY)
AND
TIMESTAMP_TRUNC(CURRENT_TIMESTAMP(), DAY)
Reference
bigquery-how-to-automatically-handle-duplicate-column-names-on-left-join