Remove duplicate values between arrays in BigQuery
假设我有以下数组:
1 2 3 | SELECT ['A', 'B', 'C', 'A', 'A', 'A'] AS origin_array UNION ALL SELECT ['A', 'A', 'B'] AS secondary_array |
而且我想删除数组之间的所有重复值(而不是数组内部),以便最终结果将是:
1 | SELECT ['C', 'A', 'A'] AS result_array |
任何想法如何完成?
以下是BigQuery标准SQL
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | #standardSQL CREATE TEMP FUNCTION DEDUP_ARRAYS(arr1 ANY TYPE, arr2 ANY TYPE) AS ((ARRAY( SELECT item FROM ( SELECT item, ROW_NUMBER() OVER(PARTITION BY item) pos FROM UNNEST(arr1) item UNION ALL SELECT item, ROW_NUMBER() OVER(PARTITION BY item) pos FROM UNNEST(arr2) item ) GROUP BY item, pos HAVING COUNT(1) = 1 ))); WITH `project.dataset.table` AS ( SELECT ['A', 'B', 'C', 'A', 'A', 'A'] AS origin_array, ['A', 'A', 'B'] AS secondary_array ) SELECT DEDUP_ARRAYS(origin_array, secondary_array) AS result_array FROM `project.dataset.table` |
结果为
1 2 3 4 | Row result_array 1 A A C |
如果仅键入