关于sql:BigQuery:如何避免”查询执行期间超出资源。”错误

BigQuery: How to Avoid “Resources exceeded during query execution.” error

我想知道如何避免出现"执行期间超出资源"错误。关于此的其他大多数问题都涉及JOIN EACH或GROUP EACH BY,但是我已经不在使用它们了。如果我在日期或ABS(HASH(userId))上包括WHERE子句,则该查询有效,但是我希望拥有整个数据集,然后在Tableau中对其进行进一步过滤。

如果删除t4,则查询有效,但我希望最后一列,并且我希望在event_parameters字段之外创建更多列以供以后查询。

职位ID为rhi-localytics-db:job_6MaesvuMK6mP6irmAnrcM9R3cx8,以防万一,谢谢。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
SELECT
    t1.userId AS userId,
    t1.event_time AS event_time,
    t1.Diamond_Balance AS Diamond_Balance,
    t2.Diamond_Change AS Diamond_Change,
    t3.Gold_Balance AS Gold_Balance,
    t4.Gold_Change AS Gold_Change
FROM (
    SELECT
        userId,
        event_time,
        INTEGER(event_parameters.Value) AS Diamond_Balance,
    FROM
        FLATTEN([game_data], event_parameters)
    WHERE
        event_name LIKE 'Currency'
        AND event_parameters.Name = 'Diamond_Balance'
        -- and date(event_time) > '2015-09-11'
        -- AND ABS(HASH(userId) % 5)  = 0
    GROUP BY
        userId,
        event_time,
        Diamond_Balance ) AS t1
INNER JOIN (
    SELECT
        userId,
        event_time,
        INTEGER(event_parameters.Value) AS Diamond_Change,
    FROM
        FLATTEN([game_data], event_parameters)
    WHERE
        event_name LIKE 'Currency'
        AND event_parameters.Name = 'Diamond_Change'
        AND INTEGER(event_parameters.Value ) < 14000
        AND INTEGER(event_parameters.Value ) > -14000
        -- and date(event_time) > '2015-09-11'
        -- AND ABS(HASH(userId) % 5)  = 0

    GROUP BY
        userId,
        event_time,
        Diamond_Change ) AS t2
ON
    t1.userId = t2.userId
    AND t1.event_time = t2.event_time
INNER JOIN (
    SELECT
        userId,
        event_time,
        event_parameters.Value AS Gold_Balance,
    FROM
        FLATTEN([game_data], event_parameters)
    WHERE
        event_name LIKE 'Currency'
        AND event_parameters.Name = 'Gold_Balance'
        -- and date(event_time) > '2015-09-11'
        -- AND ABS(HASH(userId) % 5)  = 0

    GROUP BY
        userId,
        event_time,
        Gold_Balance ) AS t3
ON
    t1.userId = t3.userId
    AND t1.event_time = t3.event_time
INNER JOIN (
    SELECT
        userId,
        event_time,
        INTEGER(event_parameters.Value) AS Gold_Change,
    FROM
        FLATTEN([game_data], event_parameters)
    WHERE
        event_name LIKE 'Currency'
        AND event_parameters.Name = 'Gold_Change'
        -- and date(event_time) > '2015-09-11'
        -- AND ABS(HASH(userId) % 5)  = 0
    GROUP BY
        userId,
        event_time,
        Gold_Change ) AS t4
ON
    t1.userId = t4.userId
    AND t1.event_time = t4.event_time

有关超出资源的一般建议可在此处找到:
https://stackoverflow.com/a/16579558/1375400

请注意,添加EACH通常是解决资源超出错误的方法,而不是原因。 (尽管在某些情况下它可以以其他方式工作!)

此外,EACHGROUP BY上不再有意义,并且不久将在JOIN上不相关。


我认为您应该只需一次简单的"扫描"就可以完成所有逻辑。
根本没有加入!
像下面这样。只是想法-但有一些可能的工作方式是:)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
SELECT
    userId,
    event_time,
    MAX(CASE WHEN event_parameters.Name = 'Diamond_Balance'
            THEN INTEGER(event_parameters.Value) END) AS Diamond_Balance,
    MAX(CASE WHEN event_parameters.Name = 'Diamond_Change' AND INTEGER(event_parameters.Value ) BETWEEN -14000 AND 14000
            THEN INTEGER(event_parameters.Value)) END AS Diamond_Change,
    MAX(CASE WHEN event_parameters.Name = 'Gold_Balance'
            THEN INTEGER(event_parameters.Value) END) AS Gold_Balance,
    MAX(CASE WHEN event_parameters.Name = 'Gold_Change'
            THEN INTEGER(event_parameters.Value) END) AS Gold_Change
FROM
    FLATTEN([game_data], event_parameters)
WHERE
    event_name LIKE 'Currency'
GROUP BY
    userId,
    event_time