有办法查看一个query里面shuffle了多少数据量吗?

Viewed 44

如题,主要想看一下开启bucket shuffle优化与否对shuffle数据量的影响

2 Answers

在fe.audit.log里有ShuffleSendBytes/ShuffleSendRows字段,查询结束后会打印对应值。
系统表里可以看到查询运行时的值

mysql [information_schema]>desc backend_active_tasks;
+---------------------------+--------------+------+-------+---------+-------+
| Field                     | Type         | Null | Key   | Default | Extra |
+---------------------------+--------------+------+-------+---------+-------+
| BE_ID                     | BIGINT       | Yes  | false | NULL    |       |
| FE_HOST                   | VARCHAR(256) | Yes  | false | NULL    |       |
| QUERY_ID                  | VARCHAR(256) | Yes  | false | NULL    |       |
| TASK_TIME_MS              | BIGINT       | Yes  | false | NULL    |       |
| TASK_CPU_TIME_MS          | BIGINT       | Yes  | false | NULL    |       |
| SCAN_ROWS                 | BIGINT       | Yes  | false | NULL    |       |
| SCAN_BYTES                | BIGINT       | Yes  | false | NULL    |       |
| BE_PEAK_MEMORY_BYTES      | BIGINT       | Yes  | false | NULL    |       |
| CURRENT_USED_MEMORY_BYTES | BIGINT       | Yes  | false | NULL    |       |
| SHUFFLE_SEND_BYTES        | BIGINT       | Yes  | false | NULL    |       |
| SHUFFLE_SEND_ROWS         | BIGINT       | Yes  | false | NULL    |       |
+---------------------------+--------------+------+-------+---------+-------+
11 rows in set (0.03 sec)