建表DDL:
CREATE TABLE IF NOT EXISTS test1
(
id BIGINT NOT NULL COMMENT '交易ID,同一交易对递增',
time_utc0 DATETIME(3) NOT NULL COMMENT '成交时间(UTC)',
symbol VARCHAR(100) NOT NULL COMMENT '交易对,如BTC/USDT',
price DECIMAL(38, 18) NOT NULL COMMENT '成交价格',
size DECIMAL(38, 18) NOT NULL COMMENT '成交数量',
side TINYINT NOT NULL COMMENT '0=buy, 1=sell'
) UNIQUE KEY(`id`, `time_utc0`, `symbol`)
AUTO PARTITION BY RANGE(DATE_TRUNC(`time_utc0`, 'year')) ()
DISTRIBUTED BY HASH(`time_utc0`) BUCKETS 10
PROPERTIES(
"replication_num"="3",
"compaction_policy" = "time_series",
"enable_unique_key_merge_on_write" = "true"
);
查询语句:
WITH cte2 AS (
WITH cte1 AS (
SELECT * FROM test1 WHERE symbol = 'BTC/USDT'
)
SELECT date_floor(time_utc0, INTERVAL 1 minute) AS time_utc0, price, size
FROM cte1
)
SELECT time_utc0, sum(price * size), sum(size), sum(price * size)/sum(size) AS vwap, avg(price), min(price)
FROM cte2
GROUP BY time_utc0
ORDER BY time_utc0 DESC;
问题在于, vwap聚合完之后, 数据和预期有很大出入, vwap算法应该接近平均值, 然后我建了一张一样的表, 将DECIMAL类型改为DOUBLE后, 数值接近正常, 后又调小了小数的精度为DECIMAL(38,12), 也接近正常, 所以这是啥问题? 以下是几张参考图, 按顺序是DECIMAL(38, 18), DOUBLE, DECIMAL(38,12), 主要看第四个字段.
这里提供一些测试数据:
https://www.lanzouw.com/iTvDr1y3u0mb