【已解决】DECIMAL涉及聚合运算时, 出现误差很大

Viewed 84

建表DDL:

CREATE TABLE IF NOT EXISTS test1
(
    id        BIGINT          NOT NULL COMMENT '交易ID,同一交易对递增',
    time_utc0 DATETIME(3)     NOT NULL COMMENT '成交时间(UTC)',
    symbol    VARCHAR(100)    NOT NULL COMMENT '交易对,如BTC/USDT',
    price     DECIMAL(38, 18) NOT NULL COMMENT '成交价格',
    size      DECIMAL(38, 18) NOT NULL COMMENT '成交数量',
    side      TINYINT         NOT NULL COMMENT '0=buy, 1=sell'
) UNIQUE KEY(`id`, `time_utc0`, `symbol`)
AUTO PARTITION BY RANGE(DATE_TRUNC(`time_utc0`, 'year')) ()
DISTRIBUTED BY HASH(`time_utc0`) BUCKETS 10
PROPERTIES(
    "replication_num"="3",
    "compaction_policy" = "time_series",
    "enable_unique_key_merge_on_write" = "true"
);

查询语句:

WITH cte2 AS (
WITH cte1 AS (
SELECT * FROM test1 WHERE symbol = 'BTC/USDT'
)
SELECT date_floor(time_utc0, INTERVAL 1 minute) AS time_utc0, price, size
FROM cte1
)
SELECT time_utc0, sum(price * size), sum(size), sum(price * size)/sum(size) AS vwap, avg(price), min(price)
FROM cte2
GROUP BY time_utc0
ORDER BY time_utc0 DESC;

问题在于, vwap聚合完之后, 数据和预期有很大出入, vwap算法应该接近平均值, 然后我建了一张一样的表, 将DECIMAL类型改为DOUBLE后, 数值接近正常, 后又调小了小数的精度为DECIMAL(38,12), 也接近正常, 所以这是啥问题? 以下是几张参考图, 按顺序是DECIMAL(38, 18), DOUBLE, DECIMAL(38,12), 主要看第四个字段.

e7e24a176d6fc172e928c90ca8d45eb.png

46d164facba741ec1f7f35d6dfa3b10.png

330585576d69df9f92dc173758ebad2.png

这里提供一些测试数据:
https://www.lanzouw.com/iTvDr1y3u0mb

2 Answers

已fix:https://github.com/apache/doris/pull/34825