doris2.1性能测试,variant存储比string或json存储占用存储空间真的低吗

Viewed 68

下图是doris官方结论
地址:https://doris.apache.org/zh-CN/docs/sql-manual/sql-types/Data-Types/VARIANT?_highlight=variant

image.png

基于github events一个小时数据量测试,数据大小605M
string建表
CREATE TABLE IF NOT EXISTS github_events_string (
id BIGINT NOT NULL,
type VARCHAR(30) NULL,
actor String NULL,
repo String NULL,
payload String NULL,
public BOOLEAN NULL,
created_at DATETIME NULL,
INDEX idx_payload (payload) USING INVERTED PROPERTIES("parser" = "english") COMMENT 'inverted index for payload'
)
DUPLICATE KEY(id)
DISTRIBUTED BY HASH(id) BUCKETS 10
properties("replication_num" = "1");

variant建表
CREATE TABLE IF NOT EXISTS github_events_string (
id BIGINT NOT NULL,
type VARCHAR(30) NULL,
actor variant NULL,
repo variant NULL,
payload variant NULL,
public BOOLEAN NULL,
created_at DATETIME NULL,
INDEX idx_payload (payload) USING INVERTED PROPERTIES("parser" = "english") COMMENT 'inverted index for payload'
)
DUPLICATE KEY(id)
DISTRIBUTED BY HASH(id) BUCKETS 10
properties("replication_num" = "1");

测试结果
使用stream load实际导入后,variant类型的表doris中查看是321M,string是156M

1 Answers

用github events来对比是不合适的, 因为github_events是及其稀疏的表