【已解决】doris2.1,主键表在副本数为3的情况下执行数据查询时返回的结果数据不一致

Viewed 62

建表语句

CREATE TABLE `test` (
  `pmis_code` VARCHAR(255) NULL COMMENT 'ID',
  `end_time` DATETIME NULL COMMENT '结束时间',
  `start_time` DATETIME NULL COMMENT '开始时间',
  `p_day` VARCHAR(32) NULL COMMENT '日期',
  `data_value` INT NULL COMMENT '状态',
  `point_name` VARCHAR(255) NULL COMMENT '点位名称',
  `source_code` VARCHAR(255) NULL COMMENT '来源',
  `site_code` VARCHAR(255) NULL COMMENT '站点编码',
  `site_name` VARCHAR(255) NULL COMMENT '站点名称',
  `load_time` DATETIME NULL COMMENT '加载时间',
  INDEX id_end_time (`end_time`) USING INVERTED,
  INDEX id_pmis_code (`pmis_code`) USING INVERTED
) ENGINE=OLAP
UNIQUE KEY(`pmis_code`, `end_time`)
COMMENT 'test'
DISTRIBUTED BY HASH(`pmis_code`) BUCKETS 1
PROPERTIES (
"replication_allocation" = "tag.location.default: 3",
"min_load_replica_num" = "-1",
"is_being_synced" = "false",
"storage_medium" = "hdd",
"storage_format" = "V2",
"inverted_index_storage_format" = "V1",
"enable_unique_key_merge_on_write" = "true",
"light_schema_change" = "true",
"disable_auto_compaction" = "false",
"enable_single_replica_compaction" = "false",
"group_commit_interval_ms" = "10000",
"group_commit_data_bytes" = "134217728",
"enable_mow_light_delete" = "false"
);

表中数据总量 407771

查询语句

SELECT COUNT(1) AS num 
FROM (
SELECT 
	pmis_code 
	,start_time 
	,end_time 
	,data_value
	,LEAD(start_time ,1,'2099-12-31 23:59:59') OVER(PARTITION BY pmis_code ORDER BY start_time) AS time_n
	,load_time
FROM test
ORDER BY start_time DESC 
) a WHERE a.end_time <> time_n;

查询结果

连续 5次查询 的结果分别是 28,30,28,26,28,而且查询期间没有其他程序向表中写入和删除数据

副本数修改

新建另一张表test_1,修改副本数为1,连续5次查询结果都为 0

CREATE TABLE `test_1` (
  `pmis_code` VARCHAR(255) NULL COMMENT 'ID',
  `end_time` DATETIME NULL COMMENT '结束时间',
  `start_time` DATETIME NULL COMMENT '开始时间',
  `p_day` VARCHAR(32) NULL COMMENT '日期',
  `data_value` INT NULL COMMENT '状态',
  `point_name` VARCHAR(255) NULL COMMENT '点位名称',
  `source_code` VARCHAR(255) NULL COMMENT '来源',
  `site_code` VARCHAR(255) NULL COMMENT '站点编码',
  `site_name` VARCHAR(255) NULL COMMENT '站点名称',
  `load_time` DATETIME NULL COMMENT '加载时间',
  INDEX id_end_time (`end_time`) USING INVERTED,
  INDEX id_pmis_code (`pmis_code`) USING INVERTED
) ENGINE=OLAP
UNIQUE KEY(`pmis_code`, `end_time`)
COMMENT 'test'
DISTRIBUTED BY HASH(`pmis_code`) BUCKETS 1
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"min_load_replica_num" = "-1",
"is_being_synced" = "false",
"storage_medium" = "hdd",
"storage_format" = "V2",
"inverted_index_storage_format" = "V1",
"enable_unique_key_merge_on_write" = "true",
"light_schema_change" = "true",
"disable_auto_compaction" = "false",
"enable_single_replica_compaction" = "false",
"group_commit_interval_ms" = "10000",
"group_commit_data_bytes" = "134217728",
"enable_mow_light_delete" = "false"
);




SELECT COUNT(1) AS num 
FROM (
SELECT 
	pmis_code 
	,start_time 
	,end_time 
	,data_value
	,LEAD(start_time ,1,'2099-12-31 23:59:59') OVER(PARTITION BY pmis_code ORDER BY start_time) AS time_n
	,load_time
FROM test_1
ORDER BY start_time DESC 
) a WHERE a.end_time <> time_n;

表模型修改

test_2,表模型修改为明细表,连续5次查询结果都为0

CREATE TABLE `test_2` (
  `pmis_code` VARCHAR(255) NULL,
  `end_time` DATETIME NULL,
  `start_time` DATETIME NULL,
  `p_day` VARCHAR(32) NULL,
  `data_value` INT NULL,
  `point_name` VARCHAR(255) NULL,
  `source_code` VARCHAR(255) NULL,
  `site_code` VARCHAR(255) NULL,
  `site_name` VARCHAR(255) NULL,
  `load_time` DATETIME NULL
) ENGINE=OLAP
DUPLICATE KEY(`pmis_code`)
DISTRIBUTED BY HASH(`pmis_code`) BUCKETS 10
PROPERTIES (
"replication_allocation" = "tag.location.default: 3",
"min_load_replica_num" = "-1",
"is_being_synced" = "false",
"storage_medium" = "hdd",
"storage_format" = "V2",
"inverted_index_storage_format" = "V1",
"light_schema_change" = "true",
"disable_auto_compaction" = "false",
"enable_single_replica_compaction" = "false",
"group_commit_interval_ms" = "10000",
"group_commit_data_bytes" = "134217728"
);



SELECT COUNT(1) AS num 
FROM (
SELECT 
	pmis_code 
	,start_time 
	,end_time 
	,data_value
	,LEAD(start_time ,1,'2099-12-31 23:59:59') OVER(PARTITION BY pmis_code ORDER BY start_time) AS time_n
	,load_time
FROM test_2
ORDER BY start_time DESC 
) a WHERE a.end_time <> time_n;

问题

同样的数据,保存到3副本的主键模型表中,查询到的结果不一致,结果返回存在异常

1 Answers
  1. 确定下具体的版本号:show backends; show frontends; 看下的。
  2. 如果是2.1.8 之前的版本,可以先干关闭local shuffle试试的:set experimental_enable_local_shuffle = false;

如果还是稳定复现的话,麻烦加我主页微信,我们一起排查下的