Doris每次查询数据不一致

Viewed 239

内容同 Issue:https://github.com/apache/doris/issues/29778

建表语句

-- origin_events.origin_aiot_narwal_bury_robot_stats_mem_event definition

CREATE TABLE `origin_aiot_narwal_bury_robot_stats_mem_event` (
  `eventId` VARCHAR(30) NULL,
  `robotId` VARCHAR(100) NULL,
  `time` VARCHAR(50) NULL,
  `dt` DATE NULL,
  `productId` VARCHAR(100) NULL,
  `robotVer` VARCHAR(100) NULL,
  `env` VARCHAR(10) NULL,
  `rTime` VARCHAR(50) NULL,
  `kafkaTime` VARCHAR(50) NULL,
  `groupName` VARCHAR(100) NULL,
  `message` TEXT NULL
) ENGINE=OLAP
UNIQUE KEY(`eventId`, `robotId`, `time`, `dt`)
COMMENT '-mem'
PARTITION BY RANGE(`dt`)
(PARTITION p20240821 VALUES [('2024-08-21'), ('2024-08-22')),
PARTITION p20240822 VALUES [('2024-08-22'), ('2024-08-23')),
PARTITION p20240823 VALUES [('2024-08-23'), ('2024-08-24')),
PARTITION p20240824 VALUES [('2024-08-24'), ('2024-08-25')),
PARTITION p20240825 VALUES [('2024-08-25'), ('2024-08-26')),
PARTITION p20240826 VALUES [('2024-08-26'), ('2024-08-27')),
PARTITION p20240827 VALUES [('2024-08-27'), ('2024-08-28')),
PARTITION p20240828 VALUES [('2024-08-28'), ('2024-08-29')),
PARTITION p20240829 VALUES [('2024-08-29'), ('2024-08-30')),
PARTITION p20240830 VALUES [('2024-08-30'), ('2024-08-31')),
PARTITION p20240831 VALUES [('2024-08-31'), ('2024-09-01')))
DISTRIBUTED BY HASH(`robotId`) BUCKETS 32
PROPERTIES (
"replication_allocation" = "tag.location.default: 2",
"min_load_replica_num" = "2",
"is_being_synced" = "false",
"dynamic_partition.enable" = "true",
"dynamic_partition.time_unit" = "DAY",
"dynamic_partition.time_zone" = "Asia/Shanghai",
"dynamic_partition.start" = "-7",
"dynamic_partition.end" = "3",
"dynamic_partition.prefix" = "p",
"dynamic_partition.replication_allocation" = "tag.location.default: 3",
"dynamic_partition.buckets" = "32",
"dynamic_partition.create_history_partition" = "true",
"dynamic_partition.history_partition_num" = "-1",
"dynamic_partition.hot_partition_num" = "0",
"dynamic_partition.reserved_history_periods" = "NULL",
"dynamic_partition.storage_policy" = "",
"storage_medium" = "hdd",
"storage_format" = "V2",
"inverted_index_storage_format" = "V1",
"enable_unique_key_merge_on_write" = "true",
"light_schema_change" = "true",
"compaction_policy" = "time_series",
"time_series_compaction_goal_size_mbytes" = "1024",
"time_series_compaction_file_count_threshold" = "2000",
"time_series_compaction_time_threshold_seconds" = "3600",
"time_series_compaction_empty_rowsets_threshold" = "5",
"time_series_compaction_level_threshold" = "1",
"disable_auto_compaction" = "false",
"enable_single_replica_compaction" = "true",
"group_commit_interval_ms" = "10000",
"group_commit_data_bytes" = "134217728"
);

查询语句:
SELECT count(*)
FROM origin_aiot_narwal_bury_robot_stats_mem_event
where dt='2024-08-26';

查询结果每次都不一致

4 Answers

1、set use_fix_replica = 1 (只用1号副本)后再查询结果是否一致
2、常见的原因比如fd不足、网络通信异常、集群负载等原因导致副本数据同步异常
3、如果确定tablet id,可以参考这个副本排查指南处理:数据副本问题排查指南;不确定则可以执行 ADMIN SHOW REPLICA STATUS FROM {表名} ,然后看看相同tablet之间的版本是否一致,不一致说明存在同步异常

“1” 的解法并不优雅,失去了设置多 replica 的意义;
“2” 并没有这个问题
“3” 参考的文章问题在 2.0 解决了,我使用的是 2.1。

我也遇到了同样的问题。
Doris 版本 2.1.5

1、 使用 set use_fix_replica = 1 这个方式没有解决掉问题。
2、 集群、网络也都健康
3、也使用了 ADMIN SHOW REPLICA STATUS FROM {表名} 查看了,版本也都一致。但是 查询时,查询的结果一会是 一个数,一会是另外一个数

请大佬帮助给出如何解决 谢谢

同样的问题
Doris 版本 2.1.5