冷热分层过程中部分副本降冷失败

Viewed 15

Doris版本:2.1.6版本
问题描述:
测试冷热分层的时候,有一个副本一直未能降冷。如下图最后一行
img_v3_02he_9e34028b-bdbb-4682-9cd6-ce916e1adcfg.jpg
日志中有如下报错

W20241203 01:03:52.953462 96261 status.h:413] meet error status: [INTERNAL_ERROR]Read hdfs file failed. (BE: 172.16.24.145) namenode:/user/serv-mpp-prd/, path:data/80516524/80516525.0.meta, err: (255), Unknown error 255), reason: IOException: Blocklist for /data/80516524/80516525.0.meta has changed!
	0#  doris::io::HdfsFileReader::read_at_impl(unsigned long, doris::Slice, unsigned long*, doris::io::IOContext const*) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
	1#  doris::io::FileReader::read_at(unsigned long, doris::Slice, unsigned long*, doris::io::IOContext const*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:491
	2#  doris::Tablet::_read_cooldown_meta(std::shared_ptr<doris::io::RemoteFileSystem> const&, doris::TabletMetaPB*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:491
	3#  doris::Tablet::_follow_cooldowned_data() at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:491
	4#  doris::Tablet::cooldown(std::shared_ptr<doris::Rowset>) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:491
	5#  std::_Function_handler<void (), doris::StorageEngine::_cooldown_tasks_producer_callback()::$_1>::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:701
	6#  doris::WorkThreadPool<true>::work_thread(int) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/atomic_base.h:646
	7#  execute_native_thread_routine at /data/gcc-11.1.0/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/unique_ptr.h:85
	8#  start_thread
	9#  __clone
W20241203 01:03:52.953506 96261 file_reader.cpp:36] [INTERNAL_ERROR]Read hdfs file failed. (BE: 172.16.24.145) namenode:/user/serv-mpp-prd/, path:data/80516524/80516525.0.meta, err: (255), Unknown error 255), reason: IOException: Blocklist for /data/80516524/80516525.0.meta has changed!

	0#  doris::io::HdfsFileReader::read_at_impl(unsigned long, doris::Slice, unsigned long*, doris::io::IOContext const*) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
	1#  doris::io::FileReader::read_at(unsigned long, doris::Slice, unsigned long*, doris::io::IOContext const*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:491
	2#  doris::Tablet::_read_cooldown_meta(std::shared_ptr<doris::io::RemoteFileSystem> const&, doris::TabletMetaPB*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:491
	3#  doris::Tablet::_follow_cooldowned_data() at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:491
	4#  doris::Tablet::cooldown(std::shared_ptr<doris::Rowset>) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:491
	5#  std::_Function_handler<void (), doris::StorageEngine::_cooldown_tasks_producer_callback()::$_1>::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:701
	6#  doris::WorkThreadPool<true>::work_thread(int) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/atomic_base.h:646
	7#  execute_native_thread_routine at /data/gcc-11.1.0/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/unique_ptr.h:85
	8#  start_thread
	9#  __clone
W20241203 01:03:52.953545 96261 olap_server.cpp:1176] failed to cooldown, tablet: 80516524 err: [INTERNAL_ERROR]cannot read cooldown meta

查询Compcation状态,返回_input_rowset_size()is 1。尝试手动执行还是报这个错
image.png
请问这个问题该怎么解决,是因为compaction影响的降冷吗

1 Answers

看起来是BE 端自己损坏数据了,只剩一个空tablet目录。这个数据还能查询吗