2.1.5版本数据恢复失败

Viewed 198
  1. 版本2.1.5
  2. 安装方式K8S版本
  3. 数据库来源大部分是从另一个库通过备份恢复的,另外创建了几个表
  4. be日志
RuntimeLogger W20240904 09:06:41.216909  1377 task_worker_pool.cpp:1125] failed to make snapshot|signature=172344|tablet_id=17573|version=2|error=[E-242]fail to create hard link. from=/opt/apache-doris/be/storage/data/48/17573/830909669/0200000000000bcbbc4fbd9b613dd40a227a2d701459d6b5_0.dat, to=/opt/apache-doris/be/storage/snapshot/20240904090641.745572.86400/17573/830909669/0200000000000bcbbc4fbd9b613dd40a227a2d701459d6b5_0.dat, errno=2

        0#  doris::BetaRowset::link_files_to(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, doris::RowsetId, unsigned long, std::set<long, std::less<long>, std::allocator<long> >*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:377
        1#  doris::SnapshotManager::_create_snapshot_files(std::shared_ptr<doris::Tablet> const&, doris::TSnapshotRequest const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, bool*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:377
        2#  doris::SnapshotManager::make_snapshot(doris::TSnapshotRequest const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, bool*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:377
        3#  doris::make_snapshot_callback(doris::StorageEngine&, doris::TAgentTaskRequest const&) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:481
        4#  std::_Function_handler<void (), doris::TaskWorkerPool::submit_task(doris::TAgentTaskRequest const&)::$_0::operator()<doris::TAgentTaskRequest const&>(doris::TAgentTaskRequest const&) const::{lambda()#1}>::_M_invoke(std::_Any_data const&) at /home/zcp/repo_center/doris_release/doris/be/src/agent/task_worker_pool.cpp:440
        5#  doris::ThreadPool::dispatch_thread() at /home/zcp/repo_center/doris_release/doris/be/src/util/threadpool.cpp:0
        6#  doris::Thread::supervise_thread(void*) at /var/local/ldb-toolchain/bin/../usr/include/pthread.h:562
        7#  ?
        8#  __clone
RuntimeLogger I20240904 09:06:41.217204  1377 task_worker_pool.cpp:1102] get snapshot task. signature=172361
RuntimeLogger I20240904 09:06:41.217237  1377 snapshot_manager.cpp:372] receive a make snapshot request, request detail is TSnapshotRequest {
  01: tablet_id (i64) = 17594,
  02: schema_hash (i32) = 821121698,
  03: version (i64) = 2,
  05: timeout (i64) = 86400,
  07: list_files (bool) = true,
  09: preferred_snapshot_version (i32) = 4,
  10: is_copy_tablet_task (bool) = false,
}, snapshot_version is 4
RuntimeLogger W20240904 09:06:41.217406  1380 local_file_system.cpp:253] [NOT_FOUND]failed to create hard link from /opt/apache-doris/be/storage/data/52/17588/821121698/0200000000000bd1bc4fbd9b613dd40a227a2d701459d6b5_0.dat to /opt/apache-doris/be/storage/snapshot/20240904090641.745576.86400/17588/821121698/0200000000000bd1bc4fbd9b613dd40a227a2d701459d6b5_0.dat
RuntimeLogger W20240904 09:06:41.217442  1380 status.h:412] meet error status: [E-242]fail to create hard link. from=/opt/apache-doris/be/storage/data/52/17588/821121698/0200000000000bd1bc4fbd9b613dd40a227a2d701459d6b5_0.dat, to=/opt/apache-doris/be/storage/snapshot/20240904090641.745576.86400/17588/821121698/0200000000000bd1bc4fbd9b613dd40a227a2d701459d6b5_0.dat, errno=2

        0#  doris::BetaRowset::link_files_to(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, doris::RowsetId, unsigned long, std::set<long, std::less<long>, std::allocator<long> >*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:377
        1#  doris::SnapshotManager::_create_snapshot_files(std::shared_ptr<doris::Tablet> const&, doris::TSnapshotRequest const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, bool*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:377
        2#  doris::SnapshotManager::make_snapshot(doris::TSnapshotRequest const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, bool*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:377
        3#  doris::make_snapshot_callback(doris::StorageEngine&, doris::TAgentTaskRequest const&) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:481
        4#  std::_Function_handler<void (), doris::TaskWorkerPool::submit_task(doris::TAgentTaskRequest const&)::$_0::operator()<doris::TAgentTaskRequest const&>(doris::TAgentTaskRequest const&) const::{lambda()#1}>::_M_invoke(std::_Any_data const&) at /home/zcp/repo_center/doris_release/doris/be/src/agent/task_worker_pool.cpp:440
        5#  doris::ThreadPool::dispatch_thread() at /home/zcp/repo_center/doris_release/doris/be/src/util/threadpool.cpp:0
        6#  doris::Thread::supervise_thread(void*) at /var/local/ldb-toolchain/bin/../usr/include/pthread.h:562
        7#  ?
        8#  __clone
6 Answers

大佬们,我也遇到了同样的问题,麻烦帮看下是什么原因,怎么解决吧

我也遇到这个问题,同求

方便把执行的步骤也贴一下吗,执行恢复操作的时候fe.log是否有报错呢

备份步骤:

BACKUP SNAPSHOT doris_db.snapshot_20240903
TO minio_repo
PROPERTIES ("type" = "full");

RESTORE SNAPSHOT doris_db.snapshot_20240903
FROM
	minio_repo PROPERTIES ( "backup_timestamp" = "2024-09-03-18-11-02", "replication_num" = "2" )

FE日志:

RuntimeLogger 2024-09-04 15:11:15,765 INFO (backupHandler|17) [RestoreJob.run():392] run restore job: RESTORE repo id: 11004, label: snapshot_20240903, job id: 197808, db id: 15594, db name: doris_db, status: [OK], timeout: 86400000, backup ts: 2024-09-03-18-11-02, state: SNAPSHOTING
RuntimeLogger 2024-09-04 15:11:15,766 INFO (backupHandler|17) [RestoreJob.waitingAllSnapshotsFinished():1357] waiting 293 replicas to make snapshot: [{197821=53295, 197828=53286, 197835=53298, 197834=53283, 197832=16427, 197842=16421, 197845=16436, 197855=53292, 197854=53289, 197853=53280, 197852=53277, 197883=16469, 197886=16484, 197888=16487, 197893=16478, 197918=16541, 197916=16529, 197922=16547, 197921=16538, 197931=16565, 197933=16577, 197932=16568, 197939=16559, 197943=16586, 197944=16595, 197950=16604, 197948=16592, 197956=16583, 197965=16616, 198008=16697, 198019=16694, 198047=16760, 198044=16745, 198049=16754, 198053=16814, 198059=16781, 198058=16772, 198061=16793, 198060=16784, 198067=16775, 198065=16787, 198064=16778, 198070=16766, 198068=16790, 198072=16802, 198078=16817, 198076=16805, 198086=16847, 198091=16829, 198089=16835, 198093=16841, 198111=16853, 198119=16886, 198117=49671, 198123=16883, 198121=49659, 198120=49650, 198127=49653, 198126=16862, 198124=16898, 198130=49668, 198129=49665, 198128=49656, 198152=16916, 198163=16895, 198162=49662, 198167=16940, 198168=16949, 198179=16934, 198176=16946, 198180=16943, 198190=16973, 198189=16964, 198188=16961, 198198=16982, 198197=16967, 198206=17009, 198204=16997, 198210=17003, 198215=53868, 198214=53865, 198218=53902, 198217=53880, 198216=53877, 198225=53874, 198224=53859, 198228=53862, 198245=53899, 198251=53871, 198255=53908, 198254=53905, 198253=53896, 198252=53893, 198264=17042, 198270=17060, 198268=17048, 198277=17078, 198280=17090, 198287=17069, 198291=17054, 198289=17081, 198300=17105, 198310=135926, 198314=17132, 198316=17144, 198323=17138, 198327=135920, 198325=135938, 198324=135923, 198330=135941, 198329=135932, 198328=135929, 198333=17126, 198352=17150, 198359=135935, 198357=17174, 198394=17258, 198403=17249, 198402=17240, 198411=17246, 198423=17303, 198427=17300, 198428=17309, 198435=17291, 198442=17333, 198441=17324, 198451=17339, 198453=17318, 198467=17390, 198475=54229, 198474=54220, 198477=54241, 198476=54232, 198482=54235, 198481=54226, 198486=54223, 198494=17384, 198498=17378, 198501=54251, 198511=54260, 198510=54257, 198509=54248, 198508=17375, 198516=54263, 198527=54238, 198525=17417, 198524=17408, 198535=54254, 198534=17414, 198533=17399, 198551=17462, 198566=17468, 198564=17456, 198583=17498, 198589=17480, 198595=17495, 198599=17486, 198598=17471, 198605=17528, 198611=17507, 198619=17552, 198618=17549, 198617=17540, 198631=17534, 198632=17582, 198639=17561, 198642=17576, 198641=17573, 198640=17564, 198647=17567, 198645=17579, 198644=17570, 198654=17600, 198652=17588, 198658=17594, 198662=17606, 198671=17633, 198670=17624, 198678=17615, 198693=17630, 198696=17675, 198707=17681, 198706=17672, 198710=17687, 198731=50507, 198730=50498, 198728=50510, 198735=50504, 198734=50501, 198737=50516, 198736=50513, 198743=50495, 198747=17768, 198744=17753, 198754=17762, 198767=17780, 198769=17792, 198774=17783, 198772=17795, 198783=17825, 198782=17816, 198781=17813, 198780=17804, 198787=17819, 198786=17810, 198789=17807, 198798=17849, 198797=17840, 198806=17846, 198805=17831, 198809=132954, 198821=132951, 198865=154823, 198875=154832, 198874=154829, 198873=154820, 198872=154817, 198879=154814, 198878=154826, 198895=15626, 198899=15632, 198901=15644, 198900=15641, 198911=15695, 198913=15674, 198912=15659, 198923=15665, 198922=15656, 198927=52539, 198926=52530, 198935=52548, 198934=52545, 198933=52536, 198932=52533, 198941=52527, 198946=15698, 198945=15683, 198954=15689, 198959=52542, 198961=15719, 198998=15761, 199000=15773, 199006=15767, 199004=15779, 199031=15833, 199061=15890, 199060=15875, 199070=15881, 199069=15872, 199072=15899, 199083=15905, 199086=15920, 199085=15917, 199091=15950, 199089=15911, 199099=15929, 199102=15944, 199101=15941, 199100=15932, 199107=15935, 199105=15947, 199104=15938, 199157=16058, 199167=16049, 199166=16040, 199175=16082, 199176=16046, 199183=16073, 199182=16064, 199184=16076, 199189=16103, 199195=16112, 199193=16100, 199192=16097, 199203=16094, 199205=16124, 199204=16121, 199243=16193, 199241=16181, 199240=16172, 199247=16187, 199252=16238, 199275=16232, 199274=16229, 199277=16244, 199301=16298, 199307=16280, 199315=16286, 199313=16295, 199355=16364, 199359=16370, 199356=16373, 199360=16379}]. RESTORE repo id: 11004, label: snapshot_20240903, job id: 197808, db id: 15594, db name: doris_db, status: [OK], timeout: 86400000, backup ts: 2024-09-03-18-11-02, state: SNAPSHOTING
RuntimeLogger 2024-09-04 15:11:15,919 INFO (InsertOverwriteDropDirtyPartitions|61) [InsertOverwriteManager.runAfterCatalogReady():260] start clean insert overwrite temp partitions
RuntimeLogger 2024-09-04 15:11:16,643 INFO (thrift-server-pool-9|5343) [ReportHandler.handleReport():206] receive report from be 10003. type: TASK, current queue size: 1
RuntimeLogger 2024-09-04 15:11:16,643 INFO (report-thread|199) [ReportHandler.taskReport():582] finished to handle task report from backend 10003, diff task num: 123. cost: 0 ms
RuntimeLogger 2024-09-04 15:11:16,652 WARN (thrift-server-pool-2|247) [MasterImpl.finishTask():94] finish task reports bad. request: TFinishTaskRequest(backend:TBackend(host:doriscluster-be-1.doriscluster-be-internal.default.svc.cluster.local, be_port:9060, http_port:8040, brpc_port:8060, id:10003), task_type:MAKE_SNAPSHOT, signature:198735, task_status:TStatus(status_code:INTERNAL_ERROR, error_msgs:[(doriscluster-be-1.doriscluster-be-internal.default.svc.cluster.local)[E-242]fail to create hard link. from=/opt/apache-doris/be/storage/data/13/50504/1117081215/020000000000003c9b42c4a0ee6aec70217f9762226e1ca3_0.dat, to=/opt/apache-doris/be/storage/snapshot/20240904151116.547932.86400/50504/1117081215/020000000000003c9b42c4a0ee6aec70217f9762226e1ca3_0.dat, errno=2]), snapshot_path:, snapshot_files:[])
RuntimeLogger 2024-09-04 15:11:16,652 WARN (thrift-server-pool-0|245) [MasterImpl.finishTask():94] finish task reports bad. request: TFinishTaskRequest(backend:TBackend(host:doriscluster-be-1.doriscluster-be-internal.default.svc.cluster.local, be_port:9060, http_port:8040, brpc_port:8060, id:10003), task_type:MAKE_SNAPSHOT, signature:198731, task_status:TStatus(status_code:INTERNAL_ERROR, error_msgs:[(doriscluster-be-1.doriscluster-be-internal.default.svc.cluster.local)[E-242]fail to create hard link. from=/opt/apache-doris/be/storage/data/14/50507/1117081215/02000000000000399b42c4a0ee6aec70217f9762226e1ca3_0.dat, to=/opt/apache-doris/be/storage/snapshot/20240904151116.547930.86400/50507/1117081215/02000000000000399b42c4a0ee6aec70217f9762226e1ca3_0.dat, errno=2]), snapshot_path:, snapshot_files:[])
RuntimeLogger 2024-09-04 15:11:16,653 WARN (thrift-server-pool-1|246) [MasterImpl.finishTask():94] finish task reports bad. request: TFinishTaskRequest(backend:TBackend(host:doriscluster-be-1.doriscluster-be-internal.default.svc.cluster.local, be_port:9060, http_port:8040, brpc_port:8060, id:10003), task_type:MAKE_SNAPSHOT, signature:198728, task_status:TStatus(status_code:INTERNAL_ERROR, error_msgs:[(doriscluster-be-1.doriscluster-be-internal.default.svc.cluster.local)[E-242]fail to create hard link. from=/opt/apache-doris/be/storage/data/15/50510/1117081215/020000000000003a9b42c4a0ee6aec70217f9762226e1ca3_0.dat, to=/opt/apache-doris/be/storage/snapshot/20240904151116.547933.86400/50510/1117081215/020000000000003a9b42c4a0ee6aec70217f9762226e1ca3_0.dat, errno=2]), snapshot_path:, snapshot_files:[])
RuntimeLogger 2024-09-04 15:11:16,654 WARN (thrift-server-pool-10|5344) [MasterImpl.finishTask():94] finish task reports bad. request: TFinishTaskRequest(backend:TBackend(host:doriscluster-be-1.doriscluster-be-internal.default.svc.cluster.local, be_port:9060, http_port:8040, brpc_port:8060, id:10003), task_type:MAKE_SNAPSHOT, signature:198730, task_status:TStatus(status_code:INTERNAL_ERROR, error_msgs:[(doriscluster-be-1.doriscluster-be-internal.default.svc.cluster.local)[E-242]fail to create hard link. from=/opt/apache-doris/be/storage/data/11/50498/1117081215/020000000000003d9b42c4a0ee6aec70217f9762226e1ca3_0.dat, to=/opt/apache-doris/be/storage/snapshot/20240904151116.547931.86400/50498/1117081215/020000000000003d9b42c4a0ee6aec70217f9762226e1ca3_0.dat, errno=2]), snapshot_path:, snapshot_files:[])

这个问题还没解决吗,我这边也遇到了这个问题

没有,等等吧,应该在跟踪了。