使用CCR备份一个 567.260 GB ,40多完个tablet的库时,导致主FE挂掉 ,我看都是在bakup的时候导致的,backup的仓库是用__keep_on_local__,是数据量太大导致FE挂掉的?
FE的异常报错如下:
2024-11-07 19:51:44,445 ERROR (backupHandler|38) [BDBJEJournal.write():183] catch an exception when writing to database. sleep and retry. journal id 12464543
com.sleepycat.je.rep.InsufficientAcksException: (JE 18.3.12) Transaction: -15971464 VLSN: 28,436,577, initiated at: 19:51:27. Insufficient acks for policy:SIMPLE_MAJORITY. Need replica acks: 1. Missing replica acks: 1. Timeout: 10000
ms. FeederState=fe_5445e3ca_a329_47cb_bb7c_75d02e2eb6d8(6)[MASTER]
Current feeds:
fe_cbe9f1a6_c9df_4106_a0f5_55778ec8b0b3: feederVLSN=28,436,578 replicaTxnEndVLSN=28,436,575
fe_385c9686_c49e_4233_b4d3_a7da1581021d: feederVLSN=28,436,578 replicaTxnEndVLSN=28,436,575
at com.sleepycat.je.rep.impl.node.DurabilityQuorum.ensureSufficientAcks(DurabilityQuorum.java:205) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.rep.stream.FeederTxns.awaitReplicaAcks(FeederTxns.java:188) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.rep.impl.RepImpl.postLogCommitHookInternal(RepImpl.java:1444) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.rep.impl.RepImpl.postLogCommitHook(RepImpl.java:1403) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.rep.txn.MasterTxn.postLogCommitHook(MasterTxn.java:228) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.txn.Txn.commit(Txn.java:778) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.txn.Txn.commit(Txn.java:631) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.txn.Txn.operationEnd(Txn.java:1773) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.Database.put(Database.java:1638) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]