【缺少关键信息】doris 2.1.3 fe挂掉

Viewed 114

doris v2.1.3使用过程中,使用flight sql连接, fe时不时挂掉,具体fe日志如下:
2024-06-26 15:31:13,419 WARN (connect-scheduler-check-timer-0|107) [ResultReceiver.updateCancelReason():203] Query 42e06c1392db4b0a-b148c6753e8264db already has cancel reason: INTERNAL_ERROR, new reason TIMEOUT will be ignored
2024-06-26 15:31:13,419 WARN (connect-scheduler-check-timer-0|107) [Coordinator$BackendExecState.cancelFragmentInstance():3115] cancelRemoteFragments initiated=true done=false backend: 12547, fragment instance id=42e06c1392db4b0a-b148c6753e8264e6, reason: TIMEOUT
2024-06-26 15:31:13,419 WARN (connect-scheduler-check-timer-0|107) [Coordinator$BackendExecState.cancelFragmentInstance():3115] cancelRemoteFragments initiated=true done=false backend: 12547, fragment instance id=42e06c1392db4b0a-b148c6753e8264dc, reason: TIMEOUT
2024-06-26 15:31:13,419 WARN (connect-scheduler-check-timer-0|107) [Coordinator$BackendExecState.cancelFragmentInstance():3115] cancelRemoteFragments initiated=true done=false backend: 12567, fragment instance id=42e06c1392db4b0a-b148c6753e8264dd, reason: TIMEOUT
2024-06-26 15:31:13,419 WARN (connect-scheduler-check-timer-0|107) [Coordinator$BackendExecState.cancelFragmentInstance():3115] cancelRemoteFragments initiated=true done=false backend: 12568, fragment instance id=42e06c1392db4b0a-b148c6753e8264de, reason: TIMEOUT
2024-06-26 15:31:13,419 WARN (connect-scheduler-check-timer-0|107) [Coordinator$BackendExecState.cancelFragmentInstance():3115] cancelRemoteFragments initiated=true done=false backend: 12546, fragment instance id=42e06c1392db4b0a-b148c6753e8264df, reason: TIMEOUT
2024-06-26 15:31:13,419 WARN (connect-scheduler-check-timer-0|107) [Coordinator$BackendExecState.cancelFragmentInstance():3115] cancelRemoteFragments initiated=true done=false backend: 12565, fragment instance id=42e06c1392db4b0a-b148c6753e8264e0, reason: TIMEOUT
2024-06-26 15:31:13,420 WARN (connect-scheduler-check-timer-0|107) [Coordinator$BackendExecState.cancelFragmentInstance():3115] cancelRemoteFragments initiated=true done=false backend: 12474, fragment instance id=42e06c1392db4b0a-b148c6753e8264e1, reason: TIMEOUT
2024-06-26 15:31:13,420 WARN (connect-scheduler-check-timer-0|107) [Coordinator$BackendExecState.cancelFragmentInstance():3115] cancelRemoteFragments initiated=true done=false backend: 12559, fragment instance id=42e06c1392db4b0a-b148c6753e8264e2, reason: TIMEOUT
2024-06-26 15:31:13,420 WARN (connect-scheduler-check-timer-0|107) [Coordinator$BackendExecState.cancelFragmentInstance():3115] cancelRemoteFragments initiated=true done=false backend: 12545, fragment instance id=42e06c1392db4b0a-b148c6753e8264e3, reason: TIMEOUT
2024-06-26 15:31:13,420 WARN (connect-scheduler-check-timer-0|107) [Coordinator$BackendExecState.cancelFragmentInstance():3115] cancelRemoteFragments initiated=true done=false backend: 12571, fragment instance id=42e06c1392db4b0a-b148c6753e8264e4, reason: TIMEOUT
2024-06-26 15:31:13,420 WARN (connect-scheduler-check-timer-0|107) [Coordinator$BackendExecState.cancelFragmentInstance():3115] cancelRemoteFragments initiated=true done=false backend: 12566, fragment instance id=42e06c1392db4b0a-b148c6753e8264e5, reason: TIMEOUT
2024-06-26 15:31:13,420 WARN (connect-scheduler-check-timer-0|107) [ConnectContext.checkTimeout():922] kill wait timeout connection, remote: 0.0.0.0:0, wait timeout: 28800
2024-06-26 15:31:13,420 WARN (connect-scheduler-check-timer-0|107) [ConnectContext.killByTimeout():888] kill query from 0.0.0.0:0, kill mysql connection: true reason time out
2024-06-26 15:31:13,420 WARN (connect-scheduler-check-timer-0|107) [Coordinator.cancel():1473] Query f6d549eba4f441c8-831734699f019a6e already in abnormal status Status [errorCode=CANCELLED, errorMsg=cancelled], but received cancel again,so that send cancel to BE again
java.lang.Exception: null
at org.apache.doris.qe.Coordinator.cancel(Coordinator.java:1475) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.qe.StmtExecutor.cancel(StmtExecutor.java:1408) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.qe.ConnectContext.killByTimeout(ConnectContext.java:900) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.qe.ConnectContext.checkTimeout(ConnectContext.java:944) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.qe.ConnectScheduler$TimeoutChecker.run(ConnectScheduler.java:74) ~[doris-fe.jar:1.2-SNAPSHOT]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_181]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) ~[?:1.8.0_181]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) ~[?:1.8.0_181]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) ~[?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_181]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_181

2 Answers

辛苦提供下更多fe.log和fe.out的日志,这个看样子和arrow flight无关

show show frontends; execution results, look at the specific version of 2.1.3.