docker上启动的doris集群,master-fe节点会宕掉,并且报错FE type: UNKNOWN

Viewed 74

Doris使用环境: 测试

Doris版本: 2.0.8

操作系统: centos 7.9

doris集群: 2台机器 2fe2be

问题描述:
在 docker 上使用doris集群,执行 show frontends 的时候出现 Lost connection to MySQL server during query 的报错,并且随后master-fe节点容器就会自动退出,查看日志发现如下信息

2024-08-21 03:38:21,312 WARN (UNKNOWN fe_f672848a_2678_46ac_acd4_eac41f1b3352(-1)|1) [Env.notifyNewFETypeTransfer():2421] notify new FE type transfer: UNKNOWN
2024-08-21 03:38:21,337 INFO (stateListener|80) [Env$4.runOneCycle():2444] begin to transfer FE type from INIT to UNKNOWN
2024-08-21 03:38:21,338 INFO (stateListener|80) [Env$4.runOneCycle():2531] finished to transfer FE type to UNKNOWN
2024-08-21 03:38:21,438 INFO (UNKNOWN fe_f672848a_2678_46ac_acd4_eac41f1b3352(-1)|1) [Env.waitForReady():956] wait catalog to be ready. FE type: UNKNOWN. is ready: false, counter: 1
2024-08-21 03:38:23,441 INFO (UNKNOWN fe_f672848a_2678_46ac_acd4_eac41f1b3352(-1)|1) [Env.waitForReady():956] wait catalog to be ready. FE type: UNKNOWN. is ready: false, counter: 21

我按照官网文档的元数据运维部分修复之后,切换 master-fe 到另一个节点上,启动集群后还是出现了一样的情况,请问我应该怎么修复?

另外,如果master-fe上的节点是最新的,我应该怎么修复?修复文档里要求执行 sh bin/start_fe.sh --metadata_failure_recovery --daemon 但是要执行这个命令就得先停止doris服务,但是在容器里停止服务的话,容器就会关闭。

1 Answers

这个PR fix 过了 :https://github.com/apache/doris/pull/37335

可以先参考这个文档构建Doris镜像