请问大家遇到过 ddc 从 3.0.3 升级到 3.0.4 be 一直重启的问题吗?
K8S 部署 FE MS 都启动好了,切换回 3.0.3 可以正常启动
报错如下:
RuntimeLogger W20250407 12:15:06.792353 684 status.h:424] meet error status: [RUNTIME_ERROR]Could not create thread. (error 11) Resource temporarily unavailable
0# doris::Thread::start_thread(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::function<void ()> const&, unsigned long, scoped_refptr<doris::Thread>*) at /home/zcp/repo_center/doris_release/doris/be/src/util/thread.cpp:445
1# doris::ThreadPool::create_thread() at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:244
2# doris::ThreadPool::init() at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:502
3# doris::Status doris::ThreadPoolBuilder::build<doris::ThreadPool>(std::unique_ptr<doris::ThreadPool, std::default_delete<doris::ThreadPool> >*) const at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:502
4# doris::EvHttpServer::start() at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:360
5# doris::HttpService::start() at /home/zcp/repo_center/doris_release/doris/be/src/service/http_service.cpp:0
6# main at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:389
7# ?
8# __libc_start_main
9# _start
*** Query id: 0-0 ***
*** is nereids: 0 ***
*** tablet id: 0 ***
*** Aborted at 1743999306 (unix time) try "date -d @1743999306" if you are using GNU date ***
*** Current BE git commitID: 39f9074cec ***
*** SIGSEGV address not mapped to object (@0x8) received by PID 684 (TID 684 OR 0x7fb2013c1a00) from PID 8; stack trace: ***
RuntimeLogger I20250407 12:15:06.815495 1466 wal_manager.cpp:485] Scheduled(every 10s) WAL info: [/opt/apache-doris/be/storage/wal: limit 33838706688 Bytes, used 0 Bytes, estimated wal bytes 0 Bytes, available 33838706688 Bytes.];
0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/r
DDC配置如下:
apiVersion: v1
kind: ConfigMap
metadata:
name: be-configmap
namespace: hd-dev-doris-v1
labels:
app.kubernetes.io/component: be
data:
be.conf: |
# For jdk 17, this JAVA_OPTS will be used as default JVM options
JAVA_OPTS_FOR_JDK_17="-Xmx1024m -DlogPath=$LOG_DIR/jni.log -Xlog:gc*:$LOG_DIR/be.gc.log.$CUR_DATE:time,uptime:filecount=10,filesize=50M -Djavax.security.auth.useSubjectCredsOnly=false -Dsun.security.krb5.debug=true -Dsun.java.command=DorisBE -XX:-CriticalJNINatives -XX:+IgnoreUnrecognizedVMOptions --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED --add-opens=java.management/sun.management=ALL-UNNAMED"
file_cache_path = [{"path":"/mnt/disk1/doris_cloud/file_cache","total_size":207374182400,"query_limit":207374182400}]
---
apiVersion: disaggregated.cluster.doris.com/v1
kind: DorisDisaggregatedCluster
metadata:
name: dev-disaggregated-cluster
namespace: hd-dev-doris-v1
spec:
metaService:
image: apache/doris:ms-3.0.3
envVars:
- name: TZ
value: Asia/Shanghai
requests:
cpu: 4
memory: 4Gi
limits:
cpu: 4
memory: 4Gi
fdb:
configMapNamespaceName:
name: fdb-dev-cluster-config
namespace: hd-dev-doris-v1
feSpec:
replicas: 2
image: apache/doris:fe-3.0.3
envVars:
- name: TZ
value: Asia/Shanghai
requests:
cpu: 6
memory: 80Gi
limits:
cpu: 6
memory: 80Gi
service:
type: NodePort
portMaps:
- nodePort: 30830
targetPort: 8030
- nodePort: 30920
targetPort: 9020
- nodePort: 30930
targetPort: 9030
- nodePort: 30910
targetPort: 9010
persistentVolume:
persistentVolumeClaimSpec:
storageClassName: netapp-iscsi
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
computeGroups:
- uniqueId: cg1
replicas: 3
image: apache/doris:be-3.0.3
envVars:
- name: TZ
value: Asia/Shanghai
requests:
cpu: 8
memory: 80Gi
limits:
cpu: 8
memory: 80Gi
persistentVolume:
# logNotStore: true
persistentVolumeClaimSpec:
storageClassName: ocs-storagecluster-ceph-rbd
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Gi