2.1.5,timeout when waiting for send fragments rpc, query timeout:14400, left timeout for this operation:30

Viewed 45

版本:2.1.5

情况描述:
机房1:部署了1个fe和3个be
机房2:加入了3个be和1个fe
机房1和机房2之间的网络延迟在3ms;出现了频繁出现【timeout when waiting for send fragments rpc, query timeout:14400, left timeout for this operation:30, host: 188.107.*. at】这个问题,按照这个:https://ask.selectdb.com/questions/D1F4/she-qu-wen-ti-timeout-when-waiting-for-send-fragments-rpc-yi-chang/E1G4 按照上面的解决办法增加了配置;并且把机房1的3个be都下线了;现在就保留了机房1的fe和机房2的3个be和1个fe;但是还是一直在报这个问题;请问一下需要怎么处理?已经到线上了

fe配置

CUR_DATE=date +%Y%m%d-%H%M%S

LOG_DIR = ${DORIS_HOME}/log

JAVA_HOME=/usr/local/jdk1.8.0_381

JAVA_OPTS="-Djavax.security.auth.useSubjectCredsOnly=false -Xss64m -Xms32g -Xmx64g -XX:+UseMembar -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=7 -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSClassUnloadingEnabled -XX:-CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=80 -XX:SoftRefLRUPolicyMSPerMB=0 -Xloggc:$DORIS_HOME/log/fe.gc.log.$CUR_DATE"

JAVA_OPTS_FOR_JDK_9="-Djavax.security.auth.useSubjectCredsOnly=false -Xss4m -Xmx8192m -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -Xlog:gc*:$LOG_DIR/fe.gc.log.$CUR_DATE:time -Dlog4j2.formatMsgNoLookups=true"

JAVA_OPTS_FOR_JDK_17="-Djavax.security.auth.useSubjectCredsOnly=false -XX:+UseZGC -Xmx8192m -Xms8192m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$LOG_DIR/ -Xlog:gc*:$LOG_DIR/fe.gc.log.$CUR_DATE:time"

http_port = 8030
rpc_port = 9020
query_port = 9030
edit_log_port = 9010
arrow_flight_sql_port = -1

priority_networks = 188.104..0/24;188.107..16/29

sys_log_level = INFO
sys_log_mode = NORMAL
max_allowed_packet = 104857600
remote_fragment_exec_timeout_ms = 30000

be配置

CUR_DATE=date +%Y%m%d-%H%M%S

LOG_DIR="${DORIS_HOME}/log/"

JAVA_OPTS="-Xmx1024m -DlogPath=$LOG_DIR/jni.log -Xloggc:$DORIS_HOME/log/be.gc.log.$CUR_DATE -Djavax.security.auth.useSubjectCredsOnly=false -Dsun.security.krb5.debug=true -Dsun.java.command=DorisBE -XX:-CriticalJNINatives"

JAVA_OPTS_FOR_JDK_9="-Xmx1024m -DlogPath=$DORIS_HOME/log/jni.log -Xlog:gc:$LOG_DIR/be.gc.log.$CUR_DATE -Djavax.security.auth.useSubjectCredsOnly=false -Dsun.security.krb5.debug=true -Dsun.java.command=DorisBE -XX:-CriticalJNINatives"

JAVA_OPTS_FOR_JDK_17="-Xmx1024m -DlogPath=$LOG_DIR/jni.log -Xlog:gc:$LOG_DIR/be.gc.log.$CUR_DATE -Djavax.security.auth.useSubjectCredsOnly=false -Dsun.security.krb5.debug=true -Dsun.java.command=DorisBE -XX:-CriticalJNINatives --add-opens=java.base/java.net=ALL-UNNAMED"

JAVA_HOME=/usr/local/jdk1.8.0_381

JEMALLOC_CONF="percpu_arena:percpu,background_thread:true,metadata_thp:auto,muzzy_decay_ms:15000,dirty_decay_ms:15000,oversize_threshold:0,prof:false,lg_prof_interval:32,lg_prof_sample:19,prof_gdump:false,prof_accum:false,prof_leak:false,prof_final:false"
JEMALLOC_PROF_PRFIX=""

be_port = 9060
webserver_port = 8040
heartbeat_service_port = 9050
brpc_port = 8060
arrow_flight_sql_port = -1

enable_https = false
ssl_certificate_path = "$DORIS_HOME/conf/cert.pem"
ssl_private_key_path = "$DORIS_HOME/conf/key.pem"

priority_networks = 188.104..0/24;188.107..16/29

sys_log_level = INFO

aws_log_level=0
AWS_EC2_METADATA_DISABLED=true
mem_limit = 90%
fragment_pool_thread_num_max = 2048
fragment_pool_queue_size = 4096
brpc_num_threads = 256

1 Answers

你这两个机房的网络情况如何?防火墙超时时间是多少?