【已解决】Doris Manager 24.0.0 agent连接不上

Viewed 102

您好:
我是3台机器,每台机器部署了一个fe,一个be。现在服务是没有问题的。doris manager部署在ip为14的机器上,现在只有14机器上的agent可以连接成功。另外15,16两台agent启动了。接管集群页面一直是异常状态。
14,15,16三台机器的8972端口可以互通。15 16两台只部署agent,没有部署doris manager。
image.png

webserver的log

2024-04-22 11:25:44.384 [SimpleAsyncTaskExecutor-422] INFO  com.selectdb.enterprise.manager.service.impl.ResourceNodeServiceImpl - update agent node 1 info
2024-04-22 11:25:49.374 [SimpleAsyncTaskExecutor-423] INFO  com.selectdb.enterprise.manager.service.component.agentjob.JobManager - schedule agent job start, running job size 0.
2024-04-22 11:25:49.375 [SimpleAsyncTaskExecutor-423] INFO  com.selectdb.enterprise.manager.service.component.agentjob.JobManager - schedule agent job end.
2024-04-22 11:25:54.374 [SimpleAsyncTaskExecutor-424] INFO  com.selectdb.enterprise.manager.service.component.agentjob.JobManager - schedule agent job start, running job size 0.
2024-04-22 11:25:54.375 [SimpleAsyncTaskExecutor-424] INFO  com.selectdb.enterprise.manager.service.component.agentjob.JobManager - schedule agent job end.
2024-04-22 11:25:59.374 [SimpleAsyncTaskExecutor-425] INFO  com.selectdb.enterprise.manager.service.component.agentjob.JobManager - schedule agent job start, running job size 0.
2024-04-22 11:25:59.375 [SimpleAsyncTaskExecutor-425] INFO  com.selectdb.enterprise.manager.service.component.agentjob.JobManager - schedule agent job end.
2024-04-22 11:26:04.374 [SimpleAsyncTaskExecutor-426] INFO  com.selectdb.enterprise.manager.service.component.agentjob.JobManager - schedule agent job start, running job size 0.
2024-04-22 11:26:04.375 [SimpleAsyncTaskExecutor-426] INFO  com.selectdb.enterprise.manager.service.component.agentjob.JobManager - schedule agent job end.
2024-04-22 11:26:09.374 [SimpleAsyncTaskExecutor-427] INFO  com.selectdb.enterprise.manager.service.component.agentjob.JobManager - schedule agent job start, running job size 0.
2024-04-22 11:26:09.375 [SimpleAsyncTaskExecutor-427] INFO  com.selectdb.enterprise.manager.service.component.agentjob.JobManager - schedule agent job end.
2024-04-22 11:26:14.374 [SimpleAsyncTaskExecutor-428] INFO  com.selectdb.enterprise.manager.service.component.agentjob.JobManager - schedule agent job start, running job size 0.
2024-04-22 11:26:14.375 [SimpleAsyncTaskExecutor-428] INFO  com.selectdb.enterprise.manager.service.component.agentjob.JobManager - schedule agent job end.
2024-04-22 11:26:14.379 [SimpleAsyncTaskExecutor-429] ERROR com.selectdb.enterprise.manager.service.impl.ResourceNodeServiceImpl - heartbeat start, resource node size 1.
2024-04-22 11:26:14.379 [SimpleAsyncTaskExecutor-429] INFO  com.selectdb.enterprise.manager.common.utils.StringUtil - getServerLocalIps: [10.143.22.14]
2024-04-22 11:26:14.381 [SimpleAsyncTaskExecutor-429] INFO  com.selectdb.enterprise.manager.service.component.agent.AgentHttpClient - connect to agent 1 heartbeat api, url:http://10.143.22.14:8972/heartbeat
2024-04-22 11:26:14.381 [SimpleAsyncTaskExecutor-429] INFO  com.selectdb.enterprise.manager.service.component.agent.AgentHttpClient - heartbeat, serverIps: [10.143.22.14]
2024-04-22 11:26:14.381 [SimpleAsyncTaskExecutor-429] INFO  com.selectdb.enterprise.manager.common.pool.HttpClientPoolManager - Post body is:{"agent_id":"agent-fbkf39kiw18c2n0kml","manager_id":"manager-4275xsvn5403omji93","manager_version":"24.0.0","nodes":[],"resource":{"ips":["10.143.22.14"],"package_md5_sum":"92c002cab881446b3406d123d4ed66fd","package_name":"manager-agent-24.0.0-x64-bin.tar.gz","port":8000}}
2024-04-22 11:26:14.384 [SimpleAsyncTaskExecutor-429] INFO  com.selectdb.enterprise.manager.common.pool.HttpClientPoolManager - execute request result:{
    "code": 0,
    "data": {
        "status": "RUNNING",
        "host_info": {
            "disk_usage": 0.04892629397638683,
            "disk_free": 280.21009063720703,
            "data_size": 14.414909362792969,
            "disk_total": 294.625,
            "disks": [
                {
                    "disk_name": "/dev/dm-0",
                    "size": 29.9375,
                    "free": 29.675918579101562,
                    "usage": 0.008737583996607515
                },
                {
                    "disk_name": "/dev/dm-2",
                    "size": 14.9375,
                    "free": 12.835426330566406,
                    "usage": 0.1407245971168933
                },
                {
                    "disk_name": "/dev/sda1",
                    "size": 0.9375,
                    "free": 0.5572166442871094,
                    "usage": 0.40563557942708334
                },
                {
                    "disk_name": "/dev/dm-5",
                    "size": 14.9375,
                    "free": 14.626663208007812,
                    "usage": 0.02080915762290795
                },
                {
                    "disk_name": "/dev/dm-3",
                    "size": 19.9375,
                    "free": 19.766536712646484,
                    "usage": 0.008574961121179467
                },
                {
                    "disk_name": "/dev/dm-4",
                    "size": 213.9375,
                    "free": 202.74832916259766,
                    "usage": 0.052301119894372625
                }
            ],
            "io_util": 0,
            "host_name": "BITDLNIU01401",
            "cpu_usage": 0.007519,
            "memory_usage": 0.160798,
            "network_transport_rate": 0.000629425048828125,
            "network_receive_rate": 0.002899169921875,
            "average_load_list": [
                0,
                0.05,
                0.05
            ],
            "cpu": "Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz 4 Core(s) 4 Thread(s)",
            "memory": "15.35 GB",
            "network_interface": "UNKNOWN"
        }
    },
    "message": "OK"
}
2024-04-22 11:26:14.384 [SimpleAsyncTaskExecutor-429] INFO  com.selectdb.enterprise.manager.service.impl.ResourceNodeServiceImpl - update agent node 1 info
2024-04-22 11:26:19.374 [SimpleAsyncTaskExecutor-430] INFO  com.selectdb.enterprise.manager.service.component.agentjob.JobManager - schedule agent job start, running job size 0.
2024-04-22 11:26:19.375 [SimpleAsyncTaskExecutor-430] INFO  com.selectdb.enterprise.manager.service.component.agentjob.JobManager - schedule agent job end.
2024-04-22 11:26:24.374 [SimpleAsyncTaskExecutor-431] INFO  com.selectdb.enterprise.manager.service.component.agentjob.JobManager - schedule agent job start, running job size 0.
2024-04-22 11:26:24.374 [SimpleAsyncTaskExecutor-431] INFO  com.selectdb.enterprise.manager.service.component.agentjob.JobManager - schedule agent job end.
2024-04-22 11:26:29.374 [SimpleAsyncTaskExecutor-432] INFO  com.selectdb.enterprise.manager.service.component.agentjob.JobManager - schedule agent job start, running job size 0.
2024-04-22 11:26:29.375 [SimpleAsyncTaskExecutor-432] INFO  com.selectdb.enterprise.manager.service.component.agentjob.JobManager - schedule agent job end.
2024-04-22 11:26:34.374 [SimpleAsyncTaskExecutor-433] INFO  com.selectdb.enterprise.manager.service.component.agentjob.JobManager - schedule agent job start, running job size 0.
2024-04-22 11:26:34.375 [SimpleAsyncTaskExecutor-433] INFO  com.selectdb.enterprise.manager.service.component.agentjob.JobManager - schedule agent job end.
1 Answers

解决了。我直接把目录scp到每个节点了。这里要scp原来的tar包。