Spark Load ETL 阶段,doris fe 获取 yarn 的 application 状态异常,表现为 proto 不兼容。求解决方案

Viewed 46

背景
从 doris 1.1 升级到 1.2 后,spark load 从 pendding 阶段进入到 etl 阶段后。fe 无法获取到 yarn 的任务状态

报错如下

org/apache/hadoop/yarn/proto/YarnProtos$ApplicationReportProto$Builder.setClientToAmToken(Lorg/apache/hadoop/security/proto/SecurityProtos$TokenProto$Builder;)Lorg/apache/hadoop/yarn/proto/YarnProtos$ApplicationReportProto$Builder; @30: invokevirtual
  Reason:
    Type 'org/apache/hadoop/security/proto/SecurityProtos$TokenProto' (current frame, stack[1]) is not assignable to 'com/google/protobuf/GeneratedMessage'
  Current Frame:
    bci: @30
    flags: { }
    locals: { 'org/apache/hadoop/yarn/proto/YarnProtos$ApplicationReportProto$Builder', 'org/apache/hadoop/security/proto/SecurityProtos$TokenProto$Builder' }
    stack: { 'com/google/protobuf/SingleFieldBuilder', 'org/apache/hadoop/security/proto/SecurityProtos$TokenProto' }
  Bytecode:
    0x0000000: 2ab4 013f c700 122a 2bb6 0398 b500 cc2a
    0x0000010: b602 14a7 000f 2ab4 013f 2bb6 0398 b603
    0x0000020: 3a57 2a59 b401 3b10 4080 b501 3b2a b0  
  Stackmap Table:
    same_frame(@22)
    same_frame(@34)

	at org.apache.hadoop.yarn.proto.YarnProtos$ApplicationReportProto.newBuilder(YarnProtos.java:23303) ~[hive-shade-3-1.0.2-SNAPSHOT.jar:1.0.2]
	at org.apache.hadoop.yarn.api.records.impl.pb.ApplicationReportPBImpl.(ApplicationReportPBImpl.java:72) ~[hive-shade-3-1.0.2-SNAPSHOT.jar:1.0.2]
	at org.apache.doris.load.loadv2.YarnApplicationReport.(YarnApplicationReport.java:78) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.load.loadv2.SparkEtlJobHandler.getEtlJobStatus(SparkEtlJobHandler.java:233) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.load.loadv2.SparkLoadJob.updateEtlStatus(SparkLoadJob.java:298) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.load.loadv2.LoadManager.lambda$processEtlStateJobs$8(LoadManager.java:414) ~[doris-fe.jar:1.2-SNAPSHOT]
	at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184) ~[?:1.8.0_161]
	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175) ~[?:1.8.0_161]
	at java.util.concurrent.ConcurrentHashMap$ValueSpliterator.forEachRemaining(ConcurrentHashMap.java:3566) ~[?:1.8.0_161]
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) ~[?:1.8.0_161]
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) ~[?:1.8.0_161]
	at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151) ~[?:1.8.0_161]
	at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174) ~[?:1.8.0_161]
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:1.8.0_161]
	at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418) ~[?:1.8.0_161]
	at org.apache.doris.load.loadv2.LoadManager.processEtlStateJobs(LoadManager.java:412) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.load.loadv2.LoadEtlChecker.runAfterCatalogReady(LoadEtlChecker.java:43) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.common.util.MasterDaemon.runOneCycle(MasterDaemon.java:58) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.common.util.Daemon.run(Daemon.java:116) ~[doris-fe.jar:1.2-SNAPSHOT]

目前尝试

  • 在 hive-shade 中将 protobuf-java 锁定成 2.5.0。锁定后依旧存在问题
1 Answers