Flink源码阅读之任务提交流程
Flink 源码阅读之任务提交基本流程
org.apache.flink.runtime.blob.BlobServer#run java.net.ServerSocket#accept
BlobServer.run
涉及的类:
org.apache.flink.client.cli.CliFrontend org.apache.flink.client.ClientUtils org.apache.flink.streaming.examples.wordcount.WordCount
org.apache.flink.runtime.blob.BlobServer#run
org.apache.flink.runtime.taskexecutor.TaskExecutor#submitTask
org.apache.flink.runtime.deployment.TaskDeploymentDescriptor
流程准备
以远程调试模式运行 org.apache.flink.client.cli.CliFrontend
org.apache.flink.runtime.entrypoint.StandaloneSessionClusterEntrypoint
org.apache.flink.runtime.taskexecutor.TaskManagerRunner
使用 ./flink 提交 WordCount.jar, 查看整个任务的流转过程
org.apache.flink.client.program.PackagedProgram#callMainMethod java.lang.reflect.Method#invoke
org.apache.flink.streaming.api.environment.StreamContextEnvironment#execute
org.apache.flink.streaming.api.graph.StreamGraphGenerator#generate
org.apache.flink.streaming.api.graph.StreamGraphGenerator#transform
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment#executeAsync(org.apache.flink.streaming.api.graph.StreamGraph)
org.apache.flink.core.execution.JobListener#onJobSubmitted
org.apache.flink.client.program.rest.RestClusterClient#submitJob
JM
org.apache.flink.runtime.dispatcher.Dispatcher#submitJob log.info(“Received JobGraph submission {} ({}).”, jobGraph.getJobID(), jobGraph.getName());
org.apache.flink.runtime.dispatcher.Dispatcher#runJob
org.apache.flink.runtime.dispatcher.Dispatcher#createJobManagerRunner
flink-rest-server-netty-worker-thread org.apache.flink.runtime.rest.handler.AbstractRestHandler#handleRequest org.apache.flink.runtime.rest.handler.AbstractRestHandler#respondToRequest
org.apache.flink.runtime.rpc.akka.AkkaRpcActor#handleRpcMessage
JM org.apache.flink.runtime.blob.BlobServer#run java.net.ServerSocket#accept
org.apache.flink.streaming.api.environment.StreamContextEnvironment#execute
Job 发到 JM 后
org.apache.flink.runtime.dispatcher.Dispatcher#runJob
org.apache.flink.runtime.jobmaster.JobMaster#offerSlots
Scheduler 调度 生成 StreamGraph org.apache.flink.runtime.executiongraph.Execution
发给 TM RemoteRpcInvocation(submitTask(TaskDeploymentDescriptor, JobMasterId, Time))
org.apache.flink.runtime.taskexecutor.TaskExecutor#submitTask
submitTask 方法 new 一个 Task
TM 去 Blobserver 拿jar https://juejin.im/post/5e80ae51518825736d278248
org.apache.flink.runtime.taskmanager.Task#startTaskThread
InputGateDeploymentDescriptor 是什么
What‘s more
通过基本的执行流程我们可以衍生出很多精进的知识去了解.
- 如何生成 Job 链
- RPC 具体实现, 都是干嘛的
- Checkpoint机制
- Task 容错
- 与Kafka 的对接等等