Core structure of the Flink Master

On the Master side, a Flink cluster is built around three main components: Dispatcher, ResourceManager, and JobManager.

  • Dispatcher: accepts jobs submitted by users and starts a new JobManager service for each newly submitted job.
  • ResourceManager: handles resource management for the entire cluster. A Flink cluster has only one ResourceManager, and all resource-related coordination goes through it.
  • JobManager: manages the execution of a specific job. Since multiple jobs may run at the same time in one Flink cluster, each job has its own JobManager service.

Flink basic architecture

What happens when a job is submitted

When a user submits a job, the user code is first converted into a JobGraph.

From there, the submission path depends on the deployment mode:

  • In Standalone session mode (or the corresponding Session style in YARN), the Client can connect directly to the Dispatcher and submit the job.
  • In Per-Job mode, the Client first requests resources from a resource management system such as YARN to start an ApplicationMaster, and then submits the job to the Dispatcher inside that ApplicationMaster.

Once the job reaches the Dispatcher, the Dispatcher first starts a JobManager service. After that, the JobManager requests resources from the ResourceManager so that the actual tasks of the job can be launched.

When the ResourceManager selects an available Slot (a core concept in Flink architecture), it notifies the corresponding TaskManager to assign that Slot to the designated JobManager.

Overall startup flow of the Master

During initialization of the Flink cluster’s Master node, the cluster startup begins by calling the runClusterEntrypoint() method of ClusterEntrypoint.

The overall process is shown below:

Overall Flink Master startup flow