ไธญ

Amazon EMR

Overview

Amazon EMR task type, for creating EMR clusters on AWS and running computing tasks. Using aws-java-sdk in the background code, to transfer JSON parameters to RunJobFlowRequest object and submit to AWS.

Parameter

  • Node name: The node name in a workflow definition is unique.
  • Run flag: Identifies whether this node schedules normally, if it does not need to execute, select the prohibition execution.
  • Descriptive information: Describe the function of the node.
  • Task priority: When the number of worker threads is insufficient, execute in the order of priority from high to low, and tasks with the same priority will execute in a first-in first-out order.
  • Worker grouping: Assign tasks to the machines of the worker group to execute. If Default is selected, randomly select a worker machine for execution.
  • Times of failed retry attempts: The number of times the task failed to resubmit. You can select from drop-down or fill-in a number.
  • Failed retry interval: The time interval for resubmitting the task after a failed task. You can select from drop-down or fill-in a number.
  • Timeout alarm: Check the timeout alarm and timeout failure. When the task runs exceed the "timeout", an alarm email will send and the task execution will fail.
  • JSON: JSON corresponding to the RunJobFlowRequest object, for details refer to API_RunJobFlow_Examples.

JSON example

{
  "Name": "SparkPi",
  "ReleaseLabel": "emr-5.34.0",
  "Applications": [
    {
      "Name": "Spark"
    }
  ],
  "Instances": {
    "InstanceGroups": [
      {
        "Name": "Primary node",
        "InstanceRole": "MASTER",
        "InstanceType": "m4.xlarge",
        "InstanceCount": 1
      }
    ],
    "KeepJobFlowAliveWhenNoSteps": false,
    "TerminationProtected": false
  },
  "Steps": [
    {
      "Name": "calculate_pi",
      "ActionOnFailure": "CONTINUE",
      "HadoopJarStep": {
        "Jar": "command-runner.jar",
        "Args": [
          "/usr/lib/spark/bin/run-example",
          "SparkPi",
          "15"
        ]
      }
    }
  ],
  "JobFlowRole": "EMR_EC2_DefaultRole",
  "ServiceRole": "EMR_DefaultRole"
}