Map Reduce

A Map Reduce task type’s example and dive into information of PyDolphinScheduler.

Example

"""A example workflow for task mr."""

from pydolphinscheduler.core.engine import ProgramType
from pydolphinscheduler.core.process_definition import ProcessDefinition
from pydolphinscheduler.tasks.map_reduce import MR

with ProcessDefinition(name="task_map_reduce_example", tenant="tenant_exists") as pd:
    task = MR(
        name="task_mr",
        main_class="wordcount",
        main_package="hadoop-mapreduce-examples-3.3.1.jar",
        program_type=ProgramType.JAVA,
        main_args="/dolphinscheduler/tenant_exists/resources/file.txt /output/ds",
    )
    pd.run()

Dive Into

Task MR.

class pydolphinscheduler.tasks.map_reduce.MR(name: str, main_class: str, main_package: str, program_type: ProgramType | None = 'SCALA', app_name: str | None = None, main_args: str | None = None, others: str | None = None, *args, **kwargs)[source]

Bases: Engine

Task mr object, declare behavior for mr task to dolphinscheduler.

_downstream_task_codes: Set[int]
_task_custom_attr: set = {'app_name', 'main_args', 'others'}
_task_relation: Set[TaskRelation]
_upstream_task_codes: Set[int]