

dag ( DAG) – a reference to the dag the task is attached to (if any).

Not all executors implement queue management, the CeleryExecutor does support targeting specific queues.

queue ( str) – which queue to target when running this job.wait_for_downstream ( bool) – when set to true, an instance of task X will wait for tasks immediately downstream of the previous instance of task X to finish successfully before it runs.The task instance for the start_date is allowed to run. depends_on_past ( bool) – when set to true, task instances will run sequentially while relying on the previous task’s schedule to succeed.end_date ( datetime) – if specified, the scheduler won’t go beyond this date.The best practice is to have the start_date rounded to your DAG’s schedule_interval. start_date ( datetime) – The start_date for the task, determines the execution_date for the first task instance.max_retry_delay ( timedelta) – maximum delay interval between retries.retry_exponential_backoff ( bool) – allow progressive longer waits between retries by using exponential backoff algorithm on retry delay (delay will be converted into seconds).retry_delay ( timedelta) – delay between retries.retries ( int) – the number of retries that should be performed before failing the task.owner ( string) – the owner of the task, using the unix username is recommended.task_id ( string) – a unique, meaningful id for the task.All types of operators derive functionalities from the BaseOperator through inheritance. For example HdfsSensor, FileSensorĪpart from these, you should also know about:īaseOperator: As the name suggests this operator acts as the base (or parent) for each and every operator in Airflow. However, if the condition doesn’t meet, the sensor waits for another interval and then checks again. More specifically it waits for a condition to be met, at a specific interval, before succeding. Sensors: Sensors are used to wait for something to happen, before proceeding to the next task.Transfer Operators: These operators are used to transfer data from source to destination.For example- The PythonOperator executes a python function, the BashOperator executes a bash command, etc. Action Operators: An action operator is an operator executing something (i.e.Operator should produce the same result regardless of the number of times it is run). An operator should be idempotent (i.e.But they can communicate using a feature of Airflow called XCom(pronounced as Cross Communications). they can work alone and do not require to communicate or share resources with any other operators). An operator is usually (but not always) atomic(i.e.If you have two tasks in the same operator instance consider separating them to two different instance of that same operator. An operator describes a single task in a workflow.These operators might not be fully baked or well tested as those in the main distribution, but they allow users to more easily add new functionality to the platform. The airflow/contrib/ directory contains many more operators built by the community. Sensor – waits for a certain time, for a condition to be satisfied.MySqlOperator, SqliteOperator, PostgresOperator, MsSqlOperator, OracleOperator, JdbcOperator, etc.SimpleHttpOperator – sends an HTTP request.

PythonOperator – calls an arbitrary Python function.BashOperator – for executing a simple bash command.Operators are generally used to provide integration to some other service like MySQLOperator, JdbcOperator, DockerOperator, etc.Īirflow also provides operators for many common tasks, including: An Operator defines one task in your data pipeline. When an operator is instantiated along with its required parameters then it is known as a task. They can be considered as templates or blueprints that contain the logic for a predefined task, that we can define declaratively inside an Airflow DAG. It determines what will be executed when the DAG runs. An Operator is the building block of an Airflow DAG.
