Tasks refer the individual data tasks that form a job. Onesecondbefore divides the world in three parts: from, do and to.
The `task` part of the configuration considers task related settings. It can be configured for all task types.
task:
type: from_google_analytics
start_date: yesterday -3 days
end_date: today
property | type | required | description |
---|---|---|---|
type | enumerator | yes | Contains type of task. Must be one of:
The items below are discussed in more detail in the Do section The items below are discussed in more detail in the To section |
id | string | yes | Default value is the filename without extension. Unique name for the task. |
trigger_date | string | read-only | Timestamp when the job (not the task) was triggered in the local timezone. Useful in deduplicating tables and in SQL templates. |
run_id | string | read-only | Unique id per run. Every time a task runs, it receives a unique 8 character alphanumeric string. |
tmp_dir | string | read-only | Temporary folder on the worker machine where data will be stored during it's lifetime. Once the task is done, the worker and all data on it will be irreversibly deleted. |
start_date | relative or absolute date or date & time | yes | Start date of the period that will be selected in the datasource. Can be filled with an absolute or relative date. Read more about relative date and time here. |
end_date | string, date or date & time | yes | End date of the period that will be selected in the datasource. Can be filled with an absolute or relative date. Read more about relative date and time here. |
loop_by | enumerator (year, month, week, day, hour, minute, second, file, list) | no | Loop the task depending on the enumerator value. If year, month, week, day, hour, minute or second the loop will add an equal time frame to the start_date until the end_date is reached. If list the loop will cycle through the values in the loop_list. If file the loop will cycle through each file on a data source. This is especially useful when downloading a large data files in many different chunks. |
loop_list | array | no | Contains a list of values to loop through. |
loop_value | string or int | read-only | Contains the actual value of the loop when loop_by is used. Automatically set by Workflows. |
loop_index | int | read-only | Starts at 0. Contains the number of the loop. Use in combination with loop_by |
loop_start_date | date or datetime | read-only | Set to task.start_date. Only available when loop_by=hour or loop_by=day. When used, task.start_date will be overwritten with the timeframe of the current loop. |
loop_end_date | date or datetime | read-only | Set to task.end_date. Only available when loop_by=hour or loop_by=day. When used, task.end_date will be overwritten with the timeframe of the current loop. |
resource_size | enumerator (0, 1, 2, 4, 8, 16, 32, 64, 128, 256) | no | Default is 0. Resource size to use for the task. Number corresponds with the amount of CPU (0 being 0.25). The memory of the instance is 8 x resource_size Gib. E.g. a resource_size of 16 means 16 CPU with 8 x 16 = 128 Gib. |