https://lucid.app/documents/view/62259ab5-50b5-4354-84e6-f2d31ca26dc1

https://lucid.app/documents/view/cb82c514-f094-411b-8b2f-a0de07528600

Playbook

Playbook Definition contains all the necessary information to define the investigation logic. The most important part of this definition is the steps property, which is a collection of Task Configurations.

Playbook Proto

Proto Field Type Description Note
name string Name of the playbook
description string Description of the playbook Optional
global_variable_set JSON A set Key-Value pairs defining variables which can be used in steps Optional. All variables must be prefixed with $ symbol.
steps Array of Object(s) An array of step configurations

Step Definition

Step definitions contains related tasks along with the metadata required to interpret all the results. The platform supports following interpretation layer(s):

  1. OpenAi Vision Interpreter: The execution results are fed as images to Open AI’s vision API to detect any abnormal behaviours like spikes, abrupt flat lines etc. Interpretation of all the results are further used to find correlations between different tasks and generate step summary

Step Definition Proto

Field Type Description Note
name string Name of the step
description string Description of the step Optional
notes markdown Investigation notes for the step Optional
external_links (metadata) Array of Key-Value pair(s) Related external links Optional. Part of metadata property
interpreter_type string Interpretation engine to be used for generating step investigation summary Optional
tasks Array of Object(s) An array of task configurations

Task

Task Definition defines a unit execution supported by the platform. They are defined in the google protobuf message formats. All available source task definitions can be found here.

Every task definition must include task_type which is of enum type and contains various choices of task configs for a given source.

Playbook Metric Task Proto

Field Type Description Note
source string task source
name string task name
description string task description optional
notes string task notes optional
task_connector_sources Array of Objetc(s) A list of connectors on which the will be executed
One of [documentation, cloudwatch, grafana, new_relic, datadog, clickhouse, postgres, eks, sql_database_connection, api, bash, grafana_mimir] JSON task configuration

Workflows

Workflows contains all the necessary information to automate the investigation logic. A workflow is built using 4 components:

  1. Entry Points: Rules for triggering workflows
  2. PlayBooks: PlayBooks that will be executed during each execution of the workflow
  3. Schedule: Rule defining how long a workflow should be executed
  4. Actions: Rules defining what to do with the output of each workflow execution

Workflow proto:

Proto Field Type Description Note
name string name of the workflow
description string description of the workflow Optional
schedule JSON Workflow schedule
playbooks Array of Object(s) An array of playbooks
entry_points Array of Object(s) An array of entry points
actions Array of Object(s) An array of actions

Workflow Entry Point

Workflow entry points are rules associated with a workflow defining events/scenarios when a workflow should be triggered.

The platform currently supports following types of entry points:

  1. API: Each workflow comes with a generated URI which can be used to trigger the workflow
  2. Alert: The platform listens to following alert source and applies a rules on each rule to check and trigger all relevant workflows for that alert
    1. Slack

Workflow Entry Point Proto:

Proto Field Type Description Note
type string one of [API, ALERT]
One of [api_config, alert_config] string description of the workflow

Workflow Schedule

Workflow schedule defines a rule that controls the execution behaviour of a workflow. The platform currently supports following schedules:

  1. One off: Single Execution
  2. Interval: Repeated execution after a fixed delay
  3. Cron: Execution at times calculated by the cron

Workflow Action

Workflow actions are rules that define what should be done with output of each execution. The platform currently supports the following actions:

  1. Notify on Slack: Send a notification to slack with detailed graph views of the metrics evaluated and interpretation of the results
  2. Call WebHook: Call a webhook