# Configuring dashboards and alerts in Yandex Monitoring

In this tutorial, you will learn how to track [trail](../concepts/trail.md) status using [dashboards](../../monitoring/concepts/visualization/dashboard.md) and [Yandex Monitoring](../../monitoring/index.md) [alerts](../../monitoring/concepts/alerting.md#alert).

This guide assumes that you already have deployed your infrastructure: 

* Created Yandex Cloud resources to collect security events for.
* Created a [trail](../concepts/trail.md) in Audit Trails to collect events.
* Configured the [target](../concepts/trail.md#target) to store and manage events: a [bucket](../../storage/concepts/bucket.md), [datastream](../../data-streams/concepts/glossary.md#stream-concepts), or [log group](../../logging/concepts/log-group.md).

To start tracking the status of trails:

* [Set up alerts](#setup-alerts).
* [Set up the dashboard](#setup-dashboard).

If you no longer need the resources, [delete them](#clear-out).

## Set up alerts {#setup-alerts}

### Create a notification channel {#create-channel}

To get notifications about a triggered alert:

1. In the [management console](https://console.yandex.cloud), select the folder where you want to create a notification channel.
1. [Go](../../console/operations/select-service.md#select-service) to **Monitoring**.
1. In the left-hand panel, select **Notification channels**.
1. In the top-right corner, click **Create channel**.
1. Specify the channel settings:
    * In the **Name** field, specify `alerts-channel`.
    * In the **Method** field, specify the notification method.
    * In the **Recipients** field, list notification recipients.
1. Click **Create**. 

The channel will appear in the list.

### Add alerts {#add-alerts}

You can set up one or more alerts.

For more information about how to [create alerts](../../monitoring/operations/alert/create-alert.md) and about [alert parameters](../../monitoring/concepts/alerting.md#alert-parameters), see the Yandex Monitoring documentation.

#### Deactivating a trail {#deactivating-trail}

The alert will send a notification that the trail is being deactivated.

1. In the [management console](https://console.yandex.cloud), select the folder where you want to create an alert.
1. [Go](../../console/operations/select-service.md#select-service) to **Monitoring**.
1. In the left-hand panel, select **Alerts**.
1. In the top-right corner, click **Create alert**.
1. In the **Name** field, specify `deactivating-trail-alert`. 
1. Under **Metrics**, click ![image](../../_assets/console-icons/plus.svg) to the right of the folder name and specify:
    1. `service = Audit Trails`. 
    1. `name = trail.status`.
    1. `status != ACTIVE`.
    1. `trail = <trail_name>`.
1. Under **Alert parameters**, specify:
    1. **Condition**: `Not equals to`.
    1. **Alarm**: `0`.
1. Under **Notification channels**, click **Add channel** and select the [previously created](#create-channel) notification channel.
1. Click **Create alert**. 

The alert is created.

#### Stopping delivery of audit logs to destination object {#stopping-logs}

The alert will send notification that the trail has stopped uploading audit logs to its destination object, for example, due to a lack of free space in the bucket.

The **Evaluation window** parameter depends on the specific trail. The type and number of resources within the audit trail logging section will define the frequency for uploading audit logs to the destination object. 

1. In the [management console](https://console.yandex.cloud), select the folder where you want to create an alert.
1. [Go](../../console/operations/select-service.md#select-service) to **Monitoring**.
1. In the left-hand panel, select **Alerts**.
1. In the top-right corner, click **Create alert**.
1. In the **Name** field, specify `stopping-logs-alert`. 
1. Under **Metrics**, click ![image](../../_assets/console-icons/plus.svg) to the right of the folder name and specify:
    1. `service = Audit Trails`. 
    1. `name = trail.delivered_events_count`.
    1. `trail = <trail_name>`.
1. Under **Alert parameters**, specify:
    1. **Condition**: `Equals to`.
    1. **Alarm**: `0`.
    1. **Evaluation window**: `<trail_value>`.
1. Under **Notification channels**, click **Add channel** and select the [previously created](#create-channel) notification channel.
1. Click **Create alert**. 

The alert is created.

#### Modifying the number of trails {#number-trails}

The alert will send a notification that the number of trails in a cloud has changed.

1. In the [management console](https://console.yandex.cloud), select the folder where you want to create an alert.
1. [Go](../../console/operations/select-service.md#select-service) to **Monitoring**.
1. In the left-hand panel, select **Alerts**.
1. In the top-right corner, click **Create alert**.
1. In the **Name** field, specify `number-trails-alert`. 
1. Under **Metrics**, click ![image](../../_assets/console-icons/plus.svg) to the right of the folder name and specify:
    1. `service = Audit Trails`. 
    1. `name = quota.trails_count.usage`.
1. Under **Alert parameters**, specify:
    1. **Condition**: `Not equals to`.
    1. **Alarm**: `<number_of_trails>`.
1. Under **Notification channels**, click **Add channel** and select the [previously created](#create-channel) notification channel.
1. Click **Create alert**.

#### Nearing cloud trail quota {#trail-quota}

The alert will send a notification that the number of trails used per cloud consumed over 80% of the quota.
   
1. In the [management console](https://console.yandex.cloud), select the folder where you want to create an alert.
1. [Go](../../console/operations/select-service.md#select-service) to **Monitoring**.
1. In the left-hand panel, select **Alerts**.
1. In the top-right corner, click **Create alert**.
1. In the **Name** field, specify `trail-quota-alert`. 
1. Under **Metrics**, click ![image](../../_assets/console-icons/plus.svg) to the right of the folder name and specify:
    1. `service = Audit Trails`. 
    1. `name = quota.trails_count.usage`.
1. Under **Alert parameters**, specify:
    1. **Condition**: `Greater than`.
    1. **Alarm**: `<number_equal_to_80%_of_quota>`.
1. Under **Notification channels**, click **Add channel** and select the [previously created](#create-channel) notification channel.
1. Click **Create alert**.

#### Unauthorized access attempts {#unauthorized-access}

The alert will send a notification that an unauthorized request has been sent to one of the trail resources.

1. In the [management console](https://console.yandex.cloud), select the folder where you want to create an alert.
1. [Go](../../console/operations/select-service.md#select-service) to **Monitoring**.
1. In the left-hand panel, select **Alerts**.
1. In the top-right corner, click **Create alert**.
1. In the **Name** field, specify `unauthorized-access-alert`. 
1. Under **Metrics**, click ![image](../../_assets/console-icons/plus.svg) to the right of the folder name and specify:
    1. `service = Audit Trails`. 
    1. `name = trail.unauthorized_events_count`.
1. Under **Alert parameters**, specify:
    1. **Condition**: `Greater than`.
    1. **Alarm**: `0`.
1. Under **Notification channels**, click **Add channel** and select the [previously created](#create-channel) notification channel.
1. Click **Create alert**.

## Set up a dashboard {#setup-dashboard}

For ready-to-use trail status monitoring charts, open the [management console](https://console.yandex.cloud) → **Audit Trails** → ![image](../../_assets/console-icons/display-pulse.svg) **Monitoring**.

To monitor not just the trail status but the trail storage status as well, create a complex dashboard in Monitoring.

### Create a dashboard {#create-dashboard}

{% list tabs group=instructions %}

- Monitoring UI {#console}

  1. In the [management console](https://console.yandex.cloud), select the folder the trails are in.
  1. [Go](../../console/operations/select-service.md#select-service) to **Monitoring**.
  1. Navigate to the **Dashboards** tab.
  1. Click **Create**.
  1. Click **Save** at the top right.
  1. In the window that opens, enter a name for the dashboard, e.g., `missing-events`, and click **Save**.

{% endlist %}

### Create a chart for missed event monitoring {#create-missing-events-chart}

You can monitor missed events using [Audit Trails metrics](../concepts/user-metrics.md):

* `trail.processed_events_count`: Rate at which the events are accepted for processing.
* `trail.delivered_events_count`: Event delivery rate to the destination object.

Copy a ready-to-use chart for missed event monitoring to your dashboard:

1. Open the trail dashboard in Monitoring:

    {% list tabs group=instructions %}

    - Management console {#console}

      1. Open the [management console](https://console.yandex.cloud).
      1. [Go](../../console/operations/select-service.md#select-service) to **Audit Trails**.
      1. In the left-hand panel, select ![image](../../_assets/console-icons/route.svg) **Trails**.
      1. Select the trail you need.
      1. Go to the ![image](../../_assets/console-icons/display-pulse.svg) **Monitoring** panel for the selected trail.
      1. Click **Open in Monium** at the top right.

          This will take you to the Monitoring interface.

    {% endlist %}

1. Copy the chart:

    {% list tabs group=instructions %}

    - Monitoring UI {#console}

      1. Find the **Processed versus delivered events** chart.
      1. To the right of the chart name, click ![horizontal-ellipsis](../../_assets/horizontal-ellipsis.svg) → **Copy to another dashboard**.
      1. Specify a name, e.g., `Processed versus delivered events — <trail_name>`.
      1. Select a cloud and folder, then specify the dashboard [you created earlier](#create-dashboard).
      1. Click **Copy and edit**.

          This will open your dashboard with a new chart.

    {% endlist %}

If you need to, follow the same steps to add charts from other trails to your dashboard.

### Review the chart for missed event monitoring {#investigate-missing-events-chart}

Note that there may be a lag between **Delivered events** and **Processed events**. Normally, you may encounter short-term lags that are compensated soon. If you have observed a persistent delivery lag of one hour or longer, check the trail status and [diagnostics logs](../concepts/diagnostics.md).

#### Get the trail status {#get-trail-status}

{% list tabs group=instructions %}

- Management console {#console}

  1. Open the [management console](https://console.yandex.cloud).
  1. [Go](../../console/operations/select-service.md#select-service) to **Audit Trails**.
  1. In the left-hand panel, select ![image](../../_assets/console-icons/route.svg) **Trails**.
  1. Select the trail. The **Trail** page will display detailed information about the trail.

{% endlist %}

The `Active` status during a persistent lag between **Delivered events** and **Processed events** means that the trail operates normally, but there are some other reasons why data is delivered to the destination object with delays. In this case, check the destination object status and logs:

* [Getting bucket information and statistics](../../storage/operations/buckets/get-info.md)
* [Getting information about a log group](../../logging/operations/index.md#log-group-info)
* [Step-by-step guides for Data Streams](../../data-streams/operations/index.md)

The `Error` status indicates a trail performance error. In which case, review the trail diagnostics log.

#### Review the trail diagnostics log {#get-trail-status}

{% list tabs group=instructions %}

- Management console {#console}

  1. Open the [management console](https://console.yandex.cloud).
  1. [Go](../../console/operations/select-service.md#select-service) to **Audit Trails**.
  1. In the left-hand panel, select ![image](../../_assets/console-icons/route.svg) **Trails**.
  1. Select the trail.
  1. Navigate to the ![image](../../_assets/console-icons/receipt.svg) **Diagnostic log** panel and review the log.
  1. Read [this](../operations/error.md) troubleshooting guide.

{% endlist %}

### Create a chart for the destination object {#create-destination-chart}

Apart from the chart for missed event monitoring, you can add a chart for the destination object:

* **Object Storage**

    Events can be missed if there is not enough space for storing logs, e.g, if a trail sends logs to a bucket of limited size. To monitor the available bucket space, create a chart for the `space_usage` [metric](../../monitoring/metrics-ref/storage-ref.md) and add the `max-size` metric as a threshold.

    The `max-size` metric will not be available if the maximum bucket size is not specified. If so, you need to track the storage space per cloud [quota](../../storage/concepts/limits.md#storage-quotas) usage on your own.

* **Cloud Logging**

    [Add](../../logging/tutorials/log-group-record-monitoring.md) a chart for the `group.service.ingested_records_per_second` [metric](../../monitoring/metrics-ref/logging-ref.md) to the dashboard to display the actual rate of log ingestion into the log group. Comparing this value with the `Maximum write speed` [quota](../../logging/concepts/limits.md#logging-quotas) helps determine whether the log stream is hitting its limit. The additional `group.service.ingest_requests_per_second` chart filtered by `ERROR` status enables detecting write errors promptly.


* **Data Streams**: See [these Yandex Managed Service for YDB guides](../../ydb/index.md):
  * [Yandex Monitoring metric reference](../../ydb/metrics.md)
  * [Quotas and limits in Managed Service for YDB](../../ydb/concepts/limits.md)

  Learn more about resolving data write issues, see [Troubleshooting slow writes in Yandex Data Streams](../../data-streams/tutorials/slow-writes-debug.md).

  {% note tip %}

  If the event flow write speed in Data Streams is over 1 MB/s, enable compression. This will cut the volume of transmitted data, reduce the risk of [throttling](https://en.wikipedia.org/wiki/Dynamic_frequency_scaling) for individual YDS segments, and improve your flow bandwidth performance.
  
  The compression setting is available when [creating](../operations/create-trail.md) or [modifying](../operations/manage-trail.md) a trail via the CLI, API, or Terraform UI. Available compression methods: `GZIP` ([GNU Zip](https://wikipedia.org/wiki/Gzip)) or `ZSTD` ([Zstandard](https://wikipedia.org/wiki/Zstandard)). There is no compression default (`RAW`).
  
  To read data via the native YDS protocol, additionally enable compression on the YDS reader. The HTTP Kinesis and Apache Kafka® protocols are not supported yet.

  {% endnote %}
  



## How to delete the resources you created {#clear-out}

* [Delete the alerts](../../monitoring/operations/alert/delete-alert.md)
* [Delete the dashboard](../../monitoring/operations/dashboard/delete-dashboard.md)