====== SAP Cloud ALM Health Monitoring ======

The **SAP Cloud ALM Health Monitoring** monitor retrieves health metrics from your SAP Cloud ALM tenant and publishes them into the monitoring system. It can also evaluate thresholds and raise alarms based on a selected measure per row.

===== Prerequisites =====
Network connectivity from the collector to your SAP Cloud ALM tenant API endpoint (e.g. ''https://your-tenant.api.xxx.hana.ondemand.com'').

==== Cloud ALM Connector (required) ====
The monitor requires a **Web Service** connector with authentication type **CLOUD_ALM**. [[..:sap_cloud_alm|SAP Cloud ALM Connector]]

This connector is based on a service key of the **SAP Cloud ALM API** service instance in the BTP subaccount that contains your Cloud ALM entitlement.

The following OAuth scopes **must** be included in the instance parameters (in the ''authorities'' list):

  * ''$XSMASTERAPPNAME.calm-api.hm.read''
  * ''$XSMASTERAPPNAME.calm-api.metrics.read''

=== Where to add the scopes ===

Even if you already created a service key for the ALM connector, you do **not** need to delete the instance.

  - In the BTP Cockpit go to your subaccount → **Services** → **Instances and Subscriptions**.
  - Locate your **SAP Cloud ALM API** instance.
  - Click the **⋯** (three dots) menu next to the instance.
  - Select **Update** to edit the instance.
  - Click **Next**.
  - Add the two scopes above to the ''authorities'' array in the parameters JSON and save.

===== API Endpoints used =====
^ Endpoint ^ Purpose ^
| POST /oauth/token | Authentication (BTP UAA) |
| GET /api/calm-metrics/v1/metrics?provider=hm | Retrieve health monitoring metrics |

===== Key Features =====
==== Metrics Collection ====
The monitor can collect and publish the following measures for Cloud ALM health metrics:

  * **usage** – Percentage utilization
  * **value** – Current measured value
  * **limit** – Configured or maximum limit
  * **okStatus** – OK status indicator
  * **warningStatus** – Warning status indicator
  * **criticalStatus** – Critical status indicator

Only **usage** is published with the **PERCENT** unit. The other measures are published without a unit.

==== Alarm Capabilities ====
  * Per-row thresholds using the standard ''G2W:80 W2M:90'' syntax. [[..:commonsettings#multi_thresholds_syntax|Multi Threshold Syntax]]
  * Choose what to alarm on: **NONE**, **USAGE**, **VALUE**, **OK_STATUS**, **WARNING_STATUS**, or **CRITICAL_STATUS**
  * Alarm evaluation is performed only on the selected measure for each row
  * **Attributes Filter** can further restrict which datapoints are evaluated
  * **Exclusive** controls whether a matching datapoint can be consumed by later rows
  * Wildcard support (''*'') for **Metric** and **Service name**
  * Optional alarm tag for grouping or routing
  * Automatic clear is supported by the monitoring framework when the alarm condition no longer matches

==== Metric Publishing Behavior ====
The monitor supports both global and row-level metric publishing:

  * **Send all metrics = true**
    * publishes all supported datapoints for all returned metrics
    * metric names keep the original Cloud ALM measure in parentheses
    * datapoints are published unless they belong to a completely empty value group

  * **Send all metrics = false**
    * publishes only the selected measure for rows where **Metric** is enabled
    * only rows with **Send metric** enabled will publish performance data

When **Send all metrics** is enabled, the monitor suppresses some datapoint groups that contain only zero values for the same metric/tag combination. This avoids storing empty technical noise that does not add useful monitoring information.

The suppressed groups are:

  * **usage / limit** when both values are `0`
  * **okStatus / warningStatus / criticalStatus** when all three values are `0`

When row-level publishing is used, the monitor publishes the measure selected in **Alarm on** for matching rows.

==== Advanced Discovery ====
  * **Load ALM Health Metrics** button: Automatically discovers metric names and service names currently reported by Cloud ALM and populates the configuration table
  * The loaded metrics also expose the available datapoint attributes, so you can use them to build an **Attributes Filter** for each row

===== Configuration =====

==== Monitor Configuration ====
=== Method 1: Load ALM Health Metrics (Recommended) ===
  - Open the monitor configuration.
  - Click the **Load ALM Health Metrics** button.
  - The table will be populated with **METRIC** and **SERVICE_NAME** from your live Cloud ALM tenant.
  - Activate the rows you want, define thresholds, choose **Alarm on**, and save.

=== Method 2: Wildcard / Manual Mode ===
  * Set **Metric** = ''*'' and/or **Service name** = ''*'' to monitor everything or a broad subset
  * If **Send all metrics** is enabled, all collected datapoints are published
  * If **Send all metrics** is disabled, only rows with **Send metric** enabled will publish performance data

=== Settings Reference ===
^ Field ^ Description ^ Default ^
| Active | Enable/disable this configuration row | ''true'' |
| Service name | Filter by service name (''*'' = any) | ''*'' |
| Metric | Filter by metric name (''*'' = any) | ''*'' |
| Attributes Filter | Optional filter applied to datapoint attributes before alarm evaluation / row matching | (empty) |
| Alarm on | Which measure to evaluate for alarms (NONE / USAGE / VALUE / OK_STATUS / WARNING_STATUS / CRITICAL_STATUS) | ''USAGE'' |
| Thresholds | Multi-level thresholds (G2W / W2M / etc.) | ''G2W:80 W2M:90'' |
| Alarm tag | Optional tag added to the alarm message | (empty) |
| Exclusive | If enabled, matching datapoints of this row can be consumed and not reused by later rows | ''true'' |
| Alarm | Enable/disable alarm creation for this row | ''true'' |
| Send metric | Publish the selected measure for this row when **Send all metrics** is disabled | ''true'' |
| Send all metrics | Publish all returned metrics and measures globally | ''true'' |

==== Attributes Filter ====

The **Attributes Filter** is an optional rule used to narrow down which datapoints a row can match.

When you use **Load ALM Health Metrics**, the discovered metrics also include the available datapoint attributes. This helps you see which attribute names are available for filtering.

It checks the datapoint’s attributes before the row is used for:

  * **Alarm evaluation**
  * **Metric publishing** when the row is processed in row-based mode

If the filter is empty or malformed, the row is not restricted by attributes.

===== Collected Metrics =====
All metrics are stored with the base key:

''promonitor.cloud_alm.hm.*''

Supported metric keys include:

^ Metric Key ^ Unit ^ Description ^ Tags ^
| promonitor.cloud_alm.hm.''<metric>''.usage | % | Current utilization percentage | service.name, sap.service.name, service.namespace, service.instance.id, sap.service.display_name, plus datapoint-specific tags |
| promonitor.cloud_alm.hm.''<metric>''.value | - | Current measured value | same as above |
| promonitor.cloud_alm.hm.''<metric>''.limit | - | Configured limit | same as above |
| promonitor.cloud_alm.hm.''<metric>''.okstatus | - | OK status datapoint | same as above |
| promonitor.cloud_alm.hm.''<metric>''.warningstatus | - | Warning status datapoint | same as above |
| promonitor.cloud_alm.hm.''<metric>''.criticalstatus | - | Critical status datapoint | same as above |

The displayed metric name in the UI is:

''<metric> (usage)'', ''<metric> (value)'', ''<metric> (limit)'', ''<metric> (okStatus)'', ''<metric> (warningStatus)'', or ''<metric> (criticalStatus)''

===== Alarm Evaluation Notes =====
  * Alarm evaluation is performed only on the measure selected in **Alarm on**
  * If **Alarm on = NONE**, no alarm is generated for that row
  * The monitor checks datapoints with the selected measure name and evaluates the configured thresholds
  * If **Exclusive** is enabled, a matching datapoint can be consumed so that later rows do not evaluate it again
  * **Attributes Filter** can restrict evaluation to datapoints with matching attributes
  * A row raises at most **one alarm per matching datapoint**, even if several datapoints match the same row

===== Troubleshooting =====
  * **"No metrics returned"** or empty table after Load: Check that the connector uses **CLOUD_ALM** authentication and that the service key from the BTP **SAP Cloud ALM API** instance is valid. Also verify the tenant has Health Monitoring data.
  * **HTTP 401/403**: Token issue – regenerate the service key in the BTP instance.
  * **Metrics show 0 but data exists in Cloud ALM**: Some metrics may legitimately report 0 for one or more measures. Fully-zero triplets may also be suppressed from publishing.
  * **Alarms not triggering**:
    * Verify the row is **Active**
    * Verify **Alarm** is enabled
    * Verify **Alarm on** matches the intended measure
    * Verify **Metric** and **Service name** match the incoming Cloud ALM metric
    * Verify **Attributes Filter** is correct, if used
    * Verify threshold syntax is correct ([[products:promonitor:latest:monitorsguide:commonsettings|Multi thresholds syntax]])
  * **Expected status alarm not raised**: Some Cloud ALM metrics expose multiple datapoints for the same measure. The monitor evaluates matching datapoints for the selected alarm measure, but only raises alarms for matching data that passes the row filters.
  * **No metrics stored when data exists**: If **Send all metrics** is disabled, make sure **Send metric** is enabled on the relevant row.
  * **Stale data after adding new services in Cloud ALM**: Click **Load ALM Health Metrics** again or wait for the next scheduled run.