User Tools

Site Tools


products:promonitor:latest:monitorsguide:sap_cloud_alm:exception_monitoring_monitor

ALM Exception Monitoring

Retrieves exception count metrics from SAP Cloud ALM Exception Monitoring (EXM). Publishes time-series metrics and evaluates alarms 
per configured row.
  • API: GET /api/calm-metrics/v1/metrics?provider=exm

Prerequisites

Cloud ALM Connector (required)

Requires a Web Service connector with authentication type CLOUD_ALM.

SAP Cloud ALM Connector

Based on a service key from the SAP Cloud ALM API service instance in the BTP subaccount.

Required OAuth scopes in the authorities list of the instance parameters:

  • $XSMASTERAPPNAME.calm-api.exm.read
  • $XSMASTERAPPNAME.calm-api.metrics.read

API Endpoints

Endpoint Purpose
POST /oauth/token Authentication (BTP UAA)
POST /api/calm-analytics/v1/analytics/providers/filters Fetch EXM service IDs (data collection)
GET /api/calm-metrics/v1/metrics?provider=exm Retrieve EXM metrics

Key Features

  • Publishes six metric types per EXM datapoint: exm.counter exm.counter_available exm.counter_disruption exm.counter_degradation exm.counter_maintenance exm.counter_unknown
  • Time window is a 5-minute aligned UTC window with a 1-minute delay: to = floor(now - 1min, 5min), from = to - 5min (format yyyyMMddHHmmss UTC)
  • Paginated fetching up to 5000 records per page using x-total-count response header
  • All 6 measures are gauge-to-count converted (non-monotonic delta sum) before storage
  • Per-row thresholds with standard G2W:80 W2M:90 syntax. Multi Threshold Syntax
  • Attributes Filter narrows which datapoints a row evaluates
  • Exclusive controls whether a matched datapoint is consumed or passed to later rows
  • Glob support for Metric and Service name fields (* = all)
  • Optional alarm tag for grouping or routing
  • Auto-clear when alarm condition no longer matches
  • Load Services button: auto-discovers EXM service IDs and names from the live tenant

Data Collection

Data collection populates Service name and Service ID from the Cloud ALM EXM service registry.

Always run data collection before adding rows. A service name not returned by data collection may not match what the metrics API returns.

Data collection runs one call:

  1. POST /api/calm-analytics/v1/analytics/providers/filters with body {“providerName”:“EXM_DATAPROVIDER”,“providerVersion”:“v1”} — returns the EXM service filter list. Finds the entry with key = serviceId and extracts each service UUID and its label (e.g. S4H.100).

Click Load Services to run data collection and populate the surveillance table.

Configuration

Method 1: Load Services

  1. Open monitor configuration
  2. Click Load Services
  3. Table populates with Service name and Service ID from the live tenant
  4. Enable rows, set thresholds, save

Method 2: Manual / Wildcard

  1. Set Metric = * and/or Service name = * to match all
  2. Use only service names returned by data collection

Settings Reference

Field Type Default Description
Active Boolean true Enable or disable this row
Service name String * Glob matched against EXM service label e.g. S4H.100. * = all. Populated by data collection
Service ID String (empty) UUID of service. Auto-populated by data collection
Metric String * Metric type to match. Exact values or *. See Metric types
Attributes Filter String (empty) Narrows datapoints. Format: key:value,key2:value2. Empty = no restriction. See Attributes Filter
Thresholds String G2W:80 W2M:90 Alarm thresholds. Multi Threshold Syntax
Alarm tag String (empty) Optional tag appended to alarm message
Exclusive Boolean true If true datapoint is consumed by this row and not re-evaluated by later rows
Alarm Boolean true Enable alarm evaluation for this row
Metric Boolean true Publish metric datapoints for this row

Metric types

Value Description
exm.counter Total exception count
exm.counter_available Exceptions while system status was Available
exm.counter_disruption Exceptions during Disruption
exm.counter_degradation Exceptions during Degradation
exm.counter_maintenance Exceptions during Maintenance
exm.counter_unknown Exceptions with unknown status
* All of the above
Note: metric names in the Metric field use the snake_case form shown above (e.g. exm.counter_disruption, not exm.counterDisruption).

Attributes Filter

Narrows which datapoints a row matches. All clauses must match (AND). Matching is case insensitive. Malformed clauses are silently ignored.

Attribute Description Example
categoryName Exception category name ABAP_SHORT_DUMP
serviceType Service type identifier S4
useCase Use case identifier EXM

Example filter value:

categoryName:ABAP_SHORT_DUMP,serviceType:S4

Filter Evaluation Order

  1. No rows configured: nothing is published. Add at least one active row to collect data.
  2. Service name: glob matched against EXM service label. S4H* matches S4H.100. * matches all.
  3. Metric: exact match or *. Partial globs do NOT work.
  4. Attributes Filter: all clauses must match. Empty = match all datapoints.
  5. Exclusive = true: datapoint consumed by first matching row. Later rows skip it.
  6. Metric = false: row evaluates alarms but publishes no metric datapoints.

Collected Metrics

Base key: promonitor.cloud_alm.exm.*

Metric key Unit Description
promonitor.cloud_alm.exm.exm.counter count Total exception count
promonitor.cloud_alm.exm.exm.counter_available count Exceptions during Available status
promonitor.cloud_alm.exm.exm.counter_disruption count Exceptions during Disruption
promonitor.cloud_alm.exm.exm.counter_degradation count Exceptions during Degradation
promonitor.cloud_alm.exm.exm.counter_maintenance count Exceptions during Maintenance
promonitor.cloud_alm.exm.exm.counter_unknown count Exceptions with unknown status

Tags published with each datapoint: service.name sap.service.name service.namespace service.instance.id sap.service.display_name plus datapoint-level tags serviceId serviceName categoryName serviceType useCase

All metrics are stored as non-monotonic delta counts (converted from gauge before storage).

Alarm Evaluation

Alarms use a suppression key to deduplicate. Format:

  • With category: {monitorId}_{connectorId}_alm_exm_{service}_{metric}_{categoryName}_{rowIdx}
  • Without category: {monitorId}_{connectorId}_alm_exm_{service}_{metric}_{rowIdx}

One alarm per unique key. A new value overwrites the previous alarm state for the same key.

Time Window

The time window is computed at collection time using 5-minute alignment with a 1-minute delay:

adjustedNow = UTC.now() - 1 minute
minuteFloor = floor(adjustedNow.minute / 5) * 5
to   = adjustedNow with minute=minuteFloor, second=0, nanosecond=0
from = to - 5 minutes

Example at 14:37 UTC: from=20260601143000 to=20260601143500

This ensures the query always covers a complete 5-minute window that the EXM backend has already finalized.

Examples

1. Publish all metrics for all services

Set Service name = * and Metric = *. An empty table sends nothing.

2. Alarm on total exceptions above threshold

Active Service name Metric Attributes Filter Thresholds Exclusive Alarm Metric
true * exm.counter (empty) G2W:1 W2M:10 true true true

Alarms as soon as 1 exception appears. G2W:1 means any non-zero value triggers warning.

3. Alarm on short dumps for one service

Active Service name Metric Attributes Filter Thresholds Exclusive Alarm Metric
true S4H.100 exm.counter categoryName:ABAP_SHORT_DUMP G2W:1 W2M:5 true true true

Only datapoints for S4H.100 and category ABAP_SHORT_DUMP are evaluated.

4. Alarm on disruption exceptions only

Active Service name Metric Attributes Filter Thresholds Exclusive Alarm Metric
true * exm.counter_disruption (empty) G2W:1 W2M:5 true true true

Alarms when any exception occurs during a Disruption period.

5. Alarm without publishing metrics

Active Service name Metric Attributes Filter Thresholds Alarm Metric
true * exm.counter (empty) G2W:1 W2M:10 true false

Metric = false: alarms fire but no datapoints written to time-series.

6. Monitor disruption and degradation with different tags

Active Service name Metric Attributes Filter Thresholds Alarm tag Exclusive Alarm Metric
true * exm.counter_disruption (empty) G2W:1 W2M:5 EXM_DISRUPT true true true
true * exm.counter_degradation (empty) G2W:1 W2M:5 EXM_DEGRADE true true true

Each metric type gets its own alarm tag. Exclusive rows prevent cross-matching.

7. Two services with different thresholds

Active Service name Metric Attributes Filter Thresholds Exclusive Alarm Metric
true S4H.100 exm.counter (empty) G2W:1 W2M:5 true true true
true * exm.counter (empty) G2W:5 W2M:20 true true true

Row 1 consumes S4H.100 datapoints (Exclusive = true). Row 2 applies a looser threshold to all other services.

8. Suppress a category from alarming

Active Service name Metric Attributes Filter Thresholds Exclusive Alarm Metric
true * exm.counter categoryName:INFORMATIONAL G2W:80 W2M:90 true false false
true * exm.counter (empty) G2W:1 W2M:10 true true true

Row 1 consumes informational exceptions (Exclusive = true, Alarm = false). Row 2 alarms on all others.

Troubleshooting

Symptom Check
Empty table after Load Services Connector uses CLOUD_ALM auth. Service key is valid. Tenant has active EXM data.
HTTP 401 or 403 Regenerate service key in BTP instance. Verify OAuth scopes include calm-api.exm.read.
Metrics show 0 but Cloud ALM has data Some measures legitimately report 0 when no exceptions occurred in the 5-minute window.
Alarms not triggering Row is Active. Alarm is enabled. Metric and Service name match incoming data. Threshold syntax is correct.
Duplicate alarms for same category Add Attributes Filter with categoryName: to target a specific category per row.
No metrics stored when data exists Enable Metric on the relevant row.
Stale data after adding new services Click Load Services again or wait for the next scheduled run.

Limitations

  • Metric field does not support partial glob patterns. *disruption does NOT match exm.counter_disruption. Use exact values or *.
  • Service ID is auto-populated by data collection. Manual entry is possible if the UUID is known.
  • Malformed Attributes Filter clauses are silently ignored. Validate format before saving.
  • The time window is always the last complete 5-minute slot. Data more recent than 1 minute is not included.
  • Exclusive = true means first matching row wins. Order rows from most specific to most general.
  • Thresholds apply to the raw numeric count value of the selected metric.
products/promonitor/latest/monitorsguide/sap_cloud_alm/exception_monitoring_monitor.txt · Last modified: by luis