====== Monitoring ======
The endpoints described below allow to get information about the monitored systems as well as generated metrics and alarms.

**Note:**
Those endpoints are only available if the ''Generic event server'' plugin is active.

===== How to use the API =====

==== API call ====
Once the plugin is configured and active, alerts, metrics and monitored infrastructure will be available through the API.

  1. Start by discovering the monitored landscapes 
  2. Poll regularly alarms and metrics queues. We recommend to poll the queues every minute.
  3. Refresh landscape metadata once per hour
  
==== Alarms and metrics correlation ====
  * Generated alarms and metrics will correlate with discovered groups, systems and instances.
  * For each alarm and metric, three parameters can be used to correlate it to a component of the landscape:
    * The ''groupUUID'' parameter will match the UUID of a discovered group.
    * The ''sid'' parameter will match the ''sid'' of a system
    * The ''instance'' parameter will match the ''name'' of an instance.

====== Landscape ======
This endpoint will return information about monitored systems, such as properties, instances, types, etc...

^ Landscape ^^^
| Description| Retrieve information about monitored systems ||
| Method | **GET** ||
| URL| /api/v1/monitoring/landscapes ||
| **Parameters** |||
| None|||
| **Example** |||
| **GET** ///api/v1/monitoring/landscapes// |||

===== Usage =====
  * Monitored landscape architecture is not supposed to change very often.
  * Therefore it is probably not necessary to poll it more than **once every hour**.

===== Response =====
  * The service returns a JSON table of monitored **groups**, **systems** and **instances**
  * Groups will contain systems, which will contain instances

__Group structure format:__
^ Parameter ^ Description ^ Type ^
| name | name of the group | String |
| uuid | unique identifier of the group | String |
| systems | the systems belonging to the group | Table |

__System structure format:__
^ Parameter ^ Description ^ Type ^
| sid | SID of the system as defined in the configuration | String |
| realSid | SID of the system as discovered | String |
| type | Type of the system | String |
| uuid | unique identifier of the system | String |
| description | description of the system | String |
| instances | the instances belonging to the system| Table |
| properties | properties of the system depending on the context | Table |

__Instance structure format:__
^ Parameter ^ Description ^ Type ^
| name | name of the instance | String |
| type | Type the instance | String |
| host | hostname | String |

====== Alarms ======
===== Alarms queue size =====
^ Get number of alarms to collect ^^^
| Description| Returns the size of the alarm queue ||
| Method | **GET** ||
| URL| /api/v1/monitoring/alarms/size||
| **Response** | An integer value ||
| **Example** |||
| **GET** ///api/v1/monitoring/alarms/size// |||

===== Alarms queue =====
This endpoint will return a chunk of generated alarms. 
To call this endpoint with a **POST** method will remove alarms from the queue, **this is the recommanded way.**\\
You can also call this endpoint with a **GET** method, this will just list the alarms from the queue. **Intended for testing only, as alarms may fill up the queue rather quickly**

^ Collect alarms ^^^
| Description| Collects the alarms and remove them from the queue ||
| Method | **POST** ||
| URL| /api/v1/monitoring/alarms||
| **Parameters** |||
| maxchunksize | The max number of alarms to return (default 100) | optional |
| **Example** |||
| **POST** ///api/v1/monitoring/alarms?maxchunksize=1000// |||

=== Usage ===
  * It is advised to poll the queue often with small chunks
  * We recommend a poll period of **60 sec**.
  * If the amount of returned alarms is equal to the max size of the chunk, it means that more alarms can be fetched and another call can be triggered.

=== Response ===
  * The service returns a JSON table of **alarms**

__Alarm structure format:__
^ Parameter ^ Description ^ Type ^ Always set ^
| id | The identifier of a unique alarm/problem **(1)** | String | Yes |
| module | The monitored module | String | Yes |
| metric | The monitored metric | String | No |
| source | The source being monitored | String | Yes |
| sid | the SID of the system being monitored | String | Yes |
| groupName | the name of the group containing the system | String | Yes |
| groupUUID | the unique identifier of the group | String | Yes |
| connectorId | the id of the connector used to connect to the system | Number | Yes |
| message | the alarm message | String | Yes |
| severity | the severity of the alarm | String | Yes |
| severityId | the id of the severity | Number | Yes |
| toClear | Set to true if the alarm must be cleared **(2)** | Boolean | Yes |
| clearable | Set to true if the alarm can ever be cleared **(3)** | Boolean | Yes |
| instance | The instance for which the alarm occurred, if relevant | String | No |
| client | The ABAP client for which the alarm occurred, if relevant | String | No |
| user | The user for which the alarm occurred, if relevant | String | No |
| component | A component name for which the alarm occurred, if relevant | String | No |
| host | The host on which the alarm occurred, if relevant | String | No |


  * (1): Same problem on same resource gets same id.
  * (2): If set to true, the severity will represent the last generated one before the alarm was cleared.
  * (3): Some problems are events that cannot be "undone", so the alert will always stay.

**Note:** Undocumented parameters are not to be used.

====== Metrics ======
===== Metric queue size =====
Get the number of metrics waiting to be collected from the queue

^ Get size of the metric queue ^^^
| Description| Collects the metrics and remove them from the queue ||
| Method | **GET** ||
| URL| /api/v1/monitoring/metrics/size||
| **Response** | An integer value ||
| **Example** |||
| **GET** ///api/v1/monitoring/metrics/size// |||

===== Metrics queue =====
This endpoint returns a chunk of generated metrics. To collect metrics will remove them from the queue.

^ Collect metrics ^^^
| Description| Collects the metrics and remove them from the queue ||
| Method | **POST** ||
| URL| /api/v1/monitoring/metrics||
| **Parameters** |||
| maxchunksize | The max number of metrics to return (default 100) | optional |
| **Example** |||
| **POST** ///api/v1/monitoring/metrics?maxchunksize=1000// |||


===== Usage =====
  * It is advised to poll the queue often with small chunks
  * We recommend a poll period of **60 sec**.
  * If the amount of returned metrics is equal to the max size of the chunk, it means that more metrics can be fetched and another call can be triggered.

===== Response =====
  * The service returns a JSON table of **metrics**

__Metric structure format:__
^ Parameter ^ Description ^ Type ^ Always set ^
| module | The monitored module | String | Yes |
| metric | The monitored metric | String | No |
| source | The source being monitored | String | Yes |
| sid | the SID of the system being monitored | String | Yes |
| groupName | the name of the group containing the system | String | Yes |
| groupUUID | the unique identifier of the group | String | Yes |
| connectorId | the id of the connector used to connect to the system | Number | Yes |
| value | The value of the metric | Number/Boolean | Yes |
| unit | The unit of the metric | String | Yes |
| unitShort | The short representation of the unit | String | Yes |
| target | The target resource for this metric **(1)** | String | Yes |
| hasMax | If true, indicates that the metric cannot exceed ''sampleMax'' value | Boolean | Yes |
| sampleMax | The maximum value reachable by the metric **(2)** | Number | No |
| instance | The instance for which the metric is generated | String | No |
| client | The ABAP client for which the metric is generated | String | No |
| user | The user for which the metric is generated | String | No |
| component | A component name for which the metric is generated | String | No |
| host | The host on which the metric is generated | String | No |

  * (1): Represents the resource being 'measured', by example ''Disk C:'', ''User X''
  * (2): For percent, sampleMax is 100. To use only if **hasMax** is set.

**Note:** Undocumented parameters are not to be used.