Check redfish-logservices¶
Overview¶
Checks the event log entries exposed under the LogServices of a Redfish-compatible server (the System Event Log, SEL) via the Redfish API. Alerts based on the severity of the log entries.
Important Notes:
- Tested on DELL iDRAC and DMTF Simulator
- A check usually completes within a few seconds, but a slow or retried request can take longer. The bundled Director basket allows a 60 second runtime timeout.
- This check runs with both HTTP and HTTPS. It uses GET requests only.
- No additional Python Redfish modules need to be installed.
Data Collection:
- Reads the service root to detect the vendor, then queries the
Managerscollection (orSystemson Supermicro) to locate the log service - Reads the SEL log entries and evaluates each entry's severity
- Uses HTTP Basic authentication if
--usernameand--passwordare provided
Fact Sheet¶
| Fact | Value |
|---|---|
| Check Plugin Download | https://github.com/Linuxfabrik/monitoring-plugins/tree/main/check-plugins/redfish-logservices |
| Nagios/Icinga Check Name | check_redfish_logservices |
| Check Interval Recommendation | Every 5 minutes |
| Can be called without parameters | Yes |
| Runs on | Cross-platform |
| Compiled for Windows | No |
Help¶
usage: redfish-logservices [-h] [-V] [--always-ok]
[--cache-expire CACHE_EXPIRE] [--ignore IGNORE]
[--insecure] [--log-type {sel,mel,both}]
[--match MATCH] [--max-age MAX_AGE] [--no-proxy]
[--password PASSWORD] [--retries RETRIES]
[--test TEST] [--timeout TIMEOUT] [--url URL]
[--username USERNAME]
Checks the event log entries exposed under the LogServices of a Redfish-
compatible server via the Redfish API and alerts based on the severity of the
log entries. By default it reads the System Event Log (SEL); `--log-type`
selects the management controller log (MEL) or both. Entries can be filtered
by regular expression (--match, --ignore), and entries older than --max-age
days can be aged out so a long-since resolved event does not keep the check in
a non-OK state forever.
options:
-h, --help show this help message and exit
-V, --version show program's version number and exit
--always-ok Always returns OK.
--cache-expire CACHE_EXPIRE
The amount of time after which the credential/data
cache expires, in minutes. Default: 15
--ignore IGNORE Ignore SEL entries whose message matches this Python
regular expression. Case-sensitive by default; use
`(?i)` for case-insensitive matching. Can be specified
multiple times. Example: `--ignore="Log area
reset/cleared"`.
--insecure This option explicitly allows insecure SSL
connections.
--log-type {sel,mel,both}
Which log to read: `sel` (System Event Log, default),
`mel` (management controller event log) or `both`.
Default: sel
--match MATCH Only consider SEL entries whose message matches this
Python regular expression. Case-sensitive by default;
use `(?i)` for case-insensitive matching. Can be
specified multiple times. Example:
`--match="(?i)temperature"`.
--max-age MAX_AGE Age out SEL entries older than this many days: they
are no longer alerted on, only counted in the summary.
A controller keeps an entry until the log is cleared,
so a long-since resolved event would otherwise keep
the check in a non-OK state forever. Default: 0 (0
disables aging).
--no-proxy Do not use a proxy.
--password PASSWORD Redfish API password.
--retries RETRIES Number of extra attempts if a request to the Redfish
API fails, before the check gives up. Helps against an
occasionally slow or flaky management controller.
Default: 3
--test TEST For unit tests. Needs "path-to-stdout-file,path-to-
stderr-file,expected-retc".
--timeout TIMEOUT Network timeout in seconds. Default: 8 (seconds)
--url URL Redfish API URL. Default: https://localhost:5000
--username USERNAME Redfish API username.
Usage Examples¶
./redfish-logservices --url=https://bmc --username=redfish-monitoring --password='linuxfabrik'
Output:
Checked SEL on 1 member. There are critical errors.
/redfish/v1/Managers/BMC
* 2012-03-07T14:44:00Z: System May be Melting [CRITICAL]
States¶
- OK if no log entry has a severity above OK.
- WARN if a log entry has severity "Warning".
- CRIT if a log entry has severity "Critical".
--always-oksuppresses all alerts and always returns OK.
Perfdata / Metrics¶
This plugin does not provide any performance data.
For Maintainers¶
You don't need a physical server with a real BMC (the management controller that serves the Redfish API, e.g. HPE iLO or Dell iDRAC) to develop or test this plugin. The official DMTF Redfish mockup server serves a static, read-only Redfish tree (including the manager log service) over plain HTTP, which is exactly what this GET-only plugin needs.
Run the mockup server and point the plugin at it, from the repository root:
podman run \
--detach --rm \
--name lfmp-redfish-mock \
--publish 5000:8000 \
docker.io/dmtf/redfish-mockup-server:latest
sleep 3
check-plugins/redfish-logservices/redfish-logservices --url=http://127.0.0.1:5000 --no-proxy
podman stop lfmp-redfish-mock
Use http://127.0.0.1:5000 rather than http://localhost:5000, because localhost may resolve to IPv6 (::1) while the published container port is bound to IPv4.
The fixtures under unit-test/stdout/ are the raw Redfish responses the plugin walks, one set per scenario named <scenario>-root (the service root), <scenario>-managers (the Managers collection) and <scenario>-sel (the log entries). To simulate an alert, copy a healthy set and add an entry with a Severity of Critical or Warning to the -sel file. The offline test suite is run with ./run from the unit-test directory.
Credits, License¶
- Authors: Linuxfabrik GmbH, Zurich
- License: The Unlicense, see LICENSE file.