Monitoring Services in kdevops

Overview

kdevops provides a flexible monitoring framework that allows you to collect system metrics and statistics during workflow execution. This is particularly useful for:

Performance analysis during testing
Debugging kernel behavior
Understanding system resource usage patterns
Validating new kernel features with custom metrics

The monitoring framework runs services in the background during workflow execution and automatically collects results afterward.

Configuration

Enabling Monitoring

Monitoring services are configured through the kdevops menuconfig system:

make menuconfig
# Navigate to: Monitors
# Enable: "Enable monitoring services during workflow execution"

Available Monitors

Folio Migration Statistics (Developmental)

This monitor tracks page/folio migration statistics in the Linux kernel. It's marked as "developmental" because it requires kernel patches that are not yet upstream.

Requirements:

Kernel with folio migration debugfs stats patch applied
Debugfs mounted at /sys/kernel/debug
File exists: /sys/kernel/debug/mm/migrate/stats

Configuration:

make menuconfig
# Navigate to: Monitors
# Enable: "Enable monitoring services during workflow execution"
# Enable: "Enable developmental statistics (not yet upstream)"
# Enable: "Monitor folio migration statistics"
# Set: "Folio migration monitoring interval" (default: 60 seconds)

Integration with Workflows

Currently Supported Workflows

fstests: Filesystem testing framework
blktests: Block layer testing framework
sysbench: Database performance testing framework

How Workflows Integrate Monitoring

Workflows integrate monitoring by including the monitoring role at appropriate points. Here's the pattern used in fstests:

# Start monitoring before tests
- name: Start monitoring services
  include_role:
    name: monitoring
    tasks_from: monitor_run
  when:
    - kdevops_run_fstests|bool
    - enable_monitoring|default(false)|bool
  tags: [ 'oscheck', 'fstests', 'run_tests', 'monitoring', 'monitor_run' ]

# ... workflow tasks run here ...

# Stop monitoring and collect data after tests
- name: Stop monitoring services and collect data
  include_role:
    name: monitoring
    tasks_from: monitor_collect
  when:
    - kdevops_run_fstests|bool
    - enable_monitoring|default(false)|bool
  tags: [ 'oscheck', 'fstests', 'run_tests', 'monitoring', 'monitor_collect' ]

Adding Monitoring to Your Workflow

To add monitoring support to a new workflow:

Identify the execution boundaries: Determine where your workflow starts and completes its main work.
Include the monitoring role: Add the monitoring role calls before and after your main tasks:

# In your workflow's main task file (e.g., playbooks/roles/YOUR_WORKFLOW/tasks/main.yml)

# Set custom monitoring results path (optional)
- name: Set monitoring results path for this workflow
  set_fact:
    monitoring_results_base_path: "{{ topdir_path }}/workflows/YOUR_WORKFLOW/results/monitoring"
  when:
    - enable_monitoring|default(false)|bool

# Start monitoring
- name: Start monitoring services
  include_role:
    name: monitoring
    tasks_from: monitor_run
  when:
    - your_workflow_condition|bool
    - enable_monitoring|default(false)|bool
  tags: [ 'your_workflow', 'monitoring', 'monitor_run' ]

# Your workflow tasks here...

# Stop monitoring
- name: Stop monitoring services and collect data
  include_role:
    name: monitoring
    tasks_from: monitor_collect
  when:
    - your_workflow_condition|bool
    - enable_monitoring|default(false)|bool
  tags: [ 'your_workflow', 'monitoring', 'monitor_collect' ]

Test the integration: Run your workflow with monitoring enabled to verify data collection.

Output and Results

Result Location

Monitoring results are stored in workflow-specific directories:

fstests: workflows/fstests/results/monitoring/
Other workflows: workflows/YOUR_WORKFLOW/results/monitoring/

Workflows can customize the results path by setting the monitoring_results_base_path variable in their playbook.

Result Files

For folio migration monitoring:

<hostname>_folio_migration_stats.txt: Raw statistics with timestamps
<hostname>_folio_migration_plot.png: Visualization plot (if generation succeeds)

Example Output

Raw statistics file format:

2024-01-15 10:30:00
success: 12345
fail: 67
total: 12412

2024-01-15 10:31:00
success: 12456
fail: 68
total: 12524

Running Workflows with Monitoring

Example: fstests with Folio Migration Monitoring

Configure monitoring:

make menuconfig
# Enable monitoring options as described above
make

Provision systems:

make bringup

Run tests with monitoring:

# Run on both baseline and dev groups
make fstests-tests TESTS=generic/003

# Or run on specific group
make fstests-baseline TESTS=generic/003

Check results:

ls -la workflows/fstests/results/monitoring/

Advanced Usage

Custom Monitoring Intervals

You can override the monitoring interval at runtime:

make fstests-tests EXTRA_VARS="monitor_folio_migration_interval=30"

Selective Monitoring

You can enable/disable specific monitors at runtime:

# Enable only folio migration monitoring
make fstests-tests EXTRA_VARS="enable_monitoring=true monitor_folio_migration=true"

Troubleshooting

Monitor Not Starting

Check kernel support:

ansible all -m shell -a "ls -la /sys/kernel/debug/mm/migrate/stats"

Verify debugfs is mounted:

ansible all -m shell -a "mount | grep debugfs"

Check monitoring process:

ansible all -m shell -a "ps aux | grep monitoring"

No Data Collected

Verify monitoring was enabled:

grep -E "enable_monitoring|monitor_" .config

Check ansible output for monitoring tasks:

ANSIBLE_VERBOSITY=2 make fstests-tests | grep -A5 -B5 monitoring

Look for error messages:

ansible all -m shell -a "cat /root/monitoring/folio_migration.log"

Adding New Monitors

To add a new monitor to the framework:

Add Kconfig option in kconfigs/monitors/Kconfig:

config MONITOR_YOUR_METRIC
	bool "Monitor your metric description"
	output yaml
	default n
	help
	  Detailed description of what this monitors...

Extend monitoring role:
- Add collection logic in playbooks/roles/monitoring/tasks/monitor_run.yml
- Add termination and data collection in playbooks/roles/monitoring/tasks/monitor_collect.yml
Add visualization (optional):
- Place scripts in playbooks/roles/monitoring/files/
- Call them from monitor_collect.yml
Update documentation: Add your monitor to this documentation file.

Performance Considerations

Monitoring overhead: Each monitor adds some system overhead. Consider the trade-off between data granularity and performance impact.
Storage requirements: Long-running tests with frequent monitoring can generate large data files.
Concurrent monitors: Running multiple monitors simultaneously increases overhead.

Future Enhancements

Planned monitoring additions:

Memory pressure statistics
CPU utilization tracking
I/O statistics collection
Network traffic monitoring
Custom perf event monitoring
Integration with Grafana/Prometheus for real-time visualization

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Monitoring Services in kdevops

Overview

Configuration

Enabling Monitoring

Available Monitors

Folio Migration Statistics (Developmental)

Integration with Workflows

Currently Supported Workflows

How Workflows Integrate Monitoring

Adding Monitoring to Your Workflow

Output and Results

Result Location

Result Files

Example Output

Running Workflows with Monitoring

Example: fstests with Folio Migration Monitoring

Advanced Usage

Custom Monitoring Intervals

Selective Monitoring

Troubleshooting

Monitor Not Starting

No Data Collected

Adding New Monitors

Performance Considerations

Future Enhancements

FilesExpand file tree

monitoring.md

Latest commit

History

monitoring.md

File metadata and controls

Monitoring Services in kdevops

Overview

Configuration

Enabling Monitoring

Available Monitors

Folio Migration Statistics (Developmental)

Integration with Workflows

Currently Supported Workflows

How Workflows Integrate Monitoring

Adding Monitoring to Your Workflow

Output and Results

Result Location

Result Files

Example Output

Running Workflows with Monitoring

Example: fstests with Folio Migration Monitoring

Advanced Usage

Custom Monitoring Intervals

Selective Monitoring

Troubleshooting

Monitor Not Starting

No Data Collected

Adding New Monitors

Performance Considerations

Future Enhancements