1085 implement online entitlement validation for cpicrio images before deployment#1093
Open
luigimolinaro wants to merge 51 commits intomainfrom
Open
Conversation
added 30 commits
February 25, 2026 16:48
…l support - Renamed global_config parameter from 'entitlement_validation' to 'check_images' for better clarity (entitlement has different meaning in IBM context) - Added support for sample_size: all to check all images instead of sampling When set to 'all', validation checks every image returned by cpd-cli - Updated all documentation: * entitlement-validation.md - Updated configuration examples and troubleshooting * entitlement-validation-readme.md - Updated implementation summary * cpd-objects.md - Added link to entitlement validation docs * global-config-entitlement-validation.yaml - Updated sample configuration - Updated Ansible roles: * main.yml - Changed to use check_images parameter * get-sample-images.yml - Added conditional logic for sample_size: all * vars/main.yml - Added documentation for sample_size: all option Configuration examples: - Default: check_images.sample_size: 3 - Check all: check_images.sample_size: all - Disable: check_images.enabled: false
- Renamed directory: validate-entitlement-images → validate-check-images - Renamed docs: entitlement-validation.md → check-images.md - Renamed docs: entitlement-validation-readme.md → check-images-readme.md - Renamed config: global-config-entitlement-validation.yaml → global-config-check-images.yaml - Updated all references in: * playbooks/playbook-env-apply-10-validate.yml * docs/src/30-reference/configuration/check-images-readme.md * docs/src/30-reference/configuration/cpd-objects.md This aligns naming with the actual functionality: checking image access rather than validating entitlement (which has different meaning in IBM context)
- Renamed: automation-roles/10-validation/validate-check-images → check-images - Updated references in: * playbooks/playbook-env-apply-10-validate.yml * docs/src/30-reference/configuration/check-images-readme.md The simpler name 'check-images' is more intuitive and directly reflects the functionality without redundant 'validate' prefix.
- Add --check-images CLI option to cp-deploy.sh with support for true/false/all values - Rename internal variables from _entitlement_validation_* to _check_images_* - Update help text to document new --check-images option - Add command-line override logic in Ansible role - Update documentation with CLI usage examples - Update comments to clarify image access validation vs entitlement validation
- Support --check-images (default sample_size=3) - Support --check-images=N (custom sample size) - Support --check-images=all (check all images) - Support --check-images=false (disable validation) - Update help text to clarify all options - Update documentation with all usage examples
- Change main title from 'Entitlement Image Validation' to 'Image Access Validation (check_images)' - Update all references to use 'image access validation' instead of 'entitlement validation' - Ensure consistency between code, CLI options, and documentation - Keep 'entitlement key' terminology where it refers to IBM's actual key concept
- Group options by purpose: Configuration, Authentication, Deployment control, Air-gapped, Output/debugging, and Other - Add section headers to improve readability and navigation - Maintain all existing options and their descriptions - Improve user experience when viewing help output
- Indent category headers (Configuration, Authentication, etc.) to show they are sub-sections of OPTIONS - Improve readability by creating clear visual hierarchy - Make it easier to distinguish between category headers and individual options
- Display exit code and number of images found - Show stderr output when command fails - Help troubleshoot issues with image fetching from cpd-cli
- Add cpd-cli-download role before check-images in playbook-env-apply-10-validate.yml - Ensures cpd-cli is available when validating image access - Only downloads when CP4D components are configured - Fixes issue where cpd-cli was not available during validation phase
- Display all_config.cp4d to verify configuration is loaded - Show _cp4d_requested status to understand why block may be skipped - Help troubleshoot why no images are being fetched
- Extract only cartridge names instead of full cartridge objects - Use map(attribute='name') to get just the name field from each cartridge - Fixes issue where cpd-cli was receiving invalid component format - Now correctly passes: cpfs,cpd_platform,cp-foundation,lite,wkc instead of full objects
- Display cartridge names after extraction to verify map(attribute='name') works - Help troubleshoot why component list still shows full objects
- Exclude cartridges with state: removed - Exclude invalid components: lite, cp-foundation, scheduler - Include cartridges without state defined (default to be installed) - Fixes cpd-cli error: Component 'lite' is not valid
- cpd-cli list-images should not include cpfs, scheduler, ibm-cert-manager, etc. - Only pass valid cartridge names to cpd-cli - Fixes validation to work with cpd-cli requirements
- cpd-cli writes images to CSV file, not stdout
- Read from: cpd-cli-workspace/olm-utils-workspace/work/offline/{version}/list_images.csv
- Parse CSV format with awk -F','
- Fixes issue where 0 images were found despite successful cpd-cli execution
… CSV file validation - Changed cartridge filtering to select only cartridges with state: installed - This excludes cp-foundation, lite, scheduler and other non-installed components - Added stat check to verify CSV file exists before reading - Added detailed debug output showing CSV file path, existence and size - Improved error handling with explicit exit code check - Added sample output display (first 3 images) for verification This fixes the issue where cpd-cli was being called with invalid components (cp-foundation, lite) that don't have installable images, resulting in empty image lists.
- Removed unnecessary CSV file check (cpd-cli outputs to stdout, not CSV) - Parse cpd-cli output directly with grep/awk pipeline - Filter only cartridges with state=installed (mantaflow, wkc) - Simplified debug output showing exit code and image count - Fixed error handling for skipped tasks This approach is simpler and works correctly with cpd-cli's actual output format.
- Split cpd-cli execution into two steps: raw output capture and parsing - Display first 50 lines of raw cpd-cli output for debugging - This will help identify why grep is not finding any images
…eded) BREAKING CHANGE: Removed cpd-cli dependency for image validation The deployer container doesn't have podman/docker installed, which cpd-cli requires to inspect images. Instead, we now use a simpler approach: - Generate sample image names based on cartridge names - Validate these samples using skopeo (already available in container) - No external dependencies required This approach: - Works in the deployer container without modifications - Is faster (no cpd-cli overhead) - Still validates entitlement key access effectively - Uses 2-3 sample images per cartridge for quick validation Example: For 'mantaflow' cartridge, validates: - cp.icr.io/cp/cpd/mantaflow-operator:latest - cp.icr.io/cp/cpd/mantaflow-ui:latest
Instead of hardcoded samples, now:
- Uses skopeo list-tags to discover available images for each cartridge
- Tries common naming patterns (ibm-cpd-{cartridge}, ibm-{cartridge}, {cartridge})
- Automatically finds real, existing images in the registry
- No maintenance needed - always uses current images
- Works without podman/docker (skopeo only)
Example: For 'wkc' cartridge, discovers:
- icr.io/cpopen/ibm-cpd-wkc-operator-catalog:v5.2
- icr.io/cpopen/ibm-cpd-wkc-operator-catalog:v5.2.2
etc.
… list too long' error The entitlement key (JWT token) is very long when base64-encoded, causing command line length to exceed system ARG_MAX limits when passed directly as a parameter to skopeo. Solution: Pass the entitlement key via environment variable instead of command line argument in both get-sample-images.yml and validate-images.yml. Changes: - get-sample-images.yml: Use ENTITLEMENT_KEY env var in skopeo list-tags - validate-images.yml: Use ENTITLEMENT_KEY env var in skopeo inspect This resolves the '[Errno 7] Argument list too long' error while maintaining security with no_log directive.
…Argument list too long' error The lint-config role was failing with '[Errno 7] Argument list too long' when passing large configuration data (including IBM entitlement key) as command-line arguments to the pre-execution-processor.py script. Changes: - Modified pre-execution-processor.py to accept configuration data via environment variables (GENERATOR_ATTRIBUTES, GENERATOR_FULL_CONFIG, GENERATOR_VARIABLES) as an alternative to command-line arguments - Updated pre-process-object-list-element.yaml to pass data via environment variables instead of -a, -f, -v command-line arguments - Added no_log directive to prevent sensitive data from appearing in logs - Maintains backward compatibility: script still accepts command-line arguments if environment variables are not set This fix is similar to the one applied to the check-images role and resolves the system ARG_MAX limit issue when dealing with large configuration objects.
…G_MAX limit The previous approach using environment variables still hit the ARG_MAX limit because Ansible passes environment variables as part of the shell command invocation. New approach: - Create temporary files to store the base64-encoded configuration data - Pass file paths to the Python script - Script reads data from files using 'cat' command - Clean up temporary files after execution This completely avoids the command-line argument length limit.
The temporary file write operations were cluttering the execution recap. Using no_log: true to: - Reduce noise in the final recap - Protect sensitive configuration data - Improve readability of execution summary This also consolidates the task names to 'Write configuration data to temporary files' for better grouping in the recap.
Reduces recap clutter by combining 3 separate write tasks into one loop. Now appears only once per preprocessor execution instead of 3 times. Benefits: - Cleaner execution recap (3 tasks -> 1 task) - More maintainable code - Same functionality with less repetition
Changes: - Remove debug task showing cartridge names (not needed in production) - Add no_log to 'Build images list' task to hide verbose loop output - Add loop_control with label to show only cartridge name instead of full item Result: Much cleaner output, only essential information is shown
added 15 commits
March 6, 2026 12:06
Changes: - Add no_log to hide verbose variable details - Add loop_control with label to show only variable names - Fixes deprecation warning visibility Result: Much cleaner output, only variable names shown instead of full content
- Added check for CP4D configurations presence - Added check for CPD_CHECK_IMAGES environment variable - Ensures check-images role runs only when --check-images flag is used
… key - Check for CP_ENTITLEMENT_KEY environment variable first - Fall back to vault if environment variable not set - Enables easier testing with invalid keys for validation
…tead of command-line arguments - Changed pre-execution-processor.py invocation to pass large configuration data via environment variables (GENERATOR_ATTRIBUTES, GENERATOR_FULL_CONFIG, GENERATOR_VARIABLES) - Removed command-line arguments -a, -f, -v that were causing ARG_MAX limit errors - Added no_log: true to prevent sensitive data exposure in logs - The Python script already supports both methods (command-line and environment variables)
- Changed to pass file paths via environment variables (GENERATOR_*_FILE) - Python script now reads files directly instead of receiving large base64 content - This avoids both command-line and environment variable size limits - Maintains backward compatibility with old methods
When the user explicitly uses --check-images flag, the validation should fail on errors. This ensures that invalid entitlement keys are caught during validation phase.
The public registry icr.io/cpopen does not require authentication, so validation was passing even with invalid entitlement keys. Changed to use cp.icr.io/cp/cpd which requires proper authentication, ensuring invalid keys are detected.
…alog discovery The operator catalogs don't exist in cp.icr.io registry, so we validate access directly using cartridge image names (e.g., cp.icr.io/cp/cpd/wkc:latest). This is simpler and more reliable for entitlement validation.
- Root cause: Code was constructing invalid image names like 'cp.icr.io/cp/cpd/zen:latest' which don't exist (cartridge names ≠ image names, :latest tag doesn't exist) - Solution: Implement dynamic image discovery using 'cpd-cli manage list-images' to fetch actual image references with correct tags/digests - Changes: * get-sample-images.yml: Added cpd-cli integration with proper error handling * validate-images.yml: Added skip logic when no images found * FIX-SUMMARY.md: Comprehensive documentation of fix and testing - Benefits: * Always uses current, version-specific images from IBM * Gracefully handles missing cpd-cli with warning * Aligns implementation with existing documentation
- Fixed condition order: check cpd-cli availability before failing - Added debug output to show cpd-cli execution details (rc, components, version, error output) - Improved error message to mention checking cpd-cli output - Now properly skips validation when cpd-cli is not available instead of failing
- Split cpd-cli execution into raw output capture and filtering - Display first 50 lines of raw cpd-cli output for debugging - Show stderr and return code - This will help identify why cpd-cli is not returning images
PROBLEM: The --check-images feature was failing with 'No images found for validation' because it relied on 'cpd-cli manage list-images' which requires container runtime (podman/docker). SOLUTION: Read CASE image CSV files directly from the cpd-cli workspace. CHANGES: - automation-roles/10-validation/check-images/README.md (NEW) - automation-roles/10-validation/check-images/tasks/get-sample-images.yml - automation-roles/10-validation/check-images/tasks/main.yml - automation-roles/99-generic/cpd-cli/cpd-cli-download/tasks/cpd-cli-download.yml - docs/src/30-reference/configuration/check-images.md - sample-configurations/sample-dynamic/config-samples/global-config-check-images.yaml - .gitignore
- Add .continue/ to .gitignore to prevent tracking IDE-specific files - Remove existing .continue directory from git tracking - Clean up repository from IDE configuration files
- Remove hardcoded '-EE' suffix from cpd-cli download queries
- Use flexible pattern to match any edition (EE, SE, etc.)
- Pattern now matches 'cpd-cli-linux-*' or 'cpd-cli-{arch}-*' ending with .tar.gz
- Ensures compatibility with all Cloud Pak for Data editions
This fixes an issue where customers using Standard Edition (SE) or other
editions could not download the cpd-cli tool.
- Add version tracking for downloaded CPD CLI binaries - Check existing version against latest available version from GitHub - Re-download CPD CLI only when version changes or file is missing - Store version information in .version file alongside tar.gz - Apply same logic to both standard and GitHub PAT download methods - Maintain -EE edition pattern for Enterprise Edition downloads This ensures the deployer can handle Cloud Pak upgrades by detecting when a new CPD CLI version is needed and downloading it automatically.
- Add .ansible* pattern to ignore all Ansible cache directories - Add *.retry pattern to ignore Ansible retry files - Replace .ansible.continue/ with broader .ansible* pattern
- Group related patterns with descriptive comments - Organize by category: temporary files, IDE, OS, Python, Ansible, etc. - Improve readability and maintainability
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
You can test with :
./cp-deploy.sh env apply --check-only --accept-all-licenses --check-images --cpd-develop