Skip to content

ama-logs deployment and test in dev clusters#1625

Open
zanejohnson-azure wants to merge 16 commits intoci_prodfrom
zane/ci-agent-auto-deploy
Open

ama-logs deployment and test in dev clusters#1625
zanejohnson-azure wants to merge 16 commits intoci_prodfrom
zane/ci-agent-auto-deploy

Conversation

@zanejohnson-azure
Copy link
Contributor

@zanejohnson-azure zanejohnson-azure commented Mar 17, 2026

intro:
test infra that uses helm chart to deploy ama-logs new images from built to aks clusters, then run e2e scripts to test the new images.

key new file:
test-ci-image-in-aks-cluster.yml: script to deploy ama-logs through helm chart, and trigger e2e scripts

changes to existing e2e test scripts:
change 15m windows to 5m to allow test not to wait long. after helm chart deployment and before trigger e2e test, we need wait for a period of time before querying the logs to avoid using logs from prev helm deployment. If we use 15m, we need wait until 20m for safety, by changing to 5m, we can just wait for 1m before running e2e test.

run a few tests using 5m query window and 10m waiting time. Log exists and test passed.

test:
see #20260320.12 in ci build pipeline

@zanejohnson-azure zanejohnson-azure requested a review from a team as a code owner March 17, 2026 22:27
@zanejohnson-azure zanejohnson-azure force-pushed the zane/ci-agent-auto-deploy branch from 69e2df7 to f404bad Compare March 19, 2026 22:14
linuxImageTagUnderTest: $[stageDependencies.stage.common.outputs['setup.linuxImagetag']]
windowsImageTagUnderTest: $[stageDependencies.stage.common.outputs['setup.windowsImageTag']]
jobs:
# TODO: remomve the two temp cluster and add more clusters from test automation framework when the tests are stable

Check notice

Code scanning / devskim

A "TODO" or similar was left in source code, possibly indicating incomplete functionality Note

Suspicious comment
# clusterName: 'my-cluster'
# resourceGroup: 'my-rg'
# region: 'eastus'
# subscriptionId: '9b96ebbd-c57a-42d1-bbe9-b69296e4c7fb'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not want to expose sub id, so remove it.

type: string
displayName: 'Log Analytics Workspace ID'
- name: imageTag
- name: amalogsLinuxImage
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separate win and linux image as build pipeline outputs these two vars separately.

note: due to this change, the ci-aks-prod-release.yaml is also changed to accommodate this new.

- name: amalogsWindowsImage
type: string
displayName: 'Image tag suffix (e.g., win-3.1.32)'
- name: imageRepository
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add imageRepository, because in build pipeline, we need use cidev, and in prod release pipeline, we need use ciprod

- name: cloudEnvironment
type: string
default: 'azurepubliccloud'
- name: kubernetesVersion
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused and useless, removed.

- name: releaseName
type: string
default: 'azuremonitor-containers'
- name: helmVersion
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove version pin, always use latest verison.

# Function to run Trivy scan and handle output
run_trivy_scan() {
trivy image --exit-code 1 --ignore-unfixed --no-progress --severity HIGH,CRITICAL,MEDIUM "${{ variables.repoImageName }}:$(linuxImagetag)" > trivy_output.log 2>&1
#trivy image --exit-code 1 --ignore-unfixed --no-progress --severity HIGH,CRITICAL,MEDIUM "${{ variables.repoImageName }}:$(linuxImagetag)" > trivy_output.log 2>&1
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

temporarily remove trivy failure and let test can run. helm deployment and tests depend on successful image build.

DisableRemediation: false
AcceptableOutdatedSignatureInHours: 72

- stage: Deploy_and_Test_Images_In_Dev_Clusters
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two steps for each cluster:

  1. helm chart deployment using image built from last stage
  2. install testkube and trigger e2e test.

workspaceKey: "<your_workspace_key>"

# Image configuration
imageRepository: "/azuremonitor/containerinsights/ciprod"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add this allow the helm chart can be used for both build pipeline and prod release pipelines.

content:
git:
uri: https://github.com/microsoft/Docker-Provider/
revision: ci_prod
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

temporary change to allow e2e tests use my new changes by using 5m. before merging, will change to ci_prod.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant