Fix NPE during public IP listing when a removed network or VPC ID is informed for associatenetworkid parameter#12372
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## 4.20 #12372 +/- ##
============================================
- Coverage 16.26% 16.25% -0.02%
+ Complexity 13428 13420 -8
============================================
Files 5660 5662 +2
Lines 499963 500166 +203
Branches 60708 60733 +25
============================================
- Hits 81330 81309 -21
- Misses 409559 409771 +212
- Partials 9074 9086 +12
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@blueorangutan package |
|
@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with no SystemVM templates. I'll keep you posted as I make progress. |
There was a problem hiding this comment.
Pull request overview
This PR fixes a NullPointerException that occurs when listing public IP addresses with a removed network or VPC ID as a parameter. When a User account attempts to list IPs using a removed resource ID, the database query returns null, causing an NPE during access checks.
Key Changes
- Added null checks before accessing network and VPC objects when validating access permissions
- Search criteria parameters are now conditionally set only when the associated resource exists
- Added VpcVO import to support the VPC null check logic
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16258 |
|
@erikbocks , would this also apply to version 20 and 22? (not just main?) |
|
@erikbocks as this is a bug fix, could you target 4.20? |
3e34da6 to
0d2f34a
Compare
|
@DaanHoogland @winterhazel, sure. The PR was re-targeted to the |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (associatedNetworkId != null) { | ||
| _accountMgr.checkAccess(caller, null, false, networkDao.findById(associatedNetworkId)); | ||
| sc.setParameters("associatedNetworkIdEq", associatedNetworkId); | ||
| NetworkVO associatedNetwork = networkDao.findById(associatedNetworkId); | ||
|
|
||
| if (associatedNetwork != null) { | ||
| _accountMgr.checkAccess(caller, null, false, associatedNetwork); | ||
| sc.setParameters("associatedNetworkIdEq", associatedNetworkId); | ||
| } | ||
| } | ||
|
|
||
| if (vpcId != null) { | ||
| _accountMgr.checkAccess(caller, null, false, _vpcDao.findById(vpcId)); | ||
| sc.setParameters("vpcId", vpcId); | ||
| VpcVO vpc = _vpcDao.findById(vpcId); | ||
|
|
||
| if (vpc != null) { | ||
| _accountMgr.checkAccess(caller, null, false, vpc); | ||
| sc.setParameters("vpcId", vpcId); | ||
| } | ||
| } |
There was a problem hiding this comment.
The fix addresses the NPE on lines 2584-2598, but there's a similar vulnerability later in the same method. Lines 2614-2621 also retrieve the network using findById and directly access its properties (getDataCenterId() on line 2616 and getAccountId() on line 2621) without null checks. This code path is executed when listing free IP addresses (when isAllocatedOrReserved is false and vlanType is not DirectAttached). If a removed network ID is provided, this will still result in a NullPointerException.
The same null check pattern should be applied here to prevent this additional NPE scenario.
| if (associatedNetworkId != null) { | ||
| _accountMgr.checkAccess(caller, null, false, networkDao.findById(associatedNetworkId)); | ||
| sc.setParameters("associatedNetworkIdEq", associatedNetworkId); | ||
| NetworkVO associatedNetwork = networkDao.findById(associatedNetworkId); | ||
|
|
||
| if (associatedNetwork != null) { | ||
| _accountMgr.checkAccess(caller, null, false, associatedNetwork); | ||
| sc.setParameters("associatedNetworkIdEq", associatedNetworkId); | ||
| } | ||
| } | ||
|
|
||
| if (vpcId != null) { | ||
| _accountMgr.checkAccess(caller, null, false, _vpcDao.findById(vpcId)); | ||
| sc.setParameters("vpcId", vpcId); | ||
| VpcVO vpc = _vpcDao.findById(vpcId); | ||
|
|
||
| if (vpc != null) { | ||
| _accountMgr.checkAccess(caller, null, false, vpc); | ||
| sc.setParameters("vpcId", vpcId); | ||
| } | ||
| } |
There was a problem hiding this comment.
The fix changes the behavior of the listPublicIpAddresses API when given a removed network or VPC ID. Consider adding test cases to verify this new behavior, specifically:
- Test that no NullPointerException is thrown when a User account calls listPublicIpAddresses with a removed network ID
- Test that no NullPointerException is thrown when a User account calls listPublicIpAddresses with a removed VPC ID
- Test the expected return value when a removed entity ID is provided (currently returns empty list)
The existing tests in ManagementServerImplTest.java test the setParameters method but don't cover the access control check scenario that was causing the NPE.
|
@blueorangutan package |
|
@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16709 |
|
@blueorangutan test |
|
@DaanHoogland a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
|
[SF] Trillian test result (tid-15384)
|
|
@blueorangutan package |
|
@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16841 |
|
@blueorangutan test |
|
@DaanHoogland a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
|
[SF] Trillian test result (tid-15480)
|
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
RosiKyu
left a comment
There was a problem hiding this comment.
LGTM
| # | Test Case | Method | Status |
|---|---|---|---|
| TC1 | List public IPs with removed network ID as User - no NPE | CLI (cmk as User) | PASS |
| TC2 | List public IPs with removed VPC ID as User - no NPE | CLI (cmk as User) | PASS |
| TC3 | List public IPs normally - no regression | CLI (cmk as User) | PASS |
Result: 3/3 PASS
Pre-fix behavior: listPublicIpAddresses with a removed network/VPC ID as a User account caused NullPointerException .
Post-fix behavior: Returns empty result gracefully. Zero NPEs in the management server log.
The null check fix correctly prevents the NPE when listing public IPs with removed network or VPC IDs. No regressions observed in normal listing flow.
Detailed Test Execution Report
TC1: List public IPs with removed network ID as User - no NPE
Objective Verify that calling listPublicIpAddresses with a removed (deleted) network ID as a User account does not cause a NullPointerException (HTTP 530), and instead returns an empty result.
Test Steps
- As RootAdmin, create User account
npetest - Create isolated network
npe-test-netundernpetest - Deploy VM to activate the network (triggers source NAT IP allocation)
- Destroy VM with expunge, then delete the network
- Register API keys for
npetest, configure cmk profile - As
npetest, call:list publicipaddresses associatednetworkid=cdd57d75-3c31-4249-b9ce-2b50d18cb928
Expected Result: Empty result returned gracefully - no HTTP 530 / NPE error.
Actual Result: Empty result returned. No error, no NPE.
Test Evidence:
(npetest) 🐱 > list publicipaddresses associatednetworkid=cdd57d75-3c31-4249-b9ce-2b50d18cb928
(npetest) 🐱 >
# No NPE in management server log:
[root@ref-trl-11136-k-Mol9-rositsa-kyuchukova-mgmt1 ~]# grep -i "NullPointerException" /var/log/cloudstack/management/management-server.log
[root@ref-trl-11136-k-Mol9-rositsa-kyuchukova-mgmt1 ~]#
TC2: List public IPs with removed VPC ID as User - no NPE
Objective Verify that calling listPublicIpAddresses with a removed (deleted) VPC ID as a User account does not cause a NullPointerException (HTTP 530), and instead returns an empty result.
Test Steps
- Using same setup from TC1
- VPC
npe-test-vpcwas created and then deleted - As
npetest, call:list publicipaddresses vpcid=a8480239-ce4a-4b7f-aeb4-693cf85ff2b6
Expected Result: Empty result returned gracefully - no HTTP 530 / NPE error.
Actual Result: Empty result returned. No error, no NPE.
Test Evidence:
(npetest) 🐱 > list publicipaddresses vpcid=a8480239-ce4a-4b7f-aeb4-693cf85ff2b6
(npetest) 🐱 >
# No NPE in management server log:
[root@ref-trl-11136-k-Mol9-rositsa-kyuchukova-mgmt1 ~]# grep -i "NullPointerException" /var/log/cloudstack/management/management-server.log
[root@ref-trl-11136-k-Mol9-rositsa-kyuchukova-mgmt1 ~]#
TC3: List public IPs normally - no regression
Objective Verify that listPublicIpAddresses without parameters still works correctly after the fix.
Test Steps
- As
npetest, call:list publicipaddresses
Expected Result: Empty result (user has no remaining public IPs after network deletion) — no error.
Actual Result:
Empty result returned normally. No error.
Test Evidence:
(npetest) 🐱 > list publicipaddresses
(npetest) 🐱 >
Description
When listing public IPs associated to a network, the database query only considers records that are not marked as removed. If the ID of an removed resource is informed, a
nullvalue is returned in the database query. If the caller is aUsertype account, during the access checks the code tries to access one of the value's properties, resulting in anNullPointerException.A validation was added to this flow to prevent the property access when the resource is
null, and the listing behavior was standardized. The same validation was added to handle removed VPCs, as the same behavior was presented for them.Types of changes
Feature/Enhancement Scale or Bug Severity
Bug Severity
Screenshots (if appropriate):
Behavior before the change
Behavior after the change
How Has This Been Tested?
First, I created an isolated network, informing a public IP address to be assigned as the network's source NAT. After that, the network was removed. Then, I authenticated with an
Usertype account in CloudMonkey and called thelistPublicIpAddressesAPI, informing the removed network's UUID as theassociatednetworkidparameter. As expected, the exception was thrown.Exception stack trace in Management Server's logs
After applying the code with my changes to my local environment, the same steps were executed, and I validated that the behavior was fixed. The same steps were reproduced for removed VPCs.
The code was also compiled with tests on Maven.