Describe the bug
Summary
Multiple bugs were found and fixed in the nspin=4 LCAO code path. All fixes are verified by two independent criteria:
- Self-consistency: All nspin=4 test cases converge and produce stable results.
- Rotational invariance: The total energy of a non-collinear DFT+U calculation must be independent of the initial magnetization direction (x/y/z), since the lattice and atomic positions have cubic symmetry.
Bug List and Fixes
Bug 1: Hamiltonian Pauli-to-spinor conversion (gint_common.cpp)
The coefficients for the σ_y channel in the Pauli-to-spinor conversion were incorrect:
// Before (WRONG):
std::vector<int> clx_j = {0, 1, -1, 0}; // σ_y channel: +1 for up-down, -1 for down-up
// After (CORRECT):
std::vector<int> clx_j = {0, -1, 1, 0}; // σ_y channel: -1 for up-down, +1 for down-up
Derivation: Since σ_y = [[0,−i],[i,0]], the spinor Hamiltonian components are:
- H_{up,down} = B_x − iB_y (coefficient on σ_y is −i, so imag coefficient is −1)
- H_{down,up} = B_x + iB_y (coefficient on σ_y is +i, so imag coefficient is +1)
Additionally, the lower-triangle fill was missing conjugation:
// Before (WRONG): only transposed, no conjugation
lower_mat->get_value(icol, irow) = upper_mat->get_value(irow, icol);
// After (CORRECT): conjugate transpose to ensure H(-R) = H(R)†
lower_mat->get_value(icol, irow) = std::conj(upper_mat->get_value(irow, icol));
Bug 2: DFT+U potential Pauli-to-spinor conversion (dftu_lcao.cpp)
The transfer_vu function had the same σ_y sign error:
// Before (WRONG):
vu[index[1]] = 0.5 * (vu_tmp[index[1]] + i * vu_tmp[index[2]]);
vu[index[2]] = 0.5 * (vu_tmp[index[1]] - i * vu_tmp[index[2]]);
// After (CORRECT):
vu[index[1]] = 0.5 * (vu_tmp[index[1]] - i * vu_tmp[index[2]]);
vu[index[2]] = 0.5 * (vu_tmp[index[1]] + i * vu_tmp[index[2]]);
Bug 3: DFT+U force — incorrect VU scaling and force factor (dftu_force_stress.hpp)
Two errors in the nspin=4 DFT+U force code:
-
VU /= 2.0 was incorrect: The occupation matrix occ and potential VU are both in Pauli basis; there is no need to divide VU by 2. Removed the division.
-
force *= 2.0 was incorrect for nspin=4: The factor 2 compensates for spin degeneracy in nspin=1, but in Pauli basis (nspin=4) all spin channels are already included. Added nspin != 4 guard.
Bug 4: DeltaSpin force — incorrect force factor (dspin_force_stress.hpp)
Same force *= 2.0 error as Bug 3. Added nspin != 4 guard.
Bug 5: DM Fourier transform ρ_y sign error (density_matrix.cpp)
The func_xyz_to_updown function converts spinor DM to Pauli basis. The ρ_y formula was wrong in both template specializations:
Real version (TR=double):
// Before (WRONG): gives -2*rho_y instead of 2*rho_y
target_DMR_mat[icol + step_trace[2]] = -tmp[1].imag() + tmp[2].imag();
// After (CORRECT): rho_y = Im(rho_updown - rho_downup) = tmp[1].imag() - tmp[2].imag()
target_DMR_mat[icol + step_trace[2]] = tmp[1].imag() - tmp[2].imag();
Complex version (TR=complex):
// Before (WRONG): gives 2i*rho_y instead of 2*rho_y
target_DMR_mat[icol + step_trace[2]] = i * (tmp[1].imag() - tmp[2].imag());
// After (CORRECT): rho_y = -i*(rho_updown - rho_downup)
target_DMR_mat[icol + step_trace[2]] = -i * (tmp[1] - tmp[2]);
Derivation: From ρ = ρ₀I + ρ_x σ_x + ρ_y σ_y + ρ_z σ_z with σ_y = [[0,−i],[i,0]]:
- ρ_updown = ρ_x + iρ_y → tmp[1].imag() = ρ_y
- ρ_downup = ρ_x − iρ_y → tmp[2].imag() = −ρ_y
- Therefore ρ_y = (tmp[1].imag() − tmp[2].imag()) / 2
Bug 6: ELPA missing blacs_context in Constructor 1 (elpa_new.cpp)
Constructor 1 of ELPA_Solver was missing elpa_set_integer("blacs_context", cblacs_ctxt), while Constructor 2 (otherParameter) already had it. This caused ELPA's internal MPI operations (e.g. MPI_Bcast in complex Cholesky/invert_triangular) to fail with INVALID DATATYPE when using complex eigensolves.
Verification
Test 1: All nspin=4 test cases pass
| Test case |
ks_solver |
ΔE vs ref |
Status |
| scf_angle_spin4 |
genelpa (default) |
0.00e+00 |
PASS |
| scf_u_spin4 |
scalapack_gvx |
0.00e+00 |
PASS |
| scf_out_dos_spin4 |
scalapack_gvx |
2.05e-11 |
PASS |
| scf_out_hsr_spin4 |
genelpa (default) |
7.79e-12 |
PASS |
| scf_out_mul_spin4 |
scalapack_gvx |
2.41e-09 |
PASS |
Test 2: Rotational invariance of DFT+U total energy
Setup: Fe BCC (2 atoms), DFT+U with U=5.0 eV on d-orbitals, initial magnetization |M|=1.0 Bohr_mag along x, y, z directions respectively. The cubic lattice guarantees that the total energy must be direction-independent.
| Mag direction |
Total Energy (eV) |
ΔE vs z (eV) |
| z |
-6789.4505628781635 |
0.0 |
| y |
-6789.4505628781571 |
6.4e-12 |
| x |
-6789.4505810042838 |
1.8e-05 |
- Before fix: |E_y − E_z| ≈ 4.09 eV (broken rotational invariance — ρ_y sign was wrong)
- After fix: |E_y − E_z| ≈ 6.4e-12 eV (machine precision ✓); |E_x − E_z| ≈ 1.8e-05 eV (SCF convergence tolerance ✓)
Test 3: ELPA nspin=4 crash resolved
Before the blacs_context fix, all nspin=4 calculations using ks_solver=genelpa crashed with:
Fatal error in internal_Bcast: Invalid datatype
MPI_Bcast(buffer=..., count=3, INVALID DATATYPE, ...)
This occurred inside ELPA's elpa_cholesky / elpa_invert_triangular when processing complex matrices. After the fix, genelpa works correctly for nspin=4.
Modified Files
| File |
Change |
source/source_lcao/module_gint/gint_common.cpp |
Fix σ_y coefficient clx_j and add std::conj() for lower triangle |
source/source_lcao/module_operator_lcao/dftu_lcao.cpp |
Fix transfer_vu σ_y sign |
source/source_lcao/module_operator_lcao/dftu_force_stress.hpp |
Remove VU/=2.0, guard force*=2.0 with nspin!=4 |
source/source_lcao/module_operator_lcao/dspin_force_stress.hpp |
Guard force*=2.0 with nspin!=4 |
source/source_estate/module_dm/density_matrix.cpp |
Fix ρ_y sign in func_xyz_to_updown (both TR specializations) |
source/source_hsolver/module_genelpa/elpa_new.cpp |
Add missing blacs_context to Constructor 1 |
tests/03_NAO_multik/scf_angle_spin4/result.ref |
Update reference energy |
tests/03_NAO_multik/scf_u_spin4/result.ref |
Update reference energy/force/stress |
Expected behavior
No response
To Reproduce
No response
Environment
No response
Additional Context
No response
Task list for Issue attackers (only for developers)
Describe the bug
Summary
Multiple bugs were found and fixed in the nspin=4 LCAO code path. All fixes are verified by two independent criteria:
Bug List and Fixes
Bug 1: Hamiltonian Pauli-to-spinor conversion (
gint_common.cpp)The coefficients for the σ_y channel in the Pauli-to-spinor conversion were incorrect:
Derivation: Since σ_y = [[0,−i],[i,0]], the spinor Hamiltonian components are:
Additionally, the lower-triangle fill was missing conjugation:
Bug 2: DFT+U potential Pauli-to-spinor conversion (
dftu_lcao.cpp)The
transfer_vufunction had the same σ_y sign error:Bug 3: DFT+U force — incorrect VU scaling and force factor (
dftu_force_stress.hpp)Two errors in the nspin=4 DFT+U force code:
VU /= 2.0was incorrect: The occupation matrixoccand potentialVUare both in Pauli basis; there is no need to divide VU by 2. Removed the division.force *= 2.0was incorrect for nspin=4: The factor 2 compensates for spin degeneracy in nspin=1, but in Pauli basis (nspin=4) all spin channels are already included. Addednspin != 4guard.Bug 4: DeltaSpin force — incorrect force factor (
dspin_force_stress.hpp)Same
force *= 2.0error as Bug 3. Addednspin != 4guard.Bug 5: DM Fourier transform ρ_y sign error (
density_matrix.cpp)The
func_xyz_to_updownfunction converts spinor DM to Pauli basis. The ρ_y formula was wrong in both template specializations:Real version (TR=double):
Complex version (TR=complex):
Derivation: From ρ = ρ₀I + ρ_x σ_x + ρ_y σ_y + ρ_z σ_z with σ_y = [[0,−i],[i,0]]:
Bug 6: ELPA missing
blacs_contextin Constructor 1 (elpa_new.cpp)Constructor 1 of
ELPA_Solverwas missingelpa_set_integer("blacs_context", cblacs_ctxt), while Constructor 2 (otherParameter) already had it. This caused ELPA's internal MPI operations (e.g.MPI_Bcastin complex Cholesky/invert_triangular) to fail withINVALID DATATYPEwhen using complex eigensolves.Verification
Test 1: All nspin=4 test cases pass
Test 2: Rotational invariance of DFT+U total energy
Setup: Fe BCC (2 atoms), DFT+U with U=5.0 eV on d-orbitals, initial magnetization |M|=1.0 Bohr_mag along x, y, z directions respectively. The cubic lattice guarantees that the total energy must be direction-independent.
Test 3: ELPA nspin=4 crash resolved
Before the
blacs_contextfix, all nspin=4 calculations usingks_solver=genelpacrashed with:This occurred inside ELPA's
elpa_cholesky/elpa_invert_triangularwhen processing complex matrices. After the fix, genelpa works correctly for nspin=4.Modified Files
source/source_lcao/module_gint/gint_common.cppclx_jand addstd::conj()for lower trianglesource/source_lcao/module_operator_lcao/dftu_lcao.cpptransfer_vuσ_y signsource/source_lcao/module_operator_lcao/dftu_force_stress.hppVU/=2.0, guardforce*=2.0withnspin!=4source/source_lcao/module_operator_lcao/dspin_force_stress.hppforce*=2.0withnspin!=4source/source_estate/module_dm/density_matrix.cppfunc_xyz_to_updown(both TR specializations)source/source_hsolver/module_genelpa/elpa_new.cppblacs_contextto Constructor 1tests/03_NAO_multik/scf_angle_spin4/result.reftests/03_NAO_multik/scf_u_spin4/result.refExpected behavior
No response
To Reproduce
No response
Environment
No response
Additional Context
No response
Task list for Issue attackers (only for developers)