Skip to content

[BUG] Overflow in _C_make_dataobj due to c_int type #2777

@kelv1n9

Description

@kelv1n9

Description

When running large 3D grids (e.g., 1295^3 points), Devito fails due to integer overflow in the C structure definition in devito/types/dense.py.
The problem originates from the use of ctypes.c_int (32-bit signed integer), which overflows when matrix size exceeds 2^31-1 elements, even if index-mode=int64 and linearize=True are enabled.

This results in incorrect memory size interpretation and crashes in GPU memory allocation.

File

devito/types/dense.py

Affected Section

_C_ctype = POINTER(type(_C_structname, (Structure,),
                            {'_fields_': [(_C_field_data, c_restrict_void_p),
                                          (_C_field_size, POINTER(c_int)),
                                          (_C_field_nbytes, c_ulong),
                                          (_C_field_nopad_size, POINTER(c_ulong)),
                                          (_C_field_domain_size, POINTER(c_ulong)),
                                          (_C_field_halo_size, POINTER(c_int)),
                                          (_C_field_halo_ofs, POINTER(c_int)),
                                          (_C_field_owned_ofs, POINTER(c_int)),
                                          (_C_field_dmap, c_void_p)]}))

Proposed Fix

Replace c_int with c_long in the _C_ctype struct definition to safely handle arrays larger than 2^31 elements:

from ctypes import c_long

_C_ctype = POINTER(type(_C_structname, (Structure,),
                            {'_fields_': [(_C_field_data, c_restrict_void_p),
                                          (_C_field_size, POINTER(c_long)),
                                          (_C_field_nbytes, c_ulong),
                                          (_C_field_nopad_size, POINTER(c_ulong)),
                                          (_C_field_domain_size, POINTER(c_ulong)),
                                          (_C_field_halo_size, POINTER(c_long)),
                                          (_C_field_halo_ofs, POINTER(c_long)),
                                          (_C_field_owned_ofs, POINTER(c_long)),
                                          (_C_field_dmap, c_void_p)]}))

After this modification, all large-domain runs succeed without overflow.

Steps to Reproduce:
1. Use a large 3D simulation (> 2^31 elements):
2. Observe the runtime error (CUDA memory allocation failure, but triggered by invalid size).
3. Inspect dense.py - note c_int usage for size fields.

Observed Behavior

Out of memory allocating 18446744065653020036 bytes of device memory
Failing in Thread:1
Accelerator Fatal Error: call to cuMemAlloc returned error 2 (CUDA_ERROR_OUT_OF_MEMORY)

The reported allocation size is clearly invalid (≈ 1.8×10^31 bytes), caused by signed 32-bit overflow.

Expected Behavior

Correct allocation for 24.5 GB domain (~2.17×10^9 elements), no overflow, normal execution.

Environment
• Devito version: 4.8.20
• Backend: OpenACC / NVHPC 25.1
• MPI: Enabled
• Hardware: NVIDIA H100 (80 GB HBM)
• OS: HPC cluster environment - Linux
• Python: 3.10 (NVHPC env)

Fix verified on GPU backends - replacing c_int with c_long solves the issue fully.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions