Skip to content

bulkcopy panics with mid > len when loading a lot of data #513

@taehoon-song

Description

@taehoon-song

Describe the bug

When trying to load data (300k rows, 90 columns) into a table using bulkcopy, the process hangs for a bit and panics. Please see the error message below.

Based on some testing, I think there's a limit to the amount of data (or time?) bulkcopy can send before panicking. For example, when loading the same data but only the first 50 columns, bulkcopy never fails.

Exception message: thread '<unnamed>' panicked at C:\Users\cloudtest\.cargo\registry\src\pkgs.dev.azure.com-d56355263f74e859\tokio-1.49.0\src\io\util\write_all.rs:45:57:
mid > len
Stack trace:
   0:     0x7ff8385d1292 - PyInit_mssql_py_core
   1:     0x7ff8385e922b - PyInit_mssql_py_core
   2:     0x7ff8385cd807 - PyInit_mssql_py_core
   3:     0x7ff8385d10d5 - PyInit_mssql_py_core
   4:     0x7ff8385d351e - PyInit_mssql_py_core
   5:     0x7ff8385d3294 - PyInit_mssql_py_core
   6:     0x7ff8385d3fdb - PyInit_mssql_py_core
   7:     0x7ff8385d3e32 - PyInit_mssql_py_core
   8:     0x7ff8385d197f - PyInit_mssql_py_core
   9:     0x7ff8385d3a7e - PyInit_mssql_py_core
  10:     0x7ff8386040a1 - PyInit_mssql_py_core
  11:     0x7ff8384e43c9 - PyInit_mssql_py_core
  12:     0x7ff8384b9013 - PyInit_mssql_py_core
  13:     0x7ff8384bbc92 - PyInit_mssql_py_core
  14:     0x7ff83843ee82 - PyInit_mssql_py_core
  15:     0x7ff8384ee277 - PyInit_mssql_py_core
  16:     0x7ff8384f1ef3 - PyInit_mssql_py_core
  17:     0x7ff8381848d8 - <unknown>
  18:     0x7ff83818d99b - <unknown>
  19:     0x7ff8381baf7f - <unknown>
  20:     0x7ff83817233b - <unknown>
  21:     0x7ff83834601b - PyInit_mssql_py_core
  22:     0x7ff8382f4c10 - <unknown>
  23:     0x7ff8382f7944 - <unknown>
  24:     0x7ff8382ebd88 - <unknown>
  25:     0x7ff8382f5654 - <unknown>
  26:     0x7ff83bcacabd - PyType_Modified
  27:     0x7ff83bc4512d - PyObject_Vectorcall
  28:     0x7ff83bc45099 - PyObject_Vectorcall
  29:     0x7ff83bc465bf - PyEval_EvalFrameDefault
  30:     0x7ff83bc445bc - PyFunction_Vectorcall
  31:     0x7ff83bc6ba9c - PyLong_New
  32:     0x7ff83bc8c223 - PyObject_Call
  33:     0x7ff83bc8c117 - PyObject_Call
  34:     0x7ff83bc4a16f - PyEval_EvalFrameDefault
  35:     0x7ff83bc445bc - PyFunction_Vectorcall
  36:     0x7ff83bc83afb - PyObject_FastCallDictTstate
  37:     0x7ff83bd65a0f - PyObject_Call_Prepend
  38:     0x7ff83bd6593a - PyCell_Set
  39:     0x7ff83bc456b6 - PyObject_Vectorcall
  40:     0x7ff83bc45099 - PyObject_Vectorcall
  41:     0x7ff83bc465bf - PyEval_EvalFrameDefault
  42:     0x7ff83bc445bc - PyFunction_Vectorcall
  43:     0x7ff83bc83afb - PyObject_FastCallDictTstate
  44:     0x7ff83bd65a0f - PyObject_Call_Prepend
  45:     0x7ff83bd6593a - PyCell_Set
  46:     0x7ff83bc8c15e - PyObject_Call
  47:     0x7ff83bc4a16f - PyEval_EvalFrameDefault
  48:     0x7ff83bc445bc - PyFunction_Vectorcall
  49:     0x7ff83bc83afb - PyObject_FastCallDictTstate
  50:     0x7ff83bd65a0f - PyObject_Call_Prepend
  51:     0x7ff83bd6593a - PyCell_Set
  52:     0x7ff83bc456b6 - PyObject_Vectorcall
  53:     0x7ff83bc45099 - PyObject_Vectorcall
  54:     0x7ff83bc465bf - PyEval_EvalFrameDefault
  55:     0x7ff83bc445bc - PyFunction_Vectorcall
  56:     0x7ff83bc83afb - PyObject_FastCallDictTstate
  57:     0x7ff83bd65a0f - PyObject_Call_Prepend
  58:     0x7ff83bd6593a - PyCell_Set
  59:     0x7ff83bc456b6 - PyObject_Vectorcall
  60:     0x7ff83bc45099 - PyObject_Vectorcall
  61:     0x7ff83bc465bf - PyEval_EvalFrameDefault
  62:     0x7ff83bc445bc - PyFunction_Vectorcall
  63:     0x7ff83bc83afb - PyObject_FastCallDictTstate
  64:     0x7ff83bd65a0f - PyObject_Call_Prepend
  65:     0x7ff83bd6593a - PyCell_Set
  66:     0x7ff83bc456b6 - PyObject_Vectorcall
  67:     0x7ff83bc45099 - PyObject_Vectorcall
  68:     0x7ff83bc465bf - PyEval_EvalFrameDefault
  69:     0x7ff83bc832bc - PyLong_FormatWriter
  70:     0x7ff83bc55caa - PyEval_EvalCode
  71:     0x7ff83bc55a9f - PyEval_GetBuiltin
  72:     0x7ff83bc55944 - PyEval_GetBuiltin
  73:     0x7ff83bc8d6f7 - PySet_Add
  74:     0x7ff83bc4512d - PyObject_Vectorcall
  75:     0x7ff83bc45099 - PyObject_Vectorcall
  76:     0x7ff83bc465bf - PyEval_EvalFrameDefault
  77:     0x7ff83bc445bc - PyFunction_Vectorcall
  78:     0x7ff83bc8c1cd - PyObject_Call
  79:     0x7ff83bc8c117 - PyObject_Call
  80:     0x7ff83bc20e18 - PyArg_Parse
  81:     0x7ff83bc2164d - Py_RunMain
  82:     0x7ff83bc2145c - Py_RunMain
  83:     0x7ff83bc20ee7 - Py_Main
  84:     0x7ff735ca1230 - <unknown>
  85:     0x7ff9346ae8d7 - <unknown>
  86:     0x7ff93602c53c - RtlUserThreadStart

To reproduce

import os
import mssql_python as mssql

connection_url = os.environ["CONN_URL"]
conn = mssql.connect(connection_url)
cur = conn.cursor()

# Create table for testing
cols = 200
rows = 400_000
table_name = "bcp_test_limit"
col_stmt = ",".join([f"[col_{i}] [int]" for i in range(cols)])
stmt = f"CREATE TABLE {table_name} ({col_stmt})"
conn.execute(stmt)
conn.commit()

# Generate data
data = [tuple(range(cols)) for _ in range(rows)]

# Try loading data
cur.bulkcopy(table_name=table_name, data=data)
conn.commit()
conn.close()

Expected behavior

bulkcopy should not panic.

Further technical details

Python version: 3.12.11
SQL Server version: Azure SQL 2022
Operating system: Windows 11

Metadata

Metadata

Assignees

Labels

triage neededFor new issues, not triaged yet.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions