Skip to content

Commit 7b1ad4f

Browse files
author
peng.li24
committed
docs: update README to reflect actual project structure and 961 tests
1 parent cfb1848 commit 7b1ad4f

1 file changed

Lines changed: 45 additions & 39 deletions

File tree

README.md

Lines changed: 45 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
55
[![C++17](https://img.shields.io/badge/C%2B%2B-17-blue.svg)](https://en.cppreference.com/w/cpp/17)
66
[![CMake](https://img.shields.io/badge/CMake-%3E%3D3.16-green.svg)](https://cmake.org/)
7-
[![Tests](https://img.shields.io/badge/tests-900%20bit--exact-brightgreen.svg)](tests/test_all.py)
7+
[![Tests](https://img.shields.io/badge/tests-961%20bit--exact-brightgreen.svg)](tests/test_all.py)
88
[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](CONTRIBUTING.md)
99

1010
## Background
@@ -17,7 +17,7 @@ We created `numpycpp` to keep NumPy's familiar usage patterns while letting C++
1717

1818
`numpycpp` is a **header-only C++ library** implementing numpy's core API (`numpy.*`, `numpy.linalg.*`, `numpy.einsum`) with **bit-level precision alignment**. Raw pointer + size interface. Zero external dependencies — pure C++17 standard library.
1919

20-
All APIs are tested against Python numpy under strict bit-level comparison: every IEEE 754 float bit must match exactly (900 tests, float64 + float32, including NaN passthrough, signed-zero, ±∞, domain-error cases, and advanced indexing).
20+
All APIs are tested against Python numpy under strict bit-level comparison: every IEEE 754 float bit must match exactly (961 tests, float64 + float32, including NaN passthrough, signed-zero, ±∞, domain-error cases, and advanced indexing).
2121

2222
**Bit-exact math** is achieved by resolving numpy's own math functions from `_multiarray_umath.so` at runtime. The SVML bridge auto-detects your CPU and selects the same path numpy uses: AVX‑512 SVML (`__svml_exp8`) when available, or scalar `npy_exp`/`npy_log`/etc. otherwise. AVX‑512 intrinsics are isolated behind `__attribute__((target))` — the binary is safe on any x86_64 CPU (no SIGILL). Every transcendental function produces the exact same IEEE 754 bits as numpy on **all architectures**.
2323

@@ -32,23 +32,22 @@ All APIs are tested against Python numpy under strict bit-level comparison: ever
3232
**Public headers** — include the umbrella or individual modules:
3333

3434
```cpp
35-
#include "numpy/numpy.h" // ← single entry point (recommended)
35+
#include <numpycpp/numpy.h> // ← single entry point (recommended)
3636

3737
// or include only what you need:
38-
#include "numpy/init.h" // zeros_like, ones_like, full
39-
#include "numpy/elementwise.h" // sqrt, exp, sin, astype, …
40-
#include "numpy/reduce.h" // sum, mean, std, var, cumsum, …
41-
#include "numpy/manipulation.h" // transpose, take, slice, putmask, …
42-
#include "numpy/io.h" // isin, interp, unwrap, …
43-
#include "numpy/linalg.h" // dot, norm, matmul, einsum
38+
#include <numpycpp/init.h> // zeros_like, ones_like, full, arange, …
39+
#include <numpycpp/elementwise.h> // sqrt, exp, sin, astype, …
40+
#include <numpycpp/reduce.h> // sum, mean, std, var, cumsum, …
41+
#include <numpycpp/manipulation.h> // transpose, take, slice, putmask, …
42+
#include <numpycpp/io.h> // isin, interp, unwrap, …
43+
#include <numpycpp/linalg.h> // dot, norm, matmul, einsum
4444
```
4545

46-
> `numpy/detail/` headers are **internal** — automatically pulled in by the
47-
> public headers. Do not include them directly; a compile-time `#error` fires
48-
> if you try.
46+
> `numpycpp/detail/` headers are **internal** — automatically pulled in by the
47+
> public headers. Do not include them directly.
4948
>
50-
> Legacy single-file headers `numpy/core.h` and `numpy/einsum.h` are kept as
51-
> backward-compatible shims that simply `#include "numpy/numpy.h"`.
49+
> **pybind11 users** — include `<numpycpp/numpy_py.h>` instead to get the full
50+
> set of pybind11 wrapper functions (`numpy::sum(py::array_t<T>)` etc.).
5251
5352
```cpp
5453
std::vector<double> data = {1.0, 4.0, 9.0};
@@ -118,8 +117,7 @@ Add `-Ipath/to/numpycpp` to your compiler flags and include the headers directly
118117

119118
The test suite verifies **bit-level precision alignment** between every C++ function and Python numpy.
120119
No tolerance, no `atol`/`rtol` — raw IEEE 754 bits must match exactly.
121-
900 tests: float64 + float32, including NaN passthrough, signed-zero, ±∞, domain errors, advanced indexing, and AVX-512 boundary sizes.
122-
In std mode ~399 precision-independent tests run (structural, reduction, manipulation, io, comparison, astype, advanced indexing).
120+
961 tests: float64 + float32, including NaN passthrough, signed-zero, ±∞, domain errors, advanced indexing, and AVX-512 boundary sizes.
123121

124122
```bash
125123
# build
@@ -157,7 +155,7 @@ cmake -DNUMPYCPP_STD_ONLY=ON .. # std / performance-first backend
157155
#### Compiler flags — bitexact backend (`NUMPYCPP_STD_ONLY=OFF`)
158156

159157
The minimum set was determined empirically: each flag was removed in isolation
160-
and the full 900-test suite was re-run. Only flags whose removal caused at
158+
and the full 961-test suite was re-run. Only flags whose removal caused at
161159
least one test failure are marked **required**.
162160

163161
```cmake
@@ -251,41 +249,49 @@ Two backends, same API — choose with `cmake -DNUMPYCPP_STD_ONLY=ON/OFF`.
251249

252250
```
253251
numpycpp/
254-
├── numpy/ # native C++ headers
255-
│ ├── numpy.h # [PUBLIC] umbrella — #includes everything below
256-
│ ├── init.h # [PUBLIC] zeros_like, ones_like, full
252+
├── numpycpp/ # header-only library (all public + internal headers)
253+
│ ├── numpy.h # [PUBLIC] umbrella — includes all core modules below
254+
│ ├── numpy_py.h # [PUBLIC] umbrella — includes all pybind11 wrappers below
255+
│ ├── init.h # [PUBLIC] zeros_like, ones_like, full, arange, linspace, …
256+
│ ├── init_py.h # [PUBLIC] pybind11 wrappers for init.h
257257
│ ├── elementwise.h # [PUBLIC] sqrt/exp/sin/…, comparison, logical, astype
258+
│ ├── elementwise_py.h # [PUBLIC] pybind11 wrappers for elementwise.h
258259
│ ├── reduce.h # [PUBLIC] sum/mean/std/var/cumsum, axis reductions
260+
│ ├── reduce_py.h # [PUBLIC] pybind11 wrappers for reduce.h
259261
│ ├── manipulation.h # [PUBLIC] transpose/take/slice/put/putmask/argsort/…
260-
│ ├── io.h # [PUBLIC] isin, interp, unwrap, safe_divide
262+
│ ├── manipulation_py.h # [PUBLIC] pybind11 wrappers for manipulation.h
263+
│ ├── io.h # [PUBLIC] isin, interp, unwrap, safe_divide, …
264+
│ ├── io_py.h # [PUBLIC] pybind11 wrappers for io.h
261265
│ ├── linalg.h # [PUBLIC] dot, norm, matmul, einsum
262-
│ ├── core.h # [SHIM] backward-compat → #include "numpy.h"
263-
│ ├── einsum.h # [SHIM] backward-compat → #include "numpy.h"
264-
│ └── detail/ # [INTERNAL] do not include directly — #error guard
266+
│ ├── linalg_py.h # [PUBLIC] pybind11 wrappers for linalg.h
267+
│ └── detail/ # [INTERNAL] do not include directly
265268
│ ├── macros.h # NUMPY_UNROLL4, NUMPY_SMALL_STACK
266-
│ ├── math_backend.h # selector: STD_ONLY → std_math_backend, else svml_bridge
267269
│ ├── svml_bridge.h # bitexact: SVML / npy_* scalar math (dlsym)
268270
│ ├── std_math_backend.h # std: pure <cmath> std::exp/log/sin/… (no deps)
269-
│ ├── npy_math_float.h # bitexact: npy_* float32 wrappers
270-
│ ├── linalg_backend.h # selector: STD_ONLY → std_linalg_backend, else blas_bridge
271271
│ ├── blas_bridge.h # bitexact: OpenBLAS ILP64 cblas wrappers (dlsym)
272272
│ ├── std_linalg_backend.h# std: pure C++ loop dot/gemm (no deps)
273-
│ └── avx512_loops.h # bitexact: AVX-512 vectorised exp/sin/cos loops
274-
├── pycpp/ # pybind11 wrappers (optional)
275-
│ ├── pycpp.h # [PUBLIC] umbrella — #includes everything below
276-
│ ├── init_py.h # [PUBLIC] zeros_like, ones_like, full
277-
│ ├── elementwise_py.h # [PUBLIC] sqrt/exp/sin/…, comparison, logical, astype
278-
│ ├── reduce_py.h # [PUBLIC] sum/mean/std/var/cumsum
279-
│ ├── manipulation_py.h # [PUBLIC] transpose/take/slice/put/putmask/…
280-
│ ├── io_py.h # [PUBLIC] isin, interp, unwrap, asarray, …
281-
│ ├── linalg_py.h # [PUBLIC] dot, norm, matmul, einsum
282-
│ ├── core_py.h # [SHIM] backward-compat → #include "pycpp.h"
283-
│ └── einsum_py.h # [SHIM] backward-compat → #include "pycpp.h"
273+
│ ├── avx512_loops.h # bitexact: AVX-512 vectorised exp/sin/cos loops
274+
│ └── npy_math_float.h # bitexact: npy_* float32 wrappers
275+
├── bench/ # performance benchmarks
276+
│ ├── CMakeLists.txt
277+
│ ├── bench_core.cpp # C++ benchmark driver
278+
│ ├── bench.py # pybind11-based benchmark runner
279+
│ └── bench_numpy.py # pure-numpy baseline
284280
├── tests/ # bit-level precision tests + test module
285281
│ ├── module.cpp # pybind11 module for testing
286-
│ ├── test_all.py # single entry — all APIs, 900 tests, float64+float32
282+
│ ├── test_all.py # single entry — all APIs, 961 tests, float64+float32
287283
│ ├── conftest.py # silent-mode output suppression
284+
│ ├── make_csv.py # ULP precision CSV generator
285+
│ ├── diagnose_numpy.py # numpy internal diagnostic tool
286+
│ ├── ulp_precision.csv # per-function ULP comparison data
288287
│ └── CMakeLists.txt # test-module build
288+
├── example/ # minimal usage examples
289+
│ ├── CMakeLists.txt
290+
│ └── main.cpp
291+
├── cmake/
292+
│ └── preinst # DEB pre-install script (clean old headers)
293+
├── issue/ # issue tracking & root-cause analysis
294+
│ └── 001-mean_pairwise_sum_vs_sequential.md
289295
├── CMakeLists.txt # build & .deb packaging
290296
└── README.md
291297
```

0 commit comments

Comments
 (0)