Files
performance-tests/README.md
2025-11-18 12:55:09 +01:00

121 lines
3.7 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Performance Tests — CPU vs WebAssembly vs Node.js vs OpenCL vs Browser Pthread
This project benchmarks performance of the same arithmetic workload across multiple execution models:
```
c[i] = a[i] + b[i]
```
Array size: ~15 million `float` elements.
Compared implementations:
• Native C — single core
• Native C — Pthreads
• Native C — OpenMP
• Node.js — JavaScript
• Node.js — WebAssembly (single core)
• Node.js — WebAssembly + Pthreads (multi-core)
• Browser — WebAssembly + Pthreads (multi-core)
• OpenCL — CPU and GPU
---
## Build & Run (All Tests)
Run every benchmark automatically (native, Node.js, WebAssembly, OpenCL):
```
bash compile.sh
```
The script:
1. Compiles all binaries (.c → ./binaries/*)
2. Builds WebAssembly versions using Emscripten
3. Executes each performance test in sequence
4. Prints timing + result samples
Requirements:
• gcc / clang
• Node.js
• Emscripten (for WASM builds)
• OpenCL dev libs (optional, for OpenCL tests)
---
## Run Browser WebAssembly + Pthreads Version
```
cd wasm_pthread_fast/web
node server.js
```
Then open:
```
http://localhost:1234
```
This runs the multithreaded WASM benchmark inside the browser with correct SharedArrayBuffer support.
---
## Example Results
All implementations validate correct output values (`c[0] = 0`, `c[1] = 3`, …)
Times in milliseconds:
| Method | Platform | Cores | Total / Calc Time (ms) | Status |
| ------------------------- | -------- | ----- | ---------------------- | ---------------- |
| Native C | CPU | 1 | 210.63 | OK |
| Node.js | CPU | 1 | 215.15 | OK |
| Wasm Node.js | CPU | 1 | 219.81 | OK |
| OpenMP | CPU | multi | 140.58 | OK |
| C Pthreads | CPU | multi | 21.98 (calc) | **Fastest CPU** |
| Wasm + Pthreads (Node.js) | CPU | multi | 23.08 (calc) | **Very fast** |
| Wasm + Pthreads (Browser) | CPU | multi | 35.21 (calc) | **Fast** |
| OpenCL CPU only | CPU | many? | 162.36 total | OK |
| OpenCL GPU | GPU | many | Crash | Driver dependent |
---
## Folder Overview
| Path | Description |
| --------------------- | ------------------------------ |
| add_single_core.c | Single-threaded C baseline |
| pthread_add.c | Multi-core with Pthreads |
| openmp_add.c | Multi-core with OpenMP |
| opencl_add_cpu.c | CPU via OpenCL runtime |
| opencl_add_gpu.c | GPU compute attempt |
| wasm_add.c | WebAssembly (single-core) |
| wasm_add_pthread.c | WebAssembly (multi-core) |
| wasm_node.js | Node test for single-core WASM |
| wasm_pthread_fast/ | Multi-threaded WASM version |
| wasm_pthread_fast/web | Browser runner + local server |
| compile.sh | Complete build + test pipeline |
---
## Findings
• Multi-core CPU execution is far superior to single-threaded versions
• Node.js + WebAssembly threads approach native CPU performance
• Browser WASM threading provides strong performance with minimal overhead
• GPU workloads are not benefited due to memory transfer bottlenecks
GPU will win when computation-per-element is higher
---
## Future Expansion
• Higher compute complexity test kernels
• Multi-run average statistics
• Visual charts comparing performance gaps
• GPU-friendly workloads showing real acceleration crossover