121 lines
3.7 KiB
Markdown
121 lines
3.7 KiB
Markdown
# Performance Tests — CPU vs WebAssembly vs Node.js vs OpenCL vs Browser Pthread
|
||
|
||
This project benchmarks performance of the same arithmetic workload across multiple execution models:
|
||
|
||
```
|
||
c[i] = a[i] + b[i]
|
||
```
|
||
|
||
Array size: ~15 million `float` elements.
|
||
|
||
Compared implementations:
|
||
|
||
• Native C — single core
|
||
• Native C — Pthreads
|
||
• Native C — OpenMP
|
||
• Node.js — JavaScript
|
||
• Node.js — WebAssembly (single core)
|
||
• Node.js — WebAssembly + Pthreads (multi-core)
|
||
• Browser — WebAssembly + Pthreads (multi-core)
|
||
• OpenCL — CPU and GPU
|
||
|
||
---
|
||
|
||
## Build & Run (All Tests)
|
||
|
||
Run every benchmark automatically (native, Node.js, WebAssembly, OpenCL):
|
||
|
||
```
|
||
bash compile.sh
|
||
```
|
||
|
||
The script:
|
||
|
||
1. Compiles all binaries (.c → ./binaries/*)
|
||
2. Builds WebAssembly versions using Emscripten
|
||
3. Executes each performance test in sequence
|
||
4. Prints timing + result samples
|
||
|
||
Requirements:
|
||
|
||
• gcc / clang
|
||
• Node.js
|
||
• Emscripten (for WASM builds)
|
||
• OpenCL dev libs (optional, for OpenCL tests)
|
||
|
||
---
|
||
|
||
## Run Browser WebAssembly + Pthreads Version
|
||
|
||
```
|
||
cd wasm_pthread_fast/web
|
||
node server.js
|
||
```
|
||
|
||
Then open:
|
||
|
||
```
|
||
http://localhost:1234
|
||
```
|
||
|
||
This runs the multithreaded WASM benchmark inside the browser with correct SharedArrayBuffer support.
|
||
|
||
---
|
||
|
||
## Example Results
|
||
|
||
All implementations validate correct output values (`c[0] = 0`, `c[1] = 3`, …)
|
||
|
||
Times in milliseconds:
|
||
|
||
| Method | Platform | Cores | Total / Calc Time (ms) | Status |
|
||
| ------------------------- | -------- | ----- | ---------------------- | ---------------- |
|
||
| Native C | CPU | 1 | 210.63 | OK |
|
||
| Node.js | CPU | 1 | 215.15 | OK |
|
||
| Wasm Node.js | CPU | 1 | 219.81 | OK |
|
||
| OpenMP | CPU | multi | 140.58 | OK |
|
||
| C Pthreads | CPU | multi | 21.98 (calc) | **Fastest CPU** |
|
||
| Wasm + Pthreads (Node.js) | CPU | multi | 23.08 (calc) | **Very fast** |
|
||
| Wasm + Pthreads (Browser) | CPU | multi | 35.21 (calc) | **Fast** |
|
||
| OpenCL CPU only | CPU | many? | 162.36 total | OK |
|
||
| OpenCL GPU | GPU | many | Crash | Driver dependent |
|
||
|
||
---
|
||
|
||
## Folder Overview
|
||
|
||
| Path | Description |
|
||
| --------------------- | ------------------------------ |
|
||
| add_single_core.c | Single-threaded C baseline |
|
||
| pthread_add.c | Multi-core with Pthreads |
|
||
| openmp_add.c | Multi-core with OpenMP |
|
||
| opencl_add_cpu.c | CPU via OpenCL runtime |
|
||
| opencl_add_gpu.c | GPU compute attempt |
|
||
| wasm_add.c | WebAssembly (single-core) |
|
||
| wasm_add_pthread.c | WebAssembly (multi-core) |
|
||
| wasm_node.js | Node test for single-core WASM |
|
||
| wasm_pthread_fast/ | Multi-threaded WASM version |
|
||
| wasm_pthread_fast/web | Browser runner + local server |
|
||
| compile.sh | Complete build + test pipeline |
|
||
|
||
---
|
||
|
||
## Findings
|
||
|
||
• Multi-core CPU execution is far superior to single-threaded versions
|
||
• Node.js + WebAssembly threads approach native CPU performance
|
||
• Browser WASM threading provides strong performance with minimal overhead
|
||
• GPU workloads are not benefited due to memory transfer bottlenecks
|
||
– GPU will win when computation-per-element is higher
|
||
|
||
---
|
||
|
||
## Future Expansion
|
||
|
||
• Higher compute complexity test kernels
|
||
• Multi-run average statistics
|
||
• Visual charts comparing performance gaps
|
||
• GPU-friendly workloads showing real acceleration crossover
|
||
|
||
|