SIMD abstraction layer for vectorized arithmetic operations. More...

#include <cstddef>
#include <cmath>
#include <algorithm>
#include <cassert>

Include dependency graph for SIMD.hpp:

This graph shows which files directly or indirectly include this file:

Namespaces
namespace	openswmm

namespace	openswmm::simd

namespace	openswmm::fastmath

Macros
#define	OPENSWMM_RESTRICT __restrict__

#define	OPENSWMM_IVDEP

#define	OPENSWMM_SIMD_SCALAR 1

#define	OPENSWMM_SIMD_WIDTH 1
	Scalar fallback.

Functions
void	openswmm::simd::add (const double OPENSWMM_RESTRICT a, const double OPENSWMM_RESTRICT b, double *OPENSWMM_RESTRICT dst, std::size_t n) noexcept
	Element-wise addition: dst[i] = a[i] + b[i].

void	openswmm::simd::multiply (const double OPENSWMM_RESTRICT a, const double OPENSWMM_RESTRICT b, double *OPENSWMM_RESTRICT dst, std::size_t n) noexcept
	Element-wise multiplication: dst[i] = a[i] * b[i].

double	openswmm::simd::min (const double *a, std::size_t n) noexcept
	Find the minimum value in an array.

double	openswmm::simd::max (const double *a, std::size_t n) noexcept
	Find the maximum value in an array.

void	openswmm::simd::clamp (double *a, double lo, double hi, std::size_t n) noexcept
	Clamp all elements of an array to [lo, hi].

double	openswmm::simd::dot (const double OPENSWMM_RESTRICT a, const double OPENSWMM_RESTRICT b, std::size_t n) noexcept
	Dot product of two arrays: sum(a[i] * b[i]).

constexpr std::size_t	openswmm::simd::lane_width () noexcept
	Returns the SIMD lane width (doubles per register on this platform).

void	openswmm::simd::sqrt_array (const double OPENSWMM_RESTRICT a, double OPENSWMM_RESTRICT dst, std::size_t n) noexcept
	Element-wise sqrt: dst[i] = sqrt(a[i]). Written as a simple loop; compilers auto-vectorize to platform-native SIMD (vsqrtq_f64 on ARM, _mm256_sqrt_pd on x86, etc.).

void	openswmm::simd::fabs_array (const double OPENSWMM_RESTRICT a, double OPENSWMM_RESTRICT dst, std::size_t n) noexcept
	Element-wise fabs: dst[i] = fabs(a[i]).

void	openswmm::simd::fma_array (const double OPENSWMM_RESTRICT a, const double OPENSWMM_RESTRICT b, const double OPENSWMM_RESTRICT c, double OPENSWMM_RESTRICT dst, std::size_t n) noexcept
	Element-wise fused multiply-add: dst[i] = a[i] * b[i] + c[i].

double	openswmm::fastmath::pow3_2 (double x) noexcept
	pow(x, 3/2) = x * sqrt(x). Weir TRANSVERSE / TRAPEZOIDAL.

double	openswmm::fastmath::pow5_2 (double x) noexcept
	pow(x, 5/2) = x² * sqrt(x). Weir V-NOTCH.

double	openswmm::fastmath::pow5_3 (double x) noexcept
	pow(x, 5/3) = x * cbrt(x²). Weir SIDEFLOW (legacy 1.67 exponent).

double	openswmm::fastmath::pow4_3 (double x) noexcept
	pow(x, 4/3) = x * cbrt(x). Manning friction.

double	openswmm::fastmath::pow2_3 (double x) noexcept
	pow(x, 2/3) = cbrt(x²). Manning normal-flow.

Detailed Description

SIMD abstraction layer for vectorized arithmetic operations.

Provides a platform-neutral interface over:

x86_64: AVX2 (256-bit, 4 doubles per register)
arm64: NEON (128-bit, 2 doubles per register)
Other: Scalar fallback (auto-vectorized by the compiler)

The goal is to express numerical loops in a way that the compiler can vectorize, with explicit SIMD intrinsics as hints for the most performance-critical paths.

Usage philosophy

Do NOT pepper the solver code with raw intrinsics. Instead:

Write the algorithm using the SIMD helpers in this file.
Let the compiler auto-vectorize as much as possible.
Profile; only use explicit intrinsics for hot loops that the compiler fails to vectorize.

Note: The scalar fallback is always available. If SIMD intrinsics are unavailable at compile time, the scalar path is used automatically.

See also: tests/unit/test_simd_math.cpp; tests/benchmarks/bench_hydraulics.cpp

Author: Caleb Buahin caleb.nosp@m..bua.nosp@m.hin@g.nosp@m.mail.nosp@m..com

Copyright: Copyright (c) 2026 Caleb Buahin. All rights reserved.

License\n MIT License

Macro Definition Documentation

◆ OPENSWMM_IVDEP

#define OPENSWMM_IVDEP

◆ OPENSWMM_RESTRICT

#define OPENSWMM_RESTRICT __restrict__

◆ OPENSWMM_SIMD_SCALAR

#define OPENSWMM_SIMD_SCALAR 1

◆ OPENSWMM_SIMD_WIDTH

#define OPENSWMM_SIMD_WIDTH 1

Scalar fallback.

Namespaces

Macros