odin-lang/Odin Issue #4480: #simd[1]T breaks when crossing a proc boundary on linux_amd64
2024-11-12 19:39:01 Barinzaya
Context
I am building for, and running on, `linux_amd64` with the default microarch `x86-64-v2` and no additional target features.
- Operating System & Odin Version:
Odin: dev-2024-11-nightly OS: EndeavourOS, Linux 6.11.6-arch1-1 CPU: AMD Ryzen 9 9950X 16-Core Processor RAM: 61888 MiB Backend: LLVM 18.1.6
Expected Behavior
Using a `#simd` array of size 1, though suboptimal compared to passing a scalar value, should allow values to be passed into procs as arguments and out of procs as return values without altering their values.
Current Behavior
Attempting to operate on a `#simd[1]T` that has been passed into a proc as an argument, or returned from a proc as a return value, seems to treat it as if it has an "undefined" value. This seems to be 0 for integer types, and NaN for floating-point values.
In my testing, this seems to only happen if 2 other conditions are met:
- The element type of the vector is 4 bytes or smaller `i8`/`i16`/`i32`/`f32` all seem to exhibit this behavior; `i64` and `f64` do not.
- Optimizations are set to higher than `minimal`. `-o:size`, `-o:speed`, and `-o:aggressive` all seem to exhibit this behavior, whereas `-o:none` and `-o:minimal` do not.
Failure Information (for bugs)
Steps to Reproduce
Run the following code with `-o:size`, `-o:speed`, or `-o:aggressive`:
package mre
import "core:fmt"
T :: i32
simd1arg :: proc (v: #simd[1]T) {
fmt.println(v * auto_cast 2)
}
simd1ret :: proc () -> #simd[1]T {
return { 3 }
}
simd1thru :: proc (v: #simd[1]T) -> #simd[1]T {
return v * auto_cast 2
}
main :: proc () {
local := #simd[1]T { 1 }
fmt.println(local * auto_cast 2)
simd1arg({ 2 })
fmt.println(simd1ret() * auto_cast 2)
fmt.println(simd1thru({ 4 }))
}
Reasonably, it should print `<2> <4> <6> <8>`. Without optimizations enabled, or if `T` is changed to `f64`/`i64`, then it does so.
Failure Logs
With `T :: i32`:
<2> <0> <0> <0>
With `T :: f32`:
<2>
Comments (1)
2025-06-02 13:08:17 Feoramund
I cannot replicate this as of the latest commit. I get the following for either type of `i32` or `f32`, with any of the optimization flags:
<2> <4> <6> <8>