Summary Recently I encountered a weird issue regarding LTO and -ffast-math where I got inconsistent result for my "pow" ( in cmath ) calls depending ...
Summary Recently I encountered a weird issue regarding LTO and -ffast-math where I got inconsistent result for my "pow" ( in cmath ) calls depending ...
I'm working with statistic functions with a lot of float data. I want it to run faster but Ofast disable NAN (fno-finite-math-only flag), which is not ...
Can the code below be modified such that it works correctly even when compiled by GCC with fast-math enabled? Note: I have it in a header file and ...
I've read This article and do-denormal-flags-like-denormals-are-zero-daz-affect-comparisons-for-equality and I understand the usage and difference bet ...
I do not understand whether there is function overloading in Cuda or not. I want to explain my problem on the following two functions, which I want to ...
I'm writing some CUDA code, and I want it to behave differently based on whether or not --use_fast_math was set or not. And - I want to make that deci ...
Intro Kahan summation / compensated summation is technique that addresses compilers´ inability to respect the associative property of numbers. Truncat ...
I want to write cross-platform C/C++ which has reproducible behaviour across different environments. I understand that gcc's ffast-math enables vario ...
If I have 2 denormal floating point numbers with different bit patterns and compare them for equality, can the result be affected by the Denormals-Are ...
Consider the following program: If I compile with Apple Clang 7.0.2 with and without -ffast-math, I get the expected output 0 0 0 0: However aft ...
This is my naive implementation of dot product: And this is using the C++ library: I ran some benchmark(code is here https://github.com/ijklr/ss ...
Suppose I have and I want to compile one instantiation with -ffast-math (--use-fast-math for nvcc), and the other instantiation without it. This ...
I'm experimenting with writing a couple kernels using GCCs builtin simd support. I've got this code benchmarking an AVX dot product kernel: Strange ...
As I read on Intel's website: Intel compiler uses /fp-model fast=1 as defaults. This optimization favors speed over standards compliance. You may ...
I've noticed an interesting phenomenon around flags to the compiler linker affecting the running code in ways I cannot understand. I have a library t ...
This is critical section of the program that cause problem, and program is completely sequential. exist_ is a class bool private member, and dbl_num_ ...
Does anyone know why GCC/Clang will not optimist function test1 in the below code sample to simply use just the RCPPS instruction when using the fast- ...
I'm trying to benchmark some Rust code, but I can't figure out how to set the "ffast-math" option. % rustc -C opt-level=3 -C llvm-args='-enable-unsaf ...
I'm using -Ofast gcc option in my program cause latency requirements. I wrote simple test program: I've tried to run it with default flags and with ...
I would like to know if any code in C or C++ using floating point arithmetic would produce bit exact results in any x86 based architecture, regardless ...