Neuron - Handling NaNs
LLMs work with floating point numbers and susceptible to saturation or loss of precision issue. This is typically seen with NaN (Not a Number) errors.
Neuron compiler suggests the following options to overcome the NaN issue.
Compiler flag:
--enable-saturate-infinity
Ref A computation that can generate +/- infinity is at a high risk of generating Not-a-Number (NaN) values when the infinity value is used in subsequent computations. This option helps avoid this by converting +Inf/-Inf values to MAX/MIN_FLOAT before operations that could produce NaN values for +Inf/-Inf inputs on the target architecture. While this option helps to avoid NaN values, there is a potential performance degradation that occurs during model execution when this conversion is enabled.
Compiler
flag: --enable-mixed-precision-accumulation
Ref To perform intermediate calculations of reduction operators (such as the dot or reduce operators) in FP32 regardless of the operation’s defined datatype.
NaNs due to saturation
To triage an intermediate BF16 tensor, np.isnan would not work, since
NumPy supports only float16
, float32
, and
float64
by default and bfloat16
is not a
standard NumPy dtype. The data typically looks like
bf16_hex = [0x3f80, 0xbf80, 0x4000, 0x7f80, 0xff80, 0x7fc0]
.
We can either try to use torch.isnan or convert to fp32 as below.
Convert to fp32 for printing/checking
def bf16_to_float32(bits):
import struct
# Pad with 16 zero bits to match float32
= bits << 16
f32_bits return struct.unpack('f', struct.pack('I', f32_bits))[0]
= [bf16_to_float32(b) for b in bf16_hex]
converted print(converted)
To add Neuron available conversion method
If you found this useful, please cite this post using
Senthilkumar Gopal. (Jan 2024). Neuron - Handling NaNs. sengopal.me. https://sengopal.me/posts/neuron-handling-nans
or
@article{gopal2024neuronhandlingnans, title = {Neuron - Handling NaNs}, author = {Senthilkumar Gopal}, journal = {sengopal.me}, year = {2024}, month = {Jan}, url = {https://sengopal.me/posts/neuron-handling-nans} }