Senthilkumar Gopal

Musings of a machine learning researcher, engineer and leader

Neuron - Handling NaNs

LLMs work with floating point numbers and susceptible to saturation or loss of precision issue. This is typically seen with NaN (Not a Number) errors.

Neuron compiler suggests the following options to overcome the NaN issue.

Compiler flag: --enable-saturate-infinity

Ref A computation that can generate +/- infinity is at a high risk of generating Not-a-Number (NaN) values when the infinity value is used in subsequent computations. This option helps avoid this by converting +Inf/-Inf values to MAX/MIN_FLOAT before operations that could produce NaN values for +Inf/-Inf inputs on the target architecture. While this option helps to avoid NaN values, there is a potential performance degradation that occurs during model execution when this conversion is enabled.

Compiler flag: --enable-mixed-precision-accumulation

Ref To perform intermediate calculations of reduction operators (such as the dot or reduce operators) in FP32 regardless of the operation’s defined datatype.

NaNs due to saturation

To triage an intermediate BF16 tensor, np.isnan would not work, since NumPy supports only float16, float32, and float64 by default and bfloat16 is not a standard NumPy dtype. The data typically looks like bf16_hex = [0x3f80, 0xbf80, 0x4000, 0x7f80, 0xff80, 0x7fc0]. We can either try to use torch.isnan or convert to fp32 as below.

Convert to fp32 for printing/checking

def bf16_to_float32(bits):
    import struct
    # Pad with 16 zero bits to match float32
    f32_bits = bits << 16
    return struct.unpack('f', struct.pack('I', f32_bits))[0]

converted = [bf16_to_float32(b) for b in bf16_hex]
print(converted)

To add Neuron available conversion method


If you found this useful, please cite this post using

Senthilkumar Gopal. (Jan 2024). Neuron - Handling NaNs. sengopal.me. https://sengopal.me/posts/neuron-handling-nans

or

@article{gopal2024neuronhandlingnans,
  title   = {Neuron - Handling NaNs},
  author  = {Senthilkumar Gopal},
  journal = {sengopal.me},
  year    = {2024},
  month   = {Jan},
  url     = {https://sengopal.me/posts/neuron-handling-nans}
}