Stockfish
  • Stockfish
  • NNUE
    • Documentation
    • Experiments
      • Remove 8th Subnetwork
      • Store Key in Accumulator Cache
      • Tweak fwdOut Multipliers
      • Improve Accumulator Update Heuristics
      • Remove Duplicate Perspective Code
      • Update All Accumulators
  • Improve Build System
    • Overview
    • Rewrite Makefile
    • Tests
Powered by GitBook
On this page
  • Efficiently Updatable Neural Network
  • Network Architecture
  • SmallNet
  • Version Update History
  • Feature Transformer
  • Accumulator Update Strategy
  • Normal Position Evaluation
  • Common Parent Position Hint
  • Incremental Update
  • Refresh Cache and Update
  • Feature Sets
  • HalfKAv2_hm
  • Layers
  • Affine Layer
  • Clipped ReLU Layer
  • Square Clipped ReLU Layer
  1. NNUE

Documentation

Last updated 8 months ago

Efficiently Updatable Neural Network

NNUE is designed to efficiently compute propagations in neural networks. The concept is based on the fact that a single move involves at most three piece differences on the board. This allows the state to be updated using only the material differentials, rather than performing a full calculation from the input vector.

The "accumulator" is a positional state that stores transformed features. It "accumulates" all the differentials from the initial state (empty board).

Network Architecture

Each subnetwork consists of one feature transformer and three dense layers, two of which are followed by clipped ReLU activations. There are a total 8 of subnetworks that make up the entire network (either Big or Small), with each subnetwork (bucket) being selected based on the number of pieces on the board. Refer to the sections below for details on each layer:

SmallNet

SmallNet is a replacement for the former HCE (handcrafted evaluation), often referred to as "classical evaluation". It has the same network architecture as Big network, except that the size of L1 is smaller. When one side is winning or losing by a significant material difference, it is assumed that a less precise evaluation is sufficient for such positions.

Determining Network Types

Stockfish first evaluates a position only with pure material scores to determine which network type to use in NNUE.

In evaluate.cpp:

int Eval::simple_eval(const Position& pos, Color c) {
    return PawnValue * (pos.count<PAWN>(c) - pos.count<PAWN>(~c))
         + (pos.non_pawn_material(c) - pos.non_pawn_material(~c));
}

bool Eval::use_smallnet(const Position& pos) {
    int simpleEval = simple_eval(pos, pos.side_to_move());
    return std::abs(simpleEval) > 962;
}

When a position is evaluated in SmallNet, however, if the evaluation and PSQT have opposing signs or if the score itself is small, BigNet is used subsequently.

    // Re-evaluate the position when higher eval accuracy is worth the time spent
    if (smallNet && (nnue * psqt < 0 || std::abs(nnue) < 227))
    {
        std::tie(psqt, positional) = networks.big.evaluate(pos, &caches.big);
        nnue                       = (125 * psqt + 131 * positional) / 128;
        smallNet                   = false;
    }

Version Update History

    • Reduce the input size by half and double the output size of L1.

    • Split L1 activation output.

    • Changed feature set from HalfKP to HalfKAv2.

    • Increased L1 size to 512x2.

Feature Transformer

Accumulator Update Strategy

Normal Position Evaluation

When doing an incremental update, Stockfish updates one or two accumulators; if the previous state has already been computed, only the current position needs to be updated. Otherwise, the accmulators of both the current position and the position following the computed one are updated.

When update is cost is greater than refresh cost, or if the king has moved, the accumulator is updated from the corresponding cache entry.

Common Parent Position Hint

If it is expected that some child nodes will be evaluated later, Stockfish preemptively computes the accumulator of the parent position, optimizing further material differential calculations. This is done in move-excluded searches (in singular extension), at TT-hit PV nodes, or after a failed ProbCut searches.

Incremental Update

WIP

Refresh Cache and Update

WIP

Feature Sets

Order from latest to oldest.

HalfKAv2_hm

WIP

Layers

Affine Layer

Sparse Input Optimization

Clipped ReLU Layer

Square Clipped ReLU Layer

Refer to the to view visualized structures of each architecture.

SFNNv9 (current): Increased L1 size to 3072.

SFNNv8: Increased L1 size to 2560.

SFNNv7: Increased L1 size to 2048.

SFNNv6: Increased L1 size to 1536.

SFNNv5: Introduced squared clipped ReLU layer, output mixed within L1.

SFNNv4:

SFNNv3 (tentative): Changed feature set from HalfKAv2 to HalfKAv2_hm.

SFNNv2 (tentative):

Introduction of NNUE:

A position is sometimes not evaluated in NNUE when a matching TT (transposition table) entry is found. Stockfish searches for the last computed accumulator and determine which is faster: whether to update from it or to construct the accumulator from the cached entry. The decision is often based upon the update cost and refresh cost, which are feature-specific values. Check section for more information.

nnue-pytorch NNUE documentation
commit
commit
commit
commit
commit
commit
commit
commit
commit
Feature Transformer
Affine Layer
Clipped ReLU Layer
Squared Clipped ReLU Layer
Feature Sets
Figure 1. Update accumulators
Figure 2. Hint common parent position