Documentation
Last updated
Last updated
NNUE is designed to efficiently compute propagations in neural networks. The concept is based on the fact that a single move involves at most three piece differences on the board. This allows the state to be updated using only the material differentials, rather than performing a full calculation from the input vector.
The "accumulator" is a positional state that stores transformed features. It "accumulates" all the differentials from the initial state (empty board).
Each subnetwork consists of one feature transformer and three dense layers, two of which are followed by clipped ReLU activations. There are a total 8 of subnetworks that make up the entire network (either Big or Small), with each subnetwork (bucket) being selected based on the number of pieces on the board. Refer to the sections below for details on each layer:
SmallNet is a replacement for the former HCE (handcrafted evaluation), often referred to as "classical evaluation". It has the same network architecture as Big network, except that the size of L1 is smaller. When one side is winning or losing by a significant material difference, it is assumed that a less precise evaluation is sufficient for such positions.
Stockfish first evaluates a position only with pure material scores to determine which network type to use in NNUE.
In evaluate.cpp:
When a position is evaluated in SmallNet, however, if the evaluation and PSQT have opposing signs or if the score itself is small, BigNet is used subsequently.
Reduce the input size by half and double the output size of L1.
Split L1 activation output.
Changed feature set from HalfKP to HalfKAv2.
Increased L1 size to 512x2.
When doing an incremental update, Stockfish updates one or two accumulators; if the previous state has already been computed, only the current position needs to be updated. Otherwise, the accmulators of both the current position and the position following the computed one are updated.
When update is cost is greater than refresh cost, or if the king has moved, the accumulator is updated from the corresponding cache entry.
If it is expected that some child nodes will be evaluated later, Stockfish preemptively computes the accumulator of the parent position, optimizing further material differential calculations. This is done in move-excluded searches (in singular extension), at TT-hit PV nodes, or after a failed ProbCut searches.
WIP
WIP
Order from latest to oldest.
WIP
Refer to the to view visualized structures of each architecture.
SFNNv9 (current): Increased L1 size to 3072.
SFNNv8: Increased L1 size to 2560.
SFNNv7: Increased L1 size to 2048.
SFNNv6: Increased L1 size to 1536.
SFNNv5: Introduced squared clipped ReLU layer, output mixed within L1.
SFNNv4:
SFNNv3 (tentative): Changed feature set from HalfKAv2 to HalfKAv2_hm.
SFNNv2 (tentative):
Introduction of NNUE:
A position is sometimes not evaluated in NNUE when a matching TT (transposition table) entry is found. Stockfish searches for the last computed accumulator and determine which is faster: whether to update from it or to construct the accumulator from the cached entry. The decision is often based upon the update cost and refresh cost, which are feature-specific values. Check section for more information.