Documentation
Efficiently Updatable Neural Network
NNUE is designed to efficiently compute propagations in neural networks. The concept is based on the fact that a single move involves at most three piece differences on the board. This allows the state to be updated using only the material differentials, rather than performing a full calculation from the input vector.
The "accumulator" is a positional state that stores transformed features. It "accumulates" all the differentials from the initial state (empty board).
Network Architecture
Each subnetwork consists of one feature transformer and three dense layers, two of which are followed by clipped ReLU activations. There are a total 8 of subnetworks that make up the entire network (either Big or Small), with each subnetwork (bucket) being selected based on the number of pieces on the board. Refer to the sections below for details on each layer:
SmallNet
SmallNet is a replacement for the former HCE (handcrafted evaluation), often referred to as "classical evaluation". It has the same network architecture as Big network, except that the size of L1 is smaller. When one side is winning or losing by a significant material difference, it is assumed that a less precise evaluation is sufficient for such positions.
Determining Network Types
Stockfish first evaluates a position only with pure material scores to determine which network type to use in NNUE.
In evaluate.cpp:
int Eval::simple_eval(const Position& pos, Color c) {
return PawnValue * (pos.count<PAWN>(c) - pos.count<PAWN>(~c))
+ (pos.non_pawn_material(c) - pos.non_pawn_material(~c));
}
bool Eval::use_smallnet(const Position& pos) {
int simpleEval = simple_eval(pos, pos.side_to_move());
return std::abs(simpleEval) > 962;
}
When a position is evaluated in SmallNet, however, if the evaluation and PSQT have opposing signs or if the score itself is small, BigNet is used subsequently.
// Re-evaluate the position when higher eval accuracy is worth the time spent
if (smallNet && (nnue * psqt < 0 || std::abs(nnue) < 227))
{
std::tie(psqt, positional) = networks.big.evaluate(pos, &caches.big);
nnue = (125 * psqt + 131 * positional) / 128;
smallNet = false;
}
Version Update History
Refer to the nnue-pytorch NNUE documentation to view visualized structures of each architecture.
SFNNv9 (current): Increased L1 size to 3072. commit
SFNNv8: Increased L1 size to 2560. commit
SFNNv7: Increased L1 size to 2048. commit
SFNNv6: Increased L1 size to 1536. commit
SFNNv5: Introduced squared clipped ReLU layer, output mixed within L1. commit
SFNNv4: commit
Reduce the input size by half and double the output size of L1.
Split L1 activation output.
SFNNv3 (tentative): Changed feature set from HalfKAv2 to HalfKAv2_hm. commit
SFNNv2 (tentative): commit
Changed feature set from HalfKP to HalfKAv2.
Increased L1 size to 512x2.
Introduction of NNUE: commit
Feature Transformer
Accumulator Update Strategy
A position is sometimes not evaluated in NNUE when a matching TT (transposition table) entry is found. Stockfish searches for the last computed accumulator and determine which is faster: whether to update from it or to construct the accumulator from the cached entry. The decision is often based upon the update cost and refresh cost, which are feature-specific values. Check Feature Sets section for more information.
Normal Position Evaluation
When doing an incremental update, Stockfish updates one or two accumulators; if the previous state has already been computed, only the current position needs to be updated. Otherwise, the accmulators of both the current position and the position following the computed one are updated.
When update is cost is greater than refresh cost, or if the king has moved, the accumulator is updated from the corresponding cache entry.
Common Parent Position Hint
If it is expected that some child nodes will be evaluated later, Stockfish preemptively computes the accumulator of the parent position, optimizing further material differential calculations. This is done in move-excluded searches (in singular extension), at TT-hit PV nodes, or after a failed ProbCut searches.
Incremental Update
WIP
Refresh Cache and Update
WIP
Feature Sets
Order from latest to oldest.
HalfKAv2_hm
WIP
Layers
Affine Layer
Sparse Input Optimization
Clipped ReLU Layer
Square Clipped ReLU Layer
Last updated