We don't know.
It's not as simple as resolution you can
see the individual dots peripherally if there's no masking grid), or
adaptation (which is never as fast as 'instantaneous'). It's more likely
related to some kind of competitive pattern-completion process that doesn't match the peripheral resolution, i.e. crowding. But that said, we just don't know the answer.
Possible contributors to the mechanism of Hermann grid-type illusions like this one (some suggested in replies below):
1) powerful lateral inhibition (but White's illusion? also, what kind of lateral inhibition exactly, and where in the brain?)
2) feature mis-integration (but neural how? why are low-contrast lines integrated at cost of high-contrast spots?)
3) adaptation (but how so fast? if adaptation, why is there no oscillation or timescale like in motion-induced blindness or binocular rivalry)
4) filling-in (but how and what's so special about this type of display? how does pattern filling-in work anyways?)
5) crowding/inappropriate integration (but crowding doesn't usually cause blindness to features)Source