Progressive Encoding for Neural Optimization

^{1} Tel Aviv University, ^{2} ETH Zurich, Switzerland

Multilayer-perceptrons (MLP) are known to struggle with learning functions of high-frequencies, and in particular cases with wide frequency bands. We present a spatially adaptive progressive encoding (SAPE) scheme for input signals of MLP networks, which enables them to better fit a wide range of frequencies without sacrificing training stability or requiring any domain specific preprocessing. SAPE gradually unmasks signal components with increasing frequencies as a function of time and space. The progressive exposure of frequencies is monitored by a feedback loop throughout the neural optimization process, allowing changes to propagate at different rates among local spatial portions of the signal space. We demonstrate the advantage of SAPE on a variety of domains and applications, including regression of low dimensional signals and images, representation learning of occupancy networks, and a geometric task of mesh transfer between 3D shapes.

With SAPE, multilayer-perceptrons can faithfully represent implicit 1D signals of varying frequency.

In the example below the network attempts to learn the representation of a 1D function represented by the ** black ** curve.
The training samples are shown in ** red **.

SAPE is able to represent a wide range of natural images without tuning the positional encoding frequency scale.

By uniformly sampling 25% of the original pixels in the image as a train set, SAPE is still able to reconstruct small details of the original signal.
Note that SAPE's performance is capped by the sampling rate (e.g: details smaller than the sampling rate are not guaranteed to be captured).
Below we show animations comparing the optimization progress per algorithm. For SAPE - you may hover over the animation to toggle a heatmap tracking the maximal frequency unmasked per position (**low** to **high**).
Further below we compare the results after convergence.

SAPE is also useful for learning the representation of 3d occupancy implicit functions.

In the examples below, points were sampled uniformly in space and near the shape surface.

Points are then assigned a binary label to determine if they fall within the interior of the surface volume or not.

Note that due to memory constraints, the result presented is a mesh converted from a neural implicit function using Marching Cubes with finite resolution.

Finally, we demonstrate how SAPE can regularize a deformation process.
In the following task, for all shapes, SAPE is first pretrained to output the coordinates of a unit circle.
Then, the network is then optimized to trace the boundaries of a target shape by learning the * offset * from the circle boundary to the shape contour, per position.

The progressive nature of SAPE allows it to capture the global shape first, during the early steps when *Spectral Bias* is present and
the optimization is stable. As higher frequencies are revealed, SAPE is able to fit the finer details of the target shape.

- Rahaman et al. (2019) observed that deep ReLU networks are biased towards low frequency functions, and identified this phenomenon as "Spectral Bias".
- Tancik et al. (2020) established the groundwork for applying Fourier Feature mappings to MLPs. They provided extensive analysis of results through the lens of NTK theory.
- Sitzmann et al. (2020) proposed the sinusoidal representation networks (SIREN). Unlike other works which focus on positional encoding mapping of the network input, they use sine functions as non-linear activations for all layers of the network.

- Park et al. (2021) reconstruct photorealistic non-rigid deforming scenes from photos or videos. They also use coarse-to-fine positional encoding.
- Lin et al. (2021) extend Neural Radiance Fields (NeRF), for training without accurate camera poses. They too, apply coarse-to-fine registration on coordinate based scene-representations.
- Mehta et al. (2021) generalize SIREN with a dual-MLP architecture, where an auxilary network maps input latent codes to parameters that modulate the periodic activations of the synthesis network.

@article{hertz2021sape,
title={SAPE: Spatially-Adaptive Progressive Encoding for Neural Optimization},
author={Amir Hertz and Or Perel and Raja Giryes and Olga Sorkine-Hornung and Daniel Cohen-Or},
journal={arXiv preprint arXiv:2104.09125},
year={2021}
}