RETHINKING THE VALUE OF NETWORK PRUNING Paper Reading Notes

less than 1 minute read

Published:

RETHINKING THE VALUE OF NETWORK PRUNING

A typical three-stage network pruning pipeline is mentioned. a typical three-stage network pruning pipeline

Difference between predefined and auto- matically discovered target architectures, in channel pruning as an example. Alt text

While They have shown that, for structured pruning, the inherited weights in the pruned architecture are not better than random, the pruned architecture itself turns out to be what brings the efficiency benefits.

They have also done serveral experiments on the Lottery Ticket Hypothesis and it shows that random initialization is enough for the pruned model to achieve competitive performance. Using the winning ticket as initialization only brings improvement when the learning rate is small (0.01), however such small learning rate leads to a lower accuracy than the widely used large learning rate (0.1).