Measure, Manifold, Learning, and Optimization: A Theory Of Neural
Networks [pdf]

Shuai Li

Pre-print

Overall, the works in the paper identify key principles underlying NNs. They are the initial skeleton of a fully-fledged theory of NNs. The main efforts in the years to come are to fill in the flesh and blood of the skeleton.

The theory gives S-System, **a measure-theoretical definition of NNs**; endows a stochastic manifold structure on the intermediate feature space of NNs through information geometry; proposes a learning framework that unifies both supervised learning and unsupervised learning in the same objective function; and proves **under practical conditions**, for **large size nonlinear deep NNs** with a class of losses, including the hinge loss, **all local minima are global minima** with zero loss errors. It also completes, or more precisely, clarify the analogy between NNs and Renormalization Group.