On a Learnability Question Associated to Neural Networks with Continuous Activations

Abstract

This paper deals with learnability of concept classes de ned by neural networks showing the hardness of PAC learning in the complex ity not merely information theoretic sense for networks with a particular class of activation The obstruction lies not with the VC dimen sion which is known to grow slowly instead the result follows the fact that the loading prob lem is NP complete The complexity scales badly with input dimension the loading prob lem is polynomial time if the input dimension is constant Similar and well known theorems had already been proved by Megiddo and by Blum and Rivest for binary threshold networks It turns out the general problem for continuous sigmoidal type functions as used in practical applications involving steepest descent is not NP hard there are sigmoidals for which the problem is in fact trivial so it is an open ques tion to determine what properties of the acti vation function cause di culties Ours is the rst result on the hardness of loading networks which do not consist of binary neurons we em ploy a piecewise linear activation function that has been used in the neural network literature Our theoretical results lend further justi ca tion to the use of incremental architecture changing techniques for training networks Research supported in part by US Air Force Grant AFOSR Research supported in part by NSF Grant CCR A part of the results reported here will also appear in V P Roychowdhury K Y Siu and A Orlitsky eds Theoret ical Advances in Neural Computation and Learning Kluwer Academic Publishers INTRODUCTION We show modulo RP NP the hardness of PAC learn ing for concept classes de ned by certain analog neural networks this is a consequence of the fact that the load ing problem for a typical such architecture hypothesis class is proved to be NP complete Our results are sim ilar in spirit those due to Blum and Rivest except that we focus on continuous as opposed to binary node func tions Continuous sigmoidal type functions are used in all practical applications involving steepest descent and for these the loading problem is not in general NP hard there are sigmoidals for which the problem is in fact trivial so it is an open question to determine what properties of the activation function cause di cul ties Ours is the rst result on the hardness of loading networks which do not consist of binary neurons we employ a piecewise linear activation function that has been used in the neural network literature

Topics

    0 Figures and Tables

      Download Full PDF Version (Non-Commercial Use)