Avraam Chatzimichailidis(Fraunhofer ITWM, Kaiserslautern)
hosted by Seminar Series on Scientific Computing
"Second-Order Methods for Neural Networks/Bridging the Gap between Neural Network Pruning and Neural Architecture Search"
Optimization in deep learning is still dominated by first-order gradient methods, such as stochastic gradient descent. Second-order optimization provides curvature information of the objective function, effectively reducing the number of training iterations until convergence and the number of hyperparameters. However, despite its strong theoretical properties, second-order optimization methods are far less prevalent in deep learning due to prohibitive computational and memory costs.
In the first part of the presentation we will discuss how to efficiently calculate the Hessian vector product of a neural network and how to construct new optimizers that can stabilize the training of neural networks. We will see how second-order methods in deep learning can help us gain deeper insight into the training process of neural networks. Specifically, an example will demonstrate how an optimizer that uses second-order information can stabilize the training of generative adversarial networks.
The second part of this talk will deal with the area of neural architecture search (NAS). The aim of NAS is to find architectures that have superior performance on a given dataset, without having to rely on handcrafted neural networks. The formulation of the one-shot NAS problem as a differentiable optimization problem reduces the time complexity compared to evolutionary of reinforcement algorithms. We observe that, in its essence, one-shot NAS is a pruning process. Network pruning is a technique to compress neural networks and reduce their computational complexity without a significant performance drop.
I will talk about how combining neural network pruning together with group sparsity can bridge the gap between the areas of network pruning and neural architecture search by casting the one-shot NAS optimizer as a single-level optimization problem.
|Time:||Thursday, 08.07.2021, 15:00|