A First Principles Theory of Generalization
Some new research from University of California, Berkeley shades some new light into how to quantify neural networks knowledge.
I recently started an AI-focused educational newsletter, that already has over 150,000 subscribers. TheSequence is a no-BS (meaning no hype, no news etc) ML-oriented newsletter that takes 5 minutes to read. The goal is to keep you up to date with machine learning projects, research papers and concepts. Please give it a try by subscribing below:
The best source to stay up-to-date with the developments in the machine learning, artificial intelligence, and data…
Most of machine learning(ML) papers we see these days are focused on advancing new techniques and methods in different areas such as natural language or computer vision. ML research is advancing at a frantic pace despite the lack of fundamental theory of machine intelligence. Some of the main questions in ML such as how to understand how neural networks learn or how to quantify knowledge generalization remain unanswered. From time to time, we come across papers that are challenging our fundamental understanding of ML theory with new ideas. This is the case of “Neural Tangent Kernel Eigenvalues Accurately Predict Generalization”, a groundbreaking paper just published by Berkeley AI Research(BAIR) that proposes nothing less that a new theory of generalization.
Understanding generalization remains one of the biggest mysteries in modern ML. In their paper, the BAIR researchers tackle an variation of the fundamental question of generalization stated in the following statement:
Can one efficiently predict, from first principles, how well a given network architecture will generalize when learning a given function, provided a given number of training examples?
To answer this question, the BAIR team relied on two recent breakthroughs in deep learning:
1) Infinite-Width Networks
One of the most interesting theoretical developments in the recent years of deep learning is the theory…