计算机软件新技术国家重点实验室
摘 要:
Deep learning has enjoyed huge empirical success in recent years.
Although training a deep neural network is a highly non-convex optimization
problem, simple (stochastic) gradient methods are able to produce good
solutions that minimize the training error, and more surprisingly, can generalize well to out-of sample data,
even when the number of parameters is significantly larger than the amount of
training data. It is known that changing the optimization algorithm, even
without changing the model, changes the implicit bias, and also the
generalization properties. What is the bias introduced by the optimization
algorithms for neural networks? What ensures generalization in neural
networks? In this talk, we attempt to
answer the above questions by proving new generalization bounds and
investigating the implicit bias of various gradient methods.
报告人简介:
Jian Li is currently an associate professor at Institute for
Interdisciplinary Information Sciences (IIIS, previously ITCS), Tsinghua
University, headed by Prof. Andrew Yao. He got his BSc degree from Sun Yat-sen
(Zhongshan) University, China, MSc degree in computer science from Fudan
University, China and PhD degree in the University of Maryland, USA. His major
research interests lie in algorithm design and analysis, machine learning,
databases and finance. He co-authored several research papers that have been
published in major computer science conferences and journals. He received the
best paper awards at VLDB 2009 and ESA 2010. He is also a recipient of the
"221 Basic Research Plan for Young Faculties" at Tsinghua University,
the "new century excellent talents award" by Ministry of Education of
China, and the National Science Fund for Excellent Young Scholars.
时间:10月29日 10:30-11:30
地点:计算机科学技术楼230室
|