南京大学计算机软件新技术国家重点实验室
摘 要:
Big
data concerns growing data sets with a huge number of samples, very
high-dimensional feature vectors, and complex and diverse structures. Many
traditional techniques are inadequate to extract knowledge and insights from
these data sets due to their ever-greater volume and complex structures. An
encouraging discovery of the early empirical studies on big data was the
recognition that many massive real-world data sets can be well interpreted by a
few number of features and/or samples. For example, given certain visual
stimulus, the fraction of active neurons at that instance is small. This
observation enlightens an emerging research area known as sparse learning that
has achieved great success in learning from large and complex data by
uncovering a small set of most explanatory features and/or samples. Typical
examples include selecting features that are the most indicative of users’
preferences for recommender systems, identifying brain regions that are
predictive of brain disorder based on fMRI data, and extracting semantic
information from raw images for object recognition. Despite of the great
success, the learning process of sparse learning methods on large and complex
data can be very time-consuming due to their nonsmooth and
highly complex regularization terms. To address this, we propose a suite of
novel techniques, called screening, to quickly identify the redundant features
and/or samples---that can be removed from the training phase---without losing
useful information of interests. Success in these unique screening techniques
is expected to dramatically scaling up sparse learning methods for large and
complex data in terms of efficiency and memory usage, by several orders of
magnitude. This will significantly expand the use of sparse learning methods to
much bigger data sets that were previously impossible, leading to direct impact
on many fields where sparse learning is critical, e.g., social media mining,
brain data analytics, and imaging genetics.
报告人简介:
Jie Wang
is a Professor at University of Science and Technology of China (USTC). He
received his BS in electronic information science and technology from USTC in
2005 and his PhD in computational science in 2011 from the Florida State
University. Then, he went on to conduct his postdoctoral work at Arizona State
University followed by the University of Michigan. Before joining USTC, Dr.
Wang held a position of research assistant professor at University of Michigan
from 2015. He has broad interests in artificial intelligence, machine learning,
data mining, natural language processing, image processing, and large-scale
optimization etc. He has published many papers on top machine learning and data
mining journals and conferences such as JMLR, TPAMI, NIPS, ICML, and KDD. He is
the PI of the research project funded by National Science Fund for Excellent
Young Scholars.
时间:5月16日
10:00-11:00
地点:计算机科学技术楼230室
|