I received invitation to attend
the Workshop on Artificial Intelligence in the Era of Big Data, which organized
by the Department of Computer Science of City University of Hong Kong, on 7th
Jun 2018. The workshop aimed to provide a platform for academics and students
to learn about the latest development in Artificial Intelligence and Data
Science through keynote talks given by high-calibre scholars. The Workshop also
offered a forum for participants to exchange knowledge, as well as share their
views and insights on the development trends.
Before that I met Dr. Ken Yau (Instructor I, SEEM Dept., CityU) and we
took a photo in front of the banner.
I was honor to meet Prof. Jun
Wang (Chair Profoessor of Computational Intelligence, CS Dept., CityU) and took
a photo for memory.
Prof.
Jun Wang (王鈞教授) won the Wu Wenjun
Artificial Intelligence Science and Technology Achievement Award 2016 (吳文俊人工智能科學技術獎).
I also meet Dr. Louis Liu
(Instructor I, SEEM Dept., CityU) in the hall.
In the beginning, Prof. Alex Jen
(Provost, Chair Professor of Chemistry and Materials Science, CityU) gave
opening remarks. He said it was a
platform to exchange the new knowledge in AI and CityU would be focused on Data
Science and establishing School of Data Science soon.
Then Prof. Hong Yan (Dean,
College of Science and Engineering, Chair Professor of Computer Engineering,
CityU) gave opening speech. He reviewed
his learning in AI in the past. He said
Neural Network and AI depended on data.
The more data available, the better performance obtained. This workshop
expected to introduce new ideas and crazy idea.
Group photo was taken.
(Left: Prof TAN, Kay Chen, Prof.
Haibo He, Prof. Jie Lu, Prof KWONG, Tak Wu Sam, Prof. Alex Jen, Prof. Zidong
Wang, Prof. Yaochu Jin, Prof. Hong Yan, Prof. Jun Wang and Prof.)
The first keynote speaker was Prof. Haibo He (Editor-in-Chief, IEEE Transactions on Neural Networks and Learning Systems; Robert Haas Endowed Chair Professor, Department of Electrical, Computer, and Biomedical Engineering, University of Rhode Island, USA) and his topic entitled “Big Data: An Imbalanced Learning Perspective”. Firstly, Prof. He introduced the University of Rhode Island and their new College of Engineering Building which would be finished construction in July 2019. Then he briefed the challenge of Big Data that “How to transform the big data into useful information and knowledge to support decision-making!”
Then Prof. He introduced Big Data included statistic data, behavioral data and causal data. Those sophistication data collected and then analyzed so as to extract knowledge for decision support. There were n-V dimension of Big Data included Volume, Velocity and Variety, etc.
And then Prof. He introduced Imbalanced Learning an emerging and critical challenge for the Big Data Era. He said “Imbalance is ubiquitous and prevalent in many real world applications such as disease screening, intrusion detection system, fraud detection and spam filtering, etc. Then he demonstrated the between-class and within-class imbalance data to express this concept. There were different dimensions of imbalanced data included “Between-class and Within-class”, “Intrinsic and Extrinsic” and “Relative and Absolute”.
After that Prof. He briefed Sampling methods, Cost-sensitive methods and Kernel-based methods. He mentioned how to evaluate the performance of imbalanced learning alorithms through “Singular assessment metrics”, “Receiver operating characteristics (ROC) curves”, “Precision-Recall (PR) curve”, “Cost Curves” and “Assessment Metrics for Multiclass Imbalanced Learning”. He also revisited of Synthetic minority oversampling technique (SMOTE) using feature space similarities which showed in the following diagram.
Finally, Prof. He discussed deep
learning for imbalanced data in his paper in 2017. The key idea was to train Generative
Adversarial Network (GAN) and use the Generator Network to generate new
data. Lastly, he discussed four opportunities
and Challenges as follows:
i)
Understanding the fundamental principles,
ii)
A uniform benchmark platform,
iii)
Standardized evaluation practices, and
iv)
Emergent topics on imbalanced learning.
The second keynote speaker was
Prof. Yaochu Jin (Editor-in-Chief, IEEE Transactions on Cognitive and
Developmental Systems; Editor-in-Chief, Complex & Intelligent Systems;
Professor in Computational Intelligence, Department of Computer Science,
University of Surrey, UK) and his presentation title was “Data-driven
Surrogate-assisted Evolutionary Optimization of Expensive Optimization Problems”. He also briefed the University of Surrey in
which was one of four successful research parks in the UK.
Prof. Jin firstly introduced optimization and problem formation. He said that optimization was a mathematical discipline that concerned the finding of minima and maxima of functions (objectives), subject to constraints included Objective function, Decision variables and Constraints. Typical industrial design flow from concept or design to verification was discussed.
Then Data Driven Evolutionary Optimization was discussed. He explained evolutionary algorithms and other meta-heuristic search methods were a class of population-based, guided stochastic search heuristics inspired from biological evolution and swarm behaviors of social animals. Data Driven Evolutionary Optimization was from evolutionary computation to machine learning with support by Data Science.
After that Prof. Jin pointed out
some challenges in Data, Machine Learning and Optimization.
Challenges in Data: Data resources, Big data and Small data
Challenges in Machine Learning: Data clustering/dimension reduction &
sensitivity analysis, Actively learning, Semi-supervised learning, Transfer
learning, Dynamic and incremental learning, as well as, Ensemble learning.
Challenges in Optimization: Problem formulation, Large-scale optimization, Multi- / many-objective
optimization, Optimization in the presence of uncertainties, Computational complexity
and Extremely highly constrained.
After lunch break, I took some
photos with keynote speakers and professors in CS Dept., CityU.
Prof. Haibo He and I
Prof. Yaochu Jin and I
Prof. Jie Lu, Dr. Keung Wai,
Jacky (Assistant Professor, CS Dept, CityU) and I
The third keynote speaker was Prof.
Jie Lu (Editor-In-Chief, Knowledge-Based Systems; Editor-In-Chief,
International Journal of Computational Intelligence Systems; Director of Centre
for Artificial Intelligence; Associate Dean (Research Excellence) and
Distinguished Professor, Faculty of Engineering and IT, University of
Technology Sydney, Australia) and her presentation topic named “Concept Drift”. She would introduced Concept drift’s definition,
challenges, detection, understanding and adaptation.
In the beginning, she introduced machine learning that was a computational process of discovering patterns (knowledge) from historical information. Its overall goal was to extract useful information from data and transform it into an understandable structure for further use.
Then Prof. Lu mentioned two scenarios of machine learning and they were “static data” and “stream data”. In Stream Data (using Dynamic Model), unpredictable variation of data distribution may occur that is the concept drift problem.
After that she briefed the
process of learning under concept drift in the following steps.
i)
Training and learning
ii)
Make predictions
iii)
Drift detection
iv)
Drift understanding
v)
Drift adaptation
There
were two strategies for handle concept drift.
The first one named “Lazy Strategy” that once a concept drift was
detected, the models would be updated. And the second one named “Active
Strategy” that update models constantly, without standalone drift detection
step. Lastly, Prof. Lu said training
data and testing data conformed the same pattern was the assumption of machine
learning. If it no longer hold for
stream data, we needed to learn under concept drift to stop making mistakes
before they turned out to a disaster.
The last keynote speaker was
Prof. Zidong Wang (Editor-in-Chief, Neurocomputing; Professor of Dynamical
Systems and Computing, Department of Computer Science, Brunel University
London, UK) and his presentation was “ Reproducibility in Big Data Analysis: A
bad data perspective”. Firstly, Prof.
Wang mentioned the hottest term ABC that A for AI, B for Big Data and C for
Cloud Computing.
Then Prof. Wang briefed some
common data analysis methods in six types and they were Classification, Clustering,
Regression, Optimization, Association Analysis and Outlier/Novelty
Detection. He also discussed reproducibility
of research that over 70% of researchers were not able to reproduce another
scientist’s experiments, while more than half failed to reproduce their own
experiments. Poor data analysis is one
of the top three reasons (other two were selective reporting and pressure to public)
(Nature 2016).
After that Prof. Wang said it was
missing the sixth V in Big Data that was Volatility (波動性) that caused from bad data or
complexity. The complexity in big data
included Uncertainty, Randomness, Nonlinearity, Time delay, Distrubance, Fault,
Missing information, High dimension, Fragility, Mode switching, etc.
Lastly, Prof. Wang used big data
from bioinformatics – gene expression image analysis as an example to express
the data quality. He concluded that “Big
is not always better and fast computing is not the unique solution. The volatility of data analysis should not be
ignore and sometime bad data is good for research.”
Panel Discussion entitled “Opportunities
and Challenges of Artificial Intelligence” was arranged at the end
Prof. Jun Wang (City University
of Hong Kong) was the moderator.
Panel members:
Prof. Haibo He (University of
Rhode Island, USA)
Prof. Kay Chen Tan (City
University of Hong Kong)
Prof. Yaochu Jin (University of
Surrey, UK)
Prof. Jie Lu (University of
Technology Sydney, Australia)
Prof. Zidong Wang (Brunel
University London, UK)
Firstly, Prof. Jun Wang briefed
AI from Big Data through the steps (Big Data → Information → Knowledge → Wisdom ←→ Intelligence).
Then Prof. He mentioned AI is Everywhere
and then he asked four question for our discussion below.
i)
Weak AI (narrow AI) or Strong AI (artificial general intelligence)?
ii)
Deep Learning: How deep is deep enough?
iii)
Big data or small data?
iv)
Decision-making with uncertainty
And then Prof. Yaochu Jin also
raised some questions as follows:
i)
Big Data AI vs Small Data AI
ii)
Application AI vs Sustainable AI (Deep learning is not yet conceptual
breakthrough)
iii)
Powerful AI vs Secure AI
Prof. Jie Lu also shared her views
on opportunities and challenges in AI.
She raised three opportunities including Technique Support, Industry
Support and Social Support. She also
discussed three technical challenges such as Labelled data, Black-box and
Working with imperfect AI. Another challenge
was social challenges that related privacy, humanity, responsibility and jobs.
After that Prof. Kay Chen Tan
shared his view on AI Evolution from Artificial Narrow Intelligence (ANI) to
Artificial General Intelligence (AGI) and then to Artificial Super Intelligence
(ASI).
Lastly, Prof. Zidong Wang briefed
some challenges with big data analysis below.
i)
Analyses techniques fall behind
ii)
Reproducibility crisis
iii)
The missing 6th V (volatility)
iv)
Lack of understanding of bad data
v)
Multi-objective data analysis.
At the end, Prof. Kwong Tak Wu,
Sam (Head & Professor, CS Dept., CityU) presented souvenirs to each
speakers and took a group photo.
After the workshop, I recognized
Prof. Sam Kwong.
I also recognized Prof. Kay
Chen Tan. Hopefully, we would had cooperation with HKSTP in near future.
Reference:
Computer Science Dept., CityU - http://www.cs.cityu.edu.hk/
Workshop page - http://eicw.cs.cityu.edu.hk/
吳文俊人工智能科學技術獎 -
http://www.wuwenjunkejijiang.cn/wj/index.aspx
沒有留言:
發佈留言