2018年6月7日星期四

CityU Workshop on AI in the Era of Big Data

I received invitation to attend the Workshop on Artificial Intelligence in the Era of Big Data, which organized by the Department of Computer Science of City University of Hong Kong, on 7th Jun 2018. The workshop aimed to provide a platform for academics and students to learn about the latest development in Artificial Intelligence and Data Science through keynote talks given by high-calibre scholars. The Workshop also offered a forum for participants to exchange knowledge, as well as share their views and insights on the development trends.  Before that I met Dr. Ken Yau (Instructor I, SEEM Dept., CityU) and we took a photo in front of the banner.


I was honor to meet Prof. Jun Wang (Chair Profoessor of Computational Intelligence, CS Dept., CityU) and took a photo for memory.




I also meet Dr. Louis Liu (Instructor I, SEEM Dept., CityU) in the hall.


In the beginning, Prof. Alex Jen (Provost, Chair Professor of Chemistry and Materials Science, CityU) gave opening remarks.  He said it was a platform to exchange the new knowledge in AI and CityU would be focused on Data Science and establishing School of Data Science soon.


Then Prof. Hong Yan (Dean, College of Science and Engineering, Chair Professor of Computer Engineering, CityU) gave opening speech.  He reviewed his learning in AI in the past.  He said Neural Network and AI depended on data.  The more data available, the better performance obtained. This workshop expected to introduce new ideas and crazy idea.


Group photo was taken.

(Left: Prof TAN, Kay Chen, Prof. Haibo He, Prof. Jie Lu, Prof KWONG, Tak Wu Sam, Prof. Alex Jen, Prof. Zidong Wang, Prof. Yaochu Jin, Prof. Hong Yan, Prof. Jun Wang and Prof.)  

The first keynote speaker was Prof. Haibo He (Editor-in-Chief, IEEE Transactions on Neural Networks and Learning Systems; Robert Haas Endowed Chair Professor, Department of Electrical, Computer, and Biomedical Engineering, University of Rhode Island, USA) and his topic entitled “Big Data: An Imbalanced Learning Perspective”. Firstly, Prof. He introduced the University of Rhode Island and their new College of Engineering Building which would be finished construction in July 2019.  Then he briefed the challenge of Big Data that “How to transform the big data into useful information and knowledge to support decision-making!”


Then Prof. He introduced Big Data included statistic data, behavioral data and causal data.  Those sophistication data collected and then analyzed so as to extract knowledge for decision support.  There were n-V dimension of Big Data included Volume, Velocity and Variety, etc. 


And then Prof. He introduced Imbalanced Learning an emerging and critical challenge for the Big Data Era.  He said “Imbalance is ubiquitous and prevalent in many real world applications such as disease screening, intrusion detection system, fraud detection and spam filtering, etc.  Then he demonstrated the between-class and within-class imbalance data to express this concept.  There were different dimensions of imbalanced data included “Between-class and Within-class”, “Intrinsic and Extrinsic” and “Relative and Absolute”. 


After that Prof. He briefed Sampling methods, Cost-sensitive methods and Kernel-based methods.  He mentioned how to evaluate the performance of imbalanced learning alorithms through “Singular assessment metrics”, “Receiver operating characteristics (ROC) curves”, “Precision-Recall (PR) curve”, “Cost Curves” and “Assessment Metrics for Multiclass Imbalanced Learning”.  He also revisited of Synthetic minority oversampling technique (SMOTE) using feature space similarities which showed in the following diagram. 


Finally, Prof. He discussed deep learning for imbalanced data in his paper in 2017.  The key idea was to train Generative Adversarial Network (GAN) and use the Generator Network to generate new data.  Lastly, he discussed four opportunities and Challenges as follows:
i)                    Understanding the fundamental principles,
ii)                  A uniform benchmark platform,
iii)                Standardized evaluation practices, and
iv)                Emergent topics on imbalanced learning.


The second keynote speaker was Prof. Yaochu Jin (Editor-in-Chief, IEEE Transactions on Cognitive and Developmental Systems; Editor-in-Chief, Complex & Intelligent Systems; Professor in Computational Intelligence, Department of Computer Science, University of Surrey, UK) and his presentation title was “Data-driven Surrogate-assisted Evolutionary Optimization of Expensive Optimization Problems”.  He also briefed the University of Surrey in which was one of four successful research parks in the UK.


Prof. Jin firstly introduced optimization and problem formation.  He said that optimization was a mathematical discipline that concerned the finding of minima and maxima of functions (objectives), subject to constraints included Objective function, Decision variables and Constraints. Typical industrial design flow from concept or design to verification was discussed. 


Then Data Driven Evolutionary Optimization was discussed.  He explained evolutionary algorithms and other meta-heuristic search methods were a class of population-based, guided stochastic search heuristics inspired from biological evolution and swarm behaviors of social animals.  Data Driven Evolutionary Optimization was from evolutionary computation to machine learning with support by Data Science.


After that Prof. Jin pointed out some challenges in Data, Machine Learning and Optimization.
Challenges in Data: Data resources, Big data and Small data
Challenges in Machine Learning: Data clustering/dimension reduction & sensitivity analysis, Actively learning, Semi-supervised learning, Transfer learning, Dynamic and incremental learning, as well as, Ensemble learning.
Challenges in Optimization: Problem formulation, Large-scale optimization, Multi- / many-objective optimization, Optimization in the presence of uncertainties, Computational complexity and Extremely highly constrained.  


After lunch break, I took some photos with keynote speakers and professors in CS Dept., CityU.
Prof. Haibo He and I


Prof. Yaochu Jin and I


Prof. Jie Lu, Dr. Keung Wai, Jacky (Assistant Professor, CS Dept, CityU) and I


The third keynote speaker was Prof. Jie Lu (Editor-In-Chief, Knowledge-Based Systems; Editor-In-Chief, International Journal of Computational Intelligence Systems; Director of Centre for Artificial Intelligence; Associate Dean (Research Excellence) and Distinguished Professor, Faculty of Engineering and IT, University of Technology Sydney, Australia) and her presentation topic named “Concept Drift”.  She would introduced Concept drift’s definition, challenges, detection, understanding and adaptation.  


In the beginning, she introduced machine learning that was a computational process of discovering patterns (knowledge) from historical information.  Its overall goal was to extract useful information from data and transform it into an understandable structure for further use. 


Then Prof. Lu mentioned two scenarios of machine learning and they were “static data” and “stream data”.  In Stream Data (using Dynamic Model), unpredictable variation of data distribution may occur that is the concept drift problem. 


After that she briefed the process of learning under concept drift in the following steps.
i)                    Training and learning
ii)                  Make predictions
iii)                Drift detection
iv)                Drift understanding
v)                  Drift adaptation
There were two strategies for handle concept drift.  The first one named “Lazy Strategy” that once a concept drift was detected, the models would be updated. And the second one named “Active Strategy” that update models constantly, without standalone drift detection step.  Lastly, Prof. Lu said training data and testing data conformed the same pattern was the assumption of machine learning.  If it no longer hold for stream data, we needed to learn under concept drift to stop making mistakes before they turned out to a disaster.  


The last keynote speaker was Prof. Zidong Wang (Editor-in-Chief, Neurocomputing; Professor of Dynamical Systems and Computing, Department of Computer Science, Brunel University London, UK) and his presentation was “ Reproducibility in Big Data Analysis: A bad data perspective”.  Firstly, Prof. Wang mentioned the hottest term ABC that A for AI, B for Big Data and C for Cloud Computing.  


Then Prof. Wang briefed some common data analysis methods in six types and they were Classification, Clustering, Regression, Optimization, Association Analysis and Outlier/Novelty Detection.  He also discussed reproducibility of research that over 70% of researchers were not able to reproduce another scientist’s experiments, while more than half failed to reproduce their own experiments.  Poor data analysis is one of the top three reasons (other two were selective reporting and pressure to public) (Nature 2016). 


After that Prof. Wang said it was missing the sixth V in Big Data that was Volatility (波動性) that caused from bad data or complexity.  The complexity in big data included Uncertainty, Randomness, Nonlinearity, Time delay, Distrubance, Fault, Missing information, High dimension, Fragility, Mode switching, etc. 


Lastly, Prof. Wang used big data from bioinformatics – gene expression image analysis as an example to express the data quality.  He concluded that “Big is not always better and fast computing is not the unique solution.  The volatility of data analysis should not be ignore and sometime bad data is good for research.”


Panel Discussion entitled “Opportunities and Challenges of Artificial Intelligence” was arranged at the end
Prof. Jun Wang (City University of Hong Kong) was the moderator. 
Panel members:
Prof. Haibo He (University of Rhode Island, USA)
Prof. Kay Chen Tan (City University of Hong Kong)
Prof. Yaochu Jin (University of Surrey, UK)
Prof. Jie Lu (University of Technology Sydney, Australia)
Prof. Zidong Wang (Brunel University London, UK)

Firstly, Prof. Jun Wang briefed AI from Big Data through the steps (Big Data Information Knowledge Wisdom ←→ Intelligence).


Then Prof. He mentioned AI is Everywhere and then he asked four question for our discussion below.
i)                    Weak AI (narrow AI) or Strong AI (artificial general intelligence)?
ii)                  Deep Learning: How deep is deep enough?
iii)                Big data or small data?
iv)                Decision-making with uncertainty

And then Prof. Yaochu Jin also raised some questions as follows:
i)                    Big Data AI vs Small Data AI
ii)                  Application AI vs Sustainable AI (Deep learning is not yet conceptual breakthrough)
iii)                Powerful AI vs Secure AI

Prof. Jie Lu also shared her views on opportunities and challenges in AI.  She raised three opportunities including Technique Support, Industry Support and Social Support.  She also discussed three technical challenges such as Labelled data, Black-box and Working with imperfect AI.  Another challenge was social challenges that related privacy, humanity, responsibility and jobs.

After that Prof. Kay Chen Tan shared his view on AI Evolution from Artificial Narrow Intelligence (ANI) to Artificial General Intelligence (AGI) and then to Artificial Super Intelligence (ASI).

Lastly, Prof. Zidong Wang briefed some challenges with big data analysis below.
i)                    Analyses techniques fall behind
ii)                  Reproducibility crisis
iii)                The missing 6th V (volatility)
iv)                Lack of understanding of bad data
v)                  Multi-objective data analysis.

At the end, Prof. Kwong Tak Wu, Sam (Head & Professor, CS Dept., CityU) presented souvenirs to each speakers and took a group photo.


After the workshop, I recognized Prof. Sam Kwong.


I also recognized Prof. Kay Chen Tan. Hopefully, we would had cooperation with HKSTP in near future.


Reference:
Computer Science Dept., CityU - http://www.cs.cityu.edu.hk/
吳文俊人工智能科學技術獎 - http://www.wuwenjunkejijiang.cn/wj/index.aspx

沒有留言:

LinkWithin

Related Posts with Thumbnails