The summit named “Computational Linguistics Summit in
the Era of Large Language Models cum International symposium on Collaborative
Innovations between The Hong Kong Polytechnic University and The China Computer
Federation” organized by COMP Dept., PolyU on 22nd & 23rd
Aug 2024. The first speaker of Day 2 was
Dr. Xingshan Zeng (曾幸山) (Huawei Noah’s Ark Lab) and his topic entitled “Advancing
LLM Evaluation: Comprehen- sive Evaluation on Long-Context, Multi-Turn, and
Instruction-Following”
Firstly, Dr. Zeng briefed Large Language Models (LLM) benchmark
evaluation and the diagram showed the process of evaluation in LLM development.
Then he introduced four long-context evaluation (LE)
named “M4LE” including Multi-ability, Multi-range, Multi-task and Multi-domain.
Overall concepts of multi-turn abilities covered recollection, expansion,
refinement and follow-up.
Finally, Dr. Zeng summarized the existing LLM
evaluation system and introduced M4LE, MT-Eval and FollowBench evaluation methods
for enhancement.
The second speaker was Dr. Zhongqing Wang (王中卿) (Soochow University) and
his presentation named “Metaphor and Synesthesia Analysis via Computational
Linguistic Methods”. His talk included textual metaphor and synesthesia
analysis.
A metaphor is a figure of speech that directly
compares one thing to another, emphasizing the similarities between two
different concepts without using the words "like" or "as".
And then he introduced different metaphor detection
methods such as based on Bidirectional Long Short-Term Memory (BiLSTM) and Bidirectional
Encoder Representations from Transformers (BERT), etc.
Lastly, Dr. Zeng mentioned the distribution of sensory
words for synesthesia detection.
The third speaker was Prof. Haofen Wang (王昊奮) (Tongji University) and
his topic was “Knowledge Retrieval Augmentation: Paradigm and Key Technologies”.
Firstly, he briefed retrieval-augmentation generation (RAG) and its development
history.
Modular RAG is an advanced approach that breaks down
the RAG process into distinct, interchangeable modules, allowing for more
flexibility and customization in how information is retrieved and generated.
Prof. Wang discussed its opportunities.
The approach on knowledge guide that using a knowledge
base to guide RAG retrieval and generation, enhancing traceability and
reliability was discussed.
At the end, he summarized the RAG ecosystem,
prospective, paradigm and evaluation.
The fourth speaker was Prof. Yue Zhang (張岳) (Westlake University) and
his presentation title named “LLM-generated Text Detection. One of motivations was
AI writings that are hard to detect.
His study aimed to distinct AIGC and Human writing.
After that he briefed some existing approaches including
trained detector and zero-shot detector (likelihood/DetectGPT).
Finally, he summarized key features of Machine-generated
Text Detection (MAGE).
The last speaker was Dr. Derek F. Wong (黃輝) (University of Macau) and
his topic entitled “Prefix Text as a Yarn – Eliciting Non-English Alignment in
Foundation Language Model”.
In the beginning, Dr. Wong briefed the world language machine
translation market and identified its significance to Macau to
Portuguese-speaking countries.
Then he briefed some challenges including
morphologically rich language and syntactic differences. After that Dr. Wong
briefed his research of Neural Machine Translation (NMT).
The machine translation development summary
was discussed. Dr. Wong concluded Supervised fine-tuning (SFT) based alignment
many be “superficial”.
Reference
沒有留言:
發佈留言