|
Past Seminars and Events
|
| December 17, 2025 |
-
Title: Genetics of Human Longevity: Regional Association Signals and Cross-cohort Replication
Time: 02:30pm
Venue: Room 301, Run Run Shaw Building
Speaker(s): Prof. Heping Zhang
Remark(s): Abstract
Understanding the genetic basis of human longevity remains a fundamental challenge in aging research. Although previous genome-wide association studies (GWAS) have identified a few loci,most prominently APOE, their replication has often been inconsistent, and most analyses have relied primarily on single-variant testing. To better understand the genetics of human longevity,we examined genetic determinants of age at death in the UK Biobank (UKB) and conducted independent replication in the All of Us (AoU) cohort. After standard quality control and covariate adjustment, we performed single-variant GWAS and complemented these with the Regional Association Score (RAS) framework to capture cumulative regional effects. To gain functional insight, we carried out transcriptome-wide association studies (TWAS) using RNA sequencing data from Mayo Pilot brain tissue. The UKB discovery and AoU replication analyses revealed robust associations across chromosome 19 encompassing APOE, APOC1, and NECTIN2(PVRL2), reaffirming this locus as the major genetic contributor to human lifespan. In addition,suggestive and potentially novel associations, such as those involving TRIM10 on chromosome 6, highlight new avenues for investigation. Together, these results underscore the enduring importance of the APOE region in longevity, demonstrate the value of regional association approaches alongside conventional GWAS, and provide new leads for elucidating the molecular mechanisms of human aging. This is a joint work with Yiran Jiang and Yue Hu.

|
| December 16, 2025 |
-
Title: AI-Assisted System Security: from Offense to Defense
Time: 10:30am
Venue: CB 308
Speaker(s): Prof. Yan Chen
Remark(s): Abstract
The emergence of Large Language Models (LLMs) has significantly transformed many fields. In this talk, we will present our recent research on leveraging LLMs for both offensive and defensive cybersecurity applications—specifically, automated penetration testing, as well as automated vulnerability discovery and patching. Evaluations show that our systems achieve state-of-the-art performance among open-source solutions in these areas, including in the recent DARPA AI Cyber Challenge (AIxCC) competition. We believe these advancements will play a critical role in combating major cyber threats, such as Advanced Persistent Threats (APTs).
About the speaker
Yan Chen received his Ph.D. in Computer Science from University of California at Berkeley in 2003 and after that he joined Northwestern University USA where he became a Full Professor in 2014. His research interests are in security and measurement for networking systems. He won the DOE Early CAREER Award in 2005, the DOD (Air Force of Scientific Research) Young Investigator Award in 2007, and the Most Influential Paper Award of ACM ASPLOS in 2018. In 2024, he co-led the team that was selected as one of the seven finalists in the DARPA AI Cyber Challenge (AIxCC), securing a total of $3 million in funding. Based on Google Scholar, his papers have been cited over 17,000 times, and the h-index of his publications is 63. He is a Fellow of IEEE.

|
| December 12, 2025 |
-
Title: AI Planning for Data Exploration
Time: 04:00pm
Venue: HW312, Haking Wong Building, HKU
Speaker(s): Prof. Sihem Amer-Yahia
Remark(s): Abstract
Data Exploration is an incremental process that helps users express what they want through a conversation with the data. Reinforcement Learning (RL) is one of the most notable approaches to automate data exploration and several solutions have been proposed. With the advent of Large Language Models and their ability to reason sequentially, it has become legitimate to ask the question: would LLMs and,more generally AI planning, outperform a customized RL policy in data exploration? More specifically, would LLMs help circumvent retraining for new tasks and strike a balance between specificity and generality? This talk will attempt to answer this question by reviewing RL training and policy reusability for data exploration.
About the speaker
Sihem Amer-Yahia is a Silver Medal CNRS Research Director and Deputy Director of the Lab of Informatics of Grenoble. She works on exploratory data analysis and algorithmic upskilling. Prior to that she was Principal Scientist at QCRI, Senior Scientist at Yahoo! Research and Member of Technical Staff at at&t Labs. Sihem served as PC chair for SIGMOD 2023 and as the coordinator of the Diversity, Equity and Inclusion initiative for the database community. In 2024, she received the 2024 IEEE TCDE Impact Award, the SIGMOD Contributions Award, and the VLDB Women in Database Award.

-
Title: Updating Quantum Beliefs
Time: 10:30am
Venue: CB 308
Speaker(s): Prof. Valerio Scarani
Remark(s): Abstract
In the classical world, upon receiving new evidence, beliefs are (well… should be) updated according to the Bayes’ rule. The quantum analog of this rule is not straightforward and has been the subject of several discussions. Recently, we have found the quantum Bayes’ rule that emerges from a “minimum change principle”, that is the idea that one should update one’s beliefs in the least disruptive way. In many cases, it coincides with the Petz map, one of the most popular previously proposed candidates; in some situations, it is a map that nobody had anticipated. I shall also discuss how the quantum belief should not be restricted to the state, but comprises any information we may have about its preparation. Based on this observation, we have been able to unify various proposals for the task of “smoothing” in a single framework.
About the speaker
Valerio Scarani is Principal Investigator (currently serving as Deputy Director) at the Centre for Quantum Technologies, and Professor at the Department of Physics, National University of Singapore. His research is in theoretical quantum information science, with works in quantum cryptography, Bell nonlocality and other topics.

|
| December 11, 2025 |
-
Title: Data Science and AI for Remote Sensing
Time: 03:00pm
Venue: CB308
Speaker(s): Departmental Seminar by Prof. Peng Gong, The University of Hong Kong
Remark(s): Abstract
Satellite remote sensing is a field expanding exponentially, with data at the PB level. In the past 40 years, it has evolved from partially covering the Earth's surface with 100–30 meter resolution to covering every corner of the Earth at 30 cm resolution. The repeat frequency is improving, and there is a potential point expected to be reached when we collect Earth surface data at submeter resolution constantly. The speed of data acquisition moves ahead, with data processing and information extraction several blocks behind. How should we fill the gap? Environmental scholars learning from the computer science community is clearly not enough. We need better pattern recognition and machine learning technologies to make better use of the explosion of Earth observation data. Will computer scientists, particularly data scientists and AI researchers, join forces to tackle these problems? I propose the concept of iEarth, calling for the participation of data scientists and AI researchers to join hands with environmental scientists to tackle today's grand environmental challenges of the human society, food insecurity, disaster early warning and prevention, water and energy shortages, global health, climate change, etc.
About the speaker
Professor Peng Gong is the Vice-President and Pro-Vice-Chancellor (Academic Development) at The University of Hong Kong, where he also serves as Chair Professor of Global Sustainability in the Departments of Geography and Earth & Planetary Sciences (since 2021). He holds a BS and MS from Nanjing University and a PhD from the University of Waterloo. His academic career spans York University, the University of Calgary, and UC Berkeley, where he became a full professor in 2001. He later founded Tsinghua University’s Department of Earth System Science (2016) and served as Dean of Science (2017).
In addition to being a Foreign Member of the Academy of Europe, he was the Founding Editor-in-Chief of Geographic Information Sciences (now Annals of GIS). He advises Future Earth as well as Earth Commission; and co-chairs the Lancet Climate Change and Health Commission and Countdown 2030. An interdisciplinary leader, he co-founded the Center for Assessment and Monitoring of Forest and Environmental Resources at UC Berkeley and established key Chinese institutions, including the first Earth System Science Institute in China at Nanjing University.
His research spans urbanization and health, environmental change monitoring, and infectious disease modelling. He received research awards from the American Society for Photogrammetry and Remote Sensing, the Association of American Geographers and the Joint Board Council of Science China and Science Bulletin. Over 30 of his former PhD students now hold faculty positions at top universities worldwide.

|
| December 09, 2025 |
-
Title: Genetic and Epigenetic Landscape of Self-Identified Hispanics in All Of Us
Time: 11:30am
Venue: CB 308
Speaker(s): Dr. Fritz Sedlazeck
Remark(s): Abstract
Hispanic populations in the United States are highly admixed and genetically diverse, yet remain underrepresented in genomic studies. To address this, we present the first large-scale long-read sequencing analysis of 1,490 self-reported Hispanic individuals from the All of Us Research Program, capturing small variants, structural variants, tandem repeats (TRs), and CpG methylation. We characterize global and local ancestry across the cohort, enabling ancestry-aware analysis of genetic and epigenetic features. Over 10.3 million previously unknown autosomal variants are identified, including medically relevant alleles stratified by local ancestry and pathogenic risk revealing 402 carriers with potential risk for subsequent generations. We discover 135 individuals with TR alleles exceeding established pathogenic ranges, and conduct the first genome-wide TR-mQTL analysis, identifying 3,329 TR alleles associated with methylation. Allele-specific methylation (ASM) is resolved at >12,000 loci per genome and 24 novel recurrent ASM loci are identified. This includes ancestry specific regulatory activity such as activation of paralogous genes driven by ancestry-enriched variants and epigenetic markers. These findings establish a foundational resource for biomedical research and highlight the critical role of ancestry-aware analyses in understanding gene regulation, disease risk, and personalized medicine.
About the speaker
Dr. Fritz Sedlazeck is an Associate Professor at the Human Genome Sequencing Center at Baylor College of Medicine and an Adjunct Associate Professor at Rice University. His research focuses on algorithmic developments and high-performance computing for genomic and genetic applications. Specifically, he studies ways to improve the characterization of complex genomic alterations between individuals’ genomes based on large genomic sequencing data and as such improve our understanding of complex phenotypes such as human diseases.

|
| December 08, 2025 |
-
Title: Fighting Noise with Noise: Causal Inference with Many Candidate Instruments
Time: 04:00pm
Venue: Room 301, Run Run Shaw Building
Speaker(s): Prof. Linbo Wang
Remark(s): Abstract
Instrumental variable methods provide useful tools for inferring causal effects in the presence of unmeasured confounding. To apply these methods with large-scale data sets, a major challenge is to find valid instruments from a possibly large candidate set. In practice, most of the candidate instruments are often not relevant for studying a particular exposure of interest. Moreover, not all relevant candidate instruments are valid as they may directly influence the outcome of interest. In this article, we propose a data-driven method for causal inference with many candidate instruments that addresses these two challenges simultaneously. A key component of our proposal involves using pseudo variables, known to be irrelevant, to remove variables from the original set that exhibit spurious correlations with the exposure. Synthetic data analyses show that the proposed method performs favourably compared to existing methods. We apply our method to a Mendelian randomization study estimating the effect of obesity on health-related quality of life. .
About the speaker
Linbo Wang is an associate professor from the University of Toronto, Canada, and he holds a joint appointment at statistic, mathematics and computer science departments. His research interests are at casual inference and graphical models. Currently he is a Canada Research Chair in Causal Machine Learning.

|
| December 01, 2025 |
-
Title: The Three Faces of Networking
Time: 10:30am
Venue: CB 308
Speaker(s): Prof. Ang Chen
Remark(s): Abstract
What does a network connect: people? machines? infrastructures? All three are true, but one tends to dominate at any given time in response to societal needs. Telephony networks interconnected people, but the Internet recast communication as connecting machines. This was a subtle yet profound shift—machines have different failures,misbehaviors, and performance goals, which translated into the network design and still define much of our problem space today. As of late, however, another quiet change is playing out which demands a rethinking of networks. Societal infrastructures, such as power grids, water systems, and datacenters, are increasingly interdependent, but historically they were never designed for coordinated operation. Networking at the "infrastructure nexus" is becoming a pressing need, bringing with it a fresh source of research problems.
About the speaker
Ang Chen is an Associate Professor in Computer Science and Engineering at the University of Michigan, Ann Arbor. Prior to this, he received his PhD at the University of Pennsylvania, and was a faculty member at Rice University. His research interests are in computer systems, networking, and security. He has received an NSF CAREER Award, a VMWare Early Career Faculty Grant, Best/Distinguished Paper Awards at FAST, APNet,USENIX Security, and the ACM SIGCOMM Rising Star Award.

|
| November 28, 2025 |
-
Title: Mortality Surface Modeling With Gaussian Processes
Time: 11:00am
Venue: Room 301, Run Run Shaw Building
Speaker(s): Prof. Mike Ludkovski
Remark(s): Abstract
I will discuss several interrelated projects on the use of Gaussian Process (GP) models for longevity analysis. The underlying Age-Period-Cohort structure is well-suited for capturing by a GP in order to address the common actuarial tasks of nowcasting the latest mortality rates and probabilistically projecting them into the future. I will review the GP spatial covariance framework in the context of mortality surfaces and the key steps of kernel and prior mean selection, improvement factor computation, and posterior sampling. Among the various GP implementations we have developed, I will highlight: (i)multi-output GPs for joint analysis of several dozen populations, hierarchically arranged along nationalities, genders and causes-of-death; (ii) compositional GP kernel search to identify the fittest kernels matching the spatio-temporal mortality dynamics in different countries; (iii) deflator GP models to capture the relative mortality of a small pension fund population vis-a-vis a national mortality table. Plentiful illustrations using Human Mortality Database datasets and corresponding insights into evolving mortality patterns will be given. Co-authors include Nhan Huynh, Jimmy Risk, Rodrigo Targino and Eduardo de Melo.

-
Title: Understanding LLM Behaviors via Compression: Data Generation, Knowledge Acquisition and Scaling Laws
Time: 10:30am
Venue: CB 308
Speaker(s): Prof. Jian Li
Remark(s): Abstract
Large Language Models (LLMs) have demonstrated remarkable capabilities across numerous tasks, yet principled explanations for their underlying mechanisms and several phenomena, such as scaling laws, hallucinations, and related behaviors, remain elusive. In this work, we revisit the classical relationship between compression and prediction, grounded in Kolmogorov complexity and Shannon information theory, to provide deeper insights into LLM behaviors. By leveraging the Kolmogorov Structure Function and interpreting LLM compression as a two-part coding process, we offer a detailed view of how LLMs acquire and store information across increasing model and data scales—from pervasive syntactic patterns to progressively rarer knowledge elements. Motivated by this theoretical perspective and natural assumptions inspired by Heap’s and Zipf’s laws, we introduce a simplified yet representative hierarchical data-generation framework called the Syntax-Knowledge model. Under the Bayesian setting, we show that prediction and compression within this model naturally lead to diverse learning and scaling behaviors of LLMs. In particular, our theoretical analysis offers intuitive and principled explanations for both data and model scaling laws, the dynamics of knowledge acquisition during training and fine-tuning, factual knowledge hallucinations in LLMs. The experimental results validate our theoretical predictions.
About the speaker
Jian Li is a professor at the Institute for Interdisciplinary Information Sciences, Tsinghua University. His research focuses on theoretical computer science, artificial intelligence, FinTech and databases. He has published over 100 papers in major international conferences and journals. His work has received the Best Paper Award at the VLDB conference and the European Symposium on Algorithms (ESA), as well as the Best Newcomer Award at the International Conference on Database Theory (ICDT). Multiple papers of his have been selected for oral presentations or highlighted as spotlight papers. He has led several research projects, including those funded by NSFC and industry projects with companies such as Baidu, Ant Group, ByteDance, E-Fund Management, Huatai Securities etc.

|