APSIPA Annual Summit and Conference 2026 (APSIPA ASC 2026)

Special Session Title	Organizers
1. Emerging Technologies and Applications of Computer Vision and Multimodal AI	Prof. Chul Lee Dongguk University, Korea chullee@dongguk.edu Prof. Je-Won Kang Ewha Womans University, Korea jewonk@ewha.ac.kr
2. AI-Driven Non-Invasive Health Monitoring Using Radar and Wearable Sensors	Dr. Hoang Thi Yen Le Quy Don Technical University, Hanoi, Vietnam yenht@lqdtu.edu.vn Assoc. Prof. Guanghao Sun The University of Electro-Communications, Tokyo, Japan Guanghao.Sun@uec.ac.jp
3. Generative and Multimodal Signal Processing for Vision and Multimedia Systems	Prof. Ching-Chun Huang National Yang Ming Chiao Tung University, Taiwan chingchun@nycu.edu.tw Assoc. Prof. Yu-Lun Liu National Yang Ming Chiao Tung University, Taiwan yulunliu@cs.nycu.edu.tw Assoc. Prof. Manh-Hung Nguyen University of Technology and Engineering, Vietnam hungnm@hcmute.vn
4. High Performance Image and Video Processing and Applications	Prof. Kosin Chamnongthai King Mongkut's University of Technology Thonburi, Thailand kosin.cha@kmutt.ac.th
5. Intelligent Signal Processing and Resource Optimization for Integrated Sensing and Communications	Prof. Kouji Hirata Kansai University, Japan hirata@kansai-u.ac.jp
6. Recent Advances in Enriched Multimedia: Security, Forensics, and Privacy in the Generative AI Era	Assoc. Prof. Michiharu Niimi Kyushu Institute of Technology, Japan niimi@ai.kyutech.ac.jp Prof. Masaki Kawamura Yamaguchi University, Japan kawamura@sci.yamaguchi-u.ac.jp
7. Acoustic Scene Analysis and Signal Enhancement Based on Advanced Signal Processing and Machine Learning	Prof. Shoji Makino Waseda University, Japan s.makino@waseda.jp Prof. Kouei Yamaoka The University of Tokyo, Japan kouei_yamaoka@ipc.i.u-tokyo.ac.jp
8. Recent Advances in Music Processing	Prof. Tetsuro Kitahara Nihon University, Japan kitahara.tetsuro@nihon-u.ac.jp Assoc. Prof. Eita Nakamura Kyushu University, Japan nakamura@inf.kyushu-u.ac.jp
9. Model Design and Practical Applications of Generative AI	Assoc. Prof. Thanh-Hai Tran Hanoi University of Science and Technology, Vietnam hai.tranthithanh1@hust.edu.vn Assoc. Prof. Hai Vu Hanoi University of Science and Technology, Vietnam hai.vu@hust.edu.vn Assoc. Prof. Cheng-Kai Lu National Taiwan Normal University, Taiwan cklu@ntnu.edu.tw Prof. Li-Wei Kang National Taiwan Normal University, Taiwan lwkang@ntnu.edu.tw
10. Reconstruction meets Generation: 2D, 3D, 4D and Beyond	Prof. Yuchao Dai Northwestern Polytechnical University, China daiyuchao@nwpu.edu.cn
11. Hardware Accelerators for AI-Driven and Real-Time Signal Processing Systems	Prof. Trio Adiono Bandung Institute of Technology, Indonesia tadiono@gmail.com
12. Towards Explainable and Interpretable AI in Medicine	Dr. Tran Hiep Dinh VNU University of Engineering and Technology, Hanoi, Vietnam tranhiep.dinh@vnu.edu.vn Assoc. Prof. Thi-Thao Tran Hanoi University of Science and Technology, Hanoi, Vietnam thao.tranthi@hust.edu.vn
13. Speech Processing and Health	Prof. Ian McLoughlin Singapore Institute of Technology, Singapore ian.mcloughlin@singaporetech.edu.sg Prof. Alessandro Vinciarelli The University of Glasgow, Scotland, United Kingdom Alessandro.Vinciarelli@glasgow.ac.uk Prof. Chng Eng Siong Nanyang Technological University, Singapore ASESChng@ntu.edu.sg Assoc. Prof. Yan Song The University of Science and Technology of China, China songy@ustc.edu.cn Prof. Hamid Sharifzadeh UNITEC, Auckland, New Zealand hsharifzadeh@unitec.ac.nz Assoc. Prof. Lee Kong Aik Hong Kong Polytechnic University, Hong Kong, China kong-aik.lee@polyu.edu.hk Asst. Prof. Tong Rong Singapore Institute of Technology, Singapore tong.rong@singaporetech.edu.sg Dr. Pham Lam Austrian Institute of Technology, Austria Lam.Pham@ait.ac.at Dr. Akshita Abrol Singapore Institute of Technology, Singapore akshita.abrol@singaporetech.edu.sg Prof. Tomoki Toda Information Technology Center, Nagoya University, Japan tomoki@icts.nagoya-u.ac.jp
14. Advanced Topics in Audio Understanding of Sound Events, Scenes, and Beyond	Prof. Nobutaka Ono Tokyo Metropolitan University, Japan onono@tmu.ac.jp Dr. Keisuke Imoto Doshisha University, Japan keisuke.imoto@ieee.org Dr. Tatsuya Komatsu LY Corporation, Japan komatsu.tatsuya@lycorp.co.jp
15. Advanced Technologies for ISAC in 6G and Beyond Networks	Assoc. Prof. Dinh Thi Thai Mai VNU University of Engineering and Technology, Hanoi, Vietnam dttmai@vnu.edu.vn Assoc. Prof. Hoang Trong Minh Posts and Telecommunications Institute of Technology, Hanoi, Vietnam hoangtrongminh@ptit.edu.vn
16. Multimodal Deepfake Detection and Copyright-Aware Analysis	Prof. Sanghoon Lee Yonsei University, South Korea slee@yonsei.ac.kr Dr. Jaesang Hyun Yonsei University, South Korea jaesang.hyun@yonsei.ac.kr Dr. Jongyoo Kim Yonsei University, South Korea jy.kim@yonsei.ac.kr Prof. Jianfei Cai Monash University, Australia Jianfei.Cai@monash.edu Prof. Ping An Shanghai University, China anping@shu.edu.cn Prof. Wen Huang Cheng National Taiwan University, Taiwan wenhuang@csie.ntu.edu.tw Prof. Weisi Lin Nanyang Technological University, Singapore wslin@ntu.edu.sg
17. Intelligent Radar Signal Processing and Adaptive Sensing	Prof. Bo Chen Xidian University, China bchen@mail.xidian.edu.cn Assoc. Prof. Hui Ma Xidian University, China h.ma@xidian.edu.cn Assoc. Prof. Zhihui Xin Guangzhou University, China Xinzhihui.lunkcy@163.com

Emerging Technologies and Applications of Computer Vision and Multimodal AI

Organizers

Prof. Chul Lee - Dongguk University, Korea (chullee@dongguk.edu)
Prof. Je-Won Kang - Ewha Womans University, Korea (jewonk@ewha.ac.kr)

Introduction

This special session focuses on emerging technologies and applications in computer vision and multimodal artificial intelligence. The session highlights recent advances in visual understanding and image processing enabled by deep learning and large-scale multimodal data. Topics of interest include image segmentation, image completion, compressed video processing, depth estimation, image denoising, and multimodal learning approaches that integrate visual information with other modalities such as text or sensor data. This session aims to provide a forum for presenting cutting-edge research and for discussing emerging trends, methodologies, and practical applications in computer vision and multimodal AI.

Important Dates

Submission of Full Papers:	May 15, 2026
Notification of Paper Acceptance:	July 15, 2026
Camera-Ready Paper Submission:	July 31, 2026

Submission Guidelines

Please follow the general Call for Papers formatting guidelines for submission details.

AI-Driven Non-Invasive Health Monitoring Using Radar and Wearable Sensors

Organizers

Dr. Hoang Thi Yen - Le Quy Don Technical University, Hanoi, Vietnam (yenht@lqdtu.edu.vn)
Assoc. Prof. Guanghao Sun - The University of Electro-Communications, Tokyo, Japan (Guanghao.Sun@uec.ac.jp)

Introduction

Recent advances in artificial intelligence (AI), radar sensing technologies, and wearable biomedical devices are transforming the landscape of non-invasive health monitoring. Contactless radar systems and wearable sensors such as PPG, ECG, and inertial devices enable continuous and unobtrusive acquisition of physiological signals, supporting real-time monitoring of cardiovascular and respiratory functions. These technologies are particularly promising for telemedicine, remote patient monitoring, military healthcare, elderly care, and smart hospital environments.

Despite rapid technological progress, several challenges remain, including signal quality under motion artifacts, multimodal data fusion, domain generalization, privacy-preserving edge computing, and clinical validation. AI-driven approaches, especially deep learning, transformer architectures, and self-supervised learning, have demonstrated strong potential in addressing these issues by improving signal interpretation, robustness, and prediction accuracy.

This Special Session aims to bring together researchers working on radar-based vital sign monitoring, wearable sensing systems, AI-based physiological signal processing, and IoT-enabled healthcare platforms. The session will provide a focused forum to discuss emerging methodologies, system integration strategies, clinical applications, and translational research challenges. By fostering interdisciplinary collaboration among engineers, computer scientists, and healthcare researchers, the session will contribute to accelerating the deployment of intelligent, non-invasive health monitoring systems in real-world applications.

Important Dates

Submission of Full Papers:	May 15, 2026
Notification of Paper Acceptance:	July 15, 2026
Camera-Ready Paper Submission:	July 31, 2026

Submission Guidelines

Please follow the general Call for Papers formatting guidelines for submission details.

Generative and Multimodal Signal Processing for Vision and Multimedia Systems

Organizers

Prof. Ching-Chun Huang - National Yang Ming Chiao Tung University, Taiwan (chingchun@nycu.edu.tw)
Assoc. Prof. Yu-Lun Liu - National Yang Ming Chiao Tung University, Taiwan (yulunliu@cs.nycu.edu.tw)
Assoc. Prof. Manh-Hung Nguyen - University of Technology and Engineering, Vietnam (hungnm@hcmute.vn)

Introduction

Recent advances in generative modeling and multimodal learning are reshaping signal processing for vision and multimedia systems. Modern pipelines integrate images, video, audio, text, depth, and 3D/4D sensor signals (e.g., LiDAR/IMU/radar) to enable robust perception, restoration, understanding, and content generation under real-world constraints such as noise, blur, compression, low-light, missing modalities, and domain shifts. While diffusion models, transformers, flow-based models, and implicit neural representations have advanced controllable synthesis and inverse problems, key challenges remain in reliability, efficiency, interpretability, and deployment.

This special session aims to bring together researchers in signal processing, computer vision, and multimedia to explore generative and multimodal signal processing. We encourage contributions grounded in signal-processing principles (e.g., priors, sampling, spectral/time-frequency analysis, compression, uncertainty modeling, and efficiency) and end-to-end multimodal system design for real-world applications.

Topics of Interest

Topics of interest include (but are not limited to):

Generative approaches for inverse problems (super-resolution, deblurring, denoising, inpainting, compressed sensing)

Multimodal generative modeling and cross-modal translation (e.g., text–image/video, audio-driven generation, 2D-to-3D/4D)

Multimodal fusion with generative priors (RGB–IR, RGB–depth, LiDAR–camera, radar–camera) and uncertainty-aware fusion

Signal-aware architectures (frequency/time-frequency-aware diffusion, wavelet/FFT-domain models, physics-informed generative modeling)

Efficient and deployable systems (distillation, quantization, low-latency inference, edge deployment)

Trustworthy generative systems (hallucination control, robustness, watermarking, detection/forensics)

Evaluation protocols and real-world applications (autonomous systems, AR/VR/XR, communication, healthcare, remote sensing)

Important Dates

Submission of Full Papers:	May 15, 2026
Notification of Paper Acceptance:	July 15, 2026
Camera-Ready Paper Submission:	July 31, 2026

Submission Guidelines

Please follow the general Call for Papers formatting guidelines for submission details.

High Performance Image and Video Processing and Applications

Organizers

Prof. Kosin Chamnongthai - King Mongkut's University of Technology Thonburi, Thailand (kosin.cha@kmutt.ac.th)

Introduction

Image and video processing plays important role as basic process for information and knowledge processing. Researches and studies in the level of information and knowledge basically depend upon many tools and modules based on image and video processing. Although researches in information and knowledge processing have been undertaken actively, image and video processing should be improved and developed in term of performance in order to enhance quality of information and knowledge processing systems. This special session collects some cases of high-performance image and video processing and their applications in pixel and frame levels.

Important Dates

Submission of Full Papers:	May 15, 2026
Notification of Paper Acceptance:	July 15, 2026
Camera-Ready Paper Submission:	July 31, 2026

Submission Guidelines

Please follow the general Call for Papers formatting guidelines for submission details.

Intelligent Signal Processing and Resource Optimization for Integrated Sensing and Communications

Organizers

Prof. Kouji Hirata - Kansai University, Japan (hirata@kansai-u.ac.jp)

Introduction

Recent advances in signal processing, artificial intelligence, and wireless technologies are accelerating the convergence of sensing and communication systems. In next-generation networks, such as beyond-5G and 6G, integrated sensing and communications is expected to play a key role in enabling context-aware, data-driven, and resource-efficient services for smart cities, intelligent transportation systems, and cyber-physical infrastructures. Achieving this vision requires not only advanced signal processing techniques but also intelligent resource optimization strategies that can dynamically adapt to complex and heterogeneous environments. This special session focuses on intelligent signal processing and resource management frameworks for integrated sensing and communication systems. Topics of interest include, but are not limited to, joint sensing–communication waveform design, AI-assisted signal processing, adaptive spectrum and power allocation, cooperative sensing and networking, and edge-enabled signal processing architectures. Both theoretical developments and practical system implementations are welcomed. By bringing together recent research from academia and industry, this session aims to provide insights into how intelligent signal processing can enhance sensing accuracy, communication efficiency, and overall system performance in future wireless networks.

Important Dates

Submission of Full Papers:	May 15, 2026
Notification of Paper Acceptance:	July 15, 2026
Camera-Ready Paper Submission:	July 31, 2026

Submission Guidelines

Please follow the general Call for Papers formatting guidelines for submission details.

Recent Advances in Enriched Multimedia: Security, Forensics, and Privacy in the Generative AI Era

Organizers

Assoc. Prof. Michiharu Niimi - Kyushu Institute of Technology, Japan (niimi@ai.kyutech.ac.jp)
Prof. Masaki Kawamura - Yamaguchi University, Japan (kawamura@sci.yamaguchi-u.ac.jp)

Introduction

The concept of Enriched Multimedia focuses on enhancing the value and security of digital content through advanced signal processing and data hiding techniques. While the rapid development of Generative AI (GenAI) has provided powerful tools for content creation and enrichment, it has also introduced new threats, such as sophisticated deepfakes and unauthorized use of digital assets. These challenges necessitate a comprehensive approach to content protection that integrates both time-tested methodologies and cutting-edge innovations.

This special session aims to provide a forum for researchers to discuss the latest developments in safeguarding multimedia integrity. We emphasize a balanced perspective, welcoming both traditional signal processing-based methods (e.g., classical watermarking and steganography) and modern AI-driven solutions (e.g., neural forensics and adversarial defense). By fostering a dialogue between conventional expertise and emerging AI technologies, this session seeks to establish robust frameworks for the security, forensics, and privacy of "enriched" multimedia content.

Topics of Interest

This session will address the following key topics:

Conventional and AI-Based Watermarking: Robust data hiding and fingerprinting for copyright protection and traceability of diverse media types

Multimedia Forensics and Deepfake Detection: Signal-based and learning-based forensic analysis to identify synthetic media and verify content provenance

Security for and by Generative AI: Countermeasures against AI-driven content misuse and technologies to protect intellectual property in the era of foundation models

Privacy-Preserving Multimedia Processing: Traditional anonymization techniques alongside modern privacy-preserving computation (e.g., differential privacy) for media data

Content Enhancement with Integrated Security: Methods that combine quality enhancement (e.g., super-resolution) with inherent security features

The goal of this session is to deepen the understanding of challenges surrounding digital content by bridging the gap between traditional signal processing and modern AI. Through this exchange, we aim to contribute to the realization of a trustworthy and secure digital society that values both innovation and fundamental technical excellence.

Important Dates

Submission of Full Papers:	May 15, 2026
Notification of Paper Acceptance:	July 15, 2026
Camera-Ready Paper Submission:	July 31, 2026

Submission Guidelines

Please follow the general Call for Papers formatting guidelines for submission details.

Acoustic Scene Analysis and Signal Enhancement Based on Advanced Signal Processing and Machine Learning

Organizers

Prof. Shoji Makino - Waseda University, Japan (s.makino@waseda.jp)
Prof. Kouei Yamaoka - The University of Tokyo, Japan (kouei_yamaoka@ipc.i.u-tokyo.ac.jp)

Introduction

We are surrounded by sounds in our daily lives. To enrich these acoustic experiences, including but not limited to conversation, music, hands-free speech communication, and automatic life logging, advancements in acoustic signal processing and machine learning play a vital role. This research area encompasses a wide range of topics, such as source separation, signal enhancement, source localization, system identification, sound field reproduction, and acoustic scene analysis. Both fundamental and applied studies in these areas contribute to the development of next-generation acoustic systems, which are essential for enabling future communication through human-machine and human-human interfaces.

This special session focuses on acoustic scene analysis and signal enhancement based on advanced signal processing and machine learning techniques. It aims to bring together researchers working on these closely related topics, fostering discussion across complementary research areas and promoting the integration of diverse perspectives and opportunities for future collaboration. By providing this dedicated forum for presentation and technical exchange, this session will accelerate innovation and strengthen the research community in this rapidly evolving field.

Important Dates

Submission of Full Papers:	May 15, 2026
Notification of Paper Acceptance:	July 15, 2026
Camera-Ready Paper Submission:	July 31, 2026

Submission Guidelines

Please follow the general Call for Papers formatting guidelines for submission details.

Recent Advances in Music Processing

Organizers

Prof. Tetsuro Kitahara - Nihon University, Japan (kitahara.tetsuro@nihon-u.ac.jp)
Assoc. Prof. Eita Nakamura - Kyushu University, Japan (nakamura@inf.kyushu-u.ac.jp)

Introduction

Music has been a major application area of signal processing and information processing, but the past APSIPA conferences had few dedicated sessions focusing on music processing research. Last year, we organized the first special session focusing on music processing, and 12 papers have been presented. Based on this success, we would like to organize a special session focusing on music processing research at APSIPA ASC 2026. This session aims to bring together researchers working on diverse aspects of music-related research, including but not limited to music signal processing, computational musicology, deep learning for music, and interactive music systems. Through this session, we hope to provide a platform for researchers to exchange ideas, identify key research gaps, and inspire future collaborations that push the boundaries of music-related research within the APSIPA community.

Important Dates

Submission of Full Papers:	May 15, 2026
Notification of Paper Acceptance:	July 15, 2026
Camera-Ready Paper Submission:	July 31, 2026

Submission Guidelines

Please follow the general Call for Papers formatting guidelines for submission details.

Model Design and Practical Applications of Generative AI

Organizers

Assoc. Prof. Thanh-Hai Tran - Hanoi University of Science and Technology, Vietnam (hai.tranthithanh1@hust.edu.vn)
Assoc. Prof. Hai Vu - Hanoi University of Science and Technology, Vietnam (hai.vu@hust.edu.vn)
Assoc. Prof. Cheng-Kai Lu - National Taiwan Normal University, Taiwan (cklu@ntnu.edu.tw)
Prof. Li-Wei Kang - National Taiwan Normal University, Taiwan (lwkang@ntnu.edu.tw)

Introduction

The rapid evolution of generative artificial intelligence (AI) has profoundly transformed the landscape of visual computing and multimedia applications. Modern generative models have demonstrated exceptional capabilities in image and video synthesis, enhancement, restoration, super-resolution, medical imaging, data augmentation, and scene understanding, among many others. This special session aims to bring together researchers, practitioners, and industry experts to discuss the latest advancements in generative AI model design and its practical implementations in vision-related domains. The session will serve as a platform for exchanging innovative ideas, sharing experimental results, and addressing key challenges in deploying generative AI systems for real-world applications. We particularly welcome contributions focusing on the design, optimization, robustness, and interpretability of generative models, as well as their integration into practical multimedia and computer vision tasks.

Topics of Interest

Advanced architectures for generative AI in vision and multimedia

Federated and distributed learning for generative AI and its applications

GANs, VAEs, Transformers, and diffusion models for image/video generation, enhancement, and restoration

Applications of generative AI in medical imaging, remote sensing, and autonomous driving

Synthetic data generation and augmentation using generative models

Robustness, trustworthiness, and security in generative AI systems

Benchmarking methodologies and evaluation metrics for generative AI models

Important Dates

Submission of Full Papers:	May 15, 2026
Notification of Paper Acceptance:	July 15, 2026
Camera-Ready Paper Submission:	July 31, 2026

Submission Guidelines

Please follow the general Call for Papers formatting guidelines for submission details.

Reconstruction meets Generation: 2D, 3D, 4D and Beyond

Organizers

Prof. Yuchao Dai - Northwestern Polytechnical University, China (daiyuchao@nwpu.edu.cn)

Introduction

Overview: The boundaries between geometric reconstruction and generative modeling are rapidly dissolving. While traditional 3D/4D reconstruction excels at deterministic fidelity, it often fails in sparse-data or "unseen" scenarios. Conversely, Generative AI offers powerful visual priors but frequently lacks physical grounding and temporal consistency. This session explores the convergence of these paradigms to create Physically Grounded World Models. By bridging "seeing" (reconstruction) and "imagining" (generation), we aim to advance spatiotemporal intelligence for Embodied AI and autonomous systems.

Motivation: Bridging Fidelity and Imagination Traditional perception is limited by captured data. We seek to transcend these limits through action-grounded reasoning and generative priors. This session fosters a new class of models that utilize the "physics engine" of 3D/4D reconstruction to anchor generative "hallucinations" in reality, ensuring results are both visually stunning and physically plausible.

Timeliness: The Paradigm Shift to Explicit 4D Models At this 2026 tipping point, the field is shifting from implicit representations to explicit, editable primitives like 3D Gaussian Splatting (3DGS). With breakthroughs in transferring generative pretraining to wide-baseline or unposed environments and the rise of video foundation models, there is an urgent mandate to unify these technologies for robust deployment.

Topics of Interest

Methodological Synergy: Bi-directional frameworks where generative priors guide reconstruction in extreme environments.

Explicit 4D Representations: Innovations in 3DGS and Neural Rendering for dynamic scene modeling.

Robust Sensing: Perception via Event-based Vision, LiDAR fusion, and uncertainty-aware modeling.

Controllable Generation: Physics-aware 3D/video generation anchored in 4D space-time.

Strategic Applications: Deployment in aerospace vision, autonomous mining, and robotics.

Important Dates

Submission of Full Papers:	May 15, 2026
Notification of Paper Acceptance:	July 15, 2026
Camera-Ready Paper Submission:	July 31, 2026

Submission Guidelines

Please follow the general Call for Papers formatting guidelines for submission details.

Hardware Accelerators for AI-Driven and Real-Time Signal Processing Systems

Organizers

Prof. Trio Adiono - Bandung Institute of Technology, Indonesia (tadiono@gmail.com)

Introduction

Field-Programmable Gate Arrays (FPGAs) and Very Large Scale Integrated Circuits (VLSIs) have proven to be practical platforms for deploying real-time signal processing with low latency and low power consumption.

As edge computing and the Internet of Things (IoT) continue to grow in popularity, the demand for AI-driven signal processing models is also increasing. Furthermore, hardware-software co-design for application-specific SoC architectures has become a standard practice among hardware engineers developing products that require edge AI capabilities. These trends have positioned hardware-based accelerator design at the forefront of signal processing and AI research.

This special session aims to challenge researchers to develop hardware accelerators for AI and signal processing applications. Researchers are encouraged to propose novel hardware accelerator designs, including pipelined and folded filter architectures for signal processing applications. In the area of AI research, the challenge lies in mapping complex neural network algorithms onto efficient hardware implementations without sacrificing computational accuracy or increasing power consumption.

Through this special session, it is expected that researchers will be inspired to develop robust AI and signal processing hardware accelerators. This session also aims to demonstrate hardware-software co-design as a systematic methodology for building real-time systems, while highlighting the Asia-Pacific community's growing role in advancing efficient embedded AI hardware.

Important Dates

Submission of Full Papers:	May 15, 2026
Notification of Paper Acceptance:	July 15, 2026
Camera-Ready Paper Submission:	July 31, 2026

Submission Guidelines

Please follow the general Call for Papers formatting guidelines for submission details.

Towards Explainable and Interpretable AI in Medicine

Organizers

Dr. Tran Hiep Dinh - VNU University of Engineering and Technology, Hanoi, Vietnam (tranhiep.dinh@vnu.edu.vn)
Assoc. Prof. Thi-Thao Tran - Hanoi University of Science and Technology, Hanoi, Vietnam (thao.tranthi@hust.edu.vn)

Introduction

Advances in artificial intelligence (AI) have been applied in many research areas, although the depth of its implementation varies. While the massive adoption of AI in such a versatile domain as computer vision is due to the availability of large-scale datasets and its general-purpose nature, the implementation of AI in medicine is still limited, regardless of the recent growth in some niche fields, e.g. medical imaging. Possible reasons for this limitation include the lack of sufficient training data, a low tolerance for error and safety, and, more importantly, the interpretability of AI models' decision-making mechanisms for possible clinical implementations.

This challenge, however, is a great opportunity for interdisciplinary collaborations, where AI scientists and clinicians can work together to solve the "black box" problem and develop AI models with an explainable rationale. This special session aims to bring together researchers around the world to discuss the vision, challenges, and key approaches towards explainable AI in medicine, covering but not limited to the following topics: medical imaging, predictive analytics, and robotic-assisted surgery.

Important Dates

Submission of Full Papers:	May 15, 2026
Notification of Paper Acceptance:	July 15, 2026
Camera-Ready Paper Submission:	July 31, 2026

Submission Guidelines

Please follow the general Call for Papers formatting guidelines for submission details.

Speech Processing and Health

Organizers

Prof. Ian McLoughlin - Singapore Institute of Technology, Singapore (ian.mcloughlin@singaporetech.edu.sg)
Prof. Alessandro Vinciarelli - The University of Glasgow, Scotland, United Kingdom (Alessandro.Vinciarelli@glasgow.ac.uk)
Prof. Chng Eng Siong - Nanyang Technological University, Singapore (ASESChng@ntu.edu.sg)
Assoc. Prof. Yan Song - The University of Science and Technology of China, China (songy@ustc.edu.cn)
Prof. Hamid Sharifzadeh - UNITEC, Auckland, New Zealand (hsharifzadeh@unitec.ac.nz)
Assoc. Prof. Lee Kong Aik - Hong Kong Polytechnic University, Hong Kong, China (kong-aik.lee@polyu.edu.hk)
Asst. Prof. Tong Rong - Singapore Institute of Technology, Singapore (tong.rong@singaporetech.edu.sg)
Dr. Pham Lam - Austrian Institute of Technology, Austria (Lam.Pham@ait.ac.at)
Dr. Akshita Abrol - Singapore Institute of Technology, Singapore (akshita.abrol@singaporetech.edu.sg)
Prof. Tomoki Toda - Information Technology Center, Nagoya University, Japan (tomoki@icts.nagoya-u.ac.jp)

With the kind support from:

Associate Professor Zhang Zhengchen, Singapore Institute of Technology
Assistant Professor Xiaoxiao Miao, Duke-Kunshan University, China
Associate Professor Tanaya Guha, University of Glasgow

Introduction

Speech and health are naturally interrelated. Apart from being the main mode of communications between health professional and patient, and a way of revealing health state, speech is a modality that is important to maintaining, diagnosing and managing health. This is true for both physical and mental wellbeing. Analysing how we speak, and what we say, can reveal much about our physiological and psychological health. Analysis of what we cannot or do not say might also be important. Understanding spoken information is also very important, for both patients and caregivers.

There are many areas of speech processing research related to health, including speech analysis for physical and mental wellbeing, ASR and TTS for those with speaking impairments, voice conversion, speech therapy, enhancements for those with hearing impairments, virtual clinicians, language biases and more. Typically, existing research is tied to the closest signal processing research domain. There are many untapped synergies and parallelisms that have not been explored – some examples are the need to work with healthcare professionals, operate in clinics, privacy issues, voice interfaces, level normalisation, voice activity detection, background noise, user interfaces.

Bringing together speech and audio signal processing researchers who have worked in health-related domains will help to find common solutions to common problems, and allow cross-domain fertilisation (i.e. solutions in one domain may be useful in different domains). We also expect cross-domain sharing and collaboration.

Important Dates

Submission of Full Papers:	May 15, 2026
Notification of Paper Acceptance:	July 15, 2026
Camera-Ready Paper Submission:	July 31, 2026

Submission Guidelines

Please follow the general Call for Papers formatting guidelines for submission details.

Advanced Topics in Audio Understanding of Sound Events, Scenes, and Beyond

Organizers

Prof. Nobutaka Ono - Tokyo Metropolitan University, Japan (onono@tmu.ac.jp)
Dr. Keisuke Imoto - Doshisha University, Japan (keisuke.imoto@ieee.org)
Dr. Tatsuya Komatsu - LY Corporation, Japan (komatsu.tatsuya@lycorp.co.jp)

Introduction

We are surrounded by various kinds of sounds, including speech, music, and environmental sounds. To enable computers to understand such acoustic information and realize human-like listening systems, audio understanding based on sound event and scene analysis plays a fundamental role. This special session is dedicated to recent advances in audio understanding, including not only traditional topics such as sound event and scene analysis, feature extraction, and deep-learning-based modeling, but also emerging directions such as music signal processing, emotion recognition, microphone array processing, and audio foundation models.

Furthermore, applications ranging from elderly monitoring, surveillance, and life-logging to advanced multimedia retrieval and the integration of large language models (LLMs) for multimodal audio-language understanding are within the scope of this session. We welcome contributions that address these topics and promote the development of next-generation intelligent audio systems.

Important Dates

Submission of Full Papers:	May 15, 2026
Notification of Paper Acceptance:	July 15, 2026
Camera-Ready Paper Submission:	July 31, 2026

Submission Guidelines

Please follow the general Call for Papers formatting guidelines for submission details.

Advanced Technologies for ISAC in 6G and Beyond Networks

Organizers

Assoc. Prof. Dinh Thi Thai Mai - VNU University of Engineering and Technology, Hanoi, Vietnam (dttmai@vnu.edu.vn)
Assoc. Prof. Hoang Trong Minh - Posts and Telecommunications Institute of Technology, Hanoi, Vietnam (hoangtrongminh@ptit.edu.vn)

Introduction

Integrated Sensing and Communication (ISAC) has emerged as a promising technology for 6G and beyond networks, enabling seamless coexistence of wireless communication and environmental sensing within shared spectrum, hardware, and signal processing frameworks. Unlike conventional systems where sensing and communication are treated independently, ISAC provides:

Spectral efficiency improvement
Hardware cost reduction
Low-latency joint operation
Enhanced situational awareness
Native support for intelligent autonomous systems

With the evolution toward THz communications, RIS-assisted networks, AI-native air interfaces, UAV-enabled platforms, and edge intelligence, advanced technologies are required to address challenges in:

Joint waveform design
Dual-functional radar-communication (DFRC)
Hardware-aware system design
Joint beamforming & resource allocation
Channel estimation for joint sensing
Real-time implementation constraints

This special session aims to gather researchers and industry experts to present cutting-edge advances in ISAC technologies, system architectures, algorithms, and experimental validations for 6G and beyond networks.

Important Dates

Submission of Full Papers:	May 15, 2026
Notification of Paper Acceptance:	July 15, 2026
Camera-Ready Paper Submission:	July 31, 2026

Submission Guidelines

Please follow the general Call for Papers formatting guidelines for submission details.

Multimodal Deepfake Detection and Copyright-Aware Analysis

Organizers

Prof. Sanghoon Lee - Yonsei University, South Korea (slee@yonsei.ac.kr)
Dr. Jaesang Hyun - Yonsei University, South Korea (jaesang.hyun@yonsei.ac.kr)
Dr. Jongyoo Kim - Yonsei University, South Korea (jy.kim@yonsei.ac.kr)
Prof. Jianfei Cai - Monash University, Australia (Jianfei.Cai@monash.edu)
Prof. Ping An - Shanghai University, China (anping@shu.edu.cn)
Prof. Wen Huang Cheng - National Taiwan University, Taiwan (wenhuang@csie.ntu.edu.tw)
Prof. Weisi Lin - Nanyang Technological University, Singapore (wslin@ntu.edu.sg)

Introduction

Multimedia technologies increasingly operate across multiple data modalities, including images, videos, audio, text, and human motion. Recent advances in multimedia synthesis and manipulation techniques have significantly improved the realism and accessibility of generated or modified media content. These developments have expanded applications in content creation, editing, summarization, and transformation. At the same time, such technologies redefine the relationship between media content and existing works, raising new technical challenges related to similarity, originality, attribution, and copyright awareness.

This session focuses on Multimodal Deepfake Detection and Copyright-Aware Analysis, aiming to foster research discussions on both deepfake detection and semantic-level analysis of multimedia content. The scope includes multimodal representations, analysis methods for detecting manipulated or synthesized media, and techniques for evaluating similarity and originality in multimedia data. In particular, the session emphasizes approaches that move beyond low-level signal comparisons and instead analyze multimedia content through semantic representations and multimodal cues.

By bringing together researchers from diverse areas of multimedia signal processing, computer vision, and multimedia forensics, this session aims to provide a platform for exploring shared research directions toward reliable detection, analysis, and protection of multimedia content in increasingly complex multimodal environments.

Topics of Interest

Topics of interest include, but are not limited to:

Multimodal generative models

Large-scale generative approaches for multimodal content

Multimodal representation spaces and semantic structure analysis

Semantic-level similarity and originality analysis for generated content

Multimodal deepfake and synthetic media detection

Generalizable analysis and detection methods for unseen generative models

Datasets, evaluation metrics, and benchmarks for multimodal generation and analysis

Important Dates

Submission of Full Papers:	May 15, 2026
Notification of Paper Acceptance:	July 15, 2026
Camera-Ready Paper Submission:	July 31, 2026

Submission Guidelines

Please follow the general Call for Papers formatting guidelines for submission details.

Intelligent Radar Signal Processing and Adaptive Sensing

Organizers

Prof. Bo Chen - Xidian University, China (bchen@mail.xidian.edu.cn)
Assoc. Prof. Hui Ma - Xidian University, China (h.ma@xidian.edu.cn)
Assoc. Prof. Zhihui Xin - Guangzhou University, China (Xinzhihui.lunkcy@163.com)

Introduction

The proposed special session, “Intelligent Radar Signal Processing and Adaptive Sensing” aims to provide a focused forum for recent advances at the intersection of radar signal processing, machine learning, and adaptive sensing system design. The motivation for this session lies in the growing need for radar systems to operate reliably in increasingly complex and dynamic environments, where conventional model-driven approaches alone often face limitations in robustness, adaptability, and scalability. Challenges such as non-stationary clutter, dense interference, low-observable targets, spectrum congestion, and real-time processing constraints have motivated the development of learning-based and adaptive methodologies that can complement classical radar processing frameworks.

This special session is particularly timely because recent years have seen rapid progress in deep learning, reinforcement learning, data-driven optimization, and edge intelligence, all of which are reshaping how radar systems perform detection, recognition, imaging, tracking, waveform design, and resource management. At the same time, adaptive sensing has become a central theme in modern radar research, especially in applications requiring real-time environmental awareness, intelligent decision-making, and closed-loop optimization. These trends are highly relevant to emerging radar applications in autonomous systems, smart transportation, remote sensing, infrastructure monitoring, and defense technologies. However, despite this momentum, research efforts remain fragmented across algorithms, sensing architectures, datasets, and deployment considerations. A dedicated special session is therefore needed to consolidate current progress, identify open problems, and promote deeper interaction between theory and practice.

The session is expected to contribute to the field in several concrete ways. First, it will showcase state-of-the-art research in learning-based radar signal processing, including target detection and classification, adaptive waveform optimization, robust anti-interference processing, radar imaging, tracking, and multi-modal or multi-sensor fusion. Second, it will encourage discussion of adaptive sensing strategies that integrate perception, inference, and control, thereby advancing radar systems from passive observation toward intelligent and context-aware operation. Third, it will promote exchange between academia and industry on practical issues such as computational efficiency, hardware constraints, interpretability, reliability, and realworld deployment. Finally, it will help strengthen an interdisciplinary research community spanning signal processing, machine learning, radar engineering, communications, and control, while supporting future collaborations on evaluation protocols, benchmark problems, and shared technical challenges.

Important Dates

Submission of Full Papers:	May 15, 2026
Notification of Paper Acceptance:	July 15, 2026
Camera-Ready Paper Submission:	July 31, 2026

Submission Guidelines

Please follow the general Call for Papers formatting guidelines for submission details.

Special Sessions

Emerging Technologies and Applications of Computer Vision and Multimodal AI

Organizers

Introduction

Important Dates

Submission Guidelines

AI-Driven Non-Invasive Health Monitoring Using Radar and Wearable Sensors

Organizers

Introduction

Important Dates

Submission Guidelines

Generative and Multimodal Signal Processing for Vision and Multimedia Systems

Organizers

Introduction

Topics of Interest

Important Dates

Submission Guidelines

High Performance Image and Video Processing and Applications

Organizers

Introduction

Important Dates

Submission Guidelines

Intelligent Signal Processing and Resource Optimization for Integrated Sensing and Communications

Organizers

Introduction

Important Dates

Submission Guidelines

Recent Advances in Enriched Multimedia: Security, Forensics, and Privacy in the Generative AI Era

Organizers

Introduction

Topics of Interest

Important Dates

Submission Guidelines

Acoustic Scene Analysis and Signal Enhancement Based on Advanced Signal Processing and Machine Learning

Organizers

Introduction

Important Dates

Submission Guidelines

Recent Advances in Music Processing

Organizers

Introduction

Important Dates

Submission Guidelines

Model Design and Practical Applications of Generative AI

Organizers

Introduction

Topics of Interest

Important Dates

Submission Guidelines

Reconstruction meets Generation: 2D, 3D, 4D and Beyond

Organizers

Introduction

Topics of Interest

Important Dates

Submission Guidelines

Hardware Accelerators for AI-Driven and Real-Time Signal Processing Systems

Organizers

Introduction

Important Dates

Submission Guidelines

Towards Explainable and Interpretable AI in Medicine

Organizers

Introduction

Important Dates

Submission Guidelines

Speech Processing and Health

Organizers

Introduction

Important Dates

Submission Guidelines

Advanced Topics in Audio Understanding of Sound Events, Scenes, and Beyond

Organizers

Introduction

Important Dates

Submission Guidelines

Advanced Technologies for ISAC in 6G and Beyond Networks

Organizers

Introduction

Important Dates

Submission Guidelines