Special Session 5:
Multimodal Machine Learning in Practice
This special session aims to highlight cutting-edge research in the development and practical deployment of multimodal machine learning algorithms that integrate and process heterogeneous data types—such as text, image, audio, video, and sensor signals—to address complex, real-world challenges. By leveraging the complementary strengths of multiple modalities, these systems enable more robust, context-aware, and intelligent solutions across a wide range of domains, including healthcare, cybersecurity, robotics, smart environments, transportation, surveillance and so on.
The session invites contributions where algorithmic innovations in multi-modal AI systems are developed in the context of real-world applications, leading to tangible improvements in performance, scalability, and interpretability. We welcome demonstrating multi-modal AI in domains such as healthcare, cybersecurity, robotics, agriculture, smart homes, transportation, and surveillance. We also encourage submissions focusing on educational tools, mobile and web-based AI systems, multi-modal chatbot development, and cross-modal retrieval tasks.
Scope and topics:
Topics of interest include, but are not limited to:
- Vision-language and multi-modal foundation models
- Generative models for multi-modal synthesis
- Multi-modal representation alignment and fusion techniques
- Transfer learning and fine-tuning strategies in multi-modal deep learning
- Cross-modal retrieval and matching (e.g., image-to-text, audio-to-video)
- Domain adaptation and self-supervised learning for multi-modal data
- Explainable, interpretable, and trustworthy multi-modal ML systems
- Applications in cybersecurity, medical imaging, transportation, robotics, and smart environments
- Mobile, web, and edge deployment of multi-modal systems
- Real-time architectures and lightweight multi-modal models for deployment
- Anomaly Detection with Foundation Models
- Benchmark datasets, framework, and reproducibility in multi-modal ML
Chairs:
- Chair Emails
- Chair Biographies
Chair: Dr. Md Belayat Hossain: belayat@cs.siu.edu
Co-chair: Dr. Abdur Rahman Bin Shahid: shahid@cs.siu.edu
Co-chair: Dr. Alvi Ataur Khalil: a.khalil@siu.edu
Dr. Md Belayat Hossain is an Assistant Professor at Southern Illinois University, Carbondale,
IL, USA. His
research focuses on deep learning, multi-modal AI, generative AI, medical imaging, domain adaptation
and
computer vision. Dr. Hossain has organized special sessions at IEEE SMC 2026 (accepted), IEEE ICMLA
2025,
IEEE ICMLC 2019 and IWACIII 2019. He also served as a program committee member for IEEE SMC, ICIEV
and
IVPR conferences, and session co-chairing in IEEE SMC, IEEE ICMLC and reviewer for numerous
conferences
and journals, including MICCAI, IEEE SMC, IEEE ICMLC, OHBM, IEEE Tran Biomedical Engineering, IEEE
Tran
Big Data, Pattern Recognition Letters, Scientific Reports, IEEE Access etc.
Dr. Abdur Rahman Bin Shahid is an Assistant Professor at Southern Illinois University,
Carbondale, IL, USA.
His research focuses on cybersecurity, deep learning, adversarial ML, multimodal AI, usable security
and
privacy, generative AI, Cyber-Physical Systems, and Internet of Things. Dr. Shahid served as a
co-chair for
special sessions at IEEE ICMLA 2025 and for the International Workshop on Security, Privacy, and
Trust for
Emergency Events (EmergencyComm), held in conjunction with SecureComm 2020. He also served as
program committee member of several conferences, including IEEE CCNC, IEEE FIE, IEEE SSCI, and
reviewer
for numerous journals including IEEE IoT Journals, IEEE TII, and IEEE TDSC.
Dr. Alvi Ataur Khalil is a tenure-track Assistant Professor in the Department of Computer
Science, School of
Computing, at Southern Illinois University Carbondale, USA. Dr. Khalil's research focuses on
blockchain
security, particularly off-chain Layer-2 vulnerabilities and defenses, intelligent UAV control using
reinforcement learning, and AI-driven cybersecurity solutions for cyber-physical systems. He has
authored
over twenty peer-reviewed papers published in leading venues, including IEEE TNSM, Elsevier Computer
Networks, IEEE/IFIP DSN, ACSAC, EAI SecureComm, IEEE CNS, ACM DLT, IEEE LCN, CNSM, IEEE SMARTCOMP,
and IEEE COMPSAC. He is a recipient of several awards, including the Outstanding Graduate Scholar of
the
Year at FIU (2023–24), the Upsilon Pi Epsilon (UPE) Honor Society Scholarship (2024), the FIU
Dissertation
Year Fellowship (2025), NSF I-Corps Grant as Entrepreneur Lead, and multiple NSF and IEEE travel
awards.
Technical Committee
- Dr. Kento Morita, Mie University, Mie, Japan
- Dr. Hani M Alnami, Jazan University, Jazan, KSA
- Dr. Nur Imtiazul Haque, Northern Illinois University, IL, USA
- Dr. Hussein Zangoti, Jazan University, Jazan, KSA
- Dr. Khaled R Ahmed, Southern Illinois University Carbondale, IL, USA
- Dr. Samia Tasnim, University of Toledo, TX, USA
- Dr. Md Farhadur Reza, Eastern Illinois University, IL, USA
- Dr. Shahriar Badsha, Ford Motor Company, MI, USA
- Dr Razieh Ganjee, University of Pittsburgh, Pittsburgh, USA
- Dr. Sakib Mohammad, Fairmont State University, WV, USA
Paper Submission Instructions
All papers will be double-blind reviewed and must present original work.
- CMT Submission Site
- Select the track: Special Session 5: Multimodal Machine Learning in Practice
Papers submitted for reviewing should conform to IEEE specifications. Manuscript templates can be downloaded from:
Keydates
- Submission Deadline: June 20, 2026
- Notification of Acceptance: July 10, 2026
Registration
In order for your paper to be published in the proceedings you must register to the conference.
Paper Presentation Instructions
The papers submitted to this track will be presented in person as part of the conference. There is no virtual presentation for this session.
ICMLA'26