|
[1]
|
Lin, Z., Zhang, D., Tao, Q., Shi, D., Haffari, G., Wu, Q., et al. (2023) Medical Visual Question Answering: A Survey. Artificial Intelligence in Medicine, 143, Article ID: 102611. [Google Scholar] [CrossRef] [PubMed]
|
|
[2]
|
Pan, H., He, S., Zhang, K., Qu, B., Chen, C. and Shi, K. (2022) AMAM: An Attention-Based Multimodal Alignment Model for Medical Visual Question Answering. Knowledge-Based Systems, 255, Article ID: 109763. [Google Scholar] [CrossRef]
|
|
[3]
|
Long, S., Yang, Z., Li, Y., Qian, X., Zeng, K. and Hao, T. (2023) MAMF: A Multi-Level Attention-Based Multimodal Fusion Model for Medical Visual Question Answering. In: Zhang, H., et al., Eds., International Conference on Neural Computing for Advanced Applications, Springer, 202-214. [Google Scholar] [CrossRef]
|
|
[4]
|
Zhan, L., Liu, B., Fan, L., Chen, J. and Wu, X. (2020) Medical Visual Question Answering via Conditional Reasoning. Proceedings of the 28th ACM International Conference on Multimedia, Seattle, 12-16 October 2020, 2345-2354. [Google Scholar] [CrossRef]
|
|
[5]
|
Liu, G., He, J., Li, P., Zhao, Z. and Zhong, S. (2024) Cross-Modal Self-Supervised Vision Language Pre-Training with Multiple Objectives for Medical Visual Question Answering. Journal of Biomedical Informatics, 160, Article ID: 104748. [Google Scholar] [CrossRef] [PubMed]
|
|
[6]
|
Liu, B., Zhan, L., Xu, L. and Wu, X. (2023) Medical Visual Question Answering via Conditional Reasoning and Contrastive Learning. IEEE Transactions on Medical Imaging, 42, 1532-1545. [Google Scholar] [CrossRef] [PubMed]
|
|
[7]
|
Li, P., Liu, G., Tan, L., Liao, J. and Zhong, S. (2023) Self-Supervised Vision-Language Pretraining for Medial Visual Question Answering. 2023 IEEE 20th International Symposium on Biomedical Imaging (ISBI), Cartagena, 18-21 April 2023, 1-5. [Google Scholar] [CrossRef]
|
|
[8]
|
Bhatti, U.A., Tang, H., Wu, G., Marjan, S. and Hussain, A. (2023) Deep Learning with Graph Convolutional Networks: An Overview and Latest Applications in Computational Intelligence. International Journal of Intelligent Systems, 2023, Article ID: 8342104. [Google Scholar] [CrossRef]
|
|
[9]
|
Ren, F. and Zhou, Y. (2020) CGMVQA: A New Classification and Generative Model for Medical Visual Question Answering. IEEE Access, 8, 50626-50636. [Google Scholar] [CrossRef]
|
|
[10]
|
Wang, H. and Du, H. (2023) Knowledge-Enhanced Medical Visual Question Answering: A Survey (Invited Talk Summary). In: Yang, S. and Islam, S., Eds., Web and Big Data. APWeb-WAIM 2022 International Workshops, Springer, 3-9. [Google Scholar] [CrossRef]
|
|
[11]
|
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D. and Batra, D. (2017) Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 22-29 October 2017, 618-626. [Google Scholar] [CrossRef]
|
|
[12]
|
Huang, X. and Gong, H. (2024) A Dual-Attention Learning Network with Word and Sentence Embedding for Medical Visual Question Answering. IEEE Transactions on Medical Imaging, 43, 832-845. [Google Scholar] [CrossRef] [PubMed]
|
|
[13]
|
Chen, Z., Li, G. and Wan, X. (2022) Align, Reason and Learn: Enhancing Medical Vision-and-Language Pre-Training with Knowledge. Proceedings of the 30th ACM International Conference on Multimedia, Lisboa, 10-14 October 2022, 5152-5161. [Google Scholar] [CrossRef]
|
|
[14]
|
Gu, T., Yang, K., Liu, D. and Cai, W. (2024) LaPA: Latent Prompt Assist Model for Medical Visual Question Answering. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, 17-18 June 2024, 4971-4980. [Google Scholar] [CrossRef]
|
|
[15]
|
Liu, B., Zhan, L. and Wu, X. (2021) Contrastive Pre-Training and Representation Distillation for Medical Visual Question Answering Based on Radiology Images. In: de Bruijne, M., et al., Eds., Medical Image Computing and Computer Assisted Intervention—MICCAI 2021, Springer, 210-220. [Google Scholar] [CrossRef]
|
|
[16]
|
Huang, J., Chen, Y., Li, Y., Yang, Z., Gong, X., Wang, F.L., et al. (2023) Medical Knowledge-Based Network for Patient-Oriented Visual Question Answering. Information Processing & Management, 60, Article ID: 103241. [Google Scholar] [CrossRef]
|
|
[17]
|
Chen, Z., Du, Y., Hu, J., Liu, Y., Li, G., Wan, X., et al. (2024) Mapping Medical Image-Text to a Joint Space via Masked Modeling. Medical Image Analysis, 91, Article ID: 103018. [Google Scholar] [CrossRef] [PubMed]
|
|
[18]
|
Eslami, S., Meinel, C. and de Melo, G. (2023) PubMedClip: How Much Does CLIP Benefit Visual Question Answering in the Medical Domain? Findings of the Association for Computational Linguistics: EACL 2023, Dubrovnik, May 2023, 1181-1193. [Google Scholar] [CrossRef]
|
|
[19]
|
Lameesa, A., Silpasuwanchai, C. and Alam, M.S.B. (2025) VG-CALF: A Vision-Guided Cross-Attention and Late-Fusion Network for Radiology Images in Medical Visual Question Answering. Neurocomputing, 613, Article ID: 128730. [Google Scholar] [CrossRef]
|
|
[20]
|
Shu, C., Zhu, Y., Tang, X., Xiao, J., Chen, Y., Li, X., et al. (2024) MITER: Medical Image-Text Joint Adaptive Pretraining with Multi-Level Contrastive Learning. Expert Systems with Applications, 238, Article ID: 121526. [Google Scholar] [CrossRef]
|
|
[21]
|
Zhan, C., Peng, P., Wang, H., Wang, G., Lin, Y., Chen, T., et al. (2025) UnICLAM: Contrastive Representation Learning with Adversarial Masking for Unified and Interpretable Medical Vision Question Answering. Medical Image Analysis, 101, Article ID: 103464. [Google Scholar] [CrossRef] [PubMed]
|
|
[22]
|
Wu, Y., Lu, Y., Zhou, Y., Ding, Y., Liu, J. and Ruan, T. (2025) MKGF: A Multi-Modal Knowledge Graph Based RAG Framework to Enhance LVLMs for Medical Visual Question Answering. Neurocomputing, 635, Article ID: 129999. [Google Scholar] [CrossRef]
|