|
[1]
|
Jiang, Y., Natarajan, V., Chen, X.L., et al. (2018) Pythia v0.1: The Winning Entry to the VQA Challenge 2018. arXiv: 1807.09956.
|
|
[2]
|
Lee, J., Yoon, W., Kim, S., Kim, D., Kim, S., So, C.H., et al. (2019) Biobert: A Pre-Trained Biomedical Language Representation Model for Biomedical Text Mining. Bioinformatics, 36, 1234-1240. [Google Scholar] [CrossRef] [PubMed]
|
|
[3]
|
Krishna, R., Zhu, Y., Groth, O., Johnson, J., Hata, K., Kravitz, J., et al. (2017) Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations. International Journal of Computer Vision, 123, 32-73. [Google Scholar] [CrossRef]
|
|
[4]
|
Li, J. and Liu, S. (2021) Image CLEFmed VQA-Med 2021: Attention Model Based on Efficient Interaction between Multimodality. Working Notes of CLEF 201.
|
|
[5]
|
Agrawal, A., Batra, D., Parikh, D. and Kembhavi, A. (2018). Don’t Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 4971-4980.[CrossRef]
|
|
[6]
|
Lau, J.J., Gayen, S., Ben Abacha, A. and Demner-Fushman, D. (2018) A Dataset of Clinically Generated Visual Questions and Answers about Radiology Images. Scientific Data, 5, Article No. 180251. [Google Scholar] [CrossRef] [PubMed]
|
|
[7]
|
Al-Sadi, A., Talafha, B., Al-Ayyoub, M., Jararweh, Y. and Costen, F. (2019) Just at Image CLEF 2019 Visual Question Answering in the Medical Domain. Working Notes of CLEF.
|
|
[8]
|
Li, M., Cai, W., Liu, R., Weng, Y., Zhao, X., Wang, C., Chen, X., Liu, Z., Pan, C., Li, M., et al. (2021) FFA-IR: Towards an Explainable and Reliable Medical Report Generation Benchmark. 35th Conference on Neural Information Processing, Canada, 6-14 December 2021, 1-9.
|
|
[9]
|
Lin, T.-Y., Maire, M., Belongie, S.J., Hays, J., Perona, P., Ramanan, D., Dollár, P. and Zitnick, C.L. (2014) Microsoft COCO: Common Objects in Context. ECCV.
|
|
[10]
|
Krishna, R., Zhu, Y., Groth, O., Johnson, J., Hata, K., Kravitz, J., et al. (2017) Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations. International Journal of Computer Vision, 123, 32-73. [Google Scholar] [CrossRef]
|
|
[11]
|
Al-Sadi, A., Al-Theiabat, H. and Al-Ayyoub, M. (2020) The Inception Team at VQA-Med 2020: Pretrained VGG with Data Augmentation for Medical VQA and VQG. Working Notes of CLEF 2020.
|
|
[12]
|
Kim, J.-H., Jun, J. and Zhang, B.-T. (2018) Bilinear Attention Networks. 2018 Conference on Neural Information Processing Systems, Montréal, 3-8 December 2018, 1-8.
|
|
[13]
|
Loper, E. and Bird, S. (2002) NLTK. Proceedings of the ACL-02 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics, Philadelphia, 7 July 2002, 63-70. [Google Scholar] [CrossRef]
|