[1]
|
Wikipedia (2013) Quantum EntanglementBrown. https://en.wikipedia.org/wiki/Quantum_entanglementBrown
|
[2]
|
Brown, R.D. and Frederking, R. (1995) Applying Statistical English Language Modelling to Symbolic Machine Translation. Proceedings of the Sixth Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages, Leuven, 5-7 July 1995, 221-239.
|
[3]
|
Tao, T., Wang, X., Mei, Q. and Zhai, C. (2006) Language Model Information Retrieval with Document Expansion. Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, New York, 4-9 June 2006, 407-414. https://doi.org/10.3115/1220835.1220887
|
[4]
|
Zhai, C. and Lafferty, J. (2001) Model-Based Feedback in the Language Modeling Approach to Information Retrieval. Proceedings of the Tenth International Conference on Information and Knowledge Management, Atlanta, 5-10 October 2001, 403-410. https://doi.org/10.1145/502585.502654
|
[5]
|
Le Quoc, V. (2014) Sequence to Sequence Learning with Neural Networks. Advances in Neural Information Processing Systems, 27, 3104-3112.
|
[6]
|
Li, J., Li, S., Zhao, W.X., He, G., Wei, Z., Yuan, N.J., et al. (2020) Knowledge-Enhanced Personalized Review Generation with Capsule Graph Neural Network. Proceedings of the 29th ACM International Conference on Information & Knowledge Management, Virtual, 19-23 October 2020, 735-744. https://doi.org/10.1145/3340531.3411893
|
[7]
|
Li, J., Zhao, W.X., Wen, J. and Song, Y. (2019) Generating Long and Informative Reviews with Aspect-Aware Coarse-to-Fine Decoding. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, 28 July-2 August 2019, 1969-1979. https://doi.org/10.18653/v1/p19-1190
|
[8]
|
Bahdanau, D. (2014) Neural Machine Translation by Jointly Learning to Align and Translate. arXiv Preprint, arXiv: 1409.0473.
|
[9]
|
See, A., Liu, P.J. and Manning, C.D. (2017) Get to the Point: Summarization with Pointer-Generator Networks. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vancouver, 30 July-4 August 2017, 1073-1083. https://doi.org/10.18653/v1/p17-1099
|
[10]
|
Iqbal, T. and Qureshi, S. (2022) The Survey: Text Generation Models in Deep Learning. Journal of King Saud University-Computer and Information Sciences, 34, 2515-2528. https://doi.org/10.1016/j.jksuci.2020.04.001
|
[11]
|
Qiu, X., Sun, T., Xu, Y., Shao, Y., Dai, N. and Huang, X. (2020) Pre-Trained Models for Natural Language Processing: A Survey. Science China Technological Sciences, 63, 1872-1897. https://doi.org/10.1007/s11431-020-1647-3
|
[12]
|
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., et al. (2017) Attention Is All You Need. Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, 4-9 December 2017, 6000-6010.
|
[13]
|
Devlin, J. (2018) Bert: Pre-Training of Deep Bidirectional Transformers for Language Understanding. arXiv Preprint, arXiv:1810.04805.
|
[14]
|
Radford, A., Wu, J., Child, R., et al. (2019) Language Models Are Unsupervised Multitask Learners. Preprint, OpenAI Blog.
|
[15]
|
Brown, T.B. (2020) Language Models Are Few-Shot Learners. arXiv Preprint, arXiv:2005.14165.
|
[16]
|
Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., et al. (2020) BART: Denoising Sequence-to-Sequence Pre-Training for Natural Language Generation, Translation, and Comprehension. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5-10 July 2020, 7871-7880. https://doi.org/10.18653/v1/2020.acl-main.703
|
[17]
|
Raffel, C., Shazeer, N., Roberts, A., et al. (2020) Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research, 21, 1-67.
|
[18]
|
Kaplan, J., McCandlish, S., Henighan, T., et al. (2020) Scaling Laws for Neural Language Models. arXiv Preprint, arXiv:2001.08361.
|
[19]
|
Singhal, K., Azizi, S., Tu, T., Mahdavi, S.S., Wei, J., Chung, H.W., et al. (2023) Large Language Models Encode Clinical Knowledge. Nature, 620, 172-180. https://doi.org/10.1038/s41586-023-06291-2
|
[20]
|
Chowdhery, A., Narang, S., Devlin, J., et al. (2023) Palm: Scaling Language Modeling with Pathways. Journal of Machine Learning Research, 24, 1-113.
|
[21]
|
Wang, W., Bi, B., Yan, M., et al. (2019) Structbert: Incorporating Language Structures into Pre-Training for Deep Language Understanding. arXiv Preprint, arXiv:1908.04577.
|
[22]
|
Cao, Y., Li, S., Liu, Y., Yan, Z., Dai, Y., Yu, P., et al. (2023) A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from Gan to ChatGPT. arXiv Preprint, arXiv: 2303.04226.
|
[23]
|
Sanner, S., Balog, K., Radlinski, F., Wedin, B. and Dixon, L. (2023) Large Language Models Are Competitive Near Cold-Start Recommenders for Language-and Item-Based Preferences. Proceedings of the 17th ACM Conference on Recommender Systems, Singapore, 18-22 September 2023, 890-896. https://doi.org/10.1145/3604915.3608845
|
[24]
|
Liang, X., Wang, H., Wang, Y., et al. (2024) Controllable Text Generation for Large Language Models: A Survey. arXiv Preprint, arXiv:2408.12599.
|
[25]
|
Bubeck, S., Chandrasekaran, V., Eldan, R., et al. (2023) Sparks of Artificial General Intelligence: Early Experiments with GPT-4. arXiv Preprint, arXiv:2303.12712.
|
[26]
|
Anil, R., Dai, A.M., First, O., et al. (2023) Palm 2 Technical Report. arXiv Preprint, arXiv:2305.10403.
|
[27]
|
Wei, J., Wang, X., Schuurmans, D., et al. (2022) Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. Advances in Neural Information Processing Systems, 35, 24824-24837.
|
[28]
|
Radford, A., Narasimhan, K., Salimans, T. and Sutskever, I. (2018) Improving Language Understanding by Generative Pre-Training. Preprint.
|
[29]
|
Abdaljalil, S. and Bouamor, H. (2021) An Exploration of Automatic Text Summarization of Financial Reports. Proceedings of the Third Workshop on Financial Technology and Natural Language Processing, Online, 19 August 2021, 1-7.
|
[30]
|
Keskar, N.S., McCann, B., Varshney, L.R., et al. (2019) CTRL: A Conditional Transformer Language Model for Controllable Generation. arXiv Preprint, arXiv:1909.05858.
|
[31]
|
Yao, S., Yu, D., Zhao, J., et al. (2024) Tree of Thoughts: Deliberate Problem Solving with Large Language Models. Proceedings of the 37th International Conference on Neural Information Processing Systems, New Orleans, 10-16 December 2023, 11809-11822.
|
[32]
|
Fedus, W., Zoph, B. and Shazeer, N. (2022) Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity. Journal of Machine Learning Research, 23, 1-39.
|
[33]
|
Birhane, A., Kasirzadeh, A., Leslie, D. and Wachter, S. (2023) Science in the Age of Large Language Models. Nature Reviews Physics, 5, 277-280. https://doi.org/10.1038/s42254-023-00581-4
|
[34]
|
Taylor, R., Kardas, M., Cucurull, G., et al. (2022) Galactica: A Large Language Model for Science. arXiv Preprint, arXiv: 2211.09085.
|
[35]
|
Touvron, H., Lavril, T., Izacard, G., et al. (2023) Llama: Open and Efficient Foundation Language Models. arXiv Preprint, arXiv: 2302.13971.
|
[36]
|
Dong, Q., Li, L., Dai, D., et al. (2022) A Survey on In-Context Learning. arXiv Preprint, arXiv: 2301.00234.
|
[37]
|
Biderman, S., Schoelkopf, H., Anthony, Q.G., et al. (2023) Pythia: A Suite for Analyzing Large Language Models across Training and Scaling. Proceedings of the 40th International Conference on Machine Learning, Honolulu, 23-29 July 2023, 2397-2430.
|
[38]
|
Hoffmann, J., Borgeaud, S., Mensch, A., et al. (2022) Training Compute-Optimal Large Language Models. arXiv Preprint arXiv: 2203.15556.
|
[39]
|
Rosenfeld, R. (2000) Two Decades of Statistical Language Modeling: Where Do We Go from Here? Proceedings of the IEEE, 88, 1270-1278. https://doi.org/10.1109/5.880083
|
[40]
|
Touvron, H., Martin, L., Stone, K., et al. (2023) LLAMA 2: Open Foundation and Fine-Tuned Chat Models. arXiv Preprint, arXiv: 2307.09288.
|
[41]
|
Achiam, J., Adler, S., Agarwal, S., et al. (2023) GPT-4 Technical Report. arXiv Preprint, arXiv: 2303.08774.
|
[42]
|
Wei, J., Tay, Y., Bommasani, R., et al. (2022) Emergent Abilities of Large Language Models. arXiv Preprint, arXiv: 2206.07682.
|
[43]
|
Huberman, B.A. and Hogg, T. (1987) Phase Transitions in Artificial Intelligence Systems. Artificial Intelligence, 33, 155-171. https://doi.org/10.1016/0004-3702(87)90033-6
|
[44]
|
Rae, J.W., Borgeaud, S., Cai, T., et al. (2021) Scaling Language Models: Methods, Analysis & Insights from Training Gopher. arXiv Preprint, arXiv: 2112.11446.
|
[45]
|
Sanh, V., Webson, A., Raffel, C., et al. (2022) Multitask Prompted Training Enables Zero-Shot Task Generalization. arXiv Preprint, arXiv: 2110.08207.
|
[46]
|
Ouyang, L., Wu, J., Jiang, X., et al. (2022) Training Language Models to Follow Instructions with Human Feedback. Advances in Neural Information Processing Systems, 35, 27730-27744.
|
[47]
|
Wei, J., Bosma, M., Zhao, V.Y., et al. (2021) Finetuned Language Models Are Zero-Shot Learners. arXiv Preprint, arXiv: 2109.01652.
|
[48]
|
Thoppilan, R., De Freitas, D., Hall, J., et al. (2022) LAMDA: Language Models for Dialog Applications. arXiv Preprint, arXiv: 2201.08239.
|
[49]
|
Chung, H.W., Hou, L., Longpre, S., et al. (2024) Scaling Instruction-Finetuned Language Models. Journal of Machine Learning Research, 25, 1-53.
|
[50]
|
Luo, R., Sun, L., Xia, Y., Qin, T., Zhang, S., Poon, H., et al. (2022) BioGPT: Generative Pre-Trained Transformer for Biomedical Text Generation and Mining. Briefings in Bioinformatics, 23, bbac409. https://doi.org/10.1093/bib/bbac409
|
[51]
|
Madani, A., Krause, B., Greene, E.R., Subramanian, S., Mohr, B.P., Holton, J.M., et al. (2023) Large Language Models Generate Functional Protein Sequences across Diverse Families. Nature Biotechnology, 41, 1099-1106. https://doi.org/10.1038/s41587-022-01618-2
|
[52]
|
Wang, Y., Kordi, Y., Mishra, S., Liu, A., Smith, N.A., Khashabi, D., et al. (2023) Self-Instruct: Aligning Language Models with Self-Generated Instructions. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Toronto, 9-14 July 2023, 13484-13508. https://doi.org/10.18653/v1/2023.acl-long.754
|
[53]
|
Chen, M., Tworek, J., Jun, H., Yuan, Q., de Oliveira Pinto, H.P., Kaplan, J., et al. (2021) Evaluating Large Language Models Trained on Code. arXiv E-Prints.
|
[54]
|
Paperno, D., Kruszewski, G., Lazaridou, A., Pham, N.Q., Bernardi, R., Pezzelle, S., et al. (2016) The LAMBADA Dataset: Word Prediction Requiring a Broad Discourse Context. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Berlin, 7-12 August 2016, 1525-1534. https://doi.org/10.18653/v1/p16-1144
|
[55]
|
Kocmi, T., Bawden, R., Bojar, O., et al. (2022) Findings of the 2022 Conference on Machine Translation (WMT22). Proceedings of the Seventh Conference on Machine Translation (WMT), Abu Dhabi, 7-8 December 2022, 1-45.
|
[56]
|
Narayan, S., Cohen, S.B. and Lapata, M. (2018) Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, 31 October-4 November 2018, 1797-1807.
|
[57]
|
Chen, M., Tworek, J., Jun, H., et al. (2021) Evaluating Large Language Models Trained on Code. arXiv Preprint, arXiv: 2107.03374.
|
[58]
|
Lewis, P., Perez, E., Piktus, A., et al. (2020) Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. Advances in Neural Information Processing Systems, 33, 9459-9474.
|
[59]
|
Xi, Z., Chen, W., Guo, X., et al. (2023) The Rise and Potential of Large Language Model Based Agents: A Survey. arXiv Preprint, arXiv: 2309.07864.
|
[60]
|
Wu, N., Gong, M., Shou, L., Liang, S. and Jiang, D. (2023) Large Language Models Are Diverse Role-Players for Summarization Evaluation. Natural Language Processing and Chinese Computing, Foshan, 12-15 October 2023, 695-707. https://doi.org/10.1007/978-3-031-44693-1_54
|
[61]
|
Zhang, T., Ladhak, F., Durmus, E., Liang, P., McKeown, K. and Hashimoto, T.B. (2024) Benchmarking Large Language Models for News Summarization. Transactions of the Association for Computational Linguistics, 12, 39-57. https://doi.org/10.1162/tacl_a_00632
|
[62]
|
Kocmi, T. and Federmann, C. (2023) Large Language Models Are State-of-the-Art Evaluators of Translation Quality. arXiv Preprint, arXiv: 2302.14520.
|
[63]
|
Wang, L., Lyu, C., Ji, T., Zhang, Z., Yu, D., Shi, S., et al. (2023) Document-Level Machine Translation with Large Language Models. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, 6-10 December 2023, 16646-16661. https://doi.org/10.18653/v1/2023.emnlp-main.1036
|
[64]
|
Kobusingye, B.M., Dorothy, A., Nakatumba-Nabende, J. and Marvin, G. (2023) Explainable Machine Translation for Intelligent E-Learning of Social Studies. 2023 7th International Conference on Trends in Electronics and Informatics (ICOEI), Tirunelveli, 11-13 April 2023, 1066-1072. https://doi.org/10.1109/icoei56765.2023.10125599
|
[65]
|
Dathathri, S., Madotto, A., Lan, J., et al. (2019) Plug and Play Language Models: A Simple Approach to Controlled Text Generation. arXiv Preprint, arXiv: 1912.02164.
|
[66]
|
Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., et al. (2023) Survey of Hallucination in Natural Language Generation. ACM Computing Surveys, 55, 1-38. https://doi.org/10.1145/3571730
|
[67]
|
Maynez, J., Narayan, S., Bohnet, B. and McDonald, R. (2020) On Faithfulness and Factuality in Abstractive Summarization. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5-10 July 2020, 1906-1919. https://doi.org/10.18653/v1/2020.acl-main.173
|
[68]
|
French, R. (1999) Catastrophic Forgetting in Connectionist Networks. Trends in Cognitive Sciences, 3, 128-135. https://doi.org/10.1016/s1364-6613(99)01294-2
|
[69]
|
Bender, E.M., Gebru, T., McMillan-Major, A. and Shmitchell, S. (2021) On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, Virtual, 3-10 March 2021, 610-623. https://doi.org/10.1145/3442188.3445922
|
[70]
|
Raji, I.D., Smart, A., White, R.N., Mitchell, M., Gebru, T., Hutchinson, B., et al. (2020) Closing the AI Accountability Gap: Defining an End-to-End Framework for Internal Algorithmic Auditing. Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, Barcelona, 27-30 January 2020, 33-44. https://doi.org/10.1145/3351095.3372873
|
[71]
|
Guo, Z., Jin, R., Liu, C., et al. (2023) Evaluating Large Language Models: A Comprehensive Survey. arXiv Preprint, arXiv: 2310.19736.
|