|
[1]
|
Cheng, J., Liang, X., Shi, X., et al. (2023) LayoutDiffuse: Adapting Foundational Diffusion Models for Layout-to-Image Generation. arXiv: 2302.08908. http://arxiv.org/abs/2302.08908
|
|
[2]
|
Zheng, G., Zhou, X., Li, X., Qi, Z., Shan, Y. and Li, X. (2023) Layoutdiffusion: Controllable Diffusion Model for Layout-To-Image Generation. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, 17-24 June 2023, 22490-22499. [Google Scholar] [CrossRef]
|
|
[3]
|
Cho, J., Li, L., Yang, Z., et al. (2024) Diagnostic Benchmark and Iterative Inpainting for Layout-Guided Image Generation. arXiv: 2304.06671. http://arxiv.org/abs/2304.06671
|
|
[4]
|
Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., et al. (2014) Generative Adversarial Networks. arXiv: 1406.2661. http://arxiv.org/abs/1406.2661
|
|
[5]
|
Ashual, O. and Wolf, L. (2019) Specifying Object Attributes and Relations in Interactive Scene Generation. arXiv: 1909.05379. http://arxiv.org/abs/1909.05379
|
|
[6]
|
Johnson, J., Gupta, A. and FEI-Fei, L. (2018) Image Generation from Scene Graphs. arXiv: 1804.01622. http://arxiv.org/abs/1804.01622
|
|
[7]
|
Wang, B., Wu, T., Zhu, M., et al. (2022) Interactive Image Synthesis with Panoptic Layout Generation. arXiv: 2203.02104. http://arxiv.org/abs/2203.02104
|
|
[8]
|
Sun, W. and Wu, T. (2021) Learning Layout and Style Reconfigurable GANs for Controllable Image Synthesis. arXiv: 2003.11571. http://arxiv.org/abs/2003.11571
|
|
[9]
|
Arjovsky, M. and Bottou, L. (2017) Towards Principled Methods for Training Generative Adversarial Networks. arXiv: 1701.04862. http://arxiv.org/abs/1701.04862
|
|
[10]
|
Radford, A., Metz, L. and Chintala, S. (2016) Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arXiv: 1511.06434. http://arxiv.org/abs/1511.06434
|
|
[11]
|
Ho, J., Jain, A. and Abbeel, P. (2020) Denoising Diffusion Probabilistic Models. arXiv: 2006.11239. http://arxiv.org/abs/2006.11239
|
|
[12]
|
Dhariwal, P. and Nichol, A. (2021) Diffusion Models Beat GANs on Image Synthesis. arXiv: 2105.05233. http://arxiv.org/abs/2105.05233
|
|
[13]
|
Zhao, B., Meng, L., Yin, W., et al. (2019) Image Generation from Layout. arXiv: 1811.11389. http://arxiv.org/abs/1811.11389
|
|
[14]
|
Kingma, D.P. and Welling, M. (2014) Auto-Encoding Variational Bayes. arXiv: 1312.6114. http://arxiv.org/abs/1312.6114
|
|
[15]
|
Sun, W. and Wu, T. (2019) Image Synthesis from Reconfigurable Layout and Style. arXiv: 1908.07500. http://arxiv.org/abs/1908.07500
|
|
[16]
|
Liang, J., Pei, W. and Lu, F. (2022) Layout-Bridging Text-to-Image Synthesis. arXiv: 2208.06162. http://arxiv.org/abs/2208.06162
|
|
[17]
|
Perera, P., Nallapati, R. and Xiang, B. (2019) OCGAN: One-Class Novelty Detection Using Gans with Constrained Latent Representations. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 8576-8585. [Google Scholar] [CrossRef]
|
|
[18]
|
Liu, L., Ren, Y., Lin, Z., et al. (2022) Pseudo Numerical Methods for Diffusion Models on Manifolds. arXiv: 2202.09778. http://arxiv.org/abs/2202.09778
|
|
[19]
|
Song, J., Meng, C. and Ermon, S. (2022) Denoising Diffusion Implicit Models. arXiv: 2010.02502. http://arxiv.org/abs/2010.02502
|
|
[20]
|
Ho, J. and Salimans, T. (2022) Classifier-Free Diffusion Guidance. arXiv: 2207.12598. http://arxiv.org/abs/2207.12598
|
|
[21]
|
Li, Y., Liu, H., Wu, Q., Mu, F., Yang, J., Gao, J., et al. (2023) GLIGEN: Open-Set Grounded Text-To-Image Generation. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, 17-24 June 2023, 22511-22521. [Google Scholar] [CrossRef]
|
|
[22]
|
Zhang, L., Rao, A. and Agrawala, M. (2023) Adding Conditional Control to Text-to-Image Diffusion Models. 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, 1-6 October 2023, 3813-3824. [Google Scholar] [CrossRef]
|
|
[23]
|
Wang, X., Darrell, T., Rambhatla, S.S., et al. (2024) InstanceDiffusion: Instance-Level Control for Image Generation. arXiv: 2402.03290. http://arxiv.org/abs/2402.03290
|
|
[24]
|
Wang, X., Fu, S., Huang, Q., et al. (2025) MS-Diffusion: Multi-Subject Zero-Shot Image Personalization with Layout Guidance. arXiv: 2406.07209. http://arxiv.org/abs/2406.07209
|
|
[25]
|
Balaji, Y., Nah, S., Huang, X., et al. (2023) eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers. arXiv: 2211.01324. http://arxiv.org/abs/2211.01324
|
|
[26]
|
Shirakawa, T. and Uchida, S. (2024) NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on Noise Cropping and Merging. arXiv: 2403.03485. http://arxiv.org/abs/2403.03485
|
|
[27]
|
Jiménez, Á.B. (2023) Mixture of Diffusers for Scene Composition and High Resolution Image Generation. arXiv: 2302.02412. http://arxiv.org/abs/2302.02412
|
|
[28]
|
Bar-Tal, O., Yariv, L., Lipman, Y., et al. (2023) MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation. arXiv: 2302.08113. http://arxiv.org/abs/2302.08113
|
|
[29]
|
Yang, L., Yu, Z., Meng, C., et al. (2024) Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs. arXiv: 2401.11708. http://arxiv.org/abs/2401.11708
|
|
[30]
|
Kim, Y., Lee, J., Kim, J., Ha, J. and Zhu, J. (2023) Dense Text-To-Image Generation with Attention Modulation. 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, 1-6 October 2023, 7667-7677. [Google Scholar] [CrossRef]
|
|
[31]
|
Wang, Z., Xia, X., Chen, R., et al. (2025) LaVin-DiT: Large Vision Diffusion Transformer. arXiv: 2411.11505. http://arxiv.org/abs/2411.11505
|
|
[32]
|
Chen, B., Zhang, Z., Li, W., et al. (2025) Invertible Diffusion Models for Compressed Sensing. arXiv: 2403.17006. http://arxiv.org/abs/2403.17006
|
|
[33]
|
Zhou, Y., Xiao, Z., Yang, S., et al. (2025) Alias-Free Latent Diffusion Models: Improving Fractional Shift Equivariance of Diffusion Latent Space. arXiv: 2503.09419. http://arxiv.org/abs/2503.09419
|
|
[34]
|
Liu, Y., Zhang, K., Li, Y., et al. (2024) Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models. arXiv: 2402.17177. http://arxiv.org/abs/2402.17177
|
|
[35]
|
Xu, Z., Zhang, J., Liew, J.H., et al. (2023) MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model. arXiv: 2311.16498. http://arxiv.org/abs/2311.16498
|
|
[36]
|
Zhang, C., Wang, C., Zhang, J., et al. (2023) DREAM-Talk: Diffusion-Based Realistic Emotional Audio-Driven Method for Single Image Talking Face Generation. arXiv: 2312.13578. http://arxiv.org/abs/2312.13578
|
|
[37]
|
Ding, S., Chen, X., Fang, Y., et al. (2023) DesignGPT: Multi-Agent Collaboration in Design. arXiv: 2311.11591. http://arxiv.org/abs/2311.11591
|
|
[38]
|
Sun, W., Cui, B., Dong, X.M., et al. (2025) Attentive Eraser: Unleashing Diffusion Model’s Object Removal Potential via Self-Attention Redirection Guidance. arXiv: 2412.12974. http://arxiv.org/abs/2412.12974
|