Papers
arxiv:2304.12519

GlyphDiffusion: Text Generation as Image Generation

Published on Apr 25, 2023
Authors:
,
,
,

Abstract

Diffusion models have become a new generative paradigm for text generation. Considering the discrete categorical nature of text, in this paper, we propose GlyphDiffusion, a novel diffusion approach for text generation via text-guided image generation. Our key idea is to render the target text as a glyph image containing <PRE_TAG>visual language content</POST_TAG>. In this way, conditional <PRE_TAG>text generation</POST_TAG> can be cast as a glyph image generation task, and it is then natural to apply continuous diffusion models to discrete texts. Specially, we utilize a cascaded architecture (ie a base and a super-resolution diffusion model) to generate high-fidelity <PRE_TAG>glyph images</POST_TAG>, conditioned on the input text. Furthermore, we design a text grounding module to transform and refine the visual language content from generated glyph images into the final texts. In experiments over four <PRE_TAG>conditional <PRE_TAG>text generation</POST_TAG> tasks</POST_TAG> and two classes of metrics (ie quality and diversity), GlyphDiffusion can achieve comparable or even better results than several baselines, including pretrained language models. Our model also makes significant improvements compared to the recent diffusion model.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2304.12519 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2304.12519 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2304.12519 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.