![](https://d3ilqtpdwi981i.cloudfront.net/24qSDcZ5J3lY_GW27-WsqLqMAbM=/425x550/smart/https://bepress-attached-resources.s3.amazonaws.com/uploads/f6/a6/ed/f6a6ed8d-1e1c-4da2-a13b-5708235b9359/thumbnail_3642d462-240a-44f9-a62f-b5c11044a425.jpg)
We propose a novel transformer-based styled handwritten text image generation approach, HWT, that strives to learn both style-content entanglement as well as global and local writing style patterns. The proposed HWT captures the long and short range relationships within the style examples through a self-attention mechanism, thereby encoding both global and local style patterns. Further, the proposed transformer-based HWT comprises an encoder-decoder attention that enables style-content entanglement by gathering the style representation of each query character. To the best of our knowledge, we are the first to introduce a transformer-based generative network for styled handwritten text generation. Our proposed HWT generates realistic styled handwritten text images and significantly outperforms the state-of-the-art demonstrated through extensive qualitative, quantitative and human-based evaluations. The proposed HWT can handle arbitrary length of text and any desired writing style in a few-shot setting. Further, our HWT generalizes well to the challenging scenario where both words and writing style are unseen during training, generating realistic styled handwritten text images. © 2021, CC BY.
- Computer Vision and Pattern Recognition (cs.CV)
Preprint: arXiv