![](https://d3ilqtpdwi981i.cloudfront.net/VSmq374bnIUK5fTWLdQ-2QSTkPE=/425x550/smart/https://bepress-attached-resources.s3.amazonaws.com/uploads/d6/24/12/d624126d-cf3f-476b-a0d2-e1a8eb87bc4f/thumbnail_d64ca8f1-047a-4ef8-b802-249ade243cb0.jpg)
We propose a novel transformer-based styled handwritten text image generation approach, HWT, that strives to learn both style-content entanglement as well as global and local writing style patterns. The proposed HWT captures the long and short range relationships within the style examples through a self-attention mechanism, thereby encoding both global and local style patterns. Further, the proposed transformer-based HWT comprises an encoder-decoder attention that enables style-content entanglement by gathering the style representation of each query character. To the best of our knowledge, we are the first to introduce a transformer-based generative network for styled handwritten text generation. Our proposed HWT generates realistic styled handwritten text images and significantly outperforms the state-of-the-art demonstrated through extensive qualitative, quantitative and human-based evaluations. The proposed HWT can handle arbitrary length of text and any desired writing style in a few-shot setting. Further, our HWT generalizes well to the challenging scenario where both words and writing style are unseen during training, generating realistic styled handwritten text images. © 2021, CC BY.
- Computer Vision and Pattern Recognition (cs.CV)
Preprint: arXiv