Tibetan Data Augmentation via GAN-Based Handwritten Text Generation
Dorje Tashi , Bingtian Chen , Tianying Sheng , Yongbin Yu , Xiangxiang Wang , Jin Zhang , Lobsang Yeshi , Rinchen Dongrub , Thupten Tsering , Nyima Tashi
CAAI Transactions on Intelligence Technology ›› 2026, Vol. 11 ›› Issue (1) : 55 -65.
Increased awareness of Tibetan cultural preservation, along with technological advancements, has led to significant efforts in academic research on Tibetan. However, the structural complexity of the Tibetan language and limited labeled handwriting data impede advancements in Optical Character Recognition (OCR) and other applications. To address these challenges, this paper proposes an innovative Tibetan data augmentation technique, using Generative Adversarial Networks (GANs) to synthesise arbitrary handwriting images in variable calligraphic styles based on inputs. Moreover, our method leverages a Real-Fake Cross Inputs Strategy during training to enhance generation diversity and improve model generalisability in generating handwritten text beyond the training set and pre-defined corpus. The model was trained on three Tibetan handwriting datasets, including Umê style numerals, Uchen style consonants, and Khyug-yig style words. Experimental results demonstrate that the model successfully generates realistic and recognisable Tibetan numeral and consonant handwriting, achieving Fréchet Inception Distance (FID) scores of 14.45 and 27.63, respectively. The proposed method's effectiveness in augmenting OCR models was validated as evidenced by a reduced OCR Word Error Rate (WER) on the augmented datasets.
computer vision / deep learning / handwriting recognition / image representation / OCR
| [1] |
|
| [2] |
|
| [3] |
|
| [4] |
|
| [5] |
|
| [6] |
|
| [7] |
|
| [8] |
|
| [9] |
|
| [10] |
|
| [11] |
|
| [12] |
|
| [13] |
|
| [14] |
|
| [15] |
|
| [16] |
|
| [17] |
|
| [18] |
|
| [19] |
|
| [20] |
J. Gan and W. Wang, “Higan: Handwriting Imitation Conditioned on Arbitrary-Length Texts and Disentangled Styles,” in Proceedings of the AAAI Conference on Artificial Intelligence, (2021), 7484-7492. |
| [21] |
|
| [22] |
|
| [23] |
|
| [24] |
|
| [25] |
|
| [26] |
|
| [27] |
|
| [28] |
|
| [29] |
Crxm. Tibetan Handwriting Consonants Dataset (2020), https://www.heywhale.com/mw/dataset/5eb0d52f366f4d002d756691. |
| [30] |
|
| [31] |
|
| [32] |
|
| [33] |
|
| [34] |
|
/
| 〈 |
|
〉 |