Rich-text document styling restoration via reinforcement learning

Hongwei LI, Yingpeng HU, Yixuan CAO, Ganbin ZHOU, Ping LUO

PDF(1515 KB)
PDF(1515 KB)
Front. Comput. Sci. ›› 2021, Vol. 15 ›› Issue (4) : 154328. DOI: 10.1007/s11704-020-9322-7
RESEARCH ARTICLE

Rich-text document styling restoration via reinforcement learning

Author information +
History +

Abstract

Richly formatted documents, such as financial disclosures, scientific articles, government regulations, widely exist on Web. However, since most of these documents are only for public reading, the styling information inside them is usually missing, making them improper or even burdensome to be displayed and edited in different formats and platforms. In this study we formulate the task of document styling restoration as an optimization problem, which aims to identify the styling settings on the document elements, e.g., lines, table cells, text, so that rendering with the output styling settings results in a document, where each element inside it holds the (closely) exact position with the one in the original document. Considering that each styling setting is a decision, this problem can be transformed as a multi-step decision-making task over all the document elements, and then be solved by reinforcement learning. Specifically, Monte-Carlo Tree Search (MCTS) is leveraged to explore the different styling settings, and the policy function is learnt under the supervision of the delayed rewards. As a case study, we restore the styling information inside tables, where structural and functional data in the documents are usually presented. Experiment shows that, our best reinforcement method successfully restores the stylings in 87.65% of the tables, with 25.75% absolute improvement over the greedymethod.We also discuss the tradeoff between the inference time and restoration success rate, and argue that although the reinforcement methods cannot be used in real-time scenarios, it is suitable for the offline tasks with high-quality requirement. Finally, this model has been applied in a PDF parser to support cross-format display.

Keywords

styling restoration / monte-carlo tree search / reinforcement learning / richly formatted documents / tables

Cite this article

Download citation ▾
Hongwei LI, Yingpeng HU, Yixuan CAO, Ganbin ZHOU, Ping LUO. Rich-text document styling restoration via reinforcement learning. Front. Comput. Sci., 2021, 15(4): 154328 https://doi.org/10.1007/s11704-020-9322-7

References

[1]
Wu S, Hsiao L, Cheng X, Hancock B, Rekatsinas T, Levis P, Ré C. Fonduer: knowledge base construction from richly formatted data. In: Proceedings of the 2018 International Conference on Management of Data. 2018, 1301–1316
CrossRef Google scholar
[2]
Chao H, Fan J. Layout and content extraction for pdf documents. In: Proceedings of the 6th International Workshop on Document Analysis Systems. 2004, 213–224
CrossRef Google scholar
[3]
Oro E, Ruffolo M. PDF-TREX: an approach for recognizing and extracting tables from pdf documents. In: Proceedings of the 10th International Conference on Document Analysis and Recognition. 2009, 906–910
CrossRef Google scholar
[4]
Wang Y, Hu J. A machine learning based approach for table detection on the web. In: Proceedings of the 11th International Conference on World Wide Web. 2002, 242–250
CrossRef Google scholar
[5]
Gilani A, Qasim S R, Malik I, Shafait F. Table detection using deep learning. In: Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition. 2017, 771–776
CrossRef Google scholar
[6]
He D,Cohen S, Price B, Kifer D, Giles C L. Multi-scale multi-task FCN for semantic page segmentation and table detection. In: Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition. 2017, 254–261
CrossRef Google scholar
[7]
Rashid S F, Akmal A, Adnan M, Aslam A A, Dengel A. Table recognition in heterogeneous documents using machine learning. In: Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition. 2017, 777–782
CrossRef Google scholar
[8]
Meunier J L. Optimized xy-cut for determining a page reading order. In: Proceedings of the 8th International Conference on Document Analysis and Recognition. 2005, 347–351
CrossRef Google scholar
[9]
Malerba D, Ceci M, Berardi M. Machine learning for reading order detection in document image understanding. In: Marinai S, Fujisawa H, eds. Machine Learning in Document Analysis and Recognition. Springer, Berlin, 2008
CrossRef Google scholar
[10]
Fang J, Mitra P,Tang Z, Giles C L. Table header detection and classification. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence. 2012
[11]
Schreiber S,AgneS,Wolf I, Dengel A, Ahmed S. Deepdesrt: deep learning for detection and structure recognition of tables in document images. In: Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition. 2017, 1162–1167
CrossRef Google scholar
[12]
Pinto D, McCallum A, Wei X, Croft W B. Table extraction using conditional random fields. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval. 2003, 235–242
CrossRef Google scholar
[13]
Nagy G, Seth S C, Jin D, Embley D W, Machado S, Krishnamoorthy M. Data extraction from web tables: the devil is in the details. In: Proceedings of the 11th International Conference on Document Analysis and Recognition. 2011, 242–246
CrossRef Google scholar
[14]
Chen X, Chiticariu L, Danilevsky M, Evfimievski A, Sen P.A rectangle mining method for understanding the semantics of financial tables. In: Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition. 2017, 268–273
CrossRef Google scholar
[15]
Wang H L, Wu S H, Wang I,Sung C L, Hsu W L, Shih W K. Semantic search on internet tabular information extraction for answering queries. In: Proceedings of the 9th International Conference on Information and Knowledge Management. 2000, 243–249
CrossRef Google scholar
[16]
Zhang S, Balog K. Ad hoc table retrieval using semantic similarity. In: Proceedings of the 2018 World Wide Web Conference. 2018, 1553–1562
CrossRef Google scholar
[17]
Ghasemi-Gol M, Szekely P A. TabVec: table vectors for classification of web tables. 2018, arXiv preprint arXiv,1802.06290
[18]
Zhang S, Balog K. On-the-fly table generation. In: Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 2018, 595–604
CrossRef Google scholar
[19]
Zhang S, Balog K. Entitables: smart assistance for entity-focused tables. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2017, 255–264
CrossRef Google scholar
[20]
Sutton R S, Barto A G. Reinforcement Learning: An Introduction. MIT Press, 2018
[21]
Mnih V, Kavukcuoglu K, Silver D, Rusu A A, Veness J, Bellemare M G, Graves A,Riedmiller M, Fidjeland A K, Ostrovski G, Human-level control through deep reinforcement learning. Nature, 2015, 518(7540): 529–533
CrossRef Google scholar
[22]
Van Hasselt H,Guez A, Silver D. Deep reinforcement learning with double Q-learning. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016, 2094–2100
[23]
Anschel O, Baram N, Shimkin N. Averaged-DQN: variance reduction and stabilization for deep reinforcement learning. In: Proceedings of the 34th International Conference on Machine Learning. 2017, 176–185
[24]
Coulom R. Efficient selectivity and backup operators in monte-carlo tree search. In: Proceedings of the 5th International Conference on Computer and Games. 2006, 72–83
CrossRef Google scholar
[25]
Silver D, Huang A,Maddison C J, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I,Panneershelvam V,Lanctot M, Mastering the game of go with deep neural networks and tree search. Nature, 2016, 529(7587): 484–489
CrossRef Google scholar
[26]
Silver D, Schrittwieser J,Simonyan K, Antonoglou I, Huang A, Guez A, Hubert T, Baker L, Lai M, Bolton A, Mastering the game of go without human knowledge. Nature, 2017, 550(7676): 354–359
CrossRef Google scholar

RIGHTS & PERMISSIONS

2021 Higher Education Press
AI Summary AI Mindmap
PDF(1515 KB)

Accesses

Citations

Detail

Sections
Recommended

/