ExpressMath - Handwritten Mathematical Expression Recognition



Project description

    The main goal of this research project is to study, develop and validate algorithms and methods for the recognition of handwritten mathematical expressions. This is an important and challenging problem in the field of Pattern Recognition. The variety of symbols to be recognized, the variations in writing style, the need to analyze 2D spatial arrangement of the symbols, different notations, intrinsic ambiguities, among other issues make this a non trivial problem.

    An efficient recognition system would be useful in several situations: it would simplify input of mathematical notation into computer systems, it would allow efficient digitalization of handwritten documents, it could assist visually impaired person to read mathematical notations, and so on.

    In particular, with the advent of touch screen based devices, online recognition of handwriting from digital ink data is currently an active research topic.

    This project has received support from CNPq and FAPESP through research project grants and scholarship grants.


Subprojects

    Graph grammar based approach for math expression recognition: this is a current PhD project (by Frank D. Julca-Aguilar), a project with co-supervision by Prof. Christian Viard-Gaudin and Prof. Harold Mouchere (Univ. of Nantes, France).

    References:

    1. ICFHR 2014

    ExpressMatch: is a system that helps creation of ground-truth annotated dataset of handwritten mathematical expressions. The main idea consists in building a set of model expressions in which important substrucutures are annotated with ground-truth data. Then, users are invited to transcribe those expressions, and an expression matching method is used to match each symbol in the transcribed expression to the respective symbol in the model expression. Once the matching is computed, all ground-truth data in the model expression can be automatically transferred to the transcribed expression. This method allows rapid generation of a large number os samples of ground-truth annotated handwritten math expresssions. Details of the matching method are described in the paper [3] below.

    The matching method and the ExpressMatch system that implements the data collecting and annotation processes are described in the first two references below. Both the code and the generated dataset are avaliable at http://code.google.com/p/express-match/

    References:

    1. Hirata, Nina S. T. ; Honda, W. Y. Automatic Labeling of Handwritten Mathematical Symbols via Expression Matching. In: 8th IAPR-TC-15 International Workshop on Graph-Based Representations in Pattern Recognition. Lecture Notes in Computer Science -- Graph-Based Representations in Pattern Recognition, 2011. v. 6658. p. 295-304.

    2. F. D. J. Aguilar and N. S. T. Hirata, ExpressMatch: A System for Creating Ground-Truthed Datasets of Online Mathematical Expressions, 10th IAPR International Workshop on Document Analysis Systems (DAS 2012), pp.155-159, 2012.

    3. N. S. T. Hirata and F. D. J. Aguilar, Matching based ground-truth annotation for online handwritten mathematical expressions Pattern Recognition, In press.

Brief development log

People
  • Nina S. T. Hirata (coordinator)
  • Current: Frank (PhD candidate), Davi (MSc candidate)
  • Former: Marcelo (MSc 2014), Breno and Ricardo; Ricardo Sider (bolsista CNPq/IC); Cristiano Perez Garcia (bolsista CNPq/ITI - 2008); Fábio Hirano (ex-bolsista CNPq/IC); Ana Paula, Fábio Eiji, Leonardo Ka Wah, Eduardo Komatsu (equipe Math-Picasso, 2007); Eduardo Gusmão Cáceres Pires, Pedro Henrique Simões de Oliveira, Ricky Ye Lun Chow (equipe SisTREO, 2006).


(Page created on 06/August/2007)
(Last update on 21/November/2014)