Trabalho de Conclusão de Curso

Undergraduate Thesis

Direção Autônoma de Robôs Móveis por Classificação de Imagens Através de Aprendizado Discriminativo de Redes Soma-Produto

Mobile Robot Self-Driving Through Image Classification Using Discriminative Learning of Sum-Product Networks

Aluno:Student: Renato Lui Geh
Orientador:Advisor: Denis Deratani Mauá

Resumo

O problema de direção autônoma consiste em locomover um veículo de forma autônoma (i.e. sem interação humana direta). Este problema é um desafio de visão computacional que teve grandes avanços recentemente por meio de aprendizado de máquina. Neste TCC, busca-se abordar direção autônoma no domínio de robôs móveis. O objetivo do TCC é guiar um robô móvel de baixo custo utilizando imagens feitas por uma câmera convencional. Seguiremos uma abordagem baseada em aprendizado por imitação [12], onde o problema de controle é posto como um problema de classificação de imagens.

Aprendizado por imitação tem como objetivo fazer um agente (i.e. uma máquina) imitar um comportamento humano em uma certa tarefa. No caso deste TCC, deseja-se que o robô móvel imite o comportamento humano de seguir uma pista. Por meio de imagens coletadas durante o percurso, o robô deve usar classificação de imagem para permanecer dentro da pista de rolagem. Cada imagem é rotulada com uma ação protótipa (e.g. "virar à direita", "seguir reto", "virar à esquerda"). O resultado da classificação é então usada como base para um posterior refinamento das ações para sinais de controle dos motores. Estas imagens utilizadas para o aprendizado do classificador serão obtidas operando o robô manualmente em uma pista e anotando imagens com as ações humanas aplicadas. Para esta tarefa de classificação usaremos redes soma-produto aprendidas de forma discriminativa.

Redes soma-produto (SPN, de Sum-Product Network) são modelos probabilísticos baseados em grafo que representam uma distribuição de probabilidade tratável. Proposta em 2011, SPNs computam inferência exata em tempo linear ao número de arestas de seu grafo se sua estrutura obedecer a certas propriedades [1]. SPNs apresentam uma série de características interessantes, como sua arquitetura profunda que permite representar funções de forma mais eficiente quanto mais profundo seu grafo [2]. Outras interessantes propriedades teóricas incluem uma generalização de SPNs para qualquer semianel em que o produto tenha escopo disjunto [3]. Com relação a aplicações, SPNs tiveram resultados impressionantes em diversas áreas, como enovelamento de proteínas [4], modelagem de sinais [5], classificação e reconstrução de imagens [1,6,7], reconhecimento de atividade [8] e linguagem natural [9].

Após o estudo da teoria e da implementação do algoritmo de aprendizagem, o modelo será aplicado à tarefa de direção autônoma de robôs móveis baseando-se em [11,12,13]. O robô móvel usado será ou o Lego Mindstorms ou um robô de baixo custo construído. No caso do Lego Mindstorms, devido ao baixo poder de processamento e memória limitada, também será usado em conjunto um Raspberry Pi. Uma câmera afixada ao robô enviará imagens ao Raspberry Pi, que tomará uma certa decisão dependendo do resultado da classificação. Apenas o controle de movimento do robô será feito no Lego Mindstorms. O treinamento da SPN será feito num computador separado.

Objetivos

Os objetivos esperados neste trabalho são:

Estudo do algoritmo de aprendizado discriminativo por método de otimização de gradiente.
Implementação do algoritmo de aprendizado discriminativo soft e hard.
Teste do modelo na tarefa de classificação de imagens, comparando com outros algoritmos de aprendizados generativos.
Desenvolvimento de um robô seguidor de pista baseado em classificação de imagens por SPNs.

Metodologia

Uma rede soma-produto pode ser definida como um DAG cujos nós podem ser somas ponderadas, produtos, uma distribuição de probabilidade univariada ou uma variável indicadora. Inferência exata em SPNs é linear no número de arestas. Modelos probabilísticos tradicionais, como redes bayesianas e redes de markov possuem grande expressividade, porém dependem de algoritmos aproximados para computar inferência de maneira tratável, o que implica em tanto aprendizado quanto inferência aproximadas. Robôs móveis, como o Lego Mindstorms, geralmente possuem recursos computacionais limitados, o que significa que decisões tomadas devem ser rápidas. Como a presença de erros nos sensores é comum em robôs móveis, é vantajoso escolher um modelo que captura incerteza nas decisões tomadas, já que desejamos não apenas obter o resultado da classificação, mas também extrair uma estimativa confiável da probabilidade de cada ação.

O trabalho terá seu início com a coleta de imagens a serem usadas como dataset. O conjunto de dados conterá imagens de pistas a serem percorridas. Cada imagem terá uma etiqueta indicando o comando de direção apropriado a situação. Após a extração do dataset, serão feitos testes de baseline com modelos mais simples (e.g. regressão linear, modelo por regras e SVM), implementando-se os algoritmos no robô. Como o robô tem baixo poder de processamento e pouca memória, será usado um Raspberry Pi para computar as decisões dos modelos. Uma câmera afixada no robô enviará as imagens captadas ao Raspberry Pi. Estas imagens serão usadas para decidir quais são os melhores comandos a serem tomados. O mini-computador então enviará os comandos de direção para o processador do robô. Desta forma, o robô apenas cuidará do controle da direção. Estas tarefas serão feitas em conjunto com a disciplina MAC0318 - Introdução a Programação de Robôs Móveis.

Após o final das tarefas de coleta de imagens e de teste de baseline, será estudado o artigo [10]. A implementação do código será livre e aberto, e fará parte da biblioteca de inferência e aprendizado de SPNs GoSPN. Esta biblioteca foi desenvolvida anteriormente como parte de um projeto de iniciação científica, e contém uma base para construções de SPNs, funções de inferência e algoritmos de aprendizado de SPNs generativas. O código será escrito na linguagem Go. Para este TCC, usaremos SPNs aprendidas de forma discriminativa. Aprendizado discriminativo busca otimizar a probabilidade \(P(\mathbf{Y}|\mathbf{X})\), onde \(\mathbf{Y}\) em nossa aplicação é a variável aleatória que representa a ação a ser tomada (e.g. "virar a esquerda", "seguir reto", "virar a direita") e \(\mathbf{X}\) é a imagem a ser classificada. Em contrapartida, modelos generativos buscam aprender a probabilidade conjunta \(P(\mathbf{Y},\mathbf{X})\). Resultados de modelos discriminativos tendem a apresentar melhor desempenho do que generativos em tarefas de classificação. Serão implementadas as versões hard e soft do algoritmo de aprendizado discriminativo por método de otimização de gradiente descrito em [10]. Em seguida, os algoritmos serão comparados com outros algoritmos generativos presentes na biblioteca na tarefa de classificação de imagens.

Com o código implementado e o conjunto de dados coletado, será iniciada a implementação do modelo no robô. Após feita esta integração, serão feitos testes comparando com a baseline e com outras SPNs generativas.

Cronograma

Abaixo segue o cronograma planejado para as atividades propostas para o ano de 2018. Estão assinaladas com ✗ atividades a serem concluídas, e ✓ atividades finalizadas.

Atividade / Mês	Mai	Jun	Jul	Ago	Set	Out	Nov	Dez
a	✓
b		✓
c			✓
d				✓
e					✓
f						✓
g							✓
h							✓
i								✓

Coleta de imagens para o dataset de treino.
Testes de baseline.
Estudo do algoritmo de aprendizado discriminativo de Gens-Domingos [10].
Implementação do algoritmo na biblioteca GoSPN.
Depuração e testes contra outros métodos de aprendizado generativo.
Testes com datasets de classificação de imagens (e.g. Caltech-101, MNIST e Digits).
Implementação do modelo discriminativo no robô.
Avaliação e comparação da SPN discriminativa com baseline e com SPNs generativas.
Finalização do pôster, monografia e apresentação.

Referências

Abstract

The self-driving problem consists of autonomously (i.e. without direct human interaction) driving a vehicle through lanes or other pathways. This problem has proven to be a challenge for computer vision, and has recently had incredible results as a machine learning application. In this undergraduate thesis, we show an application of self-driving in the mobile robot domain by using imitation learning [12], more specifically through image classification. Our objective in this undergraduate thesis is to guide a low-cost mobile robot through a pathway by using images taken from a conventional digital camera.

The objective of imitation learning is to teach an agent (i.e. a machine) to mimic human behavior in a particular task. In our case, we wish to make a mobile robot behave the same way as a human would when driving through a pathway. By using real-time images collected during its route, the robot must use image classification to stay inside the pathway. Each image is then labeled as a certain prototypical action (e.g. "turn right", "straight ahead", "turn left"). The result of the classification is then used as a basis for a later refinement of the actions to engine control signals. The training dataset follows the same label scheme, with images collected by labelling images with human actions. For the task of image classification, we shall use discriminative sum-productt networks.

Sum-product networks (SPNs) are probabilistic graphical models that model a tractable probability distribution. Proposed in 2011, SPNs compute exact inference in linear time in the number of edges of its graph if certain criteria are met [1]. SPNs have shown many interesting properties, such as a higher representability of functions the deeper its architecture whilst maintaining its efficiency [2]. Other interesting theoretical properties include a generalization of SPNs to any semiring with disjoint product [3]. With regards to applications, SPNs have shown impressive results in various tasks, such as protein folding [4], modeling speech [5], image classification and completion [1,6,7], activity recognition [8] and modeling natural language [9].

There are many published generative learning algorithms for SPNs. However, up until now, only two articles on discriminative learning of SPNs have been published so far. Part of this undergrad thesis is to study the seminal article on discriminative SPNs [1]. Once finished, an implementation of both "soft" and "hard" gradient descent versions of the learning algorithm will be written.

Once both theoretical studies and the implementation are complete, the learning algorithm will be applied to the task of autonomous driving of mobile robots. This portion of the work will be based on articles [11,12,13]. We will use the Lego Mindstorms platform or another low-cost built robot as our mobile robot. Due to Lego Mindstorms' low processing power and restricted memory size, a Raspberry Pi will also be used to compute inference. A camera attached to the robot will send images to the Raspberry Pi, which will, in turn, return a certain decision depending on the output of the SPN's classification. Only the actual driving of the robot will be delegated to the Lego Mindstorms. The SPN's training will be done in a separate computer.

Objectives

The expected objectives in this undergraduate thesis are:

To study the discriminative learning algorithm of SPNs by gradient descent.
To implement both the "soft" and "hard" learning algorithms.
To test and validate the model in image classification taks, comparing its performance with other generative learning algorithms.
Development of a lane following mobile robot based on image classification through SPNs.

Methodology

A sum-product network can be defined as a DAG whose nodes are either weighted sums, products, a univariate probability distribution or an indicator variable. Exact inference in SPNs is linear in the number of edges. Classic probabilistic models, such as bayesian networks and markov networks, are very expressive. However, they depend heavily on approximate algorithms to compute inference tractably. Mobile robots, such as the Lego Mindstorms, are usually resource constrained, meaning decision making has to be done fast. The presence of environment error in sensors is also very common in mobile robots, which means a probabilistic model that captures uncertainty in decision making can be very advantageous. Furthermore, we are also interested in extracting a good estimate of the probability of each action.

Work will start with the extraction of images for the dataset. The dataset will contain images of tracks to be driven on. Each image should have a label indicating which drive command would be most appropriate to the situation. Once the dataset is complete, baseline tests will be done with simpler models (e.g. linear regression, rule-based model, SVM) for comparison. Due to the robot's low processing power and limited memory, a Raspberry Pi will be used to compute the model's decisions. A camera attached to the robot will send captured images to the Raspberry Pi. These images will then be used to decide which command should be executed. The mini-computer will then send these commands to the robot's processor. The robot will then only be responsible for the driving controls. These tasks will be done concurrently with the course MAC0318 - Introduction to Mobile Robot Programming.

After the completion of both the dataset extraction and baseline tests, theoretical study on the discriminative learning algorithm [10] will begin. The code implementation will be free and open-sourced, and will be part of the SPN inference and learning framework GoSPN. This library was developed earlier as part of an undergraduate research project, and contains a base for SPN construction, inference functions and other generative learning algorithms for SPNs. The code will be written in Go. For this undergraduate thesis, we shall use discriminative SPNs. Discriminative learning seeks to optimize the probability \(P(\mathbf{Y}|\mathbf{X})\), in contrast to generative models, that attempt to learn the joint probability \(P(\mathbf{Y},\mathbf{X})\), where \(\mathbf{Y}\) in our application is the random variable representing the action to be taken (e.g. "turn left", "straight ahead", "turn right") and \(\mathbf{X}\) is the image to be classified. Generative models, on the other hand, model the joint probability \(P(\mathbf{Y},\mathbf{X})\). Discriminative models tend to achieve better scores compared to generatives. With the implementation done, an image classification performance comparison between the implemented algorithm and other SPN generative learning methods present in the framework will be done.

After the tasks above are completed, we will begin the implementation on the robot. Once this is done, we will compare the model with both the baseline and other generative SPNs.

Schedule

The planned schedule for all proposed tasks for the year 2018 is shown below. Tasks marked with ✗ are ongoing or not yet begun activities. Ones marked with ✓ indicate a concluded task.

Task / Month	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec
a	✓
b		✓
c			✓
d				✓
e					✓
f						✓
g							✓
h							✓
i								✓

Dataset image extraction.
Baseline tests.
Theretical study on Gens-Domingos' discriminative learning algorithm [10].
Code implementation on the GoSPN framework.
Debugging and test comparisons against other generative learning methods.
Performance tests on image classification datasets (e.g. Caltech-101, MNIST and Digits).
Code implementation of the discriminative model on the robot.
Comparison tests between discriminative SPNs, baseline models and other generative SPNs.
Poster, thesis and presentation slides.

References

Hoifung Poon, Pedro Domingos. "Sum-Product Networks: A New Deep Architecture". 2011. Uncertainty in Artificial Intelligence (UAI 2011). Ed. 27.
Olivier Delalleaum, Yoshua Bengio. "Shallow vs. Deep Sum-Product Networks". 2011. Advances in Neural Information Processing Systems (NIPS 2011). Ed. 24.
Abram L. Friesen, Pedro Domingos. "The Sum-Product Theorem: A Foundation for Learning Tractable Models". 2016. International Conference on Machine Learning (ICML 2016). Ed. 33.
Abram L. Friesen, Pedro Domingos. "Recursive Decomposition for Non-convex Optimization". 2015. International Conference on Artificial Intelligence (IJCAI 2015). Ed. 24.
Robert Peharz, Georg Kapeller, Pejman Mowlaee, Franz Pernkopf. "Modeling Speech with Sum-Product Networks: Application to Bandwidth Extension". IEEE International Conference on Acoustics, Speech, and Signal Processing (ICAASSP 2014). Ed. 39.
Robert Gens, Pedro Domingos. "Learning the Structure of Sum-Product Networks". International Conference on Machine Learning (ICML 2013). Ed. 30.
Aaron Dennis, Dan Ventura. "Learning the Architecture of Sum-Product Networks Using Clustering on Variables". Advances in Neural Information Processing Systems (NIPS 2012). Ed. 25.
Mohamed R. Amer, Sinisa Todorovic. "Sum-Product Networks for Activity Recognition". IEEE Transactions on Pattern Recognition and Machine Intelligence (TPAMI 2015).
Wei-Chen Cheng, Stanley Kok, Hoai Vu Pham, Hai Leong Chieu, Kian Ming Chai. "Language Modelling with Sum-Product Networks". Annual Conference of the Speech Communication Association (INTERSPEECH 2014). Ed. 15.
Robert Gens, Pedro Domingos. "Discriminative Learning of Sum-Product Networks". Advances in Neural Information Processing Systems (NIPS 2012). Ed. 25.
Bruno Massoni Sguerra, Fabio G. Cozman. "Image Classification Using Sum-Product Networks for Autonomous Flight of Micro Aerial Vehicles". Brazilian Conference on Intelligent Systems (BRACIS 2016). Ed. 5.
Ahmed Hussein et al. "Imitation Learning: A Survey of Learning Methods". ACM Computing Surveys (CSUR 2017). Volume 50. Issue 2. Article 21.
Mariusz Bojarski et al. "End to End Learning of Self-Driving Cars". Computing Research Repository (CoRR 2016). Volume 1604.07316.