Aportaciones del aprendizaje automático a la detección, seguimiento y reconocimiento de personas en robots de servicio
- Álvarez Aparicio, Claudia
- Vicente Matellán Olivera Director
- Ángel Manuel Guerrero Higueras Director
- Francisco Javier Rodríguez Lera Director
Universidad de defensa: Universidad de León
Fecha de defensa: 22 de diciembre de 2022
- Roberto Iglesias Rodríguez Presidente/a
- Lidia Sánchez González Secretaria
- Luis Jesús Manso Fernández-Argüelles Vocal
Tipo: Tesis
Resumen
Social robots aim to interact with people in all kinds of environments. This interaction can occur in different scenarios, from the robot providing information to solving a specific task. Assistive robots are a subset of social robots whose purpose is to help people in restaurants, homes, hospitals, etc. Tasks faced by assistive robots add to the complexity of general robotics, the need to interact with humans, who expect their performance to be similar to that of a human in a domestic or customer service environment. In addition, these robots must operate autonomously, i.e. they must have the ability to make their own decisions on their own in the environment in which they are deployed. Tasks faced by assistive robots add to the complexity of general robotics, the need to interact with humans, who expect their performance to be similar to that of a human in a domestic or customer service environment. In addition, these robots must operate autonomously, i.e. they must have the ability to make their own decisions on their own in the environment in which they are deployed. One of the classic problems in service robotics is "navigation", i.e. the ability to move through the environment autonomously and without damaging objects or people in its path. It is common for navigation solutions to treat people and objects in the environment in the same way, which is not appropriate for assistive robots. The specific problem of autonomous robot navigation in human environments is called "social navigation", which is the problem addressed in this thesis. Social navigation is not only about avoiding collisions with people or objects in the trajectory and the calculation of the trajectory, it is also about the robot’s ability to approach a person, walk next to him/her, follow him/her, etc. These capabilities are closely related to three essential skills: detection, tracking and recognition of people. The objective of this PhD can be summarised as the development of methods that allow the creation of a human detection, tracking and recognition pipeline that can be integrated into social navigation systems, which will promote human-robot interaction and facilitate the acceptance of these robots in all kind of environments. For this purpose, in the development of the PhD, a series of methods have been proposed and evaluated, which have been integrated into two systems. Firstly, the system called People Tracking (PeTra), enables the detection and tracking of people in the vicinity of the robot. Secondly, Biometric RecognITion Through gAit aNalYsis (BRITTANY) enables the recognition of people by their gait analysis. Both systems rely solely on the information contained in occupancy maps that can be provided by different sensors. People Tracking (PeTra) is based on the use of a segmentation Convolutional Neural Network (CNN), which through the processing of an occupancy map, allows to determine which points of the occupancy map belong to a person. Based on this information, by post-processing the data, the persons can be tracked. Two possible approaches are considered: the calculation of Euclidean distances and the use of Kalman filters. PeTra has been compared with Leg Detector (LD), the default solution present in Robot Operating System (ROS), based on em Random Trees to determine the location of people. The final system has reported better detection results than LD. And in terms of tracking, the Kalman filter approach has also reported better results than the implementation using Euclidean distance calculation. BRITTANY is based on the use of a classification CNN, which by processing an aggregated occupancy map, allow to determine which user is walking in front of the robot. The aggregated occupancy map is created by concatenating several occupancy maps that contain only those sensor points that are part of a person. By concatenating these maps, the walking action of a person is represented. BRITTANY proposes a new architecture of CNN that has been compared with two well-known classification architectures, LeNet and AlexNet. The final system developed is robust even to users outside the system.