Artificial neural networks applied to the resolution of regression and classification multivariate analysis problems in the agricultural and the industrial fields

  1. MARTÍNEZ MARTÍNEZ, VÍCTOR
Zuzendaria:
  1. Jaime Gómez Gil Zuzendaria
  2. Javier M. Aguiar Pérez Zuzendaria

Defentsa unibertsitatea: Universidad de Valladolid

Fecha de defensa: 2016(e)ko martxoa-(a)k 11

Epaimahaia:
  1. Gonzalo Pajares Presidentea
  2. José Fernando Díez Higuera Idazkaria
  3. Evandro de Castro Melo Kidea
  4. Alonso Alonso Alonso Kidea
  5. Carlos Baladrón Zorita Kidea

Mota: Tesia

Laburpena

Artificial Neural Networks (ANN) are processing models inspired in the human brain that have been widely employed to solve regression and classification problems because of the input-output mapping capability of the ANNs. Agriculture and industry are two examples of fields in which there are many processes affected by multiple variables that need to be modelled, and the input-output mapping capability of the ANNs makes a good alternative to be used to build these models. The main objective of this thesis is to design, implement, and evaluate specific ANN-based models to be applied to agricultural and industrial applications. These models will be designed considering a set of variables larger or different than the previously considered in the scientific literature, minimizing the previous knowledge about the application needed to design them, and optimizing its performance in terms of processing time and computation needed. The methodology employed to achieve the objectives proposed in this thesis is composed of five stages: literature review, hypothesis formulation, developing and evaluation, result analysis, and result dissemination. The literature review was considered to learn about the ANN-based modelling techniques implemented and about the application to be modelled. After the literature review, a research hypothesis was formulated to be used as the basis of the research done on each application. Then, a scenario to develop and test the proposed ANN-based models was proposed and implemented in the developing and evaluation stage. Results obtained in the testing stage were compared with the obtained by other methods proposed in the scientific literature in the result analysis stage. This comparison determined if the research hypothesis was valid or if it needed to be redefined or rejected. Finally, if the hypothesis needed to be redefined or rejected, the methodology went back to the hypothesis formulation stage, and if the hypothesis was valid, the obtained results were disseminated. The proposed methodology was applied to five agricultural and industrial applications with different types of regression and classification problems to be modelled: the tobacco drying process, the switchgrass (Panicum virgatum) drying process, the predictive maintenance of a machine, the evaluation of steel pieces in a production line, and the early detection of plant diseases. An ANN to model the tobacco drying process was proposed to estimate and predict the temperature and relative humidity inside the tobacco mass using temperature and relative humidity data from measurement points placed outside the tobacco mass (first article of the compendium). The switchgrass drying process was modelled to predict the future value of the moisture content of the switchgrass when it was being dried in the land with variable weather conditions (second article of the compendium). A vibration model of a machine was performed to be used in a predictive maintenance application, estimating the state of the rotary components of the monitored machine as a function of a vibration signal acquired in a single point of the machine structure (third article of the compendium). A non-destructive testing (NDT) model was proposed to evaluate steel pieces as a function of the eddy-current induced impedances measured applying the eddy-current at different frequencies (fourth article of the compendium). The early detection of plant diseases application was done by means of a reflectance-based disease model, which estimated the severity of a disease of a plant as a function of reflectance data acquired in previous stages of the plant development (fifth article of the compendium). The experiments performed with each one of the five models developed in this thesis showed findings that can be grouped into three categories. The first one is that most of the models proposed were developed with a set of variables larger or different than those commonly considered in the scientific literature. The second group of findings is that the proposed models were developed avoiding previous knowledge about the process modelled. This characteristic made the proposed models more general and easier to be adapted to other similar processes compared to other models of scientific literature. The last group of results focuses on the optimization of the model, minimizing the processing capability or the training time needed to adjust or execute the proposed method. This characteristic of the developed models allows them to be used in real applications with requirements of low-latency or low computational load requirements. The results obtained in this thesis suggest the capability of ANN-based models to solve regression or classifications problems of the agricultural and industrial fields. Three main conclusions can be extracted from these results. The first one is that models proposed in scientific literature can be improved considering a set of variables larger or different than the proposed by other authors: for example, the tobacco drying model considers a measurement point inside the product to be dried, the switchgrass drying model employs information about the rain events, the vibration model of a machine uses information from a measurement point placed far away from the rotary components considered, and the NDT model employs impedance data at different frequencies. The second conclusion is that the generalization capability of the ANNs allows researchers to develop ANN-based models that do not include previous knowledge about the application to be modelled: for example, the tobacco drying process model does not consider information about the control algorithm, both the tobacco and the switchgrass drying models do not employ any information about the product to be dried, the vibration model of a machine do not used any information about the rotary components monitored, and the models for the NDT of steel pieces and for the early detection of plant diseases can be implemented with acquisition systems with other characteristics than the used in the experiments of this thesis. The third conclusion is that the processing time and the computation needed depends on the way the ANN is configured and optimized, and because of that the models proposed in this thesis were optimized as follows: the tobacco and the switchgrass drying models and the early detection of plant diseases model were designed to be used in near real-time applications employing features with low processing requirements as inputs of these models, the vibration model of a machine and the NDT model do not implemented a feature extraction stage so they can be employed in real-time applications, and the vibration model of a machine uses a genetic algorithm to improve the performance of the model and to reduce the time needed to train the ANN.