O método de Levenberg-Marquadt estocástico aplicado ao treinamento de redes neurais artificiais

Benatti, Kléber; Bueno, Luis Felipe Cesar da Rocha [UNIFESP]; Nazaré, Tiago

O método de Levenberg-Marquadt estocástico aplicado ao treinamento de redes neurais artificiais

Arquivos

RelatorioEstendidoResumoSBPO.pdf(472.77 KB)

Data

2020-07-30

Autores

Benatti, Kléber

Bueno, Luis Felipe Cesar da Rocha

Nazaré, Tiago

Tipo

Artigo

Resumo

Este trabalho apresenta resultados referentes ao TCC do primeiro autor no curso de especialização em Data Science financiado pelo Itaú-Unibanco. O método de Levenberg-Marquadt tem mostrado bons resultados na resolução de problemas de quadrados mínimos não linear, pois alia a convergência do método de Newton utilizando apenas informação de primeira ordem e a boa definição de todos os seus iterandos. Sendo assim, uma aplicação natural desta técnica seria utilizá-la para minimização da função custo associado ao treinamento de redes neurais artificiais. Porém, o cálculo da matriz Jacobiana associada ao sistema pode ser muito caro quando o número de instâncias é muito alto, o que torna a otimização muito lenta. Desta forma, neste trabalho é proposto um método do tipo Levenberg-Marquadt estocástico para a minimização de funções custo associadas às redes neurais. O desempenho do algoritmo é comparado com o método de Levenberg-Marquadt clássico, além do método Adam, que é usualmente aplicado neste contexto.
This work presents some results referring to the monograph of the first author in the Data Science specialization course financed by Itaú-Unibanco. The Levenberg-Marquadt method has shown good results in solving nonlinear least squares problems, since it combines the convergence of Newton's method using only first order information and the good definition of all its iterands. Therefore, a natural application of this technique would be to use it to minimize the cost function associated with the training of artificial neural networks. However, the calculation of the Jacobian matrix associated with the system can be very expensive with aa high number of instances, which makes the optimization very slow. Thus, this work proposes a stochastic Levenberg-Marquadt method to minimize cost functions associated with neural networks. The performance of the proposed algorithm is compared with the classic Levenberg-Marquadt method, in addition to the Adam method, which is usually applied in this context.