Skip to main content
Log in

Hybrid self-attention NEAT: a novel evolutionary self-attention approach to improve the NEAT algorithm in high dimensional inputs

  • Original Paper
  • Published:
Evolving Systems Aims and scope Submit manuscript

Abstract

This article presents a “Hybrid Self-Attention NEAT” method to improve the original NeuroEvolution of Augmenting Topologies (NEAT) algorithm in high-dimensional inputs. Although the NEAT algorithm showed a significant result in different challenging tasks, as input representations are highly dimensional, it cannot create a well-tuned network. Accordingly, we decided to overcome this limitation by using the Self-Attention technique as an indirect encoding method to select the most important parts of the input. In order to tune the hyper-parameters of the self-attention module, we used the CMA-ES evolutionary algorithm. Also, an innovative method called Seesaw is presented in this article to evolve populations of the NEAT and CMA-ES algorithms simultaneously. Besides the evolutionary operators of the NEAT algorithm to update the weights, we used a combination method to reach more fitting weights. We tested our model on a variety of Atari games. The results showed that, compared to state-of-the-art evolutionary algorithms, Hybrid Self-Attention NEAT could eliminate the restriction of the original NEAT and achieve comparable scores with raw pixel input while using much smaller (e.g. approximately 300 × against HyperNEAT) number of parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from $39.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price includes VAT (Canada)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Notes

  1. https://github.com/SamanKhamesian/Hybrid-Self-Attention-NEAT.

  2. https://www.gymlibrary.dev/environments/atari/asteroids/.

  3. https://www.gymlibrary.dev/environments/atari/berzerk/.

  4. https://www.gymlibrary.dev/environments/atari/space_invaders/.

  5. https://www.gymlibrary.dev/environments/atari/seaquest/.

References

  • Badia AP et al. (2020) Agent57: Outperforming the Atari Human Benchmark,” arXiv:2003.13350 [cs, stat], Accessed: Oct. 24, 2021. [Online]. Available: http://arxiv.org/abs/2003.13350

  • Bellemare MG, Naddaf Y, Veness J, Bowling M (2013) The arcade learning environment: an evaluation platform for general agents. J Artif Intell Res 47:253–279

    Article  Google Scholar 

  • Chen F, Yang C, Khishe M (2022) Diagnose parkinson’s disease and cleft lip and palate using deep convolutional neural networks evolved by IP-based chimp optimization algorithm. Biomed Sign Process Control 77:103688. https://doi.org/10.1016/j.bspc.2022.103688

    Article  Google Scholar 

  • Cuccu G, Togelius J, Cudré-Mauroux P (2021) Playing Atari with few neurons. Auton Agent Multi-Agent Syst 35(2):1–23

    Article  Google Scholar 

  • Dasgupta D, McGregor DR (1992) Designing application-specific neural networks using the structured genetic algorithm. In [Proceedings] COGANN-92: International Workshop on Combinations of Genetic Algorithms and Neural Networks, pp. 87–96

  • Deng L, Hinton G, Kingsbury B (2013) New types of deep neural network learning for speech recognition and related applications: an overview. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, 8599–8603

  • Gauci J, Stanley KO (2010) “Indirect encoding of neural networks for scalable go”, in parallel problem solving from nature - PPSN XI, 11th international conference, Kraków, Poland. Proc, Part I 6238:354–363

    Google Scholar 

  • Gruau FC (1994) Neural Network synthesis using cellular encoding and the genetic algorithm. Université de Lyon 1

    Google Scholar 

  • Ha D, Schmidhuber J (2018) “World Models,” arXiv:1803.10122 [cs, stat], Accessed: Sep. 30, 2021. [Online]. Available: http://arxiv.org/abs/1803.10122

  • Hansen N (2006) The CMA evolution strategy: a comparing review. In: Lozano JA, Larrañaga P, Inza I, Bengoetxea E (eds) Towards a new evolutionary computation - advances in the estimation of distribution algorithms, vol 192. Springer, pp 75–102

  • Hansen N, Auger A (2014) , “Evolution strategies and CMA-ES (covariance matrix adaptation),” in Genetic and Evolutionary Computation Conference, GECCO ’14, Vancouver, BC, Canada, Companion Material Proceedings, pp. 513–534.

  • Hausknecht M, Lehman J, Miikkulainen R, Stone P (2014) A neuroevolution approach to general atari game playing. IEEE Trans Comput Intell AI Games 6(4):355–366

    Article  Google Scholar 

  • He K, Zhang X, Ren S, Sun J (2016) “Deep residual learning for image recognition” in 2016. IEEE Conf Computer Vision Pattern Recognit (CVPR) 2016:770–778

    Google Scholar 

  • He X, Zhao K, Chu X (2021) AutoML: a survey of the state-of-the-art. Knowl-Based Syst 212:106622

    Article  Google Scholar 

  • Jin KH, McCann MT, Froustey E, Unser M (2017) Deep convolutional neural network for inverse problems in imaging. IEEE Trans Image Process 26(9):4509–4522

    Article  MathSciNet  Google Scholar 

  • Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90

    Article  Google Scholar 

  • Kuwil FH (2022) A new feature extraction approach of medical image based on data distribution skew. Neurosci Inf 2(3):100097. https://doi.org/10.1016/j.neuri.2022.100097

    Article  Google Scholar 

  • Li Y, Yang Z (2017) Application of EOS-ELM With binary jaya-based feature selection to real-time transient stability assessment using PMU data. IEEE Access 5:23092–23101. https://doi.org/10.1109/ACCESS.2017.2765626

    Article  MathSciNet  Google Scholar 

  • Li Y, Zhang M, Chen C (2022) A Deep-Learning intelligent system incorporating data augmentation for Short-Term voltage stability assessment of power systems. Appl Energy 308:118347. https://doi.org/10.1016/j.apenergy.2021.118347

    Article  Google Scholar 

  • Liang J, Meyerson E, Hodjat B, Fink D, Mutch K, Miikkulainen R (2019) “Evolutionary neural AutoML for deep learning,” In Proceedings of the Genetic and Evolutionary Computation Conference, New York, NY, USA, pp. 401–409

  • Lin Z et al. (2017), “A Structured Self-attentive Sentence Embedding,” arXiv:1703.03130 [cs], Accessed: Sep. 09, 2021. [Online]. Available: http://arxiv.org/abs/1703.03130

  • McCulloch WS, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5(4):115–133

    Article  MathSciNet  Google Scholar 

  • Medjahed SA (2015) A comparative study of feature extraction methods in images classification. Int J Image, Gr Sign Process 7:16–23. https://doi.org/10.5815/ijigsp.2015.03.03

    Article  Google Scholar 

  • Mnih V et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533

    Article  Google Scholar 

  • Müller N, Glasmachers T (2018) Challenges in high-dimensional reinforcement learning with evolution strategies. Parallel problem solving from nature – PPSN XV. Cham, pp 411–423

  • Nassif AB, Shahin I, Attili I, Azzeh M, Shaalan K (2019) Speech recognition using deep neural networks: a systematic review. IEEE Access 7:19143–19165

    Article  Google Scholar 

  • Papavasileiou E, Cornelis J, Jansen B (2021) A systematic literature review of the successors of ‘neuroevolution of augmenting topologies.’ Evol Comput 29(1):1–73

    Article  Google Scholar 

  • Parmar N et al. (2018) “Image Transformer,” arXiv:1802.05751 [cs], Accessed: Sep. 09, 2021. [Online]. Available: http://arxiv.org/abs/1802.05751

  • Paulus R, Xiong C, Socher R (2017) “A Deep Reinforced Model for Abstractive Summarization,” arXiv:1705.04304 [cs], Accessed: Sep. 09, 2021. [Online]. Available: http://arxiv.org/abs/1705.04304

  • Poli R, (1997) “Evolution of Graph-Like Programs with Parallel Distributed Genetic Programming,” in Proceedings of the 7th International Conference on Genetic Algorithms, East Lansing, MI, USA, pp. 346–353

  • Risi S, Stanley KO, (2019)“Deep Neuroevolution of Recurrent and Discrete World Models,” in Proceedings of the Genetic and Evolutionary Computation Conference, New York, NY, USA, pp. 456–462

  • Risi S, Stanley KO (2012) An enhanced hypercube-based encoding for evolving the placement, density, and connectivity of neurons. Artif Life 18(4):331–363

    Article  Google Scholar 

  • Risi S, Togelius J (2017) Neuroevolution in games: state of the art and open challenges. IEEE Trans Comput Intell AI Games 9(1):25–41

    Article  Google Scholar 

  • Ronald E, Schoenauer M, (1994) “Genetic lander: An experiment in accurate neuro-genetic control,” in Proc. PPSN III, Jérusalem, France, 866: 452–461

  • Salimans T, Ho J, Chen X, Sidor S, Sutskever I, (2017) “Evolution Strategies as a Scalable Alternative to Reinforcement Learning,” arXiv:1703.03864 [cs, stat], Accessed: Oct. 24, 2021. [Online]. Available: http://arxiv.org/abs/1703.03864

  • Stanley KO (2007) Compositional pattern producing networks: a novel abstraction of development. Genet Program Evolvable Mach 8(2):131–162

    Article  Google Scholar 

  • Stanley KO, Miikkulainen R (2002) Evolving neural networks through augmenting topologies. Evol Comput 10(2):99–127

    Article  Google Scholar 

  • Stanley KO, D’Ambrosio DB, Gauci J (2009) A hypercube-based encoding for evolving large-scale neural networks. Artif Life 15(2):185–212

    Article  Google Scholar 

  • Stanley KO, Clune J, Lehman J, Miikkulainen R (2019) Designing neural networks through neuroevolution. Nat Mach Intell 1(1):24–35

    Article  Google Scholar 

  • Stanley KO, Miikkulainen R (2004) “Evolving a Roving Eye for Go,” in Genetic and Evolutionary Computation - GECCO 2004 (Part II), 3103: 1226–1238

  • Such FP, Madhavan V, Conti E, Lehman J, Stanley KO, Clune J (2018) “Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning,” arXiv:1712.06567 [cs], Accessed: Oct. 24, 2021. [Online]. Available: http://arxiv.org/abs/1712.06567

  • Tallamraju R et al (2020) AirCapRL: autonomous aerial human motion capture using deep reinforcement learning. IEEE Robot Autom Lett 5(4):6678–6685

    Article  Google Scholar 

  • Tang Y, Nguyen D, Ha D (2020) “Neuroevolution of Self-Interpretable Agents,” in Proceedings of the 2020 Genetic and Evolutionary Computation Conference, New York, NY, USA, pp. 414–424

  • Tian D (2013) A review on image feature extraction and representation techniques. Int J Multimed Ubiquitous Eng 8:385–395

    Google Scholar 

  • Tupper A,(2020) “Evolutionary reinforcement learning for vision-based general video game playing,” M.S. thesis, College of Engineering, University of Canterbury, New Zealand, [Online]. Available: http://dx.doi.org/https://doi.org/10.26021/10198

  • Tupper A, Neshatian K (2020) “Evaluating Learned State Representations for Atari,” in 2020 35th International Conference on Image and Vision Computing New Zealand (IVCNZ), pp. 1–6

  • van den Berg TG, Whiteson S (2013) Critical Factors in the Performance of HyperNEAT,” in Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation, New York, NY, USA, pp. 759–766

  • Vaswani A et al. (2017) “Attention is All You Need,” in Proceedings of the 31st International Conference on Neural Information Processing Systems, Red Hook, NY, USA, pp. 6000–6010

  • Verbancsics P, Stanley KO (2010) Evolving static representations for task transfer. J Mach Learn Res 11:1737–1769

    MathSciNet  Google Scholar 

  • Wang X, Gong C, Khishe M, Mohammadi M, Rashid TA (2022) Pulmonary diffuse airspace opacities diagnosis from chest X-ray images using deep convolutional neural networks fine-tuned by whale optimizer. Wirel Pers Commun 124(2):1355–1374. https://doi.org/10.1007/s11277-021-09410-2

    Article  Google Scholar 

  • Xu L, Ren JSJ, Liu C, Jia J, (2014) Deep Convolutional Neural Network for Image Deconvolution. In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 1, Cambridge, MA, USA , pp. 1790–1798

  • Yutong G, Khishe M, Mohammadi M, Rashidi S, Nateri MS (2022) Evolving deep convolutional neural networks by extreme learning machine and fuzzy slime mould optimizer for real-time sonar image recognition. Int J Fuzzy Syst 24(3):1371–1389. https://doi.org/10.1007/s40815-021-01195-7

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hamed Malek.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant tothe content of this article.

Appendix

Appendix

1.1 A. Atari games policy learning experiments

See Table 3, 4, 5.

Table 3 Hyper-parameters for the NEAT algorithm
Table 4 Hyper-parameter for the CMA-ES algorithm
Table 5 Hyper-parameter for the Self-Attention part

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Khamesian, S., Malek, H. Hybrid self-attention NEAT: a novel evolutionary self-attention approach to improve the NEAT algorithm in high dimensional inputs. Evolving Systems 15, 489–503 (2024). https://doi.org/10.1007/s12530-023-09510-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue date:

  • DOI: https://doi.org/10.1007/s12530-023-09510-3

Keywords

Profiles

  1. Hamed Malek
Morty Proxy This is a proxified and sanitized view of the page, visit original site.