Abstract
Infodemiology consists in the extraction and analysis of data compiled on the Internet regarding public health. Among other applications, Infodemiology can be used to analyse trends on social networks in order to determine the prevalence of outbreaks of infectious diseases in certain regions. This valuable data provides better understanding of the spread of infectious diseases as well as a vision about social perception of citizens towards the strategies carried out by public healthcare institutions. In this work, we apply Natural Language Processing techniques to determine the impact of outbreaks of infectious diseases such as Zika, Dengue or Chikungunya from a compiled dataset with tweets written in Spanish.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ajao, O., Bhowmik, D., Zargari, S.: Fake news identification on twitter with hybrid CNN and RNN models. In: Proceedings of the 9th International Conference on Social Media and Society, pp. 226–230 (2018)
Apolinardo-Arzube, O., García-Díaz, J.A., Medina-Moreira, J., Luna-Aveiga, H., Valencia-García, R.: Evaluating information-retrieval models and machine-learning classifiers for measuring the social perception towards infectious diseases. Appl. Sci. (2019). https://doi.org/10.3390/app9142858
Apolinario-Arzube, Ó., Medina-Moreira, J., Luna-Aveiga, H., García-Díaz, J.A., Valencia-García, R., Estrade-Cabrera, J.I.: Prevención de enfermedades infecciosas basada en el análisis inteligente en rrss y participación ciudadana. Procesamiento del Lenguaje Nat. 63, 163–166 (2019)
Baccianella, S., Esuli, A., Sebastiani, F.: SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: LREC, vol. 10, pp. 2200–2204 (2010)
Badillo, S., et al.: An introduction to machine learning. Clin. Pharmacol. Ther. 107(4), 871–885 (2020)
Baviera, T.: Técnicas para el análisis de sentimiento en twitter: aprendizaje automático supervisado y sentistrength. Rev. Dígitos 1(3), 33–50 (2017)
Chandrasekaran, N., et al.: The utility of social media in providing information on Zika virus. Cureus 9(10), e1792 (2017)
Chew, C., Eysenbach, G.: Pandemics in the age of Twitter: content analysis of tweets during the 2009 H1N1 outbreak. PLoS ONE 5(11), e14118 (2010)
Cortés, V.D., Velásquez, J.D., Ibáñez, C.F.: Twitter for marijuana infodemiology. In: Proceedings of the International Conference on Web Intelligence, pp. 730–736 (2017)
Cuan-Baltazar, J.Y., Muñoz-Perez, M.J., Robledo-Vega, C., Pérez-Zepeda, M.F., Soto-Vega, E.: Misinformation of COVID-19 on the internet: infodemiology study. JMIR Public Health Surveill. 6(2), e18444 (2020). https://doi.org/10.2196/18444. http://publichealth.jmir.org/2020/2/e18444/
Dey, L., Haque, S.K.: Opinion mining from noisy text data. In: Proceedings of SIGIR 2008 Workshop on Analytics for Noisy Unstructured Text Data, AND 2008 (2008). https://doi.org/10.1145/1390749.1390763
Espina, K., Estuar, M.R.J.E.: Infodemiology for syndromic surveillance of dengue and typhoid fever in the Philippines. Procedia Comput. Sci. 121, 554–561 (2017). https://doi.org/10.1016/j.procs.2017.11.073. http://www.sciencedirect.com/science/article/pii/S1877050917322731
Eysenbach, G.: SARS and population health technology. J. Med. Internet Res. 5(2), e14 (2003)
Eysenbach, G.: Infodemiology: tracking flu-related searches on the web for syndromic surveillance. In: AMIA Annual Symposium Proceedings, vol. 2006, p. 244. American Medical Informatics Association (2006)
Eysenbach, G.: Medicine 2.0: social networking, collaboration, participation, apomediation, and openness. J. Med. Internet Res. 10(3), e22 (2008)
Eysenbach, G.: Infodemiology and infoveillance: framework for an emerging set of public health informatics methods to analyze search, communication and publication behavior on the internet. J. Med. Internet Res. 11(1), e11 (2009)
Fiesler, C., Proferes, N.: “Participant” perceptions of Twitter research ethics. Soc. Media+ Soc. 4(1) (2018). https://doi.org/10.1177/2056305118763366
García-Díaz, J.A., Cánovas-García, M., Valencia-García, R.: Ontology-driven aspect-based sentiment analysis classification: an infodemiological case study regarding infectious diseases in Latin America. Future Gener. Comput. Syst. Impress 112, 641–657 (2020)
García-Díaz, J.A., Cánovas-García, M., Colomo-Palacios, R., Valencia-García, R.: Detecting misogyny in Spanish tweets: an approach based on linguistics features and word embeddings. Future Gener. Comput. Syst. 114, 506–518 (2021). https://doi.org/10.1016/j.future.2020.08.032. http://www.sciencedirect.com/science/article/pii/S0167739X20301928
Gu, Y., Qian, Z.S., Chen, F.: From Twitter to detector: real-time traffic incident detection using social media data. Transp. Res. Part C: Emerg. Technol. 67, 321–342 (2016)
Havrlant, L., Kreinovich, V.: A simple probabilistic explanation of term frequency-inverse document frequency (TF-IDF) heuristic (and variations motivated by this explanation). Int. J. Gen. Syst. 46(1), 27–36 (2017)
Hernández-García, I., Giménez-Júlvez, T.: Assessment of health information about COVID-19 prevention on the internet: infodemiological study. JMIR Public Health Surveill. 6(2), e18717 (2020). https://doi.org/10.2196/18717. https://publichealth.jmir.org/2020/2/e18717
Hockx-Yu, H.: The Web as History (2018)
Jeevan Nagendra Kumar, Y., Mani Sai, B., Shailaja, V., Renuka, S., Panduri, B.: Python NLTK sentiment inspection using Naïve Bayes classifier. Int. J. Recent Technol. Eng. (2019). https://doi.org/10.35940/ijrte.B1328.0982S1119
Khan, A., Baharudin, B., Khan, K.: Sentiment classification using sentence-level lexical based. Trends Appl. Sci. Res. 6(10), 1141–1157 (2011)
Kim, S.M., Hovy, E.: Identifying and analyzing judgment opinions. In: HLT-NAACL 2006 - Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings of the Main Conference (2006). https://doi.org/10.3115/1220835.1220861
Larsson, G., Maire, M., Shakhnarovich, G.: Learning representations for automatic colorization. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 577–593. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_35
Lim, W.L., Ho, C.C., Ting, C.-Y.: Tweet sentiment analysis using deep learning with nearby locations as features. In: Alfred, R., Lim, Y., Haviluddin, H., On, C.K. (eds.) Computational Science and Technology. LNEE, vol. 603, pp. 291–299. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-0058-9_28
Liu, B.: Sentiment analysis and opinion mining. Synth. Lect. Hum. Lang. Technol. 5(1), 1–167 (2012)
Luna-Aveiga, H., et al.: Sentiment polarity detection in social networks: an approach for asthma disease management. In: Le, N.-T., Van Do, T., Nguyen, N.T., Thi, H.A.L. (eds.) ICCSAMA 2017. AISC, vol. 629, pp. 141–152. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-61911-8_13
Mayer, S.V., Tesh, R.B., Vasilakis, N.: The emergence of arthropod-borne viral diseases: a global prospective on Dengue, Chikungunya and Zika fevers. Acta Tropica 166, 155–163 (2017). https://doi.org/10.1016/j.actatropica.2016.11.020. http://www.sciencedirect.com/science/article/pii/S0001706X16306246
García-Díaz, J.A., et al.: Opinion mining for measuring the social perception of infectious diseases. an infodemiology approach. In: Valencia-García, R., Alcaraz-Mármol, G., Del Cioppo-Morstadt, J., Vera-Lucio, N., Bucaram-Leverone, M. (eds.) CITI 2018. CCIS, vol. 883, pp. 229–239. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00940-3_17
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space (2013)
Mostafa, M.M.: More than words: social networks’ text mining for consumer brand sentiments. Expert Syst. Appl. 40(10), 4241–4251 (2013)
Pang, B., Lee, L., et al.: Opinion mining and sentiment analysis. Found. Trends® Inf. Retrieval 2(1–2), 1–135 (2008)
Paredes-Valverde, M.A., Colomo-Palacios, R., Salas-Zárate, M.d.P., Valencia-García, R.: Sentiment analysis in Spanish for improvement of products and services: a deep learning approach. Sci. Program. 2017 (2017)
Patterson, J., Sammon, M., Garg, M.: Dengue, Zika and Chikungunya: emerging arboviruses in the new world. West. J. Emerg. Med. 17(6), 671 (2016)
Pearce, N.: Traditional epidemiology, modern epidemiology, and public health. Am. J. Public Health 86(5), 678–683 (1996)
Ramteke, J., Shah, S., Godhia, D., Shaikh, A.: Election result prediction using Twitter sentiment analysis. In: 2016 International Conference on Inventive Computation Technologies (ICICT), vol. 1, pp. 1–5. IEEE (2016)
Ruiz-Martínez, J.M., Valencia-García, R., García-Sánchez, F., et al.: Semantic-based sentiment analysis in financial news. In: Proceedings of the 1st International Workshop on Finance and Economics on the Semantic Web, pp. 38–51 (2012)
Salas-Zárate, M.d.P., Medina-Moreira, J., Lagos-Ortiz, K., Luna-Aveiga, H., Rodriguez-Garcia, M.A., Valencia-Garcia, R.: Sentiment analysis on tweets about Diabetes: an aspect-level approach. Comput. Math. Methods Med. 2017 (2017)
Salas-Zárate, M.D.P., Paredes-Valverde, M.A., Limon-Romero, J., Tlapa, D., Baez-Lopez, Y.: Sentiment classification of Spanish reviews: an approach based on feature selection and machine learning methods. J. UCS 22(5), 691–708 (2016)
del Pilar Salas-Zárate, M., Paredes-Valverde, M.A., Rodriguez-García, M.Á., Valencia-García, R., Alor-Hernández, G.: Automatic detection of satire in Twitter: a psycholinguistic-based approach. Knowl. Based Syst. 128, 20–33 (2017). https://doi.org/10.1016/j.knosys.2017.04.009
Saldanha, T.J., Krishnan, M.S.: Organizational adoption of web 2.0 technologies: an empirical analysis. J. Organ. Comput. Electron. Commer. 22(4), 301–333 (2012)
Wolfe, R.M., Sharp, L.K.: Vaccination or immunization? The impact of search terms on the internet. J. Health Commun. 10(6), 537–551 (2005). https://doi.org/10.1080/10810730500228847. pMID: 16203632
Young, T., Hazarika, D., Poria, S., Cambria, E.: Recent trends in deep learning based natural language processing. IEEE Comput. Intell. Mag. 13(3), 55–75 (2018)
Acknowledgements
This work has been supported by the Spanish National Research Agency (AEI) and the European Regional Development Fund (FEDER/ERDF) through projects KBS4FIA (TIN2016-76323-R) and LaTe4PSP (PID2019-107652RB-I00). In addition, José Antonio García-Díaz has been supported by Banco Santander and University of Murcia through the Doctorado industrial programme.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Apolinario-Arzube, Ó., García-Díaz, J.A., Luna-Aveiga, H., Medina-Moreira, J., Valencia-García, R. (2020). Knowledge Extraction from Twitter Towards Infectious Diseases in Spanish. In: Valencia-García, R., Alcaraz-Marmol, G., Del Cioppo-Morstadt, J., Vera-Lucio, N., Bucaram-Leverone, M. (eds) Technologies and Innovation. CITI 2020. Communications in Computer and Information Science, vol 1309. Springer, Cham. https://doi.org/10.1007/978-3-030-62015-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-62015-8_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-62014-1
Online ISBN: 978-3-030-62015-8
eBook Packages: Computer ScienceComputer Science (R0)