METHODS OF EXTRACTING CYBERSECURITY OBJECTS FROM ELECTRONIC SOURCES USING ARTIFICIAL INTELLIGENCE
DOI:
https://doi.org/10.17721/ISTS.2024.8.34-41Keywords:
cyber warfare, cybersecurity, generative artificial intelligence, large language models, Internet, open electronic sources, social networks, text analysis, objects of cyber warfareAbstract
B a c k g r o u n d . The rapid development of information technology (IT) has led to new threats and challenges in the field of cybersecurity. Cyber warfare has become a reality and a real problem for states, organizations and individual users of cyberspace. Ukraine is taking a number of measures to develop a system of cyber actions in cyberspace, which include a set of interconnected subsystems of cyber intelligence, cyber defense, cyber influence and cyber counterintelligence. One of the forms of cyber intelligence is open-source computer intelligence (OSINT), which is used to search for and obtain intelligence information, including the identification and analysis of cybersecurity objects to predict possible manifestations of cyber threats and their consequences. This requires the development of effective methods for detecting and analyzing cybersecurity objects by extracting factual data on cybersecurity objects from large amounts of unstructured textual information.
M e t h o d s . The paper investigates artificial intelligence technologies, in particular, large language models (LLM) and generative artificial intelligence (GenAI) in the context of their application to solve the problems of computer intelligence of cybersecurity objects from open electronic sources and social networks.
R e s u l t s . As a result of the study, in order to carry out an effective analysis of the results of information extraction, a methodology for extracting named entities - the names of hacker groups and their contextual connections from the texts of messages of electronic network sources related to the subject area of cybersecurity, as well as the formation of networks of their interconnections and a substantive analysis of these networks is proposed. To identify the actors involved in cyber warfare, the author proposes a methodology for analyzing selected documents available in electronic sources on the Internet and social networks. Both methods are based on the use of artificial intelligence.
C o n c l u s i o n s . The results of the study demonstrate the effectiveness of the proposed approaches and the possibility of their practical application in solving cybersecurity problems. The proposed methods can be an important tool for cybersecurity professionals to develop effective strategies to protect against cyber threats.
Downloads
References
Даник, Ю. Г, Воробієнко, П. П., & Чернега, В. М. (2019). Основи кібербезпеки та кібероборони. Одеська національна академія зв'язку імені О. С. Попова.
Ланде, Д., Субач, І., & Соболєв А. (2019). Комп'ютерна програма контент-моніторингу соціальних мереж з питань кібербезпеки – КіберАгрегатор ("КіберАгрегатор"). Свідоцтво про реєстрацію авторського права на твір № 92744. Міністерство економіки.
Alam, T., Bhusal, D., Park, Y., & Rastogi, N. (2022). CyNER: A Python Library for Cybersecurity Named Entity Recognition. https://doi.org/10.48550/arXiv.2204.05754
Bayer, M., Kuehn, P., Shanehsaz, R., & Reuter, C. (2024). CySecBERT: A Domain-Adapted Language Model for the Cybersecurity Domain. ACM Transactions on Privacy and Security, 27(2), 1–20. https://doi.org/10.1145/3652594
Lande, D., Subach, I., Puchkov, O., & Soboliev, A. (2020). A Clustering Method for Information Summarization and Modelling a Subject Domain. Information & Security, 50(1), 79–86. https://doi.org/10.11610/isij.5013
Gao, C., Zhang, X., & Han, M. (2021). A review on cyber security named entity recognition. Front Inform Technol Electron Eng, 1153–1168. https://doi.org:/10.1631/FITEE.2000286
Halbouni, A., Gunawan, T. S., Habaebi, M. H., Halbouni, M., Kartiwi, M., & Ahmad, R. (2022). Machine Learning and Deep Learning Approaches for CyberSecurity. IEEE Access, 10, 19572–19585. https://doi.org/10.1109/ACCESS.2022.3151248
Hanks, C., Maiden, M., Ranade, P., Finin, T., & Joshi, A. (2022). Recognizing and extracting cybersecurity entities from text. In Workshop on Machine Learning for Cybersecurity. International Conference on Machine Learning. https://doi.org/10.48550/arXiv.2208.01693
Lande, D., Puchkov, O., & Subach, I. (2020). Cистема аналізу великих обсягів даних з питань кібербезпеки із соціальних медіа. Collection "Information Technology and Security, 8(1), 4–18. https://doi.org/10.20535/2411-1031.2020.8.1.217993
Lande, D., Puchkov, O., & Subach, I. (2022). Method of Detecting Cybersecurity Objects Based on OSINT Technology. In XXII International Scientific and Practical Conference "Information Technologies and Security" (ITS 2022), Vol. 3503 (pp. 115–124). State University of Information and Communication Technologies. https://ceur-ws.org/Vol-3503/paper11.pdf
Piyush, G., & Okamura, K. (2021). Investigating Cybersecurity News Articles by Applying Topic Modeling Method. In International Conference on Information Networking (ICOIN) (pp. 432–438). IEEE.
Yi, F., Jiang, B., Wang, L., & Wu, J. (2020). Cybersecurity Named Entity Recognition Using Multi-Modal Ensemble Learning. IEEE Access, 8, 63214–63224. https://doi.org/10.1109/ACCESS.2020.2984582
