International Journal of Information Security: A Bibliometric Study, 2007-2023

This study employs various bibliometric analysis techniques to examine the intellectual structure of the International Journal of Information Security from 2007 to 2023. The aim is to identify the most cited journals, underlying research themes within the article corpus, and gradual changes in the research themes over time. “Lecture Notes on Computer Science” is the most referenced knowledge source. In addition, articles on the theme of encryption techniques received the highest average citations, followed by key management and authentication protocols. Moreover, machine learning, blockchain, and the Internet of Things are emerging topics of interest among published authors in the International Journal of Information Security. Qualitative and quantitative comparisons between open-access and regular articles suggested a few notable differences in author keywords but no differences in the number of citations received. Furthermore, regression analysis found a negative correlation between citation counts with the length of the article abstract and article title and a positive correlation with page count, being published in a special issue, and if at least the a�liation of one of the authors is different from others. Finally, we also identi�ed prominent authors, articles, institutions, and countries published in this journal.


Introduction
Information security academic research has progressed and matured over time.Researchers often re ect on a research outlet or a scholarly discipline to explore the underlying research themes and the evolution of accumulated knowledge.In the literature review section, Table I summarizes the intellectual structure of notable journals investigated in the past.Although no previous study had targeted a speci c information security journal, similar studies were conducted to examine the intellectual structure of information security as a discipline [1]- [3].For instance, one such study characterized the development of information security in different waves: technical, management, institutional, information security governance, and, more recently, cybersecurity [4], [5].Hence, in this regard, information security is often distinguished from cybersecurity, with the latter involving protecting information resources and assets and various human factors [6].Information security is "the preservation of the con dentiality, integrity, and availability of information" [6].Cybersecurity, on the other hand, is de ned as "the collection of tools, policies, security concepts, security safeguards, guidelines, risk management approaches, actions, training, best practices, assurance, and technologies that can be used to protect the cyber environment and organization and user's assets" [6].
The International Journal of Information Security (IJIS) was rst published in August 2001 and has been publishing technical work in information security since.Based on its aims and scope statement, it covers such topics as network security, rewalls, mobile security, access control, applied cryptography, and intrusion detection. 1In other words, the focus is more on the technical aspects of information security than the human or organizational aspects.As mentioned, human factors in information security research usually fall under cybersecurity and behavioral information security.Therefore, we believe that the research themes extracted from the article corpus will map to the journal's research scope.
This study explores the scholarly structure and evolution of information security research published in the International Journal of Information Security in the past fteen years (2007 -2022 and half of 2023).The goal is to answer the following research questions: 1. What are the prominent references or cited journals?Are there any citation patterns and trends? .What article attributes determine their citation counts?7. Who are the in uential (most cited and productive) authors, articles, institutions, and countries published in this journal?Articles published in the International Journal of Information Security from 2007 to 2023 and indexed by Clarivate's Web of Science's core collection citation index 2 comprised the articles analyzed in this study.To address the research questions, we used the bibliometric analysis techniques of citation analysis, bibliographic coupling, term frequency counts, co-word network analysis, and negative binomial regression analysis.We aim to investigate the current state of information security research published in IJIS.Given the attention lavished on information security within organizations, this endeavor can provide a foundation for future research and improved practices in the real world.
The organization of the article is as follows.The next section summarizes the relevant literature and brie y overviews our research methods.Subsequently, we explain our data collection process before outlining our ndings.The discussion of the study's underlying research themes and conclusions comprise the last two sections, respectively.
Investigating the performance of the research constituents (authors, articles, journals, institutions, and countries) through frequency or citation counts is one of the prominent research questions in most studies.Analyzing the co-authorship network, citation patterns, author keywords using co-word analysis, and cocitation (how documents, authors, or journals are cited together) networks are other vital techniques to explore the intellectual structure of a journal or a research discipline.For identi cation of the underlying topics, researchers have relied upon qualitative methods and, more recently, topic modeling techniques, such as probabilistic topic models [34] or structural topic models [35].
The research questions in a typical bibliometric study involve identifying underlying research themes or topics, their evolution and impact over a period, identi cation of knowledge sources or reference disciplines, the performance of different constituents, and exploration of the scienti c landscape.In the current study context, the following research questions are addressed.Prominent cited journals and conferences, their citation patterns, and trends (research question 1) are investigated using citation analysis.Underlying research themes are identi ed (research question 2) using bibliographic coupling and their respective citations (research question 3) using citation analysis.Author keyword analysis and co-word analysis on terms from article titles are used to explore the evolution of research themes (research question 4).Open access and regular articles are compared (research question 5) using author keywords and citation analysis.The statistical signi cance of speci c article attributes' contribution to citation counts (research question 6) is determined using negative binomial regression analysis.Finally, performance analysis of research constituents is carried out through citation analysis and frequency counts.

Research Methodology -Bibliometric Analysis
Alan Pritchard coined the term bibliometrics in an article published in the Journal of Documentation in 1969 [36].Pritchard de ned it as applying mathematical and statistical methods to investigate the process of written communication and understand the nature of a discipline and its development by counting and analyzing different facets of such communication [37].Over time, many de nitions of bibliometrics have been proposed [38].Indeed, bibliometrics is often confused with scientometrics, which studies science's growth, structure, interrelationships, and productivity, and informetrics, which deals with studying quantitative aspects of information in any form [39], [40].For this study, we used the working de nition of bibliometrics, which involves applying quantitative techniques, such as citation analysis, bibliographic coupling, co-citation analysis, co-word analysis, etc., on bibliometric data to explore the scienti c landscape of a journal or a discipline, where the unit of analysis can be authors, articles, journals, institutions, or countries of author a liations [41].
As Table I suggests, previous bibliometric studies on journals have used one or more bibliometric techniques to answer research questions about the intellectual structure of a journal.This study employs citation analysis, bibliographic coupling, author co-word analysis, and performance analysis.Citation analysis involves counting citations from other journals and analyzing their statistical distributions [42].Next, articles are bibliographically coupled when they have one or more references (cited paper) in common [43].In other words, bibliographically connected articles cite the same article(s), and the similarity between the original documents (and the resulting article clusters) is determined based on the commonly cited articles [44].Third, co-word or word cooccurrence analysis involves a statistical examination of two or more terms used in a single document and often involves analyzing co-word network visualizations [45].Finally, most studies in Table I carried out performance analysis and science mapping.The former comprises the identi cation of the most in uential authors, articles, institutions, and countries in terms of the number of publications or citation counts.In contrast, science mapping focuses on relationships between these research constituents, primarily as network visualizations [41].

Data Collection
The rst issue of the International Journal of Information Security was published in August 2001; however, it did not start publishing six issues per year regularly until 2007.Also, the journal was indexed in the Web of Science's citation index database that same year.Therefore, articles published in the IJIS and indexed in the Web of Science citation index database from 2007 to 2023 form our article corpus.The initial search resulted in 686 articles.After excluding 23 proceeding papers, 17 editorial materials, and four corrections, the nal corpus comprised 642 articles for bibliometric analysis.Figure 1 shows the frequency of articles published over the years.Despite dips in the number of articles published from 2007 to 2008 and 2009 to 2011, there has been a gradual increase since 2011, including a sudden spike after 2021.Note that 2023 is still a year in progress, and there is already a greater number of articles published than in previous years.This indicates the growing importance of information security research and interest among academic researchers.
5 Analysis and Results

Reference Journals and Conferences
The The percentage (as a fraction of total citations for a speci c year) of citations received by the top 20 cited references over the years is shown in Table II.For instance, 8.5 at the intersection of 2016 and cited reference Lecture Notes on Computer Science (LNCS) indicates that of the total cited references for 2016, LNCS received 8.5% (or 134 of 1579).We displayed percentages instead of raw citation counts to remove disparity due to differences in the number of publications over different years.

Underlying Research Themes
The underlying research themes were identi ed based on the technique of bibliographic coupling.12 Authentication on TLS/SSL protected web, collision-resistant hashing, broadcast encryption, dynamic reversed accumulator, double authentication preventing signatures, multicast stream authentication.

Authentication techniques.
The raw citation count is 36.The number of documents in this cluster is 6.The average citation count is 6.
Except for cluster 10, which includes articles based on passwords, graphical passwords, and user authentication, article clusters overlap.This overlap is visually discernible based on the closeness of speci c-colored nodes in one group to others in different clusters (s) or color(s).Although this is undoubtedly the case in many instances, there is a clear distinction between the cluster groups regarding the underlying research themes, as evidenced by Table III.The presence of articles on authentication and encryption techniques across different groups is likely the reason for these overlaps.On further exploration and investigation of the clusters, articles on encryption and authentication and their applications to biometrics, graphical passwords, and digital signatures are signi cant constituents of clusters 3, 4, 7, 9, 11, and 12.These clusters also form the middle core of the bibliographic network and suggest the importance of encryption and authentication to the information security eld in general and authors publishing in IJIS in particular.
The remaining clusters represent the underlying research themes and the interests of authors publishing in IJIS.For instance, cluster 1 is based on research on attacks, vulnerabilities, and intrusion detection systems, which historically has been an important research area for information security researchers.Within cluster 2, access control, security policies, identity management, and trust are essential topics for organizational security researchers.Another area of interest is applying cipher techniques to mobile and sensor networks (Cluster 5).Malware analysis and detection techniques, which form another area of research (cluster 6), recently gained attention not only with their applications to IoT, smart homes, Android, and web environments but also due to the emergence of machine learning, deep learning, and blockchain techniques for malware analysis and detection.Finally, the application of privacy-preserving techniques and algorithms, such as anonymization, in communication and health networks, the web, and social media are recent topics of interest (cluster 8).

Evolution of Research Themes
The evolution of research themes for the International Journal of Information Security can be explored through text analysis techniques.One can analyze the usage of author keywords or perform co-word analysis on article titles or abstracts.For this study, we opted to use both methods.
The article corpus consists of 3,093 author keywords (an average of 4.82 per article), of which 2,194 are unique.Privacy, security, and machine learning are the top three, followed by cloud computing and intrusion detection.The frequency distribution of top author keywords (minimum frequency count of 11 10) over the years (Table IV) provides a glimpse into the evolution of authors' interests.node representing a term.The node color corresponds to the average publishing year for the respective period, and the link between words corresponds to their usage within the same article.The size of the node corresponds to term frequency.
Although there is an apparent overlap of keyword usage across the years, on average, some keywords are used more often than others for speci c years.For example, authorization, pairing, grid security, mix, trust, principle, formal analysis, and selection are key terms used within article titles from 2010-2011.Next, access control policy, denial, threshold, control, click, content, mobile phone, and cryptographic protocol were featured in article titles in 2012-2013.Moreover, 2014 saw the usage of such terms as cryptosystem, wireless sensor network, identity, access, security risk, identity management, user, cloud environment, and security model, while 2015 saw the usage of cryptanalysis, veri cation, authentication, key, signature, context, intrusion detection, smart metering.In addition, spammer, scheme, signcryption, protocol, service, password, public key encryption, performance analysis, the web, Android platform, and ciphertext were used in 2016.The system, analysis, security analysis, privacy, model, network, prevention system, protection, key encapsulation, and security were used in 2017, and the following year included case study, countermeasure, hardware, keyword search, method, implementation, web application, apple, network intrusion detection, encryption, access control, cloud, architecture, study, framework, deployment, botnet, bitcoin, time, challenge, practical privacy, and k-anonymization.
After that, 2019 featured survey, side channel attack, inconsistency, attribute, identi cation, performance evaluation, game theoretic approach, malware detection, attack, machine, and secure computation.Finally, 2020 and later saw the usage of such terms as cybersecurity operations center (CSOC), smart home, comparison, machine learning, anonymity, ISO IEC, threat, federated learning, smart grid, recommendation, network intrusion detection system, differential privacy, IoT environment, vulnerability, IoT device, reinforcement, cyber-physical system, and blockchain.

Open-Access Versus Regular Articles
The Although most author keywords are shared among the two sets, there are a few discernible differences.For example, intrusion detection, information security, provable security, and trust are common in regular articles but rarely used in open-access articles.
Along with comparing the two sets based on the average number of citations received and most frequent author keywords, we also performed a quantitative comparison of the two groups based on common author keywords, i.e., shared across the two groups.First, we identi ed common author keywords among the two sets and found 181.Since the number of articles within the two sets is unequal (201 open access versus 441 regular articles), we cannot compare the raw counts.Hence, the next step involved calculating the fraction by dividing the raw counts for unique author keywords by the total count.For example, privacy appears 24 times in the open-access articles, and the total number of unique author keywords for open-access articles is 963; thus, the fraction of the total count for this keyword is 24 divided by 963 or 2.49.Similarly, for regular articles, the fraction of the total count for privacy is 28 divided by 2,130 or 1.31.As a nal step, we compared these fractions for common author keywords for the two sets using a t-test, and the usage of common author keywords is indeed statistically signi cant for the two groups (p-value < 0.1).

Regression Analysis on Citation Counts
We utilized regression analysis to uncover any correlations between citation counts as the dependent variable and article attributes, such as the number of authors, number of author keywords, length of article abstract, length of article title, whether all the authors are a liated with the same institution or at least one is from a different institution (qualitative or categorical variable: 0 if all authors are from the same institute, otherwise 1), number of cited references, and whether the article is open access or not (qualitative or categorical variable: 0 is open access, otherwise 1), as independent variables.----- The mean or average value for the citation count variable is 10.72, and the variance is 492.499.Since the citation count variable is over-dispersed, with its variance much larger than the mean, we used a negative binomial regression [8].The results are in Figure IX.Additionally, we computed the variance in ation factor (VIF) to check for any possible correlation or multicollinearity between the independent variables.The VIF values for all the independent variables were between 1.006 and 1.515, less than the suggested maximum value of 3.33 [50].
The independent variables not correlated with the citation counts for the articles within our corpus are the number of authors, number of author keywords, number of cited references, and whether all the authors are a liated with the same institution.Also, as with the ndings from the t-test comparing the citations received by open-access versus regular articles, binomial regression analysis found no statistically signi cant difference.On the other hand, the length of article abstracts and length of article title were signi cantly but negatively correlated with citation counts.This means that for each unit increase in length (title or abstract), the expected log count for citations decreases by 0.003 (and 0.054).Independent variables found to be positively correlated with citation counts are the number of pages, at least one of the authors is from a different institution, and if the article is published in a special issue.These ndings were not surprising and matched with previous studies [8], [51]- [53].

Performance Analysis
The corpus of articles consists of 1,770 unique authors.The frequency counts for the most productive authors (minimum article count = 5) and their current institution a liations are shown in Table VI.Negative Binomial Regression Analysis Results Most Prominent Countries Co-authorship Overlay Visualization Network of Prominent Countries

2 .
What are the broad underlying research themes for articles published in this journal?3. What are the most cited research themes in the article corpus? 4. How have the research themes evolved over the years? 5. Are there differences between open-access and regular articles regarding citations received and author keywords?

(
possibly due to a delay in indexing by the Web of Science citations database).The citations received by the top 20 cited references and the respective years in which they were published starting from 1990 are shown in Figure IV.
Figure V shows the clusters or groups of bibliographically coupled articles.The minimum number of citations of a document considered for bibliographic coupling is 1, resulting in 500 out of 642 meeting the criteria.The most signi cant connected component consists of 470 articles, as shown in Figure V.These articles relate to each other by 3,987 links.Each node represents an article, and nodes with the same color belong to the same cluster or group (or underlying research theme as identi ed in is published as an open-access article or not, one hundred ninety-eight are open-access articles, and 425 are not.

Figures Figure 1 Articles
Figures

Figure 2 Citations
Figure 2 Citations Received by Referenced Journals Figure 3 Most Cited Journals

Figure 4 Top
Figure 4 Top Cited Journals and Their Respective Year of Publication

Figure 6 Co
Figure 6 Co-word Network Analysis on Article Title Terms

Figure 7 Top
Figure 7 Top Author Keywords for Open Access Articles

Figure 8 Top
Figure 8 Top Author Keywords for Regular Articles

Table I .
Related Bibliometric Analysis Research on Journals

Table II .
Citations Received by Most Cited Journals Over Time TableV).The node size represents the number of citations received by the article, and the link between nodes indicates the cited references shared by the papers[48].TableIIIshows the underlying keywords and corresponding research theme labels.DDoS attack detection, attacks on smart grids, security risk assessment for cyber-physical systems, introspection of web-based attack, realistic attack simulation, attack paths generation algorithms, intrusion detection systems, discovering domain names to be abused in future, defeating SQL injection attacks, detecting zero-day attacks, real-time intrusion detection, detecting abnormal activities in network tra c data, wireless intrusion detection, side channel attack detection, database intrusion detection, XML security, blockchain based collaborative intrusion detection, protection for Android OS, detecting targeted attacks in P2P, scalable intrusion detection system based on deep learning, reinforcement learning approach for network intrusion detection, ML approach to vulnerability detection, malware detection for securing VM in cloud, securing smart grid, robust intrusion detection using reinforcement learning, cyberattack response system, vulnerabilities in web applications, analysis of attack graphs in industry 4.0, protecting smart grids, malware detection for IoT devices.Framework for establishing trust in cloud, detection of rewall con guration errors, access control in AWS IoT using event driven functions, security policies enforcements, delegation model for extended RBAC, concrete and abstract based access control, secured information sharing in cloud, dynamic delegation of authority, grid security infrastructure, modeling contextual security policies, automated policy re nement in network security systems, trust requirements and management for IoT, trust model for smart home devices, solving identity delegation problem in e-govt environment, attribute based access control for cloud services using blockchain, security policy veri cation in cloud systems, grid security by behavioral control, identify based cryptography for grid, trust enhanced architecture for mobile systems, reputation mechanism for ubiquitous computing, multi-faceted model of trust, analyzing XACML policies, large scale employee ID system, inconsistency and incompleteness detection in access control policies, data access control in smart home scenario, ensuring web page integrity against malicious browser extensions, distributed authorization with delegation, trust management for federated SOA, trust model for effective selection of sellers in e-commerce.High performance computation for data mining applications, outsourced encrypted data storage, wildcard and keyword search over encrypted data, password authenticated searchable encryption, key agreement protocol with fault tolerance over broadcast networks, public key system for electronic voting, homomorphic encryption for secure computation in cloud, provably secure PKE, outsourced data pattern matching, watermarking protocols, supervised ML using encrypted training data, security and searchability in outsourced data, symmetric encryption for geospatial data in cloud, secure oating point arithmetic, cryptosystems based on subset sum problem, algorithms for secure outsourcing of cryptographic computations, encryption in cloud storage, secure three party computational protocols, two factor authentication for bit coin protocol, privacy enhanced architecture for smart metering, remote voting, data deduplication scheme for cloud storage, countermeasures to attacks in e-voting, public data integrity auditing, practical encrypted email, privacy preserving cloud auditing, enhancing data con dentiality by hiding partial ciphertext, secure cloud storage, blockchain enabled large scale e-voting system.time-speci c encryption, forward secure encryption, generation of elliptic curves, a secure routing protocol for vehicular communication, anonymity in cloud broadcast system, cryptography for cloud, adaptive broadcast encryption, signature schemes, identity-based signature without key escrow problem, certi cateless authenticated asymmetric group key agreement.Privacy in biometric applications, spam detection over IP telephony attacks, securing ngerprint templates, authenticating mobile phone users using keystroke analysis, wireless network scanning, suspect identi cation in communication networks, authentication in mobile devices using hand gesture recognition, biometric cryptosystem, and behavior pro ling, optimization of mobile botnets, privacy information management in video surveillance, privacy-aware video surveillance, analysis of drive-by download operations, biometric keys, masquerade detection, investigation of spam emails, behavioral biometrics.
4Secure key exchange protocol, public key Kerberos, security protocol analysis, performance analysis of authenticated key exchange protocols, secure group key establishment and exchange, authenticated key exchange security for certi cations systems, client server authentication protocols, identity based key agreement protocols, key management techniques, secure architecture for TCP/UDP based cloud communications, authentications in mobile systems, pragmatic authenticated key agreement, anonymity guarantees of authentication and connection protocol, blockchain based medical data preservation, passwords authenticated key exchange based on RSA, password authentication for Web, secure Kerberos key exchange with smart cards, authentication protocols for IoT devices, measuring protocol strength, analysis of authorization protocols, key encapsulation with multi-bit encoding method, elliptic curve cryptography based anonymous authentication protocol, challenges of post-quantum digital signing, identity based and certi cate based authenticated key exchange protocols, analysis of time-aware protocols, data minimization in communication protocols, key establishment protocol for wireless networks.maliciousremoteadministration tool, deep learning based malware detection system, detection of android botnets using ML, data usage control for Android devices, classi cation of android malware, honeypot for IoT protocols based on android devices, NLP techniques to malware detection, access control policies for Android systems, IoT botnet detection, malware classi cation through clustering, anomaly based detection of malicious events on Android platforms, detection of malware using Bayesian belief network, malware analysis with ML techniques, GPU assisted malware, understanding spread of malicious mobile programs, secure blockchain based rmware update framework for IoT environment, defense framework against malware and vulnerabilities.7Identity-basednon-interactive key exchange, secure pay, certi cate fewer encryption schemes, remote data storage without public keys, key distribution problem, veri er signature schemes, identity-based, attribute-10 Two-factor graphical password scheme, password hashing schemes, patterns in click-based graphical passwords, negative authentication system, side-channel attacks on pin pads and keyboard acoustic, tokenbased visual password authentication, content-based image authentication, system assigned passwords, privacy-preserving and provably secure authentication protocol, secure steganography, PKI certi cate framework, analysis of password policies, shoulder sur ng proof graphical password authentication schemes, encouraging users to improve password security and memorability, understanding users passwords through visual analysis and visualizations, image encryption techniques.

Table IV .
Top Author Keywords Usage over TimeBased on TableIV, machine learning, cloud computing, blockchain, Internet of Things, and Android featured more in the journal after 2014.Speci c author keywords like privacy, security, authentication, and access control were studied.Most prominent author keywords also represent underlying research themes identi ed using the bibliographic technique.Table IV does present exciting ndings; however, we carried out a co-word network analysis on frequent terms (minimum frequency = 2) used in article titles to further investigate the evolution of the journal's underlying research areas (FigureVI).The co-word network comprises 183 terms from article titles, with each corpus comprised 201 open-access articles and 441 regular (i.e., non-open access) articles.The open-access articles received 1,824 citations, with a mean of 9.07 per article.On the other hand, the citations received by regular articles is 4,947, with a mean citation of 11.22.Contrary to popular belief [49], no statistically signi cant difference is found concerning the citations received by the two sets of articles (p-value < 0.05).We compared the two sets based on author keywords as well.The open-access articles included 963 author keywords with an average of 4.79 per article.The regular articles included 2,130 author keywords with an average of 4.82 per article.Of the former totals, 794 and 1,581 are unique author keywords for openaccess and regular articles, respectively.Figures VII and VIII illustrate the frequency distribution for the top author keywords for the two sets.

Table V .
Table V lists the variable de nitions and descriptive statistics used in the regression model.Variables Used for Regression Analysis and Their Descriptive Statistics Whether all authors are from the same institution or at least one of the authors is from a different institution.Three hundred eighty-two articles have at least one author from another institution, and 241 papers have all authors from the same institution.Whether the article is published in a special or regular issue, eightyve articles are published in a special issue and 538 otherwise.

Table VI .
Authors with the Most Articles