A funnel system is proposed to access a broad spectrum of information and to have an objective view of it, with three filtering moments to select the complete papers included for analysis (see Fig 3).
The first filter was made with R software and bibliometric techniques. Then, a general search equation allowed access to all-time production on ITS registered in Scopus (only papers were selected). This analysis was carried out using keywords and summaries.
Subsequently, with the help of artificial intelligence, text mining was used to identify topics of interest in the scientific community, followed by new filtering.
The selected full texts were analyzed with NVIVO software to extract emerging challenges in the field. This study aims to answer the following questions:
Q1: What is the ITS primary evaluation purpose?
Q2: What is the main evaluating agent (in evaluation processes)?
Q3: What is the main approach used in the selected ITS?
Q4: Is the ITS evaluation process implemented holistically?
These questions arise from the need to understand evaluation in the context of learning, in particular deep learning. Specifically, a holistic and complex evaluation that can account for the student’s capacity for critical analysis of new ideas and their integration with previous knowledge, thus favoring understanding and retention in the long term to later be used to solve problems in different contexts.
An evaluation that accounts for summative aspects, but also for the levels of cognitive skills such as “analysis” (comparing, contrasting) and “synthesis” (integrating knowledge in a new dimension), integrated with metacognitive aspects that promote understanding and application of lifelong learning can be considered a holistic evaluation.
Bibliometric analysis:
With the search equation, *intelligent tutoring* the following results presented in Table 1 were obtained. However, it is crucial to bear in mind that this general equation is only considered since it is expected to obtain new filtering criteria that will lead to a more refined equation.
Table 1 Characteristics of the data
Main information about data
|
Timespan
|
1979↔2021
|
Sources (Journals)
|
618
|
Documents
|
1,890
|
Average citations per document
|
21.12
|
Document types
|
Article
|
1,890
|
Authors
|
Authors
|
3,819
|
Authors collaboration
|
Single-authored documents
|
322
|
Documents per Author
|
0.495
|
Authors per Document
|
2.02
|
A total of 1,890 results were found in Scopus, covering 42 years of academic production. The texts considered were articles published in specialized journals, although it is recognized that this field of knowledge has important dissemination through conferences. However, due to the objective of the study to identify structured knowledge with a high level of depth, conference papers were not included in this analysis. Thus, a total of 3,819 authors were considered in this initial search.
The academic production origin was in 1979; in 2014, it reached its maximum (105 papers,) and since 2016, such production has slightly decreased (Fig 4.)
Fig 5 shows that the largest source of texts was the International Journal of Artificial Intelligence in Education, classified in Q1. Fig 6 shows the top 5 most cited journals in relation to ITS. The journal Computers and Education stands out with a total of 4,814 citations.
The main authors by total citations in the chosen period are presented in Fig 7. For example, Kenneth R. Koedinger, professor of human-computer interaction and psychology at Carnegie Mellon University, is the founding and current director of the Pittsburgh Learning Science Center, with 2,112 citations.
The data represented in Fig 8 is the KeyWords Plus count. They are generated from words or phrases that frequently appear in the reference’s titles of an article but do not appear in the article’s title. Using R and the Bibliometrix plugin, it is possible to obtain them. KeyWords Plus enhances the power of cited reference searching by looking across disciplines for all articles with commonly cited references.
Garfield claimed that Keywords Plus terms could capture an article’s content with greater depth and variety [16]. However, Keywords Plus is as effective as Author Keywords in bibliometric analysis investigating the knowledge structure of scientific fields, but it is less comprehensive in representing an article’s content [17].
In Fig 8, computer-aided instruction stands out as the main topic, representing 17% of the frequencies examined in the text references. Finally, For the elaboration of Fig 9, it was considered that the co-occurrences could be normalized using similarity measures such as the Salton cosine, the Jaccard index, the equivalence index, and the strength of association [18].
The selected algorithm was the strength of the association since it is proportional to the relationship between the observed number of co-occurrences of objects i and j, and the expected number of co-occurrences of objects i and j under the assumption that occurrences of i and j are statistically independent.
For the grouping strategy, “Walktrap” was selected as one of the best alongside “Louvain” [19]. The graph is interpreted considering the following characteristics:
- Centrality / Periphery (Position)
- Dimension of the bubble (number of citations)
- Strength of relationships (links)
- Clusters (and density)
- Bridges.
The colors represent the groups to which each word belongs. In this case, there are three groups. In the first one in red, the theme of computer-aided instruction is dominant in citations. There is no central theme in the citation in the green one but in relationships, and it is Expert Systems relating topics of interest such as artificial intelligence. Finally, the third group, colored blue, seems to be a subgroup of the first one focused on educational issues.
Text Mining
Although the bibliometric analysis finds the authors and journals with the most impact in the specific field, the possible thematic fields based on the analysis of the Keywords Plus and a classification of these in groups it is necessary to perform additional analysis to identify more specific thematic groups, for which the Software Knime [20] was used.
Fig 10 shows the scheme under which the database downloaded from Scopus was processed. Data was previously filtered from 2003 when a production peak occurred and is of interest. Finally, in this analysis, all the abstracts of the selected papers were considered.
Search terms:
TITLE-ABS-KEY(*intelligent tutoring System*) AND (LIMIT-TO (DOCTYPE,”ar”)) AND (LIMIT-TO (PUBYEAR,2021) OR LIMIT-TO (PUBYEAR,2020) OR LIMIT-TO (PUBYEAR,2019) OR LIMIT-TO (PUBYEAR,2018) OR LIMIT-TO (PUBYEAR,2017) OR LIMIT-TO (PUBYEAR,2016) OR LIMIT-TO (PUBYEAR,2015) OR LIMIT-TO (PUBYEAR,2014) OR LIMIT-TO (PUBYEAR,2013) OR LIMIT-TO (PUBYEAR,2012) OR LIMIT-TO (PUBYEAR,2011) OR LIMIT-TO (PUBYEAR,2010) OR LIMIT-TO (PUBYEAR,2009) OR LIMIT-TO (PUBYEAR,2008) OR LIMIT-TO (PUBYEAR,2007) OR LIMIT-TO (PUBYEAR,2006) OR LIMIT-TO (PUBYEAR,2005) OR LIMIT-TO (PUBYEAR,2004) OR LIMIT-TO (PUBYEAR,2003) )
Fig 11 shows the workflow developed in Knime, with which it was possible to analyze 1,369 abstracts and extract the hidden thematic structure, identifying the topics that best describe a set of documents.
Table 2 describes each item presented in figure 11.
Table 2. Item description of Knime workflow
Image
|
Name
|
Description
|

|
Excel reader
|
It allows incorporating a database obtained from Scopus in Excel format
|

|
Missing Value Colum Filter
|
This node removes all columns from the input table that contain more missing values.
|

|
Strings to Document
|
It converts the specified strings to documents. For each row, a document will be created and attached to that row.
|

|
Preprocessing
|
This is a metanode, which groups several nodes responsible for multiple tasks, including Part of Speach tagging, lemmatization, stop word, number, filtering. Inside this metanode are the elements shown in Fig 12.
|
Table 3 has the description of each item presented in Figure 12.
Table 3. Metanode preprocessing item description
Image
|
Name
|
Description
|

|
Punctuation Erasure
|
Removes all punctuation characters of terms contained in the input documents
|
Number Filter
|
Filters all the numerical values present in the entered documents.
|
N Chars Filter
|
Filters all terms contained in the input documents with less than the specified number of N characters
|
Stanford Tagger
|
This node assigns each term a part of speech tag.
|
Stanford Lemmatizer
|
Lemmatizes terms contained in the input documents.
|
Case converter
|
Uppercase and lowercase converter
|
One of the main elements of this algorithm is the Topic Extractor, with which it is possible to achieve the following:
- Automatically finds the top K topics with the most relevant N keywords discussed in a collection of unlabeled documents (considered unsupervised).
- It represents documents as random mixtures over latent topics, where a distribution over words characterizes each topic.
- Syntax or order of the words in the document is not important (bag of words model).
- Document order is not important.
- The same word can belong to different topics.
- The number of topics needs to be selected/known in advance.
- Two important hyperparameters of the Dirichlet distributions:
- α Controls the per-document topic distribution.
- β Controls the per-topic word distribution.
This process is known as the Simple parallel threaded implementation of LDA [21][22] (see Figure 13).
In Figure 14, the process for dimensional reduction is presented, and in Table 4, there is the description of each item in figure 14:
Table 4. Topic extractor ítems description
Image
|
Name
|
Description
|

|
t-SNE
(L. Jonsson)
|
t-SNE is a manifold learning technique. It is most often used for visualization purposes and can capture nonlinear structures in the data.
|

|
Color Manager
|
Assign a color label to data groups
|

|
Joiner
|
Joins two tables in a database-like way.
|

|
Interactive visualization
|
This is the metanode in charge of allowing the visualization of emerging topics in an interactive way. The nodes found inside are shown in Fig 15.
|
Table 5 describes the items in the Metadone interactive visualization in Figure 15.
Table 5. Metadone interactive visualization items description
Image
|
Name
|
Description
|

|
GroupBy
|
It is responsible for grouping the rows of a table by the unique values in the columns of the selected group.
|

|
Table View
|
Displays data in an HTML table view. The view offers several interactive features, as well as the possibility to select rows.
|

|
Scatter Plot
|
With this node, a scatter plot is obtained.
|

|
Tagging
|
This metanode groups the nodes presented in Fig 16. It is the last metanode in this section. It is done in labeling that will allow viewing the word cloud and the texts associated with each topic.
|
In Table 6 is a description of the items in the Metanode Tagging.
Table 6. Metanode Tagging items description.
Image
|
Name
|
Description
|

|
Dictionary Tagger
|
This node recognizes named entities specified in a dictionary column and assigns a specified tag value and type. Optionally, the recognized entity terms can be set unmodifiable, meaning that the terms are not modified or filtered afterward by any following node.
|

|
Tag Filter
|
Filters terms are contained in the input documents that have specific tags assigned. A term if not filtered out if at least one of its assigned tags is part of the specified tags. If strict filtering is set, all assigned tags of a term must be specified tags.
|

|
Bag of Words Creator
|
Create a bag of words from a set of papers. It consists of at least one column that contains the terms that appear in the corresponding document. The programmer can interact with the result and customize the display.
|

|
IDF
|
Inverse Document Frequency- determines the number of documents containing the T concepts, which come from keywords Plus.
|

|
String to Term
|
Converts the strings of the specified string column to terms and appends a new column containing these terms.
|

|
Tag Cloud
|
A tag cloud view using JavaScript libraries, which can be customized.
|

|
Document Data Extractor
|
It is responsible for extracting desired information in columns.
|
After going through these nodes, the algorithm returned the following result. In Fig 17, all the selected terms are classified into five topics from the 1,369 abstracts; each topic requires interpretation. However, the focus of the analysis was to determine if some of them were related to the category of interest: evaluation.
The program interface allows the analyst to explore each of the five topics, as shown in Fig 18.
For example, topic_0 contains the terms game, instruction, intelligent, language, reading, skill, strategy, study, system. In the “document” column, the text and the weight of contribution to each of the terms were displayed.
The topic_3, represented in yellow in Fig 19, emerges naturally among the analyzed abstracts. The terms that compose it are affective, assessment, data, emotion, method, model, performance, result, student, and system, all of them with high values for this studio. Therefore, this result -with high values- was the selection criteria to link the full texts analyzed in Nvivo in the next phase.
One hundred sixty-four papers were selected from the text mining of the emerging group represented in Fig 19. It is essential to consider that the weight of the term assessment is not high compared to the other terms identified in topic_3 and even less compared to the total number of identified terms, as shown in Fig 20.