The Path2Integrity open data collection contains data from the Path2Integrity learning card program [11, 27], including the data from the Path2Integrity FAIR training. Path2Integrity is an open educational program, free of charge and usable for all disciplines in higher education. The learning principles from the handbook describe the training standards [27]. The program aims to teach students "how responsible research needs to be conducted in order to be reliable and thus useful for society" [27].
The Path2Integrity open data was collected through:
-
The online P2I questionnaire [28, 29], which focuses on scientific action and justification referring to research integrity topics in the European Code of Conduct (such as FAIR guiding principles, research procedures, collaborative work, and research environment), and
-
The online Path2Integrity feedback sheet [30] focuses on the learners' immediate reaction to training.
Path2Integrity collected the data between August 2019 and January 2022. (See references 31 and 32 for the data).
From the Path2Integrity open data collection (see Fig. 1), we selected the sample of students who voluntarily attended the international "FAIR training" (intervention group, n = 96 in pre-test) and a sample who filled out the questionnaire (control group, n = 418 in pre-test).
According to the Path2Integrity open data metadata, "the control group was collected in two rounds, mainly between March 2021 and January 2022. The students for the control groups were mostly European students whose educators embedded the [questionnaire] into their courses. Therefore, they were also contacted through their trainers. Due to differences in intensity and content of the courses and to increases in the quantity of the non-randomized control group, both courses in research integrity, responsible conduct of research, good research practices, scientific working, research ethics or related topics, as well as non-related courses, were included in the control groups. In total, we reached out to 864 trainers ... From these, 60 trainers allowed us to conduct [our questionnaire] within a total of 89 of their groups." [33].
The study's intervention group participated in a one-day, role-playing online training that included a 90-minute FAIR training with the standardized learning card M8 [25]. Path2Integrity asked all students to fill out the questionnaires voluntarily. As Table 1 shows, this data was collected online at different times.
Table 1
Data collection before and after the training
data collection | informed consent |
---|
With the P2I questionnaire [29] at the beginning of each training (pre-test). | Attached to the survey and a condition to proceed to give data. |
With the feedback sheet [30] after each training on FAIR guiding principles (M8) [25]. | Attached to the survey and a condition to proceed to give data. |
With the questionnaire on research integrity [29] directly after each training (post-test). | Attached to the survey and a condition to proceed to give data. |
Path2Integrity collected the data and informed consent sheets anonymized via online surveys and stored them safely at Kiel University. To calculate the measures of this study, we used the published open data collection [31, 32] with the support of the evaluation work package team.
Table 2 displays the professional, disciplinary, age, country, and gender distribution of the intervention and control group in the pre-test.
Table 2: Distributions of intervention and control group in the pre-test.
Completing the questionnaires was voluntary, and not all students took part in pre- or post-evaluation. Because Path2Integrity did not give any incentives to attend their questionnaires, we estimate that some students were tired and dropped out before collecting the post-test. (See below our comment on attrition rates.)
Table 1 outlines that students in the intervention group were given version M of the P2I questionnaire [29] once at the beginning and immediately at the end of the training. The control group answered the same questionnaire [29] at the beginning and end of their no-FAIR training. Also, all students in FAIR training were asked to fill out the feedback sheet [30].
The P2I evaluation form M contains two FAIR-related questions (hereafter referred to as SPM8 and SCSM8), each with four possible answers.
In SPM8/A, students answered the multiple-choice question:
"In his research project, Ali has collected a large amount of research data that he would like to make available open access in accordance with the FAIR guiding principles. To follow good research practices, Ali ensures that his data …" (Please choose only one of the following:)
-
• A1: are described with rich metadata to be machine-readable.
-
• A2: are stored on FAIR foundation servers.
-
• A3: can be found in every database possible.
-
• A4: do not contain any information about sexual orientation.
In SCSM8/B, students answered the question:
"Ali's decision (above) is in line with good research practices because …" (Please choose only one of the following:)
-
• B1: it ensures reliable research results.
-
• B2: it ensures the equal treatment of all research data.
-
• B3: it is the duty of Ali to follow this process.
-
• B4: the legal framework governing universities requires it.
The sample size from the data collection [31] is as follows:
nintervention_pre−test 96 with 6 missing values;
nintervention_post−test 78 with 3 missing values;
ncontrol_pre−test 418 with 34 missing values;
ncontrol_post−test 163 with 7 missing values.
Table 3 summarizes the data sample and its characteristics stratified for each answer.
Table 3: Data sample characteristics for SPM8/A and SCSM8/B
Question SPM8/A targets the students' scientific action, whereas SCSM8/B aims at their justification in SPM8/A. Path2Integrity expected students to answer A1 and B1 [28]. The answers A2, A3, and A4 are mere distractors from A1. However, Zollitsch et al. [28] explain that in the case of SCSM8, the answers B2, B3, and B4 are different justification patterns.
The feedback sheet [30] refers in cases of (no-)FAIR training to the following eleven questions:
Motivational factor: My participation in the (no-)FAIR training was encouraged by the trainer.
Instructional factor: For me, the (no-)FAIR training was adequately guided.
Safe space factor: I could express my opinion freely in the (no-)FAIR group.
Participation factor: I was able to contribute something to the (no-)FAIR group.
Appropriateness factor: The duration of the (no-)FAIR training was appropriate to me.
Comprehensibility factor: I clearly understood the task of the (no-)FAIR training.
Commitment factor: For me, the structure of the (no-)FAIR training was good to follow.
Satisfaction factor: I am satisfied with the (no-)FAIR training as a whole.
Trust factor: I would recommend the (no-)FAIR training to my fellows.
Usefulness factor: I have learned something useful in the (no-)FAIR training.
Practical relevance factor: I could connect the (no-)FAIR training with my everyday life.
Next to these two FAIR-related questions, the Path2Integrity feedback sheet dataset [32] shows that 95 students (nfeedback) answered these Likert scale questions. We recoded the answers with 2 being the positive end of the Likert scale, 0 being the neutral middle, and − 2 the opposing end.
To close the research gap on the FAIR training effectiveness, we hypothesized:
1. that FAIR training had a positive impact on both the suggested action and the justification of the students and would thus yield a clear shift in the response behavior of the students in the post- compared to the pre-testing towards the correct answers in the P2I questionnaire (A1 and B1, respectively) in the intervention group;
2. that a particular training that focuses explicitly on FAIR training is necessary to produce this shift (if present), and we should thus not be able to reproduce the former effect (if it is present) in the control group;
3. the legal framework of the associated universities of the intervention group may impact how students justify their actions. In the pre-test, students in the intervention group may choose to answer B4 over B1 (which is the expected answer).
In each case, we evaluated whether the FAIR training impacted the students' response behavior via Pearson's -chi-square test, with the null being that response behavior is independent of pre-and post-testing. In the case of hypothesis 3, in which we explicitly targeted the answer category B4, the data were first collapsed over the B4 column to obtain a 2x2 table. We regard a p-value of less than 0.01 as statistically significant. In case of rejection, we planned to evaluate the source of the shift by looking at Pearson's standardized residuals of the fit.
We regard standardized residuals with absolute values above 2 (above 3) as an indication that the respective cell has an impact (strong impact) on rejecting the null. As a measure of association and to evaluate the effect size of the shift towards the respective answer, we present the odds ratio for choosing the respective answer (A1, B1, or B4, respectively) over the other categories.
In the second step, we
4. contrast the learning factors of the FAIR training with the feedback of the learners [32].
We assess via a volcano plot which of the following learning factors students ranked as highly positive factors.
In sum, we probe the effectiveness of the FAIR training by analyzing the data from the Path2Integrity open data collection. By comparing the learning success of both the intervention and control group, we assess if FAIR training is effective and what learning factors were rated highest.