Kinetic Researcher FAQs

Frequently Asked Questions: Researcher Role in OpenStax Kinetic

In this article you will find answers to the following questions about the Researcher role in OpenStax Kinetic:

What are the benefits for researchers, research participants, and the research community of using OpenStax Kinetic?
What do I need to do to be able to deploy a study on Kinetic?
What are the various sources of data that are available to me on Kinetic?
Is there a way for me to get individual differences measures for participants in my studies?
Can I invite participants back for delayed/longitudinal measures?
What are Secure Enclaves, and what is the purpose of using this method?
What is the Kinetic Approach?
What data will researchers see? How close will that be to the raw data?
How are synthetic datasets generated on Kinetic?
What if I still want direct access to data?
Who will see the raw data?
What kind of review does Kinetic conduct?
How long will the review process take?
What does it mean when I get an error? What steps should I take to resolve them?
What if I use a different analytic toolkit than R Studio?

What are the benefits for researchers, research participants, and the research community of using OpenStax Kinetic?

For Researchers, Kinetic creates a connection with real learners in their authentic learning environment, while providing secure access to vast amounts of data that can be used to conduct large-scale studies and generate nuanced findings about what works for who, when, and in what context.

For Participants, Kinetic provides personalized insights intended to help participants grow as learners, as well as the opportunity to earn rewards in the process. Moreover, it connects them to the opportunity of contributing to cutting-edge research studies directed at furthering the development of learning strategies and outcomes for all learners.

For the Research Community as a whole, they can benefit from the increased collaboration and sharing of data and resources, leading to more impactful research and advancements in the field of digital learning. Explicitly built for researchers, Kinetic promotes sharing of promising research findings with product teams as well as higher-education institutions to ensure that research goes beyond peer-reviewed publications and is effectively translated into changes in instructional practice.

Overall, OpenStax Kinetic aims to be a valuable tool for advancing the understanding of how technology can be used to improve educational experiences for every learner.

What do I need to do to be able to deploy a study on Kinetic?

To access OpenStax Kinetic, as a first step, we invite you to sign up as a Researcher on OpenStax. If you happen to have an existing OpenStax account and wish to become a Kinetic researcher, simply reach out to us at kinetic@openstax.org and we'll be sure to get you set up.

Depending on your research needs, you might also need a Qualtrics account, which often can be accessed through your own institution. If your plan is to solely use existing Kinetic data for your research, then a Qualtrics account may not be needed. But if you are conducting a study that seeks to collect new data, then you will need Qualtrics for study implementation. Regardless, our team is always here to guide you through this process. One important note we'd like to highlight is that all research on Kinetic must align with the Kinetic IRB protocol.

What are the various sources of data that are available to me on Kinetic?

Kinetic researchers can benefit from access to multimodal data sources from OpenStax — starting with their own research studies built on Kinetic, access to individual differences measures within the Kinetic Learner Characteristics Library (KLCL), student interactions with OpenStax products and offerings (gradually over time) including the OpenStax online textbooks, and OpenStax Assignable online learning system. With iterative improvements to the Kinetic system, researchers will be able to analyze all of these data sources from a single access point: the Kinetic Secure Enclaves.

Is there a way for me to get individual differences measures for participants in my studies?

What stands out about conducting research on Kinetic is access to the Kinetic Learner Characteristics Library (KLCL). The library contains existing individual differences measures that are related to learning and academic outcomes. Some critical measures available in the KLCL include sociodemographics (e.g., SES variables, age, race, gender, employment status, and parental education), Big 5 personality variables, self-efficacy, resilience, working memory measures, and STEM knowledge.

Given our privacy-by-design approach that drives the Kinetic workflow, we are able to securely provide a mechanism to analyze the breadth of the learner characteristics while preserving participant privacy.

The benefits of the KLCL include:

All researchers using Kinetic have access to the KLCL to use in their own research (e.g., as a focal variable, as a control variable, as a way to further describe the research sample).
The KLCL includes various measures that are openly available, so researchers do not have to search for items or worry about any copyright issues.
Researchers can focus on capturing their core variables of interest and not inflate the study length with learner characteristics measures the KLCL already captures, which shortens the length of time learners need to take the study and improves the user experience.
The KLCL ensures that all researchers on Kinetic use the same instruments to measure all constructs, thereby creating construct measurement consistency in the reported results across different Kinetic studies.
Having access to learner characteristics collected at a different time reduces the common method bias that can result from capturing all variables of interest in a single study, strengthening researchers' study designs.

Can I invite participants back for delayed/longitudinal measures?

Absolutely. When designing your Kinetic study, you'll be able to add one or more follow-up sessions, as well as set the minimum day interval you would like to have before a participant can take the next session in your study.

What are Secure Enclaves, and what is the purpose of using this method?

On Kinetic, we understand that researchers need to know who the learners are to effectively address the impact of learner characteristics on learning outcomes.

These characteristics can range from socio-demographic information (e.g., race, gender, SES, etc.) to psychosocial constructs that may impact learning (e.g., goal orientation, anxiety, etc.), and are key to surfacing issues of inequity in educational opportunities and outcomes.

Data is crucial for large-scale education research (we're referring to larger than gigabytes — at terabyte or petabyte level), but sharing it with researchers poses important challenges on a privacy level, and additional technical and logistical barriers.

To address this problem, traditional approaches often resort to methods such as reducing data size or de-identifying their data, which leads to loss of critical student and contextual factors, and ultimately hinders researchers trying to understand how different learner characteristics impact different learning interventions and resulting outcomes.

What is the Kinetic Approach?

At Kinetic, we're instead choosing to adopt modern confidential computing practices for large-scale educational research that are shifting the paradigm from traditional practices — that is, we bring the researcher's analytical software to the data instead of bringing the data to the researcher.

This approach entails the use of secure enclaves for data analysis — a protected containerized environment that enables researchers to analyze all individual data points without direct access to the data, instead returning aggregated knowledge, ultimately reducing significant privacy risks.

What data will researchers see? How close will that be to the raw data?

The quality of the synthetic datasets will depend on how many participants are included in the raw data. For newly deployed activities with fewer than 300 respondents, the synthetic dataset will be aligned with the data schema of the task and involve random sampling of the response options such that researchers can develop their analytic scripts without accounting for the statistical features. As the task reaches over 300 respondents, the quality of synthetic data progressively increases to incorporate the statistical features of the raw data.

How are synthetic datasets generated on Kinetic?

Kinetic utilizes progressively more sophisticated algorithms for synthetic data generation depending on the volume of raw data available to train the model. At the very basic level, when there are 0–300 respondents in the raw data, Kinetic synthetic datasets are generated by randomly sampling response options for every individual question included in the dataset aligned with the data schema. As we acquire 300 or more responses, we utilize the models outlined in the toolkit to generate higher-fidelity synthetic datasets representing the statistical features of the raw data.

What if I still want direct access to data?

Kinetic will work with its members on a case-by-case basis to provide access to real data, subject to additional training — so that important safeguards, monitoring, and design-enforced limitations are in place for proper compliance with our learners' right to privacy and data security.

Who will see the raw data?

A limited number of individuals will see the raw data, including core Kinetic team members on a need-only basis. These individuals will all have undergone additional training on data security practices, IRB training, and FERPA. External researchers can access de-identified raw data subject to additional training, agreements, and review on a case-by-case basis.

What kind of review does Kinetic conduct?

In efforts to ensure that privacy is consistently preserved, OpenStax Kinetic conducts manual reviews of the analytic code. The sole purpose is to ensure that individual cases are not exposed, re-identifiable, or that values aggregated with fewer than 5 respondents in each cell are not returned either as part of any coding errors or otherwise. We do not check for validity of the code or best practices.

Our internal team of engineers and researchers conduct the code review for the following:

Are there any coding errors that include messages around individual cases?
Are there any print statements in the code that reveal individually identifiable cases?
Are the aggregated results for subgroups of over 5 respondents? This number is subject to change as our recruitment and research cycles get faster.

How long will the review process take?

The review process is focused on maintaining diligent privacy compliance during the analysis process aligned with the steps in the above question. Your script will be manually reviewed by our team of engineers, and we strive to have it completed within no more than 1 to 2 business days.

What does it mean when I get an error? What steps should I take to resolve them?

Errors can be caused by many factors. The most common cause is simple coding errors, such as incorrect variable names — we recommend testing a complete run of your code on the synthetic datasets before submitting it. For any errors that you experience building your code with the synthetic data, make sure to resolve or "debug" it before submission. Typically, searching the error message online should help you address the errors. StackOverflow will be your best friend for addressing errors. Here's another resource for typical R coding errors.

What if I use a different analytic toolkit than R Studio?

At OpenStax, we promote the use of open-source toolkits that are loved by our Researcher community. For the moment, we're starting out with R and RStudio, and as we continue to expand we're exploring adding Python, Jamovi, and Julia-based enclaves and resources. We are also considering the feasibility of enabling researchers to use proprietary tools with their own licenses. We will share periodic updates on how our product evolves over time.

Kinetic