Feng Shui Training Overview

Feng Shui is a traditional Chinese practice that involves the arrangement of objects and elements in a space to promote balance and positive energy flow. Feng Shui training teaches individuals how to…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Are your students achieving mastery?

Mastery learning is an ideology and an instructional strategy which is focused on ensuring that a student would gain sufficient knowledge on a specific knowledge area before moving to the subsequent topics. The approach is different to the traditional forms of delivery where the content is expected to be learnt on the same fixed timeframe by everybody. Mastery is not a new concept (Bloom, 1976; Wentling, 1973) however it has gained momentum with the availability of technology (Motamedi, 2006). Applying the Mastery Learning Approach (MLA) has yielded significant improvements in the fields like Mathematics (Yemi, 2018).

In applying MLA, there are number of concerns that require to be addressed.

An important fact about the data is the strong temporal dimension associated with it. Student’s improve overtime and therefore the performance demonstrated at a given step is only influenced by the level of mastery at that point.

The dataset provides some interesting insights into the teaching and learning focus (in this instance the tutoring system), the learning areas students struggle with most and the progression of learning.

The chart below shows the number of attempts per each KC by all students and the corresponding success and failure counts. Note that only half of the KCs are displayed here but the trend remains the same in other cases also.

Distribution of Knowledge Components

It is evident from the graph above that there is a significant variability in the distribution of the KCs addressed during the student learning journeys. Whether this is acceptable or intended for, is a discussion for the educators and the technology providers.

Note that the measurement considered is the “First Attempt” which considers if the student manage to get it right in the first attempt or not. Also note that there are significant differences in the number of correct first attempts across the range of KCs. It could be an indication that some KCs are harder to master than others.

On the subject of difficulty we can consider the average number of incorrect attempts and the number of hints requested as an indicative measure.

Number of incorrects and hints per KC

More incorrect attempts or hints requested would be an indication of a concept thats is difficult to grasp. If we closely examine the tallest bar in this graph “setting the slope” there is only a tiny bar in the counts graph (the chart before). It might be an indication of some form of a bias from the tutoring system towards the happy path, but more detailed analysis is required to make a conclusion, which is beyond the scope of this article.

It might be worthwhile performing the same analysis based on the “problem hierarchy” which is a representation of the curriculum.

Hints and incorrects by problem hierarchy

This type of analysis can provide vital insights on the parts of the curriculum that students struggle so that necessary improvements can be made to the lesson plans.

More time students spend on solving a problem may be an indication that it is a complex problem and we can expect lesser number of students to be successful in the first attempt. Graph below is a correlation analysis of this point.

Correlation between mean step duration and mean first attempt success ratio

There is a clear negative correlation between the mean step duration and mean first attempt success ratio. This is particularly important for us later when we look at feature selection for building the predictive models. We will look at a more comprehensive correlation analysis later in the article.

The dataset provides some interesting insights into the learning pathways of students. Analyzing student performance over a period of time provides an indication of whether they are converging towards mastery or not.

For this analysis two students are picked from the top 10 and bottom 10 in terms of the mean first time accuracy rates and analyzed their learning progression. Another criteria considered for student selection was the amount of interactions, so that there are enough data points to notice a trend. The student ids below are anonymized for privacy reasons.

Student ID : Ds3B2dRQo8 / overall mean first time accuracy : 0.847486

Progression of a more successful student

Note that the records are ordered on the x-axis by the sequence number (which follows the attempt start times) so that it is a reflection of the progress over time. The colors correspond to different KCs. The skill represents the average first time accuracy for the specific KC, for the specific student observed at that point in time.

It is evident from the graph above that this student converges towards mastery (i.e. skill level moves closer to 1 over time) in almost all the KCs. Let’s consider a less successful student.

Student ID : LM96hW22T2 / overall mean first time accuracy : 0.345376

Progression of a less successful student

The graph above shows an up and down trend and there is no sign of convergence in most of the KCs.

These type of analysis could answer some of the questions we raised earlier in this article. It can serve as a useful tool for the teachers and parents to understand how their students/children are progressing and whether or not they are achieving mastery.

While the trend analysis can help in making long term intervention decisions, there is also a need of a tool that can quickly asses if the students are ready for the next level. This can be achieved by building a predictive model which will be the focus of the next sections.

The data preparation process focussed on achieving two main objectives.

The dataset contains a “KC(Default)” column that lists the KCs associated with the specific step. It is possible that a step is associated with multiple KCs in which case the these are separated by a “~~”. However to make analysis more convenient, these KCs were represented in an extended format where each KC was represented in a column and a row value of 1 would indicate the presence of the KC in the step. This one-hot encoding approach is particularly useful when we train the machine learning models later.

The presence of a temporal dimension is an important part of the data preparation exercise. There are different methods to incorporate temporal dimensions like Bayesian networks but here we are using an aggregation approach so that we can use traditional classifiers to make the predictions. So at each step we calculate the the the mean success rates for each student/KC combination and impute this as a feature.

In the feature selection process, we try to identify features that drives our outcome which is being successful in the first attempt. We need to basically look at two categories of features here.

We would drop “id” type of features like the Anon Student Id, Problem Hierarchy and Problem Name to avoid overfitting and replace these with appropriate engineered features.

We start the feature selection process by looking at the null values in the original columns.

Out of the columns with null values we are not interested in Step Start Time, Correct Transaction Time, Correct Setup Duration, Error Step Duration and Opportunity given that these wouldn’t logically relate closely to the outcome (which is Correct First Attempt). Another reason to drop these columns is the large number of missing values. Any attempt to impute replacements for large number of missing values may degrade the performance of the model. KC(Default) is still important but we have already captured this during the data preparation process.

There are still missing values in the Step Duration column which is an important feature as elaborated above in one of the charts. Given that only small number of rows are affected, we can drop these rows from the dataset. For other missing values (in KC expansion columns and accumulated success rate columns) we can replace the nulls with 0s.

Having got all the features together, we can now look at the correlation matrix to understand the effectiveness of the selected features. Note that the features related to the KCs and the accumulated success rates (216) columns are left out to keep the analysis focussed.

For training our predictive model we have evaluated Logistic Regression, SVM and KNN algorithms. This was appropriate given the size of the dataset (nearly 600k training set), the type of columns (all numerical) and the type of classification (binary classification).

Out of the algorithms tried, only Logistic Regression converged in a reasonable time period and resulted in an accuracy of 0.907803.

The related confusion matrix is displayed below.

Confusion Matrix

The accuracy was further improved through hyperparameter tuning which resulted in an accuracy of 0.946260

The hyperparameters values related to this result are,

It is a widely accepted norm that every child is unique and therefore their learning journeys will be different to one another. It is clear that one size fits all approaches would fail, but without the support of technology, individualization can become overwhelming for educators. Throughout this article we tried to look at student’s journeys through the lens of data and tried to build a predictive model that can provide some indications on how a given group of students would succeed in a given challenge.

The type of data visualizations presented in this article can be really powerful in the hands of teachers and parents. They provide valuable information on where the student is at and whether the student is on the right track for achieving mastery. Also looking holistically at how students have responded to tasks can provide valuable information to the curriculum designers and teachers to fix the lesson plans. The type of predictor that we built can end up really useful for the teachers to quickly asses the readiness of their students.

The approach could be further enhanced to build a recommender that can suggest learning pathways based on the similarity of the individuals. A collaborative filtering approach should be best suited for such an implementation. This can fast track the students towards mastery and save time, effort and resources available to the educators.

Add a comment

Related posts:

How do calling and emailing work together for marketing?

The dilemma of whether a rep should email or call their prospect is one that is faced by everyone from time to time. When reaching out to leads, businesses want to use every avenue available to them…

IN PERSON DROP

I did research about Ayub Agriculture Research Institute Faisalabad. I want to do Internship there. AARI is my dream institute. I loved to work there. so, I just wanted to drop my resume and cover…

Wanderer

Wandering legs wanted to visit Naked fantasies ,. “Wanderer” is published by dk.