Desktop version

Home arrow Philosophy

  • Increase font
  • Decrease font


<<   CONTENTS   >>

Educational data mining and learning analytics

Within the discipline of education, two emerging fields related to Big Data have begun to emerge, Educational Data Mining (EDM) and Learning Analytics (LA). At the inception of these two fields, there was differentiation between their general purposes. EDM was conceptualized as work that focused on tools, techniques, and methodologies for collecting, aggregating, and analyzing Big Data related to education. By contrast, LA focused more on the application of findings from these processes to optimize educational interactions and learning. As both fields have grown, however, the differences between the definitions have become a bit muddled, merging into a common vision of extracting information from educational data to help inform and implement instructional decision making (Linän & Perez, 2015). To simplify this discussion, within the ensuing sections, we default to using the combine term EDM/LA to describe the use of Big Data within the education sector.

EDM/LA at the mile-high view, targets the collection, analysis, and reporting of data from and about students and the learning environments in which they are embedded (Siemens, 2013). There is variation in the more fine-grained definitions of EDM/ LA, with some researchers focusing explicitly on the use of student-generated data for the purpose of personalizing learning (Junco & Clem, 2015; Xing, Guo, Petakovic, & Goggins, 2015). EDM/LA can help to model learner behavior as a means to capture patterns in learner strategy, understanding what learners know, and where misconceptions might lie. Because this data can be captured in real time, it can potentially be used across stakeholders for providing scaffolding for learners, feedback to teachers, or even overall progress indicators for administrators (Bienkowski, Feng, & Means, 2012). However, in order for educational systems to provide personalized learning experiences, students must provide information about themselves and regularly interact with the system to create a wide array of analyzable data (Jones, 2015). Once a system is able to build a model of the learner, which includes a better understanding of the students cognitive and affective characteristics and common interaction patterns, it models what the student knows (or does not) as a means of illuminating the gap between where the student currently is, and where they need to be to achieve the desired learning outcomes (Bienkowski et al., 2012). This can be used to create a learning pathway that is tailored to the specific needs of individual students rather than a mass instructional design plan that homogenizes learning across an entire group of students.

To date, most of the work operating under this definition of EDM/LA has concentrated on very broad learning metrics to study student retention (Arnold, 2010), engagement, learning outcomes (Hrabowski, Suess, & Fritz, 2011), and return on investment regarding technology procurement and deployment (Norris, Baer, Leonard, Pugliese, & Lefrere, 2008). One of the most well-known EDM/LA systems developed to target these outcomes is the Course Signals (CS) system developed at Purdue University (Arnold & Pistilli, 2012). CS works by extracting data from multiple university sources (e.g., admissions information, course interaction data, etc.) and subsequently analyzing this aggregated data to generate actionable intelligence for each student on campus in the form of a risk assessment (Arnold, 2010). The underlying algorithm used to predict students’ risk level is measured through four parameters, performance (credits earned), effort (grade in current courses), comparison to peers, prior academic history, and demographics. The algorithm then categorizes students regarding their likelihood of success from low to high. Instructors are provided with the risk assessment prediction and can respond to the student in several ways to correct the trajectory if needed through computer generated email, text, their course management system, or a referral to a face-to face meeting (Pistilli & Arnold, 2010). Results from investigations of the efficacy of the CS system indicate that on courses in which CS is implemented, students obtain more passing grades and there are fewer dropouts and withdrawals (Arnold, 2010).

Although there is clear merit to the CS system, the level of granularity of both the data and the feedback provided to students is pedagogically limited. While instructors are informed of the risk level of a student in an automated fashion, the information provided does little to help the instructor understand why a particular student is faltering. What remains largely untapped in CS, and other systems like it, is the examination of EDM/LA as a tool to examine the strategic knowledge and processes that students enact during learning and how these in situ practices can inform instructors, real or virtual, on real-time intervention supports (Dietz-Uhler & Hurn, 2013). For example, such data could provide guidance on the provision or selection of relevant learning resources and content, the insertion of prompts for reflection and awareness, detection of unproductive learning strategies, delivery of timely hints and instruction, and identification of affective states (e.g., boredom, frustration) of the learner (Verbert et al., 2012).

Other researchers and EDM/LA practitioners focus more squarely on defining the purpose of LA as a tool to aiding in adapting interventions to support productive student learning strategies (Drachsler & Kalz, 2016; Rubel & Jones, 2016). In a sense, this perspective on EDM/LA creates an informed strategy for providing targeted feedback to the learner. Feedback informs individuals about what they did correctly or incorrectly, as well as how close they are to accomplishing desired outcomes. When instructors have a nuanced understanding of student success and failures, they can intervene in much more specific ways and provide students with detailed feedback to help them regulate their learning behaviors and recognize students’ strengths and weaknesses. However, examining the types of feedback typically provided to students in online learning environments, Tanes, Arnold, King, and Remnet (2011) noted that instructors rarely provided instructive, elaborated, or process-related feedback. Rather than receiving feedback on how to address deficient strategies in their approach to learning, or misconceptions regarding their understanding, students identified as “off course” tended to receive messages carrying low level, summative feedback (Gasevic, Dawson, & Siemens, 2015). The default to summative feedback likely stems from the overwhelming volume of feedback that would be necessary in a large online course setting.

EDM/LA is a possible avenue to better address high quality formative feedback to every student. For example, IBM’s “Smarter Education Group” has been working on one such automated EDM/LA system, Personalized Education Through Analytics on Learning Systems (PETALS). PETALS is an automated machine-based learning application that continuously learns through data collected from students’ interaction with the learning system, their achievements and failures with individual learning modules, and the changes in their learning behaviors in response to intervention adaptations. In addition to the data created through students’ interactions within the PETALS system, data is also culled from external sources to iteratively test and generalize identified patterns and profiles on an independent set of students. (IBM, 2013). The system uses this corpus of data to derive a model of the student in math, for example, with specific attention to the strategies used when attempting to solve problems. For example, it may notice that a student is stuck in a “trial and error” strategy loop and nudge then to review a specific resource pertaining to the current issue they are attempting to address, helping the student to regulate their behavior and learn a more productive approach. While the PETALS system was able to detect specific strategies students were using, ascertain if those strategies were productive, and intervene in cases where student’s strategy use was either unproductive or faulty, research examining the implementation of PETALS has indicated that it is no better than standard educational practice on improving student performance in math (Pool, 2015). As we know from psychology and educational research, it seems that more feedback is not necessarily better; the quality and timing of feedback is also critical.

 
<<   CONTENTS   >>

Related topics