Trends and Futures

Day 1: The What and Why of Learning Analytics (LA)

Written by Dr. Patrick Tran, UNSW Canberra

In this post we will cover briefly some basic concepts and provide pointers to further readings. We will not go into the technical details of learning analytics techniques and tools but rather provide you an overview of the topic.

What is Learning Analytics (LA)?

Learning Analytics (LA) is defined as the ‚Äúmeasurement, collection, analysis, and reporting of data about learners and their contexts, for the purposes of understanding and optimizing learning and the environments in which it occurs.‚ÄĚ (Siemens and Long 2011) Learners‚Äô data include their digital footprints in learning environments, feedback data, demographics and enrolment details. LA aims to:

  • induce relationships or patterns that are unsuspected otherwise and,
  • summarize them in both understandable and useful ways to the stakeholders.
Image by NATS.

As described above, LA is not only about applying advanced data science techniques on the vast amount of data to afford insights into learning, but also interpretation of these findings. LA can help group learners based on their learning behaviors (clustering), identify outlier records that may indicate poor performance, and induce potential correlation between learners’ engagement and their academic progression.

LA is a multidisciplinary field that intertwines instructional design, psychology, statistics and knowledge discovery to develop a unified explanation of human learning. There are a number of research fields with similar objectives as that of LA, including:

  • Academic Analytics: applies statistical techniques on large academic data sets to inform pedagogical practices and institution-wide policies. To many LA practitioners, the difference between Academic Analytics and Learning Analytics is more terminological than conceptual.
  • Educational Data Mining (EDM): applies computerized methods to search for patterns in huge educational data sets. EDM is said to transform ‚Äúunintelligible, raw educational data‚ÄĚ into actionable intelligence that is ‚Äúlegible and usable to students as feedback, to professors as assessment, or to universities for strategies‚ÄĚ (Romero et al 2016). Unlike Academic Analytics that takes a holistic approach to analyzing the learning experience as a whole, EDM implements a ‚Äúreductionist approach‚ÄĚ by focusing on the reduction of large volume of data to smaller, more manageable components and discovering relationships among them (Siemens and Baker 2012).

For the sake of simplicity, we will refer to LA as both Academic Analytics and EDM in this course.

A Typical LA Workflow

A large number of LA frameworks can be found by reviewing the relevant literature. Most of them fit into the high-level generic process or workflow below, that I often use in my analytical work.

A Typical Learning Analytics Workflow Chart

This framework collects and processes data related to learning, and interprets and feeds the analytical findings back to the previous steps. The framework components are described below:

  • Analytic Questions: first, we define the analytic problems or objectives by formulating specific questions for investigation, e.g. ‚Äúmeasure student engagement in online forums‚ÄĚ, or ‚Äúwhat is the relationship between student participation in online forums and course outcomes?‚ÄĚ.
  • Pedagogical Insights: identify learning theories, assumptions or observations that may be relevant to this LA application. These discipline-specific insights help guide the other LA processes (Data Processing and Interpretation) and narrow down the search space of solutions to the underlying questions. For example, literature may suggest that the vocabularies used in form discussion may indicate the quality of students‚Äô engagement and hence the collection of this data for our LA project.
  • Data Collection: retrieve data from sources.
  • Data Processing: organize data and transform it to meaningful and useful knowledge. I used MS Excel, Python, R and NVIVO for this purpose, depending on the data types and contexts.
    • Data Preparation: pre-process raw data for further analysis. This includes data cleaning (missing data, errors), data filtering, feature engineering (derive new variables calculated from existing variables).
    • Analytic Engine: apply analysis methods on the prepared data. Depending on the type of data and analysis objectives, different knowledge-discovery methods can be implemented, including quantitative analysis (statistical ‚Äď descriptive and inferential, Data Mining) or qualitative analysis (thematic). Data visualization can be used to assist these methods in validating or illustrating the analysis results.
  • Data Interpretation: discuss the analysis results from the previous step. Where appropriate, the results should be contrasted or compared against prior experiences, synthesized and generalized. The educational context of the LA application should be considered when interpreting the results. The analytic outcomes generated from this step could be intelligent feedback to students for an assessment, learning adaptation and personalization, and new knowledge that is useful and actionable.
  • Feedback: provide feedback to the previous steps. In this step, we:
    • Address the analytic questions.
    • Deliver the interpreted analysis results to the intended recipients, i.e. the stakeholders. This is when you write up a report describing your findings, action plans and recommendations. In some user-facing LA applications, a dashboard is used to deliver the insights back to the users.
    • Modify the previous steps based on the insights discovered, e.g. changing parameters used in the analytic engine for better results, or validating the pedagogical insights used.
  • LA Stakeholders: LA aims not only to assist (directly) learners and educators, but also (indirectly) other stakeholders.
Image by Berkeley Lab.

Educational Data

Educational data contains students’ digital footprints in virtual learning environments. In particular, it is any data on student learning and academic performance, demographic details, and course-specific data (activities, resources). This rich data is scattered around multiple systems, including learning management systems (LMS) such as Wattle (Moodle) and student information systems. In addition to data generated and stored in virtual environments, real-world contextual information such as classroom settings and face-to-face interactions can also be used to ensure a complete understanding of the learning experience.

For a more detailed overview of educational data types and sources, please check out the longer version of this post.

Why adopt LA in your teaching and learning?

The main objective of LA is to enhance student experience through evidence-based analytical intelligence and evaluation. Drivers for the adoption of LA include its abilities to:

  • Explore opportunities to leverage information about learners from multiple sources and various data types, ranging from extracurricular activities, social media postings to demographics details (age, gender), physiological metrics and face-to-face behaviours captured by activity trackers and surveillance cameras. This rich data enables LA to infer learners‚Äô intention, cognitive and metacognitive processes or even stress level during learning activities (Zeide 2017).
  • Analyse large volume of learning data available in online platforms such as A massive open online courses (MOOCs) and social media networks.
  • Discover new patterns and actionable knowledge about learning processes that might otherwise go unnoticed. The insights generated by LA can help drive learning improvements and inform institution‚Äôs strategies.
  • Meet the diverse needs of a wide range of stakeholders interested in what LA can do, including learners, educators, researchers, institutions and government agencies. The insights generated by LA are also delivered to end users in many forms such as periodic intelligence reports, ad-hoc reports, real-time dashboards embedded in a web page or a mobile app.

For more examples on LA, or an overview of commonly used LA techniques, please check out the longer version of this post.

question markActivity:

In your opinion, who are the stakeholders in an LA application, i.e. who can affect or be affected by the application? What are their interests and skills/knowledge required (if any)?

Share your thoughts by posting a comment in the discussion forum.


C. Romero, Cerezo, R. , Bogar√≠n, A. and S√°nchez‚ÄźSantill√°n, M. , “Educational process mining: a tutorial and case study using Moodle data sets,” Data mining and learning analytics: Applications in educational research, S. ElAtia, Ipperciel, D. and Za√Įane, O. R., ed., pp. 1-28: John Wiley & Sons, 2016.

G. Siemens, and P. Long, ‚ÄúPenetrating the Fog: Analytics in Learning and Education,‚ÄĚ EDUCAUSE Review, vol. 46, no. 5, 2011.

G. Siemens, and R. S. J. d. Baker, ‚ÄúLearning Analytics and Educational Data Mining: Towards Communication and Collaboration,‚ÄĚ in Proceedings of the 2Nd International Conference on Learning Analytics and Knowledge, Vancouver, British Columbia, Canada, 2012, pp. 252–254.

E. Zeide, ‚ÄúThe limits of educational purpose limitations,‚ÄĚ University of Miami Law Review, vol. 71, 2017.

19 thoughts on “Day 1: The What and Why of Learning Analytics (LA)

  1. The first obvious stakeholder with Learning Analytics (LA) is the student. What is done should benefit the student, not just treat them as a data source for exploitation, as happens with many apps.

    The next stakeholder is the teacher. Here again, there should be something in the LA for the teacher. The risk is that LA will be used to monitor the teacher’s performance and squeeze maximum work out of them, like they were a bicycle courier. Data to day, LA is not that much use to the teacher. I have read claims that with LA it is possible to identify students at risk, but you don’t LA for that, just a weekly test.

    Another stakeholder are academics who want to research learning. It should be kept in mind that they require ethics approval to use student data, especially if the results are to be published. I suggest students would not mind approving this use of their data, if they were told about it.

    Then there are the various levels in the educational organization. Here again I worry the LA will be used to minimize cost and maximize revenue, rather than improve the quality of education.

    Government and non-government funding bodies, including industry, are also stakeholders, as are the community, who ultimately pay for much of the education, and stand to benefit from it.

    Whoever is doing LA will need some data analysis skills. The average academic will have enough training in quantitative research techniques to work with LA. However, they will need LA specialists to work with. It happens I used to do analysis of data from all Australian non-government schools using SAS, to work out new funding policy for the Federal Department of Education. The analysis was not that hard, and was used to work out how to share the $10B each year for schools.

    1. HI Tom,

      Thanks for participating in the discussion. As you rightly pointed out: learners, educators (instructors and researchers), institutions and government agencies all have a stake in LA. I also share your concerns about the (potential) misuse of LA. We will definitely look into these a lot more on Day 3 of this course.

      I would also like to add some details on what LA can benefit the stakeholders. In particular, LA supports learners to develop their self-awareness and self-reflection by making their learning (progress, performance) visible to themselves and hence increasing their participation and engagement.

      In most universities, you would have a group of Learning Administrators (educational designers and system admins): Educational designers work closely with instructors to design curriculum and learning processes, and are interested in the pedagogical implications of analytic outcomes (e.g. natural groupings of learners reflecting their learning styles). System admins, on the other hands, focus on the technical side of the learning environment, including implementing system-wide changes and collecting data.

      Not sure about other institutions, at UNSW, we have a dedicated team called “Educational Intelligence & Analytics” that supports schools and individual academics with their LA needs. The educational data scientists (EDSs) in this team identify opportunities for leveraging data to build models and insights that drive learning improvement. EDSs are responsible for producing intelligence reports, developing analytic models and solutions to support all LA stakeholders. EDSs are required to have strong technical skills (system, programming) not only to develop algorithms and models, but also to automate the other steps in this workflow.


  2. I think Tom was right on the money with listing the stakeholders and I agree it should ultimately be about what is most beneficial to students – as all education decisions/actions should be! I think LA is important and has a place at all levels of the university. University higher management should have an infrastructure in place to be conducting analysis at the highest levels to inform our strategic decisions and I’m not entirely confident this is occurring. At a local course level I think it is very important, as Pat mentioned for self-reflection and adjusting content to be more engaging etc I do think that it’s not as easy as it should be. Moodle logs aren’t great. It is a deep dive to understand what the logs are capturing, how to make sense of the data and then it requires at the very least knowing how to manipulate data in excel – which I have found not all academics know how to use. I hope this course has some practical elements because I’d love to be able to help academics more with LA, particularly making better use of Moodle logs/reports.

    1. Hi Natalie, I have found similar limitations with the Moodle logs and reports – yes they give you an indication of who clicked on what and when, but I am always left thinking things like “did they understand it?” This is a big challenge for me in applying learning analytics in my own practice because I feel they provide a very incomplete picture and I worry about making generalisations from this. I think this is something we will consider in detail in Day 3 of the course in any case.

  3. Hi Natalie,
    We definitely should make use of the built-in Moodle reports (and the more communication-focused Moodleroom reports) for LA needs. On Day 2 of this course, we will briefly look into these built-in tools though you may find a more detailed overview of these techniques in the longer version of the post. The data analytics part is not for everyone, hence the need for collaboration with EDSs.

  4. I too see the main LA stakeholders as the students. I have heard of LA being used successfully in the UK as the basis for authentic, personalised, early intervention for non-engaged students. While imperfect, LA can be used (with other measures) to support student wellbeing. Thanks, Tom for your explanation of the other stakeholders – educators/trainers, university management, government, NGOs – the ecosystem stretches far and wide. I’d be interested in hearing everyone’s thoughts on data privacy/protection in this space.

    1. Hi Imogen, welcome! We’ll be looking at data privacy in more depth on Day 3 but obviously that could be an entire course on its own! I have a lot of thoughts on these issues myself – hoping to discuss them all with you in Day 3 and beyond!

  5. In a different Australian University, I was part of a team that used basic LA, coupled with direct input from teaching staff, to identify and support “at-risk” students. While we didn’t deep dive into the data, log tracking and social media were extremely useful in identifying students who were not engaging with their studies. It didn’t take long to see patterns emerge, which we were then able to address through necessary policy changes.

    Despite having been a part of the positive side of LA, I continue to hold deep concerns about using it. It genuinely felt like stalking. Once at-risk students were identified, that job then involved cold calling them to confirm the situation and initiate early intervention. 100% of students who I contacted had no idea I was able to track them in that manner (although, this probably says as much about digital literacy as LA). As has been expressed above, there is a lot of room for such information to be misused, or even abused.

    Having said that, I still continue some of the general ideas and lessons from that role to help keep tabs on my students. The main difference is that, now, I tend to stick to the low-tech data to identify at-risk students; eg, attendance, body language, etc.

    1. Bhavani thanks so much for sharing your experiences in using LA in this way! I have very similar concerns, and I note that Tom commented similarly on Day 2 ( about the impetus to act when LA is giving you information about students in trouble. Your calls with students, and their confusion, is certainly a key issue when thinking about LA – are students truly informed about what data they are giving the institution and how it will be used? And do they understand the impacts it might have? Digital literacies are a key part of this, as you say, but also I worry about teaching staff’s digital literacies and data interpretation as well. Working with people on things like interpreting Turnitin originality reports has taught me that there is significant room for interpretation and mis-interpretation in how data is presenting to teachers!

      1. Hi Katie,

        I’m glad you raised interpretation. That has me more worried than the data collection side. In recent conversations with undergrad students, many opined that they have resigned themselves to the fact that their (digital) lives are out of their control and that big data knows everything about them. Yet, only a few showed interest in the value that data contains, how to understand and interpret it, and the consequences of misinterpretation.

        1. I have had similar experiences! Working with academics on interpreting relatively “straightforward” data from the LMS (Turnitin originality reports, Moodle logs, Echo360 analytics) has shown me how willing we all are to ascribe intention onto the data, with no evidence from the students themselves about what their intentions were. Our former DVCA Marnie Hughes-Warrington wrote about this sort of thing in her blog looking at Echo360 data + heat maps from lecture theatres, for example –

          What’s interesting to me is how these data sets are actively being used by universities to make decisions around pedagogical approaches, campus and classroom design, and other things. Sara de Freitas also wrote about this in relation to the LA project she oversaw while in the executive at Curtin in WA: This use of data was very far reaching, including everything from admissions and demographic data, learning analytics, as well as use of campus wifi and buildings (drawing on location tracking of student cards and logins) and so on.

    2. Hi Bhavani, thanks for sharing your story. Reading it I have the same concerns you, Katie and Natalie mention. Out of curiosity, did you find that other people involved in this process tended to discriminate against these students for their ‘at-risk’ status, or generally not treat them as favourably as others? Providing that extra level of support sounds like a great idea in theory, but only if handled by staff with a mature and sensitive approach to these issues (like you, I assume!)

      1. Hi Gemma,

        Based on my experiences, this did not result in discrimination. Fellow “LA stalkers” were given extensive training, on the human side as well as the analytics. Although we were a team, we were assigned to individual colleges/faculties, and occupied a hybrid space between academic and professional staff. This meant we could track students across their entire major/degree. Within my college, I worked closely with the academic staff to determine the best way to support our students. We decided that I would effectively be a proactive Access and Inclusion, seeking out students instead of waiting for them to ask for help. I worked with relevant academics to arrange appropriate support, although, eventually, they trusted my ability to come up with appropriate measures that didn’t hinder their teaching objectives. Simultaneously, academics and students both liked that I was providing a one-stop-shop. Students didn’t have to re-tell everything to each individual academic. Academics didn’t have to manage the student support. I could help students manage multiple courses, and knew that by granting an extension here, I wasn’t placing an additional burden on them in another course (where the extension inevitably clashed with that assignment). With appropriate supports in place, at-risk students were able to lose the “at-risk” label, and integrate into being a “mainstream” student. If anything, that role helped to reduce discrimination, as staff and students didn’t have as many negative interactions with each other.

        This sort of role is still possible without the LA ethical dilemma. It would simply require much more proactive manual inputs and student monitoring.

  6. Exactly Katie – I worry about the incomplete picture and us acting or reacting incorrectly in response. A colleague is doing her thesis on this very topic and using Moodle logs as well as qual interviews. I’m interested to see her findings and kind of hope it debunks some of the things we assume about students!

    1. Natalie, I am also interested in your colleague’s findings. From my own experiences, we found that non-engagement was often a conscious choice. These students didn’t appreciate the “stalking” and, for the most part, rejected offers of support. In contrast, the students who were struggling genuinely appreciated the early intervention. It would be interesting to find out whether that trend has continued, or whether that was specific to a particular discipline/cohort.

      1. I am fascinated by this idea of conscious non-engagement – it raises a range of other questions for me around what our expectations of students really are, and how clearly we communicate this. I’m interested in balancing how we support students who are struggling with over-surveillance, and where student choice and agency fits into this. If students know what they are supposed to do, but choosing not to do it, is that not their prerogative? It’s a concern I have with increasing measurement of things like active learning approaches. If students choose not to engage in these modes of teaching, can we force them? So many things to consider! (Loving the discussion so far everyone!)

        1. Just to add to the questions, I am also increasingly torn by the dichotomy between higher education being an adult learning environment, and the fact that, at the end of the day, most of my students are still kids. Turning 18 does not magically equip one with andragogic principles and practices. Is a modicum of hand-holding, especially about meta-cognition, necessary? As technical adults, how far do any of our obligations extend?

Leave a Reply to Katie Freund Cancel reply

Your email address will not be published. Required fields are marked *