"Field Notes" is an occasional Connect column covering practical and philosophical issues facing admissions and registrar professionals. The columns are authored by various AACRAO members. If you have an idea for a column and would like to contribute, please send an email to the editor at connect@aacrao.org.
By Loralyn Taylor, PhD, Registrar and Director of Institutional Research, Paul Smith's College
The growth of the internet and, in particular, mobile access to the internet is driving a digital explosion in data and the applications of those data. Starting with applications such as Google’s search optimization and Amazon’s and Netflix’s content recommendation systems, big data has resulted in a revolution in data mining and its use to both predict and influence consumer behavior.
In contrast, higher education has far more data on its customers, i.e. students, than Google gets from a search or Amazon from a purchase, but these data have been seldom used. Every transaction with a campus office; every click, page view, discussion post or question answered in the learning management system (LMS); and every course registration, drop, midterm or final grade in the student information system (SIS) collects drop by drop into a vast trove of student data which, until recently, institutions have only tentatively accessed, mostly for post hoc analysis and aggregated reporting.
Learning analytics and student success
By utilizing big data mining techniques, learning analytics seeks to move institutional data use from post hoc reporting to generating immediately actionable information. Learning analytics can provide faculty and students with both reactive, real-time information on a student’s progress during a course, as well as proactive, predictive information on how a student might do in a course, whether a student is likely to drop out or transfer, or whether a particular major may be a good fit based on the student’s grades to this point.
The first International Conference on Learning Analytics and Knowledge defined learning analytics as: “the measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments in which it occurs.” (1)
Learning analytics works by combining data from an institution’s SIS, LMS and other resources (CRM, courses offered through adaptive learning platforms, tutoring software, virtual labs, etc.) and then applying data mining and predictive modeling techniques to gain actionable information. This information could include predictions about a student’s likely performance at the level of completing their degree or a specific major, down to personalizing the timing and order of presentation of learning topics within a single course for each student.
In student success, learning analytics holds tremendous promise for helping institutions identify at-risk students as early as possible as well as identifying the best possible intervention for each individual student’s challenges. Learning analytics can be used to quickly identify students who have strayed or fallen off their path to graduation as well as identify students whose skills sets do not appear aligned with their desired choice of major.
Big data and privacy concerns
The tremendous promise and power of learning analytics comes bundled with a wide array of ethical concerns including the potential for misuse and violations of student privacy. Data mining techniques can infer things about a student that the student themselves may not know or may not be comfortable with anyone else knowing. Remember when Target correctly inferred that a teenage girl was pregnant much to the consternation of her father? (2) Perhaps even more importantly, what happens when the predictive models get it wrong?
Some issues of concern:
1. What information is used?
Similar to the ethical questions facing digital health records that can now follow a patient from doctor to doctor throughout their life, should all possible student information be collected for learning analytics? Should a student have the right to opt out of having some or all of their data collected and utilized? Should that freshman F in biology follow them forever? Their admissions process data? What about their high school grades? What about their K-12 record? How do we ensure transparency for the student to understand what is collected and how it is used? Should students give informed consent?
FERPA affords students a means of redress if their educational record includes inaccurate information or data. How will inaccuracies or errors in the data collection or predictive modeling process be addressed? Should students have to grant informed consent to have their data mined for predictions about their future selves? How can we ensure that these predictions will not be used against the student?
2. Whose information is it?
In pursuit of pedagogical innovation, more faculty are bringing twitter, facebook, LinkedIn and other social media or online services into the classroom or requiring their use for projects or homework. While institutions have control of the data collected through their own systems, who controls and monitors digital exhaust generated by students through programs and resources not directly controlled by the institution? Who is monitoring to ensure proper data collection, use and, importantly, destruction?
3. Data are inherently biased.
Data miners make choices about which data to collect and what aspect of meaning they assign to those data. “In developing models data miners link their own meanings, values and assumptions to similar ones taken from the problem and the intended intervention.” (3) Data from one context can be utilized in another context, losing meaning and introducing error. In addition, while correlation is not causation, it is sometimes used as a proxy for causation. Noting that students who logged in to their course on the first day had a higher pass rate, a college sent emails to all students encouraging them to log in on the first day. They failed to realize that the first day log in was an indicator of student motivation in the course and not a success factor in and of itself. (4)
4. Sorta you is NOT you.
I can’t say it any better than this commercial: https://www.youtube.com/watch?v=hb6uNLjahCM. The data mining and predictive modeling process classifies students as belonging to a particular group of students who share common characteristics. Based on this classification, predictions about their future success in a course, major or their degree are made. In essence, we are labeling students and making decisions about what courses or majors to recommend and what services they may need to be successful. What if students disagree or don’t identify with their label? What if the label leads to prejudicial action against the students’ interests? Who determines what the student’s interests are? Do students have a right to know their label? Is it a label or a self-fulfilling prophesy?
By definition, our students are in a state of development. They are changing who they are from moment to moment—their motivations, their likes and dislikes, their goals and aspirations. When do our students no longer fit the learning analytics label we placed on them?
To appreciate the potential downside of the collection of Big Data-size databases on our students, one only needs review the cautionary tale of the short-lived, Gates Foundation-funded project, inBloom, Inc. Launched in March 2013, by June the effort was already under attack from privacy experts and parents for obtaining and maintaining records on millions of students without parental consent. Further, sensitive information such as information on learning disabilities, health and disciplinary records were combined with academic information in a database which was unable to guarantee complete security (5). As concerns grew and pressure mounted from parents opposed to their child’s personal records being collected and mined for insights, inBloom shut down in April 2014 (6).
As custodians of our students’ educational records, we must not only stay abreast of these rapid advances in learning analytics and data mining, but also raise questions concerning the accuracy, appropriateness, potential misuse and potential privacy challenges surrounding these powerful new techniques to ensure that our students’ rights and privacy are protected.
Two informative and helpful infographics on Learning Analytics:
1. How it works: http://www.opencolleges.edu.au/informed/learning-analytics-infographic/
2. How it is being used: http://www.onlinedegrees.org/how-big-data-is-changing-the-college-experience/
References:
(1) 1st International Conference on Learning Analytics and Knowledge, Banff, Alberta, February 27th
(2) Ellenberg,J. What’s Even Creepier Than Target Guessing That You’re Pregnant. Slate.com http://www.slate.com/blogs/how_not_to_be_wrong/2014/06/09/big_data_what_s_even_creepier_than_target_guessing_that_you_re_pregnant.html
(3) Johnson, J. “The Ethics of Big Data in Higher Education”. International Review of Information Ethics Vol. 07 (2014).
(4) Perry, Marc. College Mine Data to Tailor Students’ Experience, 2011. http://chronicle.com/article/A-Moneyball-Approach-to/130062/
(5) Strauss, Valerie. Privacy concerns grow over Gates-funded student database. http://www.washingtonpost.com/blogs/answer-sheet/wp/2013/06/09/privacy-concerns-grow-over-gates-funded-student-database/
(6) Dwoskin, E. and Fleisher, L. Parental Opposition Fells in Bloom Education-Software Firm. http://www.wsj.com/articles/SB10001424052702304049904579516111954826916
Recommended for further reading:
Long, P, & Siemens, G. “Penetrating the Fog: Analytics in Learning and Education.” Educause Review, 31-40 (2011, September/October).
Perry, Marc. College Degrees, Designed by the Numbers, 2012. http://chronicle.com/article/College-Degrees-Designed-by/132945/
Slade, Sharon and Prinsloo, Paul. “Learning analytics: ethical issues and dilemmas.” American Behavioral Scientist, 57(10) pp. 1509-1528 (2013).
U.S. Department of Education, Office of Educational Technology, Enhancing Teaching and Learning Through Educational Data Mining and Learning Analytics: An Issue Brief, Washington, D.C., 2012. Retrieved from http://www.ed.gov/edblogs/technology/files/2012/03/edm-la-brief.pdf