Jun 28 / Dr John Brown

How to stop worrying and learn to love AI in education

AI’s invasion of the classroom is already well established, children are using it to compose writing, guide the content of assignments and teachers are using it to mark, provide feedback and write examples of answers to exam questions. There seem to be a couple of big reasons to worry about the effect all this might have on the quality of education. One is that if it is the AI that is coming up with the content, doing the thinking, providing the statements, then the child and the teacher are less involved in thinking about it all. This seems likely to mean they are less likely to internalise, assimilate and digest the thoughts. Children might come to know the right sort of response to a question and less why, teachers might come to know less about where the individual student is at in their understanding.

Another worry is that AI is by its very design intended to be plausible, superficially convincing, in other words provide just enough content knowledge and detail to persuade the reader the writer has thought and understood the assignment. This might devalue the dialogue between children and teacher into an exchange of generic educational platitudes. For example Daisy Christodoulou found AI marked an essay about Shakespeare's Romeo and Juliet exactly the same as an essay about Charles Dickens' Great Expectations when the only difference between the two essays were that the word Romeo and Juliet had been replaced with Great Expectations. So clearly the AI knew nothing about what is specific and unique to either Hamlet or Great Expectations to make an evaluation.
So, should we be very worried? We argue we should if AI is used in dumb way, straight out of the box not adapted for education but we should not if its use is adapted for the educational context.

Large Language Models AI are designed to provide plausibly intelligent seeming responses to any dialogue. Specifically they designed to pass the ‘Turing test’, to convince an audience they are speaking to a human. To do this a system must be able to make reasonable sounding attempts at answering any possible query. To do this LLM searches the internet for human writings on any and all topics and aggregates these to find common consensual responses, much like we search our knowledge in conversations that could take any turn. This means the AI has to keep an open mind about the context of who it is speaking to, the purpose they have for interacting with it and the format of its responses, mostly its priority is to appear like a human, to give a response, to not let slip it does not know what it is talking about, just like us. It seems key to recognise it is designed for anybody to ask it anything such as talk about dreams, writing marketing blurb, explain quantum physics, do a horoscope, write computer code, diagnose a health problem, interpret the law. It would be more effective if it asked a few probing questions when you interact with it, to try to gauge the context of the dialogue, the purpose of its audience, but if it did that it would seem less human, we do not ask a series of questions before responding to a query.

AI might be a bit like this, imagine you set up a Punch and Judy tent on a high street and got into it so that you cannot see out of it and no-one could see you, with a sign that said, ‘I know everything, ask me anything’ what sort of interactions might you get? Probably some verbal abuse, requests for bus times, jokes, religion, politics, requests for directions, people trying to sell you things. You could probably come up with a response to each interaction, but how helpful or informed would they be? Probably you would come up with some common generic response that got you through each interaction that seemed to satisfy your interlocutor. You would probably invent most of the information based on a best guess, you would probably want to sound confident and couch your responses in reasonable and plausible sounding phrases, so that you satisfied your audience that they got what they wanted so they would go on their way. Arguably, this is similar to what LLMs do and it clearly seems inappropriate for education. Although this is a silly analogy it might hold a grain of truth about how LLMs prioritise plausibility, will tend to be generic and tend not to ask or know its being asked an educational question. It seems likely this means to get anything useful for education out of LLMs you need to tell it the educational context.

This is the issue of prompts, the instructions you give AI to describe what you want out of it. Much debate about AI in education seems to have rapidly evolved into a discussion about prompts, what do you ask?

Several questions arise, does it make a difference to kids grades what they ask ChapGPT? If so, what parts of a prompt are the most important in improving answers?

What Worked Education is running a series of studies into the effect on students writing - their answers to A/Level long answers - of students asking ChatGPT prompts specific to the course and marking scheme. This is all compared to students writing if they get no help from AI.

If this or any other issue on AI use in education is of interest to you please get in touch.

Created with