Everyone Wants AI Skills. We Need Digital Literacy First.
Last month I delivered an AI upskilling course for teachers and higher education professionals. I expected the usual challenges: fear of replacement, confusion around terminology, curiosity mixed with skepticism. What I did not expect was a much more fundamental gap - one that had little to do with AI itself.
Somewhere between the promise of “learn AI in a weekend” and the institutional pressure to “upskill or be left behind,” we skipped a step: the very basic digital literacy.
Digital illiteracy
I usually teach in the creative industry. While designers, developers, and media professionals may not understand what happens inside a neural network, they do understand software. They know what a browser is. They know where files go when downloaded. That baseline is what makes AI education possible.
But suddenly I found myself standing in front of a room where this baseline was absent. I gradually realized that I had assumed a level of digital literacy that simply wasn’t there. While believing I was empowering people by teaching AI tools, I might actually have been accelerating confusion.
The Spectacle of AI Fluency
One of the easiest ways to impress a room full of participants is to run a live demo.
You open Google AI Studio and paste a carefully prepared prompt into the playground. In seconds, Gemini generates hundreds of lines of code. We watch it unfold, with no coder in the room to evaluate whether the code even makes sense. But the game works and we see it in action, with sound effects and everything. Everyone is impressed and nods to the tremendous future this has in education. And yet, at the end of the show, very few could reproduce the steps on their own.
What a live demo often prioritizes is spectacle over explanation. For practical reasons, this makes sense: attention is fragile, time is limited, and a smooth, impressive sequence builds trust faster than a slow unpacking of every click and dependency. The problem is that the spectacle hides the scaffolding.
What looks like a single, almost magical step is usually a chain of decisions, iterations, and prepared environments. We do not see the hours spent iterating the prompt beforehand. We do not see the prepared folder structures, the pre-selected input materials, the preset configurations inside the tool. We do not see the browser extensions, the operating system quirks, the version differences, the quiet troubleshooting that happened earlier that morning.
And because we do not see these layers, we underestimate them.
The questions do not surface immediately. They emerge later, after the novelty wears off. Where exactly was this done - in Playground or in the “Build with Gemini” tab? What did the prompt actually contain? What do I do with the generated code now? Where does this game live? How do I integrate it into my own teaching materials?
We see this everywhere. A prime example are YouTube tutorials: they are already heavily edited. We don’t see the waiting time while something generates, and we are not aware of how much work went into preparing the input materials beforehand. There might already be presets the experienced lecturer is using in their tools to fine-tune things to their needs - something the viewer never sees. So it is no surprise when the same process does not produce results of equal quality when you try to replicate it yourself. Connections between the demonstrated tools and other factors, such as the operating system, computer settings, custom plugins, and similar dependencies, are rarely explained.
None of this is necessarily deceptive. But collectively, it builds a culture in which AI appears frictionless, and where difficulty feels like personal incompetence rather than structural complexity. That is how spectacle replaces literacy.
The ChatGPT Trap
When I went to Web Summit 2025 in Lisbon, I brought a series of stickers instead of business cards - ironic one-liners about the subtle cultural shifts taking place because of our daily interactions with ChatGPT. “ChatGPT made me do it.” “AI says I’m doing ok.” They were small jokes about new rituals of cognition, about the way conversational AI quietly inserts itself into decision-making, reassurance, even self-evaluation.
People laughed. And then, more often than I expected, they admitted that the joke was not entirely a joke.
ChatGPT will happily support you in building an action plan for becoming the best version of yourself in the upcoming five days. It will outline your strategy in five clean bullet points. It will tell you that you are doing great. It will remind you to take a breath before unpacking a complex issue you approached with a subtle anxiety in your voice. It is endlessly patient. It is responsive. It is available at midnight. It is, in many ways, an illusion-rendering machine in your pocket. And this matters when participants arrive at a course.
They often come in feeling AI-fluent, well-equipped, already familiar with “how it works,” because they have been pepped-talked nightly by a sycophantic cheerleader that simplifies long learning curves into tidy summaries and confident action plans. The friction has already been absorbed for them. The uncertainty has already been reframed as clarity. The complexity has already been translated into a feel-good movie adaptation of a dense research narrative, tailored precisely to their own filter bubble.
And because the interaction happens inside a single conversational interface, it reinforces the idea that AI is one coherent system - one intelligence that simply shifts modes on request - whether it’s a text, an image, a strategy or a reassurance.
But what remains invisible is the architecture behind that surface. A language model may be interpreting the request, restructuring it, translating it into a different format before passing it to another system. When an image is generated inside ChatGPT or Gemini, the user does not see the extraction of a structured prompt or the distinction between a large language model and a diffusion model. The conversational layer performs the translation.
So when you later introduce a dedicated tool - one that requires a different kind of prompting, a different mental model, a different understanding of what is happening - resistance appears. From their perspective, they have already generated an image inside ChatGPT, so why step outside of it? Why bother learning a new interface, adjusting parameters, or thinking differently, if their chatbot already delivered something that looked good enough?
What remains invisible is what they are trading away - control, nuance, and an understanding of how the output is actually produced. And in doing so, the chatbot interface produces the same cultural effect as the staged live demo: a surface of apparent fluency that precedes structural understanding.
Until they have to leave the chat window.
And that is where the real gap begins to show.
Browsers, Tabs, and the Missing Substrate
If you want to use AI meaningfully, you cannot orchestrate everything from your phone. Those reels showing someone casually commanding an AI agent via WhatsApp while sitting on the toilet never include the part where that agent was set up on a desktop computer, connected to local files, configured with markdown instructions, and carefully tested beforehand.
We ask participants to bring their laptops and we assume that this solves the problem. It does not.
In professional settings, I have met people who use computers every day for their work and yet cannot reliably locate downloaded files, distinguish between a browser and a desktop application, or recognize when they have landed on a copycat website rather than the official service.
If someone types a web address into Google instead of the address bar and clicks the first sponsored link, they may end up in a completely different environment that looks almost identical to the one being demonstrated. I have seen this happen in real time: a participant raising their hand because they “don’t have the same input field,” only for us to discover they are on a third-party website using the same API but presenting a different interface with different functions (and scammy pricing).
From their perspective, nothing seemed wrong. The colors were similar. The branding looked familiar. The word “Gemini” was present. The difference between a search result and an official domain was not so obvious.
The same confusion appears with something as basic as browser tabs. When a lecturer switches between tabs during a workflow, some participants feel the need to close one before opening another. Interacting between two browser-based tools simultaneously becomes cognitively overwhelming. The mental model of “this is just a window showing a website” has never fully formed.
And then there are files.
Downloading a file - where does it go? What format is it? Which application opens it? How do you move it from your laptop to your phone and back again? These are not advanced technical questions, they are foundational. Yet without them, thinking in workflows becomes almost impossible, because workflows depend on understanding where inputs originate and where outputs land.
When I suggest preparing a markdown file with a meta-prompt, the difficulty is not conceptual, but procedural. Where do I create this file? What is markdown? Why not just paste it into the chat window? The idea of an external, structured instruction file presupposes familiarity with file systems and text formats that cannot be taken for granted.
And this is where the illusion created by chat interfaces and polished demos collides with reality. Outside the frictionless surface, there are folders, domains, extensions, file types, settings, permissions - an entire ecology of decisions that must be navigated consciously.
Partial Fluency Masquerading as AI Competence
We are asking people to orchestrate automated agents with access to their hard drives while they are still unsure where their downloads folder lives. This infrastructural gap has consequences.
When someone cannot reliably distinguish between an official service and a sponsored copycat site, the risk is not theoretical anymore. In the context of generative AI, there are already countless wrappers - interfaces that sit on top of an existing API, charge more than the original service, and quietly harvest data along the way. To an untrained eye, they look legitimate, although they offer premium features at suspicious prices. The difference between the original SaaS provider and an opportunistic intermediary is often invisible. The same pattern extends to automation.
There is a growing enthusiasm around connecting AI agents to local files, granting access to drives, syncing documents, automating decisions. These workflows can be powerful when configured consciously. But they require an understanding of permissions, scopes, authentication, and data flow. Without that awareness, the excitement around “let AI handle it” becomes precarious.
What worries me is not that people are incapable of learning these things. Most can. The problem is that we are compressing the learning curve while simultaneously raising the stakes - encouraging people to automate decisions and connect new tools before they can reliably recognize when something in that ecosystem is unsafe, misleading, or exploitative.
The institutional narrative is clear: adopt AI or risk obsolescence. Upskill quickly. Integrate workflows. Increase productivity. The pressure is real. But when AI literacy is layered on top of fragile digital foundations, the outcome is rarely empowerment. What you’ll get is confusion disguised as newly gained IT competence. This is dangerous because, among other things, it is making people even more vulnerable to scams, data leaks, and poorly configured systems. And when things go wrong, the frustration is not directed at the missing literacy layer - it is directed at “AI” itself.
What AI Literacy Should Actually Mean
If we take this gap seriously, the solution is not to abandon AI education, nor to shame participants for what they do not know. The solution is slower and far less glamorous.
Before we teach people how to “use AI,” we need to reintroduce them to the anatomy of the systems they are operating within.
This does not mean turning everyone into programmers. It means restoring familiarity with the basic layers that make digital environments function: what a browser is and what it is not; how domains work; how files are stored, moved, and opened; what permissions imply; how different applications interact; how to handle inputs and outputs.
Without these basics, workflows remain an unreachable skill. They can be conceptually understood, but not owned.
Institutions that are serious about AI upskilling should consider treating digital literacy as a prerequisite rather than an assumption. A baseline course in navigating environments, understanding file systems, recognizing official domains, and distinguishing between application layers may sound less sexy than AI, but it is structurally necessary.
At the same time, educators - myself included - need to resist the temptation to prioritize impressive tool tricks over system anatomy. Demonstrating what a model can produce is undeniably seductive, while slowing down to unpack what is happening behind the interface often feels less exciting, even slightly tedious. There is rarely enough time to revisit foundational digital skills, and in most groups the baseline is uneven: some participants move quickly, others are still orienting themselves inside the environment.
AI literacy should not mean knowing which prompt to paste. It should mean understanding where that prompt operates, what system interprets it, what assumptions it encodes, and what limitations shape the output. Only then does experimentation become meaningful rather than precarious.








