So what inherent capabilities does it really possess, if any, and where do we meet the limits of those capabilities?
The Promises and Pitfalls of Machine Intelligence
In the early days of AI’s emergence, or at least before ChatGPT was released into the world in November 2022, a common approach to these questions was to prescribe a straightforward division of labour that echoed how humans have used machines for hundreds of years, whereby procedural, standardized, repetitive tasks (of varying levels of intricacy) could be fully delegated to “machine intelligence” (just as the artisan craft of textile weaving was delegated to “narrow frame” machines in the industrial revolution), leaving humans to spend their time and energy elsewhere, or – as has often been the case instead – rendering their skills redundant and their livelihoods obsolete. Researchers Lissack & Meagher however, in their 2024 paper on ‘Responsible Use of LLMs’, explain that, in practice, there is a much less clear-cut relationship between AI tools and their human users today. They observe that LLMs are now frequently deployed as valued collaborators in highly complex tasks, especially as a source of novel propositions and alternative perspectives on a subject, on which humans can then elaborate. Another study published in 2025, led by researchers at Harvard Business School, found that using LLMs not only enables the synthesis of conflicting agendas across silos within organisations, but can also foster social connectivity and emotional wellbeing among team mates. Traditional conceptual limits on the capabilities of “machine intelligence”, then, appear increasingly redundant in the era of LLMs. Of course, this won’t come as news to anyone following the field. While academic critique has struggled to stay abreast of AI’s rapidly proliferating use cases, many of us will already be daily consumers of AI-generated newspaper articles, explainer videos, and background music, sometimes even without realizing it. Indeed, AI tools are now consistently able to “outperform” humans in many tasks, in fields from business innovation to medical diagnosis (although performance benchmarks can be somewhat arbitrary and benchmarking methods, controversially imperfect).
This rupture with centuries of tradition in human-machine relations has both delighted and disappointed, giving credence to Silicon Valley technoutopias of automated abundance, while also seeming to heighten the likelihood that even the most treasured facets of human intelligence, like caring for one another and expressing ourselves creatively, will be superseded and eventually die out. But these dramatic (and highly political) future visions, while not altogether unconvincing, rely on a no-limits appraisal of machine intelligence’s potential capabilities, wherein AI is understood to be conceptually capable of matching and even surpassing the various attributes of general human intelligence: the accelerationist or maximalist stance.
Is such an appraisal justified? While claiming to remain neutral on the question, neuroscientist Christopher Summerfield has observed the many similarities between machine and human intelligences, noting for example that human brains are also increasingly thought to work on the basis of “predictive processing”, and, just like LLMs, are also prone to error and “hallucination”. Indeed, the rise of LLMs is posing interesting and important challenges to prevailing understandings of the workings of the human brain and the nature of human consciousness, much like the rise of cryptocurrencies challenged prevailing understandings of money and the financial system. Phenomena like “emergence”, whereby LLMs appear able to propose ideas or solve problems that did not feature in their training data, mystifying even their creators, also seem to add weight to this stance. A fun but illuminating example of this phenomenon recently offered by AI developer Dan Kwiatkowski goes as follows: if you ask a human to explain the similarities between volcanoes and cheese, they will typically be unable to name any; but ask the same question of a high-quality LLM and, despite the absence of any discussion on this topic in its training date, it is likely to offer surprisingly compelling and factually accurate observations in response (try it yourself!). And yet, in a paper published in 2024, Sheng Lu, Irina Bigoulaeva, and colleagues present substantial evidence that “purported emergent abilities are not truly emergent, but result from a combination of in-context learning, model memory, and linguistic knowledge”. They argue that these three characteristics are in fact the only inherent capabilities of LLMs, and that any attendant ability to successfully follow instructions should not be misconstrued as the capability to reason independently, strongly cautioning against any over-estimation of the scope of “machine intelligence”. Indeed, a growing chorus of critics is increasingly resisting the notion that LLMs are intelligent agents at all, instead understanding them as strictly deterministic social technologies that merely reconfigure pre-existing human knowledge, and arguing that any claims to the contrary risk derailing important debates about AI’s real impact on society.
What is it Good For?
No matter how valid these critiques might be, LLMs are used every week by hundreds of millions of people. So even if claims to their intelligence are exaggerated, what do their core set of capabilities really enable a human user to learn or accomplish? In this section, we’ll try to understand AI’s capabilities in practice: the kinds of tasks for which it’s most effective and what exactly it offers its users, that they didn’t have before. But AI researcher Fabrizio Dell’Acqua and colleagues have demonstrated the practical challenges of tackling these questions:
“AI capabilities cover an expanding, but uneven, set of knowledge work we call a ‘jagged technological frontier.’ Within this growing frontier, AI can complement or even displace human work; outside of the frontier, AI output is inaccurate, less useful, and degrades human performance. However, because the capabilities of AI are rapidly evolving and poorly understood, it can be hard for professionals to grasp exactly what the boundary of this frontier might be.”
Despite these challenges, Ethan Mollick, a leading voice on the uses of AI, close collaborator of Dell’Acqua, and Professor of Management at the University of Pennsylvania, has elaborated on which kinds of tasks he thinks sit within this ”jagged frontier”.
Ideation
He finds, for example, that AI is especially effective at supporting humans in conceptual exploration and creative processes, pointing out how LLMs are able to continually generate contrasting answers to open-ended questions, or alternative endings to unfinished narratives. But Mollick suggests that it’s really the quantity and variety of the AI’s outputs here, and the way a human user contributes their own ideas, that makes it useful for creativity, but not necessarily their accuracy or quality.
Synthesis
Mollick, and many others besides, also finds that AI is effective at retrieving and summarizing large and diverse volumes of complex information, sparing a human user from the time-consuming process of digesting and redacting it themselves. But this isn’t just a time-saver: it can also change the character of the knowledge a human user has access to in that time frame. Even the Pope himself (yep) has argued that AI’s synthesising capability means that it can integrate deep knowledge from a wide variety of disciplines and sources, each with its distinct vocabularies, concepts, and methodologies, enabling a user to rapidly produce genuinely multiperspectival accounts of real-world problems. More than saving just a few hours of reading, then, accessing this breadth of insight might otherwise require a user to spend years of their life building up interdisciplinary experience and expertise. Critics will of course argue, though, that spending years of your life learning about a problem from lots of different approaches, and acquiring all of the embodied, social, and ethical knowledge that comes with this, produces a fundamentally deeper kind of insight than that which an AI could offer. So while an LLM can radically augment our ability to summarise and integrate information sources, it shouldn’t be used to usurp the role of human learning in solving human problems.
Feedback
Closely linked to its capability for synthesising diverse perspectives, Mollick argues that AI is effective at impersonating specific perspectives. As such, it is an effective source of feedback, for example, capable of simulating both sympathetic and critical viewpoints to offer insightful critiques of human work, surfacing potential preconceptions and blind spots without the need for external review. Others have argued that this capability to inhabit specific personas can not only quickly complement research with countervailing approaches where they might otherwise be lacking, but also help to finetune an LLM’s usefulness in general, for example where users prefix their requests for feedback with an instruction for the LLM to impersonate a particular character in its response, like a senior technical specialist with an eye for detail, or an inspiring leader always looking for new opportunities.
Reconfiguration
Furthermore, Mollick finds AI to be effective at translating content between different audiences, styles, tones, complexity levels, lengths, formats, and structures, enabling human users to instantly reconfigure existing content to suit various contexts and user needs. This capability would appear to be mostly a time-saving benefit, rather than a qualitative augmentation of a pre-existing human skill.
Programming
Finally, Mollick observes that AI is highly effective at generating and debugging code snippets, so much so that LLMs enable users to develop software without writing a single line of code themselves.
Breaching the Jagged Frontier
But each of these capabilities, however useful, has its limits, and recent literature has raised important concerns over users straying beyond them. This next section outlines some of these limitations and the risks associated with them, focusing on their capability boundaries and the potential consequences of over-reliance on these systems across various contexts.
A structural condition of LLM’s relates to the relationship between the accuracy of large language models and the availability of training data. As noted in sources like the Vatican’s Guidelines on the Responsible Use of Large Language Models, LLMs tend to perform well on widely covered topics that are well-represented in their training data. However, their responses may become less reliable when prompted on niche or less-documented subjects. LLMs are also trained on a time-bound corpus of content, an inherent limitation that means all models currently have a knowledge cut-off date.
Models like Chat-GPT and others also tend to be trained predominantly on Western (and primarily English-language) textual sources. Therefore, LLMs strongly rely on the Western, Educated, Industrialized, Rich, and Democratic (WEIRD) cultural frame of reference. Even though many LLMs are capable of processing non-English queries, in numerous instances, they internally translate and process such inputs in English “under the hood” before generating responses in the query’s original language. This internal process can introduce a pronounced WEIRD bias in LLM reasoning and responses.
Compounding this bias is the political bias embedded in AI dialogue systems — an issue with profound implications for public opinion and societal cohesion. AI systems, particularly LLMs used in policy generation, political commentary, journalism, or even simple queries, possess a subtle yet significant influence over public discourse. As highlighted in recent literature on the responsible use of AI, including The Effects of Over-Reliance on AI Dialogue Systems, Responsible Use of Large Language Models, and Managing Extreme AI Risks amid rapid progress, such models inherit the ideological leanings of their data sources and may output content that disproportionately favors one side of the political spectrum or world view. Furthermore, LLMs can inadvertently reinforce existing biases by drawing from training data that often contains societal prejudices and exhibit patriarchal biases, responding more completely or accurately to certain questions when asked to respond as a man rather than as a woman.
These biases distort the representation of complex political realities and risk entrenching users in ideological echo chambers. These echo chambers, amplified by the algorithmic optimization of engagement, reduce exposure to diverse perspectives and hinder meaningful dialogue — an essential component of democratic health. Compounding the problem, AI systems lack the ethical reasoning capacities of human experts. This limitation in high-stakes fields such as law, medicine, and policy can lead to ethically questionable outcomes, particularly when AI outputs are accepted uncritically. Furthermore, the reward signals used to train these systems do not always align with human values or democratic principles. As a result, AI may optimize for unintended outcomes — reinforcing polarization rather than fostering informed debate and pluralism.
Compounding this issue is the uniform tone of confidence with which LLMs present all outputs, regardless of accuracy. This makes it difficult for users — especially non-experts — to discern whether the information is trustworthy without external verification or prior knowledge. Without rigorous oversight and clear standards for data sourcing, AI systems risk reinforcing negative societal patterns, undermining trust, and compromising the integrity of the technologies that are increasingly integral to public and private decision-making.
Another challenge is the so-called “toxicity of the commons,” which emerges when curating open-source pre-training data given that many closed models rely on proprietary data harvested without the creators’ consent. This opaque process not only raises ethical concerns regarding data ownership and transparency but also increases the risk of perpetuating harmful biases and toxic content (as seen above). Moreover, when AI generates content influenced by copyrighted material, questions around ownership and fair use become increasingly complex.
Lastly, LLMs present a critical inability to replicate human-like understanding and experience. While LLMs are capable of processing and generating language at an impressive scale, their intelligence remains purely computational. As highlighted in documents such as the Vatican’s Guidelines on the Responsible Use of Large Language Models, this distinction is critical: human intelligence is not just a matter of data processing — it is deeply rooted in lived experience, emotional depth, relationality and social context. This, combined with the data availability, toxicity of the commons and the authoritative tone use regardless of accuracy, makes it a fertile ground for misinformation — misinformation now poised to be unquestioningly adopted by millions to deleterious effect on our already-strained political cultures.
In the next section, we’ll detail what economists would call the “externalities” of AI, including the many material and immaterial impacts it has on society and the environment. Not just error or “hallucination”-prone and limited to what’s inside the hard-to-appraise jagged frontier - but it also causes impacts far beyond the immediate contexts of its use.