Alchemy, a pseudo-science preceding chemistry attracted the attention of numerous people for quite a significant period of time before it was finally dismantled by the rise of natural sciences. The method of alchemy was based on trial and error, and it aimed at finding ways of transmuting base metal into gold and obtaining a universal elixir. Although alchemists achieved tangible results, such as the discovery of metallurgy and glassmaking, their core beliefs were eventually dismantled by natural sciences because alchemy practitioners lacked a true scientific understanding of what they were doing. It may seem surprising, but the way modern machine learning and artificial intelligence are often practised today shows an uncanny resemblance to alchemy.
Machine learning was publicly likened to alchemy for the first time in 2017 by Ali Rahimi, a researcher in artificial intelligence at Google. His statement was not received well, with pioneer researcher Yan LeCun criticising him. Since then, AI models have advanced rapidly, but the relevance of his remark still remains. The comparison between machine learning and alchemy rests on the fact that although modern algorithms have the ability to outperform humans in games and can identify images with great accuracy, they work only in practice. This is because the thousands of opaque and complex mathematical operations create a "black box" problem such that no one can precisely explain a model's specific decision. This problem has in fact already been noted in the real-world implementation of these systems. In a McKinsey survey of the state of AI in 2024, 40% of respondents identified explainability as a key risk in adopting gen AI. Another survey from 2021 by Fair Isaac Corporation found that about 70% of respondents were unable to explain the way specific AI model decisions or predictions are made.
This black box problem with AI models is a natural consequence of the approach that is used to build them. Consider a scenario where an AI practitioner is tasked with creating a model to predict cancer probability using historical patient data (medical conditions, age, history, etc.). The practitioner trains a sophisticated algorithm on this data to achieve maximum predictive accuracy. When the finished model analyzes a new patient, it will reliably output a prediction. However, a crucial question remains unanswered: What specific factors did the model use to make that decision? A curious mind wants the model to reveal a causal mechanism: a scientifically grounded reason for the prediction. But the answer isn't in the model's architecture. The user will ultimately be disappointed because they're demanding something from the machine that it was never designed to do.
Now, what exactly was the practitioner doing when training the algorithm? Any data that we use for building an AI comes with a clear assumption that there is some causal mechanism running behind the scenes that connects physical conditions to cancer and the data we have at hand is the by-product of this mechanism. So, there must be a mathematical rule through which the causality is flowing. Let us call this rule the ‘data generating process’. Given the limits of human knowledge this process is always unknown, and we can only make educated guesses about it. However, ML algorithms are not trained to estimate this process. Their goal is to achieve maximum predictive performance on the sample data. Meaning, they are learning ‘statistical regularities’ rather than ‘causal mechanisms’. The difference between the two can be understood as follows: Imagine you need to identify a mountain. The proper approach is to study its shape, structure, and geological context. However, the machine learning approach takes a shortcut: it learns to recognize the shadow (the statistical regularity). If the shadow cast on the ground is consistently large and triangular, the algorithm concludes it's a mountain, without ever examining the mountain itself. The shadow is merely a by-product of the mountain's actual form, a natural outcome. The algorithm simply customizes itself, through complex mathematical abstraction, to reliably and consistently predict this shadow. Crucially, the mathematics, while not random, is not causally interpretable; it doesn't convey any empirically grounded reasoning mechanism as standard scientific equations would. This fundamental lack of transparent reasoning is exactly the problem that threatens to undermine the effective role AI can play in our lives.
It is clear that ML practitioners aim to trace statistical regularities, not to understand the underlying data generating processes. This limitation hinders the future of AI. Remaining stuck at statistical prediction means models are only truly reliable for simple, repetitive tasks in environments similar to their training data. AI will struggle when tasked with creative work, strategy, decision-making, and knowledge building. This lack of explainability is particularly severe in fields like healthcare and law. In medicine, for example, the core mechanisms driving a disease can change. If we cannot interpret why a model made a specific diagnosis, we cannot justify its output, nor can we determine the appropriate treatment when different causes for the same detected disease require different therapies.
While some may dismiss these concerns by arguing that AI is not meant for complex, high-stakes tasks and that human supervision will always remain essential, such reasoning raises a deeper question. Because if that were true, the growing investment in AI would imply that we are merely building an “AI clerk” that would be useful only for simple, repetitive tasks. The real ambition, however, should be to make AI genuinely useful, and that will depend on making it more explainable so it can contribute meaningfully to areas where data-driven decision-making is crucial, such as public policy. In these domains, if AI models become more transparent and better at capturing true causal mechanisms, their outputs would not only be more reliable but also far more valuable because of their objective rigour. In that case, the vast sums of money flowing into AI development would be well justified. To make AI a productive and lasting part of our lives, it must therefore become more explainable which is a goal that demands long-term commitment and sustained effort.
(Amit Kapoor is Chair and Mohammad Saad is researcher at the Institute for Competitiveness)
(Disclaimer: The opinions expressed in this column are that of the writer. The facts and opinions expressed here do not reflect the views of www.economictimes.com)
Machine learning was publicly likened to alchemy for the first time in 2017 by Ali Rahimi, a researcher in artificial intelligence at Google. His statement was not received well, with pioneer researcher Yan LeCun criticising him. Since then, AI models have advanced rapidly, but the relevance of his remark still remains. The comparison between machine learning and alchemy rests on the fact that although modern algorithms have the ability to outperform humans in games and can identify images with great accuracy, they work only in practice. This is because the thousands of opaque and complex mathematical operations create a "black box" problem such that no one can precisely explain a model's specific decision. This problem has in fact already been noted in the real-world implementation of these systems. In a McKinsey survey of the state of AI in 2024, 40% of respondents identified explainability as a key risk in adopting gen AI. Another survey from 2021 by Fair Isaac Corporation found that about 70% of respondents were unable to explain the way specific AI model decisions or predictions are made.
This black box problem with AI models is a natural consequence of the approach that is used to build them. Consider a scenario where an AI practitioner is tasked with creating a model to predict cancer probability using historical patient data (medical conditions, age, history, etc.). The practitioner trains a sophisticated algorithm on this data to achieve maximum predictive accuracy. When the finished model analyzes a new patient, it will reliably output a prediction. However, a crucial question remains unanswered: What specific factors did the model use to make that decision? A curious mind wants the model to reveal a causal mechanism: a scientifically grounded reason for the prediction. But the answer isn't in the model's architecture. The user will ultimately be disappointed because they're demanding something from the machine that it was never designed to do.
Now, what exactly was the practitioner doing when training the algorithm? Any data that we use for building an AI comes with a clear assumption that there is some causal mechanism running behind the scenes that connects physical conditions to cancer and the data we have at hand is the by-product of this mechanism. So, there must be a mathematical rule through which the causality is flowing. Let us call this rule the ‘data generating process’. Given the limits of human knowledge this process is always unknown, and we can only make educated guesses about it. However, ML algorithms are not trained to estimate this process. Their goal is to achieve maximum predictive performance on the sample data. Meaning, they are learning ‘statistical regularities’ rather than ‘causal mechanisms’. The difference between the two can be understood as follows: Imagine you need to identify a mountain. The proper approach is to study its shape, structure, and geological context. However, the machine learning approach takes a shortcut: it learns to recognize the shadow (the statistical regularity). If the shadow cast on the ground is consistently large and triangular, the algorithm concludes it's a mountain, without ever examining the mountain itself. The shadow is merely a by-product of the mountain's actual form, a natural outcome. The algorithm simply customizes itself, through complex mathematical abstraction, to reliably and consistently predict this shadow. Crucially, the mathematics, while not random, is not causally interpretable; it doesn't convey any empirically grounded reasoning mechanism as standard scientific equations would. This fundamental lack of transparent reasoning is exactly the problem that threatens to undermine the effective role AI can play in our lives.
It is clear that ML practitioners aim to trace statistical regularities, not to understand the underlying data generating processes. This limitation hinders the future of AI. Remaining stuck at statistical prediction means models are only truly reliable for simple, repetitive tasks in environments similar to their training data. AI will struggle when tasked with creative work, strategy, decision-making, and knowledge building. This lack of explainability is particularly severe in fields like healthcare and law. In medicine, for example, the core mechanisms driving a disease can change. If we cannot interpret why a model made a specific diagnosis, we cannot justify its output, nor can we determine the appropriate treatment when different causes for the same detected disease require different therapies.
While some may dismiss these concerns by arguing that AI is not meant for complex, high-stakes tasks and that human supervision will always remain essential, such reasoning raises a deeper question. Because if that were true, the growing investment in AI would imply that we are merely building an “AI clerk” that would be useful only for simple, repetitive tasks. The real ambition, however, should be to make AI genuinely useful, and that will depend on making it more explainable so it can contribute meaningfully to areas where data-driven decision-making is crucial, such as public policy. In these domains, if AI models become more transparent and better at capturing true causal mechanisms, their outputs would not only be more reliable but also far more valuable because of their objective rigour. In that case, the vast sums of money flowing into AI development would be well justified. To make AI a productive and lasting part of our lives, it must therefore become more explainable which is a goal that demands long-term commitment and sustained effort.
(Amit Kapoor is Chair and Mohammad Saad is researcher at the Institute for Competitiveness)
(Disclaimer: The opinions expressed in this column are that of the writer. The facts and opinions expressed here do not reflect the views of www.economictimes.com)
You may also like

"Mahagathbandhan will face massive defeat, RJD will disintegrate": Shivraj Singh Chouhan

Karur stampede: Ambulance drivers appear before CBI for questioning

Brit, 63, shot dead during robbery in Ghana as cops hunt for gunmen fleeing on motorbikes

Anish Bhanwala clinches World Championships silver

I visited one of the UK's 'worst' seaside towns - I'd go back in a heartbeat





