AI will change the world – but not through the AI products of today

The rising skepticism about the value generated by AI is justified. The AI products on the market today are only the first step toward meaningful transformation. We must push forward to cover the rest of the critical path while momentum lasts.

Existing AI products are designed to find, recognize and synthesize information

There is a distinct pattern that emerges when we look at the features and use cases of major AI products released over the past few years:

  • Microsoft Copilot can search the web, summarize documents/emails/meetings.
  • Google Gemini can search the web for you and summarize what it finds.
  • OpenAI just launched SearchGPT, which says it all in the name.
  • Anthropic Claude, xAI Grok, Bing AI, Meta LLaMA and others are more of the same.

The capabilities these products offer are focused on finding and recognizing insights in large collections of information. The products compete on how well they understand complexities of language, on how well they see the meaning behind the raw words, on how well they memorize and retrieve things they encounter when looking at large data sets, or how well they recall the important bits from previous interaction history. They either look things up for you or synthesize responses on the fly. Even typical add-on features such as Claude Artifacts exist merely to create a more persistent durable form for operational data.

This is perfectly understandable as this is the domain in which AI research often takes place. Consider how these products often come to be – the researchers create an AI experiment that can demonstrate yet another type or level of advanced reasoning based on user inputs and the product folks give it a catchy name and slap a textbox on it. This is the easy and simple path to reach the market and catch attention.

This will not change the world because this is not what the world needs.

Existing AI products are not capable of performing useful work

As I was writing the draft of this article in Word, I asked Copilot to format part of the above chapter from my unformatted sentences into a proper bulleted list and it simply refused – such capabilities are beyond the grasp of its textbox (though it can at least make tables if you hold its hand).

This unveils the simplicity at the heart of AI products today – each exists in its own bubble of a “chat session”. They can still typically peek out from their box to see what document the human operator is working on but require explicit direction to apply themselves to producing output. AI products today fall short of doing the truly useful work that would benefit their human operators.

The research community is busy empowering AI with ever more intricate reasoning abilities, ever greater memory, and finding ways to feed it more data from different sources at a reasonable cost. Meanwhile, the product community is sitting largely idle as it fails to realize that incrementing the smartness is not nearly enough for true success and impact. Even the most innovative attempts to productize AI simply try to bring the chatbot experience into an audio-visual portable form, such as the Humane AI Pin. This is nothing more than startups jumping on the novelty bandwagon.

We do not need AI that can merely summarize articles, we do not need AI that can merely create pictures, we do not need AI that can merely understand and generate speech – these are only building blocks.

We do not need an assistant to chat with, we need an assistant that will work for us. The world will have changed when a discussion of work-life balance includes a large element of AI doing the work for us, leaving us free time to enjoy a high quality life and allowing us to focus our working hours on activities that have the biggest impact from human input.

The limiting factor

The AI is trapped in its textbox, both in space and time. Even when it is empowered to access a large data corpus such as your Microsoft 365 documents, emails and calendar entries, the AI accesses the data set only on demand and only to answer your queries. A productive AI must be free from such limitations.

A productive AI is a humanlike AI not only in its text and voice but also in its actions! This means:

  • The AI will respond to real-world events – an email arrives, a document is dropped into a folder, an item is added to a SAP list, or a specific time of day arrives on a specific weekday.
  • The AI gets a job description and training/reference materials to inform it what to do in response to specific situations and events, just as a human would. There is no magic that makes it automatically do the right thing just because it is AI – rather, it needs to be supervised, taught and trained the same as humans do. Its actions can be evaluated against the job description and training materials, creating a clear causal relationship between the two.
  • The AI will seek out the data it needs based on the instructions given to it, without having to be explicitly told what to do in every case. It is given access to all the data it needs for its job, limited only by the same kinds of permission checks that apply to humans. An AI assistant will have its own user account to allow humans to control its access rights.
  • The AI is empowered to act on the real world – just like I can send an email, so can the AI. Just like I can save a new document into a folder, so can the AI. Just like I can add a new item in SAP, so can the AI.
  • The AI can escalate to its human supervisor when it gets confused, or to ask for advice from its peers, just like a human would.

We have the platform and all the building blocks for this today – modern large language models like GPT-4 are perfectly capable of performing the “thinking” needed for any of this to happen.

What we lack are the integrations to empower the AI to act in the right way in the right place at the right time – there are no general purpose mechanisms to act on real world events, no integration with business software beyond plugging in a chat panel, no mechanism to act on files dropped in a folder, no meaningful way for the AI to act without the human prompting each action or even any way for the AI to have an influence on the external universe beyond showing the user something in a chat or adjusting a document.

It is unlikely that the AI platform providers will be able to act with sufficient momentum and flexibility to create these capabilities that bring AI out of its box. This is an opportunity for others on the market to provide products offering these integration capabilities.

Minimal SDKs are an opportunity for platform differentiation

Not only are the integrations for productive work missing but the SDKs for building on the AI platforms are also missing a lot of core functionality necessary for effectively making use of the specialized interaction style used by large language models. There are things that every reasonable AI product needs to do.

For example, large language models have a limited input and output context size, which means that every serious AI product needs to support incremental execution of one form or another to fulfill tasks piece by piece if either context window is exceeded. Today, every product needs to invent its own incremental output mechanisms and windowing mechanisms or some form of “LLM MapReduce” to work on large inputs. While there are different approaches here that can yield results of different quality, anything is better than nothing in this domain.

There is a business opportunity here for AI platform that can supply SDKs with batteries included, to differentiate themselves beyond simply being “yet another LLM as a service” with a simple REST API.

Knowledge worker to knowledge factory

There is huge potential for AI to change the life of knowledge workers – people whose daily job centers around receiving, processing and publishing information. Accountants, software engineers, analysts and even physicians or lawyers have specialized skills that AI is unlikely to replace, yet they have to spend large amounts of their time on menial work simply to move and process information. This routine menial work is where AI can have a huge impact on productivity.

The conceptual shift necessary to harness AI is to start thinking of each knowledge worker as a factory. Instead of the accountant reviewing the previous day’s transactions each morning and generating a report, they would spawn an AI assistant that will do the majority of this work while the human focuses on something more important or simply enjoys their morning coffee.

The AI assistant can be given a job description and training on how to review these transactions and how to generate the necessary report from them. The accountant can set up different AI assistants with different job descriptions to incrementally process ongoing work, feeding the output from one as input for another as needed by business processes. Part of every knowledge worker’s job would become the orchestration of AI assistants into a cohesive whole that performs as much of the routine information processing as possible.

We need to create the necessary integrations to enable the AI to query the necessary data sources to see the full picture, to react to new data or updated data in real time, perhaps even to autonomously contact the employees who the transactions originate from for more information when needed.

There will no doubt be escalations from the AI assistant that require human intervention – these are the scenarios where human work is justified. The job description for the AI will include when and how to escalate, and the mechanisms for harnessing the AI must support forking the process into multiple paths to accommodate the decisions from human intervention.

Instead of the accountant collecting and submitting a weekly list of payments to be executed, they can spawn another AI assistant and tell it that on every Thursday morning at 10:00, it needs to generate such a list and, after human approval, submit it for execution. If there are no anomalies that require human intervention, a huge chunk of the employee’s time has suddenly been saved and can be applied in more fruitful ways.

Today’s large language models can already perform such tasks satisfactorily and will get even better at it with time – achieving the comprehension required is not an area for innovation, it is an area of steady progress already. What is missing is the integration, the ability to right click on the “Weekly reports” folder, click on “New AI assistant” and give it a job description of “Every Thursday at 10:00, go look <here> and <there> for pending payments and use the logic at <training materials> to generate a report looking like <example report> in this folder”.

None of this will eliminate the human from the picture. A computer cannot be held accountable, so the ultimate responsibility to review escalations and make the final decision must reside in a human – AI is always here to assist us, not to replace us. It can never be fully autonomous because it merely enables us to accomplish more with our time and energy.

AI assistants are better than programming

An obvious counterargument to the concept of human workers harnessing AI assistants is that this amounts to a very computationally expensive and particularly imprecise form of software engineering – instead of creating AI assistants to do the menial work, one could simply write software that does this work.

This is only true to an extent, however. Crucially, large language models do have the skills to reason about things and apply their knowledge to new situations that the training materials have not foreseen. Programming is rigid and requires all the possibilities to be considered up-front, whereas an AI can figure it out on the go and only requires escalation to a human operator when there is no clear right answer.

Aside from that, software engineering is slow and expensive – everyone working in the software industry has a story about how adding one button took months and months. With AI assistants, you can apply changes instantly by changing their instructions. This is as simple as editing a Word document (perhaps it would literally be that) and promises to greatly speed up the evolution of business processes when needs change from day to day.

There is still a role for software but it is on a different layer than AI. The software provides the rigid platform for the semi-flexible AI logic defined by the human operators.

The platform is here, now is time for the product

The massive investment in AI has succeeded in building the platform for generative models and ongoing research keeps driving down the cost of executing the models. However, this momentum will not last if we keep using AI as a novelty – for true economic transformation we need to enable the AI to interact with the world and to integrate into the everyday business processes inhabited by humans.

The AI models with sufficient reasoning skills have already arrived. We now need a new generation of AI products that will release AI from its chat window. These products will change the world.

Leave a comment