Generative AI and the Potential of What is to Come

Julian Waters-Lynch

The first in our series of articles exploring AI from a human-centred perspective.

10- Second Summary

  • The new generation of AI applications, like ChatGPT and StableDiffusion, are just early demos of the generative capability that will be built into our knowledge work tools going forward.

  • As generative AI increasingly permeates knowledge work, it is important to understand what it’s doing, how it can be used productively, as well as recognise its limitations. 

  • AI-powered tools work best as co-pilots that complement human understanding, creativity and common sense, rather than as replacements for human judgement and oversight. 

We Finally Built a Machine We can Talk to

In 2022, a new set of generative Artificial Intelligence (AI) applications became available for the public to use. These began with image creating tools like Dall-E 2, Midjourney and Stable Diffusion, which mostly made a splash creating weird AI art (1). These image generating tools are fun to play with, and raise some interesting and important questions for visual designers and the art world (2), but it’s hard to see how most of us would use it in our work, studies or day-to-day life. But in November 2022, OpenAI released ChatGPT, the first intelligent chat tool that appears to…actually work. You can ask it almost any question and it not only easily understands you, but usually gives coherent and useful responses. 

ChatGPT is just the Beginning

Normally it’s wise to treat tech demos - especially about AI - with a decent dose of scepticism (3), so it’s wonderful that you can try ChatGPT yourself (4). ChatGPT can generate wikipedia-eque answers to most questions in real time, tells decent jokes (5) and can even write pretty good songs (6). But, more importantly, try to discover how you might use it in your work by giving tasks you actually need to do - ask it to outline an idea for an article, some advice on an issue you’re grappling with at work, or edit a paragraph you’re working on. You’ll see how it can be a useful sounding board, even collaborator for text based work.  

But you’ll also likely spot some of its current shortcomings. It sometimes makes completely incorrect pronouncements with unperturbed confidence; it doesn’t have access to information beyond 2021; and it can’t currently cite its sources - and if pressed often invents completely fictitious, but plausible sounding sources.

Still, it’s a remarkable achievement. Its immediate value is reflected in its adoption rate as the fastest app in history (7), a trend that took even OpenAI by surprise. But here’s the thing, ChatGPT was never intended to be a product, merely a public demonstration of what Large Language Models (LLM) like GPT can now do. Although they’ve been working on the underlying technology for years, OpenAI built the chat interface in just a few weeks (8). OpenAI demoing ChatGPT is something like the equivalent of Thomas Eddison showing off an electric phonograph that could play the French national anthem at a 19th century expo (9). It was built more as a toy than a tool. The underlying technology, the equivalent of the electricity that powers the toy, is the real disruptor that will transform the future. 

It’s not the tool, it’s how we interact with the tool

I think the analogy with electricity is helpful in understanding how these technologies will be deployed. Electricity is what’s called a general purpose technology (10) (ironically also called GPTs!), innovations that impact and transform multiple sectors of our economy, culture and society. Once we discovered how to generate and harness electricity for example, it was progressively put to work to power a diverse array of products and tools, making them more effective and easier to use. General purpose technologies, whether we’re talking about inventing the wheel or steel, are less products in themselves, and more like civilisational infrastructure that we build new things upon. In fact, they often become so ubiquitously embedded in everyday things, like cement and plastics, that we stop even thinking of them as technology. 

In this sense, the new generation of AI (11) will transform how we interact with information, by becoming embedded in many of our daily tools. Just as we gradually electrified manufacturing products in the 20th century - lights, fridges, drills etc - we will start to see machine learning and generative AI used to power our knowledge tools - our search engines, word documents, spreadsheets and slide presentations. This might sound like a bold claim, but in many ways it’s just a more explicit layer of machine intelligence mediating our relationship with the infosphere (12). Search engines like google, let alone social media sites, already curate the information we see presented to us on screens by executing algorithmic ‘choices’ - showing us first the page most linked to by other sites; or those paid for by advertisers; or a post most anticipated to get us to “like, share and subscribe” (13). However, rather than simply direct our attention towards information created by other humans, machines can now spontaneously generate content in response to our queries and requests, based on synthesising and summarising data extracted from the infosphere. 

The potential for Generative AI to Transform the World of Work

Generative AI is already being put to work in everyday knowledge work tools, including search engines, productivity and document management software. Here are a few early stage products that illustrate the potential of what’s to come. 

1. SEARCH

Microsoft’s, multiyear, multibillion dollar partnership with OpenAI might end up being one of the most commercially strategic deals in history (14). After spending much of the 2000s lagging behind the likes Alphabet (Google), Apple, Amazon and Meta (Facebook), the partnership has enabled Microsoft to effectively leapfrog its tech rivals - at least in marketing (15). While Microsoft trailed competitors in its early execution of the major digital trends of the 2010s, such as social, mobile, analytics and cloud, it kept a lock on the workplace through its Office suite. Although innovative competitors, including a few gutsy startups, have made inroads in productivity tools at the consumer level, Microsoft’s dominance in the productivity suite, especially the flagship products Word, Excel and Powerpoint, remains dominant at the enterprise level and in the workplace. As cybersecurity concerns continue to grow, IT departments are likely to place even greater value on the perceived safety and reliability of Microsoft’s offerings. 

The partnership has also revived Bing, their almost forgotten, and long-mocked, web search engine. Bing now has a feature that uses GPT to spontaneously generate a summarised response to search queries, alongside providing the conventional list of links. Although still in beta and continuing to iron out bugs (16), this new format of directly asking questions and receiving responses generated as real time summaries from multiple data sources - visible on the right side of the screen in the image above - will likely become the dominant format of search. Other search engines (17) like Neeva and You have already released their own beta versions. 

Microsoft has integrated generative AI within Bing

These developments also put Google in a bit of a bind. Despite their pioneering role in developing LLM technology (18), and releasing their own search products like Bard AI and WebChatGPT, this new wave of AI-enhanced search engines present the company with a significant business model challenge. This is because Google relies heavily on paid advertising in the form of links that appear in response to search queries, and so adopting the new format would undermine the cash cow that powers Google’s dominant position as the incumbent in the search engine market. Google is caught in a classic case of “the innovator’s dilemma (19)”.

2. PRODUCTIVITY

In addition to search, those of us who work on computers tend to spend most of our time in document management applications - creating and manipulating text, numbers or images - as well as communicating with others through email.

Both Microsoft and Google have been quietly integrating AI capabilities in their email and productivity tools, as you might have noticed in improved autocomplete suggestions for sentences as you begin writing emails. But the options for prompt -based generation of text and images, or analysis in the case of numbers, is set to explode. 

GPT for work has already created plugins for Google Docs and Sheets (20), which embed the ability to clean up lists, convert formats, edit and summarise text, translate between languages, classify content into categories and generate new taglines, blogs or emails in response to prompts. Microsoft is expected to launch similar functions soon, which will likely be based on GPT-4 - the next generation upgrade from GPT-3 that ChatGPT is currently using (21).

Other companies are currently testing similar products, for example Canva recently released Magic Write fo Canva Docs (22), and Notion is testing Notion AI (23), both of which embed this generative functionality directly in the application. These tools will also eventually integrate prompt-generated images, audio and video. As for email, given how straightforward the requests and genetically written many emails actually are, the future of email communication will likely involve autonomous bots negotiating to schedule our meeting times, or simply seeking our permission to answer questions or share documents with our other bot-assistants.

3. CODING

The abilities that allow generative AI models to read and process natural language text can also be applied to various computer code languages. OpenAI has developed a product, called OpenAI Codex (24), that is specifically designed to translate natural language instructions into code. Microsoft has already integrated this technology into products like the aptly named Co-Pilot (25), which some computer engineers claim enables them to shift their attention from writing boiler-plate but time-consuming code to focus on higher order problems of software design (26). Just like the productivity tools, lesser known but specialist companies are building similar natural-language-to-code products, like the GhostWriter tool by Replit featured above. These have the potential to make software development vastly more efficient, and also open up the world of building software to a much wider audience.  

An Open Frontier Full of Opportunity

The emerging tools work best as co-pilots, rather than total substitutes for human knowledge, skill and judgement.

What does this mean for those of us that don’t work for AI companies? I think these tools are a good illustration of an old saying about this technology - that AI itself won’t take your job, but humans using AI might. The emerging tools work best as co-pilots, rather than total substitutes for human knowledge, skill and judgement. On a good day, they function like having a super capable assistant or team to direct towards a task. On a bad day, tools like ChatGPT acts like a brilliant assistant with a secret ayahuasca addiction, it can’t distinguish between reality and its hallucinations. But many of these bugs are temporary, and the technology will continue to evolve. But it’s important to remember that much of this disruption will occur at various levels of the task, rather than the job itself. A job is, in essence, a collection of tasks that serve a larger goal - hopefully one that adds value to other people.

The new generation of AI will certainly automate some of the tasks currently performed by humans. The composition of tasks that make up many jobs will change, but many of the larger goals will remain. The open frontier before involves learning how to incorporate the power of this technology, like electricity a century earlier, to more effectively, sometimes more efficiently, create value for the people for whom our jobs are designed to serve. This future is full of opportunity for those with imagination and the willingness to experiment and learn.

In our next article we will dive into the three different ways humans process and learn information, what this means in the context of Generative AI and how to elevate outcomes based on this.

References:

(1) Visible here for Weird AI Art

(2) https://www.theverge.com/2023/1/16/23557098/generative-ai-art-copyright-legal-lawsuit-stable-diffusion-midjourney-deviantart

(3) https://techcrunch.com/2023/01/17/tesla-engineer-testifies-that-2016-video-promoting-self-driving-was-faked/

(4) https://chat.openai.com/chat

(5) https://medium.com/@paul.k.pallaghy/chatgpt-writes-pretty-good-stand-up-comedy-lol-91ca03837fd1

(6) https://www.audiocipher.com/post/chatgpt-music

(7) https://news.yahoo.com/chatgpt-fastest-growing-app-history-142338104.html

(8) https://www.nytimes.com/2023/02/03/technology/chatgpt-openai-artificial-intelligence.html

(9) https://journalofantiques.com/features/beginning-19th-century-worlds-fairs/

(10) https://en.wikipedia.org/wiki/General-purpose_technology

(11) In a narrow sense the recent advances of generative AI are the result of Large Language Models (LLMs) built using deep learning neural networks. But AI is really an umbrella term used to refer to the convergence of multiple techno-scientific advances, spanning natural language processing, computer vision, decision and bayesian networks, probabilistic programming, neural networks, autonomous systems, evolutionary algorithms etc.

These various components are well summarised and visually represented by Francesco Corea in this post: AI Knowledge Map: how to classify AI technologies. In more practical terms, AI involves employing these various technologies to enable a machine to behave in ways that, if a human behaved this way, we would call it intelligent.

(12) Floridi, L. (2014). The fourth revolution: How the infosphere is reshaping human reality. OUP Oxford.The ‘infosphere’ is a term used by philosophy of information theorist Luciano Floridi to refer to the sum of all information and communication in a society, from academic and media articles to online forums, social media posts and personal blogs. Human generated, (and much non-human generated) information is increasingly digitised, and thus machine readable. Floridi argues we should consider this as a new and important layer of reality, on par with the biosphere and physiosphere:(13) Thaler, R. H., Sunstein, C. R., & Balz, J. P. (2013). Choice architecture. The behavioral foundations of public policy, 25, 428-439.The notion of ‘choice architecture’ involves the design of an environment - physical or digital - that presents options in such a way that we are more or less likely to select them, such as strategically placing sweet food at the supermarket checkout counter.

(14) https://blogs.microsoft.com/blog/2023/01/23/microsoftandopenaiextendpartnership/

(15) https://medium.com/enrique-dans/how-microsoft-caught-google-napping-fd5b7d9f6ad1

(16) https://futurism.com/microsoft-your-fault-ai-going-insane

(17) https://kinsta.com/blog/alternative-search-engines/

(18) GPT stands for ‘Generative Pre-trained Transformer. Google actually developed the ‘T’, or the transformer architecture as outlined in a pathbreaking paper, “Attention Is All You Need”.

(19) Christensen, C. M. (2013). The innovator's dilemma: when new technologies cause great firms to fail. Harvard Business Review Press.(20) https://gptforwork.com/

(21) https://www.gizmochina.com/2023/02/13/microsoft-office-chatgpt-demo-soon/

(22) https://www.canva.com/magic-write/

(23) https://www.notion.so/help/guides/using-notion-ai

(24) https://openai.com/blog/openai-codex/

(25) https://github.com/features/copilot/

(26) https://www.economist.com/interactive/briefing/2022/06/11/huge-foundation-models-are-turbo-charging-ai-progress

Previous
Previous

The Puzzle of Psychological Safety

Next
Next

Building a Company without Job Titles: The Opportunities and Pitfalls.