Is this “AGI?”

… or just an LLM-enabled agent?

The term AGI (Artificial General Intelligence) is thrown around a little bit loosely these days. I’m actually OK with that, we’re at the peak of the generative AI hype cycle and I think it’s a term everyone needs to get used to. We don’t have AGI yet, what we do have is a new generation of software agents empowered by LLMs.

At Pioneer Square Labs, we’re actively working to create AI “Copilots” for several industries. But what does Copilot mean exactly?

I’ve started using this distinction:

  1. A Copilot is a combination of features within an app that invokes AI for specific behaviors. (Example: Grammarly AI)
  2. A Chatbot is a conversational interface to help humans retrieve information (Example: Chat GPT)
  3. An Agent is code working on your behalf (that you might interact with like a virtual co-worker).

Obviously, many products will incorporate elements of all three, but I think these distinctions are useful. 

We’re seeing incredible Copilots and Chatbots in the market today. Agents are being called the “new frontier” in artificial intelligence. They’ve actually been an area of research for decades. AGI is harder to define. To me, it means an agent with near-human or super-human reasoning abilities, a healthy set of skills or tools, and the ability to self-improve.

The proof-of-concept “AGI” agents in the last couple of months are not actually AGIs. In many cases, they’re top top-down task generators. While they use GPT to generate the tasks, humans are still writing the prompts and code (“tools”) to complete the tasks.

These examples are inspirational and have (unsurprisingly) validated the huge demand for actual AGI. However, if you’re expecting AGI when you run Baby AGI or Auto GPT, you’ll be disappointed. You quickly discover that they can get easily sidetracked. Another deficit: these designs are not designed to update their own code. 

I’ve been enthralled by experimentation with ReAct, Chain of Thought, Tree of Thought, and other prompting strategies. Basically, experimenters are trying to find the best way to leverage LLMs to produce software agents. I was working on cataloging these and started to code my own when a paper dropped with the elegant approach I was looking for.

Voyager

Plenty has been written about Voyager (“An Open-Ended Embodied Agent with Large Language Models”) already. I just want to call out how elegant it is to use GPT-4’s code-writing ability as the core planner. Genius! I had to implement the same approach with my agent, Edgar.