… or just an LLM-enabled agent?
The term AGI (Artificial General Intelligence) is thrown around a little bit loosely these days. I’m actually OK with that, we’re at the peak of the generative AI hype cycle and I think it’s a term everyone needs to get used to. We don’t have AGI yet, what we do have is a new generation of software agents empowered by LLMs.
At Pioneer Square Labs, we’re actively working to create AI “Copilots” for several industries. But what does Copilot mean exactly?
I’ve started using this distinction:
- A Copilot is a combination of features within an app that invokes AI for specific behaviors. (Example: Grammarly AI)
- A Chatbot is a conversational interface to help humans retrieve information (Example: Chat GPT)
- An Agent is code working on your behalf (that you might interact with like a virtual co-worker).
Obviously, many products will incorporate elements of all three, but I think these distinctions are useful.
We’re seeing incredible Copilots and Chatbots in the market today. Agents are being called the “new frontier” in artificial intelligence. They’ve actually been an area of research for decades. AGI is harder to define. To me, it means an agent with near-human or super-human reasoning abilities, a healthy set of skills or tools, and the ability to self-improve.
The proof-of-concept “AGI” agents in the last couple of months are not actually AGIs. In many cases, they’re top top-down task generators. While they use GPT to generate the tasks, humans are still writing the prompts and code (“tools”) to complete the tasks.
- Auto GPT generates tasks to be completed by tool Plugins.
- LangChain has agents that use human-coded tools.
- Baby AGI is an AI Powered Task Management System. “The author does not mean to imply that this is AGI.” It generates tasks and has GPT complete them (no tools beyond the summarization and imagination that GTP is capable of).
- Engineer GPT uses a chain of steps, to output a project description and code generated through a chat with GPT-4 to file system.
- Otto is a detailed, opinionated, top-down production-quality code generator, utilizing Slack and Github agents, and a cool Figma integration (closed source for now).
These examples are inspirational and have (unsurprisingly) validated the huge demand for actual AGI. However, if you’re expecting AGI when you run Baby AGI or Auto GPT, you’ll be disappointed. You quickly discover that they can get easily sidetracked. Another deficit: these designs are not designed to update their own code.
I’ve been enthralled by experimentation with ReAct, Chain of Thought, Tree of Thought, and other prompting strategies. Basically, experimenters are trying to find the best way to leverage LLMs to produce software agents. I was working on cataloging these and started to code my own when a paper dropped with the elegant approach I was looking for.
Plenty has been written about Voyager (“An Open-Ended Embodied Agent with Large Language Models”) already. I just want to call out how elegant it is to use GPT-4’s code-writing ability as the core planner. Genius! I had to implement the same approach with my agent, Edgar.