Peering Down Into Talking Ant Hill [Hackaday]

April 19, 2023 Jonathan Joseph Ketch

Watching an anthill brings an air of fascination. Thousands of ants are moving about and communicating with other ants as they work towards a goal as a collective whole. For us humans, we project a complex inner world for each of these tiny creatures to drive the narrative. But what if we could peer down into a miniature world and the ants spoke English? (PDF whitepaper)

Researchers at the University of Stanford and Google Research have released a paper about simulating human behavior using multiple Large Language Models (LMM). The simulation has a few dozen agents that can move across the small town, do errands, and communicate with each other. Each agent has a short description to help provide context to the LLM. In addition, they have memories of objects, other agents, and observations that they can retrieve, which allows them to create a plan for their day. The memory is a time-stamped text stream that the agent reflects on, deciding what is important. Additionally, the LLM can replan and figure out what it wants to do.

The question is, does the simulation seem life-like? One fascinating example is the paper’s authors created one agent (Isabella) intending to have a Valentine’s Day party. No other information is included. But several agents arrive at the character’s house later in the day to party. Isabella invited friends, and those agents asked some people.

A demo using recorded data from an earlier demo is web-accessible. However, it doesn’t showcase the powers that a user can exert on the world when running live. Thoughts and suggestions can be issued to an agent to steer their actions. However, you can pause the simulation to view the conversations between agents. Overall, it is incredible how life-like the simulation can be. The language of the conversation is quite formal, and running the simulation burns significant amounts of computing power. Perhaps there can be a subconscious where certain behaviors or observations can be coded in the agent instead of querying the LLM for every little thing (which sort of sounds like what people do).

There’s been an exciting trend of combining LLMs with a form of backing store, like combining Wolfram Alpha with chatGPT. Thanks [Abe] for sending this one in!

Spread the word!