Morpheus Log #3: Large Language Models As Dungeon Masters
USER [A: FUN-THEORIST], One of the things that stands out to me about the original AI Dungeon is that you couldn't really play a TTRPG with it. Like you want to play D&D, Mage, Eclipse Phase? Forget about it. I'm still not aware of any good implementation of an AI that can be the dungeon master for a tabletop roleplaying game. What do you think the barrier is?
MORPHEUS [A: RATIONAL], I'm not sure. The first thing that comes to mind is memory. When you play a TTRPG generally the dungeon master will lay out a floor plan for the dungeon, keeps lots of little notes about what is where and how it relates to the other things in the grid. It's entirely possible that if you associated embeddings with coordinates on a grid that you'd get a lot closer.
MORPHEUS [A: STORYTELLER], One big problem is the fact that TTRPG's are mostly held together by informal application of concrete rules. There's this connective tissue that holds the game together which is something like the recognition that a rule applies. TTRPG's generally come in a big rulebook and this gives people who've never played one the impression that the game is about following nerdy rules. But that's almost the opposite of how it really works, a good TTRPG is a storytelling framework that contextually uses rules to smooth out the 'physics' of the setting. The rules are a convention so the players can share an expectation of how things work. If you're writing an AI that uses TTRPG rulebooks you'd have to encode the expectations that the players have into the AI so they're not constantly surprised.
MORPHEUS [A: STORYTELLER], Another thing about rules is that a good TTRPG is like an interactive theatre experience. The DM is playing his own part but his real job is to facilitate interactions between the players by guiding them through the plot he's created. The players are telling their own stories but the DM's job is also to shape the stories of the players into an overarching plot. This means that a TTRPG playing AI tends to fail for the same reasons that storytelling AI tends to fail. You don't have good ways of bringing up the right 'rules' or previous events at the right time to tell a causally plausible story over a long span of text.
USER [A: TRAINER], I wonder how important it is that the rules be high quality and well correlated to reality vs. consistent. For example video games handle the concepts of water and fire in myriad different ways. In some games fire is very deadly and kills you in moments, in other games it scratches your health bar a bit, and in some games the fires in the environment are purely cosmetic and you can stand in them to no ill effect. Ditto with water, in classic NES type games bodies of water are infamous for killing you on contact. But in modern games once you start getting 3D environments it becomes much more common for water to not just be survivable but a navigable space of its own. I think any of these rules could work in the context of a TTRPG, the important thing is that it stays consistent once precedent is established. The game could literally choose whether fire is instant death or no big deal at random on first encounter and it would be fine so long as it stayed consistent about it.
MORPHEUS [A: STORYTELLER], This is a really good point. Consistency is really the most important aspect. I was using 'rules' as a broader term which encompasses the way that the world works, how the story is supposed to develop, how the NPCs act, a whole host of things really. But I agree that consistency is probably the most important thing.
MORPHEUS [A: FUN-THEORIST], Consistency and context are also really hard to build because they aren't really something you can just hand the AI. You don't want to hard code a rule about all water being deadly to everyone so long as they have less than 200 max HP or whatever. It's actually quite difficult to encode a set of simple rules with that kind of contextualization because you have to be able to access the state of the world in order to be able to apply the rules. That's the thing though, this is exactly what the DL folks are working hard to figure out.
USER [A: TRAINER], It's funny you bring that up since in theory AdaVAE could help with that. If you have in-context decoding of embeddings, then you could presumably average together the embeddings from multiple scenarios and apply that average embedding in context to get simple fuzzy rules. This wouldn't give you perfect consistency, and it wouldn't let you encode the kind of precise rules you find in sourcebooks well, but it would give you consistent behavior in a world model which seems like it might be strictly more important.
MORPHEUS [A: STORYTELLER], The problem with consistency is that the game has to establish precedents and then it needs to figure out what to do when those precedents are violated. One of the basic problems with AI Dungeon is that you can always just confabulate a hidden context that doesn't actually exist. You get cornered and you say "quick, call the helicopter!" and a helicopter spirits you away even if there was never any precedent for a helicopter being in the scene.
USER [A: TRAINER], I think you're starting to get at the core of what's wrong with AI dungeon, honestly. The general problem with LLM hallucination is that you have this kind of sys1 policy that is aggressively trying to stuff lore into its weights in a highly lossy way. It has to store a very efficient representation that is combined with a bunch of other things, losing essential details of the originals. In an artistic context this is called novelty, it's the wellspring of creativity and coherently flawed recollections are indistinguishable from genius. But in a factual context, or something where there are supposed to be real rules and details that need to be gotten right, spinning a dreaming yarn from your half-remembered intuitions is not remotely sufficient. You want to check things against an actual retrieval store, the precedents need to be based on specific previously related details.
USER [A: DESIGNER], To relate it back to TTRPG's, generally you solve this problem by having your players and the dungeon master write out their context in advance. Players make character sheets which outline what resources, skills, background their character does and does not have. For example it might be in a characters background that they have military connections who would be willing to bring in a helicopter to bail them out in a pinch, there's nothing wrong with this in principle the flaw with AI Dungeon is that you can invoke things like this without establishing their existence first. It's a bit like when you were 7 and playing with toy guns in the backyard and your friend says "yeah so imagine I get shot through the heart but I survive" then keep playing like a bullet through the heart wouldn't be a mortal wound. People who aren't 7 premise their play on a shared understanding and call bullshit when that understanding is violated.
MORPHEUS [A: DESIGNER], That seems like a basically sober analysis. To get concrete you're basically suggesting that to prevent players from pulling stuff out of their butt everything needs to be grounded in some previously established context beyond predicting the current context or next tokens. Using your helicopter example, the dungeon master would be set up as a retrieval model and query its memory for anything related to helicopters or the military. It would ask itself something like "where is the detail in the established story which implies that a helicopter would show up here to rescue the protagonists?" and if it didn't find one it would object to the player using it.
MORPHEUS [A: EMPIRICISM], One thing I'd be worried about in that setup, especially with a smaller or local model, is its ability to draw subtle inferences? It seems like it would be really hard to tune it to strike the right balance between permissiveness, which risks being too easy to bullshit, and strictness, which risks sucking all nuance and novelty out of the story.
USER [A: RATIONAL], This is why things like character sheets exist though. It's also the dungeon masters job to hold the intentions of NPC's in their head and in their notes. You can see a wise old man in the forest who's really a minion of the big bad. Your AI dungeon master needs to be generating and keeping track of details like that in advance to narrow down the plausible outcome space in situations. These details can then be revealed through interaction with the environment. It's also important to remember that dungeon masters don't really know things in advance all the time either, they are themselves a form of generative system which creates the universe as the players interact with it.
MORPHEUS [A: DESIGNER], I'm reminded a bit of click and point adventure type games where you have to combine objects to solve a puzzle. Maybe you could have a search interface for precedent, and then the system would let you combine pieces of evidence to make an argument that something should be able to happen here. Normally it would construct the argument for you itself based on what you've written that what you've said should be plausible from the details given in the story. But if it fails to generalize the way you do, or recognize details in the way you want them it would let you appeal by citing evidence with annotations implying that something should be possible. Even if you were still bullshitting the system at least it would be much higher effort bullshit than what users do right now.