Winning Cluedo

Cluedo (also known as Clue) is a deduction-oriented board game where players use their logical reasoning skills to solve a murder case. At the start of the game, a random perpetrator, murder weapon, and location are selected secretely, and are put into an envelope. The remaining cards get shuffled, and dealt out evenly to all players. (If they can't all be dealt out evenly, the remaining cards are shown face up to all players.) Players then take turns suggesting possible combinations of perpetrator, weapon, and location cards, to which the first player in clockwise order who holds any of the named cards in their hand reveals one of them to the currently active player, but not the others. The goal is to be the first to correctly guess the secret cards in the envelope.

There are two main aspects to the game:

deducing which cards people have in their hands, and hence which cards must be in the envelope, and;
asking the right questions to learn more (which as we will shortly see can also involve deception).

We will focus on the former in this article, and write a program in Prolog that makes use of constraint logic programming to tackle it!

A short introduction to constraint logic programming

Constraint logic programming is a fancy name for combining logic programming with concepts from constraint satisfaction. This enables us to define constraints on variables, which are limitations on the values that these variables can take.

It's easiest to illustrate what this entails through an example. Consider the following arithmetic puzzle: replace the boxes by the digits 1 through 9 in the following grid to make all four equations hold:

\begin{matrix} □ - □ & = & □ \\ * \\ □ / □ & = & □ \\ = \\ □ + □ & = & □ \end{matrix}

The naive way to solve this programmatically is to enumerate all $9! = 362.880$ possible assignments, and return those for which the equations hold (called the generate and test paradigm in logic programming, but is otherwise known as brute-force search). In Prolog, this would look something like this:

% Finds the assignments of 1..9 for A..I such that:
%   A-B=C
%       *
%   D/E=F
%       =
%   G+H=I
naive_solution(Vars) :-
  % Enumerate all possible permutations of 1..9
  length(Vars, 9),
  permutation([1,2,3,4,5,6,7,8,9], Vars),
  Vars = [A,B,C,D,E,F,G,H,I],

  % Check whether the equations hold for a permutation
  C is A - B, % First row
  F is D / E, % Second row
  I is G + H, % Third row
  I is C * F. % Column

It's important to note that the call to permutation causes the variable Vars to become bound: from the next line onwards, each of A,B, C, D, E, F, G, H, and I are assigned a concrete number that will be directly checked against each of the equations from the grid.

Since the search space is small, the naive approach succeeds in this case. But the approach does not scale very well to larger problems: even just adding two meaningless variables causes the same program to take more than $100$ times as long to finish executing, adding three would mean an increase in run time of more than $1.000$ times!

% Finds the assignments of 1..11 for A..K such that:
%   A-B=C
%       *
%   D/E=F
%       =
%   G+H=I
naive_solution_with_meaningless_variables(Vars) :-
  % Enumerate all possible permutations of 1..11
  length(Vars, 11),
  permutation([1,2,3,4,5,6,7,8,9,10,11], Vars),
  Vars = [A,B,C,D,E,F,G,H,I,J,K],
  
  % Check whether the equations hold for a permutation
  C is A - B, % First row
  F is D / E, % Second row
  I is G + H, % Third row
  I is C * F. % Column

Luckily, a more elegant solution exists that scales better to larger problems, by using constraint logic programming. Rather than enumerating and checking all possibilities in a search space directly, we define constraints that can efficiently eliminate large irrelevant chunks of the search space. Each added constraint transforms the current problem into a reduced one; this process, called constraint propagation, is one of the core features of a constraint solver.

In SWI-Prolog and SICStus Prolog, the libraries CLPFD and CLPZ define constraint logic programming for finite domains and integers respectively. Using CLPFD, the equation in the first row of our puzzle can be represented by the constraint A - B #= C. Since C must be positive, adding this constraint excludes any solution where A is less than B - which already amounts to excluding half the search space! How efficiently this can be done depends on the in-memory representation of the search space within the constraint solver, but is not something we have to worry about.

The new program could look something like this:

:- use_module(library(clpfd)).

% Finds the assignments of 1..9 for A..I such that:
%   A-B=C
%       *
%   D/E=F
%       =
%   G+H=I
constraint_logic_solution(Vars) :-
  % The variables are some permutation of 1..9
  Vars = [A, B, C, D, E, F, G, H, I],
  Vars ins 1..9,
  all_distinct(Vars),

  % Introduce the grid constraints
  A - B #= C, % First row
  D #= E * F, % Second row
  G + H #= I, % Third row
  C * F #= I, % Column
  
  % Solve the constraint satisfaction problem
  label(Vars).

The first line simply imports CLPFD.

The program begins by stating we are interested in nine variables A through I, which should all take values from 1 through 9 via the ins predicate. ([A, B] ins 1..9 is equivalent with A in 1..9, B in 1..9.) The all_distinct predicate adds a constraint on the provided list of variables that guarantees that they all get assigned distinct values. After this predicate, Vars is not yet bound - we have only started building up a constraint satisfaction problem. Actually finding and enumerating the remaining possible solutions is done through label, which we will only call when we've added all constraints.

Next, we add the constraints corresponding to the sets of equations in the grid. Arithmetic constraints can be added through #= (equality constraint), #> (greater than constraint), #< (less than constraint), and #\= (inequality constraint). Again, the fundamental difference with naive_solution is that at this point A through I are still not yet bound.

Finally, only when we call label(Vars) will we start to find all possible assignments of Vars that satisfy the given constraints, and finding these assignments is done much more efficiently than enumerating and checking all permutations individually: constraint_logic_solution outperforms naive_solution by a factor $10$ :

time(naive_solution(X)). takes about 0.16 seconds on average, and;
time(constraint_logic_solution(X)). takes about 0.018 seconds on average.

You can play around with the code on SWISH. (SWISH doesn't work on all mobile devices. There should be a blue button titled "Run!" in the bottom right corner; this button does not always appear.)

constraint_logic_solution has another advantage over naive_solution, namely that it is even faster than it already was for alternate modes of execution. Querying for example naive_solution([1, _, _, _, _, _, _, _, _]). still takes relatively long to execute, as it still enumerates all permutations behind the scenes. But the equivalent query for constraint_logic_solution correctly returns false almost immediately, since the first constraint now becomes 1 - B #= C, and the constraint solver can quickly deduce that no satisfying assignment exists.

Back to murder mysteries

Constraint logic programming can also be applied to more complicated problems, like deducing the hidden cards in a game of Cluedo. Every card needs to be somewhere, so it seems natural to introduce variables for the location of each card. Each in-game event adds additional constraints on which cards can or can not reside somewhere. When enough knowledge is available such that only a single possible combination of cards can be inside the envelope, we have found the answer.

We start our Prolog program with some basic facts:

there are six suspects, six weapons, and nine rooms as listed below, and;
each of these suspects, weapons, and rooms are cards in the game.

% All possible suspects
suspect(plum).
suspect(white).
suspect(scarlet).
suspect(green).
suspect(mustard).
suspect(peacock).

% All possible weapons
weapon(rope).
weapon(dagger).
weapon(wrench).
weapon(pistol).
weapon(candlestick).
weapon(lead_pipe).

% All possible rooms
room(courtyard).
room(game_room).
room(study).
room(dining_room).
room(garage).
room(living_room).
room(kitchen).
room(bedroom).
room(bathroom).

% A card is either a suspect, weapon, or room
card(Card) :- suspect(Card).
card(Card) :- weapon(Card).
card(Card) :- room(Card).

As previously stated, each of these cards must be somewhere. They are either: removed from the game at the start, hidden inside the envelope, or in one of the player's hands. Let's associate a number with all of these locations:

$-1$ for being removed from the game;
$0$ for being hidden in the envelope, and;
$n$ for being in player $n$ 's hand, where $1$ denotes you, and the rest are ascending integers for the other players in clockwise order.

Hence, if there are NumberOfPlayers amount of players, we introduce the list of variables CardLocations (one for each card's location), such that CardLocations ins -1..NumberOfPlayers. Now, what can we say about CardLocations? A few things actually:

your hand cards are known;
the discarded cards are known;
the number of cards in each location is known;
the envelope contains exactly one card of each category, and;
the questions and responses each turn put additional restrictions on the card locations.

This leads to the following starting point:

cluedo(NumberOfPlayers, HandCards, DiscardedCards, QuestionsAndResponses, EnvelopeCards) :-
  % Create a location variable for each card
  bagof(Card, card(Card), Cards),
  length(Cards, AmountOfCards),
  length(CardLocations, AmountOfCards),
    
  % The cards are dealt evenly among the players, and the rest are discarded
  AmountOfHandCards is (AmountOfCards - 3) div NumberOfPlayers,
  AmountOfDiscardedCards is (AmountOfCards - 3) mod NumberOfPlayers,
  (length(HandCards, AmountOfHandCards) -> true;
    print("Unexpected number of hand cards"), fail ),
  (length(DiscardedCards, AmountOfDiscardedCards) -> true;
    print("Unexpected number of discarded cards"), fail ),

  % Create a mapping of each card to its location variable
  pair(Cards, CardLocations, CardCardLocationPairs),
  dict_pairs(CardToCardLocation, _, CardCardLocationPairs),

  % Each card is either discarded (-1), hidden (0), or held by a player (>=1)
  CardLocations ins -1..NumberOfPlayers,
  
  % Constraints
  your_hand_cards_are_known(NumberOfPlayers, CardToCardLocation, HandCards),
  discarded_cards_are_known(NumberOfPlayers, CardToCardLocation, DiscardedCards),
  number_of_cards_per_location(NumberOfPlayers, CardLocations, AmountOfHandCards, AmountOfDiscardedCards),
  the_envelope_contains_one_card_of_each_category(CardToCardLocation),
  questions_and_responses(NumberOfPlayers, CardToCardLocation, QuestionsAndResponses),
  
  % Enumerate all combinations of cards that can be inside the envelope
  envelope_cards(Cards, CardLocations, EnvelopeCards).

% Relates two lists with a list where all their contents are paired together
pair([], [], []).
pair([Element1|List1], [Element2|List2], [Element1-Element2|Zipped]) :-
  pair(List1, List2, Zipped).

% Relates multiple keys to their respective values in a dictionary
get_multiple_from_dict([], _, []).
get_multiple_from_dict([Key|Keys], Dict, [Value|Values]) :-
  get_dict(Key, Dict, Value),
  get_multiple_from_dict(Keys, Dict, Values).

In a nutshell, this specifies that all of the following need to be specified when calling cluedo:

NumberOfPlayers is an integer that specifies how many players are participating in the game;
HandCards is a list of atoms representing the cards in your hand, e.g. [mustard, plum, game_room, garage];
DiscardedCards is a list of atoms representing the discarded cards, e.g. [dagger, courtyard], and;
QuestionsAndResponses is a list of the questions asked by the players, and the responses they got (we will get into the specifics later).

A location variable for each card is created (CardLocations), with CardToCardLocation mapping each card atom to its corresponding location variable. Next, all constraints are applied, and we solve for the cards inside the envelope EnvelopeCards.

Two helper predicates are defined: pair and get_multiple_from_dict:

pair relates two lists, with a list where each element has been paired up, much like the zip function in most programming languages. Such that, for example: pair([bla, blabla], [foo, bar], [bla-foo, blabla-bar])., which is used to create the CardToCardLocation mapping. And;
get_multiple_from_dict relates multiple keys within a dictionary with their corresponding values, such that, for example: get_multiple_from_dict([key1, key2], _{key1: value1, key2: value2, key3: value3}, [value1, value2]),

All that remains is to define the constraints, and to actually solve the constraint satisfaction problem.

Known card constraints

The constraints your_hand_cards_are_known and discarded_cards_are_known are pretty easy to define:

your_hand_cards_are_known(CardToCardLocation, HandCards) :-
  get_multiple_from_dict(HandCards, CardToCardLocation, HandCardLocations),
  maplist(#=(1), HandCardLocations).

We simply find all card location variables corresponding to the given list of hand cards, and apply the partial relation 1 #= to each of them, using the maplist meta-predicate. (maplist(#=(1), [X, Y]) is equivalent with 1 #= X, 1 #= Y.) Remember that $1$ represents your location, so this constraint just states that your hand cards are really in your hand. Analogously, discarded_cards_are_known constrains the location variables of the discarded cards to $- 1$ , which represents being discarded:

discarded_cards_are_known(CardToCardLocation, DiscardedCards) :-
  get_multiple_from_dict(DiscardedCards, CardToCardLocation, DiscardedCardLocations),
  maplist(#=(-1), DiscardedCardLocations).

Cardinality constraints

number_of_cards_per_location permits a simple implementation using the pre-defined global_cardinality constraint, which specifies how often specific values occur within a given list of variables. (For example, global_cardinality([A, B, C, D], [1-2, 2-1, 3-1]). states that the value one occurs twice, and the values two and three occur once across the variables A, B, C, and D.) We know the number of discarded cards, that there are three cards in the envelope, and that each player has the same number of hand cards, so:

number_of_cards_per_location(NumberOfPlayers, CardLocations, AmountOfHandCards, AmountOfDiscardedCards) :-
  bagof(Player-AmountOfHandCards, between(1, NumberOfPlayers, Player), PlayerCardinality),
  global_cardinality(CardLocations, [(-1)-AmountOfDiscardedCards, 0-3 | PlayerCardinality]).

Where we have used the bagof meta-predicate to find the "cardinalities" of the player locations in the card location variables. bagof relates some template and goal with all alternatives of that template for which the goal holds. (For example, bagof(X-4, between(1, 3, X), Xs) is equivalent with Xs = [1-4, 2-4, 3-4].)

the_envelope_contains_one_card_of_each_category can also be defined using global_cardinality, by stating that the value $0$ occurs once within the card locations of the person cards, once for the weapon cards, and the room cards. Remember that zero represents being inside the envelope (for each card category, we leave the cardinalities of the card locations variables being discarded or in a player's hand unspecified, denoted by _):

the_envelope_contains_one_card_of_each_category(NumberOfPlayers, CardToCardLocation) :-
  bagof(Person, person(Person), Persons),
  bagof(Weapon, weapon(Weapon), Weapons),
  bagof(Room, room(Room), Rooms),
  one_of_cards_is_in_envelope(NumberOfPlayers, CardToCardLocation, Persons),
  one_of_cards_is_in_envelope(NumberOfPlayers, CardToCardLocation, Weapons),
  one_of_cards_is_in_envelope(NumberOfPlayers, CardToCardLocation, Rooms).

one_of_cards_is_in_envelope(NumberOfPlayers, CardToCardLocation, Cards) :-
  get_multiple_from_dict(Cards, CardToCardLocation, CardLocations),
  bagof(Player-_, between(1, NumberOfPlayers, Player), PlayerCardinality),
  global_cardinality(CardLocations, [(-1)-_, 0-1 | PlayerCardinality]).

Questions and responses

Now we get to the most important, and most difficult constraint: questions_and_responses. During their turn, a player asks for a suspect, weapon, and room card, after which one of two things can happen:

none of the other players has one of the mentioned cards, in which case no cards are revealed, or;
at least one of the other players has one of the mentioned cards, in which case the first player in turn order after the asking player reveals one of the mentioned cards to them.

Depending on whether you are the asking player, the latter case reveals different information. If you are, then you know the exact card that the responding player has. If you aren't, you only know the responding player has at least one of the cards.

So from the point of view of constraints, there are three cases for each entry in the "questions and responses"-list. questions_and_responses consists of applying the appropriate of each of these cases to the given list of QuestionsAndResponses through maplist:

questions_and_responses(NumberOfPlayers, CardToCardLocation, QuestionsAndResponses) :-
  maplist(question_and_response(NumberOfPlayers, CardToCardLocation), QuestionsAndResponses).

The individual cases become:

% Each player, except the one asking for the cards, didn't have them
question_and_response(NumberOfPlayers, CardToCardLocation, no(Asker, Cards)) :-
  get_multiple_from_dict(Cards, CardToCardLocation, CardLocations),
  players_in_between_dont_have_cards(NumberOfPlayers, CardLocations, Asker, Asker).

% Responder has one of the cards, and we know which one
question_and_response(NumberOfPlayers, CardToCardLocation, yes(1, Responder, Cards, ShownCard)) :-
  % The responding player has the shown card
  get_dict(ShownCard, CardToCardLocation, ShownCardLocation),
  ShownCardLocation #= Responder,

  % Each player before the responding player in the list didn't have them
  get_multiple_from_dict(Cards, CardToCardLocation, CardLocations),
  players_in_between_dont_have_cards(NumberOfPlayers, CardLocations, 1, Responder).

% Responder has one of the cards, but we don't know which one, since we weren't shown the card
question_and_response(NumberOfPlayers, CardToCardLocation, yes(Asker, Responder, Cards)) :-
  Asker \= 1,
  get_multiple_from_dict(Cards, CardToCardLocation, CardLocations),
    
  % The responding player has at least one of the cards
  maplist(reified_equal_to(Responder), CardLocations, RespondingPlayerHasCard),
  sum(RespondingPlayerHasCard, #>=, 1),
    
  % Each player before the responding player in the list didn't have them
  players_in_between_dont_have_cards(NumberOfPlayers, CardLocations, Asker, Responder).

Let's go over them one by one.

The first case somewhat verbosely states that all players from Asker to Asker in turn order (so all other players) don't have the given Cards. For this, we use the helper predicate players_in_between_dont_have_cards that will also come in handy for the other cases:

% Each player between the asking and responding player in the list didn't have the listed cards
players_in_between_dont_have_cards(NumberOfPlayers, CardLocations, Asker, Responder) :-
  (Asker < Responder ->
    % "Asking < Responder"-case
    bagof(Player, (
      between(Asker, Responder, Player),
      Player \= Asker,
      Player \= Responder
    ), PlayersWithoutCards);
    
    % "Asking >= Responder"-case
    bagof(Player, (
      between(1, NumberOfPlayers, Player),
      \+ between(Responder, Asker, Player)
    ), PlayersWithoutCards)),
  maplist(player_doesnt_have_cards(CardLocations), PlayersWithoutCards).

% The given card locations are not the given player
player_doesnt_have_cards(CardLocations, Player) :- maplist(#\=(Player), CardLocations).

The second case is when we are the asking player, and Responder responds to our question by showing us ShownCard. We again use players_in_between_dont_have_cards to state that the players that didn't respond don't have any of the requested cards.

The last case is when another player Asker asks for some cards, and Responder responds that they have the card (such that we don't know what was shown to Asker). Now, rather than assigning a single card location to Responder, we introduce extra variables for the card locations of the cards that were asked. These variables equal $1$ if and only if their respective card is held by Responder, otherwise they are $0$ . This is done through the helper predicate reified_equal_to:

reified_equal_to(Value, Variable, Truth) :- Variable #= Value #<==> Truth.

The sum of these extra variables being greater than or equal to $1$ means at least one of the cards is held by Responder.

Turning a constraint into a variable is known as reification, and is for example done by the #<==> and #\/ predicates. Reification allows us to reason about the truth of constraints.

Labeling

So far, we've only imposed constraints on CardLocations. All that remains is to actually solve the constraint satisfaction problem. We'll do exactly that within the envelope_cards predicate:

envelope_cards(Cards, CardLocations, EnvelopeCards) :-
  maplist(reified_equal_to(0), CardLocations, EnvelopeLocations),
  pair(Cards, EnvelopeLocations, CardEnvelopeLocationPairs),
  label(EnvelopeLocations), % Solve the constraint satisfaction problem
  include(second_equals_one, CardEnvelopeLocationPairs, CardEnvelopeLocationPairsInEnvelope),
  pair(EnvelopeCards, _, CardEnvelopeLocationPairsInEnvelope).

second_equals_one(_-1).

label will actually enumerate the possible assignments permitted by the imposed constraints. In a nutshell, what is happening in this predicate is that we are expicitly not calling label(CardLocations), because we don't need to know where every single card is in order to deduce what's inside the envelope. Labeling CardLocations would cause us to also enumerate all possible assignments of irrelevant cards. Instead, we only care about which of CardLocations is assigned to $0$ (the envelope), so we introduce extra variables for this through reification, and label those. The rest of the predicate is concerned with actually finding the cards from these variables.

With this last piece added, our solver is complete! You can play around with it on SWISH. For example, running:

cluedo(4,
  [mustard, plum, game_room, garage],
  [dagger, courtyard],
  [
   yes(1, 3, [rope, green, study], rope),
   yes(2, 1, [candlestick, plum, kitchen]),
   no(3, [candlestick, green, living_room]),
   no(4, [pistol, green, courtyard])
  ], X).

tells us that so far we know it was Reverend Green with the candlestick in either the bathroom, bedroom, kitchen, living room, dining room or study.

Wrapping up

In this article we saw how we can write a slightly more complicated constraint logic program using CLPFD. Making use of the bagof and maplist meta-predicates allowed us to write very concise code. If you want to learn more about the material covered in this article, check out these excellent videos from The Power of Prolog:

Meta-Predicates in Prolog (YouTube) covers among others bagof and maplist meta-predicates;
Sudoku in Prolog (YouTube) shows how to write a Sudoku solver using CLPFD.

This wraps up the Prolog series for now!