Einstein's Riddle in D

Some examples of strong static typing in D

By leonardo maffi
V.1.2, Dec 14 2011
Keywords: programming, d language, c++, python, static typing, correctness

[Go back to the article index]

This article is about the D translation of two old little C++ programs, to show some differences in the idioms of the two languages, and to discuss a little about static typing too.

I have found the original C++ code in this page:
See also: http://www.weitz.de/einstein.html

Both C++ programs solve a logic problem named "Einstein's Riddle". Here you find more info about its variants:
http://en.wikipedia.org/wiki/Zebra_Puzzle

The original C++ versions are by Klaus Betzler and by Dr. Gabor Csanyi.

This post is not a discussion of the Einstein's Riddle, or of algorithms that solve it. The C++ code is about ten years old, so it doesn't use some of the handy features of modern C++0x, that are sometimes comparable to D features. So this is not meant to be a fair comparison between modern C++ and modern D V.2 language.

To compile the D code I am using DMD v.2.056 (or an alpha version of v.2.057 from GitHub).

See here for the complete C++ and D code:
http://www.fantascienza.net/leonardo/js/einstein.zip

Archive contents:
- einstein1.cpp: original first C++ (fast) solution.
- einstein1a.d: D translation of the first C++ solution.
- einstein2.cpp: original second C++ (slow) solution.
- einstein2a.d: first D translation of the second C++ solution. This contains a good amount of strong static typing. This uses more idiomatic D arrays.
- einstein2b.d: second D translation of the second C++ solution. This uses arrays that are more efficient but less idiomatic D.
- einstein2c.py: Python2 translation of the first D translation of the second C++ solution. This program is rather slow, even counting it's in Python, I don't fully know why.

======================================================

Comments on the first program (einstein1.cpp and einstein1a.d):


C> inline unsigned long BIT(unsigned n) {return 1<<n;}
C>
C> const unsigned long
C>   yellow = BIT(0),
C>   blue = BIT(1),
C>   red = BIT(2),
C>   green = BIT(3),
C>   white = BIT(4),
C>
C>   norwegian = BIT(5),
C>   dane = BIT(6),
C>   brit = BIT(7),
C>   german = BIT(8),
C>   swede = BIT(9),
C>
C>   water = BIT(10),
C>   tea = BIT(11),
C>   milk = BIT(12),
C>   coffee = BIT(13),
C>   beer = BIT(14),
C>
C>   dunhill = BIT(15),
C>   marlboro = BIT(16),
C>   pallmall = BIT(17),
C>   rothmans = BIT(18),
C>   winfield = BIT(19),
C>
C>   cat = BIT(20),
C>   horse = BIT(21),
C>   bird = BIT(22),
C>   fish = BIT(23),
C>   dog = BIT(24);
C>
C> const char * Label[] = {
C>   "Yellow","Blue","Red","Green","White",
C>   "Norwegian","Dane","Brit","German","Swede",
C>   "Water","Tea","Milk","Coffee","Beer",
C>   "Dunhill","Marlboro","Pallmall","Rothmans","Winfield",
C>   "Cat","Horse","Bird","Fish","Dog"
C> };


D> string bitFlags(BaseType=uint)(string enumName, string enumItems) {
D>     string result = text("enum ", enumName, " : ", BaseType.stringof, " {\n");
D>     result ~= text("    None = 0,\n");
D>     foreach (i, flag; enumItems.split())
D>         result ~= text("    ", flag, " = (1 << ", i, "),\n");
D>     return result ~ "}";
D> }
D>
D> mixin(bitFlags!uint("E", "
D> Yellow    Blue     Red      Green    White
D> Norwegian Dane     Brit     German   Swede
D> Water     Tea      Milk     Coffee   Beer
D> Dunhill   Marlboro Pallmall Rothmans Winfield
D> Cat       Horse    Bird     Fish     Dog"));



The D version uses named enums, so D writeln is able to print their names, this avoids the need of the Label array. A simplified bitFlags compile-time function allows to create a simple enumeration of bit flags.

---------------------------

C> const unsigned long color = yellow+blue+red+green+white;
C> const unsigned long country = norwegian+dane+brit+german+swede;
C> const unsigned long drink = water+tea+milk+coffee+beer;
C> const unsigned long cigar = dunhill+marlboro+pallmall+rothmans+winfield;
C> const unsigned long animal = cat+horse+bird+fish+dog;
C>
C> unsigned long house[5] = {norwegian,blue,milk,0,0};  // rules 8,9,14
C> unsigned long result[5];
C>
C> const unsigned long comb[] = { // simple rules
C>   brit+red,                    // 1
C>   swede+dog,                   // 2
C>   dane+tea,                    // 3
C>   green+coffee,                // 5
C>   pallmall+bird,               // 6
C>   yellow+dunhill,              // 7
C>   winfield+beer,               // 12
C>   german+rothmans              // 13
C> };
C>
C> const unsigned long combmask[] = { // corresponding selection masks
C>   country+color,
C>   country+animal,
C>   country+drink,
C>   color+drink,
C>   cigar+animal,
C>   color+cigar,
C>   cigar+drink,
C>   country+cigar
C> };


D> enum E color =   E.Yellow    | E.Blue     | E.Red      | E.Green    | E.White;
D> enum E country = E.Norwegian | E.Dane     | E.Brit     | E.German   | E.Swede;
D> enum E drink =   E.Water     | E.Tea      | E.Milk     | E.Coffee   | E.Beer;
D> enum E cigar =   E.Dunhill   | E.Marlboro | E.Pallmall | E.Rothmans | E.Winfield;
D> enum E animal =  E.Cat       | E.Horse    | E.Bird     | E.Fish     | E.Dog;
D>
D> immutable E[] comb = [ // simple rules
D>     E.Brit     | E.Red,     // 1
D>     E.Swede    | E.Dog,     // 2
D>     E.Dane     | E.Tea,     // 3
D>     E.Green    | E.Coffee,  // 5
D>     E.Pallmall | E.Bird,    // 6
D>     E.Yellow   | E.Dunhill, // 7
D>     E.Winfield | E.Beer,    // 12
D>     E.German   | E.Rothmans // 13
D> ];
D>
D> immutable E[] combMask = [ // corresponding selection masks
D>     country | color,
D>     country | animal,
D>     country | drink,
D>     color   | drink,
D>     cigar   | animal,
D>     color   | cigar,
D>     cigar   | drink,
D>     country | cigar
D> ];



The D version uses | (or) instead of +, for better clarity. comb and combMask are not enums for more efficiency.


D> int main() {
D>     E[5] house = [E.Norwegian, E.Blue, E.Milk, E.None, E.None]; // rules 8, 9, 14
D>     E[5] result;



The D code defines 'house' and 'result' inside the main(), this avoids global mutable variables. 'house' is needed as argument by most functions, but the overall performance of the program is about unchanged.

'house' is an array of E enums, that are strongly typed, so I have had to add E.None, that is represented with 0.

======================================================

Comments on the second program (einstein2.cpp and einstein2a.d):


C> char *NationS[] = {"British", "Swedish", "Danish", "Norvegian", "German"};
C> enum Nation {British, Swedish, Danish, Norvegian, German};
C> char *NumberS[] = {"One", "Two", "Three", "Four", "Five"};
C> enum Number {One, Two, Three, Four, Five};
C> char *ColourS[] = {"Red", "Green", "Blue", "White", "Yellow"};
C> enum Colour {Red, Green, Blue, White, Yellow};
C> char *DrinkS[] = {"Milk", "Coffee", "Water", "Beer", "Tea"};
C> enum Drink {Milk, Coffee, Water, Beer, Tea};
C> char *SmokeS[] = {"PallMall", "Dunhill", "Blend", "BlueMaster", "Prince"};
C> enum Smoke {PallMall, Dunhill, Blend, BlueMaster, Prince};
C> char *PetS[] = {"Dog", "Cat", "Fish", "Horse", "Bird"};
C> enum Pet {Dog, Cat, Fish, Horse, Bird};


D> enum Number : ubyte { One,      Two,     Three,  Four,       Five   };
D> enum Color  : ubyte { Red,      Green,   Blue,   White,      Yellow };
D> enum Drink  : ubyte { Milk,     Coffee,  Water,  Beer,       Tea    };
D> enum Smoke  : ubyte { PallMall, Dunhill, Blend,  BlueMaster, Prince };
D> enum Pet    : ubyte { Dog,      Cat,     Fish,   Horse,      Bird   };
D> enum Nation : ubyte { British,  Swedish, Danish, Norvegian,  German };



In D enums are almost strongly typed, but not as strongly as C++0x "enum class" enumerations:
http://en.wikipedia.org/wiki/C%2B%2B0x#Strongly_typed_enumerations

In D writeln is able to print the name of named enums, so the arrays like PetS are not necessary. In D unsigned bytes are named "ubyte".

---------------------------

C> void
C> myprint(char *ptr, char **s)
C> {
C>   printf("%12s%12s%12s%12s%12s\n", s[ptr[0]], s[ptr[1]], s[ptr[2]], s[ptr[3]], s[ptr[4]]);
C> }
...
C> main()
C> {
...
C> 		      // woo-hoo!
C> 		      printf("Found a solution:\n");
C> 		      printf("Nation: ");
C> 		      myprint(nation, NationS);
C> 		      printf("Number: ");
C> 		      myprint(number, NumberS);
C> 		      printf("Colour: ");
C> 		      myprint(colour, ColourS);
C> 		      printf("Drink : ");
C> 		      myprint(drink, DrinkS);
C> 		      printf("Smoke : ");
C> 		      myprint(smoke, SmokeS);
C> 		      printf("Pet   : ");
C> 		      myprint(pet, PetS);
C> 		      printf("\n");


D> void showRow(E)(in E[] data) {
D>     auto names = [EnumMembers!E];
D>     writefln("%6s: %12s%12s%12s%12s%12s",
D>              (Unqual!E).stringof,
D>              text(names[data[0]]), text(names[data[1]]),
D>              text(names[data[2]]), text(names[data[3]]),
D>              text(names[data[4]]));
D> }
...
D> void main() {
...
D>                                             writeln("Found a solution:");
D>                                             showRow(nation);
D>                                             showRow(number);
D>                                             showRow(color);
D>                                             showRow(drink);
D>                                             showRow(smoke);
D>                                             showRow(pet);
D>                                             writeln();
D>                                         }



D here is doing more at compile-time. The printing template function showRow is able to find the names of the enums and of their elements.

text() is used because formats like "%12s" are not yet supported (DMD 2.056) for enum members.

---------------------------

C> char perms[][5] = {
C>   {     4,     3,     2,     1,     0},
C>   {     3,     4,     2,     1,     0},
C>   {     4,     2,     3,     1,     0},
C>   {     2,     4,     3,     1,     0},
C>   {     3,     2,     4,     1,     0},
C>   {     2,     3,     4,     1,     0},
C>   {     4,     3,     1,     2,     0},
C>   {     3,     4,     1,     2,     0},
C>   {     4,     1,     3,     2,     0},
C>   {     1,     4,     3,     2,     0},
C>   {     3,     1,     4,     2,     0},
C>   {     1,     3,     4,     2,     0},
C>   {     4,     2,     1,     3,     0},
C>   {     2,     4,     1,     3,     0},
C>   {     4,     1,     2,     3,     0},
C>   {     1,     4,     2,     3,     0},
C>   {     2,     1,     4,     3,     0},
C>   {     1,     2,     4,     3,     0},
C>   {     3,     2,     1,     4,     0},
C>   {     2,     3,     1,     4,     0},
C>   {     3,     1,     2,     4,     0},
C>   {     1,     3,     2,     4,     0},
C>   {     2,     1,     3,     4,     0},
C>   {     1,     2,     3,     4,     0},
C>   {     4,     3,     2,     0,     1},
C>   {     3,     4,     2,     0,     1},
C>   {     4,     2,     3,     0,     1},
C>   {     2,     4,     3,     0,     1},
C>   {     3,     2,     4,     0,     1},
C>   {     2,     3,     4,     0,     1},
C>   {     4,     3,     0,     2,     1},
C>   {     3,     4,     0,     2,     1},
C>   {     4,     0,     3,     2,     1},
C>   {     0,     4,     3,     2,     1},
C>   {     3,     0,     4,     2,     1},
C>   {     0,     3,     4,     2,     1},
C>   {     4,     2,     0,     3,     1},
C>   {     2,     4,     0,     3,     1},
C>   {     4,     0,     2,     3,     1},
C>   {     0,     4,     2,     3,     1},
C>   {     2,     0,     4,     3,     1},
C>   {     0,     2,     4,     3,     1},
C>   {     3,     2,     0,     4,     1},
C>   {     2,     3,     0,     4,     1},
C>   {     3,     0,     2,     4,     1},
C>   {     0,     3,     2,     4,     1},
C>   {     2,     0,     3,     4,     1},
C>   {     0,     2,     3,     4,     1},
C>   {     4,     3,     1,     0,     2},
C>   {     3,     4,     1,     0,     2},
C>   {     4,     1,     3,     0,     2},
C>   {     1,     4,     3,     0,     2},
C>   {     3,     1,     4,     0,     2},
C>   {     1,     3,     4,     0,     2},
C>   {     4,     3,     0,     1,     2},
C>   {     3,     4,     0,     1,     2},
C>   {     4,     0,     3,     1,     2},
C>   {     0,     4,     3,     1,     2},
C>   {     3,     0,     4,     1,     2},
C>   {     0,     3,     4,     1,     2},
C>   {     4,     1,     0,     3,     2},
C>   {     1,     4,     0,     3,     2},
C>   {     4,     0,     1,     3,     2},
C>   {     0,     4,     1,     3,     2},
C>   {     1,     0,     4,     3,     2},
C>   {     0,     1,     4,     3,     2},
C>   {     3,     1,     0,     4,     2},
C>   {     1,     3,     0,     4,     2},
C>   {     3,     0,     1,     4,     2},
C>   {     0,     3,     1,     4,     2},
C>   {     1,     0,     3,     4,     2},
C>   {     0,     1,     3,     4,     2},
C>   {     4,     2,     1,     0,     3},
C>   {     2,     4,     1,     0,     3},
C>   {     4,     1,     2,     0,     3},
C>   {     1,     4,     2,     0,     3},
C>   {     2,     1,     4,     0,     3},
C>   {     1,     2,     4,     0,     3},
C>   {     4,     2,     0,     1,     3},
C>   {     2,     4,     0,     1,     3},
C>   {     4,     0,     2,     1,     3},
C>   {     0,     4,     2,     1,     3},
C>   {     2,     0,     4,     1,     3},
C>   {     0,     2,     4,     1,     3},
C>   {     4,     1,     0,     2,     3},
C>   {     1,     4,     0,     2,     3},
C>   {     4,     0,     1,     2,     3},
C>   {     0,     4,     1,     2,     3},
C>   {     1,     0,     4,     2,     3},
C>   {     0,     1,     4,     2,     3},
C>   {     2,     1,     0,     4,     3},
C>   {     1,     2,     0,     4,     3},
C>   {     2,     0,     1,     4,     3},
C>   {     0,     2,     1,     4,     3},
C>   {     1,     0,     2,     4,     3},
C>   {     0,     1,     2,     4,     3},
C>   {     3,     2,     1,     0,     4},
C>   {     2,     3,     1,     0,     4},
C>   {     3,     1,     2,     0,     4},
C>   {     1,     3,     2,     0,     4},
C>   {     2,     1,     3,     0,     4},
C>   {     1,     2,     3,     0,     4},
C>   {     3,     2,     0,     1,     4},
C>   {     2,     3,     0,     1,     4},
C>   {     3,     0,     2,     1,     4},
C>   {     0,     3,     2,     1,     4},
C>   {     2,     0,     3,     1,     4},
C>   {     0,     2,     3,     1,     4},
C>   {     3,     1,     0,     2,     4},
C>   {     1,     3,     0,     2,     4},
C>   {     3,     0,     1,     2,     4},
C>   {     0,     3,     1,     2,     4},
C>   {     1,     0,     3,     2,     4},
C>   {     0,     1,     3,     2,     4},
C>   {     2,     1,     0,     3,     4},
C>   {     1,     2,     0,     3,     4},
C>   {     2,     0,     1,     3,     4},
C>   {     0,     2,     1,     3,     4},
C>   {     1,     0,     2,     3,     4},
C>   {     0,     1,     2,     3,     4}
C> };


D> const(T[][]) permutations(T)(in T[] items) pure nothrow {
D>     const(T[])[] result;
D>
D>     void perms(in T[] s, in T[] prefix=[]) nothrow {
D>         if (s.length)
D>             foreach (i, c; s)
D>                perms(s[0 .. i] ~ s[i+1 .. $], prefix ~ c);
D>         else
D>             result ~= prefix;
D>     }
D>
D>     perms(items);
D>     return result;
D> }
...
D> void main() {
D>     static immutable permsNumber = permutations([EnumMembers!Number]);
D>     static immutable permsColor = cast(immutable(Color[][]))permsNumber;
D>     static immutable permsDrink = cast(immutable(Drink[][]))permsNumber;
D>     static immutable permsSmoke = cast(immutable(Smoke[][]))permsNumber;
D>     static immutable permsPet = cast(immutable(Pet[][]))permsNumber;



Here I have used one of the peculiar features of D, the compile-time function evaluation. The C++ 'perms' array is just an array of all permutations of the [0, 1, 2, 3, 4] numbers. So I have written a little permutations() function (a function as commonly useful as this one, maybe as a lazy range, needs to be present in Phobos. Python has a similar generator in the itertools standard library module) that returns an array of all the permutations.

In D the enums are strongly typed, and the functions of this program work with strongly typed enums to avoid bugs, so I need different permutations of the enum members. Probably a good compiler is able to see all those immutable arrays contain the same bit patterns (despite being of different types), but this program iterates on them many times, so to be sure there is low pressure on the CPU cache, the other arrays are just aliases of the first one. I don't last casts a lot, because they are quite unsafe in D, but here I think they are acceptable.

The D 2D arrays like permsNumber are dynamic arrays of dynamic arrays of 5 enum members, that are represented with ubytes. In the C++ code perms is an array of fixed-sized arrays, this gives a bit higher performance to the C++ code. I have created another D version that solves this problem, but needs to use the arrays in a not D idiomatic way, see below.

---------------------------

C> char
C> possible(char *number, char *colour, char* drink, char *smoke, char *pet)
C> {
C>   // Rules, with the rule number as a comment
C>
C>   // single property rules
C>   if(number != NULL && number[Norvegian] != One) // 9
C>     return FALSE;
C>   if(colour != NULL && colour[British] != Red) // 1
C>     return FALSE;
C>   if(drink != NULL && drink[Danish] != Tea)  // 3
C>     return FALSE;
C>   if(smoke != NULL && smoke[German] != Prince) // 13
C>     return FALSE;
C>   if(pet != NULL && pet[Swedish] != Dog)    // 2
C>     return FALSE;


D> bool isPossible(in Number[] number, in Color[] color, in Drink[] drink,
D>                 in Smoke[] smoke, in Pet[] pet) pure nothrow {
D>     // Rules, with the rule number as a comment
D>
D>     // single property rules
D>     if (!number.empty && number[Nation.Norvegian] != Number.One) // 9
D>         return false;
D>     if (!color.empty && color[Nation.British] != Color.Red)      // 1
D>         return false;
D>     if (!drink.empty && drink[Nation.Danish] != Drink.Tea)       // 3
D>         return false;
D>     if (!smoke.empty && smoke[Nation.German] != Smoke.Prince)    // 13
D>         return false;
D>     if (!pet.empty && pet[Nation.Swedish] != Pet.Dog)            // 2
D>         return false;



Here the stronger typing of the D enums allows to write safer code, with less chance for bugs.

Generally the D function uses const/immutable/pure/nothrow everywhere it's possible, this gives performance and safety.

======================================================

Comments on the second program, second version (einstein2b.d):

This is very similar to einstein2a.d, the main difference comes from the desire to increase the performance avoiding one pointer level using fixed-size arrays to represent the is various permutations.


D> import std.stdio, std.math, std.traits, std.array, std.conv;
D>
D>
D> const(T[N][]) permutationsFixed(T, size_t N)(in T[N] items) pure nothrow {
D>     const(T[N])[] result;
D>     T[N] row;
D>
D>     void perms(in T[] s, in T[] prefix=[]) nothrow {
D>         if (s.length)
D>             foreach (i, c; s)
D>                perms(s[0 .. i] ~ s[i+1 .. $], prefix ~ c);
D>         else {
D>             row[] = prefix[];
D>             result ~= row;
D>         }
D>     }
D>
D>     perms(items);
D>     return result;
D> }



The permutationsFixed() function returns an dynamic array of fixed size arrays.

---------------------------

D> enum Number : uint { One,      Two,     Three,  Four,       Five   };
D> enum Color  : uint { Red,      Green,   Blue,   White,      Yellow };
D> enum Drink  : uint { Milk,     Coffee,  Water,  Beer,       Tea    };
D> enum Smoke  : uint { PallMall, Dunhill, Blend,  BlueMaster, Prince };
D> enum Pet    : uint { Dog,      Cat,     Fish,   Horse,      Bird   };
D> enum Nation : uint { British,  Swedish, Danish, Norvegian,  German };


The enums have uint as base type because it's about or more efficient than ubytes, in this program.

---------------------------

D> bool isPossible(immutable(Number[5])* number,
D>                 immutable(Color[5])* color,
D>                 immutable(Drink[5])* drink,
D>                 immutable(Smoke[5])* smoke,
D>                 immutable(Pet[5])* pet
D>                 ) pure nothrow {
D>     // Rules, with the rule number as a comment
D>
D>     // single property rules
D>     if (number && (*number)[Nation.Norvegian] != Number.One) // 9
D>         return false;
D>     if (color && (*color)[Nation.British] != Color.Red)      // 1
D>         return false;
D>     if (drink && (*drink)[Nation.Danish] != Drink.Tea)       // 3
D>         return false;
D>     if (smoke && (*smoke)[Nation.German] != Smoke.Prince)    // 13
D>         return false;
D>     if (pet && (*pet)[Nation.Swedish] != Pet.Dog)            // 2
D>         return false;
D>
D>     // only check complicated rules, if everything is defined
D>     if (!number || !color || !drink || !smoke || !pet)
D>         return true;
D>
D>     // double property rules
D>     foreach (i; 0 .. 5) {
D>         if ((*color)[i] == Color.Green && (*drink)[i] != Drink.Coffee)    // 5
D>             return false;
D>         if ((*smoke)[i] == Smoke.PallMall && (*pet)[i] != Pet.Bird)       // 6
D>             return false;
...
D> void main() {
...
D>     foreach (ref number; permsNumber)
D>         if (isPossible(&number, null, null, null, null))
D>             foreach (ref color; permsColor)
D>                 if (isPossible(&number, &color, null, null, null))



Normally in D you pass static arrays efficiently using "const ref T[N]", but in this program also nulls must be supported. So I have had to perform some experiments to find a flexible enough but efficient way to give them to the isPossible() function.

So I pass them as pointers to immutable fixed size arrays:

D>                 immutable(Color[5])* color,


They are pointers to arrays (so they are like C/C++/D1 fixed size arrays), so they are efficient, despite this is a not idiomatic way to use arrays in D.

To use the array items you need to deference the pointer first:

(*color)[i]


An alternative syntax, but it's less natural because the first isn't really an array of size 1, it's a pointer to an array:

color[0][i]


======================================================

Some benchmarks:

Runtime timings, seconds:
  einstein1.cpp:  0.01
  einstein1a.d:   0.05
  einstein2.cpp:  0.70
  einstein2a.d:   1.09
  einstein2b.d:   0.76
  einstein2c.py: 19.03 (with Psyco)


DMD 2.057 head
G++ 4.6.1
Python 2.6.6

Celeron 2.3 GHz, 32 bit OS.

---------------------------

Strong static typing is useful, but it often also requires lot of work. In practical D programs I try to find a balance point in the amount of strong static typing I use, to avoid wasting some of my time. Yet, as I get more experienced in programming, I like to express more and more code invariants in statically enforced constraints, like the ones of a static type system, because despite they ask me lot of work, they help me avoid some bugs.

So "Make illegal states unrepresentable", as it's said here regarding OCaML code:
http://ocaml.janestreet.com/?q=node/85

So write your code to turn the illegal states into static errors.

[Go back to the article index]