Some D gotchas for Python programmers

By leonardo maffi
V.1.11, Mar 30 2014
keywords: programming, D language, Python

[Go back to the article index]

The D language (http://www.digitalmars.com/d/ ) version 2 is a nice language that shares some similarities with Python, but being statically compiled there are also many differences. This is a list of some of such differences that may cause bugs when porting some Python code to D. This is only a partial list of the most common problems, others can be found.

All tests here are performed with Python 2.6.6 on Windows, and the D 2 compiler DMD 2.051 (with warnings on, and not release mode).

--------------------

In Python all data is managed by name, it's similar to by reference, but in D some kinds of data are values. This is a common source of problems while porting Python code to D.

In D dynamic arrays aren't references, currently they are a small 2-word struct that contains the length of the array plus a pointer to the start of the data zone. This simple fact may cause various different kinds of bugs.

The modifyArray() function of this program below modifies the length of arr, and changes its first value to 20, but after the function call the length of a is unchanged, and even its first item is unchanged.

The length increase inside modifyArray has not changed the original array because it has caused a reallocation, so the value 20 has modified the contents of the new allocation only. In such situations to avoid bugs in porting Python code to D you need to pass arr by refefence, using the 'ref' keyword.
void modifyArray(int[] arr) {
    arr.length += 1000;
    arr[0] = 20;
}

void main() {
    int[] a = [10];
    modifyArray(a);
    assert(a.length == 1);
    assert(a[0] == 10);
}
In D2 the fixed-size arrays are given to functions by value, so you often have to use by reference, especially if the array is large.

--------------------

A related problem can be seen while translating this Python code:
d = {1:[2, 3]}
l = d[1]
l2 = l.pop(0)
print d, "", l, "", l2

It outputs:
{1: [3]} [3] 2

While an apparently similar D2 program:
import std.stdio;
import std.array: popFront;
void main() {
    auto d = [1:[2, 3]];
    auto l = d[1];
    auto l2 = l[0];
    l.popFront();
    writeln(d, "  ", l, "  ", l2);
}

outputs:
[1:[2, 3]] [3] 2

D dynamic arrays are partially a value and partially a reference, and here the 'value' of the dynamic array inside the associative array 'd' doesn't change. This may cause bugs.

--------------------

In Python an empty collection is false, but this convention doesn't exist in D, so if you translate this Python code:
a = [1, 2]
if a: ...

To this D2 code:
auto a = [1, 2];
if (a) { ...

you may introduce bugs, and the D2 compiler doesn't warn you at all.

There are some ways to write that correctly in D, like:

if (a.length) { ...

Or:

if (!a.empty()) { ...

Where empty() is from std.array module.

This D2 program shows where bugs may came out:
void main() {
    int[] a1 = [1, 2];
    assert(!!a1);

    int[] a2;
    assert(!a2);

    a1.length = 0;
    assert(!!a1);
}

In the first two asserts D is behaving as Python, but in the third case 'a1' is true because despite its length is zero, its pointer to memory is not null, it points to the start of the memory that contains 1 and 2 integer values.

--------------------

In Python there are no structs, tuples and classes are managed by name. Structs are handy and efficient, but you need to be careful:
struct S { int x; }
void main() {
    S[] array = [S(1), S(2)];
    foreach (item; array)
        item.x *= 10;
    assert(array[0].x == 1);
}

'item' in the foreach is a copy of the current item of the array, so the original structs don't change.

To avoid this you have to use object, to use pointer, or just iterate by reference:

foreach (ref item; array)

This is a common cause of bugs even for experienced D programmers. So be careful on this.

--------------------

In Python slices are "saturating": when the required range is too much wide they take as many items as possible, with no errors:
s = "abcdefg"
upper = 20
s2 = s[2 : upper]
assert s2 == "cdefg"

But D slices don't saturate (maybe for performance), and cause a run-time range violation, you must be careful and use correct bounds (in release mode wrong bounds may cause even more problems).

This means that in some situations you are forced to use min() or max() to be sure your bounds are correct ($ means the length of the array), this is not handy, makes the code longer, and it may be a source of D bugs:
import std.algorithm: min;
void main() {
    string s = "abcdefg";
    int upper = 20;
    string s2 = s[2 .. min($, upper)];
    assert(s2 == "cdefg");
}

--------------------

Python indexes may be negative, to wrap around:
s = "abcdefg"
assert s[-2] == 'f'

But D doesn't allow that, you need to use $:
auto s = "abcdefg";
assert(s[$ - 2] == 'f');

Here the -2 is a constant known at compile-time. If it's a run-time variable then you need more complex code, like:
void main() {
    auto s = "abcdefg";

    int index = -2;
    char c = (index < 0) ? s[$ + index] : s[index];
    assert(c == 'f');

    index = 3;
    c = (index < 0) ? s[$ + index] : s[index];
    assert(c == 'd');
}

And you need to be careful to use a signed value for the index.

--------------------

In Python the length of a list is a normal (signed) integer, but currently in D is it an unsigned value. This composed with the implicit conversion rules of C language that D has inherited, plus the lack of warnings about signed-unsigned comparisons, may cause a lot of "wonderful" bugs:
void main() {
    int index = -3;
    int[] arr = [1, 2, 3];
    assert(index > arr.length);
}

So far D2 has no simple way to avoid this kind of bugs. Among the basic simple bugs this is one of the most common in my D code.

--------------------

In D V.2 strings are immutable, but it is only a partial immutability. In D a truly immutable string is:
immutable(char[])

But D strings are defined as:
immutable(char)[]

So the array struct itself may change. So you are allowed to do (but usually this is not a good idea):
void main() {
    string s = "hello".dup.idup;
    s.length += 1;
}

--------------------

In Python the list.index() method raises an exception if the item is missing, but currently (DMD V. 2.048) in D the std.algorithm.indexOf() function returns -1 if the item is missing. This may cause bugs:
import std.algorithm: indexOf;
void main() {
    int[] arr = [1, 2, 3];
    int x = 3;
    int y = 5;
    assert(arr.indexOf(x) > arr.indexOf(y));
}

While similar code in Python raises a ValueError and shows the stack trace:
arr = [1, 2, 3]
x = 3
y = 5
assert arr.index(x) > arr.index(y)

The Python output:
Traceback (most recent call last):
  File "...\test.d", line 4, in 
    assert arr.index(x) > arr.index(y)
ValueError: 5 is not in list

Also be careful in mixing signed and unsigned array indexes in expressions, to avoid silent casts to unsigned.

-------------------

In D array indexes and lengths are of type size_t, that is one CPU word long. So when you port 32-bit D code to 64 bit (DMD for 64 bit is in the works) this assignment will throw away half of the bits (and the compiler will complain):
uint n1 = someArray1.length;
int n2 = someArray2.length;

So it's good practice (until and unless D array indexes and lengths become somethin else) to use a size_t even in 32 bit code:
size_t n = someArray.length;

-------------------

Currently D typesafe variadic arguments don't copy the data on the heap, so inside the this() you need to dup 'data':
class SomeClass {
    this(int[] data...) {
        array = data; // needs a dup
    }
    int[] array;
}

SomeClass someFunction() {
    return new SomeClass(30, 20, 10); // passes stack data to Foo
}

void main() {
    assert(someFunction().array == [30, 20, 10]); // asserts
}

-------------------

The syntax to define, create/instantiate arrays in D2 is a little tricky (also because there are two main kinds of arrays, the fixed-sized ones, and the dynamic ones. The dynamic ones are similar to Python array.array), here are some examples:
void main() {
    enum int n = 5;

    auto arr01 = new int[n];
    pragma(msg, "arr01: " ~ typeof(arr01).stringof); // int[]

    auto arr02 = new int[](n);
    pragma(msg, "arr02: " ~ typeof(arr02).stringof); // int[]

    int[][] arr03; //
    arr03.length = n;
    foreach (i; 0 .. n)
        arr03[i].length = n;
    pragma(msg, "arr03: " ~ typeof(arr03).stringof); // int[][]

    auto arr04 = new int[][n];
    foreach (ref row; arr04)
        row = new int[n];
    pragma(msg, "arr04: " ~ typeof(arr04).stringof); // int[][]
    pragma(msg, "arr04[0]: " ~ typeof(arr04[0]).stringof); // int[]


    auto arr05 = new int[][n];
    pragma(msg, "arr05: " ~ typeof(arr05).stringof); // int[][]
    pragma(msg, "arr05[0]: " ~ typeof(arr05[0]).stringof); // int[]

//    int[][] arr06 = new int[n][]; // error

//    int[][] arr07 = new int[n][n]; // error

    int[n][n] arr08;
    pragma(msg, "arr08: " ~ typeof(arr08).stringof); // int[5u][5u]

    auto arr09 = new int[][](n, n);
    pragma(msg, "arr09: " ~ typeof(arr05).stringof); // int[][]

    alias int[n] StaticArr;
    auto arr10 = new StaticArr[n];
    pragma(msg, "arr10: " ~ typeof(arr10).stringof); // int[5u][]
    pragma(msg, "arr10[0]: " ~ typeof(arr10[0]).stringof); // int[5u]

    auto arr11 = new int[n][](n);
    pragma(msg, "arr11: " ~ typeof(arr11).stringof); // int[5u][]
    pragma(msg, "arr11[0]: " ~ typeof(arr11[0]).stringof); // int[5u]

    int[][n] arr12;
    foreach (ref row; arr12)
        row = new int[n];
    pragma(msg, "arr12: " ~ typeof(arr12).stringof); // int[][5u]
    pragma(msg, "arr12[0]: " ~ typeof(arr12[0]).stringof); // int[]

//    auto arr13 = new int[][n][](n, n); // error

    auto arr14 = new typeof(arr12)[n];
    pragma(msg, "arr14: " ~ typeof(arr14).stringof); // int[][5u][]
    pragma(msg, "arr14[0]: " ~ typeof(arr14[0]).stringof); // int[][5u]
    pragma(msg, "arr14[0][0]: " ~ typeof(arr14[0][0]).stringof); // int[]
}
-------------------

Division and modulus are different, this in Python:
>>> -1 / 10
-1
>>> -1 % 10
9
While in D:
void main() {
    assert((-1 / 10) == 0);
    assert((-1 % 10) == -1);
}
-------------------

Currently D1/D2 associative arrays have one important trap:
void test(int[int] arraya, int x) {
    arraya[x] = x;
}

void main() {
    int[int] d;
    test(d, 0);
    int[int] d0;
    assert(d == d0); // d is empty, 0:0 is lost

    d[1] = 1;
    test(d, 2);
    assert(d == [1: 1, 2: 2]); // now 2:2 is not lost
}
A similar Python program acts differently, in a more sane way:
def test(arraya, x):
    arraya[x] = x

def main():
    d = {}
    test(d, 0)
    assert d == {0: 0}

    d[1] = 1
    test(d, 2)
    assert d == {0: 0, 1: 1, 2: 2}

main()
D associative arrays act like reference types, when you define them they are like a null pointer passed by value.

-------------------

Python has tuples, but it doesn't have function overloading. In D there is function overloading and two kinds of tuples, one of them ("typetuples") is built-in. A Python programmer needs to keep in mind the output of this D2 program:
import std.stdio;
void foo(int i) { writeln("A ", i); }
void foo(int i, int j) { writeln("B ", i, " ", j); }
void main() {
    foo(4, 5);   // Prints: B 4 5
    foo((4, 5)); // Prints: A 5
}
-------------------

D numbers are fixed-size bitfields, they aren't multi-precision as in Python. This is a source of bad bugs in D. This means the following TERA1 can't store the desired result, and the D compiler gives zero error messages to warn you. TERA1 is equal to 0:
int TERA1 = 1024 * 1024 * 1024 * 1024;
Using auto, hoping the result will be stored in a long (64 bit value) will not solve the problem, because integral number literals are ints, and there is no auto-promotion to longs. So TERA2 is equal to 0:
auto TERA2 = 1024 * 1024 * 1024 * 1024;
This looks better, TERA3 is a long, and a long is able to store the desired result, but the computation is performed with signed 32 bit ints still, and TERA3 too is 0:
long TERA3 = 1024 * 1024 * 1024 * 1024;
If you want a correct result, hoping your result will not go past signed 64 precision, you need to use long literals too and store their result in a long, TERA4 is the desired 1_099_511_627_776:
long TERA4 = 1024L * 1024L * 1024L * 1024L;
The computation produces a long, so using auto produces the same correct result:
auto TERA5 = 1024L * 1024L * 1024L * 1024L;
-------------------

In Python the integer 10 to the power -1 gives 0.1:
>>> x = 1
>>> 10 ** (x - 2)
0.10000000000000001
While in D gives an "Integer Divide by Zero" error:
import std.stdio, std.math;
void main() {
    int x = 1;
    writeln(10 ^^ (x - 2));
}
To avoid this the base needs to be a floating point:
import std.stdio, std.math;
void main() {
    int x = 1;
    writeln(10.0 ^^ (x - 2));
}
-------------------

It's better to avoid mutable default arguments in Python. If you use them be aware the D acts in a better but different way:

def foo(a = [1, 2]):
    a[0] += 10
    print a
foo()
foo()

Output:

[11, 2]
[21, 2]
import std.stdio;
void foo(int[] a = [1, 2]) {
    a[0] += 10;
    a.writeln;
}
void main() {
    foo;
    foo;
}

Output:

[11, 2]
[11, 2]
-------------------

Inner functions in Python can use variables and functions regardless of their definition order:

def foo():
    def bar():
        spam()
    def spam():
        print x
    x = 10
    bar()
foo()

Output:

10

While similar code is not allowed in D:

import std.stdio;
void foo() {
    void bar() {
        spam;
    }
    void spam() {
        x.writeln;
    }
    int x = 10;
    bar;
}
void main() {
    foo;
}

Output:

test.d(4,9): Error: undefined identifier spam
test.d(7,9): Error: undefined identifier x

You have to reorder the names and definitions:

import std.stdio;
void foo() {
    int x = 10;
    void spam() {
        x.writeln;
    }
    void bar() {
        spam;
    }
    bar;
}
void main() {
    foo;
}

In some cases a static struct helps (like when you have mutually recursive functions:

void foo() {
    static struct Namespace {
        static void bar() {
            Namespace.spam;
        }
        static void spam() {
        }
    }
    Namespace.bar;
}
void main() {
    foo;
}

Another solution is to define an empty delegate before a function and assign it after the first function.

-------------------

Update Oct 18 2010: fixed typos and the like. Added size_t to represent array lengths.
Update Nov 13 2010: added typesafe variadic arguments.
Update Nov 13 2010: added dynamic/static arrays syntax.
Update Jan 2 2011: added division and modulus differences.
Update Feb 1 2011: added a trap of D associative arrays.
Update Feb 12 2011: added trap with tuples.
Update Mar 6 2011: added ugly trap with long integers.
Update Jul 7 2012: added power of negative integer number.
Update Oct 5 2012: one more note on foreach of structs.
Update Jul 1 2012: updated textual output of D associative array.
Update Mar 30 2013: added mutable default arguments in Python code, and scoping troubles for nested functions.

[Go back to the article index]