Golang interface value: fat pointer again?

Recently, I have a project that requires me to learn golang in a short time. IMO, all language with GC is easy to learn and so does golang.

Comparing golang with Rust

I started my journey with a book, the go programming language, which is well written. I skimmed the book and found golang is really similar to rust, in terms of syntax, which makes my learning curve super flat. These 2 languages aim to be an alternative for C in the domain of system programming.

IMO, they all want to address 2 problems that C has with different solutions:

memory safety

memory leak and memory access safety (dangling pointers, bad access, and etc).

Golang uses GC and thus it has freeze-sweep time as Java does (maybe there’s some always-on real-time GC too). Thus golang has a heavy runtime.

Rust uses lifetime compile time analysis to avoid memory safety issues. Therefore, rust has a supper lightweight runtime so that it could be used for embedding system programming, which is dominated by C.

All of them aim to free programmers from managing bare pointer and memory on their own hand in order to achieve better efficiency on finishing their tasks.

safe concurrent programming

Golang follows CSP model, which advocates no-shared-state, functional programming style, concurrency programming paradigm, which is tailored for micro-service development.

Rust relies on its memory access model to ensure there’s no rase condition during the multi-threading, which is still shared-state concurrent programming model, adopted by most sequential programming languages.

go interface value

Other than above 2 big differences, there are other obviously differences, such as fantastic pattern matching and generic programming that rust provides, but go doesn’t. That’s why golang is much simpler. If there’s no sense of pointer, golang would be something like Java.

One thing to notice is that Golang and rust are almost the same in advocating interface-oriented programming, unlike Java’s or C++’s classical class-inheritance style OOP. Rust uses trait and golang uses interface.

cited from the golang book:

Conceptually, a value of an interface type, or interface value, has 2 components: a concrete type (type descriptor) and a value of that type.

In general, we cannot know at compile time what the dynamic type of an interface value will be, so a call through an interface must use dynamic dispatch. Instread of direct call, the compiler must generate code to obtain the address of the method from the type descriptor, then make an indirect call to that address.

Sounds familiar? The author only depicts a conceptual model but not the actual implementation. From the above description, it’s the same as the rust fat-pointer. Feel free to check my explanation on rust’s fat pointer. The descriptor is a pointer to virtual table and the interface value is the pointer to the instance of the concrete type that implements the interface. The visualization should be similar to below:

// see more from below asm
// go uses __method to name the vtable
// __object to denote object
[*__method, *__object]

Sometimes, people use interface value to denote __object, the pointer to the object. To avoid any confusion, I will use interface object to denote the fat pointer as a whole.

Let’s go through an simple example to verify my observation.

package main

type Foo interface {
    method(x int) int
}

type Bar int

func (b Bar) method(x int) int {
    return x + int(b)
}

type Baz int

func (p *Baz) method(x int) int {
    if p == nil {
        return x
    }

    return x + int(*p)
}

func main() {
    var x Foo
    var y Bar = 10

    // pass by value
    x = y
    x.method(4)

    // pass by pointer
    var z Baz = 10
    x = &z
    x.method(4)
}

After it’s compiled to assembly code,

// init the interface value to nil
// var x Foo
mov     QWORD PTR [rbp-400], 0
mov     QWORD PTR [rbp-392], 0

// var y Bar = 10
// x = y
mov     edi, OFFSET FLAT:main.Bar..d
call    runtime.newobject         # ptr_tmp = new(10); rax = ptr_tmp
mov     QWORD PTR [rbp-24], rax   # store the value of ptr_tmp to stack
mov     rax, QWORD PTR [rbp-24]   # reload ptr_tmp from stack
mov     rdx, QWORD PTR [rbp-8]    # tmp92, y (y = 10)
mov     QWORD PTR [rax], rdx      # *prt_tmp, tmp92
mov     rax, QWORD PTR [rbp-24]   # reload ptr_tmp from stack
mov     QWORD PTR [rbp-48], OFFSET FLAT:imt..interface.4.main.method.0func.8int.9.8int.9.5..main.Bar      # x.__methods, the vtable
mov     QWORD PTR [rbp-40], rax   # x.__object = ptr_tmp

As we can see, golang first allocates a new temp object by calling runtime.newobject and copy the value of y to it, because interface value (self or __object) needs to be a pointer, which has a fixed size on stack.

Then it loads the vtable to x.__methods:

// x.method(4)
mov     rax, QWORD PTR [rbp-48]     # _4, x.__methods
mov     rdx, QWORD PTR [rax+8]      # _5, _4->method; offset to method. see vtable below
mov     rax, QWORD PTR [rbp-40]     # _6, x.__object
mov     esi, 4                      # argument value: 4
mov     rdi, rax                    # load object address from _6
call    rdx                         # call method address _5

Let’s check the vtable for y of type Bar:

// for normal type: Bar
imt..interface.4.main.method.0func.8int.9.8int.9.5..main.Bar:
        .quad   main.Bar..d
        .quad   main.Bar.method
// for pointer type: &Bar
pimt..interface.4.main.method.0func.8int.9.8int.9.5..main.Bar:
        .quad   type...1main.Bar
        .quad   main.Bar.method

As we can see, golang automatically generates vtable for *Bar type too, which means you can also do x = &y.

I bought this golang book 3 years ago. And it’s based on golang 1.5. By the time I wrote this blog, the latest golang is 1.14. In early golang versions, passing a reference &y to x is not allowed since you need to provide implementation for pointer type explicitly or manually. If you don’t do so, the code won’t pass the compilation because type *Bar doesn’t satisfy the requirement of the interface Foo, where type Bar satisfies. How time flies and golang is more ergonomic by auto-generating implementation for corresponding pointer and non-pointer type implicitly.

From above, we can see that golang explicitly creates a copy of y on heap, instead of taking its address on stack. The reasoning is that when the receiver is passing by value, it’s not a mutator method and therefore it’s okay to create a disposable temporary object in heap with a copy of the original value. Therefore, at compile time, the interface object comes with clear size, 2 pointer size. Otherwise, the __object part can be with arbitrary size. As professor Edwards, who taught me PLT in Columbia University, said, the philosophy of another level of indirection is pervasive and most problems can be solved by that in computer engineering, for example, the renowned LLVM.

However, that’s forbidden in rust since it doesn’t have GC. In rust, you must explicitly create the trait object (fat pointer) from a reference (pointer).

let y = 10;
// passing by reference explicitly
let x: &Foo = &y;

When you pass a pointer to interface object, things are quite simple.

// var z Baz = 10
mov     edi, OFFSET FLAT:main.Baz..d
call    runtime.newobject
mov     QWORD PTR [rax], 10         # rax is the pointer to newly allocated memory
mov     QWORD PTR [rbp-16], rax     # store rax, i.e to z, a pointer to int(10)

// x = &z
mov     QWORD PTR [rbp-48], OFFSET FLAT:pimt..interface.4.main.method.0func.8int.9.8int.9.5..main.Baz     # x.__methods,
mov     rax, QWORD PTR [rbp-16]     # tmp93, z
mov     QWORD PTR [rbp-40], rax     # x.__object, tmp93

// x.method(4)
mov     rax, QWORD PTR [rbp-48]     # _7, x.__methods
mov     rdx, QWORD PTR [rax+8]      # _8, _7->method
mov     rax, QWORD PTR [rbp-40]     # _9, x.__object
mov     esi, 4
mov     rdi, rax                    #, _9
call    rdx                         # _8

When passing a pointer to interface object, it passes the address of the object to x.__object directly without creating a reference to a temp object. Well, the object is created directly on heap, and that’s why there’s no extra temp object.

Golang interface value: fat pointer again?

A brief tour to golang interface value

Golang interface value: fat pointer again?

A brief tour to golang interface value

Comparing golang with Rust

memory safety

safe concurrent programming

go interface value