Usage of WaitGroup in Golang over the time


WaitGroup is a concurrency primitive often used in Golang concurrent programming for task scheduling. It looks like it has only a few simple methods and is relatively easy to use. The internal implementation of WaitGroup has been changed several times, mainly to optimize the atomic operations of its fields.

The original implementation of WaitGroup in Golang

The earliest implementation of WaitGroup is as follows.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22type WaitGroup struct { m Mutex counter int32 waiters int32 sema *uint32 } func (wg *WaitGroup) Add(delta int) { v := atomic.AddInt32(&wg.counter, int32(delta)) if v < 0 { panic("sync: negative WaitGroup count") } if v > 0 || atomic.LoadInt32(&wg.waiters) == 0 { return } wg.m.Lock() for i := int32(0); i < wg.waiters; i++ { runtime_Semrelease(wg.sema) } wg.waiters = 0 wg.sema = nil wg.m.Unlock() }

The meaning of its implementation fields is clearer, but it is still slightly rough; for example, sema is implemented using pointers.

Then the fields counter and waiters are merged. To guarantee 8-bit alignment for 64bit atomic operations, the alignment point of state1 needs to be found. sema removes the pointer implementation.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15type WaitGroup struct { // 64-bit value: high 32 bits are counter, low 32 bits are waiter count. // 64-bit atomic operations require 64-bit alignment, but 32-bit // compilers do not ensure it. So we allocate 12 bytes and then use // the aligned 8 bytes in them as state. state1 [12]byte sema uint32 } func (wg *WaitGroup) state() *uint64 { if uintptr(unsafe.Pointer(&wg.state1))%8 == 0 { return (*uint64)(unsafe.Pointer(&wg.state1)) } else { return (*uint64)(unsafe.Pointer(&wg.state1[4])) } }

Later, WaitGroup was implemented as follows and stabilized.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17type WaitGroup struct { noCopy noCopy // 64-bit value: high 32 bits are counter, low 32 bits are waiter count. // 64-bit atomic operations require 64-bit alignment, but 32-bit // compilers do not ensure it. So we allocate 12 bytes and then use // the aligned 8 bytes in them as state, and the other 4 as storage // for the sema. state1 [3]uint32 } // state returns pointers to the state and sema fields stored within wg.state1. func (wg *WaitGroup) state() (statep *uint64, semap *uint32) { if uintptr(unsafe.Pointer(&wg.state1))%8 == 0 { return (*uint64)(unsafe.Pointer(&wg.state1)), &wg.state1[2] } else { return (*uint64)(unsafe.Pointer(&wg.state1[1])), &wg.state1[0] } }

The state1 and sema fields are combined into a single field state1, which is an array of uint32, four bytes. So either the first element is 8byte aligned, or the second element is 8byte aligned. Find the aligned 8byte, the remaining 4byte as sema.

There is no problem with this implementation, it’s just a bit roundabout. Because you have to check the alignment of state1 to determine which is the counters and waiters, which is sema.

A question: What is the maximum number of waiters in a WaitGroup?

Changes in Go 1.18

In Go 1.18, WaitGroup has been changed again to ensure that fields of type uint64 are aligned to 8byte for 64bit architecture environments.