Table of Contents
- Memory Layout and Pointers
- Slices - Internal Implementation
- Maps - Swiss Tables in Go 1.24
- Garbage Collector
- String Optimizations
- Interface Optimizations
- Write Barriers
- Escape Analysis
- Memory Ordering & sync.atomic
- Memory Alignment & Struct Optimization
- Channel Internals & Scheduler
- Runtime Optimizations
- Practical Solutions
1. Memory Layout and Pointers
Stack vs Heap Allocation
// stack_heap.go
package main
import "strings"
func stackExample() int {
x := 42 // allocated on stack
return x
}
func heapExample() *int {
x := 42 // escapes to heap
return &x
}
// Inefficient: multiple allocations
func badStringConcat(strs []string) string {
result := ""
for _, s := range strs {
result += s // creates new string each time
}
return result
}
// Efficient: single allocation
func goodStringConcat(strs []string) string {
var builder strings.Builder
builder.Grow(len(strs) * 10) // pre-allocate
for _, s := range strs {
builder.WriteString(s)
}
return builder.String()
}
Escape Analysis Examples
// escapes.go
package main
// Does NOT escape to heap
func noEscape() {
x := make([]int, 100)
_ = x[0]
}
// Escapes to heap
func escapes() []int {
x := make([]int, 100)
return x // returns reference
}
// Escapes due to interface{}
func interfaceEscape() interface{} {
x := 42
return x // boxing to interface{}
}
2. Slices - Internal Implementation
Slice Header Structure
// slice_internals.go
package main
import (
"fmt"
"unsafe"
)
type SliceHeader struct {
Data uintptr
Len int
Cap int
}
func sliceInternals() {
s := []int{1, 2, 3, 4, 5}
header := (*SliceHeader)(unsafe.Pointer(&s))
fmt.Printf("Data: %v, Len: %d, Cap: %d\n",
header.Data, header.Len, header.Cap)
}
// Dangerous shared backing array
func dangerousSlicing() {
original := []int{1, 2, 3, 4, 5}
slice1 := original[:2] // [1, 2]
slice2 := original[2:] // [3, 4, 5]
slice1 = append(slice1, 999) // OVERWRITES slice2[0]!
fmt.Println(slice2) // [999, 4, 5] - surprise!
}
Memory Leaks with Slices
// slice_leaks.go
package main
// BAD: memory leak
func badSubslice(data []byte) []byte {
// Keeps reference to entire array!
return data[100:110]
}
// GOOD: copy needed data
func goodSubslice(data []byte) []byte {
result := make([]byte, 10)
copy(result, data[100:110])
return result
}
// Efficient append
func efficientAppend() {
// Pre-allocate with known capacity
good := make([]int, 0, 1000)
for i := 0; i < 1000; i++ {
good = append(good, i)
}
}
Nil vs Empty Slices
// nil_slices.go
package main
func nilVsEmpty() {
var nilSlice []int
emptySlice := make([]int, 0)
literalEmpty := []int{}
fmt.Printf("nil: len=%d, cap=%d, isNil=%v\n",
len(nilSlice), cap(nilSlice), nilSlice == nil)
fmt.Printf("make: len=%d, cap=%d, isNil=%v\n",
len(emptySlice), cap(emptySlice), emptySlice == nil)
fmt.Printf("literal: len=%d, cap=%d, isNil=%v\n",
len(literalEmpty), cap(literalEmpty), literalEmpty == nil)
}
3. Maps - Swiss Tables in Go 1.24
New Swiss Tables Implementation
// swiss_tables.go
package main
// Go 1.24: Swiss Tables instead of buckets + chaining
// Groups of 8 key-value pairs + control word (metadata)
// Conceptual Swiss Table structure:
type SwissGroup struct {
control [8]byte // control word: 7-bit hash + status
keys [8]Key // keys
values [8]Value // values
}
// Control byte contains:
// - 7 bits: h2 (lower hash bits)
// - 1 bit: status (empty/occupied/deleted)
func swissTableLookup(key string) (value interface{}, ok bool) {
hash := hashKey(key)
h1 := hash >> 7 // upper bits - group selection
h2 := hash & 0x7F // lower 7 bits - for control word
group := findGroup(h1)
// SIMD comparison of all 8 control bytes simultaneously!
matches := group.matchControlWord(h2)
for slot := range matches {
if group.keys[slot] == key {
return group.values[slot], true
}
}
return nil, false
}
Swiss Tables vs Old Implementation
// comparison.go
package main
// OLD IMPLEMENTATION (before Go 1.24):
// ❌ Buckets (8 entries) + overflow chaining
// ❌ Cache misses due to pointer chasing
// ❌ Memory fragmentation
// ❌ Slow growth (rehash entire map)
// NEW IMPLEMENTATION (Go 1.24 Swiss Tables):
// ✅ Groups of 8 with control word
// ✅ Cache-friendly linear layout
// ✅ SIMD for parallel comparison
// ✅ Incremental growth
// ✅ Better load factor
func performanceComparison() {
// Swiss Tables improvements:
// - Lookup: ~20-30% faster
// - Insert: ~15-25% faster
// - Memory usage: ~10-15% less
// - Cache miss rate: significantly lower
}
Thread Safety Remains Unchanged
// map_safety.go
package main
import "sync"
// Maps are STILL NOT thread-safe!
func racySwissMap() {
m := make(map[int]int) // Swiss Table, but race conditions remain
go func() {
for i := 0; i < 1000; i++ {
m[i] = i // RACE!
}
}()
go func() {
for i := 0; i < 1000; i++ {
_ = m[i] // RACE!
}
}()
}
// Use sync.Map for concurrent access
func safeConcurrentMap() {
var m sync.Map
go func() {
for i := 0; i < 1000; i++ {
m.Store(i, i)
}
}()
go func() {
for i := 0; i < 1000; i++ {
if val, ok := m.Load(i); ok {
_ = val
}
}
}()
}
4. Garbage Collector
GC Tuning and Configuration
// gc_tuning.go
package main
import (
"runtime"
"runtime/debug"
)
func gcTuning() {
// Read current settings
stats := debug.GCStats{}
debug.ReadGCStats(&stats)
// Change GOGC (default 100)
debug.SetGCPercent(50) // more aggressive GC
// Force GC
runtime.GC()
// Memory statistics
var m runtime.MemStats
runtime.ReadMemStats(&m)
fmt.Printf("Alloc: %d KB\n", m.Alloc/1024)
fmt.Printf("Sys: %d KB\n", m.Sys/1024)
fmt.Printf("NumGC: %d\n", m.NumGC)
}
// Object pool for reuse
var bufferPool = sync.Pool{
New: func() interface{} {
return make([]byte, 1024)
},
}
func usePool() {
buf := bufferPool.Get().([]byte)
defer bufferPool.Put(buf)
// use buf
}
Write Barriers and GC Performance
// gc_performance.go
package main
// BAD: many pointers = many write barriers
type BadStruct struct {
ptrs []*int // each assignment = write barrier
}
// GOOD: fewer pointers
type GoodStruct struct {
values []int // primitives, no write barriers
}
// Finalizers - use with caution!
func finalizerExample() {
obj := &MyResource{}
runtime.SetFinalizer(obj, (*MyResource).cleanup)
}
type MyResource struct {
data []byte
}
func (r *MyResource) cleanup() {
r.data = nil
}
5. String Optimizations
Unsafe String Conversions
// string_opts.go
package main
import "unsafe"
// Unsafe but fast conversion
func unsafeString(b []byte) string {
return *(*string)(unsafe.Pointer(&b))
}
func unsafeBytes(s string) []byte {
return *(*[]byte)(unsafe.Pointer(
&struct {
string
Cap int
}{s, len(s)},
))
}
// String interning
var stringCache = make(map[string]string)
func intern(s string) string {
if cached, exists := stringCache[s]; exists {
return cached
}
stringCache[s] = s
return s
}
6. Interface Optimizations
Avoiding Boxing and Interface Overhead
// interface_opts.go
package main
// Avoid boxing primitives
func avoidBoxing(x int) {
// BAD: interface{} boxing
var i interface{} = x // allocation
// GOOD: direct type usage
processInt(x)
}
// Use type assertions efficiently
func efficientTypeAssertion(x interface{}) {
if i, ok := x.(int); ok {
processInt(i)
}
}
// Small interfaces are faster
type Reader interface {
Read([]byte) (int, error) // small interface
}
type LargeInterface interface {
Method1()
Method2()
Method3()
Method4() // large interface - slower
}
func processInt(i int) { _ = i }
7. Write Barriers
Understanding Write Barriers
// write_barriers.go
package main
// Write barrier executes on every pointer assignment
func writeBarrierExample() {
var p *int
x := 42
p = &x // WRITE BARRIER! GC learns about this assignment
}
// Many pointers = many write barriers = slow
type SlowStruct struct {
ptr1 *int
ptr2 *string
ptr3 *[]byte
ptr4 *map[string]int
}
func slowOperations(s *SlowStruct) {
x, str, b, m := 1, "test", []byte{}, make(map[string]int)
// Each assignment = write barrier
s.ptr1 = &x // write barrier
s.ptr2 = &str // write barrier
s.ptr3 = &b // write barrier
s.ptr4 = &m // write barrier
}
// Optimization: fewer pointers
type FastStruct struct {
val1 int
val2 string
val3 []byte
val4 map[string]int
}
func fastOperations(s *FastStruct) {
// NO write barriers for primitives
s.val1 = 1
s.val2 = "test"
s.val3 = []byte{}
s.val4 = make(map[string]int)
}
Tricolor Algorithm and Write Barriers
// tricolor_gc.go
package main
// Tricolor GC algorithm:
// WHITE - not yet visited
// GRAY - visited, but children not processed
// BLACK - fully processed
func explainTricolor() {
// Without write barrier:
// 1. GC marks A as BLACK (processed)
// 2. Mutator creates link A -> C (WHITE)
// 3. GC doesn't know about C, deletes it!
//
// With write barrier:
// When creating A -> C, barrier executes
// GC marks A as GRAY (needs rescanning)
}
8. Escape Analysis
How Escape Analysis Works
// escape_analysis.go
package main
// Compiler analyzes code and decides:
// stack (fast, automatic cleanup) VS heap (slow, GC)
func stackAllocation() {
x := 42 // does NOT escape - stays on stack
_ = x
} // x automatically destroyed
func heapAllocation() *int {
x := 42 // ESCAPES - goes to heap
return &x // returning pointer -> escape!
}
// Complex escape cases
func complexEscape() {
// Case 1: Interface boxing
var i interface{} = 42 // ESCAPE: boxing to interface{}
_ = i
// Case 2: Closure capture
x := 42
fn := func() int {
return x // ESCAPE: x captured by closure
}
_ = fn
// Case 3: Too large for stack
big := make([]int, 100000) // ESCAPE: too large for stack
_ = big[0]
// Case 4: Unknown size at compile time
n := unknownSize()
dynamic := make([]int, n) // ESCAPE: size unknown at compile time
_ = dynamic
}
func unknownSize() int { return 10 }
Avoiding Escapes
// escape_avoidance.go
package main
// Use values, not pointers
func avoidEscape() {
type Point struct{ X, Y int }
p := Point{1, 2} // on stack
processPoint(p) // pass by value
}
func processPoint(p Point) {
// work with copy
}
// Compile with: go build -gcflags="-m"
// Shows: "moved to heap" or "does not escape"
9. Memory Ordering & sync.atomic
Memory Reordering Problems
// memory_ordering.go
package main
import "sync/atomic"
// PROBLEM: CPU and compiler can reorder instructions!
var (
flag int32
data int32
)
// Writer goroutine
func writer() {
data = 42 // may execute AFTER flag = 1!
flag = 1 // signal ready
}
// Reader goroutine
func reader() {
for atomic.LoadInt32(&flag) == 0 {
runtime.Gosched()
}
value := data // may read 0 instead of 42!
println(value)
}
// CORRECT solution - atomic operations
func atomicWriter() {
atomic.StoreInt32(&data, 42) // atomic store
atomic.StoreInt32(&flag, 1) // release semantics
}
func atomicReader() {
for atomic.LoadInt32(&flag) == 0 { // acquire semantics
runtime.Gosched()
}
value := atomic.LoadInt32(&data)
println(value) // guaranteed 42
}
Lock-free Data Structures
// lockfree.go
package main
import (
"sync/atomic"
"unsafe"
)
// Lock-free stack (Treiber stack)
type LockFreeStack struct {
head unsafe.Pointer // *node
}
type node struct {
value interface{}
next unsafe.Pointer // *node
}
func (s *LockFreeStack) Push(value interface{}) {
newNode := &node{value: value}
for {
head := atomic.LoadPointer(&s.head)
newNode.next = head
// CAS: compare head with old, set to newNode
if atomic.CompareAndSwapPointer(&s.head, head, unsafe.Pointer(newNode)) {
break // success!
}
// retry if another thread changed head
}
}
func (s *LockFreeStack) Pop() (interface{}, bool) {
for {
head := atomic.LoadPointer(&s.head)
if head == nil {
return nil, false // empty stack
}
headNode := (*node)(head)
next := atomic.LoadPointer(&headNode.next)
// CAS: remove head, set next as new head
if atomic.CompareAndSwapPointer(&s.head, head, next) {
return headNode.value, true
}
// retry if another thread changed head
}
}
atomic.Value Best Practices
// atomic_value.go
package main
import "sync/atomic"
// DANGER: atomic.Value panics on type change!
func atomicValuePanic() {
var value atomic.Value
value.Store("string")
// value.Store(42) // PANIC! inconsistent type int
}
// Correct usage with consistent types
type Config struct {
Timeout int
Retries int
}
var globalConfig atomic.Value
func updateConfig(newConfig *Config) {
globalConfig.Store(newConfig) // always *Config
}
func getConfig() *Config {
return globalConfig.Load().(*Config)
}
10. Memory Alignment & Struct Optimization
Struct Memory Layout
// struct_alignment.go
package main
import (
"fmt"
"unsafe"
)
// BAD alignment - lots of padding
type BadStruct struct {
a bool // 1 byte
// 7 bytes padding for int64 alignment
b int64 // 8 bytes
c bool // 1 byte
// 7 bytes padding at end
// Total: 24 bytes
}
// GOOD alignment - minimal padding
type GoodStruct struct {
b int64 // 8 bytes
a bool // 1 byte
c bool // 1 byte
// 6 bytes padding at end
// Total: 16 bytes
}
func alignmentDemo() {
fmt.Printf("BadStruct size: %d bytes\n", unsafe.Sizeof(BadStruct{})) // 24
fmt.Printf("GoodStruct size: %d bytes\n", unsafe.Sizeof(GoodStruct{})) // 16
// 33% memory savings by reordering fields!
}
fieldalignment Tool
Go
go install golang.org/x/tools/go/analysis/passes/fieldalignment/cmd/fieldalignment@latest
// fieldalignment_demo.go
package main
// Usage: fieldalignment ./...
// BAD structure for fieldalignment
type Inefficient struct {
flag1 bool // 1 byte
counter int64 // 8 bytes (+ 7 padding)
flag2 bool // 1 byte
id int32 // 4 bytes (+ 3 padding)
flag3 bool // 1 byte (+ 7 padding at end)
// Total: 32 bytes
}
// fieldalignment suggests:
type Efficient struct {
counter int64 // 8 bytes
id int32 // 4 bytes
flag1 bool // 1 byte
flag2 bool // 1 byte
flag3 bool // 1 byte
// Total: 16 bytes - 50% less!
}
// Commands:
// fieldalignment -fix ./... # auto-fix
// fieldalignment -test=false ./... # analyze only
Cache Line Optimization
// cache_optimization.go
package main
// Cache line is typically 64 bytes
// Important to place frequently used fields together
type CacheOptimized struct {
// Hot data - first 64 bytes (one cache line)
hotField1 int64 // frequently read
hotField2 int64 // frequently read
hotField3 int64 // frequently read
hotField4 int64 // frequently read
hotField5 int64 // frequently read
hotField6 int64 // frequently read
hotField7 int64 // frequently read
hotField8 int64 // frequently read = 64 bytes
// Cold data - next cache lines
coldField1 [1000]byte // rarely used
coldField2 string // rarely used
}
// False sharing - problem for concurrent access
type BadConcurrentStruct struct {
counter1 int64 // CPU core 1 reads/writes
counter2 int64 // CPU core 2 reads/writes
// Both on same cache line = false sharing!
}
// Solution - padding between fields
type GoodConcurrentStruct struct {
counter1 int64
_ [56]byte // padding to 64 bytes
counter2 int64 // on separate cache line
_ [56]byte // padding
}
11. Channel Internals & Scheduler
Channel Structure
// channel_internals.go
package main
// Internal channel structure (conceptual)
type hchan struct {
qcount uint // elements in buffer
dataqsiz uint // buffer size
buf unsafe.Pointer // buffer pointer
elemsize uint16 // element size
closed uint32 // closed flag
elemtype *_type // element type
sendx uint // send index
recvx uint // receive index
recvq waitq // waiting receivers queue
sendq waitq // waiting senders queue
lock mutex // synchronization mutex
}
type waitq struct {
first *sudog // first waiting goroutine
last *sudog // last waiting goroutine
}
Channel Operations
// channel_mechanisms.go
package main
// Send operation (conceptual)
func channelSend(ch chan int, value int) {
// 1. Check if there's a waiting receiver
if len(recvq) > 0 {
// Direct send - copy data directly to receiver
receiver := recvq.dequeue()
memmove(receiver.elem, &value, sizeof(int))
goready(receiver.g) // schedule goroutine for execution
return
}
// 2. Space in buffer?
if qcount < dataqsiz {
// Copy to buffer
buf[sendx] = value
sendx = (sendx + 1) % dataqsiz
qcount++
return
}
// 3. Block - add to sendq
mysudog := acquireSudog()
mysudog.elem = &value
sendq.enqueue(mysudog)
gopark() // block current goroutine
}
Select Statement Internals
// select_internals.go
package main
// Select uses pollorder and lockorder internally
func selectInternals() {
ch1 := make(chan int)
ch2 := make(chan string)
ch3 := make(chan bool)
select {
case v1 := <-ch1:
_ = v1
case v2 := <-ch2:
_ = v2
case ch3 <- true:
// send
default:
// default case
}
// Compiler generates approximately:
// 1. Create array of cases
// 2. Generate random pollorder (fairness)
// 3. Generate lockorder (deadlock prevention)
// 4. Lock channels in lockorder
// 5. Check readiness in pollorder
// 6. Execute ready case or block on all
}
Channel Best Practices
// channel_best_practices.go
package main
import (
"context"
"time"
)
// 1. Buffer size = amount of work in flight
func optimalBufferSize() {
numWorkers := runtime.NumCPU()
// Buffer size = concurrent work units
jobs := make(chan Job, numWorkers*2) // 2x for peak load
for i := 0; i < numWorkers; i++ {
go worker(jobs)
}
}
// 2. Graceful shutdown pattern
func gracefulShutdown() {
jobs := make(chan Job)
quit := make(chan struct{})
go func() {
defer close(jobs)
for {
select {
case job := <-getNextJob():
select {
case jobs <- job:
// sent job
case <-quit:
// shutdown signal
return
}
case <-quit:
return
}
}
}()
// Shutdown
close(quit)
// Wait for remaining jobs
for range jobs {
// process remaining jobs
}
}
// 3. Context-aware channels
func contextAwareChannel(ctx context.Context) {
ch := make(chan Data)
go func() {
defer close(ch)
for {
select {
case data := <-source():
select {
case ch <- data:
// sent
case <-ctx.Done():
return // context canceled
}
case <-ctx.Done():
return
}
}
}()
}
type Job struct{ ID int }
type Data struct{ Content string }
func worker(jobs <-chan Job) {
for job := range jobs {
// process job
_ = job
}
}
func getNextJob() <-chan Job {
ch := make(chan Job, 1)
go func() {
defer close(ch)
ch <- Job{ID: 1}
}()
return ch
}
func source() <-chan Data {
ch := make(chan Data, 1)
go func() {
defer close(ch)
ch <- Data{Content: "test"}
}()
return ch
}
12. Runtime Optimizations
Profile-Guided Optimization (PGO)
// pgo_optimization.go
package main
// Go 1.20+ supports PGO - compiler uses profiles
// for optimization decisions
// Enable PGO:
// go build -pgo=auto # look for default.pgo
// go build -pgo=profile.pgo # explicit profile
func generateProfile() {
// 1. Add CPU profiling to production code
if *cpuprofile != "" {
f, err := os.Create(*cpuprofile)
if err != nil {
log.Fatal(err)
}
defer f.Close()
pprof.StartCPUProfile(f)
defer pprof.StopCPUProfile()
}
// 2. Run typical workload
runTypicalWorkload()
// 3. Profile saved to cpu.prof
// 4. Convert for PGO: go tool pprof -proto cpu.prof > default.pgo
}
// PGO improves:
func pgoImprovements() {
// 1. Inlining decisions - more accurate inlining
hotFunction() // likely to be inlined with PGO
// 2. Devirtualization - interface calls to direct calls
var i io.Reader = strings.NewReader("test")
data, _ := io.ReadAll(i) // may become direct call with PGO
_ = data
// 3. Function layout - hot functions placed together
// 4. Better register allocation in hot paths
}
func hotFunction() {
// Frequently called function from profile
// PGO increases inlining chances
}
// Typical improvement: 2-14% performance
// Especially effective for:
// - HTTP servers
// - Database applications
// - Games
// - Scientific computing
Inlining Optimizations
// inlining_optimizations.go
package main
// Compiler automatically inlines simple functions
func inliningExamples() {
// This function will be inlined (simple, small)
result := simpleAdd(5, 10)
// This may not be inlined (complex logic)
complex := complexFunction(result)
_ = complex
}
//go:noinline
func forceNoInline(x int) int {
return x * 2 // forcibly prevent inlining
}
// Simple function - inlining candidate
func simpleAdd(a, b int) int {
return a + b // will be inlined into calling code
}
// Complex function - probably won't be inlined
func complexFunction(x int) int {
if x > 100 {
for i := 0; i < x; i++ {
x = x * i / (i + 1)
if x < 0 {
panic("overflow")
}
}
}
return x
}
// Check inlining: go build -gcflags="-m" main.go
// Shows: "can inline simpleAdd", "cannot inline complexFunction"
Devirtualization
// devirtualization.go
package main
import (
"io"
"strings"
)
// Interface call devirtualization
func devirtualizationExample() {
// Without optimization: interface method call (slow)
var r io.Reader = strings.NewReader("hello")
// With PGO: compiler may notice r is always *strings.Reader
// and replace interface call with direct call
buffer := make([]byte, 100)
n, err := r.Read(buffer) // may become direct call
_ = n
_ = err
}
// Type assertion optimization
func typeAssertionOpt(i interface{}) {
// Common pattern in hot paths
if s, ok := i.(string); ok {
processString(s) // PGO optimizes this path
} else if num, ok := i.(int); ok {
processInt(num) // and this too
}
}
func processString(s string) { _ = s }
func processInt(i int) { _ = i }
Compiler Intrinsics
// compiler_intrinsics.go
package main
import (
"math/bits"
"unsafe"
)
// Some functions are replaced with CPU instructions
func intrinsicsExamples() {
// bits.TrailingZeros compiles to TZCNT/BSF instruction
x := uint(0b1000)
zeros := bits.TrailingZeros(x) // CPU instruction
_ = zeros
// bits.Len compiles to LZCNT/BSR
length := bits.Len(x) // CPU instruction
_ = length
// math.Sqrt may use FSQRT
sqrt := math.Sqrt(16.0) // CPU instruction if available
_ = sqrt
}
// Memory barriers are also intrinsics
func memoryBarriers() {
// runtime.KeepAlive prevents optimization
data := make([]byte, 1000)
ptr := unsafe.Pointer(&data[0])
// Use ptr...
_ = ptr
// Guarantee data won't be GC'd until this point
runtime.KeepAlive(data)
}
13. Practical Solutions
Optimized String Processing
// Solution 1: Optimize string processing function
func efficientStringProcessing(data []string) []string {
// Pre-allocate with capacity estimate
result := make([]string, 0, len(data)/2)
// Use sync.Pool for string builders
builder := builderPool.Get().(*strings.Builder)
defer builderPool.Put(builder)
for _, item := range data {
if len(item) > 5 {
builder.Reset()
builder.Grow(len(item))
// Efficient uppercase conversion
for _, r := range item {
builder.WriteRune(unicode.ToUpper(r))
}
result = append(result, builder.String())
}
}
return result
}
var builderPool = sync.Pool{
New: func() interface{} {
return &strings.Builder{}
},
}
Memory Leak Fix
// Solution 2: Fix memory leak in slice operation
func noMemoryLeak() []int {
big := make([]int, 1000000)
// BAD: return big[999999:] // holds reference to entire array
// GOOD: Copy needed data to new slice
result := make([]int, 1)
result[0] = big[999999]
// big can now be garbage collected
return result
}
// For subslices:
func safeSubslice(data []byte, start, end int) []byte {
// Copy needed portion to break reference
result := make([]byte, end-start)
copy(result, data[start:end])
return result
}
Thread-Safe Cache with TTL
// Solution 3: Thread-safe cache with TTL
package main
import (
"sync"
"time"
)
type CacheItem struct {
Value interface{}
Expiration time.Time
}
type TTLCache struct {
mu sync.RWMutex
items map[string]CacheItem
ttl time.Duration
stopCleanup chan struct{}
wg sync.WaitGroup
}
func NewTTLCache(ttl time.Duration) *TTLCache {
cache := &TTLCache{
items: make(map[string]CacheItem),
ttl: ttl,
stopCleanup: make(chan struct{}),
}
// Background cleanup goroutine
cache.wg.Add(1)
go cache.cleanupExpired()
return cache
}
func (c *TTLCache) Set(key string, value interface{}) {
c.mu.Lock()
defer c.mu.Unlock()
c.items[key] = CacheItem{
Value: value,
Expiration: time.Now().Add(c.ttl),
}
}
func (c *TTLCache) Get(key string) (interface{}, bool) {
c.mu.RLock()
defer c.mu.RUnlock()
item, exists := c.items[key]
if !exists {
return nil, false
}
if time.Now().After(item.Expiration) {
return nil, false
}
return item.Value, true
}
func (c *TTLCache) Delete(key string) {
c.mu.Lock()
defer c.mu.Unlock()
delete(c.items, key)
}
func (c *TTLCache) cleanupExpired() {
defer c.wg.Done()
ticker := time.NewTicker(c.ttl / 2)
defer ticker.Stop()
for {
select {
case <-ticker.C:
c.mu.Lock()
now := time.Now()
for key, item := range c.items {
if now.After(item.Expiration) {
delete(c.items, key)
}
}
c.mu.Unlock()
case <-c.stopCleanup:
return
}
}
}
func (c *TTLCache) Close() {
close(c.stopCleanup)
c.wg.Wait()
}
Interview Tips
Key Questions to Expect
-
"What happens when you append to a slice with insufficient capacity?"
- New backing array allocated with double capacity
- All elements copied to new array
- Old array becomes eligible for GC
-
"Why are maps not thread-safe?"
- Performance optimization for single-threaded use
- Concurrent access would require locks on every operation
- Use sync.Map or manual synchronization for concurrent access
-
"When does an object escape to the heap?"
- Returning pointer from function
- Storing in interface
- Capturing in closure
- Too large for stack
- Unknown size at compile time
-
"How do write barriers work with the GC?"
- Execute on every pointer assignment
- Inform GC about pointer changes during collection
- Enable concurrent GC without stop-the-world
- Support tricolor marking algorithm
-
"What's the difference between buffered and unbuffered channels?"
- Unbuffered: synchronization point, send/receive must happen together
- Buffered: decouples sender and receiver until buffer is full
- Both can cause goroutine leaks if misused
Performance Analysis Commands
# Escape analysis
go build -gcflags="-m" main.go
# PGO build
go build -pgo=default.pgo main.go
# Race detection
go build -race main.go
# Disable optimizations for debugging
go build -gcflags="-N -l" main.go
# Memory alignment check
fieldalignment ./...
# CPU profiling
go tool pprof cpu.prof
# Memory profiling
go tool pprof mem.prof
Summary
This guide covers advanced Go topics essential for senior-level interviews:
- Memory Management: Stack vs heap, escape analysis, GC tuning
- Data Structures: Slice internals, Swiss Tables maps, channel implementation
- Concurrency: Memory ordering, atomic operations, channel patterns
- Performance: PGO, inlining, devirtualization, memory alignment
- Best Practices: Pool patterns, leak prevention, optimization techniques