Go Interview Preparation: Advanced Topics


Table of Contents

  1. Memory Layout and Pointers
  2. Slices - Internal Implementation
  3. Maps - Swiss Tables in Go 1.24
  4. Garbage Collector
  5. String Optimizations
  6. Interface Optimizations
  7. Write Barriers
  8. Escape Analysis
  9. Memory Ordering & sync.atomic
  10. Memory Alignment & Struct Optimization
  11. Channel Internals & Scheduler
  12. Runtime Optimizations
  13. Practical Solutions

1. Memory Layout and Pointers

Stack vs Heap Allocation

// stack_heap.go
package main

import "strings"

func stackExample() int {
    x := 42 // allocated on stack
    return x
}

func heapExample() *int {
    x := 42 // escapes to heap
    return &x
}

// Inefficient: multiple allocations
func badStringConcat(strs []string) string {
    result := ""
    for _, s := range strs {
        result += s // creates new string each time
    }
    return result
}

// Efficient: single allocation
func goodStringConcat(strs []string) string {
    var builder strings.Builder
    builder.Grow(len(strs) * 10) // pre-allocate
    for _, s := range strs {
        builder.WriteString(s)
    }
    return builder.String()
}

Escape Analysis Examples

// escapes.go
package main

// Does NOT escape to heap
func noEscape() {
    x := make([]int, 100)
    _ = x[0]
}

// Escapes to heap
func escapes() []int {
    x := make([]int, 100)
    return x // returns reference
}

// Escapes due to interface{}
func interfaceEscape() interface{} {
    x := 42
    return x // boxing to interface{}
}

2. Slices - Internal Implementation

Slice Header Structure

// slice_internals.go
package main

import (
    "fmt"
    "unsafe"
)

type SliceHeader struct {
    Data uintptr
    Len  int
    Cap  int
}

func sliceInternals() {
    s := []int{1, 2, 3, 4, 5}
    header := (*SliceHeader)(unsafe.Pointer(&s))

    fmt.Printf("Data: %v, Len: %d, Cap: %d\n",
        header.Data, header.Len, header.Cap)
}

// Dangerous shared backing array
func dangerousSlicing() {
    original := []int{1, 2, 3, 4, 5}
    slice1 := original[:2]  // [1, 2]
    slice2 := original[2:]  // [3, 4, 5]

    slice1 = append(slice1, 999) // OVERWRITES slice2[0]!
    fmt.Println(slice2) // [999, 4, 5] - surprise!
}

Memory Leaks with Slices

// slice_leaks.go
package main

// BAD: memory leak
func badSubslice(data []byte) []byte {
    // Keeps reference to entire array!
    return data[100:110]
}

// GOOD: copy needed data
func goodSubslice(data []byte) []byte {
    result := make([]byte, 10)
    copy(result, data[100:110])
    return result
}

// Efficient append
func efficientAppend() {
    // Pre-allocate with known capacity
    good := make([]int, 0, 1000)
    for i := 0; i < 1000; i++ {
        good = append(good, i)
    }
}

Nil vs Empty Slices

// nil_slices.go
package main

func nilVsEmpty() {
    var nilSlice []int
    emptySlice := make([]int, 0)
    literalEmpty := []int{}

    fmt.Printf("nil: len=%d, cap=%d, isNil=%v\n",
        len(nilSlice), cap(nilSlice), nilSlice == nil)
    fmt.Printf("make: len=%d, cap=%d, isNil=%v\n",
        len(emptySlice), cap(emptySlice), emptySlice == nil)
    fmt.Printf("literal: len=%d, cap=%d, isNil=%v\n",
        len(literalEmpty), cap(literalEmpty), literalEmpty == nil)
}

3. Maps - Swiss Tables in Go 1.24

New Swiss Tables Implementation

// swiss_tables.go
package main

// Go 1.24: Swiss Tables instead of buckets + chaining
// Groups of 8 key-value pairs + control word (metadata)

// Conceptual Swiss Table structure:
type SwissGroup struct {
    control [8]byte    // control word: 7-bit hash + status
    keys    [8]Key     // keys
    values  [8]Value   // values
}

// Control byte contains:
// - 7 bits: h2 (lower hash bits)
// - 1 bit: status (empty/occupied/deleted)

func swissTableLookup(key string) (value interface{}, ok bool) {
    hash := hashKey(key)
    h1 := hash >> 7  // upper bits - group selection
    h2 := hash & 0x7F // lower 7 bits - for control word

    group := findGroup(h1)

    // SIMD comparison of all 8 control bytes simultaneously!
    matches := group.matchControlWord(h2)

    for slot := range matches {
        if group.keys[slot] == key {
            return group.values[slot], true
        }
    }
    return nil, false
}

Swiss Tables vs Old Implementation

// comparison.go
package main

// OLD IMPLEMENTATION (before Go 1.24):
// ❌ Buckets (8 entries) + overflow chaining
// ❌ Cache misses due to pointer chasing
// ❌ Memory fragmentation
// ❌ Slow growth (rehash entire map)

// NEW IMPLEMENTATION (Go 1.24 Swiss Tables):
// ✅ Groups of 8 with control word
// ✅ Cache-friendly linear layout
// ✅ SIMD for parallel comparison
// ✅ Incremental growth
// ✅ Better load factor

func performanceComparison() {
    // Swiss Tables improvements:
    // - Lookup: ~20-30% faster
    // - Insert: ~15-25% faster
    // - Memory usage: ~10-15% less
    // - Cache miss rate: significantly lower
}

Thread Safety Remains Unchanged

// map_safety.go
package main

import "sync"

// Maps are STILL NOT thread-safe!
func racySwissMap() {
    m := make(map[int]int) // Swiss Table, but race conditions remain

    go func() {
        for i := 0; i < 1000; i++ {
            m[i] = i // RACE!
        }
    }()

    go func() {
        for i := 0; i < 1000; i++ {
            _ = m[i] // RACE!
        }
    }()
}

// Use sync.Map for concurrent access
func safeConcurrentMap() {
    var m sync.Map

    go func() {
        for i := 0; i < 1000; i++ {
            m.Store(i, i)
        }
    }()

    go func() {
        for i := 0; i < 1000; i++ {
            if val, ok := m.Load(i); ok {
                _ = val
            }
        }
    }()
}

4. Garbage Collector

GC Tuning and Configuration

// gc_tuning.go
package main

import (
    "runtime"
    "runtime/debug"
)

func gcTuning() {
    // Read current settings
    stats := debug.GCStats{}
    debug.ReadGCStats(&stats)

    // Change GOGC (default 100)
    debug.SetGCPercent(50) // more aggressive GC

    // Force GC
    runtime.GC()

    // Memory statistics
    var m runtime.MemStats
    runtime.ReadMemStats(&m)

    fmt.Printf("Alloc: %d KB\n", m.Alloc/1024)
    fmt.Printf("Sys: %d KB\n", m.Sys/1024)
    fmt.Printf("NumGC: %d\n", m.NumGC)
}

// Object pool for reuse
var bufferPool = sync.Pool{
    New: func() interface{} {
        return make([]byte, 1024)
    },
}

func usePool() {
    buf := bufferPool.Get().([]byte)
    defer bufferPool.Put(buf)

    // use buf
}

Write Barriers and GC Performance

// gc_performance.go
package main

// BAD: many pointers = many write barriers
type BadStruct struct {
    ptrs []*int // each assignment = write barrier
}

// GOOD: fewer pointers
type GoodStruct struct {
    values []int // primitives, no write barriers
}

// Finalizers - use with caution!
func finalizerExample() {
    obj := &MyResource{}
    runtime.SetFinalizer(obj, (*MyResource).cleanup)
}

type MyResource struct {
    data []byte
}

func (r *MyResource) cleanup() {
    r.data = nil
}

5. String Optimizations

Unsafe String Conversions

// string_opts.go
package main

import "unsafe"

// Unsafe but fast conversion
func unsafeString(b []byte) string {
    return *(*string)(unsafe.Pointer(&b))
}

func unsafeBytes(s string) []byte {
    return *(*[]byte)(unsafe.Pointer(
        &struct {
            string
            Cap int
        }{s, len(s)},
    ))
}

// String interning
var stringCache = make(map[string]string)

func intern(s string) string {
    if cached, exists := stringCache[s]; exists {
        return cached
    }
    stringCache[s] = s
    return s
}

6. Interface Optimizations

Avoiding Boxing and Interface Overhead

// interface_opts.go
package main

// Avoid boxing primitives
func avoidBoxing(x int) {
    // BAD: interface{} boxing
    var i interface{} = x // allocation

    // GOOD: direct type usage
    processInt(x)
}

// Use type assertions efficiently
func efficientTypeAssertion(x interface{}) {
    if i, ok := x.(int); ok {
        processInt(i)
    }
}

// Small interfaces are faster
type Reader interface {
    Read([]byte) (int, error) // small interface
}

type LargeInterface interface {
    Method1()
    Method2()
    Method3()
    Method4() // large interface - slower
}

func processInt(i int) { _ = i }

7. Write Barriers

Understanding Write Barriers

// write_barriers.go
package main

// Write barrier executes on every pointer assignment
func writeBarrierExample() {
    var p *int
    x := 42
    p = &x // WRITE BARRIER! GC learns about this assignment
}

// Many pointers = many write barriers = slow
type SlowStruct struct {
    ptr1 *int
    ptr2 *string
    ptr3 *[]byte
    ptr4 *map[string]int
}

func slowOperations(s *SlowStruct) {
    x, str, b, m := 1, "test", []byte{}, make(map[string]int)

    // Each assignment = write barrier
    s.ptr1 = &x     // write barrier
    s.ptr2 = &str   // write barrier
    s.ptr3 = &b     // write barrier
    s.ptr4 = &m     // write barrier
}

// Optimization: fewer pointers
type FastStruct struct {
    val1 int
    val2 string
    val3 []byte
    val4 map[string]int
}

func fastOperations(s *FastStruct) {
    // NO write barriers for primitives
    s.val1 = 1
    s.val2 = "test"
    s.val3 = []byte{}
    s.val4 = make(map[string]int)
}

Tricolor Algorithm and Write Barriers

// tricolor_gc.go
package main

// Tricolor GC algorithm:
// WHITE - not yet visited
// GRAY - visited, but children not processed
// BLACK - fully processed

func explainTricolor() {
    // Without write barrier:
    // 1. GC marks A as BLACK (processed)
    // 2. Mutator creates link A -> C (WHITE)
    // 3. GC doesn't know about C, deletes it!
    //
    // With write barrier:
    // When creating A -> C, barrier executes
    // GC marks A as GRAY (needs rescanning)
}

8. Escape Analysis

How Escape Analysis Works

// escape_analysis.go
package main

// Compiler analyzes code and decides:
// stack (fast, automatic cleanup) VS heap (slow, GC)

func stackAllocation() {
    x := 42 // does NOT escape - stays on stack
    _ = x
} // x automatically destroyed

func heapAllocation() *int {
    x := 42  // ESCAPES - goes to heap
    return &x // returning pointer -> escape!
}

// Complex escape cases
func complexEscape() {
    // Case 1: Interface boxing
    var i interface{} = 42 // ESCAPE: boxing to interface{}
    _ = i

    // Case 2: Closure capture
    x := 42
    fn := func() int {
        return x // ESCAPE: x captured by closure
    }
    _ = fn

    // Case 3: Too large for stack
    big := make([]int, 100000) // ESCAPE: too large for stack
    _ = big[0]

    // Case 4: Unknown size at compile time
    n := unknownSize()
    dynamic := make([]int, n) // ESCAPE: size unknown at compile time
    _ = dynamic
}

func unknownSize() int { return 10 }

Avoiding Escapes

// escape_avoidance.go
package main

// Use values, not pointers
func avoidEscape() {
    type Point struct{ X, Y int }

    p := Point{1, 2} // on stack
    processPoint(p)  // pass by value
}

func processPoint(p Point) {
    // work with copy
}

// Compile with: go build -gcflags="-m"
// Shows: "moved to heap" or "does not escape"

9. Memory Ordering & sync.atomic

Memory Reordering Problems

// memory_ordering.go
package main

import "sync/atomic"

// PROBLEM: CPU and compiler can reorder instructions!
var (
    flag int32
    data int32
)

// Writer goroutine
func writer() {
    data = 42        // may execute AFTER flag = 1!
    flag = 1         // signal ready
}

// Reader goroutine
func reader() {
    for atomic.LoadInt32(&flag) == 0 {
        runtime.Gosched()
    }
    value := data    // may read 0 instead of 42!
    println(value)
}

// CORRECT solution - atomic operations
func atomicWriter() {
    atomic.StoreInt32(&data, 42)  // atomic store
    atomic.StoreInt32(&flag, 1)   // release semantics
}

func atomicReader() {
    for atomic.LoadInt32(&flag) == 0 { // acquire semantics
        runtime.Gosched()
    }
    value := atomic.LoadInt32(&data)
    println(value) // guaranteed 42
}

Lock-free Data Structures

// lockfree.go
package main

import (
    "sync/atomic"
    "unsafe"
)

// Lock-free stack (Treiber stack)
type LockFreeStack struct {
    head unsafe.Pointer // *node
}

type node struct {
    value interface{}
    next  unsafe.Pointer // *node
}

func (s *LockFreeStack) Push(value interface{}) {
    newNode := &node{value: value}
    for {
        head := atomic.LoadPointer(&s.head)
        newNode.next = head

        // CAS: compare head with old, set to newNode
        if atomic.CompareAndSwapPointer(&s.head, head, unsafe.Pointer(newNode)) {
            break // success!
        }
        // retry if another thread changed head
    }
}

func (s *LockFreeStack) Pop() (interface{}, bool) {
    for {
        head := atomic.LoadPointer(&s.head)
        if head == nil {
            return nil, false // empty stack
        }

        headNode := (*node)(head)
        next := atomic.LoadPointer(&headNode.next)

        // CAS: remove head, set next as new head
        if atomic.CompareAndSwapPointer(&s.head, head, next) {
            return headNode.value, true
        }
        // retry if another thread changed head
    }
}

atomic.Value Best Practices

// atomic_value.go
package main

import "sync/atomic"

// DANGER: atomic.Value panics on type change!
func atomicValuePanic() {
    var value atomic.Value

    value.Store("string")
    // value.Store(42) // PANIC! inconsistent type int
}

// Correct usage with consistent types
type Config struct {
    Timeout int
    Retries int
}

var globalConfig atomic.Value

func updateConfig(newConfig *Config) {
    globalConfig.Store(newConfig) // always *Config
}

func getConfig() *Config {
    return globalConfig.Load().(*Config)
}

10. Memory Alignment & Struct Optimization

Struct Memory Layout

// struct_alignment.go
package main

import (
    "fmt"
    "unsafe"
)

// BAD alignment - lots of padding
type BadStruct struct {
    a bool   // 1 byte
    // 7 bytes padding for int64 alignment
    b int64  // 8 bytes
    c bool   // 1 byte
    // 7 bytes padding at end
    // Total: 24 bytes
}

// GOOD alignment - minimal padding
type GoodStruct struct {
    b int64  // 8 bytes
    a bool   // 1 byte
    c bool   // 1 byte
    // 6 bytes padding at end
    // Total: 16 bytes
}

func alignmentDemo() {
    fmt.Printf("BadStruct size: %d bytes\n", unsafe.Sizeof(BadStruct{}))   // 24
    fmt.Printf("GoodStruct size: %d bytes\n", unsafe.Sizeof(GoodStruct{})) // 16

    // 33% memory savings by reordering fields!
}

fieldalignment Tool

Go Gopher Go
go install golang.org/x/tools/go/analysis/passes/fieldalignment/cmd/fieldalignment@latest
// fieldalignment_demo.go
package main

// Usage: fieldalignment ./...

// BAD structure for fieldalignment
type Inefficient struct {
    flag1   bool    // 1 byte
    counter int64   // 8 bytes (+ 7 padding)
    flag2   bool    // 1 byte
    id      int32   // 4 bytes (+ 3 padding)
    flag3   bool    // 1 byte (+ 7 padding at end)
    // Total: 32 bytes
}

// fieldalignment suggests:
type Efficient struct {
    counter int64   // 8 bytes
    id      int32   // 4 bytes
    flag1   bool    // 1 byte
    flag2   bool    // 1 byte
    flag3   bool    // 1 byte
    // Total: 16 bytes - 50% less!
}

// Commands:
// fieldalignment -fix ./...          # auto-fix
// fieldalignment -test=false ./...   # analyze only

Cache Line Optimization

// cache_optimization.go
package main

// Cache line is typically 64 bytes
// Important to place frequently used fields together

type CacheOptimized struct {
    // Hot data - first 64 bytes (one cache line)
    hotField1 int64   // frequently read
    hotField2 int64   // frequently read
    hotField3 int64   // frequently read
    hotField4 int64   // frequently read
    hotField5 int64   // frequently read
    hotField6 int64   // frequently read
    hotField7 int64   // frequently read
    hotField8 int64   // frequently read = 64 bytes

    // Cold data - next cache lines
    coldField1 [1000]byte  // rarely used
    coldField2 string      // rarely used
}

// False sharing - problem for concurrent access
type BadConcurrentStruct struct {
    counter1 int64  // CPU core 1 reads/writes
    counter2 int64  // CPU core 2 reads/writes
    // Both on same cache line = false sharing!
}

// Solution - padding between fields
type GoodConcurrentStruct struct {
    counter1 int64
    _        [56]byte  // padding to 64 bytes
    counter2 int64     // on separate cache line
    _        [56]byte  // padding
}

11. Channel Internals & Scheduler

Channel Structure

// channel_internals.go
package main

// Internal channel structure (conceptual)
type hchan struct {
    qcount   uint           // elements in buffer
    dataqsiz uint           // buffer size
    buf      unsafe.Pointer // buffer pointer
    elemsize uint16         // element size
    closed   uint32         // closed flag
    elemtype *_type         // element type
    sendx    uint           // send index
    recvx    uint           // receive index
    recvq    waitq          // waiting receivers queue
    sendq    waitq          // waiting senders queue
    lock     mutex          // synchronization mutex
}

type waitq struct {
    first *sudog  // first waiting goroutine
    last  *sudog  // last waiting goroutine
}

Channel Operations

// channel_mechanisms.go
package main

// Send operation (conceptual)
func channelSend(ch chan int, value int) {
    // 1. Check if there's a waiting receiver
    if len(recvq) > 0 {
        // Direct send - copy data directly to receiver
        receiver := recvq.dequeue()
        memmove(receiver.elem, &value, sizeof(int))
        goready(receiver.g) // schedule goroutine for execution
        return
    }

    // 2. Space in buffer?
    if qcount < dataqsiz {
        // Copy to buffer
        buf[sendx] = value
        sendx = (sendx + 1) % dataqsiz
        qcount++
        return
    }

    // 3. Block - add to sendq
    mysudog := acquireSudog()
    mysudog.elem = &value
    sendq.enqueue(mysudog)
    gopark() // block current goroutine
}

Select Statement Internals

// select_internals.go
package main

// Select uses pollorder and lockorder internally
func selectInternals() {
    ch1 := make(chan int)
    ch2 := make(chan string)
    ch3 := make(chan bool)

    select {
    case v1 := <-ch1:
        _ = v1
    case v2 := <-ch2:
        _ = v2
    case ch3 <- true:
        // send
    default:
        // default case
    }

    // Compiler generates approximately:
    // 1. Create array of cases
    // 2. Generate random pollorder (fairness)
    // 3. Generate lockorder (deadlock prevention)
    // 4. Lock channels in lockorder
    // 5. Check readiness in pollorder
    // 6. Execute ready case or block on all
}

Channel Best Practices

// channel_best_practices.go
package main

import (
    "context"
    "time"
)

// 1. Buffer size = amount of work in flight
func optimalBufferSize() {
    numWorkers := runtime.NumCPU()

    // Buffer size = concurrent work units
    jobs := make(chan Job, numWorkers*2) // 2x for peak load

    for i := 0; i < numWorkers; i++ {
        go worker(jobs)
    }
}

// 2. Graceful shutdown pattern
func gracefulShutdown() {
    jobs := make(chan Job)
    quit := make(chan struct{})

    go func() {
        defer close(jobs)
        for {
            select {
            case job := <-getNextJob():
                select {
                case jobs <- job:
                    // sent job
                case <-quit:
                    // shutdown signal
                    return
                }
            case <-quit:
                return
            }
        }
    }()

    // Shutdown
    close(quit)
    // Wait for remaining jobs
    for range jobs {
        // process remaining jobs
    }
}

// 3. Context-aware channels
func contextAwareChannel(ctx context.Context) {
    ch := make(chan Data)

    go func() {
        defer close(ch)
        for {
            select {
            case data := <-source():
                select {
                case ch <- data:
                    // sent
                case <-ctx.Done():
                    return // context canceled
                }
            case <-ctx.Done():
                return
            }
        }
    }()
}

type Job struct{ ID int }
type Data struct{ Content string }

func worker(jobs <-chan Job) {
    for job := range jobs {
        // process job
        _ = job
    }
}

func getNextJob() <-chan Job {
    ch := make(chan Job, 1)
    go func() {
        defer close(ch)
        ch <- Job{ID: 1}
    }()
    return ch
}

func source() <-chan Data {
    ch := make(chan Data, 1)
    go func() {
        defer close(ch)
        ch <- Data{Content: "test"}
    }()
    return ch
}

12. Runtime Optimizations

Profile-Guided Optimization (PGO)

// pgo_optimization.go
package main

// Go 1.20+ supports PGO - compiler uses profiles
// for optimization decisions

// Enable PGO:
// go build -pgo=auto               # look for default.pgo
// go build -pgo=profile.pgo        # explicit profile

func generateProfile() {
    // 1. Add CPU profiling to production code
    if *cpuprofile != "" {
        f, err := os.Create(*cpuprofile)
        if err != nil {
            log.Fatal(err)
        }
        defer f.Close()
        pprof.StartCPUProfile(f)
        defer pprof.StopCPUProfile()
    }

    // 2. Run typical workload
    runTypicalWorkload()

    // 3. Profile saved to cpu.prof
    // 4. Convert for PGO: go tool pprof -proto cpu.prof > default.pgo
}

// PGO improves:
func pgoImprovements() {
    // 1. Inlining decisions - more accurate inlining
    hotFunction() // likely to be inlined with PGO

    // 2. Devirtualization - interface calls to direct calls
    var i io.Reader = strings.NewReader("test")
    data, _ := io.ReadAll(i) // may become direct call with PGO
    _ = data

    // 3. Function layout - hot functions placed together
    // 4. Better register allocation in hot paths
}

func hotFunction() {
    // Frequently called function from profile
    // PGO increases inlining chances
}

// Typical improvement: 2-14% performance
// Especially effective for:
// - HTTP servers
// - Database applications
// - Games
// - Scientific computing

Inlining Optimizations

// inlining_optimizations.go
package main

// Compiler automatically inlines simple functions
func inliningExamples() {
    // This function will be inlined (simple, small)
    result := simpleAdd(5, 10)

    // This may not be inlined (complex logic)
    complex := complexFunction(result)

    _ = complex
}

//go:noinline
func forceNoInline(x int) int {
    return x * 2  // forcibly prevent inlining
}

// Simple function - inlining candidate
func simpleAdd(a, b int) int {
    return a + b  // will be inlined into calling code
}

// Complex function - probably won't be inlined
func complexFunction(x int) int {
    if x > 100 {
        for i := 0; i < x; i++ {
            x = x * i / (i + 1)
            if x < 0 {
                panic("overflow")
            }
        }
    }
    return x
}

// Check inlining: go build -gcflags="-m" main.go
// Shows: "can inline simpleAdd", "cannot inline complexFunction"

Devirtualization

// devirtualization.go
package main

import (
    "io"
    "strings"
)

// Interface call devirtualization
func devirtualizationExample() {
    // Without optimization: interface method call (slow)
    var r io.Reader = strings.NewReader("hello")

    // With PGO: compiler may notice r is always *strings.Reader
    // and replace interface call with direct call
    buffer := make([]byte, 100)
    n, err := r.Read(buffer) // may become direct call
    _ = n
    _ = err
}

// Type assertion optimization
func typeAssertionOpt(i interface{}) {
    // Common pattern in hot paths
    if s, ok := i.(string); ok {
        processString(s) // PGO optimizes this path
    } else if num, ok := i.(int); ok {
        processInt(num)  // and this too
    }
}

func processString(s string) { _ = s }
func processInt(i int)       { _ = i }

Compiler Intrinsics

// compiler_intrinsics.go
package main

import (
    "math/bits"
    "unsafe"
)

// Some functions are replaced with CPU instructions
func intrinsicsExamples() {
    // bits.TrailingZeros compiles to TZCNT/BSF instruction
    x := uint(0b1000)
    zeros := bits.TrailingZeros(x) // CPU instruction
    _ = zeros

    // bits.Len compiles to LZCNT/BSR
    length := bits.Len(x) // CPU instruction
    _ = length

    // math.Sqrt may use FSQRT
    sqrt := math.Sqrt(16.0) // CPU instruction if available
    _ = sqrt
}

// Memory barriers are also intrinsics
func memoryBarriers() {
    // runtime.KeepAlive prevents optimization
    data := make([]byte, 1000)
    ptr := unsafe.Pointer(&data[0])

    // Use ptr...
    _ = ptr

    // Guarantee data won't be GC'd until this point
    runtime.KeepAlive(data)
}

13. Practical Solutions

Optimized String Processing

// Solution 1: Optimize string processing function
func efficientStringProcessing(data []string) []string {
    // Pre-allocate with capacity estimate
    result := make([]string, 0, len(data)/2)

    // Use sync.Pool for string builders
    builder := builderPool.Get().(*strings.Builder)
    defer builderPool.Put(builder)

    for _, item := range data {
        if len(item) > 5 {
            builder.Reset()
            builder.Grow(len(item))

            // Efficient uppercase conversion
            for _, r := range item {
                builder.WriteRune(unicode.ToUpper(r))
            }
            result = append(result, builder.String())
        }
    }
    return result
}

var builderPool = sync.Pool{
    New: func() interface{} {
        return &strings.Builder{}
    },
}

Memory Leak Fix

// Solution 2: Fix memory leak in slice operation
func noMemoryLeak() []int {
    big := make([]int, 1000000)

    // BAD: return big[999999:] // holds reference to entire array

    // GOOD: Copy needed data to new slice
    result := make([]int, 1)
    result[0] = big[999999]

    // big can now be garbage collected
    return result
}

// For subslices:
func safeSubslice(data []byte, start, end int) []byte {
    // Copy needed portion to break reference
    result := make([]byte, end-start)
    copy(result, data[start:end])
    return result
}

Thread-Safe Cache with TTL

// Solution 3: Thread-safe cache with TTL
package main

import (
    "sync"
    "time"
)

type CacheItem struct {
    Value      interface{}
    Expiration time.Time
}

type TTLCache struct {
    mu    sync.RWMutex
    items map[string]CacheItem
    ttl   time.Duration

    stopCleanup chan struct{}
    wg          sync.WaitGroup
}

func NewTTLCache(ttl time.Duration) *TTLCache {
    cache := &TTLCache{
        items:       make(map[string]CacheItem),
        ttl:         ttl,
        stopCleanup: make(chan struct{}),
    }

    // Background cleanup goroutine
    cache.wg.Add(1)
    go cache.cleanupExpired()

    return cache
}

func (c *TTLCache) Set(key string, value interface{}) {
    c.mu.Lock()
    defer c.mu.Unlock()

    c.items[key] = CacheItem{
        Value:      value,
        Expiration: time.Now().Add(c.ttl),
    }
}

func (c *TTLCache) Get(key string) (interface{}, bool) {
    c.mu.RLock()
    defer c.mu.RUnlock()

    item, exists := c.items[key]
    if !exists {
        return nil, false
    }

    if time.Now().After(item.Expiration) {
        return nil, false
    }

    return item.Value, true
}

func (c *TTLCache) Delete(key string) {
    c.mu.Lock()
    defer c.mu.Unlock()
    delete(c.items, key)
}

func (c *TTLCache) cleanupExpired() {
    defer c.wg.Done()
    ticker := time.NewTicker(c.ttl / 2)
    defer ticker.Stop()

    for {
        select {
        case <-ticker.C:
            c.mu.Lock()
            now := time.Now()
            for key, item := range c.items {
                if now.After(item.Expiration) {
                    delete(c.items, key)
                }
            }
            c.mu.Unlock()

        case <-c.stopCleanup:
            return
        }
    }
}

func (c *TTLCache) Close() {
    close(c.stopCleanup)
    c.wg.Wait()
}

Interview Tips

Key Questions to Expect

  1. "What happens when you append to a slice with insufficient capacity?"

    • New backing array allocated with double capacity
    • All elements copied to new array
    • Old array becomes eligible for GC
  2. "Why are maps not thread-safe?"

    • Performance optimization for single-threaded use
    • Concurrent access would require locks on every operation
    • Use sync.Map or manual synchronization for concurrent access
  3. "When does an object escape to the heap?"

    • Returning pointer from function
    • Storing in interface
    • Capturing in closure
    • Too large for stack
    • Unknown size at compile time
  4. "How do write barriers work with the GC?"

    • Execute on every pointer assignment
    • Inform GC about pointer changes during collection
    • Enable concurrent GC without stop-the-world
    • Support tricolor marking algorithm
  5. "What's the difference between buffered and unbuffered channels?"

    • Unbuffered: synchronization point, send/receive must happen together
    • Buffered: decouples sender and receiver until buffer is full
    • Both can cause goroutine leaks if misused

Performance Analysis Commands

# Escape analysis
go build -gcflags="-m" main.go

# PGO build
go build -pgo=default.pgo main.go

# Race detection
go build -race main.go

# Disable optimizations for debugging
go build -gcflags="-N -l" main.go

# Memory alignment check
fieldalignment ./...

# CPU profiling
go tool pprof cpu.prof

# Memory profiling
go tool pprof mem.prof

Summary

This guide covers advanced Go topics essential for senior-level interviews:

  • Memory Management: Stack vs heap, escape analysis, GC tuning
  • Data Structures: Slice internals, Swiss Tables maps, channel implementation
  • Concurrency: Memory ordering, atomic operations, channel patterns
  • Performance: PGO, inlining, devirtualization, memory alignment
  • Best Practices: Pool patterns, leak prevention, optimization techniques