Systems Notes

Writing that shows how I think about systems.

This section is about reasoning, not content marketing. Each note is framed around context, problem definition, tradeoffs, and implementation choices.

Building a Thread Pool in Rust

Draft topic

Context: A way to understand scheduling and ownership instead of treating concurrency as a black box.
Problem: High-level concurrency abstractions are useful, but they can hide how work is actually queued, executed, and shut down safely.
Approach: Implement a small thread pool from first principles, define the worker lifecycle, and reason about channels, backpressure, and shutdown paths.
Tradeoffs: A custom implementation teaches scheduling mechanics well, but it also exposes how easy it is to get error handling and teardown wrong.
Takeaways: The value is not the thread pool itself. The value is learning how execution models affect latency, contention, and ownership boundaries.

Understanding Async Runtimes

Draft topic

Context: Most backend work depends on async execution, but many engineers never inspect how runtimes behave under load.
Problem: It is easy to use Tokio productively without understanding task scheduling, wakeups, fairness, or where overhead comes from.
Approach: Compare runtime behavior conceptually, inspect task flow, and map the cost of convenience abstractions to real execution behavior.
Tradeoffs: Managed runtimes accelerate delivery, but they can make performance bottlenecks harder to reason about when the system grows.
Takeaways: Understanding the runtime makes it easier to choose better boundaries, isolate blocking work, and design more predictable services.

Designing Backend Systems for ML Workloads

Draft topic

Context: ML-backed products create a different set of infrastructure pressures than standard request-response APIs.
Problem: Inference is expensive, bursty, and often constrained by latency, model size, or GPU availability.
Approach: Separate API and inference concerns, treat the model-serving path as a specialized subsystem, and optimize around queuing, storage, and cost.
Tradeoffs: The cleanest architecture is not always the cheapest one, and the cheapest one is not always good enough for user-facing latency.
Takeaways: A good ML backend is mostly a systems design problem: cost control, workload isolation, and reliable movement of data through the pipeline.