All Posts
DesignJudgementProblem SolvingLessons from the Codebase · Part 1 of 1

The ThreadLocal Trap: How a One-Line Spec Hid a Security Decision

Share

The story of building the masking serializer at the heart of Ihawu — and the moment I realized the easy ticket I'd picked up was quietly asking me to make a security call.

First — what's Ihawu?

If you haven't met it: Ihawu is an open-source Kotlin library for dynamic data masking on the JVM. The idea is simple to state. Your application works with complete, truthful data right up until the instant it serializes a response — and at that boundary, Ihawu removes or obfuscates the fields a particular caller isn't allowed to see. A manager fetching an employee record sees the salary; a peer fetching the same record sees REDACTED; the controller code is identical for both.

It hooks into Jackson (the JSON library most Spring apps already use) and takes its masking decisions from a pluggable policy resolver — so Ihawu itself stays an enforcer, never a policy engine. You annotate a response object with @IhawuResource, and Ihawu does the rest as the JSON is written.

This article is about building that "does the rest" part — the serializer that actually applies the masking. It turned out to be the most quietly demanding ticket in the project.

The ticket I thought would be a quick win

I'll be honest: I picked up this ticket because it looked like a clean afternoon. Here's roughly what it said:

Add a Jackson BeanSerializerModifier that intercepts @IhawuResource types and applies a resolved List<FieldPolicy>. HIDE omits the field; REDACT writes a placeholder. Mask nested resources and collection items. No reflection on hot paths.

I'd written Jackson modules before. There was a clear input, a clear output, and a tidy little HIDE/REDACT rule to implement. My hands were already reaching for the keyboard. You know that feeling — the ticket where you can see the whole shape of it before you've typed a line.

Do me a favour before you read on: glance at that spec and decide how you'd build it. Keep your answer in your head. I promise it matters later.

Following the request until something didn't sit right

Ihawu exists to do one thing: make sure the wrong person never sees protected data. So before I committed to an approach, I traced a real request through it — GET /employees/42/profile — just to make sure I understood the path:

  1. The request comes in with an auth header.
  2. Spring Security authenticates and stashes the caller.
  3. The controller loads the full, truthful Employee, maps it to an EmployeeProfile annotated @IhawuResource("employee.profile"), and returns it. The business logic sees everything — that's intentional.
  4. Spring hands the object to the ObjectMapper to write the response.
  5. Here's where Ihawu does its work. The serializer has to know who's asking, resolve their policy, and mask the fields — recursing into the nested Address on the way.
  6. Masked JSON goes out.

And somewhere around step 5, the quick-win feeling drained out of me. Because one question surfaced that the ticket simply… didn't mention:

The Jackson serializer is one shared object, built once at startup, used by every request on every thread. The principal is per request. So how on earth does "who's asking" get into that serializer at the exact moment it's writing bytes?

I sat with that for a minute, slightly annoyed, because I knew what it meant. The afternoon job wasn't an afternoon job. The actual work was a decision nobody had written down — and it was hiding inside a ticket dressed up as plumbing.

That's the first thing this problem taught me, and it's stuck with me since: when a spec looks purely mechanical, the hard part is usually the choice it's not mentioning. The skill isn't typing faster. It's noticing the gap and having the discipline to stop at it. The old me would've implemented the AC word-for-word and shipped. This time I made myself put the keyboard down.

I made myself say it in one sentence

When I'm out of my depth, my instinct is to make the problem smaller until I can hold it. So I forced the whole ticket into a single line:

Get per-request state into a shared, long-lived, thread-safe object that runs during the request.

And the moment I wrote that down, I felt the relief of recognition — because it's not a weird, one-off problem. It's a shape I'd seen before without realizing it. It's the same thing logging's MDC does. The same thing Spring's own SecurityContextHolder does. The "carry call-scoped context into a shared component" problem. I wasn't inventing anything; I was standing in front of a door lots of people had walked through.

That reframing is the move I'm proudest of, honestly — not because it's clever, but because it turned "I've never done this" into "oh, this is that kind of problem," and that's a much less frightening sentence to start from.

Three doors — and why I trusted there were only three

Once I saw the class of problem, the options almost laid themselves out. And I want to be clear about why there were three, because for a long time I used to brainstorm options like I was throwing darts — quantity over structure. This time I found the one thing the decision actually turned on: where does the context live, relative to the serialization call? There are only three answers to that:

  • Inside the call. Jackson lets you pin data to a single serialization: withAttribute(KEY, principal), read back with getAttribute(KEY). It lives for that one call and then it's gone. (Option A)
  • Around the call, on the thread. A ThreadLocal<IhawuPrincipal> — a filter sets it when the request starts, clears it in a finally. The serializer just reads the ambient value. (Option B)
  • Before the call. Resolve everything up front and hand the serializer a finished map to look up. (Option C)

Inside, around, before. That's not three ideas I happened to have — that's every place the context can possibly be. And realizing that gave me something I'd never felt during a design before: confidence that I wasn't missing a secret fourth option I'd kick myself over later.

And here's my confession: the answer I'd been holding in my head since the first paragraph? It was Option B. The ThreadLocal. It's the one I knew best, the one that felt like home. Hold onto that, because it's where the story turns.

The moment it flipped

For most features I'd have compared these on what I usually care about — which is cleanest, which is fastest, which I can write without fighting the framework. By that scoring, Option B wins easily. It's familiar, it needs almost no Jackson plumbing, the read is trivial.

But I caught myself, and I think this is the single most important judgment call in the whole story: Ihawu is a security library, and you don't get to score a security library on elegance. You score it on the worst thing that happens when it breaks. So I made myself ask each option one cold question: when this fails, who gets hurt?

And the ThreadLocal — my comfortable favourite — gave an answer that made my stomach drop.

A ThreadLocal is only correct if it's always cleared and serialization always runs on the same thread that set it. But threads are pooled and reused. If a single code path skips the cleanup — an exception in the wrong spot, a filter ordering I didn't anticipate, some library wrapping the chain — then the next request lands on that thread and inherits the previous user's principal. In a masking library, that's not a glitch. That's one person being served another person's data, or protected fields leaking under someone else's leftover admin role. And the second anyone introduces WebFlux or coroutines or @Async, the thread can change mid-flight and the whole thing quietly comes apart.

What got me wasn't that Option B couldn't be made correct. It was that its correctness depended on discipline I could never enforce — discipline living in other people's applications, on cleanup code they might never write. I'd be shipping a loaded footgun and a note saying "please don't pull the trigger."

Option A, the per-call attribute I'd almost overlooked, doesn't have a worst case like that. The principal is tied to one serialization and physically cannot outlive it. There's no shared slot to leak through, no finally to forget. The bad outcome isn't prevented — it's impossible. (Option C, for the record, dodged the cross-request leak but introduced a different one: forget to pre-resolve a nested type and it sails out unmasked. Same family of "safe only if you're perfect" risk.)

That was the flip. The option I trusted least on instinct was the only one I could trust with someone's private data. I've thought about that a lot since — how often the comfortable choice is comfortable precisely because it hides its failure mode.

Building for the tickets that didn't exist yet

There was one more thing that settled it, and it's a habit I want to keep: I judged the options against work that hadn't been built. Two siblings were coming — request-scoped caching (don't resolve 200 times for a list of 200) and fail-closed (if anything throws, emit {}, never leak).

Option A made both almost free. The per-call attribute bag is already a request-scoped scratchpad, so caching just memoizes into it and disappears when the call ends — leak-free without trying. And "no principal? mask everything" gave fail-closed a single clean place to live. The other options would've made me fight for both.

I used to think good design meant solving today's ticket well. I think now it means making the next three cheaper. That reframe alone made this feel like a different tier of work.

The part that made me grin

Once I'd chosen Option A, I still had to actually wire it into Jackson — and there were two routes. One meant subclassing BeanSerializer and re-implementing a pile of constructors to survive Jackson's internal copying. The other meant wrapping each property writer: one constructor, one method. I went for the small one, half-expecting to pay for the shortcut later.

I didn't. It turned out the small approach also handed me the recursion for free — every annotated type independently gets wrapped, so nested objects and list items just work, no graph-walking code at all. The whole per-field decision collapsed to this:

override fun serializeAsField(bean: Any, gen: JsonGenerator, prov: SerializerProvider) {
    val policies = policies(prov) ?: return // no principal -> omit field -> resource becomes {}
    when (policies[name]?.strategy) {
        null -> super.serializeAsField(bean, gen, prov) // not in policy -> normal output
        MaskingStrategy.HIDE -> Unit                    // omit entirely
        MaskingStrategy.REDACT -> {
            gen.writeFieldName(name)
            gen.writeString(/* placeholder or strategy default */)
        }
    }
}

That's when I grinned at my screen. The simplest thing I could build turned out to be the safest thing I could build. They weren't in tension — they were the same choice. I've come to read that as a signal: when simple and correct converge like that, it usually means you stopped fighting the problem and found its grain.

I wrote down why

I didn't just leave code behind. I wrote an Architecture Decision Record — the lifecycle, the three doors, the worst-case analysis, the choice. Partly for the team. Mostly, honestly, for future-me, who in six months will look at this and think "why didn't I just use a ThreadLocal?" — and deserves an answer better than a shrug. A decision you can't reconstruct is a decision you'll end up making all over again.

What I actually took away

Strip out the Jackson and the Kotlin and here's what I'm keeping — the part I'd tell a friend over coffee, not just write in a postmortem:

  1. When a ticket feels mechanical, go looking for the decision it's hiding. The quick win is often a quiet choice in disguise.
  2. Shrink the problem until you can say it in one sentence — that's usually the moment you recognize you've seen it before.
  3. Find the one axis the decision turns on, and take its natural answers. Three doors you can trust beats ten ideas you can't.
  4. Score by what failure costs, not by what's familiar. The comfortable option is comfortable because it hides its worst case.
  5. Build for the tickets that don't exist yet. Good design makes the next three easier.

The thing I'm proudest of on this one isn't a line of code. It's the breach that never happened — the one the ticket, read literally, would have shipped without anyone noticing. I almost did. Catching it is the part of this job I didn't know I'd love.

Share

References

  1. JacksonThe JSON library for the JVM that Ihawu hooks into
  2. Architecture Decision Records (ADRs)Lightweight documents that capture a single architectural decision and the reasoning behind it
Back to all posts