When working with AI assistants like Claude Code, there’s a fundamental tension that becomes apparent after the initial honeymoon phase. You want the AI to have all the context it needs - your preferences, your codebase conventions, your product knowledge, your way of thinking. But loading everything upfront creates its own problems: token costs spike, responses slow down, and paradoxically, having too much context often makes the AI less precise rather than more. The signal gets lost in the noise.
We at ArcBlock spent a while living with this friction before stepping back to design a solution. The result is a pattern we call ALP, or Active Loading Policy. It’s not a product, not a framework, not something we’re selling. It’s just an engineering pattern - a way of structuring how AI assistants load context on demand rather than all at once. The core insight is simple: define explicit rules for what gets loaded when, put those rules in a file Claude reads, and let Claude follow them as conversations unfold. One afternoon of design work, continuous dividends ever since.
What makes this worth writing about isn’t the cleverness of the solution - it’s deliberately not clever. It’s that the pattern encodes a principle we keep rediscovering in different contexts: constraints enable efficiency. By explicitly declaring what loads when, we gained control over something that previously felt chaotic. And that control compounds over time as the knowledge base grows.
The Problem We Were Solving#
Before arriving at ALP, our experience with Claude Code had accumulated several friction points that individually seemed minor but collectively created real drag. We had built up a substantial collection of knowledge files over months of work - product documentation for a dozen different products, technical architecture notes covering everything from our file system abstraction to our identity layer, engineering insights and decision records, personal preferences about coding style and communication patterns. The obvious approach was to make all of this available to Claude, but “making it available” raised immediate questions about how.
The naive solution was to stuff everything into context at the start of each session. This works for small knowledge bases but scales poorly. Token usage climbed, responses became noticeably slower, and we observed something counterintuitive: Claude’s outputs became less focused. When you give an AI assistant context about twelve products and it’s trying to help you with one, it sometimes tries to be too helpful, drawing connections that aren’t relevant or hedging its answers to account for possibilities you don’t care about. More context was creating more noise, not more signal.
But the deeper problem wasn’t just efficiency - it was the absence of any structured memory system. Each session felt like meeting someone with amnesia. We’d repeatedly explain which files to read, re-establish what we were working on, re-provide context that should have been persistent. There was no clean separation between knowledge that was personal to me versus knowledge that should be shared across the team. And there was no system for the AI to learn what knowledge was relevant to what situations. What we really wanted was something like a context graph: structured, persistent, queryable knowledge with clear rules about what loads when. The industry is starting to talk about this problem, but we needed a solution that worked with today’s tools, not tomorrow’s platforms.
The friction cost wasn’t just time - it was cognitive load. Every session required a mental warm-up period where we’d re-orient Claude to our context. That warm-up was stealing attention from the actual work. And because the process was ad-hoc, it was inconsistent. Sometimes we’d forget to point Claude at an important file; sometimes we’d load too much and muddy the context. There had to be a better way.
Why a Rule Table#
When we sat down to design a solution, we considered several approaches with different tradeoffs. The first option was to let Claude auto-infer what context to load based on conversation content. This has obvious appeal - no manual configuration, just smart behavior. But it also has obvious risks. Claude might guess wrong about what’s relevant, and when it does, you have no visibility into why. There’s no audit trail, no way to adjust the behavior, no transparency. The AI is making decisions about its own context, which felt like giving up exactly the control we were trying to establish.
Another option was to build something more sophisticated - an embedding-based retrieval system, perhaps, where we’d vectorize all our documentation and let Claude query it semantically. This approach has merit and is probably where the industry is heading. But for our immediate needs, it felt like overkill. It would introduce dependencies on embedding models and vector databases, add infrastructure complexity, and require ongoing maintenance. We wanted something that could be set up in an afternoon and modified with a text editor. We wanted something that worked within the existing Claude Code paradigm rather than requiring us to build new tooling around it.
The rule table approach won because of two properties: simplicity and transparency. A Markdown table in a claude.md file that maps trigger conditions to file paths. When Claude sees a conversation about “AFS” or “file system architecture,” it checks the table, finds a matching rule, and loads technical/afs.md. When we’re writing blog posts, it loads our writing style preferences. When we’re doing architecture reviews, it loads our engineering insights. The rules are explicit - you can read them and understand exactly what will happen. They’re editable - change a line in a Markdown file and the behavior changes. They’re human-readable - new team members can look at the table and immediately understand how context loading works. No magic, no black boxes, no machine learning models making opaque decisions.
The format we settled on is intentionally minimal. A table with two columns: trigger condition and file path. The trigger conditions are natural language descriptions - Claude interprets them, which gives us flexibility without requiring complex pattern matching. The file paths are relative to a known root. That’s it. Here’s a representative example:
## Active Loading Policy (ALP)
| Trigger | Load File |
|---------|-----------|
| Discussing ArcSphere product | `products/arcsphere.md` |
| Discussing AFS architecture | `technical/afs.md` |
| Writing blog posts or articles | `content-profile/writing-style.md` |
| Making architecture decisions | `profile/engineering-insights.md` |
| Discussing DID or identity | `technical/did-capability.md` |
The simplicity is a feature, not a bug. We could have designed something more expressive - boolean combinations of triggers, priority weights, temporal conditions. But each increment of expressiveness adds cognitive overhead. The simple version handles 90% of our cases, and for the remaining 10%, we can always explicitly tell Claude what to load. Premature optimization of the rule language would have been exactly the kind of over-engineering we were trying to avoid.
The Role of README Files#
The rule table tells Claude when to load files, but it doesn’t help Claude decide whether a topic is actually relevant before committing to loading detailed documentation. If every file is substantial, you don’t want Claude loading three or four of them on a hunch. You want a lighter-weight way for Claude to pattern-match against your knowledge base and make informed decisions about what’s actually needed.
This is where README files become essential. Every directory in our knowledge base has a README that serves as a high-density index. For our products directory, the README has a one-liner for each product - enough information for Claude to recognize when a topic is relevant, but not so much that reading the README itself is expensive. For our technical documentation, the README summarizes each architecture component in a sentence or two. These aren’t documentation for humans (though they’re useful for that too) - they’re indices for Claude.
This creates what we think of as a two-tier loading strategy. First tier: Claude reads the README to get the lay of the land. This is cheap - a few hundred tokens to understand what knowledge exists in a directory. Second tier: based on the README and the current conversation, Claude decides which specific files are actually needed and loads only those. It’s analogous to how you might use a book’s table of contents before diving into specific chapters, or how operating systems use directory metadata to avoid reading entire file contents during path resolution.
The README approach also helps with a subtle problem: false matches. Without the summary layer, Claude might see a trigger like “discussing identity” and load our DID documentation when actually the conversation is about brand identity or user identity in a non-technical sense. The README provides enough context for Claude to disambiguate. It reads “DID + Capability: decentralized identity verification using verifiable credentials” and can determine whether that’s actually what the conversation is about. This lightweight semantic matching happens naturally because Claude is a language model - we’re just giving it the information it needs to match accurately.
Writing good READMEs for this purpose is a skill worth developing. The goal is maximum information density in minimum tokens. Every word should help Claude pattern-match. Avoid boilerplate phrases like “this directory contains” - Claude knows it’s a directory index. Lead with the distinctive terms that will appear in relevant conversations. Think about how you’d describe each item if you could only use one sentence, then edit that sentence to remove anything redundant. The time invested in crafting these summaries pays off every time Claude makes a smarter loading decision.
Override Priority and Team Dynamics#
When we started using ALP individually, the design was straightforward: personal knowledge files in a personal directory, rules in a personal claude.md. But as we moved toward team usage, new requirements emerged. We wanted shared knowledge that everyone could access - product documentation, technical architecture, company strategy. But we also wanted individuals to be able to customize or extend that shared knowledge without forking the entire system. And we wanted projects to be able to have their own context that might differ from both individual and team defaults.
The solution is a priority chain that mirrors how configuration typically works in well-designed systems. Three levels, checked in order: project-level overrides first, then user-level overrides, then plugin defaults. In concrete terms:
./.claude/arcblock-context/- Project-specific overrides in the current directory~/.claude/arcblock-context/- User-specific overrides in the home directory- Plugin default - The shared team knowledge base distributed as a Claude Code plugin
When Claude needs to load products/arcsphere.md, it first checks if a project-level override exists. If not, it checks for a user-level override. If neither exists, it falls back to the plugin default. This means the team can maintain authoritative documentation - the version that represents official product positioning and technical decisions - while individuals can extend or override specific files for their own needs. A developer working deeply on one product might have a richer personal version of that product’s documentation. A team member exploring a new approach might override a technical document with experimental notes. None of this affects anyone else.
The override mechanism also creates a natural contribution flow. When someone’s personal override proves valuable, they can contribute it back to the shared plugin through a pull request. The override started as a personal experiment, got validated through use, and graduated to shared knowledge. Changes flow upward through a review process when they should be shared; personal notes stay personal when they shouldn’t be. This is much cleaner than having everyone edit a shared repository directly, which would create coordination overhead and risk conflicting changes.
For teams considering this pattern, the key insight is that the priority chain should match your trust model. Project overrides take precedence because project-specific context is the most immediate - if you’re working on a particular codebase, its conventions should override general defaults. User overrides come next because personal workflow matters - if someone has a better way of describing a product for their own use, that shouldn’t be blocked. Plugin defaults anchor the system with shared truth - the documentation that represents team consensus. This ordering felt natural once we articulated it, but it’s worth being explicit about the reasoning since different teams might have different trust models.
What Changed After Adopting ALP#
The immediate, measurable benefits were what you’d expect: faster responses because Claude isn’t processing irrelevant context, lower token usage because we’re loading files on demand rather than upfront, more focused outputs because Claude’s attention isn’t diluted across a dozen products when we’re working on one. These efficiency gains are real, but they’re also the least interesting part of the change. You save tokens, responses feel snappier, costs go down a bit. Fine.
The more significant shift was behavioral and cultural. Before ALP, we thought about AI context as a bucket - dump in everything you know and hope the AI figures out what’s relevant. After ALP, we started thinking about context as a designed system. What knowledge exists? When is each piece relevant? How should pieces relate to each other? The knowledge base became something we actively curate rather than passively accumulate. We started writing documentation differently, knowing that Claude would be consuming it. We thought about information density because we knew READMEs would serve as indices. The act of designing ALP rules forced us to articulate what we know and when that knowledge matters.
For teams, ALP provides something that’s been missing from most AI-augmented workflows: a natural structure for shared knowledge. When a new team member joins, they install a plugin and immediately have access to the team’s product documentation, technical architecture, and strategic context - loaded on demand, not dumped upfront. They don’t need to ask “where’s the documentation for X” because Claude already knows where it is and when to load it. The rule table itself serves as a kind of meta-documentation, a map of what knowledge exists and when it’s relevant. Onboarding becomes faster because the knowledge system is explicit and discoverable rather than tacit and fragmented.
The pattern also changed how we think about the relationship between AI capabilities and human systems. ALP isn’t particularly clever - it’s a Markdown table and some naming conventions. But it encodes a useful principle: constraints enable efficiency. By accepting the constraint of explicit loading rules, we gained the efficiency of precise context. By accepting the constraint of README indices, we gained the efficiency of two-tier loading. This is the same principle that shows up in good software architecture, in Unix philosophy, in effective organizational design. Constraints aren’t limitations - they’re the structure that makes freedom possible. We’ve found that AI tools become more powerful, not less, when we think carefully about the constraints we put around them. ALP is a small example of this larger pattern, but it’s a tangible one.
