Problem Context
Credit decisions required manual review across inconsistent evaluation criteria. Different products had different rule sets hardcoded into application logic. Changing a single rule meant a code change, PR review, deployment cycle, and regression testing. The process was slow, error-prone, and didn't scale.
The core question: how do you make business rules changeable without code deployments, while maintaining full auditability in a regulated financial system?
System Architecture
The rule engine follows a plugin architecture with three layers:
Rule Definitions are configuration, not code. Each rule is a JSON/YAML document specifying:
- Field to evaluate
- Operator (equals, greater_than, in_range, etc.)
- Threshold value
- Action on match (approve, reject, flag_for_review)
Evaluator Pipeline processes rules in sequence with short-circuit logic — if a hard rejection rule fires, remaining rules are skipped.
Audit Trail captures every rule evaluation, not just the final decision. Each audit record stores the rule version, input data snapshot, evaluation result, and timestamp.
Key Engineering Decisions
1. Plugin Architecture Over Hardcoded Conditionals
The temptation was to build a big if/elif/else chain. Instead, rules are defined as config documents and loaded dynamically at startup. New rule types are added as evaluator plugins that implement a simple interface:
class RuleEvaluator(Protocol):
def evaluate(self, context: EvaluationContext) -> RuleResult: ...This means a new rule type (e.g., "check if applicant's state is in allowed list") is a ~20 line class, not a branch in a growing conditional tree.
2. Dataclass-Based Evaluators for Type Safety
Every evaluator uses Python dataclasses for input/output schemas. This gives us:
- IDE autocompletion and type checking
- Automatic validation at the boundary
- Clean serialization for audit logs
3. Full Audit on Every Evaluation, Not Just Decisions
In financial systems, knowing why a decision was made is as important as the decision itself. The audit logger captures:
- Which rules were evaluated
- What version of each rule was active
- What data was used as input
- What the result was at each step
This makes any decision reproducible months later — critical for regulatory inquiries.
Technical Tradeoffs
| Decision | Benefit | Cost |
|---|---|---|
| Dynamic rule loading | Rules change without deployments | Startup validation cost; no compile-time safety |
| Runtime schema validation | Non-engineers can author rules | Validation errors surface at deploy, not build time |
| Full audit trail | Complete decision reproducibility | Storage overhead; ~3x write amplification |
| Short-circuit evaluation | Faster rejections, cleaner logic | Rule ordering matters; reordering changes behavior |
We chose runtime validation with schema checks at deployment over building a DSL compiler. The tradeoff: slightly less safety, but the system stays accessible to credit analysts who define rules alongside engineers.
Impact
- ~30% reduction in manual review effort — rules that previously required human judgment were automated
- Consistent credit decisions — same inputs always produce same outputs, across products
- Rule changes in hours, not sprints — new rules are config changes, tested in staging, deployed without code review
- Full regulatory compliance — every decision is traceable to specific rule versions
Lessons Learned
-
Rule engines need versioning from day one. We learned this when a credit analyst asked "why was this application rejected three months ago?" and we couldn't reproduce the decision because the rules had changed. Now every rule evaluation stores the rule version alongside the result.
-
Audit trails are not optional in financial systems. They become the source of truth during disputes, regulatory reviews, and internal investigations. Design for auditability upfront — retrofitting is painful.
-
Keep the rule language simple. The temptation is to build a Turing-complete DSL. Resist it. Simple field-operator-value rules cover 90% of cases. For the remaining 10%, write a custom evaluator plugin.
-
Test rule interactions, not just individual rules. Rules that are individually correct can produce unexpected results when combined. Integration tests with realistic data caught bugs that unit tests missed.
Open Source Demo: RuleFlow
To explore these patterns in a simpler context, I built RuleFlow — a Drools-style rule engine in Node.js with an airline loyalty program example. Tier upgrades and discounts run on editable JSON rules, demonstrating how business logic can evolve without code changes.