Building a Modular Rule Engine for Credit Decisioning

Problem Context

Credit decisions required manual review across inconsistent evaluation criteria. Different products had different rule sets hardcoded into application logic. Changing a single rule meant a code change, PR review, deployment cycle, and regression testing. The process was slow, error-prone, and didn't scale.

The core question: how do you make business rules changeable without code deployments, while maintaining full auditability in a regulated financial system?

System Architecture

The rule engine follows a plugin architecture with three layers:

Loading diagram...

Rule Definitions are configuration, not code. Each rule is a JSON/YAML document specifying:

Field to evaluate
Operator (equals, greater_than, in_range, etc.)
Threshold value
Action on match (approve, reject, flag_for_review)

Evaluator Pipeline processes rules in sequence with short-circuit logic — if a hard rejection rule fires, remaining rules are skipped.

Audit Trail captures every rule evaluation, not just the final decision. Each audit record stores the rule version, input data snapshot, evaluation result, and timestamp.

Key Engineering Decisions

1. Plugin Architecture Over Hardcoded Conditionals

The temptation was to build a big if/elif/else chain. Instead, rules are defined as config documents and loaded dynamically at startup. New rule types are added as evaluator plugins that implement a simple interface:

class RuleEvaluator(Protocol):
    def evaluate(self, context: EvaluationContext) -> RuleResult: ...

This means a new rule type (e.g., "check if applicant's state is in allowed list") is a ~20 line class, not a branch in a growing conditional tree.

2. Dataclass-Based Evaluators for Type Safety

Every evaluator uses Python dataclasses for input/output schemas. This gives us:

IDE autocompletion and type checking
Automatic validation at the boundary
Clean serialization for audit logs

3. Full Audit on Every Evaluation, Not Just Decisions

In financial systems, knowing why a decision was made is as important as the decision itself. The audit logger captures:

Which rules were evaluated
What version of each rule was active
What data was used as input
What the result was at each step

This makes any decision reproducible months later — critical for regulatory inquiries.

Technical Tradeoffs

Decision	Benefit	Cost
Dynamic rule loading	Rules change without deployments	Startup validation cost; no compile-time safety
Runtime schema validation	Non-engineers can author rules	Validation errors surface at deploy, not build time
Full audit trail	Complete decision reproducibility	Storage overhead; ~3x write amplification
Short-circuit evaluation	Faster rejections, cleaner logic	Rule ordering matters; reordering changes behavior

We chose runtime validation with schema checks at deployment over building a DSL compiler. The tradeoff: slightly less safety, but the system stays accessible to credit analysts who define rules alongside engineers.

Impact

~30% reduction in manual review effort — rules that previously required human judgment were automated
Rule changes ship in hours, not sprint cycles — new rules are config changes tested in staging
Every decision is traceable to specific rule versions, satisfying regulatory audit requirements

Lessons Learned

Version rule engines from day one. We learned this when a credit analyst asked "why was this application rejected three months ago?" and we couldn't reproduce the decision because the rules had changed. Now every rule evaluation stores the rule version alongside the result.
Audit trails are not optional in financial systems. They become the source of truth during disputes, regulatory reviews, and internal investigations. Design for auditability upfront — retrofitting is painful.
Keep the rule language simple. The temptation is to build a Turing-complete DSL. Resist it. Simple field-operator-value rules cover 90% of cases. For the remaining 10%, write a custom evaluator plugin.
Test rule interactions, not just individual rules. Rules that are individually correct can produce unexpected results when combined. Integration tests with realistic data caught bugs that unit tests missed.

Open Source Demo: RuleFlow

To explore these patterns in a simpler context, I built RuleFlow — a Drools-style rule engine in Node.js with an airline loyalty program example. Tier upgrades and discounts run on editable JSON rules, demonstrating how business logic can evolve without code changes.