Design an extensible corporate-card rules engine. Evaluate per-expense and per-trip policies, choose a useful return type, then support composite AND / OR / NOT rules and scale the design to millions of expenses and tens of thousands of rules.

Expense Rules Engine — Rippling Interview Question

Requirements

Input expenses are dictionaries with string keys and values. Typical fields include expense_id, trip_id, amount_usd, expense_type, vendor_type, and vendor_name.
Implement evaluate_rules(rules: list<rule>, expenses: list<expense>) -> ...; the return type is part of the design discussion, but it should preserve which expense or trip violated which rule.
Base individual-expense rules include:
- no restaurant expense over $75 where vendor_type == "restaurant";
- no airfare expenses;
- no entertainment expenses;
- no single expense over $250.
Base group / trip rules include:
- no trip over $2000 total;
- no meal expenses over $200 total per trip.
Design for future rule types and API-created rules. Rules should be treated as data rather than hardcoded as one function per policy.
Reuse the same predicate / condition layer across individual and group rules. For example, the expense_type == "meals" condition can filter expenses before summing a trip-level amount.
Follow-up: handle millions of expenses per day and tens of thousands of rules; discuss storage, rule indexing, streaming evaluation, and notification of violations.
Follow-up: support composite rules such as (restaurant AND meals AND amount > 50), (entertainment AND amount > 100) OR client_hosting, and (amount > 100) AND NOT vendor_name == Staples.

Examples

{
  "expense_id": "001",
  "trip_id": "001",
  "amount_usd": "49.99",
  "expense_type": "client_hosting",
  "vendor_type": "restaurant",
  "vendor_name": "Outback Roadhouse"
}

A useful violation payload is explicit enough for API clients and notifications:

{
  "rule_id": "single-expense-limit",
  "expense_id": "004",
  "trip_id": "002",
  "message": "Expense 004 exceeds $250"
}

Notes

A good return type separates individual-expense violations from trip-level violations and preserves which rule was violated.
The OOD signal is the core of the round: use a predicate / rule interface, then compose rules into an expression tree for AND / OR / NOT. The canonical shape is the Rules / Specification pattern: a Rule interface with a single evaluate(context) -> bool or evaluate(context) -> Violation? method, plus composite rules AndRule, OrRule, and NotRule that hold child rules and combine their results.
Split the type hierarchy into two layers: per-transaction rules that read one expense, and per-trip aggregate rules that read a grouped list. Aggregate rules need an explicit grouping step in the evaluator; running them per expense would either double-count or miss trip totals.
A practical data-driven model has Condition(field, operator, value) for leaf predicates and Rule(rule_id, condition, violation_message) for individual checks. Group rules add group_by, aggregate, threshold, and an optional filter_condition that reuses the same condition object from Part 1.
When both individual and group rules run, combine results into one response rather than short-circuiting after the first violation; interviewers usually expect all violated policies to be visible.
For scale, pre-index rules by the fields they read so each expense only triggers the rules whose predicates touch its keys; aggregate per-trip metrics incrementally instead of re-scanning; stream evaluation per trip-window; and emit violation events to an async notification path rather than blocking ingestion.
Edge cases: parse amount_usd as numeric before comparing; handle missing fields consistently; decide whether negative or zero amounts are invalid input or simply fail no threshold rule; clarify whether multiple rules can produce duplicate-looking messages for the same expense.
In AI-assisted rounds, interviewers expect the candidate to own the rule interface and the per-trip aggregation contract before prompting the tool. Generating composite-rule boilerplate is fine, but the interface design has to come from you.

Preparation

Implement the base rule engine twice: once with simple per-expense functions, once with Rule objects + AndRule / OrRule / NotRule composites so the same evaluator handles both flat and nested rules.
Group expenses by trip_id once per evaluation pass; precompute trip totals and per-expense_type sums and feed them to aggregate rules so the same totals are not recomputed per rule.
Prepare a layered scale answer: rule storage, rule compilation into composite trees on load, field-based indexing of leaf predicates, streaming per-trip evaluation, and violation events emitted to a queue for notifications.