Extracting a Recurring Billing Engine from a 15-Year-Old .NET 4.8 Monolith
The Challenge
We had a recurring billing engine buried inside a .NET Framework 4.8 monolith that had been running in production for over a decade. It processed thousands of credit card and ACH payments daily across multiple payment gateways (TSYS, Paragon, First American). The system worked, but it was showing its age:
- Quartz.NET cron jobs firing at 3:01 AM daily, with a separate 3:06 AM monthly run split into 4-16 batches
- Castle Windsor IoC wired up through XML-heavy configuration
- Boolean flag soup for frequency:
IsWeekly,Is2Weekly,Is4Weekly,IsDelayPay— all separate columns instead of an enum - Two separate transaction tables — one for successes and one for merchant-level failures
- Odd/even PK splitting for multi-server processing (literally: "if subscription ID is odd, this server processes it")
The code worked. But we couldn't extend it, we couldn't test it, and we couldn't deploy it independently. It was time to extract.
The Approach: Read Everything, Then Redesign
We didn't start by writing code. We started by reading — every line of the heritage system's recurring logic. Here's what we mapped:
The Heritage Pipeline
Quartz.NET Scheduler (cron)
→ DailyRecurringJobService (AES key check, SMS alert)
→ DailyRecurringService (orchestrator)
→ ProcessDeleteDates()
→ WeeklyService.Process()
→ BiWeeklyService.Process()
→ FourWeeklyService.Process()
→ DelayPayService.Process()
→ MonthlyService (separate job, batched)
→ ProcessService.PayOccurence() (~750 lines)
→ Gateway routing (CC vs ACH vs Stripe)
→ Failure classification (cardholder vs merchant)
→ Circuit breaker (5 consecutive "597" responses → halt)
The PayOccurence() method alone was ~750 lines. It handled guard clauses, payment method resolution, gateway API calls, response parsing, failure classification, transaction recording, email notifications, and circuit breaker logic — all in a single method.
Key Behaviors We Had to Preserve
-
Short-month handling: On February 28th, the system must also process subscriptions set for days 29, 30, and 31. Same logic for April 30th (catches day-31 subs), etc.
-
Circuit breaker: If the payment gateway returns response code
597(unreachable) five consecutive times, halt all processing. This prevents hammering a down gateway and accumulating false failures. -
Failure classification: A declined card (cardholder-level) is handled differently from "merchant credentials invalid" (merchant-level). The heritage system used separate database tables for this.
-
DelayPay: A one-time future charge that deactivates the subscription after successful payment. Think "charge this card on March 15th and then we're done."
-
Batch splitting: Monthly processing splits all eligible subscriptions into groups with random 2-30 minute sleep between batches to avoid gateway rate limits.
The New Architecture
We rebuilt this as two new projects inside our existing PayEz-Core solution:
PayEz.Recurring.Application → Service layer (business logic)
PayEz.Recurring.Api → Internal HTTP API
Boolean Flags → Enums
The single biggest improvement:
// Heritage: 4 boolean columns
bool IsWeekly;
bool Is2Weekly;
bool Is4Weekly;
bool IsDelayPay;
// (if none are true, it's monthly... implicitly)
// New: one enum
public enum RecurringFrequency
{
Monthly = 1,
Weekly = 2,
BiWeekly = 3,
FourWeekly = 4,
OneTime = 5 // replaces DelayPay
}
Same treatment for status (was bool Active + bool Paused + DateTime? DeleteDate, now SubscriptionStatus enum) and failure tracking (was two separate tables, now a single RecurringTransaction with a TransactionFailureType enum).
Quartz.NET → BackgroundService
Heritage used Quartz.NET with [DisallowConcurrentExecution] and XML-configured cron expressions. We replaced it with a .NET BackgroundService using PeriodicTimer:
public class RecurringSchedulerService : BackgroundService
{
protected override async Task ExecuteAsync(CancellationToken ct)
{
using var timer = new PeriodicTimer(TimeSpan.FromMinutes(1));
while (await timer.WaitForNextTickAsync(ct))
{
var now = TimeProvider.System.GetUtcNow();
if (ShouldRunDaily(now)) await RunDaily(ct);
if (ShouldRunMonthly(now)) await RunMonthly(ct);
}
}
}
Schedule configuration comes from appsettings.json via the Options pattern — no more Quartz JobDataMap with comma-separated strings.
Odd/Even → Redis Distributed Locking
Heritage prevented multi-server concurrent execution by checking AppSetting["Server"] and only processing odd or even subscription IDs. We replaced this with Redis distributed locks:
Lock key: "recurring:daily:lock:{date}"
TTL: 2 hours
First pod to acquire the lock runs processing. Others skip gracefully.
750-Line PayOccurence → Clean Service Methods
The heritage PayOccurence() method was decomposed into:
- Guard clauses → extracted to the repository query (only fetch eligible subscriptions)
- Payment routing →
ProcessSubscriptionAsync()with pattern matching onPaymentType - Circuit breaker → Polly
CircuitBreakerAsyncper merchant - Failure classification →
TransactionFailureTypeenum set at transaction creation - Transaction recording → single
RecurringTransactionentity (no separate failure table)
Change History: Full Snapshot → JSON Diff
Heritage recorded every subscription edit by duplicating the entire entity into a history table (30+ columns copied). We replaced this with JSON diff tracking:
var changes = new Dictionary<string, object?>();
if (dto.Amount.HasValue && dto.Amount.Value != subscription.Amount)
{
changes["Amount"] = new { from = subscription.Amount, to = dto.Amount.Value };
subscription.Amount = dto.Amount.Value;
}
// ... only changed fields are recorded
History entries now store PreviousValues and NewValues as JSONB — compact and queryable.
What We Kept
Not everything needed to change. We deliberately preserved:
- The scheduling times (3:01 AM daily, 3:06 AM monthly) — operations teams are used to this
- The batch splitting logic — random sleep between batches prevents gateway rate limiting
- The minimum amount threshold ($5 default) — prevents processing trivially small amounts
- The "already paid today" guard — prevents double-charging if the scheduler runs twice
- Legacy bridge fields (
PortalId,SpecialTypeId) — needed for heritage gateway credential resolution
Lessons Learned
-
Read the heritage code fully before designing. We found the short-month handling logic buried 400 lines into
RecurringService.cs. If we'd started coding without reading everything, we'd have missed it and broken billing for everyone with a subscription on the 31st. -
Enums over booleans, always. The heritage system's boolean flags made every query a nightmare of
WHERE IsWeekly = false AND Is2Weekly = false AND Is4Weekly = false AND IsDelayPay = falseto find monthly subscriptions. One enum column fixed all of this. -
Don't abstract prematurely, but do separate concerns. The 750-line
PayOccurence()method wasn't bad because it was long — it was bad because it mixed guard logic, payment routing, failure classification, and notification into one undifferentiated blob. Splitting into focused methods made each one testable. -
Heritage systems contain tribal knowledge. Response code
597meaning "gateway unreachable" and triggering a circuit breaker wasn't documented anywhere. It was in a code comment from 2019. Preserving these behaviors required reading every line and every comment. -
Bridge fields are temporary — but they're necessary. We added
PortalIdandSpecialTypeIdto the new schema even though they're heritage concepts. Without them, we can't resolve which merchant credentials to use for legacy gateway routing. They'll be removed once all clients migrate to Stripe.
What's Next
The internal API is built and compiles. Next steps:
- Wire the Vibe API to call the Recurring API via HttpClient
- Connect to the existing Payment API for legacy CC/ACH charges
- Connect to the existing Stripe client for modern payment processing
- Run the SQL migration against the
payez_idpdatabase to create therecurringschema - Deploy to AKS alongside the existing services
The heritage system still runs in production. We'll run both in parallel until we're confident the new system matches behavior, then cut over.
Jon Ranes is the founder of PayEz and IdealVibe. BA.Pert is his AI business analyst who never sleeps and always has opinions about schema design.