Architecture

Extracting a Recurring Billing Engine from a 15-Year-Old .NET 4.8 Monolith

Jon Ranes & BA.PertFebruary 10, 202612 min read
Share:

The Challenge

We had a recurring billing engine buried inside a .NET Framework 4.8 monolith that had been running in production for over a decade. It processed thousands of credit card and ACH payments daily across multiple payment gateways (TSYS, Paragon, First American). The system worked, but it was showing its age:

  • Quartz.NET cron jobs firing at 3:01 AM daily, with a separate 3:06 AM monthly run split into 4-16 batches
  • Castle Windsor IoC wired up through XML-heavy configuration
  • Boolean flag soup for frequency: IsWeekly, Is2Weekly, Is4Weekly, IsDelayPay — all separate columns instead of an enum
  • Two separate transaction tables — one for successes and one for merchant-level failures
  • Odd/even PK splitting for multi-server processing (literally: "if subscription ID is odd, this server processes it")

The code worked. But we couldn't extend it, we couldn't test it, and we couldn't deploy it independently. It was time to extract.

The Approach: Read Everything, Then Redesign

We didn't start by writing code. We started by reading — every line of the heritage system's recurring logic. Here's what we mapped:

The Heritage Pipeline

Quartz.NET Scheduler (cron)
  → DailyRecurringJobService (AES key check, SMS alert)
    → DailyRecurringService (orchestrator)
      → ProcessDeleteDates()
      → WeeklyService.Process()
      → BiWeeklyService.Process()
      → FourWeeklyService.Process()
      → DelayPayService.Process()
    → MonthlyService (separate job, batched)
      → ProcessService.PayOccurence() (~750 lines)
        → Gateway routing (CC vs ACH vs Stripe)
        → Failure classification (cardholder vs merchant)
        → Circuit breaker (5 consecutive "597" responses → halt)

The PayOccurence() method alone was ~750 lines. It handled guard clauses, payment method resolution, gateway API calls, response parsing, failure classification, transaction recording, email notifications, and circuit breaker logic — all in a single method.

Key Behaviors We Had to Preserve

  1. Short-month handling: On February 28th, the system must also process subscriptions set for days 29, 30, and 31. Same logic for April 30th (catches day-31 subs), etc.

  2. Circuit breaker: If the payment gateway returns response code 597 (unreachable) five consecutive times, halt all processing. This prevents hammering a down gateway and accumulating false failures.

  3. Failure classification: A declined card (cardholder-level) is handled differently from "merchant credentials invalid" (merchant-level). The heritage system used separate database tables for this.

  4. DelayPay: A one-time future charge that deactivates the subscription after successful payment. Think "charge this card on March 15th and then we're done."

  5. Batch splitting: Monthly processing splits all eligible subscriptions into groups with random 2-30 minute sleep between batches to avoid gateway rate limits.

The New Architecture

We rebuilt this as two new projects inside our existing PayEz-Core solution:

PayEz.Recurring.Application   → Service layer (business logic)
PayEz.Recurring.Api           → Internal HTTP API

Boolean Flags → Enums

The single biggest improvement:

// Heritage: 4 boolean columns
bool IsWeekly;
bool Is2Weekly;
bool Is4Weekly;
bool IsDelayPay;
// (if none are true, it's monthly... implicitly)

// New: one enum
public enum RecurringFrequency
{
    Monthly = 1,
    Weekly = 2,
    BiWeekly = 3,
    FourWeekly = 4,
    OneTime = 5   // replaces DelayPay
}

Same treatment for status (was bool Active + bool Paused + DateTime? DeleteDate, now SubscriptionStatus enum) and failure tracking (was two separate tables, now a single RecurringTransaction with a TransactionFailureType enum).

Quartz.NET → BackgroundService

Heritage used Quartz.NET with [DisallowConcurrentExecution] and XML-configured cron expressions. We replaced it with a .NET BackgroundService using PeriodicTimer:

public class RecurringSchedulerService : BackgroundService
{
    protected override async Task ExecuteAsync(CancellationToken ct)
    {
        using var timer = new PeriodicTimer(TimeSpan.FromMinutes(1));
        while (await timer.WaitForNextTickAsync(ct))
        {
            var now = TimeProvider.System.GetUtcNow();
            if (ShouldRunDaily(now)) await RunDaily(ct);
            if (ShouldRunMonthly(now)) await RunMonthly(ct);
        }
    }
}

Schedule configuration comes from appsettings.json via the Options pattern — no more Quartz JobDataMap with comma-separated strings.

Odd/Even → Redis Distributed Locking

Heritage prevented multi-server concurrent execution by checking AppSetting["Server"] and only processing odd or even subscription IDs. We replaced this with Redis distributed locks:

Lock key: "recurring:daily:lock:{date}"
TTL: 2 hours

First pod to acquire the lock runs processing. Others skip gracefully.

750-Line PayOccurence → Clean Service Methods

The heritage PayOccurence() method was decomposed into:

  • Guard clauses → extracted to the repository query (only fetch eligible subscriptions)
  • Payment routingProcessSubscriptionAsync() with pattern matching on PaymentType
  • Circuit breaker → Polly CircuitBreakerAsync per merchant
  • Failure classificationTransactionFailureType enum set at transaction creation
  • Transaction recording → single RecurringTransaction entity (no separate failure table)

Change History: Full Snapshot → JSON Diff

Heritage recorded every subscription edit by duplicating the entire entity into a history table (30+ columns copied). We replaced this with JSON diff tracking:

var changes = new Dictionary<string, object?>();
if (dto.Amount.HasValue && dto.Amount.Value != subscription.Amount)
{
    changes["Amount"] = new { from = subscription.Amount, to = dto.Amount.Value };
    subscription.Amount = dto.Amount.Value;
}
// ... only changed fields are recorded

History entries now store PreviousValues and NewValues as JSONB — compact and queryable.

What We Kept

Not everything needed to change. We deliberately preserved:

  • The scheduling times (3:01 AM daily, 3:06 AM monthly) — operations teams are used to this
  • The batch splitting logic — random sleep between batches prevents gateway rate limiting
  • The minimum amount threshold ($5 default) — prevents processing trivially small amounts
  • The "already paid today" guard — prevents double-charging if the scheduler runs twice
  • Legacy bridge fields (PortalId, SpecialTypeId) — needed for heritage gateway credential resolution

Lessons Learned

  1. Read the heritage code fully before designing. We found the short-month handling logic buried 400 lines into RecurringService.cs. If we'd started coding without reading everything, we'd have missed it and broken billing for everyone with a subscription on the 31st.

  2. Enums over booleans, always. The heritage system's boolean flags made every query a nightmare of WHERE IsWeekly = false AND Is2Weekly = false AND Is4Weekly = false AND IsDelayPay = false to find monthly subscriptions. One enum column fixed all of this.

  3. Don't abstract prematurely, but do separate concerns. The 750-line PayOccurence() method wasn't bad because it was long — it was bad because it mixed guard logic, payment routing, failure classification, and notification into one undifferentiated blob. Splitting into focused methods made each one testable.

  4. Heritage systems contain tribal knowledge. Response code 597 meaning "gateway unreachable" and triggering a circuit breaker wasn't documented anywhere. It was in a code comment from 2019. Preserving these behaviors required reading every line and every comment.

  5. Bridge fields are temporary — but they're necessary. We added PortalId and SpecialTypeId to the new schema even though they're heritage concepts. Without them, we can't resolve which merchant credentials to use for legacy gateway routing. They'll be removed once all clients migrate to Stripe.

What's Next

The internal API is built and compiles. Next steps:

  • Wire the Vibe API to call the Recurring API via HttpClient
  • Connect to the existing Payment API for legacy CC/ACH charges
  • Connect to the existing Stripe client for modern payment processing
  • Run the SQL migration against the payez_idp database to create the recurring schema
  • Deploy to AKS alongside the existing services

The heritage system still runs in production. We'll run both in parallel until we're confident the new system matches behavior, then cut over.


Jon Ranes is the founder of PayEz and IdealVibe. BA.Pert is his AI business analyst who never sleeps and always has opinions about schema design.

#dotnet#microservices#legacy-migration#architecture#recurring-billing

Found this helpful? Share it with others.

Share:
Dev Blog - Building in Public | IdealVibe