That Legacy Monster? Tame It (And Test It!) With Your LLM – If You Know The Secret Handshake.

· origo's blog


Alright, pull up a chair. We need to talk about that codebase. You know the one. The sprawling, ancient beast lurking in your repo, the one where a "quick bug fix" turns into a three-day archaeological dig. Functions so long they have their own scrollbars, logic so tangled it makes spaghetti look like a ruler, and tests? Ha! If they exist, they're probably commented out with a #TODO: Fix this later (lol, 2017).

We've all inherited these digital tar pits. And the thought of refactoring them, let alone writing comprehensive tests, is enough to make even the most seasoned dev reach for the emergency chocolate.

But here’s a plot twist that might just save your sanity (and your sprint goals): Large Language Models (LLMs) are getting surprisingly good at this dirty work. Yes, the same AI that writes poetry and argues about philosophy can actually help you wrestle that legacy monster into submission. However – and this is the crucial bit, the secret handshake, if you will – they only become truly effective when you stop treating them like mystical oracles and start treating them like hyper-competent, but utterly context-blind, new hires.

The Big Lie: LLMs are "Creative" Code Writers #

Most folks I talk to figure LLMs are best at whipping up fresh, greenfield code. "Ask it for a new sorting algorithm, and it's brilliant!" they say. And sure, they can do that. But when it comes to the messy, nuanced world of existing systems, I've found the opposite is often true: LLMs can be better at refactoring and generating tests for existing code than writing new code from scratch.

Why? Because writing new code from a vague prompt is an act of sophisticated guesswork for an LLM. It's pulling from the statistical soup of its massive training data. It might suggest user.getName() because that's common in a million Java tutorials, completely oblivious to the fact that your User object, in your beautifully unique and slightly terrifying legacy system, actually uses user.fetchUsernameFromThatWeirdMainframeBridge().

But give it your code, your types, your patterns? Now you're not asking it to dream. You're asking it to work.

Contextual Lockdown: Starving Hallucinations at the Source #

This is where the magic (which isn't magic at all, just good engineering) happens. You've probably heard that LLMs can "hallucinate" – make stuff up, confidently assert falsehoods, or generate code that looks plausible but is utterly wrong for your system. This is the biggest fear when letting an AI touch your precious (or precarious) codebase.

The antidote? I call it Contextual Lockdown.

Think of an LLM as operating in two modes:

  1. The Explorer (Low Context): You give it a vague prompt: "Refactor this function." You paste only the function itself. The LLM is now wandering the vast plains of its training data (all of public GitHub, Stack Overflow, etc.), looking for statistical matches. It's guessing. It's exploring. This is where user.getName() pops up when it shouldn't. This is prime hallucination territory.

  2. The Exploiter (High Context): Now, imagine you slam the gates shut with Contextual Lockdown. You feed it the target function, and its entire module, and all relevant type definitions, and examples of how it's called, and your team's style guide, and examples of well-written tests from your codebase. You've dramatically shrunk its playground. It's no longer exploring the internet; it's forced into Exploiter mode, meticulously working with the exact, concrete materials you've handed it.

Why Contextual Lockdown Kills Hallucinations:

This isn't about the LLM suddenly getting "smarter." It's about changing its operational parameters. You're providing such a high-fidelity, constrained environment that the most statistically probable outputs are those that correctly use and manipulate the entities within that environment.

Essentially, you're starving the hallucination beast by removing its food source: ambiguity and a lack of specific information. The more precise and complete your context, the less room there is for the LLM to do anything but operate on the facts you've laid out. This is the secret to getting astonishingly accurate refactoring and test generation.

Feeding the Beast: What "Good Context" Actually Means (No Skimping!) #

So, "Contextual Lockdown" sounds great, but what does it mean in practice? You can't just wave a magic wand. You need to feed the LLM the right stuff. Garbage in, dangerously plausible garbage out. Quality in, quality out.

Here’s what I’ve found moves the needle from "meh" to "whoa":

Origo's Hard-Won Wisdom: Think of it like onboarding a sharp but inexperienced developer. You wouldn't just point them to a 2000-line hairball of a function and say, "Refactor this and write tests. Good luck!" You'd give them access to the repo, walk them through your patterns, show them existing good examples, and explain the testing strategy. Do the same for your LLM. The more you give it, the more it gives back.

The "No-BS" Prompt Formula for Refactoring & Test Gen Wins #

Stop whispering vague hopes into the void. You need to be explicit, firm, and clear. Here’s a battle-tested template I use as a starting point (customize it heavily for your specific needs!):

 1# CONTEXT ASSEMBLY: OPERATION LEGACY RESCUE
 2
 3## Target for Operation:
 4- Function/Method: `processLegacyUserData`
 5- File Path: `src/services/oldUserService.java`
 6- Brief Description: This function takes raw user input, validates it against ancient business rules, and transforms it for storage in the new system. It's a mess.
 7
 8## Relevant Code & Definitions:
 9(Paste complete file contents or relevant large snippets here)
10- `src/services/oldUserService.java` (contains `processLegacyUserData`)
11- `src/models/RawUserInput.java`
12- `src/models/ProcessedUserData.java`
13- `src/utils/LegacyValidator.java`
14- `src/config/BusinessRules.xml` (if it influences logic and can be represented)
15
16## Calling Code Examples:
17(Show how `processLegacyUserData` is typically invoked)
18- Snippet from `src/controllers/UserController.java`:
19  ```java
20  // ...
21  RawUserInput rawInput = getRawInputFromRequest(request);
22  OldUserService userService = new OldUserService();
23  ProcessedUserData processedData = userService.processLegacyUserData(rawInput, customerId);
24  // ...
25  ```
26
27## Style & Pattern Guidance:
28- **General Style:** Adhere to Google Java Style Guide (or your team's guide).
29- **Error Handling:** Emulate pattern in `src/services/NewOrderService.java` (e.g., throw `SpecificDomainException`).
30- **Refactoring Example (Good Pattern):** See `refactorExample_DataTransformer.java` (a snippet you provide showing a clean transformation).
31- **Testing Style (CRUCIAL FOR TEST GENERATION):**
32  - Framework: JUnit 5 with Mockito.
33  - Example Test Class: `test/services/NewProductServiceTest.java` (paste this to show setup, mocking, assertions).
34
35---
36# TASK:
37
38## Primary Goal: [Choose ONE: Refactor OR Generate Tests]
39
40**Option A: Refactor `processLegacyUserData`**
41   - **Objectives:**
42     1. Improve readability and maintainability significantly.
43     2. Break down into smaller, single-responsibility methods.
44     3. Reduce cyclomatic complexity.
45     4. Use clearer variable names.
46   - **Constraints:**
47     1. **MUST** maintain the exact public interface (signature, exceptions thrown).
48     2. **MUST** preserve all existing behavior, including all implicit edge cases. (Existing tests, if any, should still pass).
49     3. **MUST** adhere to the provided style and error handling patterns.
50     4. **MUST NOT** introduce new external library dependencies.
51     5. **MUST** remain compatible with Java 8.
52
53**Option B: Generate Comprehensive Unit Tests for `processLegacyUserData`**
54   - **Objectives:**
55     1. Achieve high branch and line coverage for `processLegacyUserData`.
56     2. Test normal execution paths.
57     3. Test edge cases (null inputs, empty strings, boundary values based on `BusinessRules.xml` if possible).
58     4. Test error handling paths (e.g., when `LegacyValidator` throws an error).
59   - **Constraints:**
60     1. **MUST** use JUnit 5 and Mockito, following patterns in `test/services/NewProductServiceTest.java`.
61     2. **MUST** mock external dependencies like `LegacyValidator` appropriately.
62     3. Generated tests **MUST** compile and be runnable.
63     4. Each test method should be focused and clearly named.
64
65---
66# VERIFICATION (How I'll Judge Your Work):
671.  **Compilation:** Code (or tests) compiles without errors.
682.  **Existing Tests (for Refactoring):** All pre-existing tests for `oldUserService.java` must still pass.
693.  **Generated Tests (for Test Gen):** Generated tests run and accurately reflect the function's logic. I will manually verify coverage and correctness.
704.  **Pattern Adherence:** Solution uses only methods/properties/types provided in context or standard Java libraries, and follows specified styles.
715.  **Clarity (for Refactoring):** The refactored code is demonstrably easier to understand.

Pro Tip: Save these detailed prompts as templates! You'll thank yourself later. The effort upfront pays off massively.

The Payoff: Precision, Sanity-Saving Tests, and Your Even Smarter Brain #

When you nail the Contextual Lockdown and provide a crystal-clear brief, the results can be genuinely game-changing:

This frees up your precious brainpower to focus on the higher-level stuff:

You become the architect and the QA lead, while the LLM acts as your incredibly diligent, pattern-matching junior dev who never needs coffee.

The Horizon: From Code Janitor to Architectural Co-Pilot #

We're just scratching the surface, folks. As context windows expand (and they are, at a dizzying pace) and these models get even better at reasoning over complex structures, their utility will only grow. Imagine LLMs helping with:

But the core principle will remain the same: Garbage context in, garbage (or dangerously plausible garbage) out. Rich, precise context in? That’s when you unlock the real power.

Stop treating LLMs like magic wands you wave vaguely at problems. Start treating them like the sophisticated, context-hungry reasoning engines they are. Feed them well, guide them firmly, and they can become an incredible force multiplier in your daily trench warfare against legacy code and looming deadlines.

Your future, less-stressed self (and your much healthier codebase) will thank you. Now go forth and conquer that tar pit.

last updated: