fix: add JSON-aware comparison to Python comparator fallback by mashraf-222 · Pull Request #1276 · codeflash-ai/codeflash

mashraf-222 · 2026-02-03T00:44:11Z

Summary

Fixed JSON comparison bug in Python fallback comparator
Prevents false negatives when comparing semantically identical JSON
Matches Java Comparator behavior for cross-language consistency

Problem

The compare_invocations_directly() function in comparator.py was using simple string comparison for JSON results:

elif orig_result != cand_result:  # Line 311

This caused false negatives when JSON was semantically identical but formatted differently:

Example False Negative:

Original: {"a":1,"b":2}
Candidate: { "b": 2, "a": 1 }

These are semantically identical but fail string comparison due to:

Different whitespace
Different key ordering

The Java Comparator handles this correctly by parsing JSON and doing deep equality, but the Python fallback did not.

Solution

Added JSON-aware comparison to match Java Comparator behavior:

New Helper Function: `_compare_json_values()`

def _compare_json_values(json1: str | None, json2: str | None) -> bool:
    # 1. Handle None values
    if json1 is None and json2 is None:
        return True
    if json1 is None or json2 is None:
        return False
    
    # 2. Fast path: exact string match
    if json1 == json2:
        return True
    
    # 3. Parse JSON and compare objects
    try:
        obj1 = json.loads(json1)
        obj2 = json.loads(json2)
        return obj1 == obj2
    except (json.JSONDecodeError, TypeError):
        # 4. Fallback to string comparison if parsing fails
        return json1 == json2

Updated Comparison Logic:

elif not _compare_json_values(orig_result, cand_result):
    # Results differ (using JSON-aware comparison)

What This Fixes

✅ Whitespace differences - {"a":1} == { "a": 1 }
✅ Key ordering - {"a":1,"b":2} == {"b":2,"a":1}
✅ Nested objects - Deep equality on nested JSON structures
✅ Numeric types - 42 == 42.0 (Python's JSON behavior)
✅ Invalid JSON - Gracefully falls back to string comparison

❌ Array ordering - [1,2,3] != [3,2,1] (correct behavior - order matters)

Tests Added

Updated Existing Test:

test_whitespace_in_json - Now expects True (was incorrectly expecting False)

New Test Class: `TestJsonComparison` (8 tests)

test_json_key_ordering_difference
- {"a":1,"b":2,"c":3} == {"c":3,"a":1,"b":2}
test_json_whitespace_and_ordering_combined
- Tests combined whitespace and key order differences
test_json_nested_object_comparison
- {"outer":{"inner":{"value":123}}} with whitespace variations
test_json_array_comparison_order_matters
- Ensures [1,2,3] != [3,2,1] (order matters)
test_json_invalid_json_falls_back_to_string
- Non-JSON strings compared correctly
test_json_null_vs_string_null
- JSON null value comparison
test_json_empty_object_vs_null
- {} != null (correctly different)
test_json_numeric_equivalence
- {"value":42} == {"value":42.0}

Impact

Before (Buggy):

# These would fail comparison even though they're identical
original = '{"name":"test","value":42}'
candidate = '{ "value": 42, "name": "test" }'
# Result: FALSE NEGATIVE - reports as different

After (Fixed):

# Same example now correctly recognized as identical
original = '{"name":"test","value":42}'
candidate = '{ "value": 42, "name": "test" }'
# Result: CORRECT - reports as equivalent

Testing

✅ All 344 Java tests pass (7 skipped)
✅ All 26 comparator tests pass (18 existing + 8 new)
✅ No regressions - existing tests still pass
✅ Comprehensive coverage - edge cases, invalid JSON, nested objects

Why This Matters

Prevents False Negatives - Won't incorrectly reject valid optimizations
Cross-Language Consistency - Python fallback matches Java Comparator
Better User Experience - Fewer spurious failures in behavior verification
Robustness - Handles real-world JSON variations gracefully

🤖 Generated with Claude Code

Fixed a bug where the Python fallback comparator used simple string comparison for JSON results, causing false negatives when JSON was semantically identical but formatted differently. Problem: The compare_invocations_directly() function compared result_json fields using direct string comparison (orig_result != cand_result). This failed for semantically identical JSON with: - Different whitespace: {"a":1,"b":2} vs { "a": 1, "b": 2 } - Different key ordering: {"a":1,"b":2} vs {"b":2,"a":1} The Java Comparator handles this correctly by parsing JSON, but the Python fallback did not. Solution: - Added _compare_json_values() helper function that: 1. Handles None values correctly 2. Fast-path for exact string matches 3. Parses JSON and compares deserialized objects 4. Falls back to string comparison if JSON parsing fails - Updated compare_invocations_directly() to use JSON-aware comparison Impact: - Prevents false negatives in behavior verification - Matches Java Comparator behavior for consistency - Handles whitespace, key ordering, and nested objects correctly - Gracefully handles invalid JSON by falling back to string comparison Tests added: - Updated test_whitespace_in_json to expect correct behavior (True) - Added TestJsonComparison class with 8 comprehensive tests: * test_json_key_ordering_difference * test_json_whitespace_and_ordering_combined * test_json_nested_object_comparison * test_json_array_comparison_order_matters * test_json_invalid_json_falls_back_to_string * test_json_null_vs_string_null * test_json_empty_object_vs_null * test_json_numeric_equivalence Test results: 344 Java tests pass (26 comparator tests) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

KRRT7 · 2026-02-03T00:50:18Z

@claude format and push

claude · 2026-02-03T00:50:39Z

Claude finished @KRRT7's task in 44s —— View job

Tasks

Read repository setup and formatting guidelines
Check current git status
Attempt to run code formatter

Status

I attempted to run the code formatter, but I don't have permission to execute uv run ruff format commands.

The working tree is currently clean with no uncommitted changes, which suggests that either:

The code is already formatted correctly, or
The formatting needs to be done locally by someone with the appropriate permissions

What needs to be done

To format and push the code, you can run these commands locally:

# Format the code
uv run ruff format codeflash/ tests/

# Check for linting issues
uv run ruff check codeflash/ tests/

# If there are changes, commit and push
git add -A
git commit -m "chore: format code with ruff

Co-authored-by: Kevin Turcios <KRRT7@users.noreply.github.com>"
git push origin fix/java-comparator-json-comparison

Alternatively, if you'd like me to have the permissions to run these commands, you can update the --allowedTools or --allowedPrompts configuration for the Claude Code Action to include formatter commands.

--- | Branch: fix/java-comparator-json-comparison

KRRT7 merged commit 1765c9b into omni-java Feb 3, 2026
21 of 24 checks passed

KRRT7 deleted the fix/java-comparator-json-comparison branch February 3, 2026 00:59

mashraf-222 mentioned this pull request Feb 3, 2026

Revert "fix: add JSON-aware comparison to Python comparator fallback" #1278

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: add JSON-aware comparison to Python comparator fallback#1276

fix: add JSON-aware comparison to Python comparator fallback#1276
KRRT7 merged 1 commit into
omni-javafrom
fix/java-comparator-json-comparison

mashraf-222 commented Feb 3, 2026

Uh oh!

KRRT7 commented Feb 3, 2026

Uh oh!

claude Bot commented Feb 3, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mashraf-222 commented Feb 3, 2026

Summary

Problem

Example False Negative:

Solution

New Helper Function: _compare_json_values()

Updated Comparison Logic:

What This Fixes

Tests Added

Updated Existing Test:

New Test Class: TestJsonComparison (8 tests)

Impact

Before (Buggy):

After (Fixed):

Testing

Why This Matters

Uh oh!

KRRT7 commented Feb 3, 2026

Uh oh!

claude Bot commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tasks

Status

What needs to be done

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

New Helper Function: `_compare_json_values()`

New Test Class: `TestJsonComparison` (8 tests)

claude Bot commented Feb 3, 2026 •

edited

Loading