Skip to content

fix(extract): string-dispatch CALLS edges never emitted — detection was in dead code#447

Open
isc-tdyar wants to merge 1 commit into
DeusData:mainfrom
isc-tdyar:fix/string-dispatch-dead-code
Open

fix(extract): string-dispatch CALLS edges never emitted — detection was in dead code#447
isc-tdyar wants to merge 1 commit into
DeusData:mainfrom
isc-tdyar:fix/string-dispatch-dead-code

Conversation

@isc-tdyar

Copy link
Copy Markdown
Contributor

String-dispatch CALLS edges never emitted — detection was in dead code

Discovered while indexing a Python codebase that uses string-literal method dispatch — calls of the form:

obj.classMethodValue('some.module.Class', 'MethodName', arg1, arg2)

where the class and method are passed as string literals rather than referenced directly. These produced no CALLS edge in the graph. The callee appeared as an anonymous call with no link to the target.

This pattern appears in Python reflection libraries, plugin systems, and RPC frameworks — any API that takes a class name and method name as strings and dispatches dynamically.

Root cause

The string-dispatch detection existed in walk_calls() in extract_calls.c. But walk_calls is dead code — the pipeline calls cbm_extract_unified()handle_calls(), never cbm_extract_calls() / walk_calls() directly. The detection logic was never reachable during normal indexing.

Fix

Move the detection into handle_calls(), immediately after the standard CALLS edge is pushed. When a Python call's callee name ends in one of the four dispatch suffixes (.classMethodValue, .classMethodVoid, .classMethodBoolean, .classMethodObject) and both string-literal arguments can be extracted, emit a synthetic CALLS edge to Arg0.Arg1.

Also adds extract_nth_string_arg() — a small helper to extract the N-th string-literal positional argument from an argument list node.

The suffix list is easily extended for other APIs that follow the same shape.

Regression test

python_iris_classMethodValue in tests/test_extraction.c — indexes a minimal Python snippet using .classMethodValue('Pkg.Class', 'Method', ...) and asserts a CALLS edge to Pkg.Class.Method is emitted.

@DeusData

Copy link
Copy Markdown
Owner

Thanks for the contribution — and for the clean, signed-off commits. Will review properly when time allows.

One heads-up on CI: everything is green except test-unix (macos-15-intel), and the failure there is not in the test suite (all tests passed) — it's the post-suite parent-death watchdog check (child survived parent death), which has nothing to do with your extraction change. I've rerun the failed job to rule out a runner flake; if it stays red, that's a platform issue on our side and won't be held against this PR.

@isc-tdyar

Copy link
Copy Markdown
Contributor Author

Great, thanks!

As a heads-up, I have some actual feature additions in the hopper - I spent time recently adding support for ObjectScript, InterSystems proprietary but well-established language that is tied to our database technology. I will be following the CONTRIBUTING guidelines with an issue submission, but at a high level I propose adding CBM_LANG_OBJECTSCRIPT to lang_specs.c and supporting infrastructure for specifics of that language. I hope this will be something that is viewed as a positive for CBMM, and I am willing to shape it however it can fit within the wider team's conception of "good code" :)

@DeusData

Copy link
Copy Markdown
Owner

Thanks @isc-tdyar — moving the string-dispatch detection into the live path is the right call. Two things before merge:

  1. The PR adds the detection in handle_calls() but leaves the original unreachable copy in walk_calls() in place — please remove the dead copy (the PR's premise is fixing dead code, so it shouldn't leave a duplicate behind).
  2. The dispatch list is IRIS-specific (.classMethodValue, .classMethodVoid, …). Could we generalize this into a small configurable/extensible dispatch table rather than hardcoding one vendor's methods in extract_calls.c? Happy to discuss the shape. 🙏

@isc-tdyar isc-tdyar force-pushed the fix/string-dispatch-dead-code branch from 64576ff to 297e258 Compare June 22, 2026 21:48
@isc-tdyar

Copy link
Copy Markdown
Contributor Author

Thanks @isc-tdyar — moving the string-dispatch detection into the live path is the right call. Two things before merge:

1. The PR adds the detection in `handle_calls()` but leaves the original unreachable copy in `walk_calls()` in place — please remove the dead copy (the PR's premise is fixing dead code, so it shouldn't leave a duplicate behind).

2. The dispatch list is IRIS-specific (`.classMethodValue`, `.classMethodVoid`, …). Could we generalize this into a small configurable/extensible dispatch table rather than hardcoding one vendor's methods in `extract_calls.c`? Happy to discuss the shape. 🙏

Done — both addressed:

  1. Removed walk_calls() and cbm_extract_calls() entirely (including the declaration in cbm.h) — they had no callers in src/ and the unified handle_calls() path supersedes them.
  2. Moved the dispatch suffixes out of the inline static array and into a new string_dispatch_suffixes field on CBMLangSpec. Python's row gets py_string_dispatch_suffixes[] (the four .classMethod* names); all other languages implicitly get NULL via C designated initializers — no other rows needed updating. The detection in handle_calls() now reads spec->string_dispatch_suffixes rather than a hardcoded IRIS-specific list.

Rebased onto current main and force-pushed.

@isc-tdyar isc-tdyar force-pushed the fix/string-dispatch-dead-code branch 3 times, most recently from aacbbc0 to fd55187 Compare June 22, 2026 23:46
…o target

Python calls of the form iris_obj.classMethodValue('Pkg.Class', 'Method')
emitted no CALLS edge. The detection existed in walk_calls, which is dead
code — the pipeline uses cbm_extract_unified → handle_calls, not walk_calls.

Move the IRIS Native API string-dispatch detection to handle_calls. When a
Python call's callee ends in .classMethodValue, .classMethodVoid,
.classMethodBoolean, or .classMethodObject, and both string-literal arguments
can be extracted, emit a synthetic CALLS edge to Arg1.Arg2.

The pattern generalizes beyond this specific API: any library that takes
class/method names as string literals follows the same shape.

Regression test: python_iris_classMethodValue in test_extraction.c.

Signed-off-by: Thomas Dyar <tdyar@intersystems.com>
@isc-tdyar isc-tdyar force-pushed the fix/string-dispatch-dead-code branch from fd55187 to 0b94f44 Compare June 22, 2026 23:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants