Skip to content

Add MethodDefinition#signatures to Ruby API#697

Open
soutaro wants to merge 27 commits intomainfrom
soutaro-definition-signatures
Open

Add MethodDefinition#signatures to Ruby API#697
soutaro wants to merge 27 commits intomainfrom
soutaro-definition-signatures

Conversation

@soutaro
Copy link
Copy Markdown
Contributor

@soutaro soutaro commented Mar 26, 2026

Summary

  • Expose method signature information through MethodDefinition#signatures, returning an array of Rubydex::Signature objects
  • Each Signature holds an array of typed Parameter subclasses (PositionalParameter, KeywordParameter, etc.) with name and location attributes
  • Add Rust C API (ParameterKind, SignatureEntry, SignatureArray) and C extension glue for conversion

Extracted from #686.

Test plan

  • Tests cover all 8 parameter kinds (positional, optional, rest, keyword, optional keyword, rest keyword, block, forward)
  • Tests cover RBS-defined signatures, overloads, untyped parameters, and no-parameter methods

🤖 Generated with Claude Code

@soutaro soutaro requested a review from a team as a code owner March 26, 2026 05:34
@soutaro soutaro self-assigned this Mar 26, 2026
soutaro added a commit that referenced this pull request Mar 27, 2026
- Rename ParameterKind variants to full names (e.g. Req -> RequiredPositional)
- Merge two let-else bindings into one in rdx_definition_signatures
- Make collect_method_signatures private
- Change Signature#initialize from keyword arguments to positional arguments

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@soutaro soutaro force-pushed the soutaro-definition-signatures branch from 79877ae to 39084db Compare March 31, 2026 08:58
}

VALUE rdxi_signatures_to_ruby(SignatureArray *arr, VALUE graph_obj, VALUE default_method_def) {
if (arr == NULL || arr->len == 0) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we raise in rdxr_method_definition_signatures and rdxr_method_alias_definition_signatures if the pointer is NULL?

This should never really happen, so I'm trying to avoid having it fail silently.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think the design is weird. Will take a look at this tonight.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the implementation. rdx_method_definition_signatures and rdx_method_alias_definition_signatures always return an array. rdxi_signatures_to_ruby can safely skips checking if arr is NULL.

let mut boxed = sig_entries.into_boxed_slice();
let len = boxed.len();
let items_ptr = boxed.as_mut_ptr();
std::mem::forget(boxed);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to forget here if we're ultimately passing this to Box::into_raw?

Copy link
Copy Markdown
Contributor Author

@soutaro soutaro Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't use forget, the contents of the Box will be deallocated by the destructor.
We could use Box::into_raw instead, but I don't think it's worth the rewrite since the mem::forget pattern is already used elsewhere in the codebase.

let items_ptr = Box::into_raw(boxed) as *mut SignatureEntry;

ParameterEntry param_entry = sig_entry.parameters[j];

VALUE param_class = parameter_class_for_kind(param_entry.kind);
VALUE name_sym = ID2SYM(rb_intern(param_entry.name));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does rb_intern make a copy of the string we pass? I'm asking because we free the signature array object further below. If Ruby interning just keeps a pointer, it could lead to a seg fault.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe rb_intern copies the string, but I couldn't find any documentation that explicitly says so. Changed to rb_str_intern with rb_utf8_str_new_cstr, which creates dynamic symbols (subject to GC) from a copied C string.

///
/// Panics if the definition or its document does not exist.
#[must_use]
pub fn source_at(&self, definition_id: &DefinitionId) -> &str {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can move this to Offset and pass the document as we do with https://github.com/Shopify/rubydex/blob/main/rust/rubydex/src/offset.rs#L90-L103

@soutaro soutaro force-pushed the soutaro-definition-signatures branch from b33e8c6 to e0b3e8a Compare April 8, 2026 04:48
}

/// Returns a newly allocated array of signatures for the given method definition id.
/// Returns NULL if the definition is not a method definition.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it panics in this case?

end
end

def test_method_definition_signatures_with_various_parameter_kinds
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we include test cases for singleton methods?

@soutaro soutaro force-pushed the soutaro-definition-signatures branch from eef4e19 to aeeebb3 Compare April 13, 2026 21:34
/// C-compatible struct representing a single method signature (a list of parameters).
#[repr(C)]
pub struct SignatureEntry {
pub definition_id: u64,
Copy link
Copy Markdown
Member

@st0012 st0012 Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is never used. Do we keep it for future updates?

If that's the case, please add a comment for that.
If not, let's remove it and simplify related API.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think we can remove this one.

/// C-compatible struct representing a single method signature (a list of parameters).
#[repr(C)]
pub struct SignatureEntry {
pub definition_id: u64,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think we can remove this one.

Comment on lines +47 to +62
self.sources.insert(uri.to_string(), source);
}

/// Returns the normalized source for the given URI.
///
/// # Panics
///
/// Panics if the URI has not been indexed.
#[must_use]
pub fn source(&self, uri: &str) -> &str {
self.sources.get(uri).expect("source not found for URI")
}

/// Returns the source text for a definition, sliced by its offset.
///
/// # Panics
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I wouldn't include these changes as part of this PR. It's already pretty large and we can ship test improvements separately.

Comment on lines +585 to +597
let ancestors: Vec<_> = ns.ancestors().iter().collect();

let search_start = if only_inherited {
let pos = ancestors
.iter()
.position(|a| matches!(a, Ancestor::Complete(id) if *id == namespace_id))
.expect("namespace_id should be present in its own ancestor chain");
pos + 1
} else {
0
};

for ancestor in &ancestors[search_start..] {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This refactor is switching 1 loop for 3. Before, we were iterating on ancestors and performing the checks inline.

Now, we're eagerly collecting ancestors using iter().collect() (which I think is not necessary? ancestors already returns a collection). Then we iterate to find the main namespace position and iterate again to search the ancestors.

I'd vote for us to keep the current implementation. At least for now.

///
/// Panics if any alias definition in the chain has no corresponding declaration.
#[must_use]
pub fn deep_dealias_method(graph: &Graph, alias_id: DefinitionId) -> Vec<DefinitionId> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think part of the test related changes are a result of this method returning definitions.

Should this return a DeclarationId instead? Then we return the declaration that an alias points to and, if desired, the consumer of the API can fetch the related definitions.

case ParameterKind_RequiredPositional: return cPositionalParameter;
case ParameterKind_OptionalPositional: return cOptionalPositionalParameter;
case ParameterKind_RestPositional: return cRestPositionalParameter;
case ParameterKind_Post: return cPostParameter;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
case ParameterKind_Post: return cPostParameter;
case ParameterKind_Post: return cPostParameter;

for (size_t i = 0; i < arr->len; i++) {
SignatureEntry sig_entry = arr->items[i];

VALUE parameters = rb_ary_new_capa((long)sig_entry.parameters_len);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a follow-up PR, potential performance optimization:

Parameter-less functions are pretty common. We might see some nice memory gains if we reuse the same empty array between them all.

attr_reader :parameters

#: (Array[Parameter]) -> void
def initialize(parameters)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No return value?

# frozen_string_literal: true

module Rubydex
class Signature
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Not for this PR, but in a follow-up)

In my experience, having all the parameters in a single array ends up making callers need to do lots of filtering (e.g. to find just positional args, just keyword args, pull out the block param, etc.).

As the need arises, I would encourage adding convenience methods like positional_parameters, keyword_parameters, block, etc.

Block = 8,
}

fn map_parameter_kind(param: &Parameter) -> ParameterKind {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be nicer to have in the impl Parameter block, so you can just call like p.kind

use std::ptr;

/// C-compatible enum representing the kind of a parameter.
#[repr(C)]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can be explicit about the size, too

Suggested change
#[repr(C)]
#[repr(C, u8)]

Comment on lines +85 to +89
let len = boxed.len();
let items_ptr = boxed.as_mut_ptr();
std::mem::forget(boxed);

Box::into_raw(Box::new(SignatureArray { items: items_ptr, len }))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're already using Box::into_raw, we can apply it here to remove the need for this std::mem::forget(boxed).

Suggested change
let len = boxed.len();
let items_ptr = boxed.as_mut_ptr();
std::mem::forget(boxed);
Box::into_raw(Box::new(SignatureArray { items: items_ptr, len }))
let len = sig_entries.len();
let items_ptr = Box::into_raw(sig_entries.into_boxed_slice()) as *mut _;
Box::into_raw(Box::new(SignatureArray { items: items_ptr, len }))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(or we can use cast_const(), depending on the convention this repo follows


let parameters_len = param_entries.len();
let parameters_ptr = param_entries.as_mut_ptr();
std::mem::forget(param_entries);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likewise


for param_entry in &mut param_slice {
if !param_entry.name.is_null() {
let _ = unsafe { CString::from_raw(param_entry.name.cast_mut()) };
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Equivalent, but more clear intent IMO

Suggested change
let _ = unsafe { CString::from_raw(param_entry.name.cast_mut()) };
drop(unsafe { CString::from_raw(param_entry.name.cast_mut()) });

// Take ownership of the SignatureArray
let arr = unsafe { Box::from_raw(ptr) };

if !arr.items.is_null() && arr.len > 0 {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be inverted into an early-return, similar to the ptr.is_null() check above?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants