Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Real" method overloading vs Union types #17813

Closed
procodix opened this issue Feb 15, 2025 · 10 comments
Closed

"Real" method overloading vs Union types #17813

procodix opened this issue Feb 15, 2025 · 10 comments

Comments

@procodix
Copy link

Description

I think it's time to re-raise a question that has only ever been partially answered and implemented in the past, limited by the current state of development of PHP.

How should PHP methods support overloading?

Historically there were the magic methods like __call(). Now there are Union Types and named parameters to allow different signatures for one method. This was called and sold as method overloading, although it is actually not. Union types in particular feel like an extreme step backwards from strong and clean typing. It is no substitute for method overloading, as it requires unsightly type queries in the code and the order is not flexible, not to mention variadic arguments. In a way, Union Types are the counterpoint to method overload. Named parameters mess with variadic arguments and only work together in some limited use cases.

„Method Overloading“ by the book however should allow multiple methods of the same name with different argument lists. Ideally, even static and instance methods with the same name can coexist. In the days when PHP had no typing, this was not feasible. But fortunately that has changed, so now the way is clear.

Sure, you can simulate all of this with __call(), but that destroys any typing and also the IDE support. I think it's time to merge all the partial concepts into a holistic solution of real method overloading, the same way it is offered in other modern languages. Since this could be added as a new language feature, the current substitutes can still be usable for compatibility reasons, albeit less powerful and therefore lower-value.

Constructors are a primary use case, but other methods up to property hooks also benefit from it.

// representing a HTML Element
class Element {
	function fromHtml(string $html) {...}
	function fromContentClasses(string $content, string $classes) {...}
	function fromContentClassesTag(string $content, string $classes, string $tag) {...}
	function fromDom(DOM $dom, string $selector) {…}

	// vs
	
	function __construct(string $html) {…}
	function __construct(string $content, string $classes) {…}
	function __construct(DOM $dom) {…}
	...

	// property hooks with overloading and automatic type conversion
	array $children {
		get {…}
		set(string $html) {…}
		set(array $elements) {…}
		set(Element $e) {...}
	}
}

Please let me know, if it would make sense to put effort into creating an RFC for this.

@iluuu1994
Copy link
Member

Why do you think this has been partially answered? It's been discussed to death over many years. It's not that people all agree that it's an undesirable feature (although certainly some do, me included), but it's also that there is no feasible implementation, both in terms of semantics and performance. These limitations mainly come from the fact that the function dispatch part would need to happen at runtime, whereas most (if not all) other languages do so statically.

@procodix
Copy link
Author

I followed these debates :) As it is now, PHP surely does heavy checks to ensure type safety while putting parameters on the stack before a method dispatch (which yields runtime errors when types mismatch). I don't known if selecting one of several overloaded methods is more costly compared to this. Especially when assuming there are named parameters involved which have to be brought into order, or variadic arguments which need to be unpacked/merged/repacked before dispatch.

This all feels like it compensates for the missing overloads.

Im not too familiar with PHP internals. Is it safe to say, choosing an overloaded method has to happen at runtime? Their signatures are known at compile time. If function argument types were also known to PHP the way IDEs know them, there sure would be a way.

@iluuu1994
Copy link
Member

iluuu1994 commented Feb 15, 2025

Is it safe to say, choosing an overloaded method has to happen at runtime? Their signatures are known at compile time. If function argument types were also known to PHP the way IDEs know them, there sure would be a way.

Yes, that's safe to say. PHP compiles each file individually, so if you call foo($someVar) from a.php, and foo() is declared in b.php, we do not know what the signature of this function is. What's worse, even if foo() is declared in a.php, with function overloading we would no longer know if this is the function to be called, because possibly some other file declares a different overload of this function whose signature fits the passed arguments better.

As mentioned, most languages pick the correct overload statically. I.e. they infer function argument types, and then chose which function to call at compile time. However, since PHP doesn't have knowledge about the types of many expressions (again, because they may rely on symbols from other files), it would have to pick the correct overload by looking at the actual values passed to the function. However, these types can change between function calls, which also means that the same foo($value) may trigger a call to foo(int) or foo(float). But what happens if I call foo(string)? With coercion, we could call either. We can say that int has precedence, but that just delays the problem. What about foo(int, float) and foo(float, int) called with foo(string, string)? There are bound to be inconsistencies.

I don't think anybody has ever demonstrated there is actually a good solution to any of these issues, which is why this never goes anywhere when somebody proposes it.

@procodix
Copy link
Author

Nice explanation. So union type checks also happen between method invocation and method dispatch?

@iluuu1994
Copy link
Member

In terms of the VM, parameter types are always checked at runtime, including union types. Primitive type unions are very fast (a <<-shift and &), unions of classes are more complex (actual loop over the class names). This happens as part of the ZEND_RECV opcode.

@procodix
Copy link
Author

So the minimal effort to do runtime dispatch for overloading methods would be to build a hash from method name and arguments. argument count would be easy to distinguish but only allows overloads with different argument counts. So a minimal hash would have to include argument types/names.

One approach comes to mind to be base on argument names.

public function Method(string $s, int $i, ?array $a) {...}

can be hashed to Method_s_i_a() base on argument names or Method_string_int_array() based on argument types. This look easy to implement.

Problem arises at call side:

// this seems straight-forward to hash and check (method_exists()) and dispatch with maybe complexity O(1)
$this->Method(s:'foo', i:42, a:['foo', 'bar']);

// literals are easy as well
$this->Method('foo', 42, ['foo', 'bar'])   

// requires expensive determinations of the type (inspecting Zend_Var), resulting in O(n)
$this->Method($a, $b, $c)

@iluuu1994
Copy link
Member

So the minimal effort to do runtime dispatch for overloading methods would be to build a hash from method name and arguments.

Due to sub-typing and coercion, this doesn't really work.

public function Method(string $s, int $i, ?array $a) {...}

Let's say you call this with string, int, null, how do you know you're looking for Method_s_i_a and not Method_s_i_n? This obviously gets worse with other union types, sub-classes, coercion, etc. Another problem is foo(...) or Closure::fromCallable('foo'), which will not know which concrete function is being referenced.

@procodix
Copy link
Author

Let's say you call this with string, int, null, how do you know you're looking for Method_s_i_a and not Method_s_i_n?

I'd say this demonstrates a problem of the union-type realm. ?array equals array|null. Obviously PHP's union types and overloading are contradictory. (Interestingly Swift calls them optionals and they work quite fine, because they are boxed types, inside each other, but in PHP they are more like alternatives next to each other, so it can't work with "union types").

But then I'd assume this is a precision requirement on the definition side. Overlaps must not be allowed. So you cant have
Method_s_i_?a
and
Method_s_i_a --> with array, not ?array, so an array must be provided.

Method_s_i_n doesn't make sense, because it would force to pass "null" as one parameter.

Due to sub-typing and coercion, this doesn't really work.

Isn't this whats happen now when checking argument types in calls?

@cmb69
Copy link
Member

cmb69 commented Feb 17, 2025

This discussion leads nowhere. Besides the technical difficulties (I'd rather call it impossibility), it's not even clear that method overloading is a reasonable concept (some may reject it outright in OOP languages; others may claim it is a poor-mans substitute for multi-dispatch).

Feel free to pursue the RFC process; I'm closing this ticket as WONTFIX for the time being.

@cmb69 cmb69 closed this as not planned Won't fix, can't repro, duplicate, stale Feb 17, 2025
@iluuu1994
Copy link
Member

iluuu1994 commented Feb 17, 2025

Interestingly Swift calls them optionals and they work quite fine, because they are boxed types, inside each other

I'm not sure that makes sense. Swift does static overloading. If you try this, Swift will complain at compile time:

func foo(x: Int?) {}
func foo(x: String?) {}
foo(nil)

As mentioned, we don't have that luxury.

But then I'd assume this is a precision requirement on the definition side. Overlaps must not be allowed. So you cant have

Overlaps would need to be allowed for all except one parameter, which makes it even more obscure. Otherwise, you can't have something like print(string $value, bool $stderr) and print(int $value, bool $stderr), as the second param obviously overlaps.

Isn't this whats happen now when checking argument types in calls?

Type checking in PHP will either succeed or fail. However, we don't have a concept of "which signature may fit the arguments better".

Somebody would need to demonstrate this is actually possible. Otherwise we just keep repeating the same talking points. 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants