Investigate By-Ref Vector and Matrix Operators in C# 8.0 #600

ilexp · 2018-01-04T18:18:30Z

Summary

As mentioned in issue #598, the new in parameter keyword is allowed on operators and using literal values, so operator implementations can supersede the previous static by-ref operator alternatives.

Analysis

Wait until the core has adopted C# 7.2.
Make sure using the in keyword implicitly with non-referencable values (like literals or property return values) has no negative impact on performance. Read the docs, and if that's not 100% clear, check out generated IL and x86 code in a sample project.
Remove all static by-ref methods from Vector2/3/4, Quaternion and Matrix3/4 if they have an operator equivalent.
Use the in keyword on parameters provided to the operators in question.

The text was updated successfully, but these errors were encountered:

Barsonax · 2018-01-08T12:07:01Z

Also not sure if the current arithmetric implementation is the most optimal one. There is quite a big difference in the IL being generated. Take for example the vector3 + operator:

Source:

public static Vector3 operator +(Vector3 left, Vector3 right)
{
	left.X += right.X;
	left.Y += right.Y;
	left.Z += right.Z;
	return left;
}

IL:

  .method public hidebysig static specialname valuetype Duality.Vector3 
    op_Addition(
      valuetype Duality.Vector3 left, 
      valuetype Duality.Vector3 right
    ) cil managed 
  {
    .maxstack 8

    // [502 4 - 502 22]
    IL_0000: ldarga.s     left
    IL_0002: ldflda       float32 Duality.Vector3::X
    IL_0007: dup          
    IL_0008: ldind.r4     
    IL_0009: ldarg.1      // right
    IL_000a: ldfld        float32 Duality.Vector3::X
    IL_000f: add          
    IL_0010: stind.r4     

    // [503 4 - 503 22]
    IL_0011: ldarga.s     left
    IL_0013: ldflda       float32 Duality.Vector3::Y
    IL_0018: dup          
    IL_0019: ldind.r4     
    IL_001a: ldarg.1      // right
    IL_001b: ldfld        float32 Duality.Vector3::Y
    IL_0020: add          
    IL_0021: stind.r4     

    // [504 4 - 504 22]
    IL_0022: ldarga.s     left
    IL_0024: ldflda       float32 Duality.Vector3::Z
    IL_0029: dup          
    IL_002a: ldind.r4     
    IL_002b: ldarg.1      // right
    IL_002c: ldfld        float32 Duality.Vector3::Z
    IL_0031: add          
    IL_0032: stind.r4     

    // [505 4 - 505 16]
    IL_0033: ldarg.0      // left
    IL_0034: ret          

  } // end of method Vector3::op_Addition

Assembly:

 push        ebp  
 mov         ebp,esp  
 push        edi  
 push        esi  
 sub         esp,0Ch  
 lea         edx,[ebp-14h]  
 fld         dword ptr [ebp+14h]  
 fstp        dword ptr [edx]  
 fld         dword ptr [ebp+18h]  
 fstp        dword ptr [edx+4]  
 fld         dword ptr [ebp+1Ch]  
 fstp        dword ptr [edx+8]  
 lea         eax,[ebp-14h]  
 fld         dword ptr [ebp+8]  
 fadd        dword ptr [eax]  
 fstp        dword ptr [eax]  
 lea         eax,[ebp-10h]  
 fld         dword ptr [ebp+0Ch]  
 fadd        dword ptr [eax]  
 fstp        dword ptr [eax]  
 lea         eax,[ebp-0Ch]  
 fld         dword ptr [ebp+10h]  
 fadd        dword ptr [eax]  
 fstp        dword ptr [eax]  
 mov         edi,ecx  
 lea         esi,[ebp-14h]  
 movq        xmm0,mmword ptr [esi]  
 movq        mmword ptr [edi],xmm0  
 add         esi,8  
 add         edi,8  
 movs        dword ptr es:[edi],dword ptr [esi]  
 lea         esp,[ebp-8]  
 pop         esi  
 pop         edi  
 pop         ebp  
 ret         18h

And compare it to this:

Source:

public static Vector3 operator +(Vector3 left, Vector3 right)
{
	return new Vector3(left.X + right.X, left.Y + right.Y, left.Z + right.Z);
}

IL:

  .method public hidebysig static specialname valuetype Duality.Vector3 
    op_Addition(
      valuetype Duality.Vector3 left, 
      valuetype Duality.Vector3 right
    ) cil managed 
  {
    .maxstack 8

    // [502 4 - 502 76]
    IL_0000: ldarg.0      // left
    IL_0001: ldfld        float32 Duality.Vector3::X
    IL_0006: ldarg.1      // right
    IL_0007: ldfld        float32 Duality.Vector3::X
    IL_000c: add          
    IL_000d: ldarg.0      // left
    IL_000e: ldfld        float32 Duality.Vector3::Y
    IL_0013: ldarg.1      // right
    IL_0014: ldfld        float32 Duality.Vector3::Y
    IL_0019: add          
    IL_001a: ldarg.0      // left
    IL_001b: ldfld        float32 Duality.Vector3::Z
    IL_0020: ldarg.1      // right
    IL_0021: ldfld        float32 Duality.Vector3::Z
    IL_0026: add          
    IL_0027: newobj       instance void Duality.Vector3::.ctor(float32, float32, float32)
    IL_002c: ret          

  } // end of method Vector3::op_Addition

Assembly:

 push        ebp  
 mov         ebp,esp  
 push        eax  
 fld         dword ptr [ebp+14h]  
 fadd        dword ptr [ebp+8]  
 fstp        dword ptr [ebp-4]  
 fld         dword ptr [ebp-4]  
 fld         dword ptr [ebp+18h]  
 fadd        dword ptr [ebp+0Ch]  
 fstp        dword ptr [ebp-4]  
 fld         dword ptr [ebp-4]  
 fld         dword ptr [ebp+1Ch]  
 fadd        dword ptr [ebp+10h]  
 fstp        dword ptr [ebp-4]  
 fld         dword ptr [ebp-4]  
 fxch        st(2)  
 fstp        dword ptr [ecx]  
 fstp        dword ptr [ecx+4]  
 fstp        dword ptr [ecx+8]  
 mov         esp,ebp  
 pop         ebp  
 ret         18h

BenchmarkDotnet also shows that creating a new vector is alot faster than modifying the vector. Creating a small test project in duality with a stopwatch also confirms this. Might be worth looking in to and if you want to use the in keyword (readonly ref, C# 7.2) you would have to do this anyway.

ilexp · 2018-01-09T18:00:19Z

Can you provide the C# code for each of the IL code snippets?

To be really sure how this ends up on the CPU, we'll also need to compare the JITted assembly code, since a lot of optimizations only happen in the JIT stage.

BenchmarkDotnet also shows that creating a new vector is alot faster than modifying the vector.

What do you mean by creating a new vector vs. modifying..? Using the in keyword shouldn't change a lot in that regard. The result value is still a newly created vector?

Barsonax · 2018-01-09T18:04:41Z

Basically its this:

The current version (vector is the vector you get through the parameters)

vector.X = someCalculatedValue1;
vector.Y = someCalculatedValue2;
etc..
return vector;

vs this:

return new Vector(someCalculatedValue1, someCalculatedValue2, etc)

The last one is also compatible with the in keyword as you do not modify the vector you get in through the parameters (as in you can add the in keyword to the parameter without having to change code). It also seems that atleast in IL its alot shorter and measuring it shows its faster. Haven't checked Jitted code.

EDIT: added the source to my previous post

ilexp · 2018-01-09T18:20:56Z

The last one is also compatible with the in keyword as you do not modify the vector you get in through the parameters (as in you can add the in keyword to the parameter without having to change code)

Ah, sure. For the in keyword change that aspect of the source will have to change, yes. Could still be either a temporary vector assigned element by element, or a return + constructor combo though. The latter might be faster, though I would have assumed that it shouldn't make a difference after all optimizations fired.

Barsonax · 2018-01-09T18:26:22Z

Might be better memory locality since it doesnt have to go back to that vector between the calculations? Idk really I just noticed this when I looked at the IL and decided to measure it. Could be a good idea to try to reproduce it to see if that difference is there on your machine as well?

ilexp · 2018-01-09T19:52:02Z

Idk really I just noticed this when I looked at the IL and decided to measure it. Could be a good idea to try to reproduce it to see if that difference is there on your machine as well?

Good call. Not sure I'll get around to benchmark this myself, but maybe you can just post the C# file with your benchmarking code and the full BenchmarkDotNet results for reference? Should be enough of a statement for when we get back to this issue later.

Barsonax · 2018-01-10T11:36:13Z

I have added the assembly to the source and IL post. Here is the benchmark code:

	class Program
	{
		static void Main(string[] args)
		{
			BenchmarkRunner.Run<VectorOperationBench>();
			Console.ReadKey();
		}
	}

	public class VectorOperationBench
	{
		private Vector3 left = new Vector3(4,7,8);
		private Vector3 right = new Vector3(1,2,3);

		[Benchmark]
		public Vector3 OldAdd()
		{
			return this.left + this.right;
		}

		[Benchmark]
		public Vector3 NewAdd()
		{
			return new Vector3(this.left.X + this.right.X, this.left.Y + this.right.Y, this.left.Z + this.right.Z);
		}
	}

These are the results I get on a i7-4790k:

On my laptop its slightly slower (slower cpu) in both cases but overal the same result.

ilexp · 2018-01-10T17:39:14Z

Okay, that really looks like there's an obvious perf gain from restructuring the operators that way. Very nice find 👍 If you're up for it, this could be a worthwhile master PR until we'll get around to deal with this issue / the in keyword stuff.

Barsonax · 2018-01-10T18:38:54Z

So first to be really sure this wasnt just some exception to the rule I added more benchmarks and measured the current performance:

Then I changed those operators and benchmarked them again:

So except for the division which is more or less the same its a very big improvement. I wonder why the compiler doesnt optimize this.

Also I fixed some trivial code duplication stuff (like vec * float and float * vec now use the same code instead of separate implementations that do the same thing for instance). Will do a PR once I did all the operators.

ilexp · 2018-01-11T19:12:01Z

Also I fixed some trivial code duplication stuff (like vec * float and float * vec now use the same code instead of separate implementations that do the same thing for instance). Will do a PR once I did all the operators.

Are you 100% sure that you can still multiply both ways, e.g. 2.5f * vec and vec * 2.5f even when defining only one of them? Please check before including this. I'm not sure the order of operands is reversed implicitly by the compiler here, would honestly expect it not to be.

Barsonax · 2018-01-11T19:27:18Z

I didnt removed them I changed the reversed variant to use the logic of the non reversed variant in order to reduce code duplication.

ilexp · 2018-01-11T19:31:06Z

I didnt removed them I changed the reversed variant to use the logic of the non reversed variant in order to reduce code duplication.

Ah, makes sense. If this is at the cost of performance, I would value performance over avoiding code duplication in this case, otherwise good call.

Barsonax · 2018-01-12T12:45:14Z

Ah, makes sense. If this is at the cost of performance, I would value performance over avoiding code duplication in this case, otherwise good call.

No I measured it and the performance is the same. It seems to inline them so there is no difference in performance.

Barsonax · 2018-01-13T11:02:47Z

Now my PR is merged aside from the much better performance when using these operators adding the in keyword is really trivial now and will require no other code changes:
Current:

public static Vector3 operator -(Vector3 left, Vector3 right)
{
	return new Vector3(
		left.X - right.X, 
		left.Y - right.Y, 
		left.Z - right.Z);
}

New:

public static Vector3 operator -(in Vector3 left, in Vector3 right)
{
	return new Vector3(
		left.X - right.X, 
		left.Y - right.Y, 
		left.Z - right.Z);
}

This issue can be done as soon as we switch to C#7.2. Do remember to benchmark these to see if it improves performance. There might be a chance it actually reduces performance for smaller vectors!

Barsonax · 2018-11-08T15:48:29Z

There are cases where defensive copies might occur when using the in keyword. This can reduce the performance. Making the struct readonly will prevent this.

https://blogs.msdn.microsoft.com/seteplia/2018/03/07/the-in-modifier-and-the-readonly-structs-in-c/

However since the X, Y etc fields are part of the public API its basically a breaking change.

Barsonax · 2019-06-26T09:12:47Z

As a further enhancement one can add readonly to properties and methods etc now in C#8.0 to denote they do not change state: https://docs.microsoft.com/en-us/dotnet/csharp/whats-new/csharp-8#readonly-members

(in C# 7.3 this was only possible on struct level)

This will enable the compiler to skip more defensive copies.

Barsonax · 2019-07-30T23:35:33Z

Now duality is using C# 7.3 this issue can get picked up at any time:

Add in keyword
Consider making Vectors readonly to reduce the amount of defensive copies that happen. This changes the public API though.

ilexp · 2019-08-02T16:38:38Z

Making vector or matrix types readonly would be quite a usability downgrade, since that would prevent field assignments. We can still avoid performance regressions though, if we use the new readonly members that C# 8.0 offers, to flag all non-mutating properties and methods as readonly.

With that in mind, I think it makes sense to defer this until we're done with #698 right up to C# 8.0.

Barsonax · 2019-08-04T10:21:23Z

Making vector or matrix types readonly would be quite a usability downgrade, since that would prevent field assignments

Agreed, one could also consider adding methods like WithX(float value) that returns a new vector for you with a different value for X. Basically the same that assigning the property would do but this would make it more clear when defensive copies happen and we can mark the Vector itself as readonly.

So this:

var vector = new Vector3(1,2,3);
vector.X = 6;
vector.Z = 6;
// Do something with vector

Changes to:

var vector = new Vector3(1,2,3);
var modifiedVector = vector.WithX(6)
                           .WithZ(6);
// Do something with modifiedVector

Slightly different meaning though but this does allow you to mark the entire vector as readonly.

ilexp · 2019-08-04T10:46:38Z

I acknowledge the technical possibility, but that just reads terrible and wrong 😄

(Also, ref / out support, additional copies and potentially other stuff I'm not even thinking about - just give me plain old data vectors)

Barsonax · 2019-08-04T11:44:03Z

I acknowledge the technical possibility, but that just reads terrible and wrong

Hmm I don't think its that bad though. However another reason not to implement my suggestion (yup I now disagree with myself :P) might be this: https://referencesource.microsoft.com/#System.Numerics/System/Numerics/Vector3_Intrinsics.cs,3e75f0b7820a3bf5

There might be a chance that we would want to switch to these vectors at some point in the future since the System Numerics Vectors are alot faster due to stuff like SIMD optimizations. Better to try and match that public API if possible.

ilexp · 2019-08-04T11:57:42Z

Yeah, I was eyeballing that possibility as well - definitely something to keep in mind.

ilexp added Breaking Change Breaks binary or source compatibility Cleanup Improving form, keeping function Core Area: Duality runtime or launcher Performance Related to runtime or editor performance Task ToDo that's neither a Bug, nor a Feature labels Jan 4, 2018

ilexp added this to the Future milestone Jan 4, 2018

Barsonax mentioned this issue Jan 8, 2018

Simple Math and Geometry convenience improvements #550

Closed

Barsonax mentioned this issue Jan 12, 2018

#600 Vector and point operator micro optimizations #610

Merged

ilexp modified the milestones: Future, General Mar 25, 2018

Barsonax mentioned this issue Apr 6, 2018

Update Package Management to Latest Stable NuGet (4.x) #574

Closed

Barsonax mentioned this issue Oct 29, 2018

Switch to C# 7.3 #698

Closed

ilexp mentioned this issue Aug 2, 2019

Switch to C# 8.0 #738

Open

ilexp modified the milestones: General, Future C# / .NET Upgrade Aug 2, 2019

ilexp changed the title ~~Investigate By-Ref Vector and Matrix Operators in C# 7.2~~ Investigate By-Ref Vector and Matrix Operators in C# 8.0 Aug 2, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate By-Ref Vector and Matrix Operators in C# 8.0 #600

Investigate By-Ref Vector and Matrix Operators in C# 8.0 #600

ilexp commented Jan 4, 2018

Barsonax commented Jan 8, 2018 •

edited

Loading

ilexp commented Jan 9, 2018

Barsonax commented Jan 9, 2018 •

edited

Loading

ilexp commented Jan 9, 2018

Barsonax commented Jan 9, 2018

ilexp commented Jan 9, 2018 •

edited

Loading

Barsonax commented Jan 10, 2018 •

edited

Loading

ilexp commented Jan 10, 2018

Barsonax commented Jan 10, 2018

ilexp commented Jan 11, 2018

Barsonax commented Jan 11, 2018

ilexp commented Jan 11, 2018 •

edited

Loading

Barsonax commented Jan 12, 2018

Barsonax commented Jan 13, 2018

Barsonax commented Nov 8, 2018

Barsonax commented Jun 26, 2019 •

edited

Loading

Barsonax commented Jul 30, 2019 •

edited

Loading

ilexp commented Aug 2, 2019 •

edited

Loading

Barsonax commented Aug 4, 2019 •

edited

Loading

ilexp commented Aug 4, 2019

Barsonax commented Aug 4, 2019

ilexp commented Aug 4, 2019

Investigate By-Ref Vector and Matrix Operators in C# 8.0 #600

Investigate By-Ref Vector and Matrix Operators in C# 8.0 #600

Comments

ilexp commented Jan 4, 2018

Summary

Analysis

Barsonax commented Jan 8, 2018 • edited Loading

ilexp commented Jan 9, 2018

Barsonax commented Jan 9, 2018 • edited Loading

ilexp commented Jan 9, 2018

Barsonax commented Jan 9, 2018

ilexp commented Jan 9, 2018 • edited Loading

Barsonax commented Jan 10, 2018 • edited Loading

ilexp commented Jan 10, 2018

Barsonax commented Jan 10, 2018

ilexp commented Jan 11, 2018

Barsonax commented Jan 11, 2018

ilexp commented Jan 11, 2018 • edited Loading

Barsonax commented Jan 12, 2018

Barsonax commented Jan 13, 2018

Barsonax commented Nov 8, 2018

Barsonax commented Jun 26, 2019 • edited Loading

Barsonax commented Jul 30, 2019 • edited Loading

ilexp commented Aug 2, 2019 • edited Loading

Barsonax commented Aug 4, 2019 • edited Loading

ilexp commented Aug 4, 2019

Barsonax commented Aug 4, 2019

ilexp commented Aug 4, 2019

Barsonax commented Jan 8, 2018 •

edited

Loading

Barsonax commented Jan 9, 2018 •

edited

Loading

ilexp commented Jan 9, 2018 •

edited

Loading

Barsonax commented Jan 10, 2018 •

edited

Loading

ilexp commented Jan 11, 2018 •

edited

Loading

Barsonax commented Jun 26, 2019 •

edited

Loading

Barsonax commented Jul 30, 2019 •

edited

Loading

ilexp commented Aug 2, 2019 •

edited

Loading

Barsonax commented Aug 4, 2019 •

edited

Loading