-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimisation pass #1365
base: develop
Are you sure you want to change the base?
Optimisation pass #1365
Conversation
Had a scanthrough and made the following changes in the name of optimisation (even micro-optimisations) where possible: - player.Get* -> player.Iterator - ents.GetAll -> ents.Iterator - ipairs -> for i=1,#tbl (only main uses of ipairs has been replaced, not every use) - Vector:Distance -> Vector:DistToSqr - Vector:Length -> Vector:LengthSqr - net.Read/WriteEntity -> net.Read/WritePlayer where it is definitely intended to be a player - Saved some GetConVar calls to local vars - Localised/reused some Vector and Color objects - Localised some functions where they are frequently used in a few files - Misc minor styling cleanup (mostly spacing) - BONUS: Fixed pac.ColorToNames being incorrect when used by some parts (some now use the new pac.VectorColorToNames func)
lua/pac3/core/client/parts/lock.lua
Outdated
@@ -458,7 +461,7 @@ function PART:DecideTarget() | |||
|
|||
--sort for the closest | |||
for i,ent_candidate in ipairs(ents_candidates) do | |||
local test_distance = (ent_candidate:GetPos()):Distance( self:GetWorldPosition()) | |||
local test_distance = (ent_candidate:GetPos()):Distance(self:GetWorldPosition()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This Distance can be optimized
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, I missed this file
Thanks 😄
I feel like for loops that don't need optimization should use ipairs (which is faster in luajit). Each for loop now being 3 lines is pretty messy and uncomfortable to look at. |
I agree it's nicer to look at when it's just ipairs. However when I test the run times between them, Results:
The code I'm using to test the run times: local tbl = {"a", "b", "c", "a", "b", "c", "a", "b"}
jit.on() -- Make sure its on
local runTimes = 100
local time = 0
for x = 1, runTimes do
local start = SysTime()
for i = 1, 1000 do
for k, v in pairs(tbl) do end
end
local e = SysTime()
time = time + (e - start)
end
print("pairs", (time / runTimes) * 1000)
time = 0
for x = 1, runTimes do
local start = SysTime()
for i = 1, 1000 do
for k, v in ipairs(tbl) do end
end
local e = SysTime()
time = time + (e - start)
end
print("ipairs", (time / runTimes) * 1000)
time = 0
for x = 1, runTimes do
local start = SysTime()
for i = 1, 1000 do
for k = 1, #tbl do
local v = tbl[k]
end
end
local e = SysTime()
time = time + (e - start)
end
print("for k = 1, #tbl", (time / runTimes) * 1000) |
That makes me really sad for glua. Still, compromising clean code in spots that don't need the performance is a bit of a headache. I'm fine with the changes either way. |
I don't wanna dictate too much here as I'm not very active in the project at the moment, but how does this affect startup time, framerate of rendering a complex outfit, and so on? In other words, if it doesn't improve the performance users care about then all we're really doing is making the code more complex. |
Of most significant, these changes can reduce the number of calls to player.GetAll() and ents.GetAll(), for i=1,#tbl can go through the table faster. |
That's stating the obvious. He just wants to know if the fps gain is worth the messy code. Imo only the hot loops should be optimized, but it doesn't really matter. |
Having to resolve the conflicts in that many files changed will be an absolute pain for me who hoards heaps and heaps of changes before publishing them, but I'll manage. As for the style, yeah it adds some lines and it's a different way but I'll accept it if it makes pac a little less bulky in the crucial areas that need their inevitable loops. Especially Think and OnDraw ones, that's good to optimize. Even if ultimately there's always a bigger source of cost out there. How much of pac's runtime cost are we fighting over? If it turns out to be less than a percent, fighting over a tiny fraction is useless in my opinion. Just don't make the code unreadable. Which these changes don't. If we're gonna talk about performance, I'm more interested in numbers and quantifying the impact.
Neither the word "can" nor "greatly increase performance" don't mean anything by themselves. For any theoretical "can", you have to show the real truth. I can trust your intent but it's not enough to establish. For any "greatly increase", you must specify how much. What's the run time in milliseconds (or microseconds) before, and what's the runtime after? What's the FPS before and after? If you have numbers, I'm sure they'll speak for themselves. |
Perhaps I didn’t express myself clearly enough. What I meant was that it could improve performance in certain situations. For example, if used on a larger server with a bunch of complex PAC3 outfits, the difference would definitely be noticeable. However, if you were to scrutinize it under a magnifying glass in single-player, there would hardly be any noticeable change. About complicating the code, I don’t think that anything can complicate it except for k = 1, #tbl, because the rest of the changes simply consist of using a different function |
local function get_enable_convar(classname) | ||
local convar = pac_enable_convars[classname] | ||
|
||
if not convar then | ||
convar = GetConVar("pac_enable_" .. string.Replace(classname, " ", "_"):lower()) | ||
pac_enable_convars[classname] = convar | ||
end | ||
|
||
return convar | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be the __index metamethod of pac_enable_convars. Then replace calls to this with table indexes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tested the perf difference between using the local function and using __index for this, it's pretty much the same. It's a 50-50 chance that one will run faster than the other, so I feel inclined to leave this as is.
Test results (ran code 50 times, console output of last run):
use __index 0.00066750000542015
use function 0.00056749999657768
* times won:
[1] = 21
[2] = 24
Code:
local tbl = {}
setmetatable(tbl, {
__index = function(t, k)
local convar = GetConVar("pac_enable_" .. string.Replace(k, " ", "_"):lower())
t[k] = convar
return convar
end
})
local tbl2 = {}
local function get_enable_convar(classname)
local convar = tbl2[classname]
if not convar then
convar = GetConVar("pac_enable_" .. string.Replace(classname, " ", "_"):lower())
tbl2[classname] = convar
end
return convar
end
local a
local str1, str2 = "model", "light" -- Just in case creating strings was causing a discrepency
jit.on() -- Make sure its on
timesWon = timesWon or {}
local runTimes = 200
local time = 0
for x = 1, runTimes do
local start = SysTime()
for i = 1, 1000 do
a = tbl[str1]
a = tbl[str2]
a = tbl[str1]
a = tbl[str1]
a = tbl[str2]
end
local e = SysTime()
time = time + (e - start)
end
local result = (time / runTimes) * 1000
print("use __index ", result)
time = 0
for x = 1, runTimes do
local start = SysTime()
for i = 1, 1000 do
a = get_enable_convar(str1)
a = get_enable_convar(str2)
a = get_enable_convar(str1)
a = get_enable_convar(str1)
a = get_enable_convar(str2)
end
local e = SysTime()
time = time + (e - start)
end
local result2 = (time / runTimes) * 1000
print("use function", result2)
if result < result2 then
timesWon[1] = (timesWon[1] or 0) + 1
else
timesWon[2] = (timesWon[2] or 0) + 1
end
print("* times won:")
PrintTable(timesWon)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Speed doesn't matter for this function. it's just cleaner to use __index. And yeah jit is kinda random about what it blacklists or compiles
@@ -2673,7 +2716,7 @@ function pace.UltraCleanup(obj) | |||
if IsSafe(part) or Important(part) then | |||
return | |||
elseif IsMildlyRisky(part) then | |||
if table.Count(part:GetChildren()) == 0 then | |||
if #part:GetChildren() == 0 then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure these table.Count replacements will always work? vgui can be pretty inconsistent with the type of table this returns iirc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use table.Empty instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory part:GetChildren() should always return a numerically indexed table. I'm not sure what's up with this code, but I guess in theory is just in theory..
Also I believe [1] is faster than getting the tables length, which part:HasChildren() should do.
I finished reviewing. Just a couple more suggestions. I think a big potential performance improvement is minimizing __index calls on entities. |
ive seen people complain that player.Iterator ents.Iterator can rarely return NULL entities on clientside, im not exactly how it works and how to fix it but its enough of a issue that i can recommend not using it clientside when possible |
Yeah ent, weapon, player __index calls are significantly laggy. Running something like fprofiler and a __index counter would probably help narrow down the lag. Small optimization passes like these arent bad but they wont make significiant differences |
Basically running FProfiler and seeing what I can do to make the most expensive and most ran functions run faster
I've had some updates I've been meaning to push for a while and had decided to wait. But it's been a month, which is a far cry from what the "within a few days" initially stated. So I'm doing some of my updates and you can deal with the git conflicts later. |
That's fine go ahead. I may exclude my changes to do with combat related things for the time being anyway. And sorry, there were a lot of things pointed out that I could try improving, so I kept this back to keep working on it. |
@@ -272,7 +276,7 @@ do -- scene graph | |||
-- this will handle cases like if a part is removed and added again | |||
for _, key in pairs(self.PartReferenceKeys) do | |||
if self[key] and self[key].UniqueID == part.UniqueID then | |||
self["Set" .. key](self, part) | |||
self[prefix_set .. key](self, part) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making this a variable is pointless. Getting rid of the concatenation would be a good performance improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I don't like making these constant strings into variables since it fragments and complicates the code.
This PR is nearly unreviewable with the amount of changes. You're going to just have to test it on a server probably. |
Conflicts need to be resolved too. |
I figured I'd continue on after #1364 and look at replacing all player.GetAll calls where applicable, which led into replacing all ents.GetAll calls with ents.Iterator. That brings us to now where I figured I'd keep going with every optimisation method I could think of within reason.
Instead of pushing this straight to develop, I've made this PR to give everyone interested a chance to check I haven't missed anything obvious. I've tried usual PAC usage on my own game with these changes and it seems fine.
If there's no objections, I'll merge this within a few days.
Commit description:
Had a scanthrough and made the following changes in the name of optimisation (even micro-optimisations) where possible: