forked from wled-dev/WLED
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Remove the large stack buffer as we're just going to copy it in to a heap buffer anyways. Later we can refine the length estimation or use a rope-style dynamic data structure like DynamicBufferList.
- Loading branch information
1 parent
dfcfeaf
commit 4dbe596
Showing
8 changed files
with
472 additions
and
525 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
4dbe596
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This already looks wonderful!
Did you check before and after code size? I can see that it could be way smaller.
4dbe596
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I haven't checked the code size yet. I expect this formulation will be slightly larger than before - we save a few bytes for
oappend
itself, but we'll likely pay for it with all the vtable lookups on thePrint
object. We could go through and replace a lot of the small append constructions withprintf_P()
s or the like which could reduce the call count by quite a bit, though. I tried a test example insappend()
to reduce some of the code duplication there, but the principle could be applied more broadly.4dbe596
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I'd suspected, this build is actually slightly bigger -- 2k on ESP8266, 1k on ESP32. (Apologies for my custom envs.) It can likely be improved a lot by using
printf
s instead, though.nodemcv2_debug
aws-queue
RAM: [====== ] 57.9% (used 47420 bytes from 81920 bytes)
Flash: [========= ] 91.0% (used 950663 bytes from 1044464 bytes)
end-oappend
RAM: [====== ] 57.8% (used 47380 bytes from 81920 bytes)
Flash: [========= ] 91.2% (used 952147 bytes from 1044464 bytes)
esp32_new_ar
aws-queue
RAM: [== ] 18.0% (used 59104 bytes from 327680 bytes)
Flash: [==========] 95.3% (used 1499681 bytes from 1572864 bytes)
end-oappend
RAM: [== ] 18.0% (used 59104 bytes from 327680 bytes)
Flash: [==========] 95.4% (used 1500637 bytes from 1572864 bytes)
4dbe596
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the RAM decreased on ESP8266 which is excellent (if that is not attributed to some other change).
AFAIC this is another magnificent optimisation across the board.
If you can PR this to 0_15 (also replacing
oappendi()
in some usermods) I think we can work together to consolidate those strings.4dbe596
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hesitate to use the term optimization - it's a different tradeoff, a little more (and most likely slower) code for the benefit of using dynamic buffers. Glad you're still interested, I'll polish it up for PR today or tomorrow - I want to do a proper before and after comparison on all the XML responses to ensure they are unchanged. I also tweaked the AsyncStreamResponse for better allocator performance as well as doing early buffer free()s. Finally the MQTT case needs a flat buffer Print class that there's a thousand implementations of, but I haven't found one in one of our existing deps yet - so I added one to my development AsyncWebServer, but I might just inline it in the MQTT code to avoid the dependency change.
The ESP8266 RAM change is entirely from removing some of the short non-F strings. All those 2 byte shared string literals really do add up to a few k of RAM.
The latest tip gets:
nodemcuv2_debug
RAM: [====== ] 57.8% (used 47352 bytes from 81920 bytes) (-68 from aws-queue)
Flash: [========= ] 91.0% (used 950783 bytes from 1044464 bytes) (+120 from aws-queue)
esp32_new_ar
RAM: [== ] 18.0% (used 59096 bytes from 327680 bytes) (-8 from aws-queue)
Flash: [==========] 95.3% (used 1499265 bytes from 1572864 bytes) (-416 from aws-queue)
I did make some choices to improve the sizes at the cost of code readability, though. :(