Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade odin memory to match ysera? #1105

Closed
Firefishy opened this issue Jun 25, 2024 · 17 comments
Closed

Upgrade odin memory to match ysera? #1105

Firefishy opened this issue Jun 25, 2024 · 17 comments
Labels
hardware location:amsterdam Equinix AM6 data centre service:tiles The raster map on tile.openstreetmap.org

Comments

@Firefishy
Copy link
Member

ysera and odin are twins.

We recently upgraded ysera from 256GB to 384GB of RAM.

Do we also want to upgrade odin from 256GB to 384GB of RAM?

Cost: £260 inc
Remote hands is an extra fee, else upgrade can be timed with the next site visit.

@Firefishy Firefishy added hardware location:amsterdam Equinix AM6 data centre labels Jun 25, 2024
@Firefishy
Copy link
Member Author

Firefishy commented Jun 25, 2024

@pnorman
Copy link
Collaborator

pnorman commented Jun 27, 2024

When both machines are at 100% ysera is consistently generating 5% more tiles. This is under 1 metatile per second.

A "new" odin or ysera would cost 2200 GBP, so 12% of the cost of an identical server for 5% of the performance gain.

We wouldn't buy another odin/ysera, we'd buy something newer, and the cost effectiveness of the upgrade is even worse in comparison to that.

I don't think we should do this.

@mboeringa
Copy link

@Firefishy

Have you ever considered upgrading the CPUs instead of RAM?

Although @pnorman may be right that upgrading the RAM on these machines with already 256GB may not be worth the money in terms of performance gains, upgrading the CPUs might be worth it.

I noticed many of the current OpenStreetMap servers use mid-range CPUs. While that is perfectly understandable to keep the cost down of a newly acquired brand new system, as the CPU cost and choice can be a very significant part of the cost of a new system, it also means there is considerable room for performance improvements once more powerful high end refurbished CPUs of the same generation hit the market, often at a cost that is only a fraction of the original new price.

E.g. the current Supermicro systems run dual Intel Xeon Gold 5120 105W TDP CPUs. According to PassMark benchmarks, those have a 29566-MT and 1730-ST score in dual CPU configuration, which is well below the top of their generation.

A nice replacement for those CPUs would be the Intel Xeon Gold 6148. This, not very latest series, also FCLGA3647 socket, processor with 150W TDP should be supported on the system according to the docs of Supermicro.

These processors score 46668-MT and 2110-ST PassMark results in dual CPU configuration, which is about 55-60% higher in multi-threaded performance than the current processors. So essentially, replacing all four 5120 processors in odin and ysera with 6148 CPUs would potentially result in a performance equal to three of the current servers, with added bonus of about 20% higher single threaded performance as well.

These processors can now be had for as little as about 200-225 euros including VAT, e.g. see these sites:
https://www.omniserver.nl/producten/onderdelen/processors/xeon-gold-processors/xeon-gold-6148-processor/
https://www.servershop24.de/en/intel-xeon-gold-6148-cpu/a-123546/
I am sure if you do more searching, maybe even better deals can be had, but even these prices are only 1/10th or less of original retail price, which is a major deal.

So for about 800 euro's, you could extend the life of these two servers for another few years with a clear uptick in performance.

Replacing CPUs is entirely doable. I replaced the dual Xeon E5-2680 v4 CPUs in my HP Z840 workstation with dual E5-2699 v4 processors, and got a similar about 45% uptick in multi-threaded performance closely matching the PassMark results.

Yes, it is not as quick and easy as popping in and out a new memory bank in a slot. But reading the available maintenance documentation and carefully follow the torc adjustment instructions regarding screwing down the heat sink on the processors, should allow a successful replacement without to much hassle. Only the very latest behemoth processors and sockets seem to be quite finicky and fragile, and may need special torc-adjustable screw driver tools to safely replace a CPU from what I read, but not these older processors.

@mboeringa
Copy link

mboeringa commented Jul 14, 2024

I now noticed that piasa is already using that suggested Intel Xeon Gold 6148 processor in an HPE Proliant system in a similar role as tile server, so it would be nice to compare throughput on piasa when under max load with the odin and ysera systems to see if the figures of real world performance match up to expectations.

@Firefishy
Copy link
Member Author

@mboeringa Yes, I did put some time into looking at potential CPU upgrades. I had primarily been looking at Cascade Lake, as these are the best the system supports. I wasn't able any great value Xeon Cascade Lake CPUs on the reseller market. Best I could easily find is the Xeon Gold 6230.

In pure render speed there doesn't seem to be a huge improvement from Xeon Gold 5120 and a Xeon Gold 6148. Realistic difference is only maybe only 20% / 25% improvement. The Xeon 6148 does indeed look like very good value.

I've a large collection of different high-end thermal pastes and high-end thermal pads. Despite this I am not a huge fan of server CPU upgrades. It is a risk prone exercise. Our Supermicro systems are built by Supermicro in the Netherlands and pre-certified by them.

@mboeringa
Copy link

It is a risk prone exercise.

Is this anecdotal or your own experience? Have you ever had real failure of a system after an upgrade?

My track record of doing these upgrades is admittedly non-existent, as I only did this twice (swapped out a Core i5 for i7, and these Xeons in the HP Z840), but I honestly and generally found it easier than expected if done with some due care. The main challenge is spreading the thermal paste properly, so as not to create heat spots on the processor where it cannot properly dissipate its heat, as you undoubtedly know.

Since the upgrade some 1,5 years ago though, I have trashed my HP Z840 for months on end at max CPU load without any noticeable (thermal) issues. The system has been rock solid ever since.

Yes, the Cascade lake processors do not yet seem to have come down enough in price to be worth it, while the 6148 is nearly the same speed for a fraction of the price.

Of course, if the real world experience still doesn't add up, it may need some further deliberation. Any other potential bottlenecks that may cause it, or some configuration needing a change to push the processors to their true potential?

@pnorman
Copy link
Collaborator

pnorman commented Jul 15, 2024

I wouldn't want remote hands to do this, so it would have to be Grant on one of his site visits. Time always feels limited on those.

@pnorman
Copy link
Collaborator

pnorman commented Jul 25, 2024

We have decided to upgrade, if possible. At the same time we will upgrade the RAM to make the machines the same.

The Supermicro chassis supports up to 165W TDP except for 6144, 6146, 6244, 6246, 6250 and 6256. The motherboard itself supports 205W CPUs and up to 28 cores. The manual says it supports Dual Intel Xeon 81xx/61xx/51xx/41xx/31xx series or 82xx/62xx/52xx/42xx/32xx series processors.

The cheap processors are

model cores speed TDP cost (EUR)
8173M 28 2.0 165W 245
8160 24 2.1 150W 304
6152 22 2.1 140W 186
6148 20 2.4 150W 161

8173M is about 10% faster than 8160

During daytime peak the CPU pressure is 4% and IO pressure is < 1% so we're not currently IO limited. We limit parallelism to slightly under the capacity of the machine so it remains responsive, thus the CPU pressure would be much higher if we didn't limit.

@Firefishy
Copy link
Member Author

Firefishy commented Jul 25, 2024

The servers has a SuperServer 1029P-WTRT chassis which says it ships with 2x SNK-P0067PSMB "1U Passive High Performance Front CPU Heat Sink"

SNK-P0067PSMB specs say "Up to 165 Watts"

@Firefishy
Copy link
Member Author

I have found online resellers selling systems with 2x Xeon 8173M using the X11DDW-NT motherboard, so they are likely compatible.

@Firefishy
Copy link
Member Author

Firefishy commented Jul 25, 2024

model cores speed TDP cost (EUR) CPU Mark (all core)
6248 20 2.5 150W 489 29504 (fastest)
8173M 28 2.0 165W 245 28025
8160 24 2.1 150W 304 28825
6152 22 2.1 140W 186 25441
6148 20 2.4 150W 161 28999
5120 14 2.2 105W existing 17686

@mboeringa
Copy link

mboeringa commented Jul 27, 2024

The 6148 still seem the best value for the money in this comparison. High MT performance, high base frequency, and low cost. Added benefit is the slightly lower TDP that doesn't reach the limits of supported TDP of the systems, leaving some headroom.

@pnorman pnorman added the service:tiles The raster map on tile.openstreetmap.org label Jul 28, 2024
@Firefishy
Copy link
Member Author

I have ordered the matching memory.

I am also going to go ahead with the purchase of the 4x Xeon 6148. Thank you @mboeringa for the recommendation!

@Firefishy
Copy link
Member Author

Order of 4x Xeon 6148 CPUs confirmed. Bargain price of £425 total.

@Firefishy
Copy link
Member Author

Orders arrived. CPU and RAM now in stock in Catford.

@Firefishy Firefishy added this to the 2024 AM6 Visit milestone Sep 15, 2024
@Firefishy
Copy link
Member Author

Memory upgraded and confirmed working. CPU upgrade pending.

@Firefishy
Copy link
Member Author

CPU upgraded but running very very hot. The thermal paste takes awhile to settle in and there is a very bad airflow pattern with the switches. We can create new tickets if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hardware location:amsterdam Equinix AM6 data centre service:tiles The raster map on tile.openstreetmap.org
Projects
None yet
Development

No branches or pull requests

3 participants