LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware
User Name
Password
Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?

Notices


Reply
  Search this Thread
Old 12-03-2020, 05:31 AM   #1
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 17,620

Rep: Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618
CPU Turbo Boost - when?


There's really 2 questions, and I'm interested in the smaller end of CPUs, i.e. 2/4/8 & just maybe 16 cores. Ampere Mt Snow servers come with an 80 core Arm CPU, but I'm not in that territory, or threadrippers either.

1. When does boost turn on? When 1 core is busy? Does it need more that one thread under load? And how is 'under load' defined? I am presuming 'boost mode' keeps a watchful eye on the thermals.

2. When does boost turn off? I presume overheating and load are factors. I interestingly came across Ampere Computing's specs for it's machine which quoted only it's boost speed as a cpu frequency, because it had the cooling to guarantee thermal issues would not drive you out of turbo. But how fine things are cut would be interesting.

My sucky cpu has no boost, but it did show me the perfect way to get boost to cut out
Code:
mksquashfs <some backup dir>
On my 2 core/4 thread i3, top showed ~380%, whereas 'make -j <plenty>' and most things showed only ≤200%. Making a squashfs also played havoc with the cpu thermals.

It's actually an important question. Apple's M1 boasts 8 cores, 2.5Ghz with boost to 4.5 Ghz, and offers 50% faster compile times than previous Macbook Pros. My son has a previous Macbook Pro from his job, and that runs on an i7-9750H, 6 cores/12 threads, 2.6Ghz rising to 4.5Ghz on boost, and that's only 50% as fast. They're both 45W power disappation. So I guess doing 4.5Ghz is a lot more power hungry on the i7 than the M1, and the M1 can nearly live in boost, while the i7 can only make short infrequent visits.

So I conclude from the above, that improving cpu cooling gets you more boost, to the point where you can live there if you have the load?
 
Old 12-04-2020, 01:42 AM   #2
obobskivich
Member
 
Registered: Jun 2020
Posts: 614

Rep: Reputation: Disabled
Quote:
Originally Posted by business_kid View Post
There's really 2 questions, and I'm interested in the smaller end of CPUs, i.e. 2/4/8 & just maybe 16 cores. Ampere Mt Snow servers come with an 80 core Arm CPU, but I'm not in that territory, or threadrippers either.

1. When does boost turn on? When 1 core is busy? Does it need more that one thread under load? And how is 'under load' defined? I am presuming 'boost mode' keeps a watchful eye on the thermals.
Depends on the CPU and how its set-up vs load. Sorry for the vague answer but that's really how it is with modern chips. Going back like 15 years it was basically opportunistic across 'all' cores until it hit a thermal limit, but modern Intel and AMD chips can set clockspeed per-core and will opportunistically boost to whatever the absolute maximum is (which is pre-defined and presumably based on some understanding of reliability) vs their multi-thread utilization. Intel calls these "Turbo Bins" (AMD does the same thing but has different language) - so for '1 core' there are more available bins (e.g. higher clock multiplier) vs than for 8 or 16 or whatever cores. You'd have to look this up per-CPU to have a clear answer.

The 'boost' is sustained until it hits thermal limits, yes, and also has time limits (at least on Intel) where it has a specific power limit time, however most retail motherboards (e.g. non-Dell/HP/etc OEM stuff and not mobile) will set those limits to 999 or infinite so thermals become the only limiting factor.

This article talks about the whole process and all the various limits in context of Intel's newest chips (Comet Lake): https://www.anandtech.com/show/15785...ke-we-go-again (scroll down to "Getting Complicated with Turbo").

For a 'simpler' example, the older AMD 15h chips will engage 'Turbo' for half-loads - so if a single or dual threaded application wants 100% load, it can send half of the chip up to >max advertised clock until thermal limits. So for example the FX-9590 8-core, which has a maximum clock of 4.7GHz, can go up to 5.0GHz on 4 threads (or less) until/unless it hits thermal limits, and then it will go back down (first) to 4.7, and lower (if needed to cool it). If it is running all 8 threads it will only go up to 4.7.

Quote:
2. When does boost turn off? I presume overheating and load are factors. I interestingly came across Ampere Computing's specs for it's machine which quoted only it's boost speed as a cpu frequency, because it had the cooling to guarantee thermal issues would not drive you out of turbo. But how fine things are cut would be interesting.
Modern CPU makers just generally quote maximum theoretical boost as the 'nominal clock' - its all megapixel marketing.

Boost turns off when the load is insuffucient to obviate it - in other words if the CPU is idle or not doing very much. Note that this adjusts multiple times per-second so it can be 'boosting' and returning to 'idle' without neccisarily being observed, depending on what the workload is doing. If I remember right the host system has to support ACPI to be 'aware' of this as well, in other words if you took a very old kernel or a very old version of Windows the chip will usually just switch between 'idle' and 'max nominal' (non-boost) clocks, or just sit at 'max nominal' because there's no software flags telling it to do anything else (at least, this is what I've observed playing around with Windows 98/2000/XP on newer CPUs).

Quote:

It's actually an important question. Apple's M1 boasts 8 cores, 2.5Ghz with boost to 4.5 Ghz, and offers 50% faster compile times than previous Macbook Pros. My son has a previous Macbook Pro from his job, and that runs on an i7-9750H, 6 cores/12 threads, 2.6Ghz rising to 4.5Ghz on boost, and that's only 50% as fast. They're both 45W power disappation. So I guess doing 4.5Ghz is a lot more power hungry on the i7 than the M1, and the M1 can nearly live in boost, while the i7 can only make short infrequent visits.
A lot of Apple's claims on M1 were basically naked numbers, so do with them what you will (i.e. compiling what, with what conditions, etc). As clocks go up and heat goes up, leakage goes up too, and power (current) draw will go off the charts quickly. Look at the 10900 series - there's a T SKU that sits inside of a 65W TDP, but the KS SKU will draw >240W for some (relatively) marginal increase in clockspeed (as in, it isn't an order of magnitude or anything like that). Whether or not M1 can 'live at boost' across all cores is another question - Apple is the absolute worst at both cooling their hardware and giving straight #s about their hardware, and usually advertises absolute maximum theoretical boost as if it were 'all cores under nominal load' - generally speaking those MBPs with i7 and i9 chips are never seeing 'max boost' because they're throttled so hard due to thermals, and the M1 probably will suffer the same fate, even if it is more efficient per-watt, because that'll just be an invitation to Apple to remove even more cooling capacity. I would not consider that marketing blurb usable (at all) for a 1:1 IPC comparison - that Intel chip is probably living sub-2GHz under ACA loads, and from what I've read the M1 only advertises a maximum boost of 3.2GHz (https://www.anandtech.com/show/16252...pple-m1-tested - and they say that's only for single-threaded workloads).


Quote:
So I conclude from the above, that improving cpu cooling gets you more boost, to the point where you can live there if you have the load?
As long as the power limits/time constraints are also respected, in theory yes this is possible (so on laptops/SFFs/etc the answer will be 'no' because they will enforce Tau and PL2 and whatnot). A lot of overclocked systems will try to do this, and if they're cooled effectively, will achieve it. Bear in mind that 'ACA' (All Core Active) will be 'lower' than single-core boosts - but modern marketing wants to advertise single core as 'the whole chip' (when usually it really is only one core, not even the entire package, coming up to N speed). In other words, taking your 10900k to 5.3GHz on one core is perfectly normal - TVB will do that out of the box if the cooling is there. Taking your 10900k to 5.3GHz ACA and sustaining it under heavy workloads/stability tests is another matter entirely. Same is true for the FX-9590 example from above as well. Also note that on newer CPUs, Intel will no longer publish the turbo bins because it makes their marketing look bad (especially on their >8 core chips), where the 'advertised maximum clock' is basically fantasy land #s at this point.

With ARM I'm sure this gets even wilder because of bigLITTLE letting them play with the #s and move the goalposts even more.


If you want to see some realworld example, look at this review of a Supermicro board that takes the i9-9900k, but actually tries to abide by Intel's TDP/PL2/Tau specs out of the box (which is what OEM systems, laptops, etc will generally do) as well: https://www.youtube.com/watch?v=wJ6GDHvab8E (note that around 16:00 he summarizes this, and says the 'overall performance is not good' but then notes 'many tasks' run just fine (as in should be equivalent to running with all the limits off) - so 'always being at max turbo' doesn't always mean 'more performance')

Here's an analysis of the throttling that the MBP will go through with those 'big' mobile chips (6/8-core Intels): https://www.youtube.com/watch?v=NW8VMpl3XO4 (I laughed at 'temps at as low as 91* C' - sure that works 'right now' but whats long-term reliability like? (I mean, it *is* an Apple product, so its ultimately designed to be thrown away every 9-18 months after all...)).
 
1 members found this post helpful.
Old 12-04-2020, 05:15 AM   #3
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 17,620

Original Poster
Rep: Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618
Thanks for the detailed answer, obobskivich. That's what LQ is great for: finding experts.

You're throwing that term 'Tau' at me. I know it's a letter in some weirdo alphabet (Greek?) but what's Tau?

Summarising what you say:
  • There's no hard and fast rules but it's nearly rules per model for each manufacturer.
  • Single core turbo is possible on most systems.
  • Thermal issues or lack of workload turn off.
  • Board Manufacturers have some control but usually leave boost on hard
  • Arm can confuse the already clouded picture with their BIGlittle builds.
  • All cores might not get boost because it's thermal suicide.
  • Boost isn't always full speed boost, especially with dodgy thermals.
 
Old 12-04-2020, 07:38 AM   #4
obobskivich
Member
 
Registered: Jun 2020
Posts: 614

Rep: Reputation: Disabled
Quote:
Originally Posted by business_kid View Post
Thanks for the detailed answer, obobskivich. That's what LQ is great for: finding experts.

You're throwing that term 'Tau' at me. I know it's a letter in some weirdo alphabet (Greek?) but what's Tau?
It's an 'Intel thing' - they use it in their literature for 'Turbo Boost.' Here's a table that shows examples for some newer processors, including Tau: https://www.kitguru.net/components/c...op-processors/

Basically it's a time variable for how long the chip can/should sustain PL2 boost - its basically 'funny math' that lets them advertise 125W TDP on a 250W part by only counting the 250W component for some fraction of a larger time horizon that averages down to 125W. On a lot of gamer/overclocker motherboards these limits are either ignored or massively increased, which allows the chip to boost a lot longer/indefinitively until/unless thermal throttle occurs (or the system locks up/crashes due to VRM instability or somesuch). If I remember right it also has some meaning in context of the newer 'Thermal Velocity Boost' but I don't know exactly how its used there since I admittedly haven't bothered to read much in-depth on TVB (because it looks to only apply to a handful of fairly expensive chips).

As far as I know, AMD does not have an equivalent to Tau on their implemention of Turbo, and is more current/thermal limited (on 15h I know that's the case, on Ryzen I know they also have the 'most favored core' thing where it will pick a specific core on the die that boosts higher in low/single threaded tasks under XFR (another name for their 'Turbo')).

Quote:
Summarising what you say:
[*]There's no hard and fast rules but it's nearly rules per model for each manufacturer. [*]Single core turbo is possible on most systems.[*]Thermal issues or lack of workload turn off.
One thing I forgot to add - on AMD 15h processors its per-module not per-core, so it will look like 'two cores' are boosting even in single threaded loads (and I'd rather not have the 'core count' debate on 15h haha). If I remember right 2nd gen K10 also behaves something like this - the older K10, and K8 processors, along with all Intel chips up until (supposedly) Broadwell, will clock the whole chip up/down - so for example a Core 2 Quad Q9550 (yes I know its old) has a 'boost' clock of 2.83GHz, but 'idle' clock of 2.0GHz. It has no ability to set 1 core at 2.83 and leave the other 3 at 2.0 (so its all or nothing to 2.83GHz), while a newer AMD chip can (and will) do exactly that (e.g. a Ryzen or an FX can set one core at say 2.8GHz and leave the rest idle, if the workload calls for it), and the very newest Intel chips are supposed to do that as well (but fwiw my Broadwell-H does *not* behave this way *shrug* maybe Coffee/Comet Lake does?).

Quote:
[*]Board Manufacturers have some control but usually leave boost on hard
Be a bit cautious with this - what I mean here are like Asus ROG or ASRock Extreme kind of boards, stuff that gamers or overclockers are buying, but if you're looking at a Dell or HP build it will probably follow the spec-sheet TDP and PL values to the letter, and boost will be very constrained even if cooling is sufficient, because it doesn't want to exceed the advertised TDP values. Some of this is changeable in XTU (which I don't remember if is available for linux - I know its available for macOS and Windows) or BIOS settings, depending on the board.

Quote:
[*]Arm can confuse the already clouded picture with their BIGlittle builds.
I'm guessing at that - I could be wrong. My understanding from reading on Wikipedia about biglittle is that the two 'sets' can be on different clock planes, so I could see taking the little cores and dialing them up to 11 to say 'look it does 5GHz!' while the big cores lag in some way. There's still of course probably some advantage to that kind of configuration (aside from marketing), but it isn't the same as the whole thing running at 5GHz.

Quote:
[*]All cores might not get boost because it's thermal suicide.[*]Boost isn't always full speed boost, especially with dodgy thermals.
Yep. Something else I forgot to mention: that 'thermal suicide' may not be limited to the CPU, it may be on the VRMs on the mainboard - this was somewhat of a news item/pain-point when the i9 9900k came out, and later the i9 10900k, which draw significant power (>230W at PL2) and present an issue for a lot of the cheaper motherboards that are supposed to be 'compatible' with them. This was a lesser issue for FX-9370 and FX-9590 on AM3+, but at that time AMD provided a list of certified motherboards (the 'lamest' of which had a doubled 5-phase power stage, so 10 total stages, and handles 200W+ (which is like 140-150A) just fine - some of those newer Intel boards are straight-up 4-6 stage parts with no heatsinking trying to run like 200A+...).

When you get into Threadripper/HEDT this all gets amplified, because with so many more cores you usually have a lot more 'bins' to negotiate, so it may advertise say 4.5 or 4.8GHz 'max clocks' but thats on 1 or 2 threads on a 16/18/24/something-core chip, and 'max threads load' may only be running like 2.somethingGHz (and still drawing 200W+) [I'm thinking of 10980XE for this example if you were curious]

And all of this just so the machine can have slightly lower idle power draw and skimpy little heatsinks can quiet down sometimes...
 
Old 12-04-2020, 09:35 AM   #5
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 17,620

Original Poster
Rep: Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618
I've always gone underpower with the cpu on previous builds, but intend not to on my next. I don't intend to replace my next box. I feel it's at the stage where not a lot of significant progress is left. So no need to quote historical heavyweight CPUs at me. I don't know them.

The formula for power dissapation in a capacitor (or capacitive load) is ½Cv²F, where C=capacitance,V=voltage, & F=frequency. AMD have this little 'safety triangle' between die temperature, core voltage and cpu frequency and will attempt to stay within sane boundaries. They have sensors as part of the die, which gives them optimal feedback. C is fixed, but V and F are variable. The v² element isn't too hostile, because V≅1, and 1²=1. It's the F that pushes up power for any given cpu. Likewise, the increase in potential speed comes from a reduction in C or capacitance. That's why lithography or fab size is so important. because less of C means more of F at the same power.
 
Old 12-04-2020, 02:57 PM   #6
obobskivich
Member
 
Registered: Jun 2020
Posts: 614

Rep: Reputation: Disabled
Quote:
Originally Posted by business_kid View Post
I feel it's at the stage where not a lot of significant progress is left.
Unfortunately you're probably right, if the last few years are any indication...



Quote:
The formula for power dissapation in a capacitor (or capacitive load) is ½Cv²F, where C=capacitance,V=voltage, & F=frequency. AMD have this little 'safety triangle' between die temperature, core voltage and cpu frequency and will attempt to stay within sane boundaries. They have sensors as part of the die, which gives them optimal feedback. C is fixed, but V and F are variable. The v² element isn't too hostile, because V≅1, and 1²=1. It's the F that pushes up power for any given cpu. Likewise, the increase in potential speed comes from a reduction in C or capacitance. That's why lithography or fab size is so important. because less of C means more of F at the same power.
I think all else being equal this is right, but transistor count (and density) has increased massively (which also seems to at least weakly correlate with performance) as feature size has gone down, and I think this explains why frequencies really haven't increased significantly over the last 10+ years as a result ("4-ish GHz" has been pretty common for a while - even Pentium 4 chips were getting into that realm - with more exotic chips targetting "5-ish GHz" for at least 5 years now). There's probably some other detail(s) I'm missing here that explains things more succintly...
 
Old 12-05-2020, 06:16 AM   #7
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 17,620

Original Poster
Rep: Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618
Quote:
Originally Posted by obobskivich View Post
Unfortunately you're probably right, if the last few years are any indication...

I think all else being equal this is right, but transistor count (and density) has increased massively (which also seems to at least weakly correlate with performance) as feature size has gone down, and I think this explains why frequencies really haven't increased significantly over the last 10+ years as a result ("4-ish GHz" has been pretty common for a while - even Pentium 4 chips were getting into that realm - with more exotic chips targetting "5-ish GHz" for at least 5 years now). There's probably some other detail(s) I'm missing here that explains things more succintly...
So we agree. The Mosfet gate is the source of the capacitance. Mosfet gates initially were about 15pF (=15e-12F) but at a tiny fab size, that will be way down. Current through the mosfets is minimal. If you get your core voltage below 1.0V, the V² element actually reduces power(e.g. 0.9² =0.81). That's why you always see me waffling about fab size. That's why AMD's core voltage going down to 0.8V is so great, because 0.8V²=0.64! I think 5Ghz is just about do-able - the Intel - Intel i10s are offering burst mode there already, but the power consumption must be crazy. I think it's territory for gamers and servers. But I don't see 6ghz happening. When AMD catch up squeezing every last ounce out of their potential, they should come out ahead. I believe teams are still working to get sub 5nm fab, but I'm not hearing of breakthroughs. It's been 2 years now, since I could see that as a real target. But no news is actually bad news.

Transistor count adds complexity - more complex control of areas like burst control, or more cores. In any core, you want as few pipeline stages as possible because it takes a finite amount of time to turn on or off each device - propagation time is the relevant gobbledygook. But sure, you can parallel devices. A powered on fet acts as a resistor (On resistance) and paralleling devices halves that resistance. Bigger cache allotments are also possible. Also the circuitry about bus sharing when some cores are on different levels of turbo speed, and others are slower must be very complex. As a hardware guy, I can tell you optimizing that would be a nightmare. So transistors will be used.

Interestingly, capacitance is also affected by the distance between capacitor plates, in this case, the gate and fet channel. So as the fab gets smaller, the gate capacitance is increased by the gate/channel proximity, and reduced by their size. So capacitance might not decrease by as much as would be hoped. So the next target would not be 4nm, but 3nm. That's purely because 5nm-->4nm wouldn't be a significant improvement. Of course, below 5nm, they'll take what they can get. But it only takes 1 insulation issue in a zillion transistors to wreck a chip … failure rate must be 99.99% or worse.

Last edited by business_kid; 12-05-2020 at 06:21 AM.
 
Old 12-05-2020, 08:02 AM   #8
obobskivich
Member
 
Registered: Jun 2020
Posts: 614

Rep: Reputation: Disabled
Quote:
Originally Posted by business_kid View Post
So we agree. The Mosfet gate is the source of the capacitance. Mosfet gates initially were about 15pF (=15e-12F) but at a tiny fab size, that will be way down. Current through the mosfets is minimal. If you get your core voltage below 1.0V, the V² element actually reduces power(e.g. 0.9² =0.81). That's why you always see me waffling about fab size. That's why AMD's core voltage going down to 0.8V is so great, because 0.8V²=0.64! I think 5Ghz is just about do-able - the Intel - Intel i10s are offering burst mode there already, but the power consumption must be crazy. I think it's territory for gamers and servers. But I don't see 6ghz happening. When AMD catch up squeezing every last ounce out of their potential, they should come out ahead. I believe teams are still working to get sub 5nm fab, but I'm not hearing of breakthroughs. It's been 2 years now, since I could see that as a real target. But no news is actually bad news.
The i9s that do 5GHz aren't doing it on all 8 or 10 cores, at least not 'stock' - overclockers are doing that on liquid cooling though. Power draw is something like 250W at the socket. If you dial all that back to 'stock' specs, like the Supermicro board example I provided, it all behaves much more sanely.

AMD doesn't need to 'catch up' to anything in terms of hitting 5GHz - FX-9590 was doing 5GHz in 2013, on 32nm. But that also wasn't on all 8 cores (its on up-to 4), and power draw was easily 160W+ (official spec TDP says 220W but I remember reading some Q/A with an AMD engineer a few years back and he basically admitted that was a very liberal estimate to ensure boardmakers and whatnot were provisioning properly - in practice its somewhere over 150w and under 200W). The thing is, outside of 'retail shipped' stuff, 5GHz has been 'doable' for a long time - Pentium 4s were overclocking to that (with admittedly pretty exotic cooling) in the early 2000s, and more recent Intel and AMD hardware *is* doing 6GHz+ (I think the standing world record is still right under 9GHz on a 32nm AMD) with overclocking. Again this is all liquid/phase-change/LN2 cooling territory, but it is doable on modern hardware. Intel just released a revised i9 with TEC cooling as a limited edition that's supposed to do something like 5.5GHz with TVB...

I thought both TSMC and Samsung were shipping 5nm (or will be within a year), and TSMC was supposed to be working on a 3nm node? I know GloFo basically threw in the towel on EUV nodes, and Intel has been having on-and-off trouble with their 10nm-and-below nodes (they actually *are* shipping 10nm parts to retail, but only for mobile client devices (11th gen for example)).

Quote:
Transistor count adds complexity - more complex control of areas like burst control, or more cores. In any core, you want as few pipeline stages as possible because it takes a finite amount of time to turn on or off each device - propagation time is the relevant gobbledygook. But sure, you can parallel devices. A powered on fet acts as a resistor (On resistance) and paralleling devices halves that resistance. Bigger cache allotments are also possible. Also the circuitry about bus sharing when some cores are on different levels of turbo speed, and others are slower must be very complex. As a hardware guy, I can tell you optimizing that would be a nightmare. So transistors will be used.

Interestingly, capacitance is also affected by the distance between capacitor plates, in this case, the gate and fet channel. So as the fab gets smaller, the gate capacitance is increased by the gate/channel proximity, and reduced by their size. So capacitance might not decrease by as much as would be hoped. So the next target would not be 4nm, but 3nm. That's purely because 5nm-->4nm wouldn't be a significant improvement. Of course, below 5nm, they'll take what they can get. But it only takes 1 insulation issue in a zillion transistors to wreck a chip … failure rate must be 99.99% or worse.
I guess the part I just don't 'get' is: I've seen the leaked per-wafer prices for TSMC's 7nm and 5nm stuff, and I have an idea of how real-world 7nm GPUs and CPUs are stacking up against 14nm or 28nm or 32nm 'remnants' and frankly don't see the business sense in pushing 'further' here. The prices are just sky-high and the yields are low (and all that gets passed on the to consumer) and they aren't really getting big performance uplifts out of it - for mobile/embedded devices I get it, because that's battery life and so forth - Intel's strategy to stand on 14nm doesn't look so insane I guess is what I'm saying, and I wouldn't be surprised if in a few years they end up having the last laugh. Sure it means 'the end' of huge superscalar performance uplift but wasn't the original point of multi-core x86 (way back in 2005) supposed to negate that brick-wall by going sideways? And yet here we are again 12-15 years later...
 
Old 12-05-2020, 12:41 PM   #9
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 17,620

Original Poster
Rep: Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618
I would see it that 5-7nm is enough, and I'm working on that basis. I trained as a hardware engineer, and imho Intel is wrong to stick on 14nm, unless they're getting great yields from it, which makes everything cheap.

From that formula Watts=½CV²F, and the fab size you can get a ballpark wattage figure for things. Once you have a believable watts spec for something similar, you can ignore the quiescent (gobbledygook = always present) current. Fab sizes changes the C element (Probably non-linearly), and F changes the frequency. NicoD did reviews of an AWS inhouse server, and the Ampere computing server. The AWS spec he quoted was 32 A-76 'Neoverse' cores @2.5Ghz and TDP of 1W per core @2.5Ghz. That's a spec I can leverage. That's ≅32 Watts + a little. Ampere computing specifies their 32 core Arm server @ 3.3Ghz, as they stay cool enough to remain in turbo permanently. But the power calculation would be 32W/2.5*3.3 = 42.24 Watts, which is trivial beside the X86 behemoths.

It's interesting also that Ampere and AWS chose 32core units. Ok, AWS used the base frequency, and Ampere chose the turbo speed, but who cares? I smell a hardware part being manufactured that plugs into an X86_64 type motherboard complete with all the usual suspects. The only issue is the BIOS. Ampere is hawking the thing as a server, and AWS is pimping them as servers. No wonder they can pimp them at $1.50 per hour. BTW, Ampere has moved on to an 80 core version. Even at 2W per core×80 = 160W, it would still be manageable by very sane methods.

So there's probably off the shelf components somewhere, although as always in Electronics, the MOQ may be frightening.

Last edited by business_kid; 12-05-2020 at 02:04 PM.
 
Old 12-05-2020, 01:48 PM   #10
obobskivich
Member
 
Registered: Jun 2020
Posts: 614

Rep: Reputation: Disabled
Quote:
Originally Posted by business_kid View Post
I would see it that 5-7nm is enough, and I'm working on that basis. I trained as a hardware engineer, animho Intel is wrong to stick on 14nm, unless they're getting great yields from it, which makes everything cheap.
From what I've seen in the press, they're sticking with 14nm for client and server parts currently due to yields yes - the 10nm stuff has been going into mobile (e.g. the new Macbook Air) though. Price-wise I don't know that their strategy is really so bad - going by tray prices a top-end current i9 is what? $500 or so? And they basically worked against inflation for the last decade, and probably half of that was on 14nm (they're something like 5-6 generations into 14nm at this point). AMD CPU prices have gone up something like 5-10x in the same period, as they've moved from 32nm to 7nm in the same (and performance, while higher, is not 5-10x). *shrug*

The 'big ARM' stuff is certainly cool, but like you say, what's the MOQ look like (and at what price/unit) and who is going to be brave enough to go 'first' - this isn't the first time some non-x86-based platform can beat up on x86 in perf/watt/core-count but historically none of those other contenders ever make it out of enterprise-oriented gear.
 
Old 12-05-2020, 02:52 PM   #11
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 17,620

Original Poster
Rep: Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618Reputation: 2618
No, they never make it out of enterprise, because 90% of desktops use windows, bugs and all.

Anyhow, we have another firm spec to throw into the fray. Up to 2018, Ampere were on 16nm lithography, and their 32core unit was specc'ed at 125W ≅ 3.9W per core @ 3.0Ghz, with turbo to 3.3Ghz.

In 2019 they and AWS moved to 7nm fab There is a product out there for OEMs. When Ampere upgraded themselves, they gifted a (presumably obsolete stock) 32 core server to Debian. Using the AWS Gravitron figure of 1W per core@2.5Ghz with 7nm lithography, Ampere would be doing 1.2W per core @ 3.0Ghz, or 1.32 @ 3.3Ghz. So those figures make Ampere's 80 core Mount Snow server look appealing. 80 cores @ 1.32W per core = 106W.

That difference (3.9W down to 1.2W) was achieved by 16nm going down to 7nm. so 3nm would be great, if they could get it. I don't think they will. Looking at it, I can see a year or two more tweaking things before the X86 designs run out of road
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
enable AMD APU turbo boost constantius Slackware 2 07-12-2020 12:17 PM
Intel Turbo Boost Aethereal751 Linux - Hardware 6 09-04-2015 08:40 AM
[SOLVED] Kernel config CPU Frequency Scaling / module and Intel "Turbo boost" behaviour zeebra Linux - Kernel 3 09-29-2013 04:39 PM
[SOLVED] High CPU load, but low CPU usage (high idle CPU) baffy Linux - Newbie 5 03-13-2013 09:24 AM
Cpu fan stuck to the cpu, how do I get the cpu out? abefroman Linux - Hardware 16 09-04-2009 12:47 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware

All times are GMT -5. The time now is 03:37 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration