Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux? |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
![Reply](https://www.linuxquestions.org/questions/images/buttons/reply.gif) |
12-14-2020, 03:25 PM
|
#1
|
LQ Guru
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 17,215
|
CPU speeds, Bus Bandwidth, & similar stuff.
Back in the early days, you always had 4+ stages per cpu cycle - Address (for upcoming instruction)
- Read Instruction
- Compute (= internally decode) Instruction
- Write Output
Now some instructions went to ≥5 clock cycles and if you had one of the crappy things with Adress & data bus combined, they got much longer, but mostly the cpu speed was crystal/4. All cpu i/o chips were on this single Address & Data Bus.
Now in these days of multiple cores, caches, You lose track a bit. Take this example: https://www.solid-run.com/arm-server...b-workstation/
That has 16Core A-72 Arm Cores @2Ghz. For networking, they are imho vastly overspecified. They offer - 1×100Gbps nic or 4×25Gbps nics
- Also 4×10Gbps nics
I looked at this and it didn't add up. Now they've a set of 2 DIMMs so maybe it's 8 cores each, but still it didn't seem to compute. I sent some searching questions.I got a reply back on the state of software progress, and a link to their software forum. It's definitely a Software WIP. A 10Gbps nic is only doing 2.5Gbps unless they use a 'passthrough IOMMU' in which case it's 6.5-7Gbps. They're telling me they'll get more cores talking to the nics, but haven't done the software yet. - Sure it's got 16 Cores, but how much use are they?
- Could that thing ever keep a 100GB Nic at top whack?
- what on earth is going on instead of the single buses of old?
- What sort of bus bandwidth do you need to feed a 100Gb (10 GigaBytes per second) NIC?
- How does the IOMMU become the bottleneck?
EDIT: Just Found this: https://www.solid-run.com/arm-server...b-workstation/
Last edited by business_kid; 12-14-2020 at 03:44 PM.
Reason: Wrong url stuck in
|
|
|
12-16-2020, 05:07 AM
|
#2
|
LQ Guru
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 17,215
Original Poster
|
OK. I've timed out on this. Marking it solved.
I consider myself a hardware guy, with designs under my belt. But It appears everyone is like me - they don't have a clue.
I'm going to grab some figures and play with them to see what I can come up with. If I crack it, I'll report back.
|
|
|
12-16-2020, 07:00 AM
|
#3
|
LQ Guru
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 17,215
Original Poster
|
Right, I kinda got this as clear as it's ever going to to be.
On the old IBM XT/AT, there was ONE cpu, ONE address & DATA bus, and life was simple. Now it's extremely complex. Sitting at several GHZ are your cpu cores, and everything else is throttled to some extent.
Using 2 DIMMS, there's a 128 line Memory bus. The last ram I read up on ram, it had a 6-1-1-1 access cycle. That means (I think):Fresh Address = 6 Cycles wait; then by a 64bit cpu doing a 128 bit read, you should get 2 successive reads in 2 cycles (so 8 cycles total for a fresh ram address); increment the address, and another 2 reads = 3 cycles, and so on. Now the ram can't go at core speed so that's throttled back a bit.
Then, rather than one distributed parallel bus, there's many highly efficient serial i/os dedicated to different parts. SATA, for example uses one lane and maxes out around 500 Mb/S on an SSD. But NVMe sits in a PCIE slot and can use more (16?) lanes to get to ~3GB/S which is seriously fast. USB 1/2/3 get their own slower speeds, and it kind of becomes obvious why GPIO is such a pain in the butt.Caches are thrown around like confetti, and everyone is happy. This gets you away from the hard physical addressing of days of yore, because all µP support chips had fixed physical addresses, which was a hacker's dream come true. So bus bandwidth is sales talk, unless somebody can explain exactly what it means. But it probably relates to the potential; All other things being equal, the more cores (at the same speed)the more potential bus bandwidth, if the I/O lanes are correctly routed. It kinda makes me glad I've retired.
|
|
|
12-24-2020, 07:27 AM
|
#4
|
Member
Registered: Jun 2020
Posts: 609
Rep: ![Reputation: Disabled](https://www.linuxquestions.org/questions/images/reputation/reputation_off.gif)
|
FWIW: I think NVMe can only do up to 4 lanes per device, at least that's what all of the consumer devices I've toyed around with support (and its 1:1 with the PCIe generation, at least on paper - there's also stuff 'behind' NVMe since there's a whole ARM CPU, DRAM, etc between the NAND (however arranged) and the NVMe interface...). What I've seen from fancier arrangements is basically a bridged (e.g. PLX)/bifrucated (on newer systems) setup - so you have some 'device' that takes in x8 or x16 lanes and spits out N*x4 to NVMe devices that you could then soft-RAID back together (which can mean crazy fast speeds - 10GB/s is probably possible with modern NVMe drives, which should get close to saturaing your 100Gbit network - but as you point out, there's a lot 'in between' that may or may not let such happen). Non-standard stuff like fusionIO was/is after a similar goal, but I think most of that stuff has gone the way of the dodo in favor of NVMe's ubiquity.
|
|
|
All times are GMT -5. The time now is 01:27 AM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|