LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware > Slackware - ARM
User Name
Password
Slackware - ARM This forum is for the discussion of Slackware ARM.

Notices


Reply
  Search this Thread
Old 09-09-2019, 04:29 AM   #1
leeeoooooo
Member
 
Registered: Jan 2009
Distribution: Slackware64 14.2 (current)
Posts: 121

Rep: Reputation: 20
Slackware for Raspberry Pi zero W?


Greetings!

I have a Raspberry Pi zero W that I would like to install Slackware onto. Debian just doesn't do it for me.

I was looking at SARPi but it doesn't support installation on the RPi zero W.

I would like to have the most complete and current release of Slackware.

I understand that as this has a Broadcom BCM2835 it has support for hard float; VFPv2, and is internally a 32-bit processor with 64-bit data lines.

What would you recommend?
 
Old 09-09-2019, 07:41 PM   #2
glorsplitz
Member
 
Registered: Dec 2002
Distribution: slackware!
Posts: 766

Rep: Reputation: 164Reputation: 164
Maybe look here Slackware 14.1 on Raspberry Pi Zero
or here Slackware ARM on the Raspberry Pi 1
 
1 members found this post helpful.
Old 09-11-2019, 05:40 AM   #3
leeeoooooo
Member
 
Registered: Jan 2009
Distribution: Slackware64 14.2 (current)
Posts: 121

Original Poster
Rep: Reputation: 20
Thanks glorsplitz!

Those links look to be very helpful. I missed those when I searched the forums.
 
Old 09-11-2019, 07:51 PM   #4
glorsplitz
Member
 
Registered: Dec 2002
Distribution: slackware!
Posts: 766

Rep: Reputation: 164Reputation: 164
post back if you get anywhere
 
Old 09-12-2019, 09:39 PM   #5
gus3
Member
 
Registered: Jun 2014
Distribution: Slackware (x86 and ARM)
Posts: 202

Rep: Reputation: Disabled
If you do get Slackware ARM 14.1 to work on your RPi Zero, you can get a small boost by following the instructions at https://mindplusplus.wordpress.com/2...-raspberry-pi/ (updated to reflect Slackware ARM 14.1 instead of -current). It will rebuild glibc using "-mfloat-abi=softfp"; this lets glibc use the VFP floating-point coprocessor, internally to glibc itself, without breaking the soft-float ABI that other programs and libraries are expecting.
 
Old 09-13-2019, 03:59 PM   #6
abga
Senior Member
 
Registered: Jul 2017
Location: EU
Distribution: Slackware
Posts: 1,117

Rep: Reputation: 595Reputation: 595Reputation: 595Reputation: 595Reputation: 595Reputation: 595
@leeeoooooo

If you follow the "Manual installation method" instructions from the link glorsplitz suggested:
https://docs.slackware.com/howtos:ha...rm:raspberrypi
you'll get the Slackware 14.2 ARM SoftFloat working on your Pi Zero W. I own a bunch of Pi Zeros (without WiFi) on which I installed and used Slackware 14.2 and can confirm that the "Manual installation method" works.
Note that in the output of fdisk -l you'll get different Start - End entries, adapt the rest of the instructions accordingly.
The opt/vc libs are useless (compiled for HardFloat) on your Pi Zero W and if you need them, you have to get the source files and compile them on your own. You can follow the section "___VC-USERLAND___START" from here:
https://www.linuxquestions.org/quest...on-4175612537/

If you use the latest Raspbian image to extract the kernel&firmware, your WiFi/Bluetooth chip should also work properly.
The CPU from your Pi Zero, as you correctly mentioned, does support hard float (VFP), but Slackware 14.2, for compatibility considerations, is only SoftFloat.
Slackware ARM -current is HardFloat, but compiled to support only armv7(and above) and useless for your armv6 Pi Zero W.

You can find some extra hints for your Slackware 14.2 ARM installation in this post:
https://www.linuxquestions.org/quest...9/#post5846280

I suggest to get the smaller Raspbian Buster Lite, you only need the kernel+modules+firmware+videocore(/opt/vc) libs:
https://www.raspberrypi.org/downloads/raspbian/

Grab the latest slack-14.2-miniroot*.xz from here:
ftp://ftp.slackware.uk/slackwarearm/...irootfs/roots/
- the root password is contained in the corresponding slack-14.2-miniroot_details.txt

You might want to add a swap partition (not included in the Slackware doc)
Code:
# use mkswap to format it & add it manually to /etc/fstab - change X to reflect the actual partition
/dev/mmcblk0pX   swap             swap        defaults         0   0
- I'd recommended to set the swappiness on 1 - add the following to your /etc/rc.d/rc.S
Code:
echo 1 > /proc/sys/vm/swappiness
From here you can download the Slackware ARM 14.2 packages:
ftp://ftp.slackware.uk/slackwarearm/...4.2/slackware/
- I'm usually installing Slackware ARM with the help of an USB flash drive, containing the Slackware packages tree, that I mount after I boot the miniroot and this is how I download the tree:
Code:
rsync --exclude '*/source/*' --delete -Pavv ftp.slackware.uk::slackwarearm/slackwarearm-14.2/ .


@gus3
According to my investigation&conclusions from here:
https://www.linuxquestions.org/quest...ml#post5753118
the softfp might create more overhead instead of performance improvements. Can you prove different?
 
Old 09-13-2019, 08:15 PM   #7
gus3
Member
 
Registered: Jun 2014
Distribution: Slackware (x86 and ARM)
Posts: 202

Rep: Reputation: Disabled
Quote:
Originally Posted by abga View Post
According to my investigation&conclusions from here:
https://www.linuxquestions.org/quest...ml#post5753118
the softfp might create more overhead instead of performance improvements. Can you prove different?
Yes, I can prove different.

First, the kernel is completely unaffected by userspace's hard- or soft-float ABI. The kernel's only concern w.r.t. FP coprocessors is to save/restore the FPU state on task switch. The kernel does not use the FPU otherwise. If an ARM processor has no FPU, the kernel saves/restores nothing, but that's a decision made when the kernel is built, not at runtime.

As for userspace: Slackware ARM up to 14.1 was soft-float, putting all FP arguments on the program stack before calling a subroutine. The subroutine then extracted the FP arguments from the stack, did whatever calculations were necessary, then put a FP result (if any) onto the stack before returning. This is what happens when you pass "-mfloat-abi=soft" or "-mfloat-abi=softfp" as a gcc option. But that's where the similarity ends.

If the float-abi is "soft", gcc avoids all FPU co-processor instructions, generating calls to routines in glibc that emulate an FPU in pure ARM code instead. The infrastructure for all this is included in the glibc source code. So every FADDS, FADDD, FMULS, FMULP turns into "push, push, call, (fetch op1 from stack, fetch op2 from stack, emulate FPU op, store result in stack, return), pop result". That's a minimum of nine instructions per emulated FPU instruction, likely many more to carry out the actual emulation. A hypotenuse calculation gets fairly hairy, with two multiplications, an addition, and a square root.

If the float-abi is "softfp", glibc gets built with actual FPU instructions, obviating all that emulation infrastructure. FP values are still passed back and forth between routines via the stack, but internally to a routine, calculations are carried out without emulation. So instead of several thousand pure ARM instructions to calculate a hypotenuse, it's reduced to just a few: "push x, push y, call, (fetch op1, FMUL, fetch op2, FMUL, FADD, push sum-of-squares, call sqrt, (fetch square, ... Newton-Raphson algorithm here(*)... store root, return), fetch result, store as final result, return), pop hypotenuse". Note that FMUL and FADD correspond to the actual VFP/Neon co-processor instructions, not calls to emulation code.

It still conforms to the "soft" ABI, so it doesn't look any different to outside code. Internally, the "softfp" code uses far fewer instructions, and more silicon in parallel, to do the same work.

(*)Newton-Raphson, in an optimized form, will itself call an external log10() function a couple times per iteration... but remember, that routine also uses VFP/Neon directly, where possible! Otherwise, N-R uses multiplication, division, subtraction, and absolute values, all of which are supplied directly in VFP/Neon.

EDIT: I didn't have the ARM ARM available to look this up as I composed the above, but VFP and Neon do support square roots directly, both single- and double-precision. So there's no need to call a Newton-Raphson algorithm at all. This makes the hypotenuse function in softfp much more direct: fetch, FMUL, fetch, FMUL, FADD, FSQRT, store result, return. 7 simple steps, 10 ARM instructions. Even better than my sub-optimal concoction from last night.

Last edited by gus3; 09-14-2019 at 08:00 PM. Reason: ARM VFP does square roots.
 
1 members found this post helpful.
Old 09-14-2019, 12:03 AM   #8
abga
Senior Member
 
Registered: Jul 2017
Location: EU
Distribution: Slackware
Posts: 1,117

Rep: Reputation: 595Reputation: 595Reputation: 595Reputation: 595Reputation: 595Reputation: 595
Quote:
Originally Posted by gus3 View Post
Yes, I can prove different.
...
If the float-abi is "softfp", glibc gets built with actual FPU instructions, obviating all that emulation infrastructure. FP values are still passed back and forth between routines via the stack, but internally to a routine, calculations are carried out without emulation. So instead of several thousand pure ARM instructions to calculate a hypotenuse, it's reduced to just a few: "push x, push y, call, (fetch op1, FMUL, fetch op2, FMUL, FADD, push sum-of-squares, call sqrt, (fetch square, ... Newton-Raphson algorithm here(*)... store root, return), fetch result, store as final result, return), pop hypotenuse". Note that FMUL and FADD correspond to the actual VFP/Neon co-processor instructions, not calls to emulation code.
Thank you for your very informative analysis. I really appreciate your effort & insights.
I wish we could have gone through this discussion in the appropriate thread:
https://www.linuxquestions.org/quest...987/page2.html
and not pollute this Pi Zero W installation

In my question, asking you for proof, I was expecting some empirical performance measurements because also in my question to you I referenced an older post where I came to the conclusion:
Quote:
softfp might create more overhead than simple soft and I'm still not sure why it was created in the first place
And it was based on this article (also available in the older post):
https://wiki.debian.org/ArmHardFloat...#A.22softfp.22
Stating:
Quote:
The caveat is that copying data from integer to floating point registers incurs a pipeline stall for each register passed (rN->fN) or a memory read for stack items. This has noticable performance implications in that a lot of time is spent in function prologue and epilogue copying data back and forth to FPU registers. This could be 20 cycles or more.
followed by some more details and explanations.

Should you choose to reply, let's move to the appropriate thread I mentioned in the beginning. Thanks again for your explanations.
 
Old 09-14-2019, 01:51 PM   #9
gus3
Member
 
Registered: Jun 2014
Distribution: Slackware (x86 and ARM)
Posts: 202

Rep: Reputation: Disabled
Point taken. Yes, I did some emperical testing on the math routines involved, and yes, trig and log functions ran visibly faster with "softfp" than with just "soft", without having to re-compile the test harness. If the speed-up had been negligible, or even questionable, I wouldn't have posted the article. (I no longer have a 2835-based RPi running Slackware, so I can't give you the speed figures, sorry.)

However, compared to the latest link you provide, that's frankly apples vs. oranges. Gcc's '-mfloat-abi=hard' uses a totally different ABI, one that isn't compatible with "soft" or "softfp". And yes, it is the fastest running FP library, short of hand-coding it in assembly. The "hard" ABI passes FP values directly in the VFP/Neon registers, instead of through the stack. This means no (or, at least *fewer*) memory access when calling a floating-point subroutine, and no memory access to return a bare FP value. That's also why it's incompatible with "soft" and "softfp".
 
Old 09-14-2019, 05:15 PM   #10
abga
Senior Member
 
Registered: Jul 2017
Location: EU
Distribution: Slackware
Posts: 1,117

Rep: Reputation: 595Reputation: 595Reputation: 595Reputation: 595Reputation: 595Reputation: 595
Thanks again for your inputs. The last link I provided, again:
https://wiki.debian.org/ArmHardFloat...#A.22softfp.22
contains the section "softfp" and that's what I was pointing at (also quoted). softfp is the subject of our discussion and no "apples vs. oranges".
I take your word for the performance improvements you measured with softfp, but I'm not really convinced. I fear that the the overhead created by "copying data from integer to floating point registers" could actually cancel out (or even worsening) the use of available HW VFP functions.
I do still own a bunch of 2835 Pi Zeros but I gradually moved them all to an armv6 HardFloat distro after failing to get a toolchain for recompiling the whole Slackware:
https://www.linuxquestions.org/quest...v6-4175612701/
I use these boards solely for Kodi (multimedia), not really needing a full distro for that purpose and just maintaining (compiling & updating) now the applications I need to expose the system to the Internet (ssl&curl&co). I only noticed a slight performance improvement in Kodi's OSD display - it's snappier - between Slackware SoftFloat & new distro HardFloat. No other subjective performance improvements noticed.

I've tried to reference and continue the discussion on these last 3 posts in the more appropriate slackware-arm-faq-soft-float-and-hard-float thread, but I wasn't able to, that thread looks locked. Weird!
https://www.linuxquestions.org/quest...987/page2.html

I'll stop here and apologize to leeeoooooo for the slightly off-topic discussiom.
 
Old 09-14-2019, 08:03 PM   #11
gus3
Member
 
Registered: Jun 2014
Distribution: Slackware (x86 and ARM)
Posts: 202

Rep: Reputation: Disabled
Um, I'm still not making it clear, apparently.

Slackware ARM 14.1, as built, didn't use the HW VFP, since it was built with '-mfloat-abi=soft'. The parameter passing during function call/return went through the stack already. So memory access penalty for the function arguments is there, just as much as with 'softfp'. Add to that the much larger code base to provide FP emulation, and you're looking at a massive slow-down on the machine level.

The overhead of copying from integer to FP register, is more than offset by the speedup of VFP/Neon. That "overhead" is a pittance, compared to FP emulation.
 
1 members found this post helpful.
Old 09-14-2019, 09:17 PM   #12
abga
Senior Member
 
Registered: Jul 2017
Location: EU
Distribution: Slackware
Posts: 1,117

Rep: Reputation: 595Reputation: 595Reputation: 595Reputation: 595Reputation: 595Reputation: 595
Clear now!
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: Raspberry Pi Zero WH adds 40-pin GPIO header to Zero W LXer Syndicated Linux News 0 01-14-2018 04:40 AM
LXer: Ubuntu's Snapd Daemon Now Works Properly on Raspberry Pi and Raspberry Pi Zero LXer Syndicated Linux News 0 06-15-2017 02:22 PM
LXer: Raspberry Digital Signage 9.0 Supports Raspberry Pi Zero W, Based on Chromium 56 LXer Syndicated Linux News 0 05-09-2017 08:15 PM
LXer: Raspberry Pi Foundation: We'll Ship the 250,000th Raspberry Pi Zero W This Week LXer Syndicated Linux News 0 05-04-2017 05:05 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware > Slackware - ARM

All times are GMT -5. The time now is 01:07 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration