LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Slackware - ARM (https://www.linuxquestions.org/questions/slackware-arm-108/)
-   -   USB issues under the new 2017-07-05 Raspbian official kernel 4.9.28-v7+ on Slackware ARM (https://www.linuxquestions.org/questions/slackware-arm-108/usb-issues-under-the-new-2017-07-05-raspbian-official-kernel-4-9-28-v7-on-slackware-arm-4175609456/)

abga 07-08-2017 10:46 AM

USB issues under the new 2017-07-05 Raspbian official kernel 4.9.28-v7+ on Slackware ARM
 
Greetings Earthlings,

I'd like to start by thanking all of the Slackers for helping me out with their work in my last 17 years of continuous "slacking", the last 2 also on ARM.

I'm using Slack ARM current HF on a bunch of Raspberry Pi2 & Zero devices and have been updating my kernel & firmware always directly from the Raspbian official images (once they were released) for the last 2 years.

By my last update with the 2017-07-05 Raspbian image, I've stumbled upon some peculiar issues with the USB bus and failed to identify the cause. I've asked for help on the Raspberry Forums and made a big mistake by telling them that I'm not using Raspbian but Slack and that I don't really trust Debian derivatives for my needs. I got a bored moderator on a brink of resigning together with his trolls and came here to ask if someone has had the same experience like me with the new kernel 4.9.28v7+.
I've documented my problem both on the Kodi Forum - under ABGA:
https://forum.kodi.tv/showthread.php?tid=267284
And on the Raspberry Forum under justme123 (actually banned for the impertinence to ask for help and being a hardcore Slacker, not to mention - asking Santa Claus to put his glasses on):
https://www.raspberrypi.org/forums/v...?f=63&t=187697

Just for the sake of consistency and completion I'm also inserting (copy/paste) here the original problem description from the Raspberry Forum:

I'm not using Raspbian myself and just took the firmware&kernel from the latest image dated 2017-07-05, which comes with kernel 4.9.28-v7+ and Firmware: May 15 2017 16:57:15, version 9469ea3706e34c4de62f38a5008f69a429b4b43e (clean) (release)
Just for information, I make whole SDCard backups between the kernel updates and just updating the firmware and kernel (/opt/vc included) once they are released. The configuration files and packages are being kept in the same state/version (preserved), I only start to modify/upgrade them after making sure that the new kernel is OK.The last kernel I was using without any issues was the previously released 4.4.50-v7+.

Among other interesting things, I use one of my Pi 2 B boards for streaming video through tvheadend from an usb DVB-S2 adapter. The only part I'm compiling myself (updating the kernel modules) is the media_tree (linuxtv.org) because I need to patch a tuner file.
With the latest kernel & firmware from 2017-07-05 I experience issues with tvheadend (launched with non-root privileges), which is bragging every minute in syslog that it's getting DVB stream continuity errors from the usb DVB adapter:
tvheadend[19131]: TS: Astra/11347V/ZDFinfo HD: H264 @ #6710 Continuity counter error (total 14)
tvheadend[19131]: TS: Astra/11347V/ZDFinfo HD: AC3 @ #6722 Continuity counter error (total 11)
tvheadend[19131]: TS: Astra/11347V/ZDFinfo HD: MPEG2AUDIO @ #6720 Continuity counter error (total 10)
and eventually stops streaming as the errors frequency is getting worse. It works again for a while after I restart the tvheadend process.

My CPU frequency governor is set on ondemand (only under very intensive usage is jumping from 600-900Mhz) and if I tune it on performance (CPU frequency will stay at 900Mhz) the tvheadend errors are almost disappearing - I get one every 10-20 minutes.
The CPU temperature is usually at 45C in idle mode, 50-60C when streaming and only while compiling I get between 65-70C.
By trying to identify the cause of this real-time streaming issue with the new kernel& firmware, I stumbled upon an old tvheadend thread, not ARM related, which suggest that CPU throttling might be a valid cause:
https://tvheadend.org/boards/5/topics/8502?r=8790

Just by trying to exclude the media_tree (media_build) from the equation, I have recompiled it (the latest git release) under the older backup image (older firmware & 4.4.50-v7+ kernel) and I was not experiencing any issues.

The kernel being the only component that I have changed and not fully investigated yet, I believe that the issue is somewhere in the new usb subsystem / modules.

I'm also considering to start my own kernel compilation with this latest experience and looking for an ARM toolchain under x86.

Thank you in advance!

Off-Topic, there are some General Raspberry kernel related threads at the Raspberry Forums that you might find informative (that's from when I was still able to find help):
https://www.raspberrypi.org/forums/v...?f=63&t=182116
https://www.raspberrypi.org/forums/v...f=107&t=187222
And on the Kodi Forums, if you look for compiling Kodi, FFmpeg, etc on Slackware, search for my nickname ABGA - or simply ask me, now that I'm registered - Always happy to help Slackers!

abga 07-10-2017 01:36 PM

Hi,

I've got some time today and searched in the official 2017-07-05 Raspbian Image after new /boot/config.txt&cmdline.txt / modules parameters / udev rules that might have been the cause for my reported issue. I couldn't find anything new or special in comparison to the previously released Raspbian Image with kernel 4.4.50

However, I've identified several outstanding USB related issues with the 4.9 kernel branch on Raspberry's git and if you look for stability, I'd advise to stick with the 4.4.50 kernel for the moment, as it is confirmed as stable even by the ones that reported the issues with the new 4.9 branch:

https://github.com/raspberrypi/linux/issues/1943
https://github.com/raspberrypi/linux/issues/2097
https://github.com/raspberrypi/linux/issues/2026

As a side note, kernels previous to 4.4.50 are affected by this issue (I can confirm it for 4.4.34 & 4.4.48):
https://github.com/raspberrypi/linux/issues/1753

I'll stick with 4.4.50 for the moment and will stall the investigation on 4.9 - as there are several confirmed USB issues with this branch.

Regarding a Slackware x86-64 Raspberry Pi kernel cross-compilation environment, I'm still studying/learning from the google results I've found.
I'd be thankful if anyone could help me be more efficient by pointing me at a tested Slackware "recipe", if such a recipe exists.

Thanks!

Exaga 07-12-2017 07:12 AM

Quote:

Originally Posted by abga (Post 5733274)
Hi,

https://github.com/raspberrypi/linux/issues/1753

I'll stick with 4.4.50 for the moment and will stall the investigation on 4.9 - as there are several confirmed USB issues with this branch.

I haven't noticed any USB issues under the 4.9 kernel running Slackware ARM. That's not to say there aren't any, just that I haven't experienced any issues in the months I've been using the 4.9 kernel.

Quote:

Originally Posted by abga (Post 5733274)
Regarding a Slackware x86-64 Raspberry Pi kernel cross-compilation environment, I'm still studying/learning from the google results I've found.
I'd be thankful if anyone could help me be more efficient by pointing me at a tested Slackware "recipe", if such a recipe exists.

These days, when it comes to cross-compiling for ARM architecture I refer to this FAQ by Mozes.

abga 07-12-2017 11:49 AM

@Exaga

Thank you for your feedback, and cross-compiler hints.

Kernel related:
I'm using a Pi2B board as a testbed on my desk, running several services(Firewall,Snort,VPN,DNS Resolver,PostgreSQL,Kodi...etc), monitored by the help of Monitorix and having 3 USB ports always directly connected. A DVBS2 USB adapter (max 20Mbps), a 3G USB dongle for Internet fail-over (max 21Mbps) and an USB Card Reader(16GB Sandisk SDCard) for the PostgreSQL db storage (older type that is not supporting write speeds over 4-5MB/s and reads over 10MB/s). The Ethernet (which is also connected internally on the USB bus), the DVB-S2 and the Card Reader are active and servicing 24/7. It's in this environment that I observe USB issues and not on other boards that are more "relaxed". As stated, I'm happy with 4.4.50


Cross-Compiler related:

Thanks again for pointing me to that Q&A link, I don't know why I missed it, maybe because I was focusing solely on kernel compilation.

I was asking for help because I got confused by the multiple choices I have with respect to toolchains:
http://elinux.org/Toolchains#Getting_a_toolchain

And by the fact that most of the Raspberry kernel compilation recipes I've found on the Internet are advising to use the "official" and "optimized" linaro toolchain that Rapberry itself is using:
https://github.com/raspberrypi/tools

But all these should be part of a new thread, as they are off-topic here. I'll maybe start a thread and document a Raspberry kernel compilation recipe on Slackware, once I'll learn my lessons and become more confident.

Exaga 07-13-2017 05:08 AM

Quote:

Originally Posted by abga (Post 5734027)
@Exaga

Thank you for your feedback, and cross-compiler hints.

Thanks again for pointing me to that Q&A link, I don't know why I missed it, maybe because I was focusing solely on kernel compilation.

You're welcome to any help and advice. :)

At one time I had the idea that it may be beneficial to use a more powerful and much faster Intel x86_64 based Linux system to save a shed load of compiling time. However, after having spent approx. 1 month trying to get a cross-compile to work, and being quite unsuccessful throughout, producing more errors than one could shake a stick at, I gave up on the idea and stuck to compiling natively on the various RPi devices. Compiling natively produces zero errors for me and is 100% successful every time.

abga 08-20-2017 06:13 PM

There are confirmed kernel issues that will affect DVB devices on USB starting apparently with the 4.8 branch:

https://forum.libreelec.tv/thread/42...-kernel-4-9-x/

abga 01-27-2018 12:52 AM

Good News!

It's an old thread, my first post on LQ, but the issue reported persisted (affecting both X86 & ARM) to this very day (latest kernel releases) and it appears that the buggy kernel commit was finally found:
https://forum.libreelec.tv/thread/42...5965#post75965
https://git.kernel.org/pub/scm/linux...500098f2d5f882

Later in the librelec thread (page 15) you'll get the kernel folks (media subsystem maintainer) involved and some links to the kernel discussion board - work in progress.

Bad News! (for me only) as I'll soon need to "unglue" the Slackware ARM (Pi0) temporary streamer running tvheadend and the DVB friendly 4.4.50 kernel from my DVB Tunner ...
https://s18.postimg.org/m6xkkwjsp/lego1-streamer.jpg

abga 03-23-2018 06:19 PM

A short update on this, curbing my enthusiasm for expecting a quick fix soon and wondering ATM on how deep into the kernel core the rabbit hole goes:
https://forum.kodi.tv/showthread.php...186#pid2717186

You won't notice these issues/effects on powerful systems, but only on slower ARM/X86 ones.

OldHolborn 03-24-2018 04:15 AM

Quote:

Originally Posted by abga (Post 5834635)

to

https://github.com/raspberrypi/linux/issues/2134#

to

"mutability commented on Sep 5, 2017 •
I flattened the 4.9.41 - 4.9.43 history and bisected it.
So far it looks like 9ef8b23 is the commit that fixed things.
"

So

"JamesH65 commented on Sep 13, 2017
Closing this issue as questions answered/issue resolved.
"

Is quite a correct statement and so your statement on https://github.com/raspberrypi/linux/issues/2134#

"JamesH65 has apparently found a resolution that he doesn't want to share with anyone, just keeping it for himself:
"

is quite simply wrong.

In short - mutability filed a detailed bugreport, later commenting that bug he reported was fixed and JamesH65 closed the bug report.

Looks like you are wrongly conflating two problems, one fixed, one not.

So what's your problem? That the RPi kernel hasn't fixed an upstream bug or that a bug report you haven't filed hasn't been fixed?

abga 03-24-2018 02:19 PM

@OldHolborn

Thank you for your quite objective and quite nontechnical feedback.
In short, driven mainly by my initiative to be helpful in the Slackware community, as I'm a Slacker myself, and still using Slackware ARM -current on some Pi2 boards, I considered to keep this thread updated and to inform on the development of this peculiar bug. That's for helping any Slacker that might "get hit" by this still unresolved issue, maybe even helping you to understand that you're currently running a broken kernel (X86 included).

In order to answer some of your last questions for yourself, you should have maybe quoted the whole section from my referenced Kodi Forum post:
Quote:

Looking at the the Raspberry Foundation kernel issues list to see if this was admitted as a bug, I found out that JamesH65 has apparently found a resolution that he doesn't want to share with anyone, just keeping it for himself:
https://github.com/raspberrypi/linux/issues/2134#
JamesH65 commented Sep 13, 2017
"Closing this issue as questions answered/issue resolved."
Again, in short, I was just checking if it was admitted as a bug, not necessarily expecting any resolution, since there might be no competence at all for that at the Raspberry Foundation. And I was not able to report this bug on my own on the Raspberry Foundation Forums, because I got trolled and banned - see the first post on this thread.
The kernel bug report that was filled by mutability (the title was correct - suspecting regressions between the 4.9.x and 4.4.x) came just some weeks after my first report on their forum and if you read it again, with a little more attention maybe, you'll notice that mutability was repeatedly confused, did the whole investigation on its own and at the end nobody asked him if he's happy with his findings, but your friend Jamesch just came by and closed it.
https://github.com/raspberrypi/linux/issues/2134#
This is what I considered a level of arrogance and superficiality, thus denial, that wasn't new for me - as it happened also when I tried to report it on my own. I made a sarcastic side note about that situation, which I assume and sustain.

I would also like to point out, that unlike here on the Slackware Forum, where we spend our time to help each other without any remuneration, the Raspberry Foundation is a commercial entity, even running as a foundation and not making profit their costs are still covered and the support is paid (Jamesch is a paid employee). I am a customer of the Raspberry Foundation, I did buy a few boards (dozens), and I consider myself entitled to expect some higher level of quality from their support.

OldHolborn 03-24-2018 04:59 PM

Quote:

Originally Posted by abga (Post 5834917)
I'm a Slacker myself, and still using Slackware ARM -current

Well, we probably wouldn't be here otherwise so, let's take that as a given :)

Dear Abga,

I would suggest you read it again.

mutability has a quite particular setup, in his own words "Reproducing this is somewhat involved and needs specific hardware."

mutability later*[1] reports that the problem is fixed "I had a chance to retest this with 4.9.46-v7+"..."and the issue seems to be fixed."

mutability bisects the kernel and comments "it looks like 9ef8b23 is the commit that fixed things."

We can only assume mutability was not faking the whole issue just to annoy you and that he did indeed have a problem and that it was now fixed.

What do you expect him to say after he finds his problem has been fixed?

So then, if JamesH65 then closes the ticket with "Closing this issue as questions answered/issue resolved", where is he wrong?

Note - if you search issues on github using "is:issue is:closed involves:JamesH65" you'll see he closes many issues with this same line.

So quite where in this you find a conspiracy I don't know.

[1] There was a little over a month in between mutability's first report and his reporting it fixed, what happened during this period we can only guess, here a few of mine:

JamesH65 was involved in problems affecting more than just one user?
JamesH65 was involved in work for the 3B+?
JamesH65 was enjoying his summer holidays?

What we do know was the issue was not closed before mutability reported his problems fixed.

So where's the problem?

Ah, your problem!

Ok, not all problems that have the same signs have the same causes, taking a non-computing example - nosebleeds!

https://en.wikipedia.org/wiki/Nosebleed

So, just because you and mutability both had systems that worked with 4.4 and didn't work with 4.9 does not mean the problems had the same cause.

That he then reported his problem fixed should have been a hint!

OK, the RPi Foundation...

They develop in public and are progressively moving towards mainline kernel and their staff are active on the forums etc, compare this with the support from the manufacturers of many other SBCs who seem to throw their product onto the market and run away leaving support/onward development to the community.

Compare it to even Intel and the less than perfect support provided after Meltdown and their kit costs a heck of a lot more than £35

You appear to have unrealistic demands, are confusing problems and then inventing conspiracy.

abga 03-24-2018 06:14 PM

This very issue was confusing a lot of people for a long time and only after jahutchi spent several months bisecting the kernel commits, it was first recognized and assumed by the kernel folks:
https://forum.libreelec.tv/thread/42...5965#post75965

There is no conspiracy here, at least not if a healthy rational thinking is involved, I was just emphasizing the arrogance & superficiality in a side note that got you irritated and me uselessly wasting my time again arguing on subjective interpretations.

The user mutability was confused during his investigation:
- first he was considering the firmware being buggy:
mutability commented Sep 4, 2017
Initial results:
Good: 4.9.43-v7+ (Hexxeh/rpi-firmware@bf19fe7)
Bad: 4.9.41-v7+ (Hexxeh/rpi-firmware@b9becbb)
- then, his last - and at least for me - not convincing report - "it looks like":
mutability commented Sep 5, 2017
"So far it looks like 9ef8b23 is the commit that fixed things."
- additional to that, jsiverskog opened another ticket that he relates to mutability's, which actually remained open, your friend Jamesch apparently remained a little bit more cautious in dumping that report too:
https://github.com/raspberrypi/linux/issues/2140

If you would have taken some more time and read some of the kernel threads (discussions) I referenced in the Kodi Forum post, objectively related to the reported issue, you might have well observed that the effects from that softirq.c commit on the USB subsystem are causing all sort of weird issues and unexpected behaviors. Again, the superficiality with which these issues were treated caused a huge delay in pinpointing the exact cause and that's what I was emphasizing in my rather innocent but sarcastic observation.
I don't care about what Jamesch is doing, I'm sure he's a nice guy in his private life, nor about the amount of work he needs to do (management issue, not mine) and I deleted my registration on Raspberry Forums long time ago, mainly because I haven't had too much success in receiving help there, but more off-topic arguing about why am I not using Raspbian ... off-topic arguing much like here in this Slackware appendix forum apparently :)

OldHolborn 03-24-2018 07:34 PM

You take issue with mutability using "seems fixed" yet take jsiverskog's "possibly related to #2134" as categorical statement of connection? Despite even P33M's remark "It's more likely to be related to #1709"?

*boggle*

And all the while still managing to miss what is right under your nose.

Quote:

Originally Posted by abga (Post 5835021)

jahutchi: "So this indicates the troublesome commit is: 4cd13c21b207e80ddb1144c576500098f2d5f882"
Eric Dumazet 2016-08-31 10:42:29 -0700 "softirq: Let ksoftirqd do its job"

Meanwhile back in the grand conspiracy thread...
mutability: "So far it looks like 9ef8b23 is the commit that fixed things."
matijaGP: "Fixes: 46c8f0b ("timers: Fix get_next_timer_interrupt() computation")"
cmetcalf-tilera: "Fixes: 500462a "timers: Switch to a non-cascading wheel""
Thomas Gleixner Jul 4, 2016 "timers: Switch to a non-cascading wheel"

Each tracks back to a different original commit, yet somehow, you consider this the basis to accuse someone of withholding fixes?
https://forum.kodi.tv/showthread.php...186#pid2717186
"I found out that JamesH65 has apparently found a resolution that he doesn't want to share with anyone, just keeping it for himself:"

Now, I know jack about kernels, but I do know my calendar well enough that "Thomas Gleixner Jul 4, 2016" is just shy of 2 months earlier than "Eric Dumazet 2016-08-31", so unless has a timetravel machine - Oh wait - that's what all his real-time patches are about!!!

Why abga, you are a genius! Conspiracy proven! Timetravel does exist and it's being used to sabotage your kodi box!

Quote:

Originally Posted by abga (Post 5835021)
I haven't had too much success in receiving help there

I wonder why...

abga 03-24-2018 08:47 PM

Quote:

Originally Posted by OldHolborn (Post 5835041)
You take issue with mutability using "seems fixed" yet take jsiverskog's "possibly related to #2134" as categorical statement of connection? Despite even P33M's remark "It's more likely to be related to #1709"?

*boggle*

None of those statements that you quote are categorical, aren't they? I was linking it in the context of the confusion that was in the air at that time and regard it as a potential effect of the softirq.c issue. But now that I look closely on that period, I'm afraid I can't see you around doing some work to investigate it. Indeed, *boggle*

Quote:

Originally Posted by OldHolborn (Post 5835041)
And all the while still managing to miss what is right under your nose.


jahutchi: "So this indicates the troublesome commit is: 4cd13c21b207e80ddb1144c576500098f2d5f882"
Eric Dumazet 2016-08-31 10:42:29 -0700 "softirq: Let ksoftirqd do its job"

This was the objective information I posted here on this thread before your conspiracy theory got initiated and developed. Good observation BTW.

Quote:

Originally Posted by OldHolborn (Post 5835041)
Meanwhile back in the grand conspiracy thread...
mutability: "So far it looks like 9ef8b23 is the commit that fixed things."
matijaGP: "Fixes: 46c8f0b ("timers: Fix get_next_timer_interrupt() computation")"
cmetcalf-tilera: "Fixes: 500462a "timers: Switch to a non-cascading wheel""
Thomas Gleixner Jul 4, 2016 "timers: Switch to a non-cascading wheel"

Each tracks back to a different original commit, yet somehow, you consider this the basis to accuse someone of withholding fixes?
https://forum.kodi.tv/showthread.php...186#pid2717186
"I found out that JamesH65 has apparently found a resolution that he doesn't want to share with anyone, just keeping it for himself:"

Now, I know jack about kernels, but I do know my calendar well enough that "Thomas Gleixner Jul 4, 2016" is just shy of 2 months earlier than "Eric Dumazet 2016-08-31", so unless has a timetravel machine - Oh wait - that's what all his real-time patches are about!!!

Now, this is your interpretation and an interesting work of fiction - I wish time traveling was possible, BTW. With my sarcastic side note, I was referring strictly to:
https://github.com/raspberrypi/linux/issues/2134
I repeat, I don't take mutability's findings as convincing and complete, his initial report, as the title and his first sentence describes:
"TL;DR: USB bulk data gets intermittently dropped on Pi 2B + kernel 4.9. The same hardware with kernel 4.4 does not drop data"
came with the 4.9.x kernel upgrade, the previous one 4.4.x had no issues and I can confirm that as I was also experiencing the issue with my 4.4.50-4.9.28 kernel upgrade. The buggy commit appears somewhere in the kernel 4.8.x branch. Please go again through the LibreELEC and kernel discussions links, as I fear you have missed some information. It's difficult, I must admit, as you have the kernel folks discussion spread over several threads that are referencing the softirq.c
This is also a thread related to the subject, you have apparently missed, it also references Jamesch's resolution ( https://github.com/raspberrypi/linux/issues/2134 ):
https://www.spinics.net/lists/linux-usb/msg164414.html
So please don't take it out of the context. And it wasn't an accusation but a joke, I hope you're able to make the difference. Here some help, as you sure like Wikipedia:
https://en.wikipedia.org/wiki/Sarcasm
I count on your sharp perceptiveness and hope you understand that sarcasm is healthy, at least for healthy minds. ;)

Quote:

Originally Posted by OldHolborn (Post 5835041)
Why abga, you are a genius! Conspiracy proven! Timetravel does exist and it's being used to sabotage your kodi box!

Thank you! I'm very modest myself, but very pleased when being recognized as a genius!

Quote:

Originally Posted by OldHolborn (Post 5835041)
I wonder why...

It's simple, as mentioned before I was focusing on the firmware, kernel, drivers related to the Raspberry Pi platform (they're the only ones having access to the closed source Broadcom HW documentation and the firmware & kernel was theirs I was using (still am)), those are the areas of competence that I expected help from the Raspberry Foundation and instead I got a lot of Raspbian usage suggestion/enforcement. You can follow my posts there, they're still available.

I'd like to ask you, if you're so kind to stop vandalizing this thread, as I, and hopefully some other users, find it useful for tracking the reported kernel issue and instead, if you're still filled with energy, maybe start a blog on your own, go on Facebook, even start a new thread here and continue your interpretations.
Thank you in advance!

OldHolborn 03-25-2018 02:04 AM

This is pointless, we are going round in circles

You provide a link saying - look, it mentions #2134
I say, yes it mentions it, but it says it's not that.

You provide a another link saying - look, it mentions #2134
I say, yes does mentions it, but it also says it's not that.

for example, you say
Quote:

Originally Posted by abga (Post 5835054)

and I say, look
"
The main issue is really the logic changes a the core softirq logic.

Using Kernel 4.14.10 on a Raspberry Pi 3 with 4cd13c2 commit reverted
fixed the issue.
"

We could play this game all day really, but I have better things to do.

Bye


All times are GMT -5. The time now is 07:41 PM.