Bugzilla – Bug 825
ksoftirqd high load
Last modified: 2006-02-23 10:56:46
You need to log in before you can comment on or make changes to this bug.
Hi, during data transfers I'm experiencing up to 70% CPU time eaten up by ksoftirqd (ieee80211-package 1.1.6), making the driver mostly unusable. This is what happens: -ieee80211_xmit() is called via dev->hard_start_xmit() -checks queue size using ieee->is_queue_full(), ipw_net_is_queue_full() -if queue is full, returns NETDEV_TX_BUSY, network layer reschedules immediatly If the network is slow for some reason, this sums up to several thousand iterations until a packet can finally be transmitted. Workaround: call netif_stop_queue() from ipw_net_is_queue_full(). This may interact with QOS as it triggers when one queue is full, but at least makes the driver usable. Also, I'm wondering why the high_mark of a queue is calculated depending on the queue size. Shouldn't we always try the fill the xmit queue as much as possible? Stefan Patch attached.
Created an attachment (id=586) [details] patch to workaround
The same thing happens here. Thank you for providing fix. (In reply to comment #0) > Hi, > > during data transfers I'm experiencing up to 70% CPU time eaten up by > ksoftirqd (ieee80211-package 1.1.6), making the driver mostly unusable. This > is what happens: > > -ieee80211_xmit() is called via dev->hard_start_xmit() > -checks queue size using ieee->is_queue_full(), ipw_net_is_queue_full() > -if queue is full, returns NETDEV_TX_BUSY, network layer reschedules > immediatly > > If the network is slow for some reason, this sums up to several thousand > iterations until a packet can finally be transmitted. > > Workaround: call netif_stop_queue() from ipw_net_is_queue_full(). This may > interact with QOS as it triggers when one queue is full, but at least makes > the driver usable. > > Also, I'm wondering why the high_mark of a queue is calculated depending on > the queue size. Shouldn't we always try the fill the xmit queue as much as > possible? > > Stefan > > Patch attached.
Unfortunately, this patch doesn't fix this problem for me. In contrary it completely broke this driver. (In reply to comment #2) > The same thing happens here. Thank you for providing fix. > >
How does completely broken manifest? Driver crash, many firmware restarts, low throughput, cannot associate...?
(In reply to comment #4) > How does completely broken manifest? Driver crash, many firmware restarts, low > throughput, cannot associate...? My mistake, it's not completely broken. What I see is the extremely low throughput and when I enable debug option, I see a lot of "ipw2200: I ipw_rx Dropping" and "ipw2200: I ipw_rx_notification link deterioration" messages.
This reminds me that I did have throughput problems with 1.0.8 against a Linksys WRT54G when quality of service was compiled in. Can you try to recompile your driver with the line CONFIG_IPW_QOS=y in the Makefile commented out? If it helps, I can open another bug for this. Stefan
FYI, this issue was also reported downstream at https://bugs.gentoo.org/show_bug.cgi?id=113820
I was running into high ksoftirqd loads as well (open 9 pages in konqueror at the same) and after some searching found this bug. Recompiled 1.0.8 without QOS and problem is solved now. driver: 1.0.8 kernel: 2.6.14 Hope this helps. Let me know if there's any testing I can do. Martin
As reported by Julian Oliver on the mailing list, this problem persists in 1.0.10. I also didn't find anything in the ipw+ieee diffs that could have fixed this one. Hey, it's a serious regression and I've even provided a workaround patch. Care to apply?
the problem is fixed in ipw2200-1.0.11
Stefan, can you retest it?
I just tested 1.0.12 running netio while moving the laptop between zones of different reception quality. In the past, this triggered the problem immediatly, but 1.0.12 worked. Configuration was with CONFIG_IPW_QOS, but my AP doesn't support that feature anyway. I therefore close the bug, thanks! Stefan