Bug 825 - ksoftirqd high load
: ksoftirqd high load
Status: CLOSED FIXED
: IPW2200
Driver Load
: 1.0.8
: All All
: P1 major
Assigned To:
:
:
:
:
:
  Show dependency treegraph
 
Reported: 2005-10-29 09:27 by
Modified: 2006-02-23 10:56 (History)


Attachments
patch to workaround (706 bytes, patch)
2005-10-29 09:28, Stefan Rompf
Details | Diff


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2005-10-29 09:27:49
Hi,  
  
during data transfers I'm experiencing up to 70% CPU time eaten up by  
ksoftirqd (ieee80211-package 1.1.6), making the driver mostly unusable. This 
is what happens: 
 
-ieee80211_xmit() is called via dev->hard_start_xmit() 
-checks queue size using ieee->is_queue_full(), ipw_net_is_queue_full() 
-if queue is full, returns NETDEV_TX_BUSY, network layer reschedules 
immediatly 
 
If the network is slow for some reason, this sums up to several thousand 
iterations until a packet can finally be transmitted. 
 
Workaround: call netif_stop_queue() from ipw_net_is_queue_full(). This may 
interact with QOS as it triggers when one queue is full, but at least makes 
the driver usable. 
 
Also, I'm wondering why the high_mark of a queue is calculated depending on 
the queue size. Shouldn't we always try the fill the xmit queue as much as 
possible? 
 
Stefan 
 
Patch attached.
------- Comment #1 From 2005-10-29 09:28:39 -------
Created an attachment (id=586) [details]
patch to workaround 
------- Comment #2 From 2005-11-03 19:26:18 -------
The same thing happens here. Thank you for providing fix.

(In reply to comment #0)
> Hi,  
>   
> during data transfers I'm experiencing up to 70% CPU time eaten up by  
> ksoftirqd (ieee80211-package 1.1.6), making the driver mostly unusable. This 
> is what happens: 
>  
> -ieee80211_xmit() is called via dev->hard_start_xmit() 
> -checks queue size using ieee->is_queue_full(), ipw_net_is_queue_full() 
> -if queue is full, returns NETDEV_TX_BUSY, network layer reschedules 
> immediatly 
>  
> If the network is slow for some reason, this sums up to several thousand 
> iterations until a packet can finally be transmitted. 
>  
> Workaround: call netif_stop_queue() from ipw_net_is_queue_full(). This may 
> interact with QOS as it triggers when one queue is full, but at least makes 
> the driver usable. 
>  
> Also, I'm wondering why the high_mark of a queue is calculated depending on 
> the queue size. Shouldn't we always try the fill the xmit queue as much as 
> possible? 
>  
> Stefan 
>  
> Patch attached.

------- Comment #3 From 2005-11-03 20:54:47 -------
Unfortunately, this patch doesn't fix this problem for me. In contrary it 
completely broke this driver.

(In reply to comment #2)
> The same thing happens here. Thank you for providing fix.
> 
> 
------- Comment #4 From 2005-11-04 02:03:38 -------
How does completely broken manifest? Driver crash, many firmware restarts, low 
throughput, cannot associate...? 
------- Comment #5 From 2005-11-05 10:17:57 -------
(In reply to comment #4)
> How does completely broken manifest? Driver crash, many firmware restarts, low 
> throughput, cannot associate...? 

My mistake, it's not completely broken. What I see is the extremely low 
throughput and when I enable debug option, I see a lot of "ipw2200: I ipw_rx 
Dropping" and "ipw2200: I ipw_rx_notification link deterioration" messages.
------- Comment #6 From 2005-11-08 11:29:31 -------
This reminds me that I did have throughput problems with 1.0.8 against a  
Linksys WRT54G when quality of service was compiled in. Can you try to  
recompile your driver with the line CONFIG_IPW_QOS=y in the Makefile commented  
out? If it helps, I can open another bug for this.  
  
Stefan  
  
  
------- Comment #7 From 2005-11-28 11:39:51 -------
FYI, this issue was also reported downstream at
https://bugs.gentoo.org/show_bug.cgi?id=113820
------- Comment #8 From 2006-01-02 01:44:46 -------
I was running into high ksoftirqd loads as well (open 9 pages in konqueror at  
the same) and after some searching found this bug. Recompiled 1.0.8 without  
QOS and problem is solved now.  
 
driver: 1.0.8 
kernel: 2.6.14 
 
Hope this helps.  
 
Let me know if there's any testing I can do. 
Martin  
------- Comment #9 From 2006-01-23 11:30:41 -------
As reported by Julian Oliver on the mailing list, this problem persists in  
1.0.10. I also didn't find anything in the ipw+ieee diffs that could have  
fixed this one.  
  
Hey, it's a serious regression and I've even provided a workaround patch. Care  
to apply?  
------- Comment #10 From 2006-02-15 00:19:20 -------
the problem is fixed in ipw2200-1.0.11
------- Comment #11 From 2006-02-21 00:50:48 -------
Stefan, can you retest it?
------- Comment #12 From 2006-02-23 10:56:46 -------
I just tested 1.0.12 running netio while moving the laptop between zones of  
different reception quality. In the past, this triggered the problem  
immediatly, but 1.0.12 worked. Configuration was with CONFIG_IPW_QOS, but my 
AP doesn't support that feature anyway. 
 
I therefore close the bug, thanks! 
 
Stefan