Bugzilla – Bug 1703
wrong command queue
Last modified: 2009-04-14 03:18:15
You need to log in before you can comment on or make changes to this bug.
linux 2.6.26 kernel on Dell D830. Distribute: Scientific Linux 5 -- derived from Redhat EL5 iwl4965: Error wrong command queue 63 command id 0x0 kernel BUG at drivers/net/wireless/iwlwifi/iwl4965-base.c:3465!
(In reply to comment #0) BTW, I'm using the 228.57.1.21 microcode version.
Can you confirm if it only happens on firmware 228.57.1.21 or 4.44.1.20 as well?
Are you using 11n? If so, can you disable it to narrow down if it is 11n related? BTW, What AP do you use?
(In reply to comment #2) > Can you confirm if it only happens on firmware 228.57.1.21 or 4.44.1.20 as > well? > I'm 99% sure it is happening with 4.44.1.20 also. I have re-installed that firmware and will verify 100% -- it may take a day or so to do so as I do not know how to make the bug happen quickly.
(In reply to comment #3) > Are you using 11n? If so, can you disable it to narrow down if it is 11n > related? BTW, What AP do you use? > These are great questions! At where I work, in a 15 story office building, the network guys recently started upgrading the Access Points to 802.11n CISCO equipment. (the older equipment is also Cisco Systems.) And this is when I started having troubles. So currently, there is a mix of 11n capable APs along with the a/b/g capable APs. Now that I think about it, know I was having troubles (in this wireless environment that was changed at my office building within the last 2 weeks) with the older firmware and the previous versions of the iwl drivers -- in 2.6.26-rc7,8,9. So, that is why I upgrade to all latest kernel and firmware. I have not experience the bug while using my laptop at home where the wireless environment is a/b/g capable APs only (mine and my neighbors) and where each wireless cell has a different essid. I will try to disable the 11n feature -- I assume you mean in the kernel configuration ("Enable 802.11n HT features...") and report (I did try with all the features (HT, LEDS, Spectrum, Sensitivity Calibration) disable and the driver did not work ... for example, a "ping <node>" command line command would always hang on the 10th ping, then after about 20 seconds, display a sendmsg error -- something about "buffers" (I sorry I do not recall the exact error -- I will have to recompile, install, etc and record and report)
(In reply to comment #5) This is what happens when HT (802.11n) is disabled: PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data. 64 bytes from 192.168.1.1: icmp_seq=1 ttl=150 time=1.00 ms 64 bytes from 192.168.1.1: icmp_seq=2 ttl=150 time=1.10 ms 64 bytes from 192.168.1.1: icmp_seq=3 ttl=150 time=2.64 ms 64 bytes from 192.168.1.1: icmp_seq=4 ttl=150 time=1.00 ms 64 bytes from 192.168.1.1: icmp_seq=5 ttl=150 time=1.00 ms 64 bytes from 192.168.1.1: icmp_seq=6 ttl=150 time=1.01 ms 64 bytes from 192.168.1.1: icmp_seq=7 ttl=150 time=0.989 ms 64 bytes from 192.168.1.1: icmp_seq=8 ttl=150 time=1.09 ms 64 bytes from 192.168.1.1: icmp_seq=9 ttl=150 time=0.997 ms 64 bytes from 192.168.1.1: icmp_seq=10 ttl=150 time=0.980 ms [approximate 15 second pause] ping: sendmsg: No buffer space available ping: sendmsg: No buffer space available ping: sendmsg: No buffer space available ping: sendmsg: No buffer space available ping: sendmsg: No buffer space available ^C --- 192.168.1.1 ping statistics --- 15 packets transmitted, 10 received, 33% packet loss, time 33018ms rtt min/avg/max/mdev = 0.980/1.183/2.648/0.491 ms Now I'm at work (in my office building) where a scan shows: # iwlist wlan0 scan wlan0 Scan completed : Cell 01 - Address: 00:17:DF:A9:3C:10 ESSID:"fgz" Mode:Master Channel:1 Frequency:2.412 GHz (Channel 1) Quality=58/100 Signal level=-73 dBm Noise level=-127 dBm Encryption key:off Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 6 Mb/s; 9 Mb/s 11 Mb/s; 12 Mb/s; 18 Mb/s; 24 Mb/s; 36 Mb/s 48 Mb/s; 54 Mb/s Extra:tsf=00000063ddd5a178 Cell 02 - Address: 00:14:A8:00:AF:80 ESSID:"fgz" Mode:Master Channel:1 Frequency:2.412 GHz (Channel 1) Quality=57/100 Signal level=-74 dBm Noise level=-127 dBm Encryption key:off Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 6 Mb/s; 9 Mb/s 11 Mb/s; 12 Mb/s; 18 Mb/s; 24 Mb/s; 36 Mb/s 48 Mb/s; 54 Mb/s Extra:tsf=0000005d0cd4d1ca Cell 03 - Address: 00:16:46:B8:D4:F0 ESSID:"fgz" Mode:Master Channel:1 Frequency:2.412 GHz (Channel 1) Quality=45/100 Signal level=-82 dBm Noise level=-127 dBm Encryption key:off Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 6 Mb/s; 9 Mb/s 11 Mb/s; 12 Mb/s; 18 Mb/s; 24 Mb/s; 36 Mb/s 48 Mb/s; 54 Mb/s Extra:tsf=000000592ccf07a0 Cell 04 - Address: 00:11:5C:93:C9:C0 ESSID:"fgz" Mode:Master Channel:1 Frequency:2.412 GHz (Channel 1) Quality=100/100 Signal level=-29 dBm Noise level=-94 dBm Encryption key:off Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 6 Mb/s; 9 Mb/s 11 Mb/s; 12 Mb/s; 18 Mb/s; 24 Mb/s; 36 Mb/s 48 Mb/s; 54 Mb/s Extra:tsf=000000279cc8a198 Cell 05 - Address: 00:1B:2A:AB:C7:60 ESSID:"fgz" Mode:Master Channel:1 Frequency:2.412 GHz (Channel 1) Quality=31/100 Signal level=-90 dBm Noise level=-127 dBm Encryption key:off Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 11 Mb/s; 12 Mb/s 18 Mb/s; 24 Mb/s; 36 Mb/s; 48 Mb/s; 54 Mb/s Extra:tsf=0000169022857191 Cell 06 - Address: 00:16:46:B8:C1:A0 ESSID:"fgz" Mode:Master Channel:6 Frequency:2.437 GHz (Channel 6) Quality=72/100 Signal level=-62 dBm Noise level=-127 dBm Encryption key:off Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 6 Mb/s; 9 Mb/s 11 Mb/s; 12 Mb/s; 18 Mb/s; 24 Mb/s; 36 Mb/s 48 Mb/s; 54 Mb/s Extra:tsf=000000279caaf1aa Cell 07 - Address: 00:16:46:B8:D4:60 ESSID:"fgz" Mode:Master Channel:6 Frequency:2.437 GHz (Channel 6) Quality=26/100 Signal level=-93 dBm Noise level=-127 dBm Encryption key:off Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 6 Mb/s; 9 Mb/s 11 Mb/s; 12 Mb/s; 18 Mb/s; 24 Mb/s; 36 Mb/s 48 Mb/s; 54 Mb/s Extra:tsf=0000006016d724d6 Cell 08 - Address: 00:16:46:B8:BA:90 ESSID:"fgz" Mode:Master Channel:6 Frequency:2.437 GHz (Channel 6) Quality=28/100 Signal level=-92 dBm Noise level=-127 dBm Encryption key:off Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 6 Mb/s; 9 Mb/s 11 Mb/s; 12 Mb/s; 18 Mb/s; 24 Mb/s; 36 Mb/s 48 Mb/s; 54 Mb/s Extra:tsf=00000058ee9552b7 Cell 09 - Address: 00:14:A8:5F:B1:E0 ESSID:"fgz" Mode:Master Channel:6 Frequency:2.437 GHz (Channel 6) Quality=42/100 Signal level=-84 dBm Noise level=-127 dBm Encryption key:off Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 6 Mb/s; 9 Mb/s 11 Mb/s; 12 Mb/s; 18 Mb/s; 24 Mb/s; 36 Mb/s 48 Mb/s; 54 Mb/s Extra:tsf=00000058eef18192 Cell 10 - Address: 00:14:A8:7E:19:A0 ESSID:"fgz" Mode:Master Channel:6 Frequency:2.437 GHz (Channel 6) Quality=81/100 Signal level=-53 dBm Noise level=-127 dBm Encryption key:off Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 6 Mb/s; 9 Mb/s 11 Mb/s; 12 Mb/s; 18 Mb/s; 24 Mb/s; 36 Mb/s 48 Mb/s; 54 Mb/s Extra:tsf=0000005b09cdf55d Cell 11 - Address: 00:17:DF:AA:47:30 ESSID:"fgz" Mode:Master Channel:11 Frequency:2.462 GHz (Channel 11) Quality=55/100 Signal level=-75 dBm Noise level=-127 dBm Encryption key:off Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 6 Mb/s; 9 Mb/s 11 Mb/s; 12 Mb/s; 18 Mb/s; 24 Mb/s; 36 Mb/s 48 Mb/s; 54 Mb/s Extra:tsf=00000063df687ab2 Cell 12 - Address: 00:16:46:B8:BE:C0 ESSID:"fgz" Mode:Master Channel:11 Frequency:2.462 GHz (Channel 11) Quality=48/100 Signal level=-80 dBm Noise level=-127 dBm Encryption key:off Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 6 Mb/s; 9 Mb/s 11 Mb/s; 12 Mb/s; 18 Mb/s; 24 Mb/s; 36 Mb/s 48 Mb/s; 54 Mb/s Extra:tsf=0000005b06c3d193 Cell 13 - Address: 00:14:A8:7E:1B:D0 ESSID:"fgz" Mode:Master Channel:11 Frequency:2.462 GHz (Channel 11) Quality=43/100 Signal level=-83 dBm Noise level=-127 dBm Encryption key:off Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 6 Mb/s; 9 Mb/s 11 Mb/s; 12 Mb/s; 18 Mb/s; 24 Mb/s; 36 Mb/s 48 Mb/s; 54 Mb/s Extra:tsf=00000058eedba217 Cell 14 - Address: 00:16:46:B8:BF:50 ESSID:"fgz" Mode:Master Channel:11 Frequency:2.462 GHz (Channel 11) Quality=48/100 Signal level=-80 dBm Noise level=-127 dBm Encryption key:off Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 6 Mb/s; 9 Mb/s 11 Mb/s; 12 Mb/s; 18 Mb/s; 24 Mb/s; 36 Mb/s 48 Mb/s; 54 Mb/s Extra:tsf=000000279cba920f Cell 15 - Address: 00:16:46:B8:BB:30 ESSID:"fgz" Mode:Master Channel:11 Frequency:2.462 GHz (Channel 11) Quality=35/100 Signal level=-88 dBm Noise level=-127 dBm Encryption key:off Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 6 Mb/s; 9 Mb/s 11 Mb/s; 12 Mb/s; 18 Mb/s; 24 Mb/s; 36 Mb/s 48 Mb/s; 54 Mb/s Extra:tsf=000000279cb2c192 Cell 16 - Address: 00:17:DF:A9:3C:1F ESSID:"fgz" Mode:Master Channel:64 Frequency:5.32 GHz (Channel 64) Quality=37/100 Signal level=-87 dBm Noise level=-127 dBm Encryption key:off Bit Rates:6 Mb/s; 9 Mb/s; 12 Mb/s; 18 Mb/s; 24 Mb/s 36 Mb/s; 48 Mb/s; 54 Mb/s Extra:tsf=00000063ddf83835 Cell 17 - Address: 00:17:DF:AA:47:3F ESSID:"fgz" Mode:Master Channel:149 Frequency:5.745 GHz (Channel 149) Quality=38/100 Signal level=-86 dBm Noise level=-127 dBm Encryption key:off Bit Rates:6 Mb/s; 9 Mb/s; 12 Mb/s; 18 Mb/s; 24 Mb/s 36 Mb/s; 48 Mb/s; 54 Mb/s Extra:tsf=00000063df62c035 Cell 18 - Address: 00:16:46:B8:D5:60 ESSID:"fgz" Mode:Master Channel:1 Frequency:2.412 GHz (Channel 1) Quality=33/100 Signal level=-89 dBm Noise level=-127 dBm Encryption key:off Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 6 Mb/s; 9 Mb/s 11 Mb/s; 12 Mb/s; 18 Mb/s; 24 Mb/s; 36 Mb/s 48 Mb/s; 54 Mb/s Extra:tsf=00000058ee7c58c4 Cell 19 - Address: 00:1B:2A:AB:82:00 ESSID:"fgz" Mode:Master Channel:6 Frequency:2.437 GHz (Channel 6) Quality=30/100 Signal level=-91 dBm Noise level=-127 dBm Encryption key:off Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 11 Mb/s; 12 Mb/s 18 Mb/s; 24 Mb/s; 36 Mb/s; 48 Mb/s; 54 Mb/s Extra:tsf=0000107abd828344 Cell 20 - Address: 00:17:DF:A9:0E:B0 ESSID:"fgz" Mode:Master Channel:6 Frequency:2.437 GHz (Channel 6) Quality=38/100 Signal level=-86 dBm Noise level=-127 dBm Encryption key:off Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 6 Mb/s; 9 Mb/s 11 Mb/s; 12 Mb/s; 18 Mb/s; 24 Mb/s; 36 Mb/s 48 Mb/s; 54 Mb/s Extra:tsf=00000060dedaa481 I've confirmed with our network guys that Cells 01, 11, and 20 are 802.11n. I've tried to associate to Cell 01 (by specifying "ap 00:17:DF:A9:3C:10" to iwconfig and my system can _not_ associate, so, for now, I associate to one of the non-802.11n cells.
1. Did you do iwlist scan _before_ make the association? 2. Did you use full iwconfig (ap,channel,essid) to make the association? 3. Is 1x being used in your office? Please enable 11n, and load the driver with debug=0x04071009 (by module parameter), and attach the dmesg (kernel) log.
(In reply to comment #7) > 1. Did you do iwlist scan _before_ make the association? > 2. Did you use full iwconfig (ap,channel,essid) to make the association? > 3. Is 1x being used in your office? > > Please enable 11n, and load the driver with debug=0x04071009 (by module > parameter), and attach the dmesg (kernel) log. > I'm so confused. I must appologize for starting is bug report and then entering a state of confusion. I must appologize -- I'm sorry; please bare with me. The 4.44.1.20 _might_ be OK -- I haven't seen a BUG yet. My start script does a scan before association. But there is a 7 sec delay inbetween. Is there timing constraints? (if so, what are they?) I have added options iwl4965 debug=0x04071009 to my /etc/modprobe.conf file, but do not know for sure if that has done the trick. When I rmmod iwl4965 and then modprobe iwl4965, I get: iwl4965: Intel(R) Wireless WiFi Link 4965AGN driver for Linux, 1.2.26ks iwl4965: Copyright(c) 2003-2008 Intel Corporation ACPI: PCI Interrupt 0000:0c:00.0[A] -> GSI 17 (level, low) -> IRQ 17 PCI: Setting latency timer of device 0000:0c:00.0 to 64 iwl4965: Detected Intel Wireless WiFi Link 4965AGN iwl4965: Tunable channels: 11 802.11bg, 13 802.11a channels phy2: Selected rate control algorithm 'iwl-4965-rs' ACPI: PCI Interrupt 0000:0c:00.0[A] -> GSI 17 (level, low) -> IRQ 17 firmware: requesting iwlwifi-4965-1.ucode Registered led device: iwl-phy2:radio Registered led device: iwl-phy2:assoc Registered led device: iwl-phy2:RX Registered led device: iwl-phy2:TX iwl4965: TX Power requested while scanning! ADDRCONF(NETDEV_UP): wlan0: link is not ready wlan0: Initial auth_alg=0 wlan0: authenticate with AP 00:11:5c:93:c9:c0 wlan0: RX authentication from 00:11:5c:93:c9:c0 (alg=0 transaction=2 status=0) wlan0: authenticated wlan0: associate with AP 00:11:5c:93:c9:c0 wlan0: RX AssocResp from 00:11:5c:93:c9:c0 (capab=0x401 status=0 aid=37) wlan0: associated wlan0: CTS protection enabled (BSSID=00:11:5c:93:c9:c0) ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready wlan0: Initial auth_alg=0 wlan0: authenticate with AP 00:11:5c:93:c9:c0 wlan0: RX authentication from 00:11:5c:93:c9:c0 (alg=0 transaction=2 status=0) wlan0: authenticated wlan0: associate with AP 00:11:5c:93:c9:c0 wlan0: RX ReassocResp from 00:11:5c:93:c9:c0 (capab=0x401 status=0 aid=37) wlan0: associated I do _not_ use the channel number, just the essid and ap -- I now see that I _must_ use the channel -- If I don't AND the channel of the AP is different in the iwconfig output, the iwconfig command seems to be ignored (and I do not get anything is dmesg) -- I will make this change and I suspect that my wireless operation with go much smoother! Thanks. I'm not sure what you mean by "1x" -- I asked my network guys and they said you might be refering to the draft version number. The said the Cisco equipment we are running is 802.11n draft standard 2. Here are the things I'm confused about: 1. I must enable 11n because the driver will not work without it (Comment #6) 2. I CAN associate to the 11n APs, but not 5+ GHz channels (also, before, when I was not using the channel in the iwconfig command, I was having trouble when the last AP was different -- fixed now -- Thanks for helping work through this!) (dmesg for 5 GHz below) (... and a couple off on a tangent...) 3. With the 1.2.26ks, my txpower does list above 25 mW, where in the 1.1.17k era, I saw 500 mW listed. 4. There are 2 building where I work "high-rise" has mix g and n, and "computer-center" has all n now; in neither have I seen any output indicating the ability to go faster than 54 Mb (Under Windows, I've seen connects at 144 Mb). dmesg output for "ifconfig wlan0 up": ACPI: PCI Interrupt 0000:0c:00.0[A] -> GSI 17 (level, low) -> IRQ 17 Registered led device: iwl-phy2:radio Registered led device: iwl-phy2:assoc Registered led device: iwl-phy2:RX Registered led device: iwl-phy2:TX ADDRCONF(NETDEV_UP): wlan0: link is not ready Then I do: iwlist wlan0 scan | awk '/00:17:DF:A9:3C:1F/,/Channel/' and get: Cell 12 - Address: 00:17:DF:A9:3C:1F ESSID:"fgz" Mode:Master Channel:64 Then I do: iwconfig wlan0 essid fgz channel 64 ap 00:17:DF:A9:3C:1F and get: wlan0: Initial auth_alg=0 wlan0: authenticate with AP 00:17:df:a9:3c:1f wlan0: Initial auth_alg=0 wlan0: authenticate with AP 00:17:df:a9:3c:1f wlan0: authenticate with AP 00:17:df:a9:3c:1f wlan0: authenticate with AP 00:17:df:a9:3c:1f wlan0: authentication with AP 00:17:df:a9:3c:1f timed out I'll continue for now with the 4.44.1.20 firmware. This and always using the channel when bringing up wlan0 should really help. Thanks.
(In reply to comment #8) > I'm so confused. I must appologize for starting is bug report and then ................starting this bug report ..... > 3. With the 1.2.26ks, my txpower does list above 25 mW, where in the .............does _not_ list above 25 mW.......
(In reply to comment #9) Hi, Over the weekend, because of problems with my dhpc server, I ended up do serveral "associations": wlan0: Initial auth_alg=0 wlan0: authenticate with AP 00:0c:41:17:3b:2f wlan0: RX authentication from 00:0c:41:17:3b:2f (alg=0 transaction=2 status=0) wlan0: authenticated wlan0: associate with AP 00:0c:41:17:3b:2f wlan0: RX AssocResp from 00:0c:41:17:3b:2f (capab=0x401 status=0 aid=4) wlan0: associated wlan0: switched to short barker preamble (BSSID=00:0c:41:17:3b:2f) ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready wlan0: no IPv6 routers present Then (withing a few seconds) I suddenly got: wlan0: No ProbeResp from current AP 00:0c:41:17:3b:2f - assume out of range Then a few seconds later, I got: iwl4965: Error sending REPLY_STATISTICS_CMD: enqueue_hcmd failed: -5 Then I tried a scan and got "No scan results" Then I did "ifconfig wlan0 down" and go many: iwl4965: WARNING: Requesting MAC access during RFKILL wakes up NIC iwl4965: WARNING: Requesting MAC access during RFKILL wakes up NIC iwl4965: WARNING: Requesting MAC access during RFKILL wakes up NIC ... Then an "ifconfig wlan0 up" returned: SIOCSIFFLAGS: No such device and dmesg had: ACPI: PCI Interrupt 0000:0c:00.0[A] -> GSI 17 (level, low) -> IRQ 17 iwl4965: Radio disabled by SW RF kill (module parameter) At that point, I had to rmmod and re-modprobe the iwl4965 module to get it out of that state and be able to use the wireless card. Should this be a new bug report?
> iwl4965: Error wrong command queue 63 command id 0x0 > kernel BUG at drivers/net/wireless/iwlwifi/iwl4965-base.c:3465! Do you see this problem with SLAB (I assume you use SLUB by default)?
(In reply to comment #11) > > iwl4965: Error wrong command queue 63 command id 0x0 > > kernel BUG at drivers/net/wireless/iwlwifi/iwl4965-base.c:3465! > > Do you see this problem with SLAB (I assume you use SLUB by default)? > I am using SLAB by default: $ grep '_SL[AU]B' .config CONFIG_SLAB=y # CONFIG_SLUB is not set CONFIG_SLABINFO=y # CONFIG_DEBUG_SLAB is not set (I'm re-doing my config so I can do: zgrep '_SL[AU]B' /proc/config.gz) So, yes, I do see this problem with SLAB. BTW, it appears as if this BUG only happens with the (new) 228.57.1.21 firmware. I'm using the 4.44.1.20 now where it does _not_ seem to happen. (I hit the 1% in comment #2) So, with the combination of the 4.44.1.20 and the 2.6.26 kernel and associated iwl driver, I can not a) associate with 5 GHz APs nor b) connect at speeds higher than 54 Mb in the 802.11n environment (Cisco equipment). I'm hoping you can comment -- are either of these issue known? should I start another bug report on either? are either addressed by the new firmware? I see there is a version of the new firmware that has the "-2" interface, will there be a driver version that supports that interface soon? Thanks for any info and for working on the driver/firmware for the 4965agn card for linux!
Created an attachment (id=1513) [details] Kernel trace image for bug at iwl4965-base.c:3465 I too am getting this error. The error happens when I shutdown or I go into suspend mode. I have attached the backtrace from my computer regarding this error. I have tried using the SLUB and the SLAB allocator, but both give me the same error. Also tried firmware 228.57.1.21 and 4.44.1.20 both with same crash. Bug seems to appear after I have associated with wireless-N (5ghz) network, but not 100% sure. Kernel version: 2.6.26 on Gentoo Linux
*** Bug 1726 has been marked as a duplicate of this bug. ***
*** Bug 1744 has been marked as a duplicate of this bug. ***
Created an attachment (id=1563) [details] Log of kernel crash (netconsole dump) This kernel panic happens when I use iwl4965 in an area with Cisco 1250 APs (11n capable) and bluetooth enabled. I used netcfg2 in Arch Linux to associate to the AP and 11n is enabled in the kernel. I also see warnings earlier that look like this: WARNING: at include/../net/mac80211/rate.h:152 rs_get_rate+0x101/0x220 [iwlagn]() Call Trace: [<ffffffff8023a644>] warn_on_slowpath+0x64/0xb0 [<ffffffffa00267b6>] ieee80211_rx_bss_info+0x3c6/0xd90 [mac80211] [<ffffffffa00273f0>] ieee80211_rx_mgmt_beacon+0x1d0/0x220 [mac80211] [<ffffffffa006ece1>] rs_get_rate+0x101/0x220 [iwlagn] [<ffffffffa002cdc8>] rate_control_get_rate+0x88/0x190 [mac80211] [<ffffffffa00336b9>] invoke_tx_handlers+0x659/0xd30 [mac80211] [<ffffffffa0032adb>] __ieee80211_tx_prepare+0x17b/0x360 [mac80211] [<ffffffff804ad758>] pskb_expand_head+0xf8/0x170 [<ffffffffa0035428>] ieee80211_master_start_xmit+0x1c8/0x460 [mac80211] [<ffffffff804c811e>] __qdisc_run+0x21e/0x250 [<ffffffff804b4c23>] dev_queue_xmit+0x203/0x5a0 [<ffffffffa002a563>] ieee80211_associated+0x1a3/0x210 [mac80211] [<ffffffffa002c2e0>] ieee80211_sta_work+0x700/0x7b0 [mac80211] [<ffffffff802349bb>] finish_task_switch+0x2b/0xe0 [<ffffffff8056328f>] thread_return+0x3d/0x63e [<ffffffffa002bbe0>] ieee80211_sta_work+0x0/0x7b0 [mac80211] [<ffffffff8024d4b5>] run_workqueue+0x85/0x150 [<ffffffff8024d61f>] worker_thread+0x9f/0x110 [<ffffffff802519a0>] autoremove_wake_function+0x0/0x30 [<ffffffff8024d580>] worker_thread+0x0/0x110 [<ffffffff802515d7>] kthread+0x47/0x90 [<ffffffff802378b7>] schedule_tail+0x27/0x70 [<ffffffff8020d419>] child_rip+0xa/0x11 [<ffffffff80251590>] kthread+0x0/0x90 [<ffffffff8020d40f>] child_rip+0x0/0x11
Why is this severity "Enhancement"? I cannot use this firmware at all with this bug.
*** Bug 1755 has been marked as a duplicate of this bug. ***
Created an attachment (id=1582) [details] a patch to try
Still crashes with the patch on top of 2.6.27-rc7 (+tuxonice,color-printk,nvidia). However, the warnings at rs_get_rate no longer appear. This is all I could get over netconsole: ------------[ cut here ]------------ kernel BUG at drivers/net/wireless/iwlwifi/iwl-tx.c:1198! invalid opcode: 0000 [1] PREEMPT SMP CPU 1 Modules linked in: netconsole ipv6 btusb bluetooth joydev asus_oled usbhid mmc_block hid ff_memless uvcvideo compat_ioctl32 videodev v4l1_compat pcspkr psmouse serio_raw sg sr_mod cdrom sdhci_pci sdhci ohci1394 mmc_core ieee1394 ricoh_mmc iTCO_wdt video output uhci_hcd ehci_hcd intel_agp usbcore thermal fan button battery ac vboxdrv cpufreq_userspace cpufreq_ondemand cpufreq_conservative cpufreq_powersave cpufreq_stats acpi_cpufreq freq_table processor asus_laptop fuse nvidia(P) snd_seq_oss i2c_core snd_seq_midi_event snd_seq snd_seq_device evdev snd_hda_intel snd_hwdep snd_pcm_oss snd_pcm snd_timer snd_page_alloc snd_mixer_oss snd soundcore arc4 ecb iwlagn iwlcore rfkill led_class mac80211 cfg80211 r8169 rtc_cmos rtc_core rtc_lib [last unloaded: netconsole] Pid: 0, comm: swapper Tainted: P 2.6.27-rc7-blackice1 #1 RIP: 0010:[<ffffffffa0056473>] [<ffffffffa0056473>] iwl_tx_cmd_complete+0x2a3/0x2d0 [iwlcore] RSP: 0018:ffff88013f8c3e30 EFLAGS: 00010086 RAX: 0000000000000037 RBX: ffff88013e6c1900 RCX: 0000000000000000 RDX: ffff88013d6edd78 RSI: 0000000000000082 RDI: ffffffff80668d40 RBP: ffff88013e6c3fa0 R08: 0000000000000001 R09: 00010026fc4ce5a0 R10: 0000000000000000 R11: ffffffff802228c0 R12: 0000000000000001
Created an attachment (id=1587) [details] rx_dma debug patch
(In reply to comment #20) > Still crashes with the patch on top of 2.6.27-rc7 Thanks! Could you help to try the above debug patch for root causing if it is rx_dma address related? See if there is below message before the BUG message in your netconsole. "ERROR: Bad dma_addr is 0x..."
Nope. Both patches are applied. NET: Registered protocol family 10 lo: Disabled Privacy Extensions wlan0: no IPv6 routers present eth0: no IPv6 routers present *time passes* iwlagn: Error wrong command queue 63 command id 0x0 ------------[ cut here ]------------ p thermal fan button battery ac cpufreq_userspace cpufreq_ show_trace_log_lvl+0x54/0x80 <4> [<ffffffff80245454>] lock_timer_base+0x34/0x70 <4> [<ffffffff8043d162>] do_unblank_screen+0x92/0x150 <4> [<ffffffff803d830d>] bust_spinlocks+0x1d/0x40 <4> [<ffffffff8020db3c>] oops_end+0x2c/0x90 <4> [<ffffffff8020f3c6>] do_invalid_op+0x86/0xa0 <4> [<ffffffffa00564d3>] iwl_tx_cmd_complete+0x2a3/0x2d0 [iwlcore] <4> [<ffffffff805628a8>] printk+0x4000003f<4> [<ffffffff8040c6a5>] aceq evdev snd_seq_device snd_hda_in There's a lot more lines on my screen console before this, but they didn't make it over netconsole. (looks like dmesg history). The screen pointed to a BUG at iwl-tx.x:1198
*** Bug 1768 has been marked as a duplicate of this bug. ***
*** Bug 1770 has been marked as a duplicate of this bug. ***
Created an attachment (id=1624) [details] iwl kernel panic I am running Ubuntu 8.04.1 on my laptop. I think the iwl code is coming in from linux-backports-modules-2.6.24-21-generic. This /just/ started happening. Is it possible this bug was not in 2.6.24-19, but is in 2.6.24-21? (attached screenshot)
OK. I tested with 2.6.24-19 and it works fine, but with 2.6.24-21 it kernel panics 1-2 minutes after I start using it.
If you still need more information please let me know and I will see what I can do.
Also reported in linux-wireless mailing list - http://marc.info/?l=linux-wireless&m=122487466302285&w=2
(In reply to comment #29) > Also reported in linux-wireless mailing list - > http://marc.info/?l=linux-wireless&m=122487466302285&w=2 > Per a suggestion on linux-wireless, I've tried 2.6.28-rc1. It continues to show similar problems, but thus far has not actually hung the system. dmesg output is attached, in case it helps.
Created an attachment (id=1633) [details] 2.6.28-rc1 dmesg
Another user reported issue on linux-wireless ML. http://marc.info/?l=linux-wireless&m=122531601713439&w=2 Below is dmesg of error: WARNING: at drivers/net/wireless/iwlwifi/iwl-tx.c:1241 iwl_tx_cmd_complete+0x50/0x1f0 \ [iwlcore]() wrong command queue 31, command id 0x0 Modules linked in: tun aes_x86_64 aes_generic af_packet i915 drm binfmt_misc rfcomm \ sco bnep l2cap bridge stp llc kvm_intel kvm ipv6 ip_tables x_tables tpm_tis tpm \ tpm_bios fuse loop btusb bluetooth pcmcia arc4 ecb cryptomgr aead crypto_blkcipher \ crypto_algapi iwlagn iwlcore thinkpad_acpi sdhci_pci sdhci rfkill backlight mac80211 \ firewire_ohci firewire_core led_class nvram mmc_core sg piix ide_core crc_itu_t \ cfg80211 ehci_hcd yenta_socket rsrc_nonstatic pcmcia_core uhci_hcd usbcore e1000e \ evdev unix Pid: 0, comm: swapper Tainted: G W 2.6.28-rc2-wl #1 Call Trace: <IRQ> [<ffffffff80233f6c>] warn_slowpath+0xae/0xd5 [<ffffffff80252c00>] ? trace_hardirqs_off_caller+0x8/0x9f [<ffffffff80256110>] ? print_lock_contention_bug+0x1e/0x110 [<ffffffff80256110>] ? print_lock_contention_bug+0x1e/0x110 [<ffffffff80256110>] ? print_lock_contention_bug+0x1e/0x110 [<ffffffff80256110>] ? print_lock_contention_bug+0x1e/0x110 [<ffffffff80256110>] ? print_lock_contention_bug+0x1e/0x110 [<ffffffff80256110>] ? print_lock_contention_bug+0x1e/0x110 [<ffffffff80211dc4>] ? native_sched_clock+0x76/0x88 [<ffffffff80252ca4>] ? trace_hardirqs_off+0xd/0xf [<ffffffff80252c19>] ? trace_hardirqs_off_caller+0x21/0x9f [<ffffffffa0156ab4>] iwl_tx_cmd_complete+0x50/0x1f0 [iwlcore] [<ffffffffa016c251>] iwl_rx_handle+0x127/0x226 [iwlagn] [<ffffffffa016c562>] iwl4965_irq_tasklet+0x212/0x2c9 [iwlagn] [<ffffffff802389b8>] tasklet_action+0x7f/0xda [<ffffffff8023926a>] __do_softirq+0x8d/0x163 [<ffffffff8020c82c>] call_softirq+0x1c/0x28 [<ffffffff8020dd65>] do_softirq+0x39/0x8a [<ffffffff80238dcc>] irq_exit+0x4e/0x91 [<ffffffff8020e059>] do_IRQ+0x150/0x173 [<ffffffff8020b9cb>] ret_from_intr+0x0/0xf <EOI> [<ffffffff8038b846>] ? acpi_idle_enter_simple+0x1a4/0x21f [<ffffffff802542ed>] ? trace_hardirqs_on+0xd/0xf [<ffffffff8038b850>] ? acpi_idle_enter_simple+0x1ae/0x21f [<ffffffff8038b846>] ? acpi_idle_enter_simple+0x1a4/0x21f [<ffffffff8038b420>] ? acpi_idle_enter_bm+0xd1/0x353 [<ffffffff803f9779>] ? cpuidle_idle_call+0x94/0xcf [<ffffffff8020a5f7>] ? cpu_idle+0x54/0x9d [<ffffffff80469018>] ? rest_init+0x5c/0x5e ---[ end trace c15dac81b0f1f4b9 ]---
I'm seeing this as well, using latest wireless-testing. Mind you I have my 11d patch in but this shouldn't affect iwlwifi, I hit the same WARNING at the same place [ 222.180781] ------------[ cut here ]------------ [ 222.180789] WARNING: at drivers/net/wireless/iwlwifi/iwl-tx.c:1241 iwl_tx_cmd_complete+0x256/0x260 [iwlcore]() [ 222.180796] wrong command queue 31, command id 0x0 [ 222.180801] Modules linked in: af_packet binfmt_misc rfcomm l2cap uinput ipv6 acpi_cpufreq cpufreq_userspace c pufreq_conservative cpufreq_ondemand cpufreq_powersave cpufreq_stats freq_table pci_slot sbs sbshc container ipta ble_filter ip_tables x_tables sbp2 snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm arc4 snd_seq_dummy ecb joydev snd_seq_oss iwlagn iwlcore pcmcia mac80211 snd_seq_midi snd_rawmidi sdhci_pci sdhci yenta_socket snd_seq_midi_eve nt btusb mmc_core rsrc_nonstatic iTCO_wdt iTCO_vendor_support psmouse serio_raw cfg80211 ricoh_mmc snd_seq snd_ti mer snd_seq_device pcmcia_core bluetooth ac battery intel_agp video output snd soundcore wmi button agpgart snd_p age_alloc shpchp pci_hotplug thinkpad_acpi rfkill hwmon led_class nvram evdev ext3 jbd mbcache sr_mod cdrom sg sd _mod crc_t10dif ata_piix pata_acpi ohci1394 ieee1394 ata_generic ahci libata scsi_mod e1000e uhci_hcd ehci_hcd us bcore thermal processor fan fuse [ 222.181031] Pid: 0, comm: swapper Not tainted 2.6.28-rc2-wl #11 [ 222.181036] Call Trace: [ 222.181051] [<c0135270>] warn_slowpath+0x60/0x80 [ 222.181068] [<c0151f2b>] ? getnstimeofday+0x4b/0x100 [ 222.181078] [<c015bd94>] ? __lock_acquire+0x2a4/0xf90 [ 222.181087] [<c015564a>] ? clockevents_program_event+0x9a/0x150 [ 222.181107] [<f7eefd96>] iwl_tx_cmd_complete+0x256/0x260 [iwlcore] [ 222.181124] [<c0262781>] ? _raw_spin_lock+0x41/0x120 [ 222.181141] [<f7f03bf6>] iwl_rx_handle+0xe6/0x270 [iwlagn] [ 222.181157] [<f7f0579d>] iwl_irq_tasklet+0x1cd/0x2e0 [iwlagn] [ 222.181167] [<c015b85b>] ? trace_hardirqs_on+0xb/0x10 [ 222.181176] [<c013a585>] tasklet_action+0x75/0x100 [ 222.181185] [<c013aa27>] __do_softirq+0xa7/0x180 [ 222.181193] [<c013ab85>] do_softirq+0x85/0x90 [ 222.181201] [<c013ad15>] irq_exit+0x65/0xa0 [ 222.181210] [<c0106953>] do_IRQ+0x83/0xa0 [ 222.181218] [<c0105298>] common_interrupt+0x28/0x30 [ 222.181227] [<c015007b>] ? pm_qos_power_open+0x2b/0xc0 [ 222.181248] [<f7d54317>] ? acpi_idle_enter_bm+0x276/0x2c5 [processor] [ 222.181259] [<c02e38c4>] cpuidle_idle_call+0x74/0xd0 [ 222.181267] [<c0102862>] cpu_idle+0x72/0xd0 [ 222.181276] [<c0380cfb>] start_secondary+0x196/0x1fb [ 222.181282] ---[ end trace 9a92be5adb8fe71a ]--- [ 303.504927] ------------[ cut here ]------------ [ 303.504936] WARNING: at drivers/net/wireless/iwlwifi/iwl-tx.c:1241 iwl_tx_cmd_complete+0x256/0x260 [iwlcore]() [ 303.504943] wrong command queue 31, command id 0x0 [ 303.504948] Modules linked in: af_packet binfmt_misc rfcomm l2cap uinput ipv6 acpi_cpufreq cpufreq_userspace c pufreq_conservative cpufreq_ondemand cpufreq_powersave cpufreq_stats freq_table pci_slot sbs sbshc container ipta ble_filter ip_tables x_tables sbp2 snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm arc4 snd_seq_dummy ecb joydev snd_seq_oss iwlagn iwlcore pcmcia mac80211 snd_seq_midi snd_rawmidi sdhci_pci sdhci yenta_socket snd_seq_midi_eve nt btusb mmc_core rsrc_nonstatic iTCO_wdt iTCO_vendor_support psmouse serio_raw cfg80211 ricoh_mmc snd_seq snd_ti mer snd_seq_device pcmcia_core bluetooth ac battery intel_agp video output snd soundcore wmi button agpgart snd_p age_alloc shpchp pci_hotplug thinkpad_acpi rfkill hwmon led_class nvram evdev ext3 jbd mbcache sr_mod cdrom sg sd _mod crc_t10dif ata_piix pata_acpi ohci1394 ieee1394 ata_generic ahci libata scsi_mod e1000e uhci_hcd ehci_hcd us bcore thermal processor fan fuse [ 303.505176] Pid: 0, comm: swapper Tainted: G W 2.6.28-rc2-wl #11 [ 303.505182] Call Trace: [ 303.505196] [<c0135270>] warn_slowpath+0x60/0x80 [ 303.505211] [<c012ed14>] ? try_to_wake_up+0x104/0x290 [ 303.505223] [<c0151f2b>] ? getnstimeofday+0x4b/0x100 [ 303.505231] [<c0115003>] ? lapic_next_event+0x13/0x20 [ 303.505240] [<c015564a>] ? clockevents_program_event+0x9a/0x150 [ 303.505249] [<c0151f2b>] ? getnstimeofday+0x4b/0x100 [ 303.505257] [<c0115003>] ? lapic_next_event+0x13/0x20 [ 303.505265] [<c015564a>] ? clockevents_program_event+0x9a/0x150 [ 303.505274] [<c0156933>] ? tick_dev_program_event+0x33/0xc0 [ 303.505294] [<f7eefd96>] iwl_tx_cmd_complete+0x256/0x260 [iwlcore] [ 303.505311] [<c0262781>] ? _raw_spin_lock+0x41/0x120 [ 303.505329] [<f7f03bf6>] iwl_rx_handle+0xe6/0x270 [iwlagn] [ 303.505344] [<f7f0579d>] iwl_irq_tasklet+0x1cd/0x2e0 [iwlagn] [ 303.505360] [<f7f05d94>] ? iwl_isr+0x24/0x110 [iwlagn] [ 303.505370] [<c013a585>] tasklet_action+0x75/0x100 [ 303.505378] [<c013aa27>] __do_softirq+0xa7/0x180 [ 303.505387] [<c013ab85>] do_softirq+0x85/0x90 [ 303.505395] [<c013ad15>] irq_exit+0x65/0xa0 [ 303.505403] [<c0106953>] do_IRQ+0x83/0xa0 [ 303.505411] [<c0105298>] common_interrupt+0x28/0x30 [ 303.505433] [<f7d54317>] ? acpi_idle_enter_bm+0x276/0x2c5 [processor] [ 303.505443] [<c02e38c4>] cpuidle_idle_call+0x74/0xd0 [ 303.505451] [<c0102862>] cpu_idle+0x72/0xd0 [ 303.505460] [<c0374e3e>] rest_init+0x4e/0x60 [ 303.505465] ---[ end trace 9a92be5adb8fe71a ]--- This repeats over and over. Let me know if you want me to test any patches.
Oh, in case it helps this is with an 11g AP, and I have a very noisy environment here at work, probably over 150 APs around me.
Created an attachment (id=1663) [details] a patch to try For those who can reproduce this bug easily, please help to verify if this patch fix the problem for you.
Created an attachment (id=1666) [details] dmesg from 2.6.28-rc4 This is a dmesg log from 2.6.28-rc4 with your patch. Please let me know if you would like to test something else (other kernel version, etc.)
Thanks Karol. Please try below patch against mainline 28-rc4 kernel. It adds singel frame bit and dumps 256 bytes for rx buffer when the error accours.
Created an attachment (id=1670) [details] 2nd try patch against mainline 2.6.28-rc4 kernel.
Created an attachment (id=1672) [details] 3rd try This patch add the remove_sta debug code to 2nd patch. Please try this one.
Created an attachment (id=1673) [details] new dmesg from 2.6.28-rc4
Regression bug has been created for this issue at http://bugzilla.kernel.org/show_bug.cgi?id=11983. Please update that bug when this issue has been resolved.
(In reply to comment #39) > Created an attachment (id=1672) [details] [details] > 3rd try > > This patch add the remove_sta debug code to 2nd patch. Please try this one. thank you, I think this one fixes it here. 14 hours uptime without crash now.
The latest patch didn't work for me on v2.6.28-rc4-322-g58e20d8 I get the second warning once a minute. ------------[ cut here ]------------ WARNING: at drivers/net/wireless/iwlwifi/iwl-tx.c:1208 iwl_tx_cmd_complete+0x2ad/0x330 [iwlcore]() wrong command queue 31, command id 0x0 Modules linked in: ipv6 vboxdrv asus_oled btusb usbhid hid bluetooth joydev uvcvideo compat_ioctl32 videodev v4l1_compat mmc_block pcspkr psmouse serio_raw sg sr_mod cdrom sdhci_pci sdhci mmc_core ohci1394 ieee1394 iTCO_wdt video output ehci_hcd uhci_hcd usbcore intel_agp thermal fan button battery ac cpufreq_userspace cpufreq_ondemand cpufreq_conservative cpufreq_powersave cpufreq_stats acpi_cpufreq freq_table processor asus_laptop fuse nvidia(P) i2c_core evdev snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_hda_intel snd_hwdep snd_pcm_oss snd_pcm snd_timer snd_page_alloc snd_mixer_oss snd soundcore arc4 ecb iwlagn iwlcore rfkill led_class mac80211 cfg80211 r8169 mii rtc_cmos rtc_core rtc_lib Pid: 7938, comm: VirtualBox Tainted: P W 2.6.28-rc4-blackice1 #1 Call Trace: <IRQ> [<ffffffff80245b27>] warn_slowpath+0xb7/0xf0 [<ffffffff8023e37b>] ? enqueue_task_fair+0x3b/0x80 [<ffffffff803f1ec9>] ? __next_cpu+0x19/0x30 [<ffffffff8023e1ba>] ? enqueue_entity+0x11a/0x2a0 [<ffffffff8023e3bb>] ? enqueue_task_fair+0x7b/0x80 [<ffffffff8059b305>] ? _spin_unlock_irqrestore+0x55/0x60 [<ffffffff80241367>] ? try_to_wake_up+0x107/0x330 [<ffffffff8059b2d4>] ? _spin_unlock_irqrestore+0x24/0x60 [<ffffffffa007474d>] iwl_tx_cmd_complete+0x2ad/0x330 [iwlcore] [<ffffffff80265739>] ? getnstimeofday+0x49/0xc0 [<ffffffffa008c7d2>] iwl_rx_handle+0xf2/0x2d0 [iwlagn] [<ffffffffa008cbbe>] iwl4965_irq_tasklet+0x20e/0x380 [iwlagn] [<ffffffff8024b6e5>] tasklet_action+0x75/0x100 [<ffffffff8024c1ac>] __do_softirq+0x9c/0x180 [<ffffffff8020d90c>] call_softirq+0x1c/0x30 [<ffffffff8020f305>] do_softirq+0x75/0xb0 [<ffffffff8024bc2d>] irq_exit+0x9d/0xb0 [<ffffffff8020f5b9>] do_IRQ+0xc9/0x110 [<ffffffff8020ca3b>] ret_from_intr+0x0/0xf <EOI> [<ffffffffa0afce03>] ? g_abExecMemory+0xb3a3/0x180000 [vboxdrv] [<ffffffff80211b86>] ? IRQ0xe1_interrupt+0x0/0xa [<ffffffffa0afce03>] ? g_abExecMemory+0xb3a3/0x180000 [vboxdrv] [<ffffffffa0afcdac>] ? g_abExecMemory+0xb34c/0x180000 [vboxdrv] [<ffffffff80211b86>] ? IRQ0xe1_interrupt+0x0/0xa [<ffffffffa0afdbf5>] ? g_abExecMemory+0xc195/0x180000 [vboxdrv] [<ffffffffa0ae763f>] ? supdrvIOCtlFast+0x7f/0x90 [vboxdrv] [<ffffffffa0ae7234>] ? VBoxDrvLinuxIOCtl+0x44/0x1e0 [vboxdrv] [<ffffffff802e3c51>] ? vfs_ioctl+0x31/0xa0 [<ffffffff802e3d34>] ? do_vfs_ioctl+0x74/0x470 [<ffffffff802e415e>] ? sys_ioctl+0x2e/0xa0 [<ffffffff8023d680>] ? sub_preempt_count+0x60/0x90 [<ffffffff802e41c9>] ? sys_ioctl+0x99/0xa0 [<ffffffffa0afce03>] ? g_abExecMemory+0xb3a3/0x180000 [vboxdrv] [<ffffffff8020c4fb>] ? system_call_fastpath+0x16/0x1b ---[ end trace 7a38aab7c293e2f9 ]--- txq[0] readp=232 writep=232 txq[1] readp=0 writep=0 txq[2] readp=92 writep=92 txq[3] readp=0 writep=0 txq[4] readp=28 writep=28 txq[5] readp=0 writep=0 txq[6] readp=0 writep=0 txq[7] readp=0 writep=0 txq[8] readp=0 writep=0 txq[9] readp=0 writep=0 txq[10] readp=0 writep=0 txq[11] readp=0 writep=0 txq[12] readp=0 writep=0 txq[13] readp=0 writep=0 txq[14] readp=0 writep=0 txq[15] readp=0 writep=0 txq[16] readp=0 writep=0 txq[17] readp=0 writep=0 txq[18] readp=0 writep=0 txq[19] readp=0 writep=0 ------------[ cut here ]------------ WARNING: at include/net/mac80211.h:1883 rs_get_rate+0x91/0x1a0 [iwlagn]() Modules linked in: ipv6 vboxdrv asus_oled btusb usbhid hid bluetooth joydev uvcvideo compat_ioctl32 videodev v4l1_compat mmc_block pcspkr psmouse serio_raw sg sr_mod cdrom sdhci_pci sdhci mmc_core ohci1394 ieee1394 iTCO_wdt video output ehci_hcd uhci_hcd usbcore intel_agp thermal fan button battery ac cpufreq_userspace cpufreq_ondemand cpufreq_conservative cpufreq_powersave cpufreq_stats acpi_cpufreq freq_table processor asus_laptop fuse nvidia(P) i2c_core evdev snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_hda_intel snd_hwdep snd_pcm_oss snd_pcm snd_timer snd_page_alloc snd_mixer_oss snd soundcore arc4 ecb iwlagn iwlcore rfkill led_class mac80211 cfg80211 r8169 mii rtc_cmos rtc_core rtc_lib Pid: 581, comm: iwlagn Tainted: P W 2.6.28-rc4-blackice1 #1 Call Trace: [<ffffffff80245bbf>] warn_on_slowpath+0x5f/0x90 [<ffffffff80221d48>] ? smp_apic_timer_interrupt+0x88/0xc0 [<ffffffff8024bbf1>] ? irq_exit+0x61/0xb0 [<ffffffff8024be0d>] ? local_bh_disable+0xd/0x10 [<ffffffff8059ae11>] ? _spin_lock_bh+0x11/0x40 [<ffffffffa003b1c8>] ? ieee80211_rx_bss_get+0xa8/0xc0 [mac80211] [<ffffffffa003b1c8>] ? ieee80211_rx_bss_get+0xa8/0xc0 [mac80211] [<ffffffff8024bea5>] ? local_bh_enable_ip+0x95/0x120 [<ffffffffa008e2b1>] rs_get_rate+0x91/0x1a0 [iwlagn] [<ffffffffa003b830>] ? ieee80211_rx_bss_put+0x20/0x100 [mac80211] [<ffffffffa003b8f5>] ? ieee80211_rx_bss_put+0xe5/0x100 [mac80211] [<ffffffffa004317c>] rate_control_get_rate+0xac/0x150 [mac80211] [<ffffffffa0049f4e>] invoke_tx_handlers+0x6ae/0xda0 [mac80211] [<ffffffff8059b2d4>] ? _spin_unlock_irqrestore+0x24/0x60 [<ffffffffa0049323>] ? __ieee80211_tx_prepare+0x193/0x380 [mac80211] [<ffffffffa004bed4>] ieee80211_master_start_xmit+0x234/0x5e0 [mac80211] [<ffffffff804e4985>] dev_hard_start_xmit+0x255/0x2e0 [<ffffffff804f89de>] __qdisc_run+0x23e/0x290 [<ffffffff804e4df8>] dev_queue_xmit+0x2b8/0x5b0 [<ffffffffa004d5f4>] ieee80211_tx_skb+0x64/0x70 [mac80211] [<ffffffffa003ea87>] ieee80211_send_probe_req+0x1a7/0x220 [mac80211] [<ffffffffa0041210>] ? ieee80211_sta_work+0x0/0x840 [mac80211] [<ffffffffa003ec69>] ieee80211_associated+0x169/0x1d0 [mac80211] [<ffffffff804dbfde>] ? skb_dequeue+0x5e/0x80 [<ffffffffa00417e3>] ieee80211_sta_work+0x5d3/0x840 [mac80211] [<ffffffff8023d680>] ? sub_preempt_count+0x60/0x90 [<ffffffff80598791>] ? thread_return+0x81/0x720 [<ffffffff80259509>] ? run_workqueue+0x19/0x170 [<ffffffff8025956e>] ? run_workqueue+0x7e/0x170 [<ffffffffa0041210>] ? ieee80211_sta_work+0x0/0x840 [mac80211] [<ffffffffa0041210>] ? ieee80211_sta_work+0x0/0x840 [mac80211] [<ffffffff80259589>] run_workqueue+0x99/0x170 [<ffffffff80259707>] worker_thread+0xa7/0x120 [<ffffffff8025dd00>] ? autoremove_wake_function+0x0/0x40 [<ffffffff80259660>] ? worker_thread+0x0/0x120 [<ffffffff8025d849>] kthread+0x49/0x90 [<ffffffff8020d5a9>] child_rip+0xa/0x11 [<ffffffff8023b95b>] ? finish_task_switch+0x2b/0xe0 [<ffffffff8059afc6>] ? _spin_unlock_irq+0x16/0x40 [<ffffffff8020ca95>] ? restore_args+0x0/0x30 [<ffffffff8059ad66>] ? _spin_lock+0x16/0x40 [<ffffffff8025d800>] ? kthread+0x0/0x90 [<ffffffff8020d59f>] ? child_rip+0x0/0x11 ---[ end trace 7a38aab7c293e2f9 ]--- dhcpcd[9523]: wlan0: carrier lost IWL DEBUG: iwl_clear_stations_table is called IWL DEBUG: iwl_clear_stations_table is called
Created an attachment (id=1679) [details] driver patch to roll back to use "-1" version ucode Please apply this patch to 2.6.28-rc4 kernel. Then install the iwlwifi-4965-ucode-4.44.1.20 version firmware from http://intellinuxwireless.org/iwlwifi/downloads/iwlwifi-4965-ucode-4.44.1.20.tgz. See if the problem still happens.
I have applied the patch ("driver patch to roll back to use "-1" version ucode ") to 2.6.28-rc4 and 2.6.28-rc5 and both exhibit the same problem. I cannot associate with my linksys wireless-N(5ghz) accesspoint when the patch is applied. Without the patch, association is fine. Post-patch output: [ 665.720756] wlan0: direct probe to AP 00:21:29:6c:2d:ea try 1 [ 665.920090] wlan0: direct probe to AP 00:21:29:6c:2d:ea try 2 [ 666.120100] wlan0: direct probe to AP 00:21:29:6c:2d:ea try 3 [ 666.320081] wlan0: direct probe to AP 00:21:29:6c:2d:ea timed out NetworkManager: <info> Old device 'wlan0' activating, won't change. [ 677.654189] wlan0: direct probe to AP 00:21:29:6c:2d:ea try 1 [ 677.660814] wlan0: direct probe to AP 00:21:29:6c:2d:ea try 1 [ 677.860068] wlan0: direct probe to AP 00:21:29:6c:2d:ea try 2 [ 678.061057] wlan0: direct probe to AP 00:21:29:6c:2d:ea try 3 [ 678.260066] wlan0: direct probe to AP 00:21:29:6c:2d:ea timed out NetworkManager: <info> Old device 'wlan0' activating, won't change. [ 689.593349] wlan0: direct probe to AP 00:21:29:6c:2d:ea try 1 [ 689.600059] wlan0: direct probe to AP 00:21:29:6c:2d:ea try 1 [ 689.800076] wlan0: direct probe to AP 00:21:29:6c:2d:ea try 2 [ 690.000069] wlan0: direct probe to AP 00:21:29:6c:2d:ea try 3 [ 690.200066] wlan0: direct probe to AP 00:21:29:6c:2d:ea timed out NetworkManager: <info> Old device 'wlan0' activating, won't change. [ 701.503828] wlan0: direct probe to AP 00:21:29:6c:2d:ea try 1 [ 701.509195] wlan0: direct probe to AP 00:21:29:6c:2d:ea try 1 [ 701.709108] wlan0: direct probe to AP 00:21:29:6c:2d:ea try 2 [ 701.909069] wlan0: direct probe to AP 00:21:29:6c:2d:ea try 3 [ 702.109068] wlan0: direct probe to AP 00:21:29:6c:2d:ea timed out NetworkManager: <info> Activation (wlan0/wireless): association took too long (>40s), failing activation. NetworkManager: <info> Activation (wlan0) failure scheduled... NetworkManager: <info> Activation (wlan0) failed for access point (MMNet5) NetworkManager: <info> Activation (wlan0) failed. Patch-less output: [ 130.046412] wlan0: authenticate with AP 00:21:29:6c:2d:ea [ 130.246052] wlan0: authenticate with AP 00:21:29:6c:2d:ea [ 130.446055] wlan0: authenticate with AP 00:21:29:6c:2d:ea [ 130.646102] wlan0: authentication with AP 00:21:29:6c:2d:ea timed out NetworkManager: <info> Old device 'wlan0' activating, won't change. [ 141.905558] wlan0: authenticate with AP 00:21:29:6c:2d:ea [ 141.912429] wlan0: authenticate with AP 00:21:29:6c:2d:ea [ 142.112088] wlan0: authenticate with AP 00:21:29:6c:2d:ea [ 142.312117] wlan0: authenticate with AP 00:21:29:6c:2d:ea [ 142.512065] wlan0: authentication with AP 00:21:29:6c:2d:ea timed out NetworkManager: <info> Old device 'wlan0' activating, won't change. [ 153.795976] wlan0: authenticate with AP 00:21:29:6c:2d:ea [ 153.802677] wlan0: authenticate with AP 00:21:29:6c:2d:ea [ 154.003195] wlan0: authenticate with AP 00:21:29:6c:2d:ea [ 154.004357] wlan0: authenticated [ 154.004365] wlan0: associate with AP 00:21:29:6c:2d:ea [ 154.005474] wlan0: RX AssocResp from 00:21:29:6c:2d:ea (capab=0x11 status=0 aid=1) [ 154.005481] wlan0: associated
(In reply to comment #45) > I have applied the patch ("driver patch to roll back to use "-1" version ucode > ") to 2.6.28-rc4 and 2.6.28-rc5 and both exhibit the same problem. I cannot > associate with my linksys wireless-N(5ghz) accesspoint when the patch is > applied. Without the patch, association is fine. Yup, the new firmware is target to fix this issue. I guess your AP is on a passive channel. Can you try to use 2.4GHz band? Just to see if old firmware has this problem or not.
I tried the RX_skb patch from linux-wireless on top of 2.6.28-rc5 with wireless-testing (without the last patch) and I still hit the issue. I have not tried rolling back to v1 ucode yet. ------------[ cut here ]------------ WARNING: at drivers/net/wireless/iwlwifi/iwl-tx.c:1268 iwl_tx_cmd_complete+0x28a/0x290 [iwlcore]() wrong command queue 31, sequence 0x7FFF readp=141 writep=141 Modules linked in: ipv6 btusb bluetooth mmc_block joydev usbhid hid uvcvideo compat_ioctl32 videodev v4l1_compat psmouse sdhci_pci sdhci ohci1394 serio_raw mmc_core ieee1394 pcspkr ricoh_mmc sg sr_mod cdrom iTCO_wdt video output ehci_hcd uhci_hcd intel_agp thermal fan button battery ac asus_oled usbcore cpufreq_userspace cpufreq_ondemand cpufreq_conservative cpufreq_powersave cpufreq_stats acpi_cpufreq freq_table processor asus_laptop fuse nvidia(P) i2c_core evdev snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_hda_intel snd_hwdep snd_pcm_oss snd_pcm snd_timer snd_page_alloc snd_mixer_oss snd soundcore arc4 ecb iwlagn iwlcore rfkill led_class mac80211 cfg80211 r8169 mii rtc_cmos rtc_core rtc_lib Pid: 0, comm: swapper Tainted: P W 2.6.28-rc5-blackice1 #1 Call Trace: <IRQ> [<ffffffff80245b57>] warn_slowpath+0xb7/0xf0 [<ffffffff8059b534>] ? _spin_unlock_irqrestore+0x24/0x60 [<ffffffff803f20d9>] ? __next_cpu+0x19/0x30 [<ffffffff8023e73c>] ? tg_shares_up+0xcc/0x230 [<ffffffffa00766aa>] iwl_tx_cmd_complete+0x28a/0x290 [iwlcore] [<ffffffffa008f282>] iwl_rx_handle+0xf2/0x2d0 [iwlagn] [<ffffffffa008f66e>] iwl_irq_tasklet+0x20e/0x380 [iwlagn] [<ffffffff8024b6f5>] tasklet_action+0x75/0x100 [<ffffffff8024c1bc>] __do_softirq+0x9c/0x180 [<ffffffff8020d90c>] call_softirq+0x1c/0x30 [<ffffffff8020f305>] do_softirq+0x75/0xb0 [<ffffffff8024bc3d>] irq_exit+0x9d/0xb0 [<ffffffff8020f5b9>] do_IRQ+0xc9/0x110 [<ffffffff8020ca3b>] ret_from_intr+0x0/0xf <EOI> [<ffffffffa09717d6>] ? acpi_idle_enter_simple+0x1ae/0x21f [processor] [<ffffffffa09717cc>] ? acpi_idle_enter_simple+0x1a4/0x21f [processor] [<ffffffff804cb0e5>] ? cpuidle_idle_call+0xa5/0x100 [<ffffffff8020b5ce>] ? cpu_idle+0x6e/0xe0 [<ffffffff80586f20>] ? rest_init+0x70/0x80 ---[ end trace 90a3cd80115354aa ]---
(In reply to comment #46) > (In reply to comment #45) > > I have applied the patch ("driver patch to roll back to use "-1" version ucode > > ") to 2.6.28-rc4 and 2.6.28-rc5 and both exhibit the same problem. I cannot > > associate with my linksys wireless-N(5ghz) accesspoint when the patch is > > applied. Without the patch, association is fine. > > Yup, the new firmware is target to fix this issue. I guess your AP is on a > passive channel. Can you try to use 2.4GHz band? Just to see if old firmware > has this problem or not. > The old ucode fixes it for me on the 2.4ghz band -- only rs_get_rate warnings for 6 hours now
> The old ucode fixes it for me on the 2.4ghz band -- only rs_get_rate warnings > for 6 hours now Thanks for testing. Are you able to use 5GHz band with the old uCode?
I was able to associate on channel 161, but I could not associate on channel 48 (see below -- channel 48 was tried first followed by 161). Both tries were directly with iwconfig ------- iwlagn 0000:03:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 iwlagn 0000:03:00.0: restoring config space at offset 0x1 (was 0x100102, writing 0x100106) iwlagn 0000:03:00.0: irq 39 for MSI/MSI-X iwlagn 0000:03:00.0: firmware: requesting iwlwifi-4965-1.ucode DEBUG: txq_id 0, CONFIG_REG 0x1d00 DEBUG: txq_id 1, CONFIG_REG 0x1d20 DEBUG: txq_id 2, CONFIG_REG 0x1d40 DEBUG: txq_id 3, CONFIG_REG 0x1d60 DEBUG: txq_id 4, CONFIG_REG 0x1d80 DEBUG: txq_id 5, CONFIG_REG 0x1da0 DEBUG: txq_id 6, CONFIG_REG 0x1dc0 DEBUG: txq_id 7, CONFIG_REG 0x1de0 DEBUG: txq_id 9, CONFIG_REG 0x1e20 DEBUG: txq_id 10, CONFIG_REG 0x1e40 IWL DEBUG: iwl_clear_stations_table is called IWL DEBUG: iwl_clear_stations_table is called IWL DEBUG: iwl_clear_stations_table is called Registered led device: iwl-phy0:radio Registered led device: iwl-phy0:assoc Registered led device: iwl-phy0:RX Registered led device: iwl-phy0:TX IWL DEBUG: iwl_clear_stations_table is called IWL DEBUG: iwl_clear_stations_table is called IWL DEBUG: iwl_clear_stations_table is called IWL DEBUG: iwl_clear_stations_table is called IWL DEBUG: iwl_clear_stations_table is called IWL DEBUG: iwl_clear_stations_table is called IWL DEBUG: iwl_clear_stations_table is called wlan0: direct probe to AP ff:ff:ff:ff:ff:ff try 1 IWL DEBUG: iwl_clear_stations_table is called IWL DEBUG: iwl_clear_stations_table is called IWL DEBUG: iwl_clear_stations_table is called wlan0: direct probe to AP 00:1c:58:6d:32:30 try 1 IWL DEBUG: iwl_clear_stations_table is called wlan0: direct probe to AP 00:1c:58:6d:32:30 try 2 wlan0: direct probe to AP 00:1c:58:6d:32:30 try 3 wlan0: direct probe to AP 00:1c:58:6d:32:30 timed out IWL DEBUG: iwl_clear_stations_table is called IWL DEBUG: iwl_clear_stations_table is called IWL DEBUG: iwl_clear_stations_table is called wlan0: direct probe to AP 00:21:d8:d6:8c:0e try 1 IWL DEBUG: iwl_clear_stations_table is called IWL DEBUG: iwl_clear_stations_table is called IWL DEBUG: iwl_clear_stations_table is called wlan0: direct probe to AP 00:21:d8:d6:7f:ce try 1 wlan0: direct probe to AP 00:21:d8:d6:7f:ce try 2 wlan0 direct probe responded wlan0: authenticate with AP 00:21:d8:d6:7f:ce wlan0: authenticated wlan0: associate with AP 00:21:d8:d6:7f:ce wlan0: RX AssocResp from 00:21:d8:d6:7f:ce (capab=0x1 status=0 aid=2) wlan0: associated
FYI: Commit in latest 2.6.28-rc6 kernel... commit 4018517a1a69a85c3d61b20fa02f187b80773137 Author: Johannes Berg <johannes@sipsolutions.net> Date: Tue Nov 18 01:47:21 2008 +0100 iwlagn: fix RX skb alignment So I dug deeper into the DMA problems I had with iwlagn and a kind soul helped me in that he said something about pci-e alignment and mentioned the iwl_rx_allocate function to check for crossing 4KB boundaries. Since there's 8KB A-MPDU support, crossing 4k boundaries didn't seem like something the device would fail with, but when I looked into the function for a minute anyway I stumbled over this little gem: BUG_ON(rxb->dma_addr & (~DMA_BIT_MASK(36) & 0xff)); Clearly, that is a totally bogus check, one would hope the compiler removes it entirely. (Think about it) After fixing it, I obviously ran into it, nothing guarantees the alignment the way you want it, because of the way skbs and their headroom are allocated. I won't explain that here nor double-check that I'm right, that goes beyond what most of the CC'ed people care about. So then I came up with the patch below, and so far my system has survived minutes with 64K pages, when it would previously fail in seconds. And I haven't seen a single instance of the TX bug either. But when you see the patch it'll be pretty obvious to you why. This should fix the following reported kernel bugs: http://bugzilla.kernel.org/show_bug.cgi?id=11596 http://bugzilla.kernel.org/show_bug.cgi?id=11393 http://bugzilla.kernel.org/show_bug.cgi?id=11983 I haven't checked if there are any elsewhere, but I suppose RHBZ will have a few instances too... I'd like to ask anyone who is CC'ed (those are people I know ran into the bug) to try this patch. I am convinced that this patch is correct in spirit, but I haven't understood why, for example, there are so many unmap calls. I'm not entirely convinced that this is the only bug leading to the TX reply errors. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Created an attachment (id=1694) [details] a debug firmware Please verify if this ucode fix your problem. Copy it to your /lib/firmware and overwrite the original one. Then reload the driver to use it.
(In reply to comment #52) > Created an attachment (id=1694) [details] [details] > a debug firmware > > Please verify if this ucode fix your problem. Copy it to your /lib/firmware and > overwrite the original one. Then reload the driver to use it. > Using this provided firmware fixed the issue for me. Using 2.6.27.7.
mark as fixed
Yi, when will this firmware be released?
(In reply to comment #54) > mark as fixed > The firmware is not (as of 11-28 AM) available on the http://www.intellinuxwireless.org/?n=downloads&f=ucode website. But thanks for resolving the issue. Please get the firmware release on the website so people will easily know the version number. Thanks.
Sorry, maybe I missed something but with the firmware in Comment #52, I'm still getting errors in dmesg. I'm using kernel 2.6.27, is 2.6.28 required for the fix to work? I'm joining my dmesg file.
Created an attachment (id=1723) [details] dmesg output with debug firmware and kernel 2.5.27
(In reply to comment #57) > Sorry, maybe I missed something but with the firmware in Comment #52, I'm > still getting errors in dmesg. 1. make sure you copy the firmware image to the right place of your distro. 2. you need to reload the iwlwifi driver to use the new ucode 3. if it still doesn't work with above, try the patch in comment #51. It is another cause for this problem
1. I'm sure it is in the right place and I had to overwrite the existing firmware. 2. I'm pretty sure iwlwifi was reloaded because I rebooted a couple times since I overwrote my existing firmware. 3. I will compile git version of the kernel and see how it works.
Created an attachment (id=1732) [details] dmesg with kernel git and new debug firmware I tried kernel git (because it has the patch you were mentioning). Now I get the following error in dmesg: [73627.971057] __ratelimit: 10781 callbacks suppressed [73627.971063] iwlagn: Can not allocate SKB buffers Note that I am still using your new debug firmware.
(In reply to comment #61) > Created an attachment (id=1732) [details] [details] > dmesg with kernel git and new debug firmware > > I tried kernel git (because it has the patch you were mentioning). > > Now I get the following error in dmesg: > [73627.971057] __ratelimit: 10781 callbacks suppressed > [73627.971063] iwlagn: Can not allocate SKB buffers > > Note that I am still using your new debug firmware. > This issue is not specific to the new debug firmware. Could you please add your test results to http://www.intellinuxwireless.org/bugzilla/show_bug.cgi?id=1842 instead?
The new uCode 228.57.2.23 is released. Mark as Verified.
Any ETA for a fixed microcode to the 5000 series? I see that this issue has already been marked as FIXED. Open another bug for 5000 series or reopen this one for tracking? http://www.intellinuxwireless.org/?n=Downloads still only has the one from last October with this bug present. 4965 Images iwlwifi-4965-ucode-228.57.2.23.tgz 12-11-08 10:42:34 R L 5000 Images iwlwifi-5000-ucode-5.4.A.11.tar.gz 06-10-08 19:46:21 R L
(In reply to comment #64) > Any ETA for a fixed microcode to the 5000 series? > I see that this issue has already been marked as FIXED. Open another bug for > 5000 series or reopen this one for tracking? > > http://www.intellinuxwireless.org/?n=Downloads still only has the one from last > October with this bug present. > > 4965 Images > iwlwifi-4965-ucode-228.57.2.23.tgz 12-11-08 10:42:34 R L > > 5000 Images > iwlwifi-5000-ucode-5.4.A.11.tar.gz 06-10-08 19:46:21 R L > Ops, I guess the new bug for 5000 series is http://www.intellinuxwireless.org/bugzilla/show_bug.cgi?id=1946