Bug 1918 - iwl4965 - SENSITIVITY_CMD failed - Kernel 2.6.28 and 2.6.29
: iwl4965 - SENSITIVITY_CMD failed - Kernel 2.6.28 and 2.6.29
Status: ASSIGNED
: iwlwifi
firmware error
: official kernel (2.6.*)
: 4965 (Intel(R) WiFi Link 4965) __UNSPECIFIED__
: P1 critical
Assigned To:
:
:
:
:
:
  Show dependency treegraph
 
Reported: 2009-03-01 12:40 by
Modified: 2009-10-29 15:02 (History)


Attachments
Log of erros (75.97 KB, application/gzip)
2009-03-07 20:20, Robson Peixoto
Details
dmesg log (14.62 KB, application/gzip)
2009-03-07 20:21, Robson Peixoto
Details
kernel log (80.61 KB, application/gzip)
2009-03-07 20:23, Robson Peixoto
Details
erros using debug=0x43fff (289.97 KB, application/gzip)
2009-03-12 10:12, Robson Peixoto
Details
Kern.log snippet with relevant errors (5.41 KB, text/plain)
2009-03-19 09:12, Brett Ussher
Details
dmesg.log for comment #19 (77.62 KB, text/x-log)
2009-04-02 17:59, Brett Ussher
Details
NMI Interrupt WDG Errors (Replacing SENSITIVITY_CMD errors with RF_SENSITIVITY disabled) (46.72 KB, text/plain)
2009-04-06 23:12, John Ranson
Details
Log file containing error messages. (30.14 KB, application/octet-stream)
2009-05-27 10:53, Harald Judt
Details
Latest dmesg with 2.6.28-13 kernel (45.89 KB, text/x-log)
2009-06-21 06:47, Brett Ussher
Details
kern.log w/ debug kernel showing 0x82000000 error. (112.32 KB, application/x-gzip)
2009-06-27 14:55, Brett Ussher
Details
gzipped tarball with kern.log and dmesg after 0x82000000 error w/ latest ucode (7-9-09) (88.00 KB, application/x-gzip)
2009-07-15 18:41, Brett Ussher
Details
kern.log w/ debug kernel showing 0x82000000 error (7-17-09). (189.99 KB, application/x-gzip)
2009-07-17 14:59, Brett Ussher
Details


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2009-03-01 12:40:38
# uname  -a
Linux robinho 2.6.28-ARCH #1 SMP PREEMPT Sun Feb 22 11:03:50 UTC 2009 i686
Intel(R) Core(TM)2 Duo CPU T6400 @ 2.00GHz GenuineIntel GNU/Linux


# pacman -Q | grep kernel
kernel26 2.6.28.7-1
Patchs used in that kernel
http://projects.archlinux.org/?p=linux-2.6-ARCH.git;a=tree;f=patches;h=3c8715dbc296650d4f98e902e760581c3f22af6e;hb=57bf92198297226c520f993016a25d5d5171edeb


# pacman -Q | grep iwl
iwlwifi-4965-ucode 228.57.2.23-1


# lspci
0b:00.0 Network controller: Intel Corporation PRO/Wireless 4965 AG or AGN
[Kedron] Network Connection (rev 61)


# lspci -vv
0b:00.0 Network controller: Intel Corporation PRO/Wireless 4965 AG or AGN
[Kedron] Network Connection (rev 61)
        Subsystem: Intel Corporation Device 1121
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR+ FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 761
        Region 0: Memory at fe7fe000 (64-bit, non-prefetchable) [size=8K]
        Capabilities: [c8] Power Management version 3
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [d0] MSI: Mask- 64bit+ Count=1/1 Enable+
                Address: 00000000fee0300c  Data: 41e9
        Capabilities: [e0] Express (v1) Endpoint, MSI 00
                DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns,
L1 unlimited
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal-
Unsupported-
                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 128 bytes, MaxReadReq 128 bytes
                DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq- AuxPwr+
TransPend-
                LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency
L0 <128ns, L1 <64us
                        ClockPM+ Surprise- LLActRep- BwNot-
                LnkCtl: ASPM L0s L1 Enabled; RCB 64 bytes Disabled- Retrain-
CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+
DLActive- BWMgmt- ABWMgmt-
        Capabilities: [100] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr+ BadTLP+ BadDLLP- Rollover+ Timeout+ NonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
        Capabilities: [140] Device Serial Number 67-90-a0-ff-ff-5c-21-00
        Kernel driver in use: iwlagn
        Kernel modules: iwlagn


# lspci  -nn
0b:00.0 Network controller [0280]: Intel Corporation PRO/Wireless 4965 AG or
AGN [Kedron] Network Connection [8086:4229] (rev 61)


# tail -F /var/log/errors.log
Mar  1 17:33:26 robinho iwlagn: Microcode SW error detected.  Restarting
0x82000000.
Mar  1 17:33:27 robinho iwlagn: Can't stop Rx DMA.
Mar  1 17:33:27 robinho iwlagn: No space for Tx
Mar  1 17:33:27 robinho iwlagn: Error sending SENSITIVITY_CMD: enqueue_hcmd
failed: -28
Mar  1 17:33:27 robinho iwlagn: SENSITIVITY_CMD failed
------- Comment #1 From 2009-03-03 09:27:32 -------
# grep "^Mar 3" /var/log/errors.log | grep "Microcode SW error detected"
Mar 3 00:01:32 robinho iwlagn: Microcode SW error detected. Restarting
0x82000000.
Mar 3 07:33:36 robinho iwlagn: Microcode SW error detected. Restarting
0x82000000.
Mar 3 08:17:18 robinho iwlagn: Microcode SW error detected. Restarting
0x82000000.
Mar 3 08:35:10 robinho iwlagn: Microcode SW error detected. Restarting
0x82000000.
Mar 3 09:01:24 robinho iwlagn: Microcode SW error detected. Restarting
0x82000000.
Mar 3 09:45:03 robinho iwlagn: Microcode SW error detected. Restarting
0x82000000.
Mar 3 10:20:05 robinho iwlagn: Microcode SW error detected. Restarting
0x82000000.
Mar 3 12:05:38 robinho iwlagn: Microcode SW error detected. Restarting
0x82000000.
Mar 3 12:21:13 robinho iwlagn: Microcode SW error detected. Restarting
0x82000000.
Mar 3 12:42:34 robinho iwlagn: Microcode SW error detected. Restarting
0x82000000.
Mar 3 13:03:49 robinho iwlagn: Microcode SW error detected. Restarting
0x82000000.
Mar 3 13:17:47 robinho iwlagn: Microcode SW error detected. Restarting
0x82000000.
Mar 3 13:29:27 robinho iwlagn: Microcode SW error detected. Restarting
0x82000000.
Mar 3 13:38:16 robinho iwlagn: Microcode SW error detected. Restarting
0x82000000.
Mar 3 14:01:24 robinho iwlagn: Microcode SW error detected. Restarting
0x82000000.
Mar 3 14:12:22 robinho iwlagn: Microcode SW error detected. Restarting
0x82000000.
Mar 3 14:19:09 robinho iwlagn: Microcode SW error detected. Restarting
0x82000000.
------- Comment #2 From 2009-03-03 09:51:56 -------
Also present in latest 2.6.29 RC
------- Comment #3 From 2009-03-07 11:50:11 -------
Tested on 2.6.28.6 and the same bug!
------- Comment #4 From 2009-03-07 20:20:43 -------
Created an attachment (id=1863) [details]
Log of erros

$ uname  -a
Linux robinho 2.6.28.7 #1 SMP Sat Mar 7 22:39:35 BRT 2009 i686 Intel(R)
Core(TM)2 Duo CPU T6400 @ 2.00GHz GenuineIntel GNU/Linux

$ grep IWL /usr/src/linux/.config
CONFIG_IWLWIFI=m
CONFIG_IWLCORE=m
CONFIG_IWLWIFI_LEDS=y
# CONFIG_IWLWIFI_RFKILL is not set
CONFIG_IWLWIFI_DEBUG=y
CONFIG_IWLAGN=m
# CONFIG_IWLAGN_SPECTRUM_MEASUREMENT is not set
CONFIG_IWLAGN_LEDS=y
CONFIG_IWL4965=y
# CONFIG_IWL5000 is not set
# CONFIG_IWL3945 is not set

$ cat /etc/modprobe.conf
#
# /etc/modprobe.conf (for v2.6 kernels)
#

options snd-hda-intel.model=dell-bios 
options iwl4965 debug=0x4000
------- Comment #5 From 2009-03-07 20:21:32 -------
Created an attachment (id=1864) [details]
dmesg log

$ uname  -a
Linux robinho 2.6.28.7 #1 SMP Sat Mar 7 22:39:35 BRT 2009 i686 Intel(R)
Core(TM)2 Duo CPU T6400 @ 2.00GHz GenuineIntel GNU/Linux

$ grep IWL /usr/src/linux/.config
CONFIG_IWLWIFI=m
CONFIG_IWLCORE=m
CONFIG_IWLWIFI_LEDS=y
# CONFIG_IWLWIFI_RFKILL is not set
CONFIG_IWLWIFI_DEBUG=y
CONFIG_IWLAGN=m
# CONFIG_IWLAGN_SPECTRUM_MEASUREMENT is not set
CONFIG_IWLAGN_LEDS=y
CONFIG_IWL4965=y
# CONFIG_IWL5000 is not set
# CONFIG_IWL3945 is not set

$ cat /etc/modprobe.conf
#
# /etc/modprobe.conf (for v2.6 kernels)
#

options snd-hda-intel.model=dell-bios
options iwl4965 debug=0x4000
------- Comment #6 From 2009-03-07 20:23:29 -------
Created an attachment (id=1865) [details]
kernel log

$ uname  -a
Linux robinho 2.6.28.7 #1 SMP Sat Mar 7 22:39:35 BRT 2009 i686 Intel(R)
Core(TM)2 Duo CPU T6400 @ 2.00GHz GenuineIntel GNU/Linux

$ grep IWL /usr/src/linux/.config
CONFIG_IWLWIFI=m
CONFIG_IWLCORE=m
CONFIG_IWLWIFI_LEDS=y
# CONFIG_IWLWIFI_RFKILL is not set
CONFIG_IWLWIFI_DEBUG=y
CONFIG_IWLAGN=m
# CONFIG_IWLAGN_SPECTRUM_MEASUREMENT is not set
CONFIG_IWLAGN_LEDS=y
CONFIG_IWL4965=y
# CONFIG_IWL5000 is not set
# CONFIG_IWL3945 is not set

$ cat /etc/modprobe.conf
#
# /etc/modprobe.conf (for v2.6 kernels)
#

options snd-hda-intel.model=dell-bios
options iwl4965 debug=0x4000
------- Comment #7 From 2009-03-09 17:18:27 -------
I testing on kernel 2.6.27.19 and are working very well.
------- Comment #8 From 2009-03-09 18:01:29 -------
(In reply to comment #7)
> I testing on kernel 2.6.27.19 and are working very well.
> 

After 01:54:04 time the problem appeared
------- Comment #9 From 2009-03-12 09:24:26 -------
Could you try capturing a log following the suggestions at:

http://www.intellinuxwireless.org/?n=fw_error_report

Thanks!  That will provide more info.

-- Ben --
------- Comment #10 From 2009-03-12 10:12:00 -------
Created an attachment (id=1874) [details]
erros using debug=0x43fff

Microcode SW error detected.  Restarting 0x82000000.
Can't stop Rx DMA.
No space for Tx
Error sending SENSITIVITY_CMD: enqueue_hcmd failed: -28
SENSITIVITY_CMD failed
------- Comment #11 From 2009-03-19 09:12:34 -------
Created an attachment (id=1887) [details]
Kern.log snippet with relevant errors

I have this bug also.  I see the exact output in my kern.log that is shown in
Comment #10.  Attaching a relevant snippet of my kern.log.  It should be noted
that I'm using the 2.6.27-11 kernel on Ubuntu 8.10:
------- Comment #12 From 2009-03-23 15:05:17 -------
I had the same problem on Ubuntu and on Archlinux x86_64

More reports on Gentoo:
http://bugs.gentoo.org/show_bug.cgi?id=261248
------- Comment #13 From 2009-03-23 16:43:08 -------
*** Bug 1923 has been marked as a duplicate of this bug. ***
------- Comment #14 From 2009-04-01 15:48:48 -------
I think I know how to reproduce this bug now, for me at least(Lenovo Thinkpad
T61 with Nvidia card) it only seems to happen when I am switching from a
terminal to X11, it doesn't always happen, but it seems like it only happens
then.  This could indicate a problem with a display driver(there were some
strange problems with nvidia-drivers-180.35 before).
------- Comment #15 From 2009-04-01 16:43:30 -------
Nvidia is not related. Anyway, has there been any progress with this bug? Beta
firmware update perhaps?
------- Comment #16 From 2009-04-01 19:35:39 -------
Thanks all.

Log from Robson comment #10 showed very similar symptom to bugzilla #1941.

Unfortunately, log from Brett comment #11 did not have the needed debug level
applied, so does not have enough info to help.  See comment #9.

Julian, do you have a log available (didn't see one in bugzilla #1923)?  See
comment #9.

Bugzilla #1941, and Julian comment #14 sound like video-driver-related
disabling of interrupts, which in turn prevent driver from servicing Rx and Tx
queues ... comment #10 log shows same symptoms as 1941 log, Rx underrun (no
place to put Rx data), and driver log shows no place to put Tx data.  If
interrupts stop, there's not much we can do.

Could you check Bugzilla #1941, and see if there's anything there that is
helpful?  Let me know.

-- Ben --
------- Comment #17 From 2009-04-02 09:46:53 -------
(In reply to comment #16)
> Thanks all.
> 
> Log from Robson comment #10 showed very similar symptom to bugzilla #1941.
> 
> Unfortunately, log from Brett comment #11 did not have the needed debug level
> applied, so does not have enough info to help.  See comment #9.
> 
> Julian, do you have a log available (didn't see one in bugzilla #1923)?  See
> comment #9.
> 
> Bugzilla #1941, and Julian comment #14 sound like video-driver-related
> disabling of interrupts, which in turn prevent driver from servicing Rx and Tx
> queues ... comment #10 log shows same symptoms as 1941 log, Rx underrun (no
> place to put Rx data), and driver log shows no place to put Tx data.  If
> interrupts stop, there's not much we can do.
> 
> Could you check Bugzilla #1941, and see if there's anything there that is
> helpful?  Let me know.
> 
> -- Ben --
> 

I'm running the module in debug now.  I'll post a new log when I get the SW
microcode error again.

Brett
------- Comment #18 From 2009-04-02 16:11:17 -------
(In reply to comment #16)
I downgraded to nvidia-drivers-180.29 last night and I was about to report
success, that the problem was the nvidia driver and downgrading it worked, but
right before I hit commit, I got another error, same as before
   [ 1161.315700] iwlagn 0000:03:00.0: Microcode SW error detected.  Restarting
0x82000000.
   [ 1161.331165] iwlagn 0000:03:00.0: No space for Tx
   [ 1161.331170] iwlagn 0000:03:00.0: Error sending SENSITIVITY_CMD:
enqueue_hcmd failed: -28
   [ 1161.331172] iwlagn 0000:03:00.0: SENSITIVITY_CMD failed
I'm going to recompile with debug flags soon and see what I can see.
------- Comment #19 From 2009-04-02 17:57:29 -------
(In reply to comment #16)
> Thanks all.
> 
> Log from Robson comment #10 showed very similar symptom to bugzilla #1941.
> 
> Unfortunately, log from Brett comment #11 did not have the needed debug level
> applied, so does not have enough info to help.  See comment #9.
> 
> Julian, do you have a log available (didn't see one in bugzilla #1923)?  See
> comment #9.
> 
> Bugzilla #1941, and Julian comment #14 sound like video-driver-related
> disabling of interrupts, which in turn prevent driver from servicing Rx and Tx
> queues ... comment #10 log shows same symptoms as 1941 log, Rx underrun (no
> place to put Rx data), and driver log shows no place to put Tx data.  If
> interrupts stop, there's not much we can do.
> 
> Could you check Bugzilla #1941, and see if there's anything there that is
> helpful?  Let me know.
> 
> -- Ben --
> 

Ok, ran dmesg after getting the error with debug=0x43fff set.  However, the
output is no different.  I tried "modprobe iwlagn debug=0x43fff" after first
doing a rmmod iwlagn.  That made no change in the logs.  Then I put the debug
statement in /etc/modprobe.d/options and ran modprobe -v iwlagn.  I saw from
the verbose output that the debug statement was passed to the module at load
time.  However, still no change in the log.  I'm posting the dmesg anyhow, just
in case it does say something useful.
------- Comment #20 From 2009-04-02 17:59:23 -------
Created an attachment (id=1918) [details]
dmesg.log for comment #19

upload of dmesg.log for comment #19.  Hope it helps.

Brett
------- Comment #21 From 2009-04-06 21:46:34 -------
Brett,

See https://bugs.launchpad.net/ubuntu/+source/linux/+bug/200509 . Ubuntu
helpfully disables debugging of iwlagn. Someone has apparently posted modules
with debugging enabled. You might check them out.

Anyway, I want to give a ditto for this. I'm running 2.6.29 in Gentoo and
seeing RX SENSITIVITY errors. (I was seeing them previously with other kernels,
and I just disabled it with "echo 1 >
/sys/kernel/debug/ieee80211/phy0/iwlagn/rf/disable_sensitivity") I'll post some
logs when I have the chance.

John
------- Comment #22 From 2009-04-06 23:12:02 -------
Created an attachment (id=1925) [details]
NMI Interrupt WDG Errors (Replacing SENSITIVITY_CMD errors with RF_SENSITIVITY
disabled)

I regularly get SENSITIVITY_CMD errors. When I disable RF_SENSITIVITY, I get
NMI_INTERRUPT_WDG errors. It's an either-or thing.

Unfortunately, in my current config, RF_SENSITIVITY and debug level are set at
the same time, so when I get the SENSITIVITY_CMD errors, I usually don't have
debugging enabled.

Apr  6 20:01:47 shabbar [264098.665311] iwlagn: Microcode SW error detected. 
Restarting 0x82000000.
Apr  6 20:01:47 shabbar [264098.687038] iwlagn: No space for Tx
Apr  6 20:01:47 shabbar [264098.687042] iwlagn: Error sending SENSITIVITY_CMD:
enqueue_hcmd failed: -28
Apr  6 20:01:47 shabbar [264098.687045] iwlagn: SENSITIVITY_CMD failed
Apr  6 20:01:47 shabbar [264098.890668] Registered led device: iwl-phy0:radio
Apr  6 20:01:47 shabbar [264098.890684] Registered led device: iwl-phy0:assoc
Apr  6 20:01:47 shabbar [264098.890698] Registered led device: iwl-phy0:RX
Apr  6 20:01:47 shabbar [264098.890712] Registered led device: iwl-phy0:TX
------- Comment #23 From 2009-04-15 08:42:27 -------
Also present in 2.6.30 rc1...

That would make four kernel releases with the same debilitating bug. Also, four
releases where my wireless does not work at all.
------- Comment #24 From 2009-04-17 05:28:40 -------
(In reply to comment #21)
> Brett,
> 
> See https://bugs.launchpad.net/ubuntu/+source/linux/+bug/200509 . Ubuntu
> helpfully disables debugging of iwlagn. Someone has apparently posted modules
> with debugging enabled. You might check them out.
> 
> Anyway, I want to give a ditto for this. I'm running 2.6.29 in Gentoo and
> seeing RX SENSITIVITY errors. (I was seeing them previously with other kernels,
> and I just disabled it with "echo 1 >
> /sys/kernel/debug/ieee80211/phy0/iwlagn/rf/disable_sensitivity") I'll post some
> logs when I have the chance.
> 
> John
> 

John,

Thanks for the link.  I went into the kernel's .config file (for the kernel I'm
actually running) and set the options listed in it with "=y".  I then heard
that Intel had released a version that fixed this issue so I downloaded the
latest compat-wireless source and compiled it against the kernel with the
.config file I had modified.  Still no change in my output as comment #19
states and the latest compat-wireless didn't fix the issue.

Brett
------- Comment #25 From 2009-05-27 10:53:36 -------
Created an attachment (id=2012) [details]
Log file containing error messages.

Similar problem here on kernel 2.6.29 using compat-wireless 2009-05-27
snapshot.
Module load options: swcrypto=1 debug=0x43fff
Filtered lots of
ieee80211 phy0: I iwl_mac_tx enter
ieee80211 phy0: I iwl_mac_tx leave
to make it smaller.
------- Comment #26 From 2009-06-21 06:47:46 -------
Created an attachment (id=2058) [details]
Latest dmesg with 2.6.28-13 kernel

I'm wondering if there is any progress on this bug.  I'm using Ubuntu 9.04 with
the 2.6.28-13-generic kernel and this bug still exists.  I've tried downloading
the stable compat-wireless for 2.6.30, but the problem is still in that version
of the driver.  I'm posting the dmesg I get with this kernel using the 2.6.30
compat-wireless, log hasn't changed much except for an additional line stating
that the MAC is in deep sleep.  I've personally been having this issue since
Ubuntu 8.04.  At this point, I'm willing to ship my laptop to one of you
developers if that would help with fixing this problem.

Thanks,
Brett Ussher
------- Comment #27 From 2009-06-21 09:11:29 -------
(In reply to comment #26)
> Created an attachment (id=2058) [details] [details]
> Latest dmesg with 2.6.28-13 kernel
> 
> I'm wondering if there is any progress on this bug.  I'm using Ubuntu 9.04 with
> the 2.6.28-13-generic kernel and this bug still exists.  I've tried downloading
> the stable compat-wireless for 2.6.30, but the problem is still in that version
> of the driver.  I'm posting the dmesg I get with this kernel using the 2.6.30
> compat-wireless, log hasn't changed much except for an additional line stating
> that the MAC is in deep sleep.  I've personally been having this issue since
> Ubuntu 8.04.  At this point, I'm willing to ship my laptop to one of you
> developers if that would help with fixing this problem.
> 
> Thanks,
> Brett Ussher
> 

The Intel folks have been claiming that they will release a firmware that will
fix this problem. See
http://intellinuxwireless.org/bugzilla/show_bug.cgi?id=1989 It's been a month
since they released updates for every other iwl firmware. Don't hold your
breath waiting for it.

John
------- Comment #28 From 2009-06-27 14:55:46 -------
Created an attachment (id=2064) [details]
kern.log w/ debug kernel showing 0x82000000 error.

Finally had the time to figure out how to compile the kernel with the debug
stuff turned on for Ubuntu.  Still using 2.6.28-13 kernel on Ubuntu 9.04. 
Attaching my kern.log (had to use the kern.log.  The log filled up beyond the
point that dmesg would display the moment the error occurred).  Look at line
7647 (I'm using gedit with line numbering turned on) for the moment when I get
the 0x82000000 error.  This kern.log file has been trimmed to only include that
particular session (I rebooted my laptop with the debug kernel and began
watching).  I had to tar.gz this file so it would upload.  Hope this helps.
------- Comment #29 From 2009-06-27 14:57:50 -------
(From update of attachment 1887 [details])
Replaced by Attachment 2064 [details]
------- Comment #30 From 2009-06-27 14:58:14 -------
(From update of attachment 1918 [details])
Replaced by Attachment 2064 [details]
------- Comment #31 From 2009-06-27 14:59:48 -------
(From update of attachment 2058 [details])
Replaced by Attachment 2064 [details]
------- Comment #32 From 2009-07-07 08:40:35 -------
There are any plans from main engineering group for fixing it in the near
future?
------- Comment #33 From 2009-07-15 13:39:13 -------
Could you try new 4965 uCode, released 7/9/09?

Thanks!

-- Ben --
------- Comment #34 From 2009-07-15 14:05:47 -------
(In reply to comment #33)
> Could you try new 4965 uCode, released 7/9/09?
> 
> Thanks!
> 
> -- Ben --
> 

Sorry. I changed my Wireless card.
------- Comment #35 From 2009-07-15 17:10:16 -------
I just installed the new ucode file and did an rmmod/modprobe on iwlagn.  I'll
post when it breaks or in a few days if it does not.
------- Comment #36 From 2009-07-15 17:13:33 -------
Thanks.

Sorry we couldn't help Robson in a timely way.  Gnarly bug.  :-(

-- Ben --
------- Comment #37 From 2009-07-15 18:41:44 -------
Created an attachment (id=2091) [details]
gzipped tarball with kern.log and dmesg after 0x82000000 error w/ latest ucode
(7-9-09)

Ok, it failed.  I'm attaching a new dmesg and kern.log.  The kern.log has all
debug output from the moment the iwlagn interface was added via modprobe.

Error occurs at line 5691 (in gedit with line numbering turned on).
------- Comment #38 From 2009-07-17 05:56:15 -------
Was my kern.log and dmesg found in attachment id=2091 helpful to you guys?  If
not, what can I do to make it more useful?
------- Comment #39 From 2009-07-17 06:19:43 -------
Hmm, looks like Rx FIFO is backed up / overflowing.

Could you add one more bit to debug param?  Try 0x01043fff, adding RX bit will
show the Rx queue indexes.

ALSO, are you on a 64-bit platform????

There's another bugzilla (2039) that has some symptoms of Rx overflow, and he's
on a 64-bit system.

-- Ben --
------- Comment #40 From 2009-07-17 07:25:28 -------
(In reply to comment #39)
> Hmm, looks like Rx FIFO is backed up / overflowing.
> 
> Could you add one more bit to debug param?  Try 0x01043fff, adding RX bit will
> show the Rx queue indexes.
> 
> ALSO, are you on a 64-bit platform????
> 
> There's another bugzilla (2039) that has some symptoms of Rx overflow, and he's
> on a 64-bit system.
> 
> -- Ben --
> 

I'll add the extra debug parameter this afternoon and test it out.

As for my platform, my hardware is 64-bit, but I'm running a 32-bit OS (Mint 7
main -- same as Ubuntu 9.04)

I'll post the new debug output when I get it.

Thanks!
------- Comment #41 From 2009-07-17 14:59:33 -------
Created an attachment (id=2101) [details]
kern.log w/ debug kernel showing 0x82000000 error (7-17-09).

Ok, used the 0x01043fff debug setting.  I'm attaching the kern.log.  There was
way too much output for dmesg to be use use.  Kern.log has been truncated to
only show the log from the moment that I activated the iwlagn module through
modprobe until I removed it with rmmod (after the error had occured).  I had to
zip the file due to size.

Error occurs at line 14315 (using gedit with line numbering turned on).

As I stated in my last post, I'm using a 32-bit OS.

Hope this helps!  
------- Comment #42 From 2009-07-19 16:51:47 -------
I can confirm this bug after upgrading (only) the firmware from
iwlwifi-4965-ucode-228.57.2.23 to iwlwifi-4965-ucode-228.61.2.24. Reverting the
firmware resolves the issue for me. Starting a torrent download (loads of
simultaneous connections) was enough to trigger this error repeatedly within 5
seconds on my system. More info:

# uname -a
Linux emerald 2.6.30-ARCH #1 SMP PREEMPT Sat Jul 4 02:24:43 CEST 2009 x86_64
Intel(R) Core(TM)2 Duo CPU T7700 @ 2.40GHz GenuineIntel GNU/Linux

# lspci | grep 4965
10:00.0 Network controller: Intel Corporation PRO/Wireless 4965 AG or AGN
[Kedron] Network Connection (rev 61)

# dmesg
wlan0: authenticate with AP 00:14:bf:eb:89:44
wlan0: authenticated
wlan0: associate with AP 00:14:bf:eb:89:44
wlan0: RX ReassocResp from 00:14:bf:eb:89:44 (capab=0x411 status=0 aid=2)
wlan0: associated
iwlagn 0000:10:00.0: Microcode SW error detected.  Restarting 0x2000000.
Registered led device: iwl-phy0::radio
Registered led device: iwl-phy0::assoc
Registered led device: iwl-phy0::RX
Registered led device: iwl-phy0::TX
------- Comment #43 From 2009-07-19 16:54:00 -------
(In reply to comment #42)
> I can confirm this bug after upgrading (only) the firmware from
> iwlwifi-4965-ucode-228.57.2.23 to iwlwifi-4965-ucode-228.61.2.24. Reverting the
> firmware resolves the issue for me. Starting a torrent download (loads of
> simultaneous connections) was enough to trigger this error repeatedly within 5
> seconds on my system. More info:
> 
> # uname -a
> Linux emerald 2.6.30-ARCH #1 SMP PREEMPT Sat Jul 4 02:24:43 CEST 2009 x86_64
> Intel(R) Core(TM)2 Duo CPU T7700 @ 2.40GHz GenuineIntel GNU/Linux
> 
> # lspci | grep 4965
> 10:00.0 Network controller: Intel Corporation PRO/Wireless 4965 AG or AGN
> [Kedron] Network Connection (rev 61)
> 
> # dmesg
> wlan0: authenticate with AP 00:14:bf:eb:89:44
> wlan0: authenticated
> wlan0: associate with AP 00:14:bf:eb:89:44
> wlan0: RX ReassocResp from 00:14:bf:eb:89:44 (capab=0x411 status=0 aid=2)
> wlan0: associated
> iwlagn 0000:10:00.0: Microcode SW error detected.  Restarting 0x2000000.
> Registered led device: iwl-phy0::radio
> Registered led device: iwl-phy0::assoc
> Registered led device: iwl-phy0::RX
> Registered led device: iwl-phy0::TX
> 

Ignore my comment, after more careful reading of this thread it seems my errors
in unrelated to the originating report.
------- Comment #44 From 2009-07-20 08:08:37 -------
Brett,

Thanks for log (comment #41).  It shows the driver is doing a good job of
keeping Rx queue serviced and happy.  Which means driver and device are not
communicating with each other very well at some point.

Since you're running on 64-bit hardware (comment #40), I'll hope that's the
related cause (a number of other bugs, with equally crazy symptoms, are on
64-bit systems).  We're looking into this, but wouldn't mind help from
community as well.

-- Ben --
------- Comment #45 From 2009-07-20 08:12:36 -------
Hi Eivind,

Thanks for report ... could you open a new bugzilla for this, and call it
something like "iwlwifi-4965-ucode-228.61.2.24 regression"?

I'm very disappointed the new uCode causes problems.  :-(

In the meantime, if the older uCode works better for you, keep using the old
uCode.  We issued the new uCode hoping it would fix bugs, not create new ones.

-- Ben --
------- Comment #46 From 2009-10-29 15:02:26 -------
I found a post on RedHat's site that suggested setting an option to stop this
problem:

https://bugzilla.redhat.com/show_bug.cgi?id=519154

Look at Comments #10 and 11 for details.  Basically, it states you should turn
on the swcrypto option when loading the module.  I tried this and had partial
success.  I say partial because my connection still died with the error I've
posted from kern.log several times now.  However, despite the error still
occurring, my connection did last for about an hour and a half.  Usually I get
only seconds -- a few minutes at best.  So, I'm wondering if the swcrypto=1
option gives some insight as the the nature of this bug?