Bugzilla – Bug 592
Firmware error detected. Restarting (with ERROR_NMI_INTERRUPT)
Last modified: 2008-07-29 03:10:11
You need to log in before you can comment on or make changes to this bug.
When loading the module dmesg shows under some circumstances an error message: ipw2200: Firmware error detected. Restarting. After that it successfully connects to the ap.
Created an attachment (id=272) [details] Dmesg output
Created an attachment (id=273) [details] Dmesg output
I am also seeing the same error. I didnt see this error earlier but seems to work fine after restarting.
Please set the debug level to 0x40000 to capture the full firmware status log and provide the dmesg output capturing the log. You can do this via the module parameter 'debug': % modprobe ipw2200 debug=0x40000 or via adding the module option in your modprobe.conf equivelent (distribution specific)
Created an attachment (id=281) [details] Dmesg output with debug=0x40000
James, What data do you need? I recently upgraded my ipw2200 driver from 0.15 to 1.01 with the 2.6.11 kernel and started to see these errors. In my cases, once this happens I can't associate with the AP again until a reboot, reloading the module doesn't help. In one case, I was not able to connect to the AP at all, every attempt would result in this Firmware error state. I will capture the dmesg output and post here, is there any other data I can provide? What would be useful to debug this issue? Thanks! Richard Ferguson
I hate to ask -- but can you try and make it die a few more times at the same debug level and attach any debug logs that don't list: ipw2200: ERROR_NMI_INTERRUPT ... on the fourth line? I'm hoping there may be another error that sometimes shows up (race condition) If all you ever see is the NMI, let us know that too. Thanks, James
I created a lot more errors, loading and unloading the module. It happens everytime on my system. I tried about 10 times, but everytime the 4th line was the NMI one. I'll upload the dmesg output...
Created an attachment (id=293) [details] The error reproduced
Created an attachment (id=296) [details] My /var/log/messages Hi, I experience exactly the same problem with my IBM T42 running 2.6.11-mm4 when trying to associate in Ad-Hoc mode (Managed runs fine), driver ipw2200 1.0.1 I can sometimes associate but as soon as I try to ping somehting, I obtain this "firmware error. restarting". The attached /var/log/messages has been obtained with debug=0xffff.
The bug is fixed in ipw2200 1.0.2, could you try this version to see whether the problem is fixed? Thanks.
It solves nothing for me, everything stays identical as before (still running 2.6.12-rc1-mm1 with ieee80211 included in the kernel, I don't know if this is important).
Sorry, the bug is not fixed in 1.0.2, I got a wrong message from CHANGES. I will separate the problem found by Alexandre Buisse from this bug, since the usage steps are different.
Created an attachment (id=309) [details] dmesg output Not sure if you guys are still looking for non NMI data but I have some here. The attached log is the dmesg output after a Firmware restart and it doesn't show an NMI error. It shows an ERROR_SYSASSERT. Hope this helps. note: this is with version 1.0.1, version 1.0.2 will not load for me see bug #617. my kernel verison: Linux firebolt 2.6.11 #5 Tue Mar 15 18:50:01 EST 2005 i686 Intel(R) Pentium(R) M processor 1.60GHz GenuineIntel GNU/Linux
Created an attachment (id=321) [details] firmware fix please try this patch against ipw 1.0.2. try to run the test with led disabled, there are some deadlock problem with led or try to apply the led patch if led test needed. please report any problem you see with log.
patch against ipw22 1.0.2
I applied your patch an tried to compile but in your file there is something missing. You changed the struct ipw_priv in ipw2200.h and added a field called sem. This is missing in your patch.
Created an attachment (id=327) [details] firmware fatal fix please try this one instead of 321, I forget to include ipw2200.h changes
Created an attachment (id=358) [details] dmesg.out.1.0.3 The bug is verified in 1.0.3. Environment: FC3-IBM T40, SuSE 9.3-Compaq X1000 Step: 1. modprobe -r ipw2200 2. modprobe ipw2200 debug=0x40000 3. iwconfig eth1 key restricted (Must do it in a few seconds, Maybe error occured during some connecting init actions). 4. Firmware error detected. Restarting... dmesg in the attachment, I didn't found other config command can reappear this error expcet "iwconfig eth1 key restricted", and "iwconfig eth key open" has no problem.
Sorry, I found the bug again. Not "verified" in my last comment.
Created an attachment (id=363) [details] Slackware Current / 2.6 Kern -> Dmesg Debug Output -> 1.0.3
Created an attachment (id=364) [details] fix for race condition in auth process please try the above patch against ipw2200-1.0.3
I could not produce the bug in my machine, but there might be a race condition in the process of association. please try the latest patch against ipw2200 1.0.3. I was not sure if Mark getting the same error with the same steps that Yan doing, if not please provide the fall log so I can steps through.
I didn't see it any more after use patch from Mohamed.
Mark the bug as resolved: TESTED_PATCH_EXITS.
I had this problem with 1.0.4, and always when I was using Samba to download large quantities of data over the network. I was always able to reassociate, but the samba download couldn't go on. Mohamed's patch made it work perfectly. Just downloaded 350 MB without a glitch. I hope you'll add this patch to the CVS so future versions can be used (the patch gave errors against 1.0.4, only works on 1.0.3). (On a sidenote: The Changelog says bug #592 is fixed since 1.0.2, but has wrong details next to it and it's not patched.)
The patch is valid. Please add it to 1.0.5.
patch attachment 364 [details] is correct and applied. but I doubt how it fixed this problem.
mark as fixed
Stephan and Mark, Has the behavior in managed mode improved for you guys in 1.0.5? Are the ERROR_NMI_INTERRUPT errors gone?
Well, I did not see an ERROR_NMI_INTERRUPT for some time. There are still some firmware-errors especially under load. Its better now. I'll tell you if I get new problems.
This errors occurs each time try to access an nfs mounted file with ipw2200 1.0.6 nfs over ipw2200 was fairly unreliable with 1.0.4 on big files, but with 1.0.6 it's worse; I have the following message each time i try to access an nfs mounted file. I use firmware 2.3, kernel 2.6.12.2. I you want more logs, tell me. Alexandre ======= Jul 15 13:22:07 dell17 kernel: ipw2200: Firmware error detected. Restarting. Jul 15 13:22:25 dell17 last message repeated 4 times Jul 15 13:22:26 dell17 kernel: nfs: server ifar not responding, timed out Jul 15 13:22:27 dell17 last message repeated 15 times Jul 15 13:22:28 dell17 kernel: ipw2200: Firmware error detected. Restarting. Jul 15 13:23:00 dell17 last message repeated 8 times Jul 15 13:24:03 dell17 last message repeated 14 times Jul 15 13:24:22 dell17 last message repeated 5 times Jul 15 13:24:28 dell17 kernel: nfs: server ifar not responding, timed out
Same problem there, with ipw2200 v1.0.6 / ieee80211 v1.0.3 / kernel 2.6.12.2 After sometime managed wep connection drops : ipw2200: Firmware error detected. Restarting. ipw2200: Start IPW Error Log Dump: ipw2200: Status: 0x000000E0, Config: 00000347 ipw2200: NMI_INTERRUPT 10766004 0x000003b4 0x00000000 0x0001ad50 0x00015dcc 0x00000000 ipw2200: DMA_STATUS 10766008 0x00027d50 0x00027170 0x01540002 0x00000000 0x00000000 ipw2200: DMA_STATUS 10766011 0x00028400 0x00028490 0x00540001 0x00000000 0x00000001 ipw2200: DMA_STATUS 10766015 0x00028000 0x00028000 0x00540000 0x9c6a4200 0x00000002 ipw2200: DMA_STATUS 10766019 0x00400000 0x00408000 0x300081cc 0x00000086 0x00000003 Tell me if you want more log... Laurent
Would you please attach the dmesg log with debug=0x43fff?
Created an attachment (id=483) [details] A Firmware error log with debug=0x43fff This occurred this afternoon with 1.0.6, iee802 1.0.3, kernel 2.6.12.3. I hope I have supplied enough context to make it useful. The machine is an IBM Thinkpad R52. 0000:04:02.0 0280: 8086:4220 (rev 05) Subsystem: 8086:2712 Flags: bus master, medium devsel, latency 64, IRQ 21 Memory at 90301000 (32-bit, non-prefetchable) [size=4K] Capabilities: [dc] Power Management version 2
Created an attachment (id=518) [details] dmesg output ipw 1.0.4
on my machine the error occurs while uploading big data via samba or to a local ftp. error only occurs with wpa aes during UPLOAD, it always disconnects and reconnects, so throughput is very low. tkip works fine without disconnecting. machine: acer travelmate 291lci using SuSE 9.3, 2.6.11.4-20a-default. wpa_supplicant 0.38, i also tried 0.40 ieee80211 1.0.3 ipw2200 1.0.6, also tried 1.0.4, and various combinations of wpa_supplicant and ipw2200. dmesg output both 1.0.6 and 1.0.4 with debug=0x40000 i recognized that the error depends on the transmit power of the AP, if i put it to 20mW everyting works good. if txpower is at 150mW or 250mW, i get these firmware errors on my notebook, very strange....
Created an attachment (id=519) [details] dmesg output ipw2200 1.0.6
I'll attach a copy of the output when I get a chance. I got the same error over and over. Seams the easiest way to get it is to download a torrent. Perhaps while surfing and downloading other things.
I just wanted to emphasize that this bug is strongly connected with WEP. I had a WPA network for a while, and the symptom was pretty much gone. Now that I'm back on WEP, it starts appearing again.
After upgrading to Kernel 2.6.14-rc2 which claims (in dmesg) to include ipw2200 version 1.04 and loads firmware 2.2 the problems have disappeared for me completely. Even running a network benchmark like iperf causes no problems at all while it crashes the firmware after about five seconds with ipw2200 v. 1.06/ fw 2.3 and ubuntu kernel 2.6.12-9. This is using 128Bit WEP, module loaded without any parameters. The config for both kernels is pretty much identical (on 2.6.14 i enabled PREEMPT and selected Pentium M instead of Pentium Pro as the processor). There is an unrelated warning in dmesg when i load the module: [17179753.096000] eth1 (WE) : Driver using old /proc/net/wireless support, please fix driver ! Also, i get the message [17179747.768000] ieee80211_crypt: registered algorithm 'WEP' while i think it did not appear in 2.6.12/ipw1.06/fw2.3 unless i specified hwcrypto=0 (which caused the card not to associate at all).
(In reply to comment #43) > After upgrading to Kernel 2.6.14-rc2 which claims (in dmesg) to include ipw2200 > version 1.04 and loads firmware 2.2 the problems have disappeared for me > completely. Correction: I just got a single firmware error after repeatedly plugging in an usb scanner within maybe 15 seconds. The system froze for a short time because of this (maybe because hotplug uploads firmware to the scanner). The connection recovered immediately after the system had settled down again.
I get this error to, using: kernel-2.6.12-gentoo-r10 ipw2200-1.0.6 ipw2200-firmware-2.3
Created an attachment (id=544) [details] dmesg output (with debug: 0x43000) The output from dmesg.
New instances of fw failures should NO LONGER be added to this bug. Please scan for other bugs that may be related to the problem being seen, and if not found, open a new bug. Please refer to Ben's mailing list post on how to most effectively report firmware bugs: http://sourceforge.net/mailarchive/forum.php? thread_id=8685492&forum_id=38938 Different reasons for fw failuare have been crowded into this same bug report. Assigning to Ben to see which of the issues have already been resolved, and which are still nagging bugs, which will be tracked individually.