Bugzilla – Bug 1248
oops with kernel 2.6.20
Last modified: 2008-12-08 22:14:08
You need to log in before you can comment on or make changes to this bug.
I am using the ipw3945 packages from atrpms. These are the versions: ipw3945-1.2.0-18.2.fc6.at ipw3945d-1.7.22-4.at ipw3945-kmdl-2.6.19-1.2911.6.4.fc6-1.2.0-18.2.fc6.at ipw3945-kmdl-2.6.19-1.2911.6.5.fc6-1.2.0-18.2.fc6.at ipw3945-kmdl-2.6.19-1.2911.fc6-1.2.0-18.2.fc6.at ipw3945-kmdl-2.6.20-1.2925.fc6-1.2.0-18.2.fc6.at ipw3945-kmdl-2.6.20-1.2933.fc6-1.2.0-18.2.fc6.at ipw3945-ucode-1.14.2-4.at ONLY with the 2.6.20 kernel do I get randomly an oops. This is the latest: ipw3945: Microcode SW error detected. Restarting. ipw3945: request scan called when driver not ready. Mar 30 21:44:13 dufus NetworkManager: <WARNING> nm_device_802_11_wireless_get_essid (): error getting ESSID for device eth1: Resource temporarily unavailable Mar 30 21:44:15 dufus NetworkManager: <WARNING> nm_device_802_11_wireless_get_essid (): error getting ESSID for device eth1: Resource temporarily unavailable ipw3945: Error sending ADD_STA: time out after 500ms. invalid opcode: 0000 [#1] SMP last sysfs file: /class/net/eth0/carrier Modules linked in: arc4 ecb blkcipher ieee80211_crypt_wep rfcomm hidp l2cap bluetooth ohci1394 ieee1394 button usb_storage aes ieee80211_crypt_ccmp vfat fat ipt_LOG xt_tcpudp nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink xt_multiport iptable_filter ip_tables x_tables nfsd exportfs lockd nfs_acl autofs4 vmnet(P)(U) vmmon(P)(U) sunrpc cpufreq_ondemand video sbs i2c_ec dock battery asus_acpi backlight ac parport_pc lp parport snd_hda_intel snd_hda_codec snd_seq_dummy ipw3945(F)(U) snd_seq_oss snd_seq_midi_event snd_seq ieee80211 snd_seq_device joydev iTCO_wdt ieee80211_crypt nvidia(P)(U) snd_pcm_oss snd_mixer_oss iTCO_vendor_support snd_pcm sr_mod snd_timer cdrom serio_raw snd tg3 ide_cs pcspkr sg soundcore i2c_i801 snd_page_alloc i2c_core dm_snapshot dm_zero dm_mirror dm_mod ata_piix libata sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd CPU: 0 EIP: 0060:[<c0434847>] Tainted: PF VLI EFLAGS: 00010206 (2.6.20-1.2933.fc6 #1) EIP is at run_workqueue+0x8a/0x125 eax: f73fcd4c ebx: f73fe258 ecx: 00000018 edx: 00000018 esi: f7782e40 edi: 00000246 ebp: f73fe254 esp: f7d2cf6c ds: 007b es: 007b ss: 0068 Process ipw3945/0 (pid: 1581, ti=f7d2c000 task=f71ccbf0 task.ti=f7d2c000) Stack: 00000000 00000282 f70ccd34 c061ffaf f903b9d4 f7782e40 f70ccd34 f7d2cfbc 00000000 c04351b4 00000001 00000000 00000000 00010000 00000000 00000000 f71ccbf0 c04226ab 00100100 00200200 ffffffff ffffffff f7782e40 c04350bb Call Trace: [<c061ffaf>] _spin_lock_irqsave+0x9/0xd [<f903b9d4>] ipw_bg_disassociate+0x0/0x31 [ipw3945] [<c04351b4>] worker_thread+0xf9/0x124 [<c04226ab>] default_wake_function+0x0/0xc Mar 30 21:44:17 dufus NetworkManager: <WARNING> nm_device_802_11_wireless_get_essid (): error getting ESSID for device eth1: Resource temporarily unavailable [<c04350bb>] worker_thread+0x0/0x124 Mar 30 21:44:20 dufus NetworkManager: <WARNING> nm_device_802_11_wireless_get_essid (): error getting ESSID for device eth1: Resource temporarily unavailable [<c04377c7>] kthread+0xb0/0xd9 [<c0437717>] kthread+0x0/0xd9 [<c0404b33>] kernel_thread_helper+0x7/0x10 ======================= Code: e8 9f b7 1e 00 8b 43 fc 83 e0 fc 39 f0 74 04 0f 0b eb fe 8b 43 fc a8 02 75 06 f0 0f ba 73 fc 00 89 e8 ff 54 24 10 89 e0 25 00 f0 <ff> ff 8b 48 14 f7 c1 ff ff ff ef 74 48 65 a1 08 00 00 00 8b 90
Can you please try below patch for ipw3945? After apply the patch, recompile and reinstall, `modprobe ipw3945 debug=0x43fff`. Please report back with dmesg if you still see the oops.
Created an attachment (id=1020) [details] a patch to try
Created an attachment (id=1021) [details] kernel messages Ok, got another last night. I ran dmesg as advised, but all I got was: ated. ipw3945: I ipw_net_hard_start_xmit Tx attempt while not associated. (the last line repeated 1846 times) I'll upload the messages from /var/log/messages about the oops. thanks.
Thanks for the testing. Can you tell me under what condition did you get the oops? During normal transfer or driver unload time?
Not really sure. I've gotten it at night while the laptop is just sitting, while I was using it reading email, and once while I was surfing. It has NOT happened while I was doing heavy file transfers, compiles, or such. Last night, nothing happened at all so it seems to be random, or at least from the user perspective. I'm connecting to a linksys rev 6 wrt54g using WEP personal where the group key renewal is 600 sec. Thank you. I think I mentioned that this just started with the FC6 2.6.20 kernels and the 2.6.19 never had a problem.
Hello, I am also seeing this problem occasionally (no special conditions) with the 2.6.20-1.2933.fc6 kernel and the drivers from atrpms: ipw3945d-1.7.22-4.at ipw3945-ucode-1.14.2-4.at ipw3945-kmdl-2.6.20-1.2933.fc6-1.2.0-18.2.fc6.at ieee80211-kmdl-2.6.20-1.2933.fc6-1.2.16-17.fc6.at Here are the messages; they are similar to those in the earlier reported case. Apr 7 15:43:56 localhost kernel: ipw3945: Error sending cmd #07 to daemon: time out after 500ms. Apr 7 15:43:58 localhost kernel: ipw3945: Error sending SCAN_ABORT_CMD: time out after 500ms. Apr 7 15:43:58 localhost kernel: ipw3945: Error sending cmd #08 to daemon: time out after 500ms. Apr 7 15:43:59 localhost kernel: ipw3945: Error sending ADD_STA: time out after 500ms. Apr 7 15:43:59 localhost kernel: invalid opcode: 0000 [#1] Apr 7 15:43:59 localhost kernel: SMP Apr 7 15:43:59 localhost kernel: last sysfs file: /devices/pci0000:00/0000:00:1c.1/0000:0c:00.0/cmd Apr 7 15:43:59 localhost kernel: Modules linked in: vfat fat usb_storage aes ieee80211_crypt_ccmp(F)(U) ipw3945(F)(U) ieee80211(F)(U) ieee80211_crypt(F)(U) autofs4 hidp rfc omm l2cap bluetooth sunrpc nf_conntrack_netbios_ns ipt_REJECT nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filte r ip6_tables x_tables cpufreq_ondemand video sbs i2c_ec dock button battery asus_acpi backlight ac ipv6 parport_pc lp parport joydev snd_hda_intel snd_hda_codec snd_seq_dumm y snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device sr_mod nvidia(PF)(U) snd_pcm_oss snd_mixer_oss tg3 i2c_i801 cdrom serio_raw ohci1394 sdhci iTCO_wdt pcspkr snd_pcm i2 c_core mmc_core ieee1394 iTCO_vendor_support snd_timer sg snd soundcore snd_page_alloc dm_snapshot dm_zero dm_mirror dm_mod ata_piix libata sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd Apr 7 15:43:59 localhost kernel: CPU: 0 Apr 7 15:43:59 localhost kernel: EIP: 0060:[<c0434847>] Tainted: PF VLI Apr 7 15:43:59 localhost kernel: EFLAGS: 00010216 (2.6.20-1.2933.fc6 #1) Apr 7 15:43:59 localhost kernel: EIP is at run_workqueue+0x8a/0x125 Apr 7 15:43:59 localhost kernel: eax: f4954d4c ebx: f4956258 ecx: 00000018 edx: 00000018 Apr 7 15:43:59 localhost kernel: esi: f694ebc0 edi: 00000246 ebp: f4956254 esp: f4fedf6c Apr 7 15:43:59 localhost kernel: ds: 007b es: 007b ss: 0068 Apr 7 15:43:59 localhost kernel: Process ipw3945/0 (pid: 3392, ti=f4fed000 task=f6d83470 task.ti=f4fed000) Apr 7 15:43:59 localhost kernel: Stack: 00000000 00000282 f4918d34 c061ffaf f8ad19d4 f694ebc0 f4918d34 f4fedfbc Apr 7 15:43:59 localhost kernel: 00000000 c04351b4 00000001 00000000 00000000 00010000 00000000 00000000 Apr 7 15:43:59 localhost kernel: f6d83470 c04226ab 00100100 00200200 ffffffff ffffffff f694ebc0 c04350bb Apr 7 15:43:59 localhost kernel: Call Trace: Apr 7 15:43:59 localhost kernel: [<c061ffaf>] _spin_lock_irqsave+0x9/0xd Apr 7 15:43:59 localhost kernel: [<f8ad19d4>] ipw_bg_disassociate+0x0/0x31 [ipw3945] Apr 7 15:43:59 localhost kernel: [<c04351b4>] worker_thread+0xf9/0x124 Apr 7 15:43:59 localhost kernel: [<c04226ab>] default_wake_function+0x0/0xc Apr 7 15:43:59 localhost kernel: [<c04350bb>] worker_thread+0x0/0x124 Apr 7 15:43:59 localhost kernel: [<c04377c7>] kthread+0xb0/0xd9 Apr 7 15:43:59 localhost kernel: [<c0437717>] kthread+0x0/0xd9 Apr 7 15:43:59 localhost kernel: [<c0404b32>] kernel_thread_helper+0x6/0x10 Apr 7 15:43:59 localhost kernel: ======================= Apr 7 15:43:59 localhost kernel: Code: e8 9f b7 1e 00 8b 43 fc 83 e0 fc 39 f0 74 04 0f 0b eb fe 8b 43 fc a8 02 75 06 f0 0f ba 73 fc 00 89 e8 ff 54 24 10 89 e0 25 00 f0 <ff> ff 8b 48 14 f7 c1 ff ff ff ef 74 48 65 a1 08 00 00 00 8b 90 Apr 7 15:43:59 localhost kernel: EIP: [<c0434847>] run_workqueue+0x8a/0x125 SS:ESP 0068:f4fedf6c I'm running on a Dell XPS M1710.
Created an attachment (id=1023) [details] 2nd try Here is another patch to try. Please attach dmesg with debug=0x6bfff if you see the oops again. BTW, Dan Krejsa, did you see firmware error before the oops?
*** Bug 1256 has been marked as a duplicate of this bug. ***
Created an attachment (id=1025) [details] 3nd try The patch workarounds a stack overwritten bug. Please help to test if it fixed the oops.
Created an attachment (id=1026) [details] oops for patch #2 I had an oops last night, logged in to see "try #3", so I will. This is an oops for "try #2" This is what was on the terminal (minus the syslog part) dufus kernel: Oops: 0002 [#1] dufus kernel: SMP dufus kernel: CPU: 0 dufus kernel: EIP: 0060:[<f903ba85>] Tainted: PF VLI dufus kernel: EFLAGS: 00010286 (2.6.20-1.2933.fc6 #1) dufus kernel: EIP is at ipw_bg_disassociate+0x46/0x55 [ipw3945] dufus kernel: eax: 00000001 ebx: f6c2cd4c ecx: 00000018 edx: 00000246 dufus kernel: esi: f6c2cbd0 edi: 00000246 ebp: f6c2e254 esp: f773bf5c dufus kernel: ds: 007b es: 007b ss: 0068 dufus kernel: Process ipw3945/0 (pid: 1511, ti=f773b000 task=f701d430 task.ti=f773b000) dufus kernel: Stack: f9053869 f6c2e258 f7375c40 c0434842 00000000 00000282 f7fc5d34 c061ffaf dufus kernel: f903ba3f f7375c40 f7fc5d34 f773bfbc 00000000 c04351b4 00000001 00000000 dufus kernel: 00000001 00010000 00000000 00000000 f701d430 c04226ab 00100100 00200200 dufus kernel: Call Trace: dufus kernel: [<c0434842>] run_workqueue+0x85/0x125 dufus kernel: [<c061ffaf>] _spin_lock_irqsave+0x9/0xd dufus kernel: [<f903ba3f>] ipw_bg_disassociate+0x0/0x55 [ipw3945] dufus kernel: [<c04351b4>] worker_thread+0xf9/0x124 dufus kernel: [<c04226ab>] default_wake_function+0x0/0xc dufus kernel: [<c04350bb>] worker_thread+0x0/0x124 dufus kernel: [<c04377c7>] kthread+0xb0/0xd9 dufus kernel: [<c0437717>] kthread+0x0/0xd9 dufus kernel: [<c0404b33>] kernel_thread_helper+0x7/0x10 dufus kernel: ======================= dufus kernel: Code: 05 f9 81 eb 08 15 00 00 e8 72 c1 3e c7 e8 83 9b 3c c7 c7 04 24 69 38 05 f9 e8 61 c1 3e c7 89 d8 e8 42 38 5e c7 89 f0 e8 fa fe ff <ff> 89 d8 5b 5b 5e e9 e0 37 5e c7 59 5b 5e c3 55 57 56 89 c6 53 dufus kernel: EIP: [<f903ba85>] ipw_bg_disassociate+0x46/0x55 [ipw3945] SS:ESP 0068:f773bf5c I'll try the "try #3". I'm patching the clean ipw3945.c from the ipw3945-linux-1.2.0.tgz. Is that correct? Thanks.
(In reply to comment #10) > I'll try the "try #3". Thanks. > I'm patching the clean ipw3945.c from the ipw3945-linux-1.2.0.tgz. Is that > correct? Yes. It's appreciated if you can load the module with "modprobe ipw3945 debug=0x6bfff" and attach the full log (not only the oops but also the logs related to ipw3945 before the oops) for trying the 3rd patch.
Created an attachment (id=1029) [details] kernel BUG with "3rd-try" patch Tested patching against 1.2.0, kernel 2.6.20-1.2933.fc6. After OOPS the computer hangs.
(In reply to comment #7) > Created an attachment (id=1023) [edit] [details] > 2nd try > > Here is another patch to try. Please attach dmesg with debug=0x6bfff if you see > the oops again. > > BTW, Dan Krejsa, did you see firmware error before the oops? Hi, sorry for not checking back in a while. No, I didn't see the firmware error (no 'ipw3945: Microcode SW error detected').
Created an attachment (id=1031) [details] 4th try I think I've found the root cause. In ipw_send_cmd(), the cmd->meta.u.skb is actually the same as cmd->meta.u.source, because it is a union!! So we cannot just free it if it is not NULL. Please try this patch and see if the oops happens again.
Well, nothing so far, I've been using "4th try" and nothing yet. I'll keep testing. Also, FC6 upgraded to kernel 2.6.20-1.2944. But it looks good.
Created an attachment (id=1033) [details] syslog of firmware error with "4-try" patch With 4-try patch no more kernel bug, only firmware error (see the attachment). I discovered that there is another access point (probably in the near house) with strong signal, on a different channel end with WEP encryption. I noticed that firmware errors shows more frequently when I go with notebook where the signal from "foreign" AP is stronger. I don't know if it's related, but I think you should know...
ipw3945 as a driver has been replaced by iwl3945 in official kernel for a long time. We suggest to use iwl3945 driver instead of the obsolete ipw3945 driver. If you have bug, please report it with product=iwlwifi and platform="Intel(R) Wifi Link 3945". Thanks so much!