Bugzilla – Bug 667
"No space for Tx" error when hwcrypto is enabled
Last modified: 2006-10-10 17:20:19
You need to log in before you can comment on or make changes to this bug.
I have come across many issues with the 1.0.4 driver. Multiple errors, multiple firmware resets. Ive been able to get it to connect, but if I unload and reload it may not reconnect the next time. The debugs from syslog are quite long and Im not familiar with bugzilla, so I put them on my website: http://chriscarey.us/hardware/myhardware/thinkpad-t41/ipw2200/
Created an attachment (id=385) [details] syslog output 1
Created an attachment (id=386) [details] syslog output 2
Created an attachment (id=387) [details] syslog output 3
I have two access points in WDS mode. They both have the same SSID and are both within range. Im not sure if that may contribute to the error but I thought I'd point it out because it is the only non-traditional part of my setup.
I have seen the command failure issue with ipw2200-1.0.4 here as well - which result in the 'no space for tx' message.
Created an attachment (id=388) [details] dmesg output without debug I have no idea what debug level might yield anything interesting for the above situation, please advice.
echo 100 > /sys/class/firmware/timeout has improved reliability for me. I was able to remove and modprobe the driver successfully. Once the card is connected to the access point, the errors dont continue and things seem ok. It is during the scanning phase that they show.
The best debug level (for 99% of the issues) is 0x43fff. That will capture the extra firmware data during restarts, and also traces all internal and external invoked changes to the state of the driver, allowing us to follow the configuration logic.
changing title, per bug scrub
Created an attachment (id=414) [details] fix patch
v1.04 is completely unreliable for me due to all the firmware errors. Others in my office have seen the same thing. I'm applying the patch just posted today and I'll test it tomorrow.
The patch (attachment #414 [details]) seems to do it here. I have not seen that error message since.
Darn, just encountered the problem again on system boot. Unfortunately, I didn't catch the error log. Seems the proposed patch doesn't work anyway :/
I can trigger a firmware restart with 1.0.4+patch (applied with some offset fuzz) within 1-3minutes by trying to transfer a large file to my laptop. This is no change from just plain old 1.0.4.
Does it happen only when you are using WPA? Please provide dmesg with debug level 0x43fff for the patched version.
I'm just using 128bit WEP. I'll try to get some debug. Is the proper way "modprobe ipw2200 debug=0x43fff" ??
(In reply to comment #16) > I'm just using 128bit WEP. I'll try to get some debug. > > Is the proper way "modprobe ipw2200 debug=0x43fff" ?? It most not be, cause I loaded it with that command, then xfered a large file which, several minutes into the xfer, only resulted in this output: ipw2200: Firmware error detected. Restarting.
Dax, If your got firmware errors but only "No space for Tx", bug 697 should be the right place to put your comment. Seems you didn't enable CONFIG_IPW_DEBUG in your Makefile (if you have ipw2100 enabled in your kernel .config, you should enable it there). Brix, do you use encryption when you see "No space for Tx" warning?
It happens both with and without encryption - but I see it very rarely, and have not been able to capture debug output yet...
I have got the same problem here. Don't know if it helps, but I could localize / reproduce the mistakes. This problem only occurs, when I use wpa_supplicant to connect to an AP. When I use iwconfig to connect without encryption or with WEP encryption everything works fine. I think it's a problem with TKIP. Whenever I try to connect to my AP, the driver begins to produce the errors, after the State in wpa_supplicant changes from 4-WAY HANDSHAKE -> GROUP HANDSHAKE. I have to reset the IPW2200 Firmware with the kill switch. Then everything works fine. But when I reload the ieee80211_crypt_tkip modul and try it again, I get the same mistakes until I reset the Firmware. I have tried this with two different AP's. My System is an Acer Travelmate 292 with Ubuntu 5.04 installed. Will try it with an unpatched kernel later.
(In reply to comment #20) > I have to reset the IPW2200 Firmware with the kill switch. Does the problem reproduceable if you don't reset the kill switch? Does your dmesg contain something like below? Please attach you dmesg with debug=255. Did you try attachment 414 [details] patch? > failed to send ASSOCIATE command > failed to send SCAN_REQUEST_EXT command > failed to send SYSTEM_CONFIG command > ipw_send_system_config failed > failed to send SCAN_REQUEST_EXT command > No space for Tx
Created an attachment (id=423) [details] dmesg output without patch
Created an attachment (id=424) [details] dmesg output with patch
I have added two files with my dmesg-output at debug-level 255. I wrote some comments between the lines for you, to make clear, when the problem begins. For these logs I used the original ubuntu kernel 2.6.10-5-386 with the ipw2200 driver with and without the patch 414 included. Both produced the same mistakes on my system. As you can see there is an endless loop at the end of each file. When I stop wpa_supplicant and restart it again, sometimes the second/third try is successful, but only after a Firmware restart. On my Travelmate 292 I can reproduce the error by unloading the ieee80211_crypt_tkip modul. After that I turn off and on again my Kill Switch. The message "ipw2200: Firmware error detected. Restarting" appears. My little workaround is to switch off and on again the kill switch without unloading the modul mentioned above. In round about 70 % of my trys I get a connection afterwards. This problem only occurs on my Laptop, when I use WPA encryption. WEP and "no encryption" is working fine. It doesn't matter whether the ssid is hidden or not. Hope this helps you a bit. As I said, I'll try it with a vanilla kernel later.
I finally found a solution for the Problem, maybe someone else can check this out. I disabled the hwcrypto Option of the ipw2200 driver. There seems to be a problem with this option and authentification with WPA-PSK / TKIP. Now everything works fine and I'm happy with a wonderful driver for Linux :-)
Patrick, thanks for your reporting. I've changed the title to indicate this bug only happens when using hwcrypto. There are some related fixes in 1.0.5 (will come soon), please verify the bug at the time when it comes out.
*** Bug 716 has been marked as a duplicate of this bug. ***
Created an attachment (id=448) [details] debug information using debug=0x43fff 23:32 timeframe it lost connectivity twice during a large download 23:33 timeframe it lost connectivity once during a large download 23:34:49-50 timeframe it lost connectivity once during a large download 23:35:15-17,34-35 timeframe it lost connectivity twice during a large download 23:36:00-01 timeframe it lost connectivity once during a large download /var/log/kern contains all the contents for debugging modules and kernel related issues.
Created an attachment (id=449) [details] debug information using debug=0x43fff 23:32 timeframe it lost connectivity twice during a large download 23:33 timeframe it lost connectivity once during a large download 23:34:49-50 timeframe it lost connectivity once during a large download 23:35:15-17,34-35 timeframe it lost connectivity twice during a large download 23:36:00-01 timeframe it lost connectivity once during a large download /var/log/kern contains all the contents for debugging modules and kernel related issues. The file can also be found at http://www.ldb-jab.org/bugs/kern of http://www.ldb-jab.org/bugs/kern/gz
Created an attachment (id=450) [details] debug information using debug=0x43fff 23:32 timeframe it lost connectivity twice during a large download 23:33 timeframe it lost connectivity once during a large download 23:34:49-50 timeframe it lost connectivity once during a large download 23:35:15-17,34-35 timeframe it lost connectivity twice during a large download 23:36:00-01 timeframe it lost connectivity once during a large download /var/log/kern contains all the contents for debugging modules and kernel related issues. The file can also be found at http://www.ldb-jab.org/bugs/kern of http://www.ldb-jab.org/bugs/kern/gz
Created an attachment (id=452) [details] a patch to try Please load module without hwcrypto=0 and see if the problem is fixed.
For me, your patch is working fine, thanks a lot. No more problems with hwcrypto enabled.
Please don't mark as FIXED before the patch has been included upstream.
Marking as fixed in v1.0.5.
Those who used to get the "No space for Tx" error in 1.0.4, can you please comment whether this error is gone in 1.0.6? -thx
Not fixed for me. Im using 1.0.6 now. I tried Yi's patch3 against 1.0.6 and it did not help either. I still use hwcrypto=0 in order for the driver to function
Marking as reopened. I still cannot use the driver without hwcrypto=0 switch. I see that the patch worked for Patrick Renkowski. Is there anyone else who is still having this trouble as I am?
Assigning to Yi. From scrub: <chuyee> this one need more look <chuyee> I think I did something to make it better, but not totally solved <chuyee> or do you want a walkaround ;) <logics_sbux> we really need to nail this one <logics_sbux> unless we suspect it is being caused by a fw lockup <logics_sbux> (i noticed a NMI in the most recent debug information attachement) <chuyee> it is not easily reproducable <logics_sbux> is the bug about seeing the 'no space for tx' or about firmware restarts? <chuyee> "no space for tx" is caused by a "firmware halt" <logics_sbux> ok; so the firmware is dying, trasnfer attempts to continue until all the slots are full? <logics_sbux> s/trasnfer/transfer/ <chuyee> yeah <chuyee> but only saw with hwcrypto enabled <logics_sbux> the specific firmware dumps random, or consistently NMIs? <chuyee> I think the dump is in the late time after quite a lot "no space" <chuyee> because firmware is in a unstable state at that time. But the initial reason of "firmware halt" is unclear. <logics_sbux> hmm <logics_sbux> i wonder if the root cause is actually overflowing the ring buffer <logics_sbux> that then confuses things <logics_sbux> meaning our queue full logic might be in error <chuyee> the error is in the early phase of association <chuyee> do you mean your recent change for NETIF_TX_FULL or something? <chuyee> maybe TX_DROP <logics_sbux> oh; its during association? <chuyee> yes <logics_sbux> that's odd. why would hwcrypto play a role there unless its shared key authentication? <logics_sbux> (it is consistent that with hwcrypto=0 the problem goes away-- correct?) <chuyee> it sends TGi key <chuyee> I thought it might be the sequence of SYSTEM_CONFIG and TGI key <chuyee> but in bug 792 I change the sequence to the right order, it *fix* for me, but still someone see the bug <logics_sbux> even if its an open or wep AP we set the TGi? <chuyee> no, only with AES/TKIP <chuyee> wep and open no problem <logics_sbux> ok <salwan> is the dmesg output that submitter provided with 1.0.4 sufficient for now, or do need new dmesg info with 1.0.6? <chuyee> yechun said he sometimes can reproduce the bug on his laptop, I can use his laptop to reproduce <chuyee> so I don't need the log this time
*** Bug 791 has been marked as a duplicate of this bug. ***
*** Bug 828 has been marked as a duplicate of this bug. ***
please try ipw2200-1.0.11
I don't know about everyone else on this bug, but I haven't needed hwcrypto=0 for this since 1.0.10 came out (I use it for unrelated reasons. :) )
I'm seeing something very similar to this bug in version 1.1.2kmprq in kernel 2.6.18-rc7. If hwcrypto is enabled, the wireless link stops working after a few seconds of heavy load, with the [ipw2200/0] process eating 100% cpu and the following error: Oct 9 22:41:33 ophelia kernel: [ 1090.151000] ipw2200: No space for Tx Oct 9 22:41:33 ophelia kernel: [ 1090.151000] ipw2200: Failed to send SCAN_REQUEST_EXT: Reason -16
(In reply to comment #43) > Oct 9 22:41:33 ophelia kernel: [ 1090.151000] ipw2200: No space for Tx > Oct 9 22:41:33 ophelia kernel: [ 1090.151000] ipw2200: Failed to send > SCAN_REQUEST_EXT: Reason -16 Can you attach the full log please?
need more info