Bug 608 - stability problems - problem choosing 802.11b instead of 802.11g
: stability problems - problem choosing 802.11b instead of 802.11g
Status: CLOSED NEEDSMOREDATA
: IPW2200
__UNSPECIFIED__
: 1.0.1
: Dell All
: P1 major
Assigned To:
:
:
:
:
:
  Show dependency treegraph
 
Reported: 2005-03-21 23:53 by
Modified: 2005-12-07 15:21 (History)


Attachments
capture of dmesg when debug enabled (15.40 KB, text/plain)
2005-03-30 22:43, Klavs Klavsen
Details
fix for firmware fatal error (30.94 KB, patch)
2005-04-02 10:37, Mohamed Abbas
Details | Diff
firmware fatal fix (31.55 KB, patch)
2005-04-02 18:42, Mohamed Abbas
Details | Diff
Linux-2.6.11-gentoo-r1-kernelconfig (32.22 KB, text/plain)
2005-04-06 00:29, Klavs Klavsen
Details


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2005-03-21 23:53:00
I've got several stability problems with my wireless (I'm running in adhoc with
WEP key - and just got that working at all, in bug 587). 

1. When I initially config my card - I wrote a script that does this:
iwconfig eth1 enc restrictricted <thekey>
iwconfig eth1 mode Ad-Hoc
iwconfig eth1 essid MYESSID

When I run it - the iwconfig output says 802.11g - but the other end is 802.11b
- so obviously there's no connectivity.
If I add a sleep 2, in between each command, it works like a charm.

2. If I transfer a big file from the local network (so the wireless should hit
it's transfer ceiling) the card also switches to 802.11g - and I'm thrown off.

The worst part about this, is that there's appereantly no way to tell the card
that I want it to use 802.11b - now that it can't figure it out by itself :(
------- Comment #1 From 2005-03-22 00:17:56 -------
You can set the card mode w/

$ iwpriv eth1 set_mode 2

1 == 802.11a, 2 == 802.11b, 3 = 802.11g

and verify w/

$ iwpriv eth1 get_mode
------- Comment #2 From 2005-03-22 00:27:40 -------
Perfect. As soon as I get time - I'll see if I can fix it when I get thrown off
by doing that. Still - it shouldn't switch  modes when sending a lot of traffic :)
------- Comment #3 From 2005-03-22 07:56:15 -------
I just had the problem (not transferring much data) - and I can fix it by
setting: iwpriv eth1 set_mode 2
but it only works for half a minute, and then I'm thrown off again. The iwconfig
status doesn't even show 802.11g - it still shows 802.11b - but setting the mode
again, fixes it again.

Also I've noticed, that the link quality says=100 - when it's "offline" - and
when I force it into mode 2, it says 100/100 - and then goes down to approx.
85-90/100 in my case (which means it is actually measuring).
------- Comment #4 From 2005-03-22 22:49:03 -------
Was your laptop still assoicatied after the big file transfer stopped? what's
the result for `iwpriv eth1 get_mode` at this point? 
------- Comment #5 From 2005-03-22 23:29:41 -------
I just got online, and had to do iwpriv eth1 set_mode 2 several times, for it to
work.

it still shows and says (also get_mode) 802.11b and says it's associated and
all. The only difference, is the link quality - which is 100/100 (and then drops
to about 87/100) when it's connected - and when nothing is getting through, it
says link quality=100 (and not 100/100).

I guess the shift to 802.11g is something it does after a while of being
disconnected or something. If you can give me a debug version, I'll post the
debug info it outputs.
------- Comment #6 From 2005-03-24 07:57:02 -------
Log from scrub:

logics_sbux	for 608, i wonder if we are configuring the card to use all 
data rates in ad-hoc instead of limiting based on what other cells are in the 
network
crystal_	will it be the same case in BSS?
chuyee	logics: does the second sta needs to support all rates of the first 
sta to associate?
logics_sbux	crystal: no, bss limits the rates based on the AP
logics_sbux	chuyee: i'm not sure
logics_sbux	if you have a 3 node network, one b/g, one g-only and one b-
only...
logics_sbux	should they all be able to join if the b/g node creates the 
network?
logics_sbux	and if so, how do we configure the driver to use b-rates with 
one node and g-rates with another
logics_sbux	or if you have two b/g cards and one b card... can the two g-
cards use 54mbs and the b card just use 11mb?
crystal_	but b card should not be able to communicate with g card.
logics_sbux	crystal: correct; it wouldn't be able to communicate with that 
node in the network-- similar to if they were out of range
crystal_	logics: so it will be in trouble if there is 1 b/g, 1 b, 1 g..... how 
to create the adhoc network...
crystal_	once the mode is decided, it should not be changed... I think
crystal_	such as if b/g create a g ad-hoc network, only g can associate with 
it. b can't.
logics_sbux	this might be coverd in the 802.11 spec
logics_sbux	or one of the networking books
logics_sbux	its one of those topics i keep meaning to research more, but 
haven't had time
logics_sbux	assign this one to mohamed
logics_sbux	he's been looking at the ad-hoc stuff recently; he may have 
found the data we seek
salwan	ok
crystal_	P1?
logics_sbux	yeah
------- Comment #7 From 2005-03-29 03:11:46 -------
Hi. Just wanted to let you know, It's driving me crazy :(

EVERY time I have more than just a little traffic (unless sometimes, when
downloading steadily it can run smoothly) it dies.

An emerge sync (Gentoo update of tree - an rsync actually) kills it every few
seconds at worst. 

I would be very helped if I could set a "force_mode 2" option for the module to
make it not even consider switching.
Everytime it goes bad, is consistent with a "firmware error" msg in the kernelmsgs.
So the real problem, it seems is twofold.

1: it chooses the wrong mode sometimes, upon firmware restart
2: the firmware appereantly is very errorprone, sometimes failing the second I
do set_mode, so I have to do it one or 2 times more. 

I would very much like to help you debug this issue, if you would give me a
version that outputs debug info (just use kmsg's) - and I'll post right back
with the output :)
------- Comment #8 From 2005-03-29 13:09:23 -------
Klavs
I can not folow how you produce this error, Can you if possible to write step 
by step how you make it fail "Firmwae Fatal error" So I can trace it down. 
also can you run this command once you load the driver
#echo 0x43FFF > /sys/bus/pci/drivers/ipw2200/debug_level
then capture the dmesg into a file after the fatal error happen. I have some 
patch I am working on that might solve this problem I post once it is working 
right.
Thanks
Mohamed
------- Comment #9 From 2005-03-30 22:43:55 -------
Created an attachment (id=312) [details]
capture of dmesg when debug enabled

capture of dmesg, from when the Firmware error occured.
------- Comment #10 From 2005-03-30 22:47:23 -------
I've attached the debug info. There's one weird experience I just had. I was at
a meeting, and used the wireless for 802.11g in Managed mode, and it worked just
fine. That night, and the ENTIRE day after that the 802.11b in Ad-Hoc worked
without ever getting thrown off. 
Today it's back with a vengeance, throwing me off as soon as I make more than a
small sniff on the network (had to do: iwpriv eth1 set_mode 2 approx. 10 times,
to get the attached file uploaded :)

It seems it had two good days, after having been on a 802.11g network.. I have
no plausible explanation at all.
------- Comment #11 From 2005-03-30 22:55:28 -------
I forgot to note, that I have never been on an 802.11g network before that day,
so I don't know if this scenario will happen every time. As soon as I get a
chance to  do it again, I'll try and see if it happens again :)
------- Comment #12 From 2005-03-30 23:21:28 -------
Here's a script for everyone, that restarts the connection, when it dies :)
#!/usr/bin/perl
#sets 802.11b mode until wireless lan restarts connection

$result = `iwconfig eth1`;
while (1)
{
   `iwpriv eth1 set_mode 2` if (!($result =~ /uality=[0-9]+\/[0-9]+/));
   print "eth1 failed again!\n" if (!($result =~ /uality=[0-9]+\/[0-9]+/));
   sleep 2;
   $result = `iwconfig eth1`;
}
------- Comment #13 From 2005-04-02 10:37:15 -------
Created an attachment (id=320) [details]
fix for firmware fatal error

please try this patch against ipw 1.0.2. try to run the test with led disabled,
there are some deadlock problem with led or try to apply the led patch if led
test needed. please report any problem you see with log.
------- Comment #14 From 2005-04-02 10:37:53 -------
patch against ipw22 1.0.2
------- Comment #15 From 2005-04-02 18:42:02 -------
Created an attachment (id=326) [details]
firmware fatal fix

please try this one instead of 321, I forget to include ipw2200.h changes
------- Comment #16 From 2005-04-05 00:56:56 -------
patching file ipw2200.h
Hunk #2 FAILED at 1008.
1 out of 2 hunks FAILED -- saving rejects to file ipw2200.h.rej
patching file ipw2200.c
Hunk #1 succeeded at 69 (offset -1 lines).
Hunk #2 succeeded at 3451 (offset 2747 lines).
Hunk #3 succeeded at 3500 (offset 2747 lines).
Hunk #4 succeeded at 3539 (offset 2747 lines).
Hunk #5 succeeded at 3552 (offset 2747 lines).
Hunk #6 succeeded at 3580 (offset 2747 lines).
Hunk #7 FAILED at 4456.
Hunk #8 FAILED at 4493.
Hunk #9 FAILED at 5766.
Hunk #10 FAILED at 6144.
Hunk #11 FAILED at 6480.
..
Hunk #78 succeeded at 7986 with fuzz 2 (offset 44 lines).
Hunk #79 FAILED at 8006.
Hunk #80 FAILED at 8014.
Hunk #81 FAILED at 8046.
Hunk #82 FAILED at 8067.
Hunk #83 succeeded at 8021 (offset -35 lines).
Hunk #84 succeeded at 8088 (offset -35 lines).
Hunk #85 succeeded at 8175 (offset -35 lines).
Hunk #86 succeeded at 8201 (offset -35 lines).
Hunk #87 succeeded at 8211 (offset -35 lines).
Hunk #88 succeeded at 8235 (offset -35 lines).
Hunk #89 succeeded at 8266 (offset -35 lines).
Hunk #90 succeeded at 8277 (offset -35 lines).
Hunk #91 succeeded at 8306 (offset -35 lines).
Hunk #92 succeeded at 8410 (offset -36 lines).
Hunk #93 succeeded at 8545 (offset -42 lines).
Hunk #94 succeeded at 8570 (offset -42 lines).
9 out of 94 hunks FAILED -- saving rejects to file ipw2200.c.rej

I'm going to try to merge them in manually - but odds are I'll make a mistake.

This is a patch against v1.0.2.
------- Comment #17 From 2005-04-05 01:29:29 -------
tried emerging 1.0.2 - without patching anything. It "craps out" when trying to
modprobe the ipw2200 module, and the machines freezes 2 seconds later, so I
can't save the message :(

I'll gladly give you my kernel config or whatever you need to reproduce.
------- Comment #18 From 2005-04-05 07:33:09 -------
yes attached all the kernel config any logs to help debug this. this patch 
should work fine on 1.0.2 i dont know why you getting the failing while 
patching. just make sure you get plain 1.0.2 and disable led. also try load 
the driver by . load. dont install the driver and just load it from where you 
build it
------- Comment #19 From 2005-04-06 00:29:29 -------
Created an attachment (id=331) [details]
Linux-2.6.11-gentoo-r1-kernelconfig

Hi Mohamed,

I've attached the Kernel config. The wireless has been more stable the last few
days - not so much need for my fixwlan.pl script :) - some days its bad, some
days it's not. Yesterday - it totally went bananas - had link - but no
connectivity, and I couldn't get on the WLAN with another machine, until I
removed the driver (rmmod ipw2200) from the box. I rebooted and it worked fine
again. Just re-inserting the driver didn't help.

I have a few questions:
1) How do I disable led? (what is led? a led is AFAIK a small light-diode :)
2) what do you mean by I need to . load the driver? I load it by doing:
modprobe ipw2200
3)I untarred the ipw2200-1.0.2.tgz (md5:a6974b0b7399a53fe497b68bd2fca10a) and
did: cat patchfile | patch -p1 (when in the ipw2200-1.0.2 dir) and the patch
fails as described. We seem to be working on different 1.0.2 release versions?
------- Comment #20 From 2005-04-07 12:00:03 -------
actauly you can use pre 1.0.3 from ketrenos and try i out. by defaul led is 
disabled so if you never enable then you dont have t worry about led. What I 
ment by . load is after you compile your ipw2200 driver dont run
#make install 
instead load the driver from the directory whre you build it
#. load debug=0x43FFF
load is a script in the ipw2200 directory, just make sure to uninstall the old 
ipw2200 driver
#make uninstall 
before you build and install the new driver.
------- Comment #21 From 2005-04-07 12:11:22 -------
I would gladly test 1.0.3-pre.. but the code is not in CVS at SF - and I can't
find ketronos homepage (I presume you mean James Ketrenos - I googled :) - so I
don't know where I can find snapshots of 1.0.3 - or the CVS repository.
------- Comment #22 From 2005-04-11 23:40:53 -------
please verify on 1.0.3
------- Comment #23 From 2005-04-12 00:14:58 -------
I'm sorry I haven't had time to test it - I've moved my location a bit, and
since then it has been a lot more stable, rarely failing.

Still running 1.0.1 with patch though.

I'll test the latest prerelease this week.
------- Comment #24 From 2005-04-21 10:02:39 -------
I finally (sorry for the delay) got to test 1.0.3 - it works without patching,
but it seems to be more unstable.

sometimes I get the firmware error when the connection goes down - but sometimes
I don't - and then I have to rmmod the ipw2200 module, and modprobe it again :(
This didn't happen with 1.0.1 version.

My fixwlan.pl script serves me well - but does not work in the second situation,
where it simply dies (it says unassociated and keeps the link quality to
something that looks valid, but the connection is dead).

I'll gladly test any debugging patch or whatever you'd like me to try.

None of you have this problem when running adhoc - or do you simply not have an
adhoc network to test in?
------- Comment #25 From 2005-04-28 07:28:04 -------
From scrub:
Might be useful to get a firmware dump for 1.0.3 since Mohamed fixed some state
machine race conditions since 1.0.2. That could have changed the root cause of
the error.
------- Comment #26 From 2005-06-05 17:40:39 -------
Hi Mohamed,

Did the firmware fatal patch - http://bughost.org/bugzilla/attachment.cgi?
id=327 - refered to in bug #592 ever make it to version 1.0.4?

I'm wondering whether to ask submitter to retest 1.0.4, or whether to first 
apply the patch to it.
------- Comment #27 From 2005-07-13 13:30:29 -------
According to Mohamed, the firmware fatal patch
http://bughost.org/bugzilla/attachment.cgi?id=327 that was refered to in bug
#592 has been in since 1.0.4.

Klavs, can you please retest with latest version (1.0.5), and provide dmesg
output at debug level 0x43fff.
------- Comment #28 From 2005-11-04 20:29:26 -------
Request testing with latest version and/or close.
------- Comment #29 From 2005-12-07 15:21:10 -------
Presuming fixed in 1.0.8; if not, reopen.