ettv_b6 causing MAJOR lags on broadcasts

Forum for discussing ET TV

Moderators: Forum moderators, developers

User avatar
arni
Posts: 188
Joined: Sun Feb 20, 2005 2:32 pm

ettv_b6 causing MAJOR lags on broadcasts

Post by arni »

Ok, as this is the second borked #gamestv.org broadcast and we have tried 3 different matchservers, all with plenty of resources we have come to the conclusion, that the ettv_b6 must be causing these lags.

Today we were running a rather large ettv broadcast for the summercup finals - almost all servers were having huge lag problems (except for one and we dont know why)

What i have come across when looking at the logfiles:

The recorder's logfile (delay recorder) it perfectly fine, not a single error in there, but when it comes to the replayer, lots and lots of demo frames are skipped for no reason.
Example:
skipped demo frame: svs.time [2192000] cl.serverTime [2192200]
skipped demo frame: svs.time [2192050] cl.serverTime [2192200]
skipped demo frame: svs.time [2192100] cl.serverTime [2192200]
skipped demo frame: svs.time [2192150] cl.serverTime [2192200]
skipped demo frame: svs.time [2192600] cl.serverTime [2193050]
skipped demo frame: svs.time [2192650] cl.serverTime [2193050]
skipped demo frame: svs.time [2192700] cl.serverTime [2193050]
skipped demo frame: svs.time [2192750] cl.serverTime [2193050]
skipped demo frame: svs.time [2192800] cl.serverTime [2193050]
What you notice right away is: Its exactly every 50th frame that is skipped - dont think that could be due to server problems (server had ~15% cpu idle throughout the whole broadcast, and i've had lagfree broadcasts of this size with b4 before)

Bug in b6 - dunno but i think so ...
Image
stgraber
Posts: 3
Joined: Mon Aug 08, 2005 5:20 am
Location: Switzerland

Post by stgraber »

Same problems when I had 100 slots ETTV connected. No problem with 50 slots.
But It's a really good idea to fix this problem.

PS : I'm one admin of ETTV.fr
Perform #pr3dators-klan.et and #ettv.fr
stgraber
Posts: 3
Joined: Mon Aug 08, 2005 5:20 am
Location: Switzerland

Post by stgraber »

An other problem for this night :
They have loaded dubrovnik_final, my record loaded it without problem and write the demo. One minute later, my hub want to read the demo and say this :
-------- UNRECOVERABLE ERROR --------
This may be due to a bug in etpro
Information to be used in a bug report is being generated:
------------- CUT HERE --------------
Version: etpro 3.2.0
Platform: Linux
Signal: Segmentation violation (11)
Signal code: 1
fault address: 0x14d0
Load addresses:
0x4001f000 /lib/libm.so.6
0x40041000 /lib/libdl.so.2
0x40044000 /lib/libc.so.6
0x40000000 /lib/ld-linux.so.2
0x49a14000 /lib/libnss_compat.so.2
0x49a1c000 /lib/libnsl.so.1
0x49a31000 /lib/libnss_nis.so.2
0x49a3a000 /lib/libnss_files.so.2
0x49a43000 /lib/libnss_dns.so.2
0x49a47000 /lib/libresolv.so.2
0x49a59000 /usr/local/games/ettv.fr/logs/hub/pb/pbsv.so
0x4a3fb000 /usr/local/games/ettv.fr/server/hub/etpro/tvgame.mp.i386.so
EIP: 4a4269e4
edi:00000000 esi:000014d0 ebp:00000000 esp:bfff4bc0
eax:00000000 ebx:4a43935c ecx:000001c3 edx:0000070c
stack:
00000000 00000004 00000000 00000000 4a9afec0 4c51ae30 4c5174e0 4a43935c
00000000 4a426a90 00000001 4a426a9e 00000000 08d884a0 4a43935c 4a426a95
00000001 08daaa80 00000000 0808e2e0 00000001 00000000 08061b0f 080aeeb4
080caf40 0000044d 08061c37 080b1f60 00000000 00000000 00000000 00000000
code:
f3 a5 45 8b 44 24 18 81 44 24 0c c4 5b 00 00 81 44 24 08 0c 07 00 00 3b 68 0c
7c a0 83 c4 1c 5b 5e 5f 5d c3 90 8d b4 26 00 00 00 00 57 56 53 e8 b8 c2 ff ff
Stack trace: 1 entries
/usr/local/games/ettv.fr/server/hub/etpro/tvgame.mp.i386.so[0x4a4185f7]
Recorder, HUB and slave has the same etpro and etmain directory.
There is only a version of dubrovnik and his _ETPRO .pk3.
My etpro is 3.2.0.
Perform #pr3dators-klan.et and #ettv.fr
User avatar
arni
Posts: 188
Joined: Sun Feb 20, 2005 2:32 pm

Post by arni »

about the original issue:

Replaying the borked demo files on an idle server causes the same problems, i can provide those if needed ...
Image
User avatar
bani
Site Admin
Posts: 2780
Joined: Sun Jul 21, 2002 3:58 am
Contact:

Post by bani »

your recorder doesnt have enough bandwidth to the master server.

did you try b5 on the same machines to verify exactly that it was only a bug in b6? if not, then you can't say it's a b6 bug :)
User avatar
arni
Posts: 188
Joined: Sun Feb 20, 2005 2:32 pm

Post by arni »

well, i was assuming it was a bug that got introduced (hehe) from 4 to 5 and 6, as 5 and 6 were released the same day ...

Not enough bandwidth ...

... all servers that were involved have a 100Mbit full duplex connection or similar - I can push 6Mbyte/s around through the net, both directions.

Dont really think thats the problem ...
Image
User avatar
bani
Site Admin
Posts: 2780
Joined: Sun Jul 21, 2002 3:58 am
Contact:

Post by bani »

the recorder is getting dropped frames, so it's either an insufficient cpu or bandwidth problem.
skipped demo frame: svs.time [2192150] cl.serverTime [2192200]
skipped demo frame: svs.time [2192600] cl.serverTime [2193050]
here about 8 seconds of data was dropped.
User avatar
arni
Posts: 188
Joined: Sun Feb 20, 2005 2:32 pm

Post by arni »

any chance this problem is caused by the master?

recorders (actually 3 of which i know) were having the same problems and all had enough bandwidth & cpu

so, can this problem also be caused if the master doesnt have enough cpu? (players didnt lag tho)
Image
User avatar
bani
Site Admin
Posts: 2780
Joined: Sun Jul 21, 2002 3:58 am
Contact:

Post by bani »

stgraber wrote:Same problems when I had 100 slots ETTV connected. No problem with 50 slots.
But It's a really good idea to fix this problem.

PS : I'm one admin of ETTV.fr
you either don't have enough cpu or enough bandwidth.
User avatar
arni
Posts: 188
Joined: Sun Feb 20, 2005 2:32 pm

Post by arni »

what he meant to say (i've spoken him) is that 2 50slots servers work fine on the same machine, while 1 100slot server lags.

--> Also we have had the same setup working fine with beta4
Image
User avatar
bani
Site Admin
Posts: 2780
Joined: Sun Jul 21, 2002 3:58 am
Contact:

Post by bani »

scaling is completely different for 1 x 100 vs 2 x 50

it doesnt take the same amount of cpu.

many smaller servers will take aggregately less cpu than one large one. so its likely a cpu limit.

can you verify 1 x 100 works perfectly on b4 and 1 x 100 fails 100% on b6?

did you have exactly the same amount of spectators on b4 as you did on b6?
User avatar
bani
Site Admin
Posts: 2780
Joined: Sun Jul 21, 2002 3:58 am
Contact:

Post by bani »

-=[VP]+arni+=- wrote:any chance this problem is caused by the master?

recorders (actually 3 of which i know) were having the same problems and all had enough bandwidth & cpu

so, can this problem also be caused if the master doesnt have enough cpu? (players didnt lag tho)
if the master doesnt have enough cpu then the players will lag. the debug on the recorder indicates that the link between master and recorder is dropping packets.

it could also be caused if the recorder runs out of cpu also. you'll see hitch warnings in the recorder console. if you dont see any, then its likely a bandwidth issue between master and recorder.
User avatar
arni
Posts: 188
Joined: Sun Feb 20, 2005 2:32 pm

Post by arni »

i can 100% confirm the following:

* The recorder's log doesnt throw any errors whatsoever (maybe a dropped frame or 2 on map change ...)

* The recorder has enough cpu - the recorder always has a nicelevel below the broadcasting server - so it will always get as much cpu as it needs before server will even think about giving cpu to replayer ...

* while 125 slots have always worked fine (had it filled up till the end) with b4, its now causing problems with b6 (already at 75 slots), although in both setups there were ~15+% idle - throughout the whole broadcast ...

* i have enough bandwidth and i presume master also does if players didnt lag ...
Image
User avatar
bani
Site Admin
Posts: 2780
Joined: Sun Jul 21, 2002 3:58 am
Contact:

Post by bani »

change to b4 and see if you get the same problems with exact same number of spectators.
* i have enough bandwidth and i presume master also does if players didnt lag ...
this says nothing about the quality of the network link between the master and the recorder.

"the matchserver in china has enough bandwidth for all the chinese players, and my recording server in argentina has enough bandwidth for all the argentina spectators".

this says nothing if the link from china to argentina is crap or not.

i need a real test b4 vs b6. verify the problem exists only in b6 and goes away exactly when you switch to b4, on exact same servers with exact same setting and exact same number of viewers.
User avatar
arni
Posts: 188
Joined: Sun Feb 20, 2005 2:32 pm

Post by arni »

bani wrote:this says nothing about the quality of the network link between the master and the recorder.

"the matchserver in china has enough bandwidth for all the chinese players, and my recording server in argentina has enough bandwidth for all the argentina spectators".

this says nothing if the link from china to argentina is crap or not.
This is ofc true, but pretty unrealistic, with players all over europe as well as ettv recorders all over europe.
The reason i am neglecting this problem is that within europe, network topology is usually neglectable - if you have proper connections on both sides its hard to get pings >50 (between servers) - On an example connection which had this problem is 14ms and 3.5Mbyte/s ...

I will try with b4 though and see if the b6 servers on the same network still lag ... (i think tonight)
Image
Post Reply