From: "Bob Dixon" To: "The Megaconference" ; ; "Video Development Initiative" Subject: RESULTS OF THE 2ND COMMONS/MEGACONFERENCE MCU LOAD TEST Date: Tuesday, September 04, 2001 12:44 PM On Wed Aug 22 we conducted the 2nd Commons/Megaconference MCU load test. Everyone in the world accessible thru the various H.323-related email lists was asked to participate, so as to create as much realistic load as possible. At least 127 H.323 clients signed up in advance to participate. RADVision and Accord were informed of the test, and invited to participate in any way they chose. Accord sent their video engineer Andy Shapira here to observe and assist in the test. Pankaj Shah, manager of ITEC-Ohio, was also here as an observer. This report has been reviewed by Accord and RADVision, and both have provided helpful input. Three multipoint conferences were set up, all cascaded together. Conference 1 - Accord MCU, 30 users. Conference 2 - Same Accord MCU, 30 users. Conference 3- RADVision ViaIP MCU, 50 users. The gatekeeper for all 3 MCUs is a RADVision ECS-100, running on a dedicated stand-alone NT PC. It was necessary to split the load on the Accord MCU into two conferences, since it is not designed to handle single conferences larger than 30 users. The RADVision performed flawlessly at its full load of 50 users, at all times, as it had in the 1st load test. The Accord MCU performed much better than in the 1st load test. The latest software version had been installed in it, which fixed the incompatibility problem with the RADVision ECS gatekeeper that occurred in the 1st test. However there were still some smaller problems. Shortly before the load test started, the Accord MCU communications module stopped working, requiring a quick trip to the MCU location to restart it. This was apparently a random failure. But the MCU conferences continued to run during this problem, even though we could not communicate with the MCU until it was fixed. There were also software failures later in two other cards within the Accord MCU, which were repaired remotely while the rest of the Accord continued to run. Users were instructed to connect to the conferences in the order given above, so as to provide a full load for the Accord MCU. As the test began, the first Accord conference filled up and worked fine. The RADVision MCU also filled up and worked fine. But as the 2nd Accord conference began to fill, problems developed. Users reported blue screens and buzzing noises when they connected to the 2nd Accord conference. Accord reports they now understand what caused this problem. As the Total number of users connected to all systems approached 100, the 2nd Accord conference and the gatekeeper stopped communicating, which prevented any more users from connecting to that conference. About 10 minutes later, the 1st Accord conference and the gatekeeper also stopped communicating. Both the Accord and the gatekeeper seemed to be functioning normally by themselves (the gatekeeper was still communicating with the RADVision MCU fine), but not together. The Accord MCU was able to make dial out calls without the gatekeeper. It did appear that the 2nd Accord conference could have reached its full capacity of 30 users, if this communication problem had not developed. The RADVision gatekeeper has a rated capacity of 100 calls, and we probably attempted to exceed that. It is not clear how the gatekeeper reacts when its capacity is reached. There are still some known incompatibilities between the RADVision gatekeeper and the Accord MCU, and perhaps one of them manifests itself when there is heavy load on the gatekeeper. At the same time the above problems developed, we noted an increase in dropped packets and jitter on the network. Perhaps something else was becoming overloaded on the MCU network. Just to be safe, we are planning to install gigabit ethernet link to the MCU network switch. After the test, we rebooted the gatekeeper and then it communicated fine with the Accord MCU, but of course by then the load was small. We also tested the Accord with an ECS-500 gatekeeper and it worked fine, at low load. Shortly after the test, RADVision sent us a software patch for the ECS-100 gatekeeper, which we have applied, but it is not clear if that has anything to do with the problems we experienced. And we cannot tell until the next load test. The ECS-500 (500 calls capacity) was not available for this load test, but will be used for all future tests. It runs on a separate card in the same chassis as the RADVision MCU, and is located closer to the MCUs than the ECS-100 used in this test. Thus the actual cause of the communications loss is unknown, and cannot be determined now. We will take more detailed traces on the RADVision gatekeeper to be sure we can find out. In subsequent testing, Accord has determined that the communications problem is reproducible in a controlled way, and we have reproduced it ourselves. We are now in close discussions with RADVisiona and Accord to get this problem fixed. Until then, we cannot schedule any further load tests, and it is getting close to the Megaconference, and its rehearsals may constitute the future load tests. These tests are teaching us all many important things, and are helping the vendors improve their products, so you don't have these problems in the future with your own systems. Bob Robert S. Dixon, Ph D, PE Chief Research Engineer | Voice 614-292-1638 Office of the CIO | Fax 614-292-7081 Ohio State University | Lab 614-292-7425 1971 Neil Ave, room 451 | Email dixon.8@osu.edu Columbus, OH 43210 | Video 128.146.199.98 and Senior Systems Developer/Engineer | Voice 614-728-8100 X232 Ohio Academic Resources Network (OARnet) | Fax 614-486-4594 | Email rdixon@oar.net