RFC2544 Testing in Production Networks
Juniper Networks
rbonica@juniper.net
Cisco Systems
Green Park, 250, Longwater Avenue,
Reading
RG2 6GB
UK
stbryant@cisco.com
General
I-D
Internet-Draft
This document considers the use of RFC2544 type tests in an
production network.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC2119 .
This document considers the issues related to conducting tests
similar to RFC2544 in a prodcution
network.
Some types of production network offer a high degree of assured
issolation between the traffic of different users. An example of a
network technique that provides this issolation is a pseudowire . Some types of network, known as transport
networks are traffic engineered to the
point where there is normally no resource contention betwen user
traffic. The incautous use of unmodified RFC2544 testing is NOT
RECOMMENDED . However provided suitable
approprate caution is applied the existing RFC2544 tests may provide
some useful information on the behaviour of such a network.
When using unmodified RFC2544 tests on such a network the tester
needs to be aware that the latency and latency variation will be
significantly higher than in a laboratory environment, and that error
rate and hence packet loss may be higher than in a laboratory
environment. It is therefore necessary to repeat tests a number of times
to be confident of the results and to present the outcome in a
statistical rather than definitive form. It is also necessary to note
that the results represent the behaviour of the system and network at
the time of measurement and may not reperesent the behaviour at some
other time in the past or in the future.
Unmodified RFC2544 tests MUST NOT be conducted on a production
network unless the tester is confident that the required degree of
resource issolation is in place such that the tests will not be harmful
to the network itself, other user traffic, or the performance
applications using the network as a data path.
The following text does not seem useful as it is self evident and
thus I propose to delete the section.
[2. Real world
In producing this document the authors attempted to keep in mind the
requirement that apparatus to perform the described tests must actually
be built. We do not know of "off the shelf" equipment available to
implement all of the tests but it is our opinion that such equipment can
be constructed.]
The advice in Section 3 regarding the
applicability and usefulness of the tests applies to the situation
considered in this document
As noted in RFC2544 Section 4 the test
will produce a great deal of data, more so in the case when the tests
are conducted on a production network, since it is advisable to repeat
the tests to ensure that the results are statically representative of
the behaviour of the devices and network path at the time of the test.
In particular behavior such as latency variation or packet loss needs to
be properly characterised. Automation of the instumentation and
improvements in display and visualization since RFC2544 was written
should assist with this.
As noted in RFC2544 the selection of the to be run and evaluation of
the test data must therefore be done with an understanding of generally
accepted testing practices regarding repeatability, variance, the
tatistical significance of small numbers of trials and the nature of the
production network.
An implementation is not compliant if it fails to satisfy one or more
of the MUST requirements for the protocols it implements. An
implementation that satisfies all the MUST and all the SHOULD
requirements for its protocols is said to be "unconditionally
compliant"; one that satisfies all the MUST requirements but not all the
SHOULD requirements for its protocols is said to be "conditionally
compliant".
There are two cases to be considered when setting up a test. The
first case is where the production network is providing an emulated
physical or datalink serice such as a pseudowir, and this service is the
subject of the test. This is similar to the case shown in Figure 2 of
, noting that either the service
emulation will need to be put in loopback, or a second service will need
to be provsioned to return the test traffic to the tester.
The second case concerns where it is desired to determine the
performance of a system that is handing off packets a packet equipment
such as a Router (or a lable switched router (LSR)), such as when
testing a virtual private network (VPN) service. This is similar to the
case shown in Figure 3 of RFC2544.
In both cases it should be noted that measurements taken will be
subject to imparement on both the outward path and the return path, and
that it is not possible to easily determine the degree of imparement
attributable to each path. It will be the case that the path that is the
subject of the test will, other than under pathalogical conditions,
exhibit a performance that is at least as good as the measurements
taken.
The advice in section 7 of applies,
with the exception that protocols and protocol modes not supported by
the production network MUST NOT be configured or tested. The protocol
profile tested MUST be included or referenced in ant test report.
The advice in Section 8 of regarding
Frame Formats applies. Given the evolution of network technology since
1999, some additional frame formats will be required, but these are out
of scope for this document.
The advice in Section 9 of regarding
frame size should be followed.
Since the publication of RFC2544 the maximum packet size supported on
Ethernet [anything else???] has increased with the introduction of jumbo
frames. The set of frame sizes tested SHOULD include the maxium size
supported by the network. The maximum size tested MUST be included in
the test report. This maximum size may be set by configuration, or may
be determined by the tester automatically using a discovery technique.
Such a discovery technique should preferable be log n efficient such as
a binary chop search. The discovered maximum packet size should be
verified a number of times (at least five times is RECOMMENDED) until a
consistant result is achieved.
The advice in Section 10 of SHOULD be
followed regarding verification of frames
The advice in Section 10 of SHOULD be
followed regarding verification of frames
[ We might want to update the following, although that really needs
to be aligned with more modern advice on the lab version of this
specification
The rest of section 13 to be deleted
It might be useful to know the DUT performance under a number of
conditions; some of these conditions are noted below. The reported
results SHOULD include as many of these conditions as the test equipment
is able to generate. The suite of tests SHOULD be first run without any
modifying conditions and then repeated under each of the conditions
separately. To preserve the ability to compare the results of these
tests any frames that are required to generate the modifying conditions
(management queries for example) will be included in the same data
stream as the normal test frames in place of one of the test frames and
not be supplied to the DUT on a separate network port.
11.1 Broadcast frames
In most router designs special processing is required when frames
addressed to the hardware broadcast address are received. In bridges (or
in bridge mode on routers) these broadcast frames must be flooded to a
number of ports. The stream of test frames SHOULD be augmented with 1%
frames addressed to the hardware broadcast address. The frames sent to
the broadcast address should be of a type that the router will not need
to process. The aim of this test is to determine if there is any effect
on the forwarding rate of the other data in the stream. The specific
frames that should be used are included in the test frame format
document. The broadcast frames SHOULD be evenly distributed throughout
the data stream, for example, every 100th frame.
The same test SHOULD be performed on bridge-like DUTs but in this
case the broadcast packets will be processed and flooded to all
outputs.
It is understood that a level of broadcast frames of 1% is much
higher than many networks experience but, as in drug toxicity
evaluations, the higher level is required to be able to gage the effect
which would otherwise often fall within the normal variability of the
system performance. Due to design factors some test equipment will not
be able to generate a level of alternate frames this low. In these cases
the percentage SHOULD be as small as the equipment can provide and that
the actual level be described in the report of the test results.
11.2 Management frames
Most data networks now make use of management protocols such as SNMP.
In many environments there can be a number of management stations
sending queries to the same DUT at the same time.
The stream of test frames SHOULD be augmented with one management
query as the first frame sent each second during the duration of the
trial. The result of the query must fit into one response frame. The
response frame SHOULD be verified by the test equipment. One example of
the specific query frame that should be used is shown in Appendix C.
11.3 Routing update frames
The processing of dynamic routing protocol updates could have a
significant impact on the ability of a router to forward data frames.
The stream of test frames SHOULD be augmented with one routing update
frame transmitted as the first frame transmitted during the trial.
Routing update frames SHOULD be sent at the rate specified in
Appendix C for the specific routing protocol being used in the test. Two
routing update frames are defined in Appendix C for the TCP/IP over
Ethernet example. The routing frames are designed to change the routing
to a number of networks that are not involved in the forwarding of the
test data. The first frame sets the routing table state to "A", the
second one changes the state to "B". The frames MUST be alternated
during the trial.
The test SHOULD verify that the routing update was processed by the
DUT.
11.4 Filters
Filters are added to routers and bridges to selectively inhibit the
forwarding of frames that would normally be forwarded. This is usually
done to implement security controls on the data that is accepted between
one area and another. Different products have different capabilities to
implement filters.
The DUT SHOULD be first configured to add one filter condition and
the tests performed. This filter SHOULD permit the forwarding of the
test data stream. In routers this filter SHOULD be of the form:
forward input_protocol_address to output_protocol_address
In bridges the filter SHOULD be of the form:
forward destination_hardware_address
The DUT SHOULD be then reconfigured to implement a total of 25
filters. The first 24 of these filters SHOULD be of the form:
block input_protocol_address to output_protocol_address
The 24 input and output protocol addresses SHOULD not be any that are
represented in the test data stream. The last filter SHOULD permit the
forwarding of the test data stream. By "first" and "last" we mean to
ensure that in the second case, 25 conditions must be checked before the
data frames will match the conditions that permit the forwarding of the
frame. Of course, if the DUT reorders the filters or does not use a
linear scan of the filter rules the effect of the sequence in which the
filters are input is properly lost.
The exact filters configuration command lines used SHOULD be included
with the report of the results.
11.4.1 Filter Addresses
Two sets of filter addresses are required, one for the single filter
case and one for the 25 filter case.
The single filter case should permit traffic from IP address
198.18.1.2 to IP address 198.19.65.2 and deny all other traffic.
The 25 filter case should follow the following sequence.
deny aa.ba.1.1 to aa.ba.100.1 deny aa.ba.2.2 to aa.ba.101.2 deny
aa.ba.3.3 to aa.ba.103.3 ... deny aa.ba.12.12 to aa.ba.112.12 allow
aa.bc.1.2 to aa.bc.65.1 deny aa.ba.13.13 to aa.ba.113.13 deny
aa.ba.14.14 to aa.ba.114.14 ... deny aa.ba.24.24 to aa.ba.124.24 deny
all else
All previous filter conditions should be cleared from the router
before this sequence is entered. The sequence is selected to test to see
if the router sorts the filter conditions or accepts them in the order
that they were entered. Both of these procedures will result in a
greater impact on performance than will some form of hash coding.]
Where the test traffic is fully issolated from production traffic,
for example when running over a PW, the advice in Section 12 of MAY be followed.
Where the test traffic shares the network with the production
traffic, the addresses used MUST be those that are correctly routed to
the designated test traffic receiver. Correct routing of this traffic at
a low data rate MUST be verified prior to running tests that subject the
test receiver to a significant load.
Where the test traffic is fully issolated from production traffic,
for example when running over a PW, the advice in Section 12 of MAY be followed.
Where the test traffic shares the network with production traffic,
the existing routing protocols SHOULD be used to set up the routed path
between the test traffic source and the test traffic destination.
The advice on traffic bidirectionality in Section 14 of SHOULD be followed.
The advice on stream selection in Section 15 of SHOULD be followed.
The advice on exercising the ability of the DUT to concurrently
receive packets for the same destination port giveb in Section 16 of
SHOULD be followed.
The advice on multiple protocols given in Section 17 of SHOULD be followed.
The advice on multiple frame sizes given in Section 18 of SHOULD be followed.
Section 19 of discusses testing of
multiple systems. This is the normal case in a production network. As
noted in RFC2544 such tests require care in care in interpretation.
Unlike the laboratory benchmrk case however, the test will be exercising
the network in the expected configuration, and thus is representive of
the opertion of the system with the background traffic load of the time.
It is RECOMMENDED that the test be repeated a number of times to guage
the effect of variation of background traffic over both short and long
term time frames. The times of the test SHOULD be logged, and the
results presented in such a way that the statistical nature of the test
be clear to the reviewer. It should be noted whether the type of
production network was such that it would or not it would be anticipated
that the test would be repeatable within the statistical significance of
the measurements.
The advice on maximum frame rate given in Section 20 of RFC2544 SHOULD be followed.
The advice in Section 21 of on bursty
traffic SHOULD be followed.
Editor's Note Do we need to recomemnd larger burst sizes?
[This is probably obselete now - suggest we delete the section
Although it is possible to configure some token ring and FDDI
interfaces to transmit more than one frame each time that the token is
received, most of the network devices currently available transmit only
one frame per token. These tests SHOULD first be performed while
transmitting only one frame per token.
Some current high-performance workstation servers do transmit more
than one frame per token on FDDI to maximize throughput. Since this may
be a common feature in future workstations and servers, interconnect
devices with FDDI interfaces SHOULD be tested with 1, 4, 8, and 16
frames per token. The reported frame rate SHOULD be the average rate of
frame transmission over the total trial period.]
If the prodcution network is providing a layer 1 or layer 2 service,
then the test may be conducted as described in Section 23 of . If the production network is providing a layer
3 service care MUST be taken to ensure that any routes intruduced may be
safely announced by the DUT without causing disruption to production
traffic, and at such a volume that the route processors in the
production network are not overloaded.
The advice in Section 24 of on trial
duration is applicable.
As stated in Section 25 of , the DUT
SHOULD be able to respond to address resolution requests sent by the DUT
wherever the protocol requires such a process.
The throughput test described in Section 26.1 of applies. However the reporting of a single
number is NOT RECOMMENDED. If a single is reported, it must be
measured at a time during the busy period, and the value of the lower
decile SHOULD be reported.
The latency testing described in Section 26.1 of applies. Measurements SHOULD be carried out
during the busy period. The statistical averaging approach is for
further study.
The frame loss proceedure described in Section 26.3 of is modified as follows:
The test should be repeated a number of times at each frame rate.
The number of repeats SHOULD be configured by the tester and reported
with the results. The default value is 10 (pulled out of a hat). The
exit criteria of loss free transmissions SHOULD be configured by the
tester and reported with the results. The default value is 10 (also
pulled out of a hat).
The advice on back-to-back frames provided in Section 26.4 of applies.
The advice on system recovery frames provided in Section 26.5 of
applies.
The advice provided in Section 26.6 of on reset SHOULD be followed, except that the
test MUST NOT be carried out on a DUT that is being used for
production traffic unless a specific decision is made that disruption
of the user traffic is acceptable. This test MAY be carried out during
the quiet period.
There are no IANA considerations which arise from this document.
To be provided in a future version.
The Authors of are acknowledged.
This Appendix will be deleted or updated in next version. Text is a
copy of Appendix A of
A.1 Scope Of This Appendix
This appendix discusses certain issues in the benchmarking
methodology where experience or judgment may play a role in the tests
selected to be run or in the approach to constructing the test with a
particular DUT. As such, this appendix MUST not be read as an amendment
to the methodology described in the body of this document but as a guide
to testing practice.
1. Typical testing practice has been to enable all protocols to be
tested and conduct all testing with no further configuration of
protocols, even though a given set of trials may exercise only one
protocol at a time. This minimizes the opportunities to "tune" a DUT for
a single protocol.
2. The least common denominator of the available filter functions
should be used to ensure that there is a basis for comparison between
vendors. Because of product differences, those conducting and evaluating
tests must make a judgment about this issue.
3. Architectural considerations may need to be considered. For
example, first perform the tests with the stream going between ports on
the same interface card and the repeat the tests with the stream going
into a port on one interface card and out of a port on a second
interface card. There will almost always be a best case and worst case
configuration for a given DUT architecture.
4. Testing done using traffic streams consisting of mixed protocols
has not shown much difference between testing with individual protocols.
That is, if protocol A testing and protocol B testing give two different
performance results, mixed protocol testing appears to give a result
which is the average of the two.
5. Wide Area Network (WAN) performance may be tested by setting up
two identical devices connected by the appropriate short- haul versions
of the WAN modems. Performance is then measured between a LAN interface
on one DUT to a LAN interface on the other DUT.
The maximum frame rate to be used for LAN-WAN-LAN configurations is a
judgment that can be based on known characteristics of the overall
system including compression effects, fragmentation, and gross link
speeds. Practice suggests that the rate should be at least 110% of the
slowest link speed. Substantive issues of testing compression itself are
beyond the scope of this document.]
This Appendix will be updated looks updated in next version. Text is
a copy of Appendix B of
(Provided by Roger Beeman, Cisco Systems)
Size Ethernet 16Mb Token Ring FDDI (bytes) (pps) (pps) (pps)
64 14880 24691 152439 128 8445 13793 85616 256 4528 7326 45620 512
2349 3780 23585 768 1586 2547 15903 1024 1197 1921 11996 1280 961 1542
9630 1518 812 1302 8138
Ethernet size Preamble 64 bits Frame 8 x N bits Gap 96 bits
16Mb Token Ring size SD 8 bits AC 8 bits FC 8 bits DA 48 bits SA 48
bits RI 48 bits ( 06 30 00 12 00 30 ) SNAP DSAP 8 bits SSAP 8 bits
Control 8 bits Vendor 24 bits Type 16 bits Data 8 x ( N - 18) bits FCS
32 bits ED 8 bits FS 8 bits
Tokens or idles between packets are not included
FDDI size Preamble 64 bits SD 8 bits FC 8 bits DA 48 bits SA 48 bits
SNAP
DSAP 8 bits SSAP 8 bits Control 8 bits Vendor 24 bits Type 16 bits
Data 8 x ( N - 18) bits FCS 32 bits ED 4 bits FS 12 bits
The considerations provided in Apendix C of apply.
The requirement for any new frame formats will be considered in a
future version.