"... I was just looking over the [ promised or expected ] features of the DNSBenchmark v2 and I've seen that it runs as a service in windows and that it can change the DNS server of a Windows system ..."
Nix that.
Automatically changing DNS Nameserver resolvers is too much of a
challenge at this time, considering the variety of places and controls
over NIC Network Interface Cards, and different operating systems,
and different routers, and other variables.
And long-test wise, DNSBench 2 has 'evolved' in development
experience to offer long testing as a program instead of as a service,
and only recommending manual resetting of our DNS Nameserver
resolvers - stay tuned.
- - - - -
From a recent DNSBench 2 pre-release exploration sample:
Given more time, greater accuracy is available
Additional measurements increase the
certainty of the benchmark's conclusions
Benchmarking LONGER means benchmarking BETTER.
The buttons below determine the amount of data collected by the benchmark.
Although this will require more time, it will deliver more certain results.
5x (default) benchmark testing:
This recommended testing level determines each resolver's performance when
querying for the IP addresses of each of the top 50 domain names 5 times. This
collects 250 samples of each type (cached, uncached and dotcom) for a total of
750 DNS queries per resolver. Taking this number of samples averages out
enough of the Internet's timing uncertainty to produce highly reliable results.
1x (quick) benchmark testing:
If running time is at a premium or you just want to see sample results, you may
reduce the total number of samples taken from 750 to 150 by performing a single
round of queries instead of five. The results will have lower confidence since
Internet timing "jitter" will tend to skew measured individual resolver performance.
10x, 20x, 50x, 100x for the most accurate benchmark testing:
Please see the next section for an explanation of why taking many more samples
results in significantly more stable and higher confidence results. Although it is
not necessary, the benchmark can be run much longer to collect more data and
produce extremely high-reliability measurements.
Why benchmarking longer produces better results:
Imagine that you needed to determine exactly how much time is required to drive
a car between two locations. Measurement of that time would be consistent if all
traffic lights were green and no one else was on the road. But driving through a
busy city with buses, cyclists, construction, traffic lights and railroad crossings
introduces a huge amount of uncertainty into that measurement. And every time
you made the drive, a different set of circumstances would present themselves.
Just like driving a car through a busy city, the Internet is very busy and travel
across it introduces a huge amount of uncertainty into any timing measurement.
What we want is an average travel time we can trust. So let's say that we make
10 cross-town drives and take their average. But what if during 5 of those 10 test
drives a railroad train crossing just happened to be down, and a lot of time was
lost waiting for the train to pass? Averaging those very long delays into the 10
trips would significantly increase the travel time average. We could say it was just
"bad luck" those 5 times, except that we've seen that it could happen, because it
did happen, during our travel testing.
We cannot simply ignore those 5 bad trips since that would be cheating
when what we want is the truth. So what we could do is take the average of 50
trips instead of just 10. That way, the effects of those 5 exceedingly "bad luck"
events will tend to be "averaged out" of our final result. What this really means is
that the effects of any "luck" (statistical anomalies), whether good or bad, will be
greatly diminished, which is exactly our ultimate goal.
In school, you may have been surprised to see how just one bad grade can pull
down an average, and how difficult it can be to bring that average back up. If
Internet traffic briefly hits a "slow spot" a resolver's apparent performance can be
unfairly penalized. You could re-run the benchmark and hope for a better result,
but being selective of data you choose to keep does not result in the truth.
The buttons below allow you to take many more measurements to average out
both good and bad luck. If you were to run the "quick" 1x benchmark several
times, you would get useful results each time, but they would be different each
time due to "railroad crossings" that were not averaged out by a sufficient number
of non-railroad crossing measurements. Running 5x helps to smooth those out.
Since the benchmark may be stopped at any time, you could select the maximum
100x sampling and manually stop the benchmark once the results have stabilized
after collecting sufficient data - probably once the progress bar reaches around
5% complete. This will take comparatively longer, but one thing that will help is
reducing the number of resolvers that are deeply compared. You probably only
care about the top 10 or 20, because super-accurately determining how slow the
slowest resolvers are will not be very useful and just wastes time.
Regardless, we hope you enjoy the exploration and have fun.
[ 1x ][ 5x ][ 10x ][ 20x ][ 50x ][ 100x ]
Or our choice of
x continuous iterations of a Benchmark run via
/COMMAND LINE
Along with Speed / Inter-Query Queue Delay of up to 9999 milliseconds
- that's ~10 seconds between DNS queries.
And up to 500 resolvers.
My math suggests:
1 good resolver can take 8:20 or to 8.33 minutes to Benchmark 1x at
9999 milliseconds between queries.
Times 100 continuous Benchmarks = 833 minutes = 13.833 hours for
one ( good ) resolver, longer for less-'good' resolvers.
Most folks may narrow down their comparison to a dozen or so
competing DNS Nameserver resolvers to test.
So, for example, comparing 12 DNS Nameserver resolvers x 13.8833
hours each = 166.6 hours = 6.94 days - that's almost a week's worth
of testing and data reporting.
Hey, a full comparison of 500 different DNS Nameserver resolvers at
100x tests and 9999 milliseconds Speed / Inter-Query Queue Delay
would take at
least 6,916.5 hours, and that's 288.19 days, the better
part of a year.
That's for full-performance 'good' resolvers.
Slower resolvers in that mix will stretch that test out even longer.
So it looks as if there are other ways to run long tests than running in
the background as a service.
By running in the foreground as a program, with automatic multiple
runs and delays between DNS queries, DNSBench 2 can take it's own
sweet time.
We'll see what the final release version offers, but the above is one
example being explored.
More at
https://www.grc.com/groups/dns.dev