Benchmarking 101 PHP: array_walk() and conditionals PDF Print E-mail
Webmaster Articles - Scripts & Scripting
Written by TDavid   

This will be the first of several different benchmark articles I will be releasing over time. I always enjoy reading benchmarks that others have done because it can help reduce time of having to do them myself. So I'll be sharing my results with you so maybe it will save you some time.

Often times you will come up to situations in coding where you can do something a variety of different ways to achieve the same end result. Being able to benchmark the results and decide which branch of code is the most efficient and/or the fastest is a very good tool to have in your coding arsenal. It is often one thing that newer and/or inexperienced programmers tend to overlook in the development process. That is, how do I take working production code and make it *efficient* working production code.

Today I'm going to show you a simple way to benchmark branches or sections of code and test to see which way performs more efficiently using PHP code. The method I'll show you is not as neat as using one of several benchmarking utilities that exists out there, but it is easy to follow along with and it doesn't require anything more than putting two tags around your existing section of code and using a calculator or spreadsheet to determine the average compile times. In other words: it's the quick and dirty way to get the answer during the code development phase.

You might think that the answer should be obvious by looking at two pieces of code that do the same thing but it isn't always this way. I'm going to show you how function changes (and choices) can make a significant difference in the performance of your code.

Always use built-in functions in PHP? Yes or No?

Before I give you the answer to this question, see if you can guess correctly. Should you always use built-in PHP functions? I've read manuals and articles by some very accomplished programmers which preach that it is something akin to sacriledge not to do so. Logic would suggest (wouldn't it?) that a function written in C neatly served as a built-in PHP function would be more efficient than using one which you create which essentially does the same thing in the higher level language.

Let's break down one example using the new PHP 4 function array_walk() function. The array_walk() function iterates through an array and allows you to alter and/or output the contents of the array by passing through a function you define in your script. If this doesn't make sense right now, let's look at how to do this without using this built-in PHP function first.

For the first example we'll assume you want to iterate through an array and print out the results of each item in the array to the browser. Let's prepare a piece of code to count up to 1000 times and store the results in a single array. So our array contains numbers 1 to 1000. Pretty basic, right? Now let's show you the code to do this:

<?
for($i=1; $i<1001; $i++) {
$test[] = $i;
}
// ready to set first benchmark
?>

Now we'll create our benchmarking timer using the microtime() function with a font green (start) and red (end) to mark the sections of code we're benchmarking. This will grab microseconds from the system clock which is what we need since the majority of functions will take much less than 1 second to execute. The first number is the number of microseconds (a fraction of 1 second) and the second number is a UNIX timestamp showing the number of seconds since January 1, 1970. The print statement does use up some microseconds but I'll also demonstrate how we can benchmark that as well so that can be backed out of the equation when making the comparison.

<?
print("<font color='green'>TEST1 Start " . microtime() . "</font><br>");
$sizeof = count($test1);
for($i=0; $i<$sizeof; $i++) {
print("$test1[$i]");
}
print("<font color='red'>TEST1 End " . microtime() . "</font><br>");

print("<br><font color='blue'>PAUSE " . microtime() . "</font><p>");
?>

The pause function gives us information about how long it takes to print the actual microtime() function. We need to subtract this time from the figures we are given for TEST 1 and TEST 2.

Now let's do the same thing as in TEST 1 using the array_walk() function in php4. We need to create a function to print each $array_item.

<?
print("<font color='green'>TEST1 Start " . microtime() . "</font><br>");
function prnt($array_item) {
print("$array_item");
}
array_walk($test1, "prnt");
print("<font color='red'>TEST1 End " . microtime() . "</font><br>");
?>

At first glance the second piece of code looks a bit shorter and more efficient, doesn't it? Now let's execute the script which runs both sections of code and then we can compare the actual execution times to see if looks in this test case are deceiving.

Below are the results for TEST 1, which ran 5 times. The more you run the process the more accurate average you will get, but for this "quick and dirty method" I usually do only 5-10 times and throw them into a spreadsheet to obtain a general idea. I could write code that would do this averaging on the fly for me, but in this article I wanted to show you a simple way to get bench times and also to eliminate as much overhead when testing the branch of code as possible.

TEST 1 (using a for loop with no function call)
time in microseconds:
1301200
1274500
1301800
1300800
1319200
average = 1299500

Pause (this is the process time to print the microsecond message to the browser and should be deducted from the TEST 1 and TEST 2 times respectively)
2600
2900
2700
2800
2700
average = 2740

TEST 2 (using built-in array_walk() function)
1593400
1845000
1752000
1601800
1531700
average = 1664780

Ok, now we can subtract the pause time from each to gain an average execution time. When looking at the difference in execution times it doesn't appear to be very much in microseconds, but when using benchmarking you have to calculate in percentages, not in hundreds or millionths of a second as you'll see when put side by side below. Let's compare making the higher number 100%:

Test 1 = 0.1296760 = 78.0%
Test 2 = 0.1662040 = 100.0% (or 22% slower than TEST 1! Yikes!)

Yeah, you might say, big deal it is still a tiny fraction of a second, who will notice? But now let's increase the array size from 1,000 numbers to 10,000 which would represent a fairly decent sized array and do the same comparison. I'll skip showing you all the numbers to arrive at the averages and just show you the overall difference in time:

Test 1 = 0.13530900
Test 2 = 3.28378400

People will begin to notice 3 1/4 seconds versus a little more than 1/10th of one second in a small routine like this. Why is this happening? Creating your own functions are great for repetititve tasks and will save you time as a programmer but they often will cost you time from an execution standpoint, especially when we start talking about larger array sizes and functions that are called recursively such as the array_walk() function does.

If we could simply use the built-in function without having to call a function we created the difference would be negligible, but the flexibility of a function which simply iterated through an array and printed the results isn't like allowing a custom function call so I see why array_walk() was added by the PHP development team. There are also situations where the comparsion between the two isn't always going to come out this way, which is my point here. There very well might be times when using array_walk() will be more efficient than not. It's important to note that many times emphasis with creation of new functions is placed on making it more convenient for the programmer instead of most efficient for the end program.

So while you might read in a lot of manuals and articles that you should *always* use a built-in php function to perform tasks the real answer is to be absolutely certain you should *always* benchmark the functions against your existing code and see which performs better to determine which works best. Perhaps the built-in function will save you coding time and be more flexible across multiple applications, but the end result on the server for any one application could be a very different story because of overhead imposed by the built-in function that you don't need for the operation in question.

For contrast let's use the built-in count() function. Try adding up the items in an array with 100 items to come up with a count of an array and then comparing that to simply calling the count() function. You'll see a drastic difference in savings using the built-in php count() function. I'll go along with using built-in functions most of the time is best, but not all of the time. "Always" and "never" are pretty strong words in any application, not just programming.

If-else-elseif or switch-case - which one is better?

Multiple statements of if/elseif/else or switch/case using a consistent left side matched against a variable right side ($leftside == $right side) -- which is better? You can check this out using the following simple piece of code which iterates 100 times across an admittedly non-real world example.

<?
$days = array('monday', 'tuesday', 'wednesday', 'thursday', 'friday', 'saturday', 'sunday');
print("<font color='green'>TEST1 Start " . microtime() . "</font><br>");

for($i=0; $i<100; $i++) {
if($days[5] == 'sunday') { print("it's sunday"); }
elseif($days[5] == 'monday') { print("it's sunday"); }
elseif($days[5] == 'tuesday') { print("it's sunday"); }
elseif($days[5] == 'wednesday') { print("it's sunday"); }
elseif($days[5] == 'thursday') { print("it's sunday"); }
elseif($days[5] == 'friday') { print("it's friday!"); }
else { echo($i); // just print iteration with no <br> }
}
print("<font color='red'>TEST1 End " . microtime() . "</font><br>");
print("<br><font color='blue'>PAUSE " . microtime() . "</font><p>");
print("<font color='green'>TEST2 Start " . microtime() . "</font><br>");
for($i=0; $i<100; $i++) {
switch ($days[5]) {
case 'sunday': print("it's sunday"); break;
case 'monday': print("it's sunday"); break;
case 'tuesday': print("it's sunday"); break;
case 'wednesday': print("it's sunday"); break;
case 'thursday': print("it's sunday"); break;
case 'friday': print("it's sunday"); break;
default: echo($i); // just print iteration
}
print("<font color='red'>TEST2 End " . microtime() . "</font><br>");
?>

Test 1 (if/else/elseif) = 295400 = 100%
Test 2 (switch/case) = 250400 = 84.77% (15.23% FASTER than using if/else/elseif)

Our test here indicates that using switch / case is superior to if/else in this example. Remember to test your actual code, because results can and do vary. A 15.23% increase in speed can be very significant over a lot of script executions.

Remember that individual code results do very so you should never generalize and always compare and contrast similar sections of code to determine which will perform best in the actual application. I hope you find this information useful in your code development efforts.

 
Real Time Web Analytics