Tuesday, March 26, 2013

Benchmark Testing: maps vs for-loop

Speed and efficiency should always be in the forefront of a developers mind when writing code. This, most agree on. What is not agreed upon is how that should be accomplished. Each developer has their our own preferences on how it should be written for various reasons. Usually it's due to how they were taught or trained, maybe its because it is more aesthetically pleasing or reads cleaner. Regardless of the cause, the effect is that the code could have performed more efficiently or at a faster rate. There is always more than one way to accomplish a task, benchmark testing is used to prove what methodology is the faster one, although might not be the prettiest. 

The Argument:

Jim thinks that it is faster to collect data via a for-loop:

Set<Id> accountIds = new Set<Id>();

for (Account a : [select Id from Account]) {
  accountIds.add(a.Id);
}
someMethod(accountIds);

Jane however insists that it is faster to do it via SOQL to Map:

Map<Id, Account> accounts = new Map<Id, Account>(
  [select Id from Account]
);
someMethod(accounts.keySet());

After much debate, a few harsh words and a broken pen, a bet was struck.

The Test:

A test class can be written, alternatively this could be done in 'execute anonymous' but with a test class the real data wont be effected as it would the other way. The tests will run different amounts of data through each pattern and record the time it took for the comparison. First the data needs to be created that they will iterate over. 


private static void createAccounts(String pName) {
  Account a;
  List<Account> accounts = new List<Account>();
  for (Integer i = 0; i < 200; i++) {
    accounts.add(a = new Account(Name = pName + i));
  }
  insert accounts;
}

The name needs to be unique for each Account created, to help accomplish this a string will be passed in to add to it when the account is being created. Since Salesforce has govonerner limits in place, only 200 records can be created at a time and so requiring a second method to call createAccounts() and allow more than 200 to be made at a time. 

private static void setupTests(Integer pLimit) {
  for (Integer i = 0; i < pLimit; i++) {
    createAccounts(String.valueOf(i));
  }
}

To test the for-loop and map, a timestamp is logged at the start and end of the method in milliseconds (this makes the comparison easier). Since each is using the data created in the test, the amount being iterated over is known and no limit is needed in the SOQL statement.


private static Long runViaForLoop() {
  Long lStart = DateTime.now().getTime();
  Set<Id> accountIds = new Set<Id>();
  for (Account a : [select Id from Account]) {
    accountIds.add(a.Id);
  }
  return DateTime.now().getTime() - lStart;
}

private static Long runViaMap() {
  Long lStart = DateTime.now().getTime();
  Map<Id, Account> accounts = new Map<Id, Account>(
    [select Id from Account]
  );
  return DateTime.now().getTime() - lStart;
}

Its time to actually do some testing. The first test should be over a low number of records like 1 or 5 and slowly working up to larger numbers like 10k. Debug statements are necessary to see the output of the numbers in the logs after the tests complete. This will show which pattern took less time to iterate over the various record sizes, and thus show which one is better to use, and who wins the bet!

private static testmethod void compaireTime() {
  setupTests(1);
  system.debug('for-loop: ' + runViaForLoop()  + 'ms');
  system.debug('map: ' + runViaMap() + 'ms');
}

After the tests run, the output will show how long each took to complete the same task. 

DEBUG| for-loop: 36ms
DEBUG| map: 10ms

Looks like the Map is the winner. Unfortunately though, Jim is a sore loser and since there are other factors that can effect the output, its best to run the tests a few times along with expanding it to larger sets of records such as 4k and 10k:

private static testmethod void compaireTime() {
  setupTests(20);
  system.debug('4k for-loop: ' + runViaForLoop() + 'ms');
  system.debug('4k map: ' + runViaMap() + 'ms');
  setupTests(30);
  system.debug('10k for-loop: ' + runViaForLoop() + 'ms');
  system.debug('10k map: ' + runViaMap() + 'ms');
}


DEBUG| 4k for-loop: 876ms
DEBUG| 4k map: 138ms

DEBUG| 10k for-loop: 1777ms
DEBUG| 10k map: 246ms

DEBUG| 4k for-loop: 787ms
DEBUG| 4k map: 107ms

DEBUG| 10k for-loop: 2234ms
DEBUG| 10k map: 340ms

DEBUG| 4k for-loop: 1580ms
DEBUG| 4k map: 113ms

DEBUG| 10k for-loop: 1740
DEBUG| 10k map: 220


DEBUG| 4k for-loop: 796
DEBUG| 4k map: 115

DEBUG| 10k for-loop: 1898
DEBUG| 10k map: 279



The graph above shows the average of 20 tests, and it shows that as the record size increased, the time saved using maps speaks for itself. Jane is clearly the victor here, and Jim has to pay up. Numbers always win arguments.


4 comments:

  1. I've always been a fan of maps but I certainly wish that we could do a
    Map = [select name, {stuff} from account] ;

    Most of the ways that I am "stuck" using maps is indexed by a string not the ID. So for me I tend to have to do a select to a list and then iterate over the list in order to prep the map for direct access later.

    ReplyDelete
    Replies
    1. Ok, it didn't like the map syntax I used to identify the map as Map(string, account). Dang HTML seeing that as a tag. :)

      Delete
  2. Put them in the opposite order -- I bet you'll find whatever happens first happens slower :)

    ReplyDelete