Monday, November 14, 2011

What is in my Core Data managed store?

Suppose you have an iOS application. Suppose it uses the Core Data API to persist data. And let us further suppose that you are having a few problems; it seems like the data in your store isn't exactly as you expect.

It might prove useful to be able to get some basic information about what the heck is happening with your data at runtime. For example, let us suppose you have an API in your code that lets you get a reference to the NSManagedObjectModel. We can then print out what entities are in your database and how many of them. Create a simple ViewController with a multi-line text view on it and setup an outlet named 'output'. In the viewDidAppear method gather up some basic diagnostic information and spit it out to the output text view:
//NOTE: sample assumes ARC, hence no calls to release
//in the .m file for your diagnostic output

- (void)viewDidAppear:(BOOL)animated
{
    [super viewDidAppear:animated];
    
    NSString *diagnostics = @"";
    
    @try {
        diagnostics = [NSString stringWithFormat:@"%@%@", diagnostics, @"Data Diagnostics\n"];
        NSManagedObjectModel *mom = //CALL WHATEVER YOUR APP USES TO GET NSManagedObjectModel
        if (mom) {
            for (NSEntityDescription *entity in mom.entities) {
                NSUInteger countOf = [self countOfEntity:entity];
                diagnostics = [NSString stringWithFormat:@"%@  entity:%@ (%d records)\n", diagnostics, entity.name, countOf];
            }
        } else {
            diagnostics = [NSString stringWithFormat:@"%@%@", diagnostics, @"  objectModel is nil :(\n"];
        }
    }
    @catch (NSException *exception) {
        diagnostics = [NSString stringWithFormat:@"%@%@", diagnostics, @"  ERROR getting data diagnostics\n"];
    }
    
    output.text = diagnostics;
}

//note that this could return "NSNotFound if an error occurs"; this typically shouldn't be the case since we got the entity description directly from our object model
- (NSUInteger) countOfEntity :(NSEntityDescription*)entity
{
    NSFetchRequest *fr = [[NSFetchRequest alloc] init];
    fr.entity = entity;
    [fr setIncludesSubentities:NO];
    
    NSError *err;
 NSManagedObjectContext *moc = //CALL WHATEVER YOUR APP USES TO GET NSManagedObjectContext
    NSUInteger count = [moc countForFetchRequest:fr error:&err];
    return count;
}
If you print this to a text view your output should end up something like this:
For some types of app this is surprisingly useful.

The main things I personally find this useful for are:

  1. Identifying glaring errors, such as "your model is completely missing an entity type; something is very wrong"
  2. Identifying errors where data isn't added or removed as expected, such as "when you do X in the UI there should be +1 of those (or -2 of those or whatever) if I quickly peek at diagnostics"
    1. Hmm ... perhaps it would be useful if the diagnostic view actually told you the delta for each type since you last looked?
There are many ways to wire this in; it doesn't have to be via it's own view. Another approach I often find useful to to have a debug/logged mode enabled only in development where this type of information prints to the console when key events occur in the app.


Sunday, October 16, 2011

How to wait while an NSTimer runs in a unit test

Suppose you were a lowly noob to iOS development. Further suppose you had a Cool Idea (tm) which involved timers. The internets might rapidly guide you to NSTimer and you might decide to try to get it to log to the console in a unit test. The most obvious approach seems to be to setup a timer to tick frequently, lets say every 0.1 seconds, and setup a timer callback that logs something, then make a test that sleeps for a couple of seconds. Presumably during the sleep period we'll see a bunch of timer output. The code might look like this (inside an XCode 4.2 test implementation class):

- (void)onTimerTick:(NSTimer*)timer
{
    NSLog(@"MY TIMER TICKED");
}

- (void)testTimerBasics
{
    NSLog(@"timer time");
    
    [NSTimer scheduledTimerWithTimeInterval:0.1
                   target:self
                   selector:@selector(onTimerTick:)
                   userInfo:nil
                   repeats:YES];
    
    [timer fire]; //manually calling fire DOES log 'MY TIMER TICKED'
    
    NSLog(@"about to wait");    
    [NSThread sleepForTimeInterval:2.0]; //absolutely no logs of 'MY TIMER TICKED' occur; somehow the time doesn't fire during a thread sleep :(
    NSLog(@"wait time is over");    
}

Sadly absolutely no log messages are printed during our two second sleep ([NSThread sleepForTimeInterval:2.0]) ; WTF?!

After much Google and literally in the midst of typing a Stack Overflow question I came across a question involving waiting for something else that mentioned NSRunLoop in passing. The very existence of a run loop class suggests an answer: our tests run on the same thread as the run loop. This means if we put the run loop to sleep nothing gets processed. Instead of sleep we need some sort of "run the run loop for a while" approach. Luckily it turns out that NSRunLoop provides a runUntilDate API so we can re-write the test above as follows:

- (void)onTimerTick:(NSTimer*)timer
{
    NSLog(@"MY TIMER TICKED");
}

- (void)testTimerBasics
{
    NSLog(@"timer time");
    
    NSTimer *timer = [NSTimer scheduledTimerWithTimeInterval:0.1
                              target:self
                              selector:@selector(onTimerTick:)
                              userInfo:nil
                              repeats:YES];
    
    //[timer fire];
    
    NSDate *runUntil = [NSDate dateWithTimeIntervalSinceNow: 3.0 ];
    
    NSLog(@"about to wait");    
    [[NSRunLoop currentRunLoop] runUntilDate:runUntil];
    NSLog(@"wait time is over");    
}
We've found the right magic incantation! Knuth would be proud.

Speaking of magic incantations, I am using the SyntaxHighlighter libraries hosted @ http://syntaxhighlighter.googlecode.com/svn/trunk/Scripts/. However, there is no Objectionable-C brush there so I took the one posted @ http://www.undermyhat.org/blog/wp-content/uploads/2009/09/shBrushObjectiveC.js and updated it the casing and namespace names to the newer highlighter standard. The updated brush looks like this:


dp.sh.Brushes.ObjC = function()
{
 var datatypes = 'char bool BOOL double float int long short id void';
 
 var keywords = 'IBAction IBOutlet SEL YES NO readwrite readonly nonatomic nil NULL ';
 keywords += 'super self copy ';
 keywords += 'break case catch class const copy __finally __exception __try ';
 keywords += 'const_cast continue private public protected __declspec ';
 keywords += 'default delete deprecated dllexport dllimport do dynamic_cast ';
 keywords += 'else enum explicit extern if for friend goto inline ';
 keywords += 'mutable naked namespace new noinline noreturn nothrow ';
 keywords += 'register reinterpret_cast return selectany ';
 keywords += 'sizeof static static_cast struct switch template this ';
 keywords += 'thread throw true false try typedef typeid typename union ';
 keywords += 'using uuid virtual volatile whcar_t while';
 // keywords += '@property @selector @interface @end @implementation @synthesize ';
 
  
 this.regexList = [
  { regex: dp.sh.RegexLib.SingleLineCComments,  css: 'comments' },  // one line comments
  { regex: dp.sh.RegexLib.MultiLineCComments,  css: 'comments' },  // multiline comments
  { regex: dp.sh.RegexLib.DoubleQuotedString,  css: 'string' },   // double quoted strings
  { regex: dp.sh.RegexLib.SingleQuotedString,  css: 'string' },   // single quoted strings
  { regex: new RegExp('^ *#.*', 'gm'),      css: 'preprocessor' },  // preprocessor
  { regex: new RegExp(this.GetKeywords(datatypes), 'gm'),  css: 'datatypes' },  // datatypes
  { regex: new RegExp(this.GetKeywords(keywords), 'gm'),  css: 'keyword' },   // keyword
  { regex: new RegExp('\\bNS\\w+\\b', 'g'),     css: 'keyword' },   // keyword
  { regex: new RegExp('@\\w+\\b', 'g'),      css: 'keyword' },   // keyword
  ];
 this.CssClass = 'dp-objc';
 this.Style = '.dp-objc .datatypes { color: #2E8B57; font-weight: bold; }'; 
}
dp.sh.Brushes.ObjC.prototype = new dp.sh.Highlighter();
dp.sh.Brushes.ObjC.Aliases  = ['objc'];

Friday, October 14, 2011

Integrating Javascript tests into a CLI build

Wherein we walk through a basic setup for running Javascript unit tests on the command line. After some initial investigation (here) I didn't find time to get back to Javascript unit testing until recently. I have now managed to get Javascript unit tests running fairly gracefully in a command line build at work; here is an outline of how, simplified from the "real" implementation to highlight the basics. Fortunately a great deal of the work is done for us, always nice when it turns out that way.

We are going to run everything off the filesystem to avoid having our tests impacted by external influences.

Part 1: Basic test setup
  1. Create a directory to house Javascript unit test files; we will refer to this as \jsunit henceforth when giving paths. 
  2. Download QUnit.js and QUnit.css into \jsunit
  3. Download run-qunit.js into \jsunit
  4. Create a file testme.js in \jsunit with the following content
    /**
     * var-args; adds up all arguments and returns sum
     */
    function add() {
    }
    
  5. Create a file testme.test.htm in \jsunit with the following content
    • Note we are using local filesystem paths to load all content; we have no external dependencies
    • <!DOCTYPE html>
      <html>
      <head>
      	<!-- we need QUnit as a test runner -->
          <link rel="stylesheet" href="qunit.css" type="text/css" media="screen" />
          <script src="qunit.js"></script>
      	
      	<!-- we'd like to have the file we're going to test -->
          <script src="testme.js"></script>
      	
      	<!-- and finally lets write some tests -->
      	<script>
      		console.log("test time baby");
      
      		test("add is defined", function() {
      			equals(typeof window.add, "function", "add isn't a function :(");
      		});
      	</script>
          
      </head>
      <body>
      	 <h1 id="qunit-header">QUnit Tests</h1>
      	 <h2 id="qunit-banner"></h2>
      	 <div id="qunit-testrunner-toolbar"></div>
      	 <h2 id="qunit-userAgent"></h2>
      	 <ol id="qunit-tests"></ol>
      	 <div id="qunit-fixture"></div>    
      </body>
      </html>
      
  6. Download PhantomJS (1.3.0 at time of writing)
    • For example PhantomJS commands I will assume it is on PATH (eg phantomjs args); use the qualified path if not (eg C:\where\phantom\is\phantomjs args)
  7. Open testme.test.htm in a browser; it should look like this:

  8. Open a command prompt, navigate to \jsunit and run phantomjs run-qunit.js testme.test.htm
    • Output should be similar to:
      test time baby
      'waitFor()' finished in 211ms.
      Tests completed in 57 milliseconds.
      1 tests of 1 passed, 0 failed.
      
    • Note we don't see any "test blah pass" or "test two fail" style output
Part 2: CLI build integration prep
So far so good, now we need to get setup to run in a CLI build. There a couple of things we'd like here, most of which are already implemented in run-qunit.js:
  1. Output each test pass/fail
  2. Output log messages from tests to the console
    1. This "just works" courtesy of run-qunit.js, yay!
  3. Exit with non-zero error code if tests fail
    1. This makes it easy for build to detect failure and do something in response; for example an Ant build could simply set failonerror
    2. This "just works" courtesy of run-qunit.js, yay!
We just have to setup output of test pass/fail information to the console. We'll add a test that fails to show what that looks like. Proceed as follows:
  1. Create a file test-support.js in \jsunit with the following content:
    //create a scope so we don't pollute global
    (function() {  
       var testName;
       
       //arg: { name }
    	QUnit.testStart = function(t) {
    	    testName = t.name;
    	};
    	
    	//arg: { name, failed, passed, total }
    	QUnit.testDone = function(t) {
    	    console.log('Test "' + t.name + '" completed: ' + (0 === t.failed ? 'pass' : 'FAIL'))
    	};
    	
    	//{ result, actual, expected, message }
    	QUnit.log = function(t) {
    	    if (!t.result) {
    	        console.log('Test "' + testName + '" assertion failed. Expected <' + t.expected + '> Actual <' + t.actual + '>' + (t.message ? ': \'' + t.message + '\'' : ''));
    	    }
    	};
    }());
    
  2. Edit testme.test.htm to pull in test-support.js and add a test that will currently fail
    • <!DOCTYPE html>
      <html>
      <head>
      	<!-- we need QUnit as a test runner -->
          <link rel="stylesheet" href="qunit.css" type="text/css" media="screen" />
          <script src="qunit.js"></script>
      	
      	<!-- where would our tests be without support! -->
      	<script src="test-support.js"></script>
      	
      	<!-- we'd like to have the file we're going to test -->
          <script src="testme.js"></script>
      	
      	<!-- and finally lets write some tests -->
      	<script>
      	
      		test("add is defined", function() {
      			equals(typeof window.add, "function", "add isn't a function :(");
      		});
      		
      		test("add 1+1", function() {
      			equals(add(1, 1), 2);
      		});		
      	</script>
          
      </head>
      <body>
      	 <h1 id="qunit-header">QUnit Tests</h1>
      	 <h2 id="qunit-banner"></h2>
      	 <div id="qunit-testrunner-toolbar"></div>
      	 <h2 id="qunit-userAgent"></h2>
      	 <ol id="qunit-tests"></ol>
      	 <div id="qunit-fixture"></div>    
      </body>
      </html>
      
  3. Open a command prompt, navigate to \jsunit and run phantomjs run-qunit.js testme.test.htm
    • Output should be similar to:
      Test "add is defined" completed: pass
      Test "add 1+1" assertion failed. Expected <2> Actual <undefined>
      Test "add 1+1" completed: FAIL
      'waitFor()' finished in 209ms.
      Tests completed in 70 milliseconds.
      1 tests of 2 passed, 1 failed.
    • If you print the exit code (echo %ERRORLEVEL% in Windoze) you should get a 1, indicating we have fulfilled the 'exit with non-zero exit code on failure' requirement :)
Part 3: Ant integration
At long last we are ready to integrate this mess into a build. For this example I will use Ant and will assume Ant is on PATH. At time of writing I am using Ant 1.8.2.

  1. Create a phantomjs.bat file in \jsunit with the following content
    @echo off
    C:\where\you\put\phantom\phantomjs.exe %*
    
    • Alternately create phantomjs.sh with equivalent functionality if on *nix
  2. Create a build.xml file in \jsunit with the following content
    • <?xml version="1.0" encoding="UTF-8"?>
      <project name="jsunittests" basedir="." default="main">
      	<property name="builddir" location="${basedir}/target"/>
      	
      	<condition property="phantom.filename" value="phantomjs.bat"><os family="windows"/></condition>
      	<condition property="phantom.filename" value="phantomjs.sh"><os family="unix"/></condition>   
      	
      	<target name="clean">
      		<delete dir="${builddir}"/>
      	</target>
      	
      	<target name="prep">
      		<mkdir dir="${builddir}"/>
      	</target>
      	
      	<target name="jstest">
            <!--Run all tests w/phantom, fail if tests fail. Execute all files w/extension .test.htm. -->
            <apply executable="${phantom.filename}" failonerror="true" dir="${basedir}" relative="true">
               <arg value="run-qunit.js"/>
               <fileset dir="${basedir}">
                  <include name="**/*.test.htm" />
               </fileset>
            </apply>			
      	</target>
      	
      	<target name="main" depends="clean, prep, jstest">
      	</target>
      </project>
      
  3. Run 'ant'; you should get output similar to the following (yes, it's supposed to fail, remember we have a test that fails setup on purpose)
    • Buildfile: build.xml
      
      clean:
         [delete] Deleting directory C:\Code\jsunit-trial\target
      
      prep:
          [mkdir] Created dir: C:\Code\jsunit-trial\target
      
      jstest:
          [apply] Test "add is defined" completed: pass
          [apply] Test "add 1+1" assertion failed. Expected <2> Actual 
          [apply] Test "add 1+1" completed: FAIL
          [apply] 'waitFor()' finished in 218ms.
          [apply] Tests completed in 58 milliseconds.
          [apply] 1 tests of 2 passed, 1 failed.
      
      BUILD FAILED
      C:\Code\jsunit-trial\build.xml:18: apply returned: 1
      
      Total time: 0 seconds
      
  4. Edit testme.js just enough to fix the test
    • /**
       * var-args; adds up all arguments and returns sum
       */
      function add() {
      	var sum =0;
      	for (var i=0; i<arguments.length; i++)
      		sum += arguments[i];
      	return sum;
      }
  5. Run 'ant'; you should get output similar to the following
    • Buildfile: build.xml
      
      clean:
         [delete] Deleting directory C:\Code\jsunit-trial\target
      
      prep:
          [mkdir] Created dir: C:\Code\jsunit-trial\target
      
      jstest:
          [apply] Test "add is defined" completed: pass
          [apply] Test "add 1+1" completed: pass
          [apply] 'waitFor()' finished in 214ms.
          [apply] Tests completed in 59 milliseconds.
          [apply] 2 tests of 2 passed, 0 failed.
      
      main:
      
      BUILD SUCCESSFUL
      Total time: 0 seconds
Pretty sweet, we've got Javascript tests running in an Ant build as a first-class citizen. Now if you break my Javascript my Continuous Integration server will let me know!

Part 4: Code coverage
Finally we are ready to get some code coverage. We are going to get code coverage by instrumenting our js files using JSCoverage, running our QUnit tests such that the relative paths resolve to the instrumented copies, and then using the PhantomJS file system APIs to create a colorized copy of the original js file to visually display coverage. We'll do a quick and dirty percentage coverage output to the console as well.


  1. Download JSCoverage 0.5.1
  2. Create a jscoverage.bat file in \jsunit with the following content
    @echo off
    C:\where\you\put\jscoverage\jscoverage.exe %*
    
  3. Create a template file for coverage information named coverageBase.htm in \jsunit
    • <!DOCTYPE html>
      <html>
      <head>
          <style>
              .code {
                  white-space: pre;
                  font-family: courier new;
                  width: 100%;            
              }
              
              .miss {
                  background-color: #FF0000;
              }
              
              .hit {
                  background-color: #94FF7C;
              }
              
              .undef {
                  background-color: #AFFF9E;
              }        
          </style>
      </head>
      <body>
      
      COLORIZED_LINE_HTML
      
      </body>
      </html>
      
  4. Update build.xml to perform a few new steps
    1. Create a \target\testjs\js directory and copy our js files into it
    2. Index our js files for code coverage, putting the indexed version into \target\testjs\jsinstrumented
    3. Copy *.test.htm into \target\testhtm
    4. Copy base resources to run tests (run-qunit.js, qunit.js, qunit.css) into \target\testhtm
    5. Copy the instrumented js files into \target\testhtm
      1. Note that because we used relative paths to our test js files the *.test.htm QUnit html files will now resolve js to the instrumented version when we run the files out of \target\testhtm
    6. Run PhantomJS on *.test.htm in \target\testhtm
    7. The updated build.xml looks like this:
    8. <?xml version="1.0" encoding="UTF-8"?>
      <project name="jsunittests" basedir="." default="main">
      	<property name="builddir" location="${basedir}/target"/>
      	<property name="jstestdir" location="${builddir}/testjs"/>
      	<property name="jsdir" location="${jstestdir}/js"/>
      	<property name="jsinstrumenteddir" location="${jstestdir}/jsinstrumented"/>
      	<property name="testhtmdir" location="${builddir}/testhtm"/>
      	
      	<condition property="phantom.filename" value="phantomjs.bat"><os family="windows"/></condition>
      	<condition property="phantom.filename" value="phantomjs.sh"><os family="unix"/></condition>   
      	
      	<property name="jscoverage.filename" value="jscoverage.bat" />
      	
      	<target name="clean">
      		<delete dir="${builddir}"/>
      	</target>
      	
      	<target name="prep">
      		<mkdir dir="${jsdir}"/>
      		<mkdir dir="${jsinstrumenteddir}"/>		
      		<mkdir dir="${testhtmdir}"/>
      		
      		<!-- copy non-test js files to target so we can mess with 'em. how we select which files may vary; for this 
      			 example just pick the one file we are testing.-->
      		<copy todir="${jsdir}">
      			<fileset dir="${basedir}">
      				<include name="testme.js" />
      			</fileset>
      		</copy>
      				
      		<!-- run jscoverage to produce a version of the file instrumented for code coverage -->
      		<exec executable="${jscoverage.filename}" failonerror="true">
      			<arg value="${jsdir}"/>
      			<arg value="${jsinstrumenteddir}"/>
      		</exec>   		
      		
      		<!-- copy our test htm files and modify them to point to the coverage indexed version of the test file. -->
      		<copy todir="${testhtmdir}">
      			<fileset dir="${basedir}">
      				<include name="**/*.test.htm" />
      			</fileset>
      		</copy>		
      		
      		<!-- copy core resources to testhtmdir so we can load them with same paths as when executing test htm files directly -->
      		<copy todir="${testhtmdir}">
      			<fileset dir="${jsinstrumenteddir}">
      				<include name="**/*.js" />
      				<exclude name="jscoverage.js"/>
      			</fileset>
      		</copy>				
      		<copy todir="${testhtmdir}">
      			<fileset dir="${basedir}">
      				<include name="test-support.js" />
      				<include name="run-qunit.js" />
      				<include name="qunit.css" />
      				<include name="qunit.js" />
      			</fileset>
      		</copy>				
      	</target>
      	
      	<target name="jstest">
            <!--Run all tests w/phantom, fail if tests fail. Execute all files w/extension .test.htm. -->
            <apply executable="${basedir}/${phantom.filename}" failonerror="true" dir="${testhtmdir}" relative="false">
               <arg value="run-qunit.js"/>
      		 <srcfile/>
      		 <arg value="${basedir}"/>
               <fileset dir="${testhtmdir}">
                  <include name="**/*.test.htm" />
               </fileset>
            </apply>			
      	</target>
      	
      	<target name="main" depends="clean, prep, jstest">
      	</target>
      </project>
      
  5. Modify our test-support.js to look for jscoverage data and output a rough count of lines hit, missed, and irrelevant (non-executable). Also expose a function a caller outside of page context can use to access coverage information. The new version should look like this:
    • //create a scope so we don't pollute global
      (function() {  
         var testName;
         
         //arg: { name }
      	QUnit.testStart = function(t) {
      	    testName = t.name;
      	};
      	
      	//arg: { name, failed, passed, total }
      	QUnit.testDone = function(t) {
      	    console.log('Test "' + t.name + '" completed: ' + (0 === t.failed ? 'pass' : 'FAIL'))
      	};
      	
      	//{ result, actual, expected, message }
      	QUnit.log = function(t) {
      	    if (!t.result) {
      	        console.log('Test "' + testName + '" assertion failed. Expected <' + t.expected + '> Actual <' + t.actual + '>' + (t.message ? ': \'' + t.message + '\'' : ''));
      	    }
      	};
      	
      	//we want this at global scope so outside callers can find it. In a more realistic implementation we
      	//should probably put it in a namespace.
      	window.getCoverageByLine = function() {
      		var key = null;
              var lines = null;
              //look for code coverage data    
              if (typeof _$jscoverage === 'object') {
      			for (key in _$jscoverage) {}
      			lines = _$jscoverage[key];
              } 
      
      		if (!lines) {
                 console.log('code coverage data is NOT available');
              } 
              		
              return { 'key': key, 'lines': lines };
         };
      
         QUnit.done = function(t) {
              var cvgInfo = getCoverageByLine();
              if (!!cvgInfo.lines) {
                  var testableLines = 0;
                  var testedLines = 0;
      			var untestableLines = 0;
                  for (lineIdx in cvgInfo.lines) {
      				var cvg = cvgInfo.lines[lineIdx];
      				if (typeof cvg === 'number') {
      					testableLines += 1;
      					if (cvg > 0) {
      						testedLines += 1;
      					}					
      				} else {
      					untestableLines += 1;
      				}
                  }     
                  var coverage = '' + Math.floor(100 * testedLines / testableLines) + '%';
                  
      			var result = document.getElementById('qunit-testresult');
      			if (result != null) {
      				result.innerHTML = result.innerHTML + ' ' + coverage + ' test coverage of ' + cvgInfo.key;
      			} else {
      				console.log('can\'t find test-result element to update');
      			}			
              }
         };  	
      }());
      
  6. Finally, modify run-qunit.js to load the original js file and produce a colorized version based on the coverage data we get by running the test against the version of the js file indexed for coverage. The new version should look like this:
    • /**
       * Wait until the test condition is true or a timeout occurs. Useful for waiting
       * on a server response or for a ui change (fadeIn, etc.) to occur.
       *
       * @param testFx javascript condition that evaluates to a boolean,
       * it can be passed in as a string (e.g.: "1 == 1" or "$('#bar').is(':visible')" or
       * as a callback function.
       * @param onReady what to do when testFx condition is fulfilled,
       * it can be passed in as a string (e.g.: "1 == 1" or "$('#bar').is(':visible')" or
       * as a callback function.
       * @param timeOutMillis the max amount of time to wait. If not specified, 3 sec is used.
       */
      function waitFor(testFx, onReady, timeOutMillis) {
          var maxtimeOutMillis = timeOutMillis ? timeOutMillis : 3001, //< Default Max Timout is 3s
              start = new Date().getTime(),
              condition = false,
              interval = setInterval(function() {
                  if ( (new Date().getTime() - start < maxtimeOutMillis) && !condition ) {
                      // If not time-out yet and condition not yet fulfilled
                      condition = (typeof(testFx) === "string" ? eval(testFx) : testFx()); //< defensive code
                  } else {
                      if(!condition) {
                          // If condition still not fulfilled (timeout but condition is 'false')
                          console.log("'waitFor()' timeout");
                          phantom.exit(1);
                      } else {
                          // Condition fulfilled (timeout and/or condition is 'true')
                          console.log("'waitFor()' finished in " + (new Date().getTime() - start) + "ms.");
                          typeof(onReady) === "string" ? eval(onReady) : onReady(); //< Do what it's supposed to do once the condition is fulfilled
                          clearInterval(interval); //< Stop this interval
                      }
                  }
              }, 100); //< repeat check every 250ms
      };
      
      
      if (phantom.args.length === 0 || phantom.args.length > 3) {
          console.log('Usage: run-qunit.js URL basedir');
          phantom.exit(1);
      }
      
      var fs = require('fs');
      var page = require('webpage').create();
      
      // Route "console.log()" calls from within the Page context to the main Phantom context (i.e. current "this")
      page.onConsoleMessage = function(msg) {
          console.log(msg);
      };
      
      var openPath = phantom.args[0].replace(/^.*(\\|\/)/, '');
      var basedir = phantom.args[1];
      var coverageBase = fs.read(basedir + fs.separator + 'coverageBase.htm');
      
      page.open(openPath, function(status){
          if (status !== "success") {
              console.log("Unable to access network");
              phantom.exit(1);
          } else {
              waitFor(function(){
                  return page.evaluate(function(){
                      var el = document.getElementById('qunit-testresult');
                      if (el && el.innerText.match('completed')) {
                          return true;
                      }
                      return false;
                  });
              }, function(){
      			//BEGIN MODIFIED: output colorized code coverage
      			//reach into page context and pull out coverage info. stringify to pass context boundaries.
      			var coverageInfo = JSON.parse(page.evaluate(function() { return JSON.stringify(getCoverageByLine()); }));
      			var lineCoverage = coverageInfo.lines;
      			var originalFile = basedir + fs.separator + coverageInfo.key;
      			var fileLines = readFileLines(originalFile);
      			
                  var colorized = '';
                  
      			console.log('lines=' + JSON.stringify(lineCoverage));
                  for (var idx=0; idx < lineCoverage.length; idx++) { 
                      //+1: coverage lines count from 1.
                      var cvg = lineCoverage[idx + 1];
                      var hitmiss = '';
                      if (typeof cvg === 'number') {
                          hitmiss = ' ' + (cvg>0 ? 'hit' : 'miss');
                      } else {
                          hitmiss = ' ' + 'undef';
                      }
                      var htmlLine = fileLines[idx].replace('<', '&lt;').replace('>', '&gt;');
                      colorized += '<div class="code' + hitmiss + '">' + htmlLine + '</div>\n';
                  };        
                  colorized = coverageBase.replace('COLORIZED_LINE_HTML', colorized);
                  
                  var coverageOutputFile = phantom.args[0].replace('.test.htm', '.coverage.htm');
                  fs.write(coverageOutputFile, colorized, 'w');
                  
                  console.log('Coverage for ' + coverageInfo.key + ' in ' + coverageOutputFile);			
      			//END MODIFIED
      		
                  var failedNum = page.evaluate(function(){
                      var el = document.getElementById('qunit-testresult');
                      console.log(el.innerText);
                      try {
                          return el.getElementsByClassName('failed')[0].innerHTML;
                      } catch (e) { }
                      return 10000;
                  });
                  phantom.exit((parseInt(failedNum, 10) > 0) ? 1 : 0);
              });
          }
      });
      
      //MODIFIED: add new fn
      function readFileLines(filename) {
          var stream = fs.open(filename, 'r');
          var lines = [];
          var line;
          while (!stream.atEnd()) {
              lines.push(stream.readLine());
          }
          stream.close();
          
          return lines;
      }  
      
      
  7. Run 'ant'; you should see output similar to:

  8. Open \jsunit\target\testhtm\testme.test.htm in a browser; you should see something similar to this (note coverage % appears):

  9. Open \jsunit\target\testhtm\testme.coverage.htm in a browser; you should see something similar to this (red for untested, green for tested, light green for non-executable lines):

So where does that leave us?
We have clearly displayed we can accomplish some important things:


  • Write unit tests for Javascript
  • Run unit tests for Javascript in a command line build
  • Index Javascript files for code coverage
  • Output coverage percentage to the test runner (QUnit html file)
  • Render a colorized version of the Javascript under test clearly indicating which lines are/aren't being tested
I think this is awesome! Bear in mind in a real version we would of course make numerous refinements to this rather basic implementation; what we have is a proof of concept not by any stretch of the imagination an implementation ready for a team to consume.

Friday, September 23, 2011

Tracking a running standard deviation

It is fairly common to want to track statistics on your software as it runs. Sometimes it makes sense to scrape logs and aggregate a massive set of samples. Other times you really want "live" numbers; eg running statistics. Most commonly, arithmetic mean and standard deviation. Tracking arithmetic mean alone is relatively easy but it can be very misleading. I frequently see claims of "good performance" based on mean alone which completely break down when it turns out that if we consider standard deviation a significant percentage of our population is actually experiencing something much, much worse than the mean.

So, how do we track a standard deviation from a series of samples? We could keep all samples in memory but that sounds bad; ideally we'd just like a few variables used to keep an running count of the stat. Luckily Knuth solved this problem for us, and even more luckily a statistician has written sample code in C# for us at http://www.johndcook.com/standard_deviation.html.

In Java this winds up looking remarkably similar:
//class
public class RunningStat {
	private int m_n;
	private double m_oldM;
	private double m_newM;
	private double m_oldS;
	private double m_newS;
	
	public RunningStat() {
		m_n = 0;
	}
	
	public void clear() { m_n = 0; }
	
	public void addSample(double sample) {
        m_n++;

        // See Knuth TAOCP vol 2, 3rd edition, page 232
        if (m_n == 1)
        {
            m_oldM = m_newM = sample;
            m_oldS = 0.0;
        }
        else
        {
            m_newM = m_oldM + (sample - m_oldM)/m_n;
            m_newS = m_oldS + (sample - m_oldM)*(sample - m_newM);

            // set up for next iteration
            m_oldM = m_newM; 
            m_oldS = m_newS;
        }
	}
	
	public int getNumSamples() { return m_n; }
	public double getMean() { return (m_n > 0) ? m_newM : 0.0; }
	public double getVariance() { return ( (m_n > 1) ? m_newS/(m_n - 1) : 0.0 ); }
	public double getStdDev() { return Math.sqrt(getVariance()); }	
}

//usage
RunningStat timeStat = new RunningStat();		
...
long time = System.nanoTime();			
//do the thing we are tracking...						
time = (System.nanoTime() - time) / (1000*1000); //ms
			
timeStat.addSample(time); 

This does have some limitations for a typical Java web application, in particular it isn't threadsafe. We can simply slap synchronized on it, using a private lock to avoid unwanted publicity of our synchronization primatives:
public class RunningStat {
	private int m_n;
	private double m_oldM;
	private double m_newM;
	private double m_oldS;
	private double m_newS;
	
	private Object m_lock = new Object();
	
	public RunningStat() {
		m_n = 0;
	}
	
	public void clear() {
		synchronized(m_lock) {
			m_n = 0;
		}
	}
	
	public void addSample(double sample) {
		synchronized(m_lock) {
	        m_n++;
	
	        // See Knuth TAOCP vol 2, 3rd edition, page 232
	        if (m_n == 1)
	        {
	            m_oldM = m_newM = sample;
	            m_oldS = 0.0;
	        }
	        else
	        {
	            m_newM = m_oldM + (sample - m_oldM)/m_n;
	            m_newS = m_oldS + (sample - m_oldM)*(sample - m_newM);
	
	            // set up for next iteration
	            m_oldM = m_newM; 
	            m_oldS = m_newS;
	        }
		}
	}
	
	public int getNumSamples() { synchronized(m_lock) { return m_n; } }
	public double getMean() { synchronized(m_lock) { return (m_n > 0) ? m_newM : 0.0; } }
	public double getVariance() { synchronized(m_lock) { return ( (m_n > 1) ? m_newS/(m_n - 1) : 0.0 ); } }
	public double getStdDev() { synchronized(m_lock) { return Math.sqrt(getVariance()); } }	
}
This leaves open the question of how we actually distribute references to the stat. For example, it could be that anytime a given handler executes we want to update a stat, and we want to grab that stat to use in a periodic stat logger, and we want to grab it for our view statistics page. All these locations need to get a reference to the same instance of RunningStat. If we use something like Spring we can trivially inject the same reference. If not we may wish to manage a stats registry ourselves. A simple implementation using ConcurrentHashMap might look something like this:
import java.util.Collection;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ConcurrentMap;

public class StatisticsRegistry {
	private static final StatisticsRegistry instance = new StatisticsRegistry();				
	
	private final ConcurrentMap statById = new ConcurrentHashMap(); 
	private final Object writeLock = new Object();
	
	public static StatisticsRegistry getInstance() { return instance; }
	
	public RunningStat getNamedStatistic(String name) {
		/**
		 * Usually the stat will exist; avoid extra sync ops on write lock when possible
		 */
		RunningStat stat = statById.get(name);
		
		if (null == stat) {
			synchronized(writeLock) {
				//someone else could have just inserted it; if so putIfAbsent will return the value they put in
				RunningStat existing = statById.putIfAbsent(name, new RunningStat());
				if (existing != null) {
					stat = existing;
				}
			}
		}
		
		return stat;
	}
		
	public Collection getAllStatistics() {
		return statById.values();
	}
}
Clients can request a stat by name and clients like a stat logger or viewer can grab either all stats or specific ones as needed. As the same instance is always returned it is safe for clients to write code to retain RunningStat references.

Monday, July 11, 2011

Javascript Unit Tests with QUnit, Ant, and PhantomJS, Take 1

Recently I have been finding bugs in Javascript slip by a lot more easily than bugs in Scala, Java, or various other languages for which we write unit tests. jQuery seems to use QUnit (http://docs.jquery.com/QUnit) but QUnit appears to expect a web page to be setup to host it. This is better than nothing but really I want to run my js unit tests in an automated build, in my case using Ant.

The problem seemed remarkably likely to be solved already so I took to the Google and discovered some blogposts (http://twoguysarguing.wordpress.com/2010/11/26/qunit-cli-running-qunit-with-rhino/, http://twoguysarguing.wordpress.com/2010/11/06/qunit-and-the-command-line-one-step-closer/) where the author was attempting to achieve a command line unit test runner using Rhino and QUnit. Apparently John Resig tweeted some time ago (http://twitter.com/#!/jeresig/status/4477641447) to indicate QUnit should be operable in this manner so things seemed promising.

The twoguysarguing (great title) blog posts I found seemed to require modifications to QUnit source, plus Rhino not being a full browser apparently caused some issues as well. I really didn't want a custom version of QUnit, but the general approach seemed promising. In the comments for the second post someone suggested use of PhantomJS (http://twoguysarguing.wordpress.com/2010/11/06/qunit-and-the-command-line-one-step-closer/#comment-599), a headless WebKit browser. I decided to give this a try as it sounded remarkably reasonable.

My first step was to verify PhantomJS worked for me at all. It ran my first test without any issue:
//try1.js
console.log('Hello, World');
phantom.exit();
Executed similar to phantomjs try1.js this prints Hello, World just as one might hope.

The next question seemed to be whether or not PhantomJS could actually load QUnit using injectJs (ref http://code.google.com/p/phantomjs/wiki/Interface). I git cloned QUnit and attempted to invoke inject.Js on it:
//try2.js
if (window.QUnit == undefined)
 console.log('no QUnit yet!');
else
 console.log('somehow we already haz QUnit !!');
phantom.injectJs('D:\\Code\\3.7-scalaide\\JavaScriptUnitTests\\QUnit\\qunit.js');
if (window.QUnit != undefined)
 console.log('goodnes; injectJs seems to have worked');

phantom.exit();
This prints:
no QUnit yet!
goodnes; injectJs seems to have worked
So far so good!!

So, that means we should be able to setup and run a test, right? Something like this:
test("This test should fail", function() {
  console.log('the test is running!');
  ok( true, "this test is fine" );
  var value = "hello";
  equals( "hello", value, "We expect value to be hello" );
  equals( "duck", value, "We expect value to be duck" );
});

test("This test should pass", function() {
  console.log('the test is running!');
  ok( true, "this test is fine" );
  var value = "hello";
  equals( "hello", value, "We expect value to be hello" );
  equals( "duck", value, "We expect value to be duck" );
});
Well ... sadly this part didn't "just work". QUnit tries to execute the test queue on timers and despite PhantomJS supporting timers they just never seemed to execute. Furthermore, QUnit default feedback is via DOM modifications that are rather unhelpful to the PhantomJS runner. My first draft was to modify QUnit by adding a function that directly executed the test queue, inline, without using timers. This worked, but it required modifying QUnit source, which I specifically wish to avoid.

Luckily something similar to the changes to add a function to run QUnits tests directly works just fine outside QUnit as well. The key is that we will:
  1. Track test pass/fail count via the QUnit.testDone callback (http://docs.jquery.com/Qunit#Integration_into_Browser_Automation_Tools)
    1. We need our own pass/fail counters as QUnit tells us how many assertions passed/failed rather than how many tests passed/failed.
  2. Track whether or not the test run is done overall via the QUnit.done callback (http://docs.jquery.com/Qunit#Integration_into_Browser_Automation_Tools)
  3. Directly execute the QUnit test queue from our own code
    1. hack but the point here is to see if we can make this work at all
  4. Split tests into their own file
    1. This facilitates using an Ant task to run a bunch of different test sets; eg using apply to pickup on all .js test files by naming convention or location convention
  5. Return the count of failures as our PhantomJS exit code
    1. This facilitates setting an Ant task to failonerror to detect unit test failures
So, without further ado, error handling, namespaces/packages, or any other cleanup here is a version that works in a manner very near to the desired final result:
try4.js
function importJs(scriptName) {
 console.log('Importing ' + scriptName);
 phantom.injectJs(scriptName);
}

console.log('starting...');

//Arg1 should be QUnit
importJs(phantom.args[0]);

//Arg2 should be user tests
var usrTestScript = phantom.args[1];
importJs(usrTestScript);

//Run QUnit
var testsPassed = 0;
var testsFailed = 0;

//extend copied from QUnit.js
function extend(a, b) {
 for ( var prop in b ) {
  if ( b[prop] === undefined ) {
   delete a[prop];
  } else {
   a[prop] = b[prop];
  }
 }

 return a;
}

QUnit.begin({});

// Initialize the config, saving the execution queue
var oldconfig = extend({}, QUnit.config);
QUnit.init();
extend(QUnit.config, oldconfig);

QUnit.testDone = function(t) {
 if (0 === t.failed) 
  testsPassed++;
 else
  testsFailed++;
  
 console.log(t.name + ' completed: ' + (0 === t.failed ? 'pass' : 'FAIL'))
}

var running = true;
QUnit.done = function(i) {
 console.log(testsPassed + ' of ' + (testsPassed + testsFailed) + ' tests successful');
 console.log('TEST RUN COMPLETED (' + usrTestScript + '): ' + (0 === testsFailed ? 'SUCCESS' : 'FAIL')); 
 running = false;
}

//Instead of QUnit.start(); just directly exec; the timer stuff seems to invariably screw us up and we don't need it
QUnit.config.semaphore = 0;
while( QUnit.config.queue.length )
 QUnit.config.queue.shift()();

//wait for completion
var ct = 0;
while ( running ) {
 if (ct++ % 1000000 == 0) {
  console.log('queue is at ' + QUnit.config.queue.length);
 }
 if (!QUnit.config.queue.length) {
  QUnit.done();
 }
}

//exit code is # of failed tests; this facilitates Ant failonerror. Alternately, 1 if testsFailed > 0.
phantom.exit(testsFailed);

try4-tests.js
test("This test should fail", function() {
  ok( true, "this test is fine" );
  var value = "hello";
  equals( "hello", value, "We expect value to be hello" );
  equals( "duck", value, "We expect value to be duck" );
});

test("This test should pass", function() {
  equals( "hello", "hello", "We expect value to be hello" );
});

This runs as follows:
>phantomjs.exe try4.js qunit.js try4-tests.js
starting...
Importing qunit\qunit.js
Importing javascript\try4-tests.js
This test should fail completed: FAIL
This test should pass completed: pass
queue is at 0
1 of 2 tests successful
TEST RUN COMPLETED (try4-tests.js): FAIL

Note that we have not modified qunit.js, and we have split our tests into their own file. This allows us to easily set the whole thing up to run from Ant:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<project default="js-tests4"> 
 <target name="js-tests4">
  <property name="phantomjs.exe.file" value="phantomjs.exe" />
  <property name="qunit.js.file" location="path/to/qunit.js" />

  <apply executable="${phantomjs.exe.file}" failonerror="true">
   <arg value="path/to/try4.js"/>
   <arg value="${qunit.js.file}" />
   <srcfile/>
   
   <fileset dir="path/to/tests">
    <include name="try4-tests.js" />
   </fileset>
  </apply>
 </target>
</project>

It even works run this way:
js-tests4:
    [apply] starting...
    [apply] Importing path\to\qunit.js
    [apply] Importing path\to\try4-tests.js
    [apply] This test should fail completed: FAIL
    [apply] This test should pass completed: pass
    [apply] queue is at 0
    [apply] 1 of 2 tests successful
    [apply] TEST RUN COMPLETED (try4-tests.js): FAIL

BUILD FAILED

Note that Ant has detected that a js unit test failed and failed the build, just as we intended.

This leaves us with a proof of concept implementation that seems to prove that using PhantomJS to run QUnit based Javascript tests to run from a command line build is fundamentally possible. Doubtless if/when we try to use it "for real" additional problems will emerge ;)

Tuesday, June 14, 2011

Beginning Scala: Building a project with Maven and Eclipse

Let us suppose that after many years languishing in the dark corridors of Java, frantically scrambling to avoid the various framework ghouls, one finally emerged into the light and saw a Scala frolicking in the sun on a much greener grassy field. The terse little Scala looks inviting to the verbose, ragged, framework beridden, somewhat aged Java developer ... but how to start? Well luckily there are a few nice introductions (Scala for Java Refugees, The busy Java developer's guide to Scala) out there and the staircase book (by the language creator no less) is a must if one is serious about learning the language.

Introductions are all well and good but as a Java developer I want some first-class tools and I want to run some code! Until recently first-class tools were seriously lacking. Luckily in recent months even the languages creator has acknowledged the tooling was lacking and has taken aggressive steps to correct the issue.

At the time of writing I use Eclipse 3.6.2 Helios (Java Developer edition) with Scala-IDE 2.0.0.somethingOrOther-beta. For the sake of this example I will assume we wish to build the project using Maven. First step, download the tools:
  1. Eclipse Helios
    1. At the time of writing I favor the Java developer edition as it is somewhat less bloated than the EE version; http://www.eclipse.org/downloads/packages/eclipse-ide-java-developers/heliossr2
  2. Scala IDE 2
    1. The 2.0 release is a major modification and the first solid Eclipse IDE plugin for Scala; the 1.x version was prone to all sorts of "exciting" behaviors
    2. Add to Eclipse using the update site from link on the front page of http://www.scala-ide.org/
  3. m2eclipse
    1. Add to Eclipse using the update site from http://m2eclipse.sonatype.org/installing-m2eclipse.html
  4. Maven 3
    1. Download from http://maven.apache.org/download.html
    2. Make sure if you run mvn -v in command prompt it prints the expected version
Now we have the tools, lets setup a project! First up, create a directory for your project and in that directory create a file called pom.xml similar to:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>com.blogspot.whileonefork</groupId>
  <artifactId>mvn-scala-trial1</artifactId>
  <packaging>jar</packaging>
  <version>1.0-SNAPSHOT</version>
  <name>Sample Project for Blog</name>
  <url>http://com.blogspot.whileonefork</url>
</project>

Next, create a directory named src containing a directory named main containing a directory named scala. Your project should now contain the following:
mvn-scala-project
|--pom.xml
`--src
   `--main
      `--scala

In the src/main/scala directory, create a new file HelloWorldJustLikeJava.scala with the simplest hello world implementation possible:
object HelloWorldJustLikeJava {
  def main(args:Array[String]) = {
    println("Hello, World");
  }
}

This, while arguably not the absolute simplest Scala hello world, is the definition closest to the one you would write in Java. Now to get it compiling in Maven. If you open a command prompt in the mvn-scala-project directory and run mvn clean install it will succeed but you will get a warning that no content was marked for inclusion; this occurs because by default Maven doesn't include the plugin necessary to build Scala code.

To compile Scala files we must update our pom.xml; we'll tell Maven what plugin to use to build Scala files and we'll advise it of the existence of a URL where it can find the plugin in question. Our updated pom.xml should look like this:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>com.blogspot.whileonefork</groupId>
  <artifactId>mvn-scala-trial1</artifactId>
  <packaging>jar</packaging>
  <version>1.0-SNAPSHOT</version>
  <name>Sample Project for Blog</name>
  <url>http://com.blogspot.whileonefork</url>
  
  <!-- Notify Maven it can download things from here -->
  <repositories>
    <repository>
      <id>scala-tools.org</id>
      <name>Scala-tools Maven2 Repository</name>
      <url>http://scala-tools.org/repo-releases</url>
    </repository>
  </repositories>  
  
  <dependencies>
    <!-- Scala version is very important. Luckily the plugin warns you if you don't specify: 
        [WARNING] you don't define org.scala-lang:scala-library as a dependency of the project -->
    <dependency>
      <groupId>org.scala-lang</groupId>
      <artifactId>scala-library</artifactId>
      <version>2.9.0-1</version>
    </dependency>
  </dependencies>
  
  <build>  
    <!-- add the maven-scala-plugin to the toolchain -->
    <plugins>
      <plugin>
        <groupId>org.scala-tools</groupId>
        <artifactId>maven-scala-plugin</artifactId>
        <version>2.14.2</version>
      </plugin>
    </plugins>
  </build>
</project>

If you run mvn scala:compile in the mvn-scala-project directory the project should build successfully and in /target/classes you should find two .class files: HelloWorldJustLikeJava$.class and HelloWorldJustLikeJava.class. Scala (usually) compiles the .class files to run on the JVM, just like Java.

All the usual Java tools can be used on these .class files; for example if we run javap on HelloWorldJustLikeJava$ we see the Scala source was compiled to Java; in this case fairly predictable Java:
public final class HelloWorldJustLikeJava$ extends java.lang.Object implements scala.ScalaObject{
    public static final HelloWorldJustLikeJava$ MODULE$;
    public static {};
    public void main(java.lang.String[]);
}
Digging a little deeper, using javap -verbose, we can see that our HelloWorldJustLikeJava class invokes HelloWorldJustLikeJava$:
public static final void main(java.lang.String[]);
  Code:
   Stack=2, Locals=1, Args_size=1
   0:   getstatic       #11; //Field HelloWorldJustLikeJava$.MODULE$:LHelloWorldJustLikeJava$;
   3:   aload_0
   4:   invokevirtual   #13; //Method HelloWorldJustLikeJava$.main:([Ljava/lang/String;)V
   7:   return
HelloWorldJustLikeJava$ provides a simple implementation of main and some other odds and ends:

public static final HelloWorldJustLikeJava$ MODULE$;

public static {};
  Code:
   Stack=1, Locals=0, Args_size=0
   0:   new     #9; //class HelloWorldJustLikeJava$
   3:   invokespecial   #12; //Method "":()V
   6:   return

public void main(java.lang.String[]);
  Code:
   Stack=2, Locals=2, Args_size=2
   0:   getstatic       #19; //Field scala/Predef$.MODULE$:Lscala/Predef$;
   3:   ldc     #22; //String Hello, World
   5:   invokevirtual   #26; //Method scala/Predef$.println:(Ljava/lang/Object;)V
   8:   return
  LineNumberTable:
   line 3: 0

  LocalVariableTable:
   Start  Length  Slot  Name   Signature
   0      9      0    this       LHelloWorldJustLikeJava$;
   0      9      1    args       [Ljava/lang/String;

The practice of outputting additional classes beyond what you directly coded is VERY common in Scala. A class using closures and other features of Scala will often output an exciting wack of $whatever classes.

Anyway, at this point the objective of making our project compile with Maven is achieved; we can now move on to getting it setup to edit in Eclipse. In the mvn-scala-project directory (the same one as pom.xml), create a .classpath file and a .project file:

.classpath
<?xml version="1.0" encoding="UTF-8"?>
<classpath>
  <classpathentry kind="src" path="src/main/scala"/>
 <classpathentry kind="con" path="org.eclipse.jdt.launching.JRE_CONTAINER"/>
 <classpathentry kind="con" path="org.maven.ide.eclipse.MAVEN2_CLASSPATH_CONTAINER"/>
 <classpathentry kind="con" path="org.scala-ide.sdt.launching.SCALA_CONTAINER"/>
 <classpathentry kind="output" path="target/classes"/>
</classpath>


.project
<?xml version="1.0" encoding="UTF-8"?>
<projectDescription>
 <name>mvn-scala-project</name>
 <comment></comment>
 <projects>
 </projects>
 <buildSpec>
  <buildCommand>
   <name>org.scala-ide.sdt.core.scalabuilder</name>
   <arguments>
   </arguments>
  </buildCommand>
  <buildCommand>
   <name>org.maven.ide.eclipse.maven2Builder</name>
   <arguments>
   </arguments>
  </buildCommand>
 </buildSpec>
 <natures>
  <nature>org.maven.ide.eclipse.maven2Nature</nature>
  <nature>org.scala-ide.sdt.core.scalanature</nature>
  <nature>org.eclipse.jdt.core.javanature</nature>
 </natures>
</projectDescription>

At this point your project should contain the following:
mvn-scala-project
|--.classpath
|--.project
|--pom.xml
`--src
   `--main
      `--scala
         |--HelloWorldJustLikeJava.scala
`target
   `--classes
      |--HelloWorldJustLikeJava$.class
      |--HelloWorldJustLikeJava.class
If you happened to run mvn clean the target directory may be absent.

In Package Explorer in Eclipse your project should look similar to:


And now we can edit our Scala in Eclipse and compile with Maven, and can therefore easily load our project into just about any continuous integration server. Hooray!

EDIT: Assuming you are using m2eclipse, you will want to make sure your Eclipse IDE boots using a JDK VM; this is accomplished by editing eclipse.ini and adding a -vm argument. Bumping memory up a bit usually helps too; the file I am currently using on a Windows dev box follows:
-startup
plugins/org.eclipse.equinox.launcher_1.1.1.R36x_v20101122_1400.jar
--launcher.library
plugins/org.eclipse.equinox.launcher.win32.win32.x86_1.1.2.R36x_v20101222
-showsplash
org.eclipse.platform
--launcher.XXMaxPermSize
256m
-vm
C:/Program Files/Java/jdk1.6.0_07/bin/javaw.exe
--launcher.defaultAction
openFile
-product
org.eclipse.epp.package.java.product
--launcher.defaultAction
openFile
--launcher.XXMaxPermSize
256M
-vmargs
-Dosgi.requiredJavaVersion=1.5
-Xms40m
-Xmx768m
EDIT2: If you want the project to build a bit more gracefully in Eclipse consider binding Scala compile to run by default (eg when you run mvn clean install instead of the somewhat novel mvn scala:compile). Modify the plugin element to add the execution:
<build>  
    <!-- add the maven-scala-plugin to the toolchain -->
    <plugins>
      <plugin>
        <groupId>org.scala-tools</groupId>
        <artifactId>maven-scala-plugin</artifactId>
        <version>2.15.2</version>
        <executions>
            <execution>
                <goals><goal>compile</goal></goals>
            </execution>
        </executions>
      </plugin>
    </plugins>
  </build>

Friday, June 3, 2011

When the UI designer doesn't use the UI

Suppose you wanted to provide a basic interface where users could set training goals, stuff like "Run 50km in June" perhaps. This seems semi-reasonable right? - well, at least as long as nobody ever wants to set a goal for two weeks.


So surely if I want a goal for one month I can just set Calendar month, pick a day in the month (or maybe it'll even flip to be a month dropdown?), and go, right?

Not so fast bucko, UI has got to be validated and you jackasses aren't going to start putting goals for the past into our system!
Not to worry though, the "calendar month" relative to June 3rd is a slightly weird way to put it but it does make sense. Or does it?
So um ... I can't enter the first because it's in the past, and I must pick the first day of a calendar month ... so it's impossible to specify a goal for June? And you don't even give me an arbitrary start/end date option so I could at least set June 3rd-30th?

Clearly the designer of this never had to actually use it! Arglefarg.

Tuesday, May 31, 2011

The magic of getOrElse

For the last month or so I've been coding in Scala. From time to time I use Option's; they seem pretty handy; it's much nicer to get an Option[T] back instead of a null. I semi-often find myself writing things like:

//where f is a function returning Option[Something]
   val t = f(args) getOrElse 3

Recently I realized (classic lightbulb moment :) that getOrElse actually takes a function as the argument:

def getOrElse [B >: A] (default: ⇒ B): B
  //Returns the option's value if the option is nonempty, otherwise return the result of evaluating default.

Why does this matter? It means that the code passed to getOrElse doesn't run at all (default is not evaluated) if the Option is defined! For example, this code:

object GetOrElseTest {
 def main(args : Array[String]) : Unit = {
  def a() = Some(3)
  def b() = None
  
  val aV = a getOrElse { println("nothing from a :("); 1 }
  val bV = b getOrElse { println("nothing from b :("); 1 }
  
  println("aV=%d, bV=%d".format(aV, bV)) 
  
  0
 }
}

Will print this output (note that the println for getting a, which is defined, never runs):

nothing from b :(
aV=3, bV=1

That's awesome and really hard to accomplish in Java. Imagine if log4j could do this; suddenly we wouldn't have any issues with things log.debug("a" + objectThatIsExpensiveToToString.toString()) having a runtime cost even when debug isn't enabled because we wouldn't evaluate the function to create a message at all if debug is off.

The new uniforms are pretty snappy eh first officer!

Monday, May 30, 2011

How to print Ivy's cachepath one artifiact per line with Ant

Many moons ago one Andrew Beacock shared how to use to dump out a classpath in a human-readable fashion (see http://blog.andrewbeacock.com/2005/08/pretty-printing-java-classpaths-using.html). The amount of Ant pain this gem saved me back in the day is ... well a lot.

Fast forward to 2011 and we're using Ant with Ivy as our dependency manager. We use Ivy's retrieve Ant task to get our artifacts but how do we print out a simple list of what Ivy got for us? Well, it turns out cachepath can get this for us, optionally filtered if we see fit:

<!-- after a resolve; in our case this target depends on our resolving target -->
    <ivy:cachepath pathid="ivy.cachepath" settingsRef="ivy.settings" />
    <pathconvert pathsep="${line.separator}  "
                 property="dependency.list"
                 refid="ivy.cachepath"/>
    <echo>DEPENDENCY LIST</echo>
    <echo>  ${dependency.list}</echo>

This will print something along the lines of:

[echo]   C:\...\.ivy2\cache\org.slf4j\slf4j-api\jars\slf4j-api-1.6.1.jar
     [echo]   C:\...\.ivy2\cache\org.slf4j\jcl-over-slf4j\jars\jcl-over-slf4j-1.6.1.jar
     [echo]   C:\...\.ivy2\cache\org.slf4j\log4j-over-slf4j\jars\log4j-over-slf4j-1.6.1.jar
     [echo]   C:\...\.ivy2\cache\ch.qos.logback\logback-classic\jars\logback-classic-0.9.28.jar
     [echo]   C:\...\.ivy2\cache\ch.qos.logback\logback-core\jars\logback-core-0.9.28.jar

Once in a while this is a lifesaver as it makes it easy to see a simple list of our dependencies. Since we print it out line-by-line it's now nice and easy to grep through and find specific things. Like say what version of the components in such and such a group we are getting.

Thursday, April 21, 2011

Deceiving yourself in Scala

I found a cool way to trip myself in Scala today. This is what I meant:
private val lockByKey = new java.util.concurrent.ConcurrentHashMap[Any, Object]
...
locks.get(key)
This is what I actually typed:
private def lockByKey = new java.util.concurrent.ConcurrentHashMap[Any, Object]
...
locks.get(key)
Because I put 'def lockByKey' instead of 'val lockByKey' I created myself a function that creates a new ConcurrentHashMap every time it gets called instead of a single instance. Scala allows leaving out brackets in many cases so code like lockByKey.get(...) compiled fine, but as every access was creating a new map things didn't work entirely correctly.

Thankfully unit tests of basic functionality (eg ability to get the same value back twice) caught that something was awry and I was able to identfy that I had cleverly created a factory function rather than a static instance. Thank god for unit tests ;)

Dynamic lock allocation

Suppose you had some code that manipulated things in the filesystem and you wanted to ensure only one thread worked on any given file at a time. Various parts of the code may construct a File object and they then desire to synchronize access to it. However, each part of the code may have a different instance of the File so while the instances are equal (.equals would return true in Java) they are not the same instance (== would not return true in Java. Visually:
We could just declare a lock and synchronize on that, but that means that regardless of how many different resources we have only one thread can use any of them. It would be handy if we could say (pseudo-code):

val resource = new SomeResourceOrOther("""blah""")
   val lock = lockFor(resource )
   lock synchronized {
      //do stuff w/exclusive access to resource
   }

By acquiring a different lock for each resource we ensure optimal concurrency. This is somewhat similar to how structures like java.util.concurrent.ConcurrentHashMap partition their data into many segments and allocate a key per segement, thus potentially allowing as many threads as there are segments to safely operate on the data structure concurrently. We can build such a structure fairly easily:

object Locks {
 private val lockByKey = new java.util.concurrent.ConcurrentHashMap[Any, Object] 
 @tailrec
 def lockFor(key: Any): Object = {
  val lock = lockByKey.get(key)
  if (lock == null) {
   lockByKey.putIfAbsent(key, new Object())
   lockFor(key)
  } else {
   lock
  }
 }

...

class ConsumerOfLocks {
 val resource = new String("Maybe its a string?") //contrived means of insuring different instance!
 Locks.lockFor(resource) synchronized {
  //wherein I have exclusive access to resource
 }
}


Obviously the resource could be a file/directory or just about anything else. This structure can be particularly useful if you have multiple dynamic local resources you want to synchronize access to. If you move to a lock that is distributed in some manner then the same structure works across multiple nodes. For example, I have used a logically similar structure to ensure that only one node runs a given batch process.

This structure works best if the same set of resources are used consistently. If over time we are continually creating and locking new resources then we have to introduce additional complexity to ensure the lock map doesn't simply grow forever.

Source samples are for Scala 2.8.1.

Thursday, March 10, 2011

100% - 30px: Mixed-unit calculation in Firefox 4

Firefox 4 (see release notes, or dev edition thereof) adds some pretty cool new features. I love the tab candy.

A fairly big one that almost slipped under my radar is that they shipped an initial implementation of CSS calc: "Support for -moz-calc has been added. This lets you specify  values as mathematical expressions." (see https://developer.mozilla.org/en/CSS/-moz-calc).

Support for calc is kind of a big deal as it allows you to do things that are annoying as hell to accomplish today. For example, perhaps you'd like to have three columns, left and right fixed width and center resized to fill available space:

[200px][100% - 400px, minimum 250px][200px]

The problematic bit is the middle column - it's actually fairly arcane to setup mixed unit calculations presently. calc makes it easy:
<!DOCTYPE html> 
<html>
 <head>
  <title>-moz-calc</title>
  <style type="text/css">
   body { margin: 0px }
   #container {
    min-width: 650px;
    width: 100%;
   }
   #left {
    width: 200px;
    background-color: #F0F0F0; 
    float: left;
   }
   #center {
    min-width: 250px;
    width: -moz-calc(100% - 400px);
    background-color: #A0A0A0;
    float: left; 
   }
   #right {
    width: 200px;
    background-color: #F0F0F0;
    float: left; 
   }
  </style>
 </head>
 <body>
  <div id="container">
   <div id="left">left!</div>
   <div id="center">center!!</div>
   <div id="right">righter!</div>
  </div> 
 </body>
</html>
This snippet produces fixed-width left and right columns with a center column whose width is dynamically calculated using a mixture (% and px) of length units. The container div is used to prevent the left/center/right divs from dropping to the next line if the browser is sized down far enough the min-width on the center kicks in (we don't want it infinitely narrow!). Images below show the result before/after resizing the browser window:


The ability to easily (the ease being the key here; it was possible to pull off similar results before but not easily) create page structures using mixed-unit calculated sizes is an AWESOME improvement. Now all we need is a full implementation in Firefox and the other browsers :) Oh and user adoption sufficient to justify use on significant sites.

Monday, March 7, 2011

C# using is the loan pattern in Scala

One of the remarkably handy little features of C# is the using statement (http://msdn.microsoft.com/en-us/library/yh598w02.aspx). I miss it so in Java! However, Scala will let me make my own, complete with some nice additions. Using relies on IDisposeable for cleanup; for example purposes we'll use Java's Closeable. Our goal is to be able to write code similar to:
...
using(new FileWriter(file)) { fw => fw append code }
...
Simple but pretty cool. Our FileWriter could be any Closeable. This code breaks down like this:
using is just a Scala function:
package com.active.scala.util

import java.io.Closeable

object Loans {
 def using[T <: Closeable, R](c: T)(action: T => R): R = {
  try {
   action(c)
  } finally {
   if (null != c) c.close
  }
 }
}
And again with some notes:
Note that we can return a value from the Scala using if we wish. This is a simple example of the loan pattern in Scala. We can apply this to any resource we would typically have to use in a try { ... } finally { cleanup my resource } structure in Java.

How jQuery.ready works

For some time I have been meaning to poke a little at how jQuery works. Line references and code samples are from the dev version of jQuery 1.5.1 - http://code.jquery.com/jquery-1.5.1.js.

First up, the ready function (http://api.jquery.com/ready/).
<!DOCTYPE html> 
<html>
 <head>
  <title>How the heck does jQuery work?</title>
  <script type="text/javascript" src="http://code.jquery.com/jquery-1.5.1.js" ></script>
  <script type="text/javascript">
   $(onStart); //short-hand for $(document).ready(onStart);
   
   function onStart($) {
    alert('hello');
   }
  </script> 
 </head>
 <body>
  
 </body>
</html>

The $(onStart) call requests that our function be run when the DOM is available; normally if we just called onStart here the DOM probably wouldn't be loaded and we thus be unable to do cool stuff like attach event handlers or manipulate our page here because none of the objects have been created yet. The stack at time of loading is very short: We are at the very bottom!

Somewhat later our onStart function will be run via callback from jQuery. Presumably jQuery runs the ready function(s) we submit in response to some event so it seems natural to wonder what the callstack looks like when our onStart function actually runs. It turns out we are called from a stack of jQuery functions, rooted on DOMContentLoaded():
The handler function for DOMContentLoaded is setup differently depending on what browser capabilities are available:

jQuery uses the presence of addEventListener as a signal that DOMContentLoaded event is available (line 1052 above). In the past it has been fairly normal to use addEventListener as a way to detect a Mozilla based browser (eg http://dean.edwards.name/weblog/2005/09/busted/). DOMContentLoaded is ideal for ready() as it fires after the DOM is loaded but before things like large images have necessarily downloaded (M$ demo of this: http://ie.microsoft.com/testdrive/HTML5/DOMContentLoaded/Default.html). The "load" event on the other hand may wait for resources to load. Unfortunately, IE8 and lower don't support addEventListener (or DOMContentLoaded) so the failover to attachEvent is required.

IE9 will support addEventListener (http://www.davidflanagan.com/2010/03/ie9-will-have-a.html) and DOMContentLoaded (http://blogs.msdn.com/b/ie/archive/2010/03/26/dom-level-3-events-support-in-ie9.aspx) ... Yay!

The code at line ~1050 (above) merely sets up a DOMContentLoaded function based on the capabilities that appear to be available; something still needs to actually call it. The callbacks are attached to events by bindReady, at line 429.

Multiple events are setup to callback (eg both DOMContentLoaded and load) to jQuery.ready to ensure that something actually runs it; exactly which ones are setup varies based on the browser capabilities available. Taking this type of nonsense away from most developers and having it "just work" across browsers is a large part of why jQuery is awesome - it takes FOREVER to figure out how to make this stuff work reliably.

When DOMContentLoaded ultimately fires it calls jQuery.ready. Ready does some work to try to confirm it's really time to run, double-checks the document is actually ready (line 407, again shielding us from browser weirdness), then executes the list of functions that have been listed to run on ready via resolveWith:

resolveWith does some gymnastics to try to avoid running repeatedly, then runs each of our callbacks:

That was kind of cool but it is sort of hard to understand merged into all the other jQuery code. Perhaps what is called for is to build our own :) Let us suppose we wanted our library to focus on running functions when something happened. We want usage to be something like this:
//SAMPLE USE OF OUR LIBRARY
when.DOMLoaded(onDomLoad);

function onDomLoad() {
 alert('hello');
}
For this to work we'll need to declare 'when' and expose DOMLoaded. Something basic should suffice:
//OUR LIBRARY
function When() {
 this.DOMLoaded = function(userFn) {
 }
}
var when = new When();
Spiffy! All we have to do now is implement it. Luckily jQuery already did so we can rip off much of the relevant part of their implementation (simplified somewhat to try to keep the example clear):
//OUR LIBRARY
function When() {
 var domLoadedHandlers = [];
 var handleReady = function() {
  while (domLoadedHandlers.length > 0) {
   domLoadedHandlers.shift()();
  }
 }
   
 // Mozilla, Opera and webkit nightlies currently support this event
 if ( document.addEventListener ) {
  // Use the handy event callback
  document.addEventListener( "DOMContentLoaded", handleReady, false );

  // A fallback to window.onload, that will always work
  window.addEventListener( "load", handleReady, false ); 
 } //else IE event model is used ... etc
    
   
 this.DOMLoaded = function(userFn) {
  domLoadedHandlers.push(userFn);
 }
}
var when = new When();

//SAMPLE USE OF OUR LIBRARY
when.DOMLoaded(onDomLoad);

The implementation above has plenty of limitations: It only works on some browsers, it doesn't handle submission of a ready function after the loaded event has fired, errors from a ready function abort all processing, and so on.

At this point we can start to really appreciate the hard work jQuery is doing for us. Even seemingly trivial scripts - do something when the page is ready - are fairly ugly, with excitingly frequent browser-specific corner cases and hacks (IE-specific ready detection with special casing based on whether or not we are in a frame...). To make matters worse, just when you think you have it all correct a new version of the browser comes out and muddles everything up again. Thank god for jQuery!

Friday, February 25, 2011

Scala Permutations 2

My second attempt at permutations in Scala (see first attempt):

def perms3[T](L: List[T]):List[List[T]] = L match {
   //permutations of a single element is simply a list containing a list with that element
   case head :: Nil => List(List(head))
   case head :: tail => {         
     (List[List[T]]() /: L)((result, head) => (List[List[T]]() /: perms3(L-head))((accum, tailperm) => (head :: tailperm) :: accum) ::: result)     
   }
    case Nil => List[List[T]]()
  }  

I actually wound up with three versions (perms, perms2, perms3, all shown below) while trying to get rid of the mutable variable from the first version:

object Perm {
  def main(args : Array[String]) : Unit = {
   tryPerms ("A" :: Nil)
   tryPerms ("A" :: "B" :: Nil)
   tryPerms ("A" :: "B" :: "C" :: Nil)
  }
  
  def tryPerms(input: List[String]) {
   println ("Permutations of " + input.mkString(",") + ", v1")   
   for (perm <- perms (input))
     println (perm.mkString(","))  
   println ("Permutations of " + input.mkString(",") + ", v2")   
   for (perm <- perms2 (input))
     println (perm.mkString(","))  
   println ("Permutations of " + input.mkString(",") + ", v3")   
   for (perm <- perms3 (input))
     println (perm.mkString(","))       
   println
  }
  
  def perms[T](L: List[T]):List[List[T]] = L match {
   //permutations of a single element is simply a list containing a list with that element
   case head :: Nil => List(List(head))
   case head :: tail => {          
    var result = List[List[T]]()
    for (head <- L; tail <- perms(L-head))  
     result = (head :: tail) :: result
    result     
   }
    case Nil => List[List[T]]()
  }
  
  def perms2[T](L: List[T]):List[List[T]] = L match {
   //permutations of a single element is simply a list containing a list with that element
   case head :: Nil => List(List(head))
   case head :: tail => {          
     L.foldLeft(List[List[T]]())((result, current) => {
       perms2(L-current).foldLeft(List[List[T]]())((r, c) => {
        (current :: c) :: r        
       }) ::: result
     })
   }
    case Nil => List[List[T]]()
  }
  
  def perms3[T](L: List[T]):List[List[T]] = L match {
   //permutations of a single element is simply a list containing a list with that element
   case head :: Nil => List(List(head))
   case head :: tail => {         
     (List[List[T]]() /: L)((result, head) => (List[List[T]]() /: perms3(L-head))((accum, tailperm) => (head :: tailperm) :: accum) ::: result)     
   }
    case Nil => List[List[T]]()
  }  
}


Version two gets rid of the offensive mutable result variable from version one. Version three re-expresses version two more concisely (perhaps to the point of mild illegibility!).

Output is:
Permutations of A, v1
A
Permutations of A, v2
A
Permutations of A, v3
A

Permutations of A,B, v1
B,A
A,B
Permutations of A,B, v2
B,A
A,B
Permutations of A,B, v3
B,A
A,B

Permutations of A,B,C, v1
C,A,B
C,B,A
B,A,C
B,C,A
A,B,C
A,C,B
Permutations of A,B,C, v2
C,A,B
C,B,A
B,A,C
B,C,A
A,B,C
A,C,B
Permutations of A,B,C, v3
C,A,B
C,B,A
B,A,C
B,C,A
A,B,C
A,C,B

Thursday, February 24, 2011

Scala vs Erlang vs Java permutations

Erlang permutations is ... terse:
perms([]) -> [[]];
perms(L) -> [[H|T] || H <- L, T <- perms(L--[H])].
I have entire post about this terse little gem, including the Java equivalent (rather less terse!); see Erlang: so awesome it makes my brain bleed.

It seemed rather reasonable to try this in Scala. It's a functional programming language with list comprehensions so surely it will be pretty terse! My first draft as a Scala newb:
def perms[T](L: List[T]):List[List[T]] = L match {
   //permutations of a single element is simply a list containing a list with that element
   case head :: Nil => List(List(head))
   case head :: tail => {          
    //not quite sure how to dodge use of the result var to accumulate results :(
    var result = List[List[T]]()
    for (head <- L; tail <- perms(L-head))  
     result = (head :: tail) :: result
    result
   }
    case Nil => List[List[T]]()
  }
Much nicer than the Java version but it doesn't hold a candle to Erlang for terse! It does however have the marked benefit of being vastly easier to read and comprehend ;) I wish I knew how to get rid of the result var :(

For reference, the Java equivalent (written to clearly express what is happening more than to be as lean as possible):
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
 
 
public class Perms {
 public static void main(String[] argv) {
  System.out.println(perms(Arrays.asList("A", "B", "C")));
 }
 
 public static List<List<String>> perms(List<String> input) {
  List<List<String>> output = new ArrayList<List<String>>();
  for (String H : input) {
   //L--[H]
   List<String> LminusH = new ArrayList<String>(input);
   LminusH.remove(H);   
   
   if (LminusH.isEmpty()) {
    //[H|T] when T is empty
    output.add(new ArrayList<String>(Arrays.asList(H)));
   } else {
    for (List<String> T : perms(LminusH)) {    
     //a list made up of [H|T]
     List<String> HT = new ArrayList<String>();
     HT.add(H);
     HT.addAll(T);
     output.add(HT);
    }
   }
  }
  return output;
 }
}