Wednesday, March 24, 2010

AWK'ward logs: Finding errors by type or time with bash

Just in case I ever need to do this again, a couple of notes about quickly summarizing error logs using Bash. Script examples all run on Fedora 6.

Error By Time
Recently I ran into a couple of log files (multiple nodes and log rolling) with more errors than I cared to examine one by one.

2010-03-10 06:32:57,982 ERROR CasLoginView:0 - CAS Login failedin btnLogin_Click. Root error = Could not send Message.
  System.Web.Services.Protocols.SoapHeaderException: Could not send Message.
  at System.Web.Services.Protocols.SoapHttpClientProtocol.ReadResponse(SoapClientMessage message, WebResponse response, Stream responseStream, Boolean asyncCall)
  at System.Web.Services.Protocols.SoapHttpClientProtocol.Invoke(String methodName, Object[] parameters)
  at ...etc...
2010-03-10 07:47:21,900 ERROR HBXTaggingControl:0 - Error initializing
  System.Collections.Generic.KeyNotFoundException: The given key was not present in the dictionary.
  at System.ThrowHelper.ThrowKeyNotFoundException()
  at System.Collections.Generic.Dictionary`2.get_Item(TKey key)
  at ActiveFoundation.ConsumerUI.PageTypeCore.HBXTaggingControl.PrepareHBXScript()
  at ActiveFoundation.ConsumerUI.PageTypeCore.HBXTaggingControl.OnInit(EventArgs e)
2010-03-10 07:50:23,759 ERROR HBXTaggingControl:0 - Error initializing
  System.Collections.Generic.KeyNotFoundException: The given key was not present in the dictionary.
  at System.ThrowHelper.ThrowKeyNotFoundException()
  at System.Collections.Generic.Dictionary`2.get_Item(TKey key)
  at ...etc...

In order to start making sense of it all I rapidly found I needed summaries: errors by time, by type, and so on.

The first and most basic need was to filter down to just the errors per hour for a specific day. The desired result is date, hour, number of errors with entries for each hour, not just the ones that have errors.

2003-03-10 00 0
2003-03-10 01 12

If we just wanted an entry for every hour with more than 0 errors we could just grep and count. To get entries for hours that have no entries in our log it is easiest to just loop through the days/hours desired. Looping through days and hours desired is easy enough in a Bash script.

The dates in the log are padded with leading zeros so to quickly get a similar effect in Bash we can use printf to produce the statement we want. Putting this together, we start by writing a simple loop script that builds up roughly the statement we want and stores it into a variable:

#!/bin/bash

wd=`pwd`
for day in {23..23}
do
   for hr in {0..23}
   do
      cmd=`printf "grep \"^2010-03-%02d %02d\" ${wd}/*.log  | grep ERROR | wc -l" ${day} ${hr}`
      echo ${cmd}
   done
done

Note that the cmd= line uses backticks (old school syntax for command substitution) to execute printf and capture the result in a variable. It would be more modern of us to use $(printf...) instead. If we run this script it will print the series of commands we wish to execute to find the error count for each hour:


[build@localhost dm-logs]$ ./blog.sh 
grep "^2010-03-23 00" /home/build/dm-logs/ConsumerUI1.log /home/build/dm-logs/ConsumerUI2.log /home/build/dm-logs/ConsumerUI3.log | grep ERROR | wc -l
grep "^2010-03-23 01" /home/build/dm-logs/ConsumerUI1.log /home/build/dm-logs/ConsumerUI2.log /home/build/dm-logs/ConsumerUI3.log | grep ERROR | wc -l
...etc...


So far so good, but we'd really much prefer to actually execute the command! It turns out we'd really like to print the day and hour, then the grep command output (eg the error count for that hour) so we'll modify the command a little bit while we're at it:

#!/bin/bash

wd=`pwd`
for day in {23..23}
do
   for hr in {0..23}
   do
      eval $(printf "echo \"2010-03-%02d %02d \" \$(grep \"^2010-03-%02d %02d\" ${wd}/*.log  | grep ERROR | wc -l)" ${day} ${hr} ${day} ${hr})
   done
done

What's going on here is that we're building up a command using printf, capturing the result by way of command substitution $(printf ...blah...), then executing the result using eval. The statement we run eval on will look like this:

echo "2010-03-23 23 " $(grep "^2010-03-23 23" /home/build/dm-logs/ConsumerUI1.log /home/build/dm-logs/ConsumerUI2.log /home/build/dm-logs/ConsumerUI3.log | grep ERROR | wc -l)

We're going to echo the date echo "2010-03-23 23 " followed by the output (courtesy of command substitution) of the command that counts up error entries for the date $(grep "^2010-03-23 23" /home/build/dm-logs/ConsumerUI1.log /home/build/dm-logs/ConsumerUI2.log /home/build/dm-logs/ConsumerUI3.log | grep ERROR | wc -l).

The result looks might look something like this:

2010-03-23 00  0
2010-03-23 01  0
2010-03-23 02  0
2010-03-23 03  0
2010-03-23 04  0
2010-03-23 05  0
2010-03-23 06  54
2010-03-23 07  304
2010-03-23 08  236
2010-03-23 09  320
2010-03-23 10  175
2010-03-23 11  199
2010-03-23 12  109
2010-03-23 13  0
2010-03-23 14  0
2010-03-23 15  0
2010-03-23 16  0
2010-03-23 17  0
2010-03-23 18  0
2010-03-23 19  0
2010-03-23 20  0
2010-03-23 21  0
2010-03-23 22  0
2010-03-23 23  0

From such a result we might infer that we had an exciting morning! This type of output works very nicely in Excel as a series so we can get a pretty picture out of it, graph and compare multiple series, and so on.

Error By Type
In addition to wanting to know what is failing by time, we may wish to know where the failures are coming from. Our log format gives this to us, so lets dig up the source components and count up how many errors each one is filing. The source is the bit highlighted in bold below:

2010-03-10 06:32:57,982 ERROR CasLoginView:0 - CAS Login failedin btnLogin_Click. Root error = Could not send Message.

So, first step is how do we pull out the source for each error. Our tokens are roughly space-separated so they'll fit perfectly into awk (book, brief introduction). For our purposes it is sufficient to yank the fourth token out, then throw away the duplicates:

grep ^2010-03-23 *.log | grep ERROR | awk '{ print $4 }' | sort | uniq -c

This will give us a nice list of error count by source, similar to:

.
      7 CasLoginView:0
     34 SomeOtherComponent:0

This is nice, but maybe what we'd like is to see the count of each exception message from each source. We can just loop through the sources, grab the errors from each source, drop the time part (by substituting "" for those fields using awk), then count it all up:

#!/bin/bash

wd=`pwd`
for source in `grep ^2010-03-23 *.log | grep ERROR | awk '{ print $4 }' | sort | uniq`
do
  grep "^2010-03-23 " *.log  | grep "ERROR ${source}" | awk '{v=$0; gsub($1" "$2" "$3" "$4" - ", "", v); print $4" "v; }' | sort | uniq -c
done

This will give us something like:

.
      3 LoginControl:0 Some clever failure message
     54 LoginControl:0 Another clever failure message
      1 LoginControl:0 A rare but still fascinating failure

Like most things in Bash there are nearly endless ways to make such little scripts faster, shorter, and so on.

Friday, March 5, 2010

The Joy of Blogger Syntax Highlighting

On a whim I decided to setup a blog where I could capture the odd technical thought, primarily to replace my trusty notebook (the paper kind) which turns out to be a bit prone to getting lost, coffee coated, and so on.

On encountering some problems with Spring and thread safety I decided to scratch a few notes about it. The post included a little bit of Java and XML so I thought it would be nice to have it highlighted neatly. Ninety minutes later I had tried several variants of setting up SyntaxHighlighter on blogger without using any of my own hosting. Most of these methods did not work or only partially worked. Ironically most of the methods I tried were posted on other peoples blogs. The closest one to working out of the box was this one. The result was the text formatted but none of the styles applied properly.

On inspection of a blog where the highlighter seemed to be working (http://alisteroz.blogspot.com/) it appeared that the styles were simply copied inline. So, the steps that ultimately worked for me are:

  1. Open the blogger Layout tab and click 'Edit HTML'
  2. Locate <head/> and add the core syntax highlighting script, plus the script(s) specific to any types of code you wish to post. In my case:
    <script language='javascript' src='http://syntaxhighlighter.googlecode.com/svn/trunk/Scripts/shCore.js'></script>
    <script language='javascript' src='http://syntaxhighlighter.googlecode.com/svn/trunk/Scripts/shBrushCSharp.js'></script>
    <script language='javascript' src='http://syntaxhighlighter.googlecode.com/svn/trunk/Scripts/shBrushXml.js'></script>
    <script language='javascript' src='http://syntaxhighlighter.googlecode.com/svn/trunk/Scripts/shBrushSql.js'></script>
    <script language='javascript' src='http://syntaxhighlighter.googlecode.com/svn/trunk/Scripts/shBrushJava.js'></script>
    <script language='javascript' src='http://syntaxhighlighter.googlecode.com/svn/trunk/Scripts/shBrushCss.js'></script>
    <script language='javascript' src='http://syntaxhighlighter.googlecode.com/svn/trunk/Scripts/shBrushJScript.js'></script>
    
  3. Open http://syntaxhighlighter.googlecode.com/svn/trunk/Styles/SyntaxHighlighter.css and copy/paste the entire content into the layout just before the --></style>. This will typically be just above the spot where the .js file <script> tags were added
  4. Locate the </body> tag and add the following just before it
    <script language="javascript">
    dp.SyntaxHighlighter.BloggerMode();
    dp.SyntaxHighlighter.HighlightAll('code');
    </script>
    
  5. Place code, html escaped (many online tools, such as http://accessify.com/tools-and-wizards/developer-tools/quick-escape/default.php will do this for you) into pre blocks
    <pre class="Xml" name="code">
    &lt;script language='javascript'
    </pre>
    

All in all much less easy than I had expected!! Note that you must have the brush (shBrushlanguage.js) script imported from http://syntaxhighlighter.googlecode.com/svn/trunk/Scripts/ to get highlighting for the corresponding language.

Thread safety (or not) of Spring singletons using setter injection

Recently I have seen several instances, one of which would have caused very serious defects had it been released, of very experienced programmers creating code with thread safety issues due to lack of familiarity with the implications of the Spring singleton scope.

It is very common we create a class:

public class MySingleton {
    private int myVar;

    public void setMyVar(int myVar) {
      this.myVar = myVar;
    }

    public int getMyVar() {
      return myVar;
    }
  }

And configure it in Spring:

<bean class="com.example.spring.threading.MySingleton">
    <property name="myVar" value="7" />
  </bean>

So far so good... or is it? We've got a mutable instance variable on a class that is setup as a singleton. As soon as we configure it to be used by something that runs on multiple threads we've got a serious potential problem. Since the majority of our Spring applications are web applications they will very naturally tend to utilize some of these beans from request processing threads so we'll end up with concurrent access. Normally it is a non-issue: the property is only ever set through Spring so even though it is sitting there mutable we don't actually get any problems. However, this is not guaranteed by our code; it's essentially accidental that it works.

So, moving right along, our hapless developer - who is, btw, a Java expert though perhaps not a Spring expert yet - creates some instance variable and a method that can alter its value.

public class MySingleton {
    private int myVar;
    private List myList;

    public void setMyVar(int myVar) {
      this.myVar = myVar;
    }

    public int getMyVar() {
      return myVar;
    }
    
    public void doSomethingToMyList() {
      //...code that alters myList...
    }
  }

Looks fine, the type doesn't show any signs of needing to be threadsafe, right? This looks fine in unit testing, works fine when they try the software through a web browser, and then blows up (sometimes) when the software is used by concurrent users.

Since our singleton is probably really intended to be immutable (or would be if the developer stopped and thought it through) we could presumably set it up with final fields and an initialization constructor whose arguments are provided via Spring constructor injection. This is rather ugly in Spring due to the inability to provide named constructor-arg values (which is in turn due to lack of parameter name metadata in Java). Our developers strongly favor setter injection and the Spring docs on the subject seem to generally recommend this approach so for now we're left with nothing but code reviews between us and threading disaster.

This sucks. None of our current automated checks will notice. A code review should catch it but inevitably some will slip through. Possibly we can use annotations with @Resource on the field, as suggested in http://www.infoq.com/articles/spring-2.5-part-1, and provide only a getter. This is probably the most promising solution at present.