rain city digest

Sunday, May 25, 2014

A Database for node.js Projects

LevelDb is a key/value store written three or four years ago by engineers at Google. Since then it has been adopted by many projects including Google Chrome, Bitcoin, Apache ActiveMQ, and others. It was introduced to the node community as levelup/leveldown in July of 2012. Since then there have been over 13K downloads and many extensions.

When I first started experimenting with LevelDb I was pleased with it's speed and ease of use but also surprised as how easy it was to configure. Simply give it a file name and you are good to go. Or if you want an in-memory database, use memdown as the database backing store.

SimpleNodeDb is an extension to LevelDb that adds query, insert, update, delete and backup/restore capabilities. The keys are domain specific and data values are stored as objects with JSON encoding.

The project is published to the npm registry and can be installed using this:

npm install simple-node-db

Here is a more complete description as pulled from the published npm page:

A database implementation on top of levelup, leveldown, and memdown. SimpleNodeDb leverages the document store aspects of level up to provide a data-model centric implementation.

Models are stored as JSON strings with domain-scoped keys. For example a user data model's key of '12345' would have an associated domain key of 'user:12345'. So querying for users as opposed to orders or inventory parts is as easy as including records where keys begin with 'user:'.

Automatic model attributes include dateCreated, lastUpdated and version. The version attribute is used to enforce optimistic locking.

Typically SimpleNodeDb is well suited for small to medium datasets (less than 100K rows) or data stores that don't require complex querying. It also provides robust caching when used as an in-memory data store. To support more than 100K rows you should probably create alternate indexing schemes or stick with redis, mongo, or a traditional SQL database.

The project includes a full suite of unit tests and examples. If you get a chance to try it out, let me know what you think. For more info on leveldb, check out this video from Richard Astbury.

Wednesday, January 22, 2014

Javascript Asynchronous Operations

Javascript is single threaded. That's good. But to take advantage of our ever increasing multi-processor world, we need to adopt functional solutions. Thats where asynchronous operations come in.

Javascript applications that run in the browser or in node have asynchronous events occurring all the time. This is especially true in node where most actions return results through asynchronous callbacks rather than direct return values. This takes some time to get used to.

Consider the following recursive function/snippet:

var list = getList(); 
var nextListItem = function() {
    var item = list.pop();
 
    if (item) {
        // process the item...
        nextListItem();
    }
};
nextListItem();

The recursive list item processor will work fine if the list is a reasonable size. But, if the list is huge, then one has to worry about the size of call stack. With every iteration it grows by one call, so it may exceed memory limits and fail (in a very bad way) if there are too many iterations.

Now consider this small change:

var list = getList();
var nextListItem = function() {
    var item = list.pop();
 
    if (item) {
        setTimeout(function() {
            // process the item...
            nextListItem();
        }, 0);
    }
};
nextListItem();

The stack over-flow problem is now eliminated. Why? Because javascript's event system handles the recursion–not the call stack. When nextListItem runs, if 'item' is true (not null) then the timeout function is defined and pushed to the event queue, then the function exits–and the call stack is now clear. When the event queue runs it's timed' out event, the item is processed and nextListItem is recursively called.

Again, the method is processed from start to finish without a direct recursive call, so the call-stack is again clear–regardless of the number of iterations.

Now consider this example:

var userList,
    cleverProgrammerBonus = 10000;

var userUpdateCallback = function(err) {

    if (err) return updateCompleteCallback( err );
    nextUser();
};

var nextUser = function() {
    var user = userList.pop();
    if (user) {
        user.pay += cleverProgrammerBonus;
        dao.updateUser( user, userUpdateCallback );
    } else {

        updateCompleteCallback();

    }
};

var userFetchCallback = function(err, list) {
    if (err) return updateCompleteCallback( err );
    userList = list;
    nextUser();
};

dao.fetchUsers( userFetchCallback );

As with most functional, asynchronous solutions, it's best to start at the bottom–the dao fetch of users. The fetch callback sets the user list then invokes the recursive nextUser() to get the iteration started. Then the user is updated asynchronously, and when it completes updateUpdateCallback is invoked. This callback continues the recursion by invoking nextUser. This is repeated until the list is empty.

There isn't a need for setInterval ( or nextTick() ) because the dao calls to update the user are asynchronous–always invoked from the event queue, not the call stack. So the call stack remains static no matter how many users are in the list.

Its also important to realize that the function dao.updateUser doesn't really start working until nextUser has exited. It's queued to do some work, but not at the direction of the call stack–just the event queue.

Asynchronous event driven operations. Another reason to love coding in javascript.

Tuesday, January 7, 2014

Javascript Coding Standards

I love javascript. I love how it's flexible enough to be as expressive as you want. The power of closures used in the traditional way or simply as function pointers like in the old C days.

Javascript's flexible nature enables an infinite combination of styles. From prototypical, IIFE, to classical where functions are actually called "classes" when no actual classes exist.

This flexibility can lead to endless discussions of "what's best" or "what's most efficient" and worse can lead to project code that can resemble a big steamy bowl of pasta with brownish marinara. As a team leader, the last thing you want is to have to code walk dozens of coding styles. As a coder and team member the last thing you what to do is traverse hundreds of lines of code that has bad indentation or horrible style.

So, as a manager, you need to adopt a strict coding standard and ruthlessly enforce it.

Coding Standard Objectives

Here is a short bullet list. The standards should...

match the team's current experience and talent,
should closely follow at least one major published standard,
should have justifications for each decision,
and include team consensus.

Team Experience

Javascript isn't new, but it's new to many programmers. If your team has years of java and c++ experience, the wide-open flexibility of javascript will come as a frustrating surprise. If your team has a ruby influence, then you have a leg up. Understanding closures, dynamic typing, etc will ease the transition to production ready javascript. If your team is primarily PHP based, then they know a bit about javascript but probably don't understand its full capability.

For the Java Team

So lets begin with the java programming team--what standards would suite them best? The choice is easy--use a classical implementation with factory generated components that implement IoC through parameter injection. Think of it this way, every file is a class defined by a named function variable (the function is still technically anonymous). Its injection is always through construction (think ruby opts), and always factory built. Files are organized as pseudo-package folders that may (or may not) have package namespaces.

The coding style is very similar to the java specs. K&R formatting, title cased classes, mixed case variables, no underscores. Complete separation of concerns, i.e. MVC plus services, delegates and factories. And, unit tests--either TDD or BDD for everything. And not just the normal tests, but also tests to find bugs that the compiler would normally find. (mocha is probably the best test platform for both client and server javascript today).

Published Standards

Douglas Crockford. JSLint. Javascript, the good parts. If you can only read one javascript book, please read "the good parts". Coding conclusions aren't drawn on what is best--just what the alternatives are. But, he does describe what he feels are some best practices for coding style, indentation, curly braces, etc. And, if you have the time, watch his video series on javascript.

For good published standards, start here:

Standards you probably want stay away from:

if you are writing applications, the don't follow these standards; utility modules maybe, but...
M$ standards--known only to a few Seattle east-siders.

Our Standards Document

Here is a link to our current javascript coding standards. For the most part, this is for client applications, but most parts apply to node/server projects as well.

Conclusions

My best advise is to find a good published standard, modify it for your team's abilities, then stick with it. You will need to add rules as time goes on--because coders will always find differing ways to interpret the rules. So be prepared for additions--not so much changes, just clarifications.

Happy coding...

Thursday, September 23, 2010

Easy Mac Installs--MySQL & NetBeans

I just went through two very easy installs on my new mac; the first, MySQL 5.1.5. A few years ago, when I installed on my old mac it was the typical tar ball install that linux users are familiar with. Not bad, especially if you have a canned script to remind you of all the steps.

But now there are two dmg package files that do most of the work for you. And, it installs in /usr/local with a dynamic link to mysql--just like it's supposed to be. The only steps left were to remove the anonymous users, supply the root and admin passwords and reboot to start the service. Easy.

Next was NetBeans 6.9. I have been using 6.8 for a few projects, and have loved the support for groovy and especially grails, but I have held off upgrading while in full development mode. But now, with a new grails project in it's infant stages, I decided to make the move.

Installation was easy. I did have to manually replace the 6.8 item with 6.9 on my doc, but other than that all went easy. It found my current projects and I was able to run unit tests from within the IDE. Once again, easy.

Happy coding...

Wednesday, September 22, 2010

Grails 1.3.4, Groovy 7 and NetBeans 6.8

After about a year of doing nothing but Adobe Action Script projects, I'm finally back to a java/groovy/grails project. So, I downloaded the latest grails and fired up my version of NetBeans (6.8) and generated some code. Here's what I discovered...

First, NetBeans is very groovy/grails friendly. I was able to generate the project inside the IDE. I then dropped to the command line to generate domain classes and tests. I was able to run the full test suite inside the IDE, but I still prefer the command line.

I noticed that the generated code doesn't include an ant script, but just as well. I'm so sick of the XML crap, and the grails command line does everything you need, so...

Next step is to generate controllers, and maybe a few views, but I usually use grails as a JSON or XML server with a Flex/Action Script UI. I'm also creating all the unit tests as I go and running them from inside the IDE and from the command line.

Next post will discuss the controllers and any problems related to getting the application into production. So far everything works as advertised--better than previous versions, and very productive. My new best friend...

Tuesday, August 10, 2010

UMLet UML Diagraming

Just had to comment on a very useful new(ish) project for editing UML documents called UMLet. No frills or bs* just a good productivity tool in the open source space. Human readable file structure to enable old timers like me to hack in attributes with vi.

Converts your drawings to png, jpg, eps, pdf, svg, etc from inside the UI or even from the command line (a real programmer's tool).

It comes in two flavors 1) eclipse plugin, and 2) standalone Swing application. They have a few skins that help ease the swing pain but who cares about the L&F, I just want a tool that works with me instead of against me.

So, thanks to the UMLet team for a great product!

Friday, April 16, 2010

HTML5, CSS3, Apple, Google and the W3C

About a year ago, I remember responding to a question from a colleague concerning Flex in comparison to HTML5 by saying that HTML5 was not ready yet. For the most part, I still hold that opinion but with additional qualifications.

We are currently bidding on a job that requires iPad as the target device and another that needs a private label application that will work on the desktop, iPhone and eventually Android and Blackberry. After logging many hours researching objective-c and HTML5/CSS3 my first thoughts were that all development needed to be done in objective-c. But, after viewing some of the other videos from Apple, MIT, and Stanford, and doing some small tests, I think that most applications for the iPad/iPhone can be implemented completely in HTML5/CSS3. So here is my revised/updated response...

If you have complete control over the target web browser (i.e., webkit/khtml based) then HTML5 and CSS3 can't be beat. Using the 'canvas' tag/object enables drawing anything that you can draw with flex/flash and also includes drawing text (not possible with flex). Using CSS3 enables all the transitions that are currently available in flex including easing methods. Using WebSocket, provides peer-to-peer real time communications. So if your target includes Safari, iPhone, iPad, iPod/Touch, Google Chrome, Opera, Firefox, as well as the Android, Sprint and Blackberry browsers, the new technology is ready today.

The missing piece is IE, currently at 70% of desktop browser installations. An application that needs to run only in a standard client browser is still better off using Flex (95% penetration) until Microsoft complies with the latest W3C specs. But, the flash player will probably never make it to small devices (it's a hog), so if the requirement is to run on desktop and phone, especially with touch gestures, HTML5/CSS3 is probably a better choice. The down side is waiting for Microsoft to catch up with the fully developed world and hoping they will fully comply with the W3C.

Thursday, February 11, 2010

Action Script Threads for Report Generation

My requirement: to create a complex report that may take several seconds to run without blocking the user's interaction with the application. In java the solution is simple: create a thread and invoke start(). In action script, it's a bit more difficult.

Action Script doesn't support threads directly out the the box but it is run in a multi-threaded framework, so I knew there must be a solution. I took some time to research how other flex developers were implementing Action Script threads and concluded that the best solution was to simply use the Timer class to create a new threads.

The timer class has an asynchronous event called TIMER_COMPLETE that is perfect for creating a bloated, heavy weight thread. But, if all you need is a separate thread to run while creating a huge report, this does the trick.

One of the practical considerations is that you will probably want a supporting class to encapsulate run parameters for each thread you create. For reports this works well because there is enough logic in each individual report to warrant its own class. And as long as the report class implements the timer start method then you are in good shape.

The simplest implementation looks like this:


public function start():void {
var timer:Timer = new Timer(1, 1);
timer.addEventListener(TimerEvent.TIMER_COMPLETE, run, false 0, true);
timer.start();
}

// override this base class to implement the run worker
protected function run():void {
// create the report or whatever you need...
}

Implement the run handler and you are set. The class can encapsulate any run time parameters that you require including a common dispatcher to dispatch a 'COMPLETE' event when the process is finished. Easy.

This is an acceptable work-around for a language that does not support threads right out of the box. But for the rare occasion that you really need a thread to enable interaction while a heavy-weight, time consuming task is running, this really does the trick.

Happy coding...

Sunday, January 24, 2010

Transforming Average Programmers into Power Developers

The term software crisis coined in the late 60s described the ratio of software required by industry to the inability of software engineers to deliver correct, understandable, and verifiable programs. Back then the problem was blamed on the relative high power of new computers and the general low quality of programmers.

And now 40 years on, our industry continues to suffer the software crisis, but not because computing power has outstripped programmer ability. And our crisis cannot be blamed on a lack of programmers; the world has provided an over abundance of programmers, so that's not an issue. But, as an industry, we still fail to deliver on a consistent basis. Here are some tips designed to help alleviate our crisis and improve the quality of our programming community.

Learn UNIX. Ok, linux, mac osx, or cygwin; the main idea is to use the command line in a well thought out shell. Good programmers program all the time, just like musicians practice scales. Every time you enter a series of commands into the command line you are excising your skills. When you learn the power of command pipelines to solve basic problems your are flexing your ability to solve problems pragmatically. Learn and use tar, jar, grep, tail, find, sed, awk, lex, yacc, etc. Use curl, svn, git, ant, make, ssh and scp. Find a good shell like zsh, bash, ksh or whatever and write scripts to help you organize your work. Write aliases that tame the command line. Write sed or egrep scripts to exercise your Regular Expression patterns. The important thing is that "Programmers program--always".
Expose yourself to as many languages as you can. You probably have one or two primary languages that you use on the job. Don't limit yourself to just those. Create a sandbox using subversion or git and create scripts in ruby, php, perl, javascript and python. Create java, action script, c++ sandboxes with either ant, gant, rake, or make files to build small applications. Learn new, or newly recognized languages like Scala, OCaml, Erlang, Haskell and Go.
Contribute to an open source project. There are a gazillion open source projects on the market now, most of them throw away. Rather than creating your own, join one that has a chance of succeeding. Read the project's standards and rules and begin by submitting documentation updates. Search the bug list and submit proposed fixes, complete with unit tests (don't bother with projects that ignore tested designs). After a while, if you pass the initial stages, you will become a committer--your right of passage to the open source community.
Read and re-read the APIs. The programming language that you primarily work in was created by several brilliant people that have devoted thousands of hours to create a useful platform. Don't ignore their work. Even if the platform solution may not exactly fit your situation, at a minimum, know what it does first before creating your own. Dive deep and read the platform source code.
Read the expert blogs. Posts to Object Mentor, InfoQ, Artima and experts like Martin Fowler, Ward Cunningham, Dave Thomas, Kent Beck, and a host of others. Find the blogs that are specific to your language. Look for web or pod casts.
Keep up with technology changes. IoC from Spring Source or Google, Agile and Extreme Programming, Rails Active Record, Hibernate, Apect Oriented Objects, JUnit 4, Flex Unit 4, DSLs, Patterns, Idioms. Lots to keep up with, some hype, others real.
Write readable, clean code. The definition of clean code is 1) code that is understandable by someone other than the original author, 2) package, class, and method names that are accurate and fully describe their intended purpose, and 3) code that is fully tested with a suite of automated test tools. Always choose readability over writability. As Knuth says, "A program is essentially a work of literature".
Read. Most programmers are in the habit of purchasing lots of technical books. That's great, but don't let them sit on the shelf. Read and re-read them. Dog-ear or sticky note sections that you want to re-visit. Highlight important points that resonate with you. And, don't just read the cook-book tutorials. Your library should contain a healthy mix of literature from publishers like Prentice Hall and Addison Wesley to offset your APress, O'reilly, Manning, and Wrox vocational studies.

I'm sure that there are more items, but these eight represent a good start. Happy coding!

Sunday, January 10, 2010

Maximizing FlexUnit4 with Mocks and Closures

After spending the past few months using FlexUnit4 that comes with the new version of Adobe Flex called Flash Builder, I have a few tips on how to maximized your testing using mocks and closures. This comes as second nature for java/groovy/ruby programmers, but may be new to some in the Flex/Action Script community, so here goes.

Mocks: The project that I am currently on uses Swiz for IoC wiring and configuration of controllers, delegates, components, services and loggers. But, it's not good practice use IoC for tests, so we use mocks. Mocks that configure themselves, expose protected members, and reassign closures to methods that modify behavior for test purposes.

Example: The Shape factory: We have a factory class that generates various shapes such as lines, rectangles, ellipses, comment boxes, polylines, curves, etc. The method closure that creates specific shapes is called createShapeFromType. In the class constructor this closure is mapped to the factory's local private method createShape that creates and returns a shape object for the specified type. During some tests, this closure is mapped to createMockShape which returns a mock shape that can be tested independent of the factory. Other tests still use the mock factory but with the default createShape method in place to test and verify that it's behaving as expected. Think of it like a second level of polymorphism without having to create complex and ridged hierarchies (class bloat reduction).

MockShapeFactory: This mock extends ShapeFactory to inherit all of it's normal behaviour. So when I write a unit test against one of it's methods, I know that I'm exercising the real thing. But the mock allows me to keep track of what's going on inside the class as the tests run. And, the mock does it's own configuration mimicking what IoC performs in production.

MockShape: This mock has two primary purposes; 1) to configure itself for use, logs, delegates, etc, and 2) to expose protected methods and variables. For example, the shape has mouse event listeners that can be invoked through the test to see if the object moves to a specific set of coordinates. Or if it's a non-scaling shape, you can verify that the target shape performed a move an resize rather than a simple scaleX/scaleY operation. You can also test that a vetoed request for focus does indeed ignore the request and remain unfocused. There are many possibilities.

Some of this is very basic for test junkies, but as I said earlier, the Flex community seems to be just catching on to test driven development. And, with the new Flex/FlashBuilder and FlexUnit4 there is no excuse to ignore testing any longer.

Monday, October 26, 2009

Flex 4 BitmapImage

Flex 4, and specifically the new spark framework has a BitmapImage object in the spark.primitives package. It's associated mxml tag is . I found this after trying to add a Bitmap object to a VGroup only to get errors. So, after searching a bit, I found the BitmapImage and simply set the source property to my BitmapData. Very easy.

My code problem was to read a remote image then scale it to as close to a target width & height as possible without changing the original aspect ratio. From there, the user can rescale the image and drag a focus rectangle over the image to select the desired image section. Then, with the click of a button, the image is scaled and cropped to the optimum size. I used JPEGEncoder to write the data back out to a new image file.

Tuesday, October 20, 2009

Flex 4/Flash Builder Beta

Adobe released Flash 4 Beta 2 a few weeks ago with many changes and improvements over Flex 3. The new lightweight spark framework has many advantages over the older mx components and separates skinning in an attempt to facility and advocate better separation of concerns and improved maintainability.

There are also some significant improvements to the Flash Builder IDE that take much better advantage of the eclipse framework. It was always a big mystery to my why Flex 2 and 3 Builders seemed to strip so many of the basic features out of eclipse when they created their IDE. Many of those features, although not all, are now included.

One basic but vital feature is the ability to create your own standard templates for AS3 classes and MXML applications and components. The approach follows the eclipse use of internal variables, like ${user}, ${date}, ${year}, etc.

Another feature is conditional breakpoints and watchpoints. There is also a way to do remote debugging.

So it looks like the new Flash Builder comes closer to providing the basic eclipse functionality that has been missing since Flex 1.x was introduced. I'll post later with other features that have been included while I work my way through the latest Max 2009 presentations and the on-line "Flex 4 in a week" training series. And, in subsequent posts, I will share what is offered by the new Spark platform.

Friday, September 4, 2009

Hotlinks External Page Viewer

I first heard about twitter in 2007 while listening to a Ruby On Rails podcast. It took a while, but I finally got on board, mainly as a result of the mandatory SEO stuff required for my current project. So, I'm now all a twitter.

Twitter's 140 character limit leads to a mandatory link for almost every post. Some links are huge, but with the help of tinyurl.com or bit.ly you can hash down your URL to a very manageable size. At the same time, the link reduction sites allow you to track the number of hits the link has received.

There are a couple of companies that have mashed up the URL reducers with the ability to display a personalized reference at the top of the target page. So, If I tweet about Hillary Duff's new utube video I would be able to display a header that includes a link to my site at the top of the page. Rather than use one of the off the shelf products, I decided to build my own, mainly because 1) it was trivial to implement, and 2) I (and my customer) wanted custom graphics displayed on the banner without the third party interference.

So, here is a sample of what I mean. The original tweet would look a bit like this: "check out Hillary Duff's new utube thingy here: http://bit.ly/30YbdB". When the tweet recipient clicks on the link, they see the video as described, but they also see the Rain City Software banner at the top of the page, complete with links to my web site and twitter account. Go ahead, give it a try.

As it turns out, this is very simple to do with a simple iframe to display the target URL. The banner is small and hopefully not offensive and can be closed by clicking on the 'X' icon. Or, the user can click on the logo to get to the Rain City site or click on the tweety birds to get to Rain City's twitter account. All sorts of possibilities (including incorporating swf's, but lets keep it simple for now). And, the clicks are tracked not only by bit.ly but for google analytics as well.

Creating the Hotlink: I made the hotlink creation easy with a small web based UI. The user simply enters a title and pastes in the link then clicks create (after selecting the correct server). The hotlink is created and can be copied to bit.ly, tinyurl or your favorite URL shortener. Here is what we get for the Hillary Duff link:

The user can test the link by clicking 'Test Link' to see if all is in order. If this looks ok, then copy the link to the shortener, make the tweet and that's it.

Adobe/Flex UI Code: As you can see from the code below, the code is quite simple. Click handlers for create and test, and the use of escape to get the title and target URL (hotlink) to work correctly as http parameters. I thought about using he open APIs provided by many URL shorteners, but decided to just let the user copy/paste the target URL. Here is the Flex xml code:

The Hotlink Service Code: I decided to implement the service code in ruby but php or java would have worked just as well. The objective is to pull the title and link from the URL parameters and use those inside the standard template code. The template code should probably be separated out, but it was easy enough to keep everything all in one file. The banner can be any combination of backgrounds, links, whatever. Anyway, here is the ruby code:

So that's it. Easy to use and all the tracking that an SEO could hope for. And, very customizable to suite the customer's requirements. Let me know if you want help setting up your own. I would be happy to help.

Thursday, August 27, 2009

Social Network Common Login in Flex

JanRain provides a free version of their industry-leading OpenID solution. RPX Basic enables websites to accelerate user registration and login success by allowing visitors to easily sign in with one of their existing third party accounts from AOL, Facebook, Google, MySpace, Yahoo! and more.

Their product includes a drop-in widget written in Javascript that does all the communications to their open id server and third party providers. But, for Flex applications (or Java Applets) they provide a set of API calls that do the job.

Here is what our custom login dialog looks like:

The user simply clicks the preferred provider, and the appropriate API call is made to RPX to open a new window. For example, if the user clicks the facebook icon, the following page is displayed.

If the users successfully logs into facebook, a token is returned to the Flex application. The next step is to use this token to access the user's profile, email, photo, etc. Here's what the user see's after they log in.

The same thing happens if the user logs in with any of the other providers. Subsequent logins don't require the approval, but simply log in without the intermediate page. So, if I'm already logged into facebook and have approved access from my Flex application, the click skips the facebook login and returns the token. Simple.

The Code: The code adheres to the MVC paradigm and is short and sweet. A single controller talks to the RPX API and a login dialog mxml handles the display and clicks. The controller publishes two events loginComplete and accessFailed. The dialog processes the provider click through the controller while listening for the response event.

Configuration: When you sign up with JanRain for RPX Basic, there are a few configurations that need to be made to get facebook and other third party authorization platforms to work. The process is simple and only takes a few minutes.

Testing was easy for all six of the providers that our application uses. And, there are more coming on line all the time. It's a great example of distributed processing that eliminates user resistance to signing up for your latest social network application.

Friday, July 17, 2009

Simple Digest to Test File Uniqueness

Here is a simple way to use the groovy Digest class from org.raincity.glib.crypto package. It's limited to reasonably small files, but could be used with chunked sections of files. In any case, here is the test script:


void testStringDigest() {
  def algorithm = 'SHA-512'
  def s1 = 'this is test 1'
  def s2 = 'this is test 2'
               
  def hash1 = new Digest().createHashString( s1, algorithm )
  def hash2 = new Digest().createHashString( s2, algorithm )
               
  println "h1 = ${hash1}"
  println "h2 = ${hash2}"
               
  assert hash1 != hash2
}

s1 = the byte stream from the first file, s2 is the byte stream from the second file. If the hashes match, then the files are identical, i.e., not unique.

Digest's default algorithm is SHA-256, but SHA-512 is available in java > 1.4 and is much stronger. The 256 default was designed to match the SHA-256 limit for Adobe/Flex.

Thanks to Brad Rhoads for inspiring this post...

Thursday, June 4, 2009

Amazon S3 Access with JetS3t

Back in my ruby/rails days, I always accessed Amazon S3 from the command line. No that I'm coding in groovy/grails, I thought I would search for a groovy solution. Well, I found one, but it's not ready for prime-time. But, I did come across JetS3t, written in java with examples in groovy, so I thought I would give it a try.

After downloading, I opened the README and found that JetS3t comes with a nifty set of stand-alone and web based UI application/applets for accessing, browsing and uploading files. The application is called Cockpit and I was able to easily use it inside Firefox and as a standalone Java application.

Find out more about JetS3t suite of applications and be sure to read Andrew Glover's tutorial.

Wednesday, April 22, 2009

Lame Flex IDE

Ok, Adobe Flex, or AS3 is an OK language. But their commercial Flex IDE sucks. Adobe uses (abuses) the eclipse framework for their commercial IDE. They charge big bucks for a somewhat visual editor that amounts to a bucket of shit.

After wrestling with it for over two years I'm tempted to go back to vi and the command line. Anyone else feel this way?

Monday, March 23, 2009

Parsing ISO 8601 Dates in Flex

Flex has many capabilities, but parsing dates is not one of them. As a member of the JSR-310 team I am focused on date formatting and parsing across many formats, so it comes as a bit of a surprise that Flex has very little support for this. I guess it's time to roll my own.

Unlike Java date formatters, Flex doesn't let you set the parse format to accept custom inputs. The formats must match one of the seven standard parse formats that Flex supports. Some third party examples try to use Date.parse() after massaging the input string, but they are a bit lame.

A good way to parse dates in Flex is to use RegExp to extract the numbers. Then, the numbers need to be validated prior to assignment. First, lets look at a few code snippets to see how to pull the numbers out of a local date (no time zone offset).


// ISO 8601 date time as YYYY-MM-DDThh:mm:ss.SSS
var dateTimeArray:Array = inputDate.split('T')
var dateString:String = dateTimeArray[0]
var timeString:String = dateTimeArray[1]

// parse the date from YYYY-MM-DD
var pattern:RegExp = /(^\d{4})-(\d{2})-(\d{2}$)/
var result:Object = pattern.exec( dateString )
if (result == null)
  throw(new Error("invalid date format"))

var year:uint = new uint( result[1] )
var month:uint = new uint( result[2] )
var day:uint = new uint( result[3] )

if (!validateDate(year, month, day))
   throw(new Error("invalid date format"))

var date:Date = new Date(year, month - 1, day, 0, 0, 0, 0)

// now parse the time from HH:MM:SS.SSS
pattern = /(^\d{2}):(\d{2}):(\d{2}).(\d{3})/
result = pattern.exec( timeString )
if (result == null)
   throw(new Error("invalid time format"))

var hour:uint = new uint( result[1] )
var minute:uint = new uint( result[2] )
var second:uint = new uint( result[3] )
var milli:uint = new uint( result[4] )

if (!validateTime(hour, minute, second, milli))
   throw(new Error("invalid time format"))

date.setHours(hour, minute, second, milli)

Lenient Dates and Times: It appears that in Flex world, March 32nd == April 1st. And April 2nd at 25 hours == April 3rd at 1am. Is this an April fools joke? No, it's just the lenient way that Flex handles date parameters. No problem with that until you start parsing, then you need to be strict.

Strict Dates: Lets say a vendor sends an electronic invoice to you with a payment due date of 2009-May-31. Does he want his money on June 1st? Probably not. The best response is to reply that his date is invalid and he must submit a valid date.

That's what being strict means. Unfortunately, Flex does not have a way of controlling strict dates, so you need to do this manually.

Validation: Without validation constraints a Flex date can range from 0100-01-01T00:00:00.000 to beyond the year 10,000. Dates earlier than the year 100 are coerced to 1900. This is clearly, not what we want. For parsing purposes there should be a range of minimum/maximum acceptable dates. For discussion purposes, I'll set the range to start at the recognized Gregorian start date of 1582-10-17T00:00:00.000 and end at an arbitrary future date of 2999-12-31:23:59:59.999.

Now that we have a date range, the years are easy, 1582..2999. Months for Flex follow the old C convention of zero indexing, so months range from 0..11 but we will use 1..12 for validation then decrement the month prior to creating the flex Date object.

Days are a bit trickier, but lets use 31 as the default, then 30 for Apr, Jun, Sep, Nov and 28/29 for Feb, after determining the leap year. Here is the standard calculation for leap year lifted from the white book:


var leap = ((year % 4 == 0) && ( !(year % 100 == 0) || (year % 400 == 0)))

Hours, minutes, seconds and milliseconds are bound by 0..23, 0..59, 0..59, and 0..999. This ignores time zone changes and the occasional leap-second, but otherwise works fine.

The Final Code: I created a demo application that allows you to plug in and parse dates and run the unit tests. The demo and source code are available here. I'm still working on a parser for time-zoned dates, but this is a good start.

Saturday, February 28, 2009

JSR-310 javax.time Periods

The proposed JSR-310 date/time API comes with many representations of date and time including Instant, Duration, LocalDate, LocalTime, LocalDateTime, OffsetDate, OffsetTime, OffsetDateTime, ZonedDateTime, MonthDay, YearMonth and others. One of the more interesting classes is Period. It represents a quantity of time, not fixed to any time in space--just a quantity. This entry will discuss periods, how they are parsed, and examples of use.

Parsing: Periods are parsed from string using formats that conform to the ISO-8601 duration format PnYnMnDTnHnMn.nS. Variations of this format are parsed to create Period objects. Parsing is done in PeriodParser, a standalone helper class that is easily accessed through the static method Period.parse(). Here are a few examples:


assert Period.parse("PT0S") == Period.ZERO
assert Period.parse("P1Y") == Period.years(1)
assert Period.parse("P10Y8M22DT3M") == Period.period(10, 8, 22, 3)
assert Period.parse("PT1M") == Period.minutes(1)
assert Period.parse("P-4Y") == Period.years(-4)

As you can see, the parsing scheme is robust and flexible. The main thing to keep in mind is that the Period object is more of a value container as opposed to a process unit. So, 60 minutes won't directly translate to 1 hour, and 24 months does not directly equal 2 years. But this aside, there are many uses for the Period class.

For example, lets say I want to periodically archive old temp files from a system. I could set the period to minutes, hours, days, whatever, then based on an arbitrary start time, begin the archive sweep. In the same moment, I could compute the next archive sweep by adding my pre-defined period object to the current time.

A more elaborate implementation would be to self-adjust the scheduled periods based on some criteria. Sticking with the archive sweep, lets say I set the initial period to 2 hours, or better yet, 120 minutes to give a finer resolution. Then, at the end of the sweep, I tally archived files. The smaller the number, the longer the period, on a sliding scale, up to 240 minutes. A large number would decrease the period to zero minutes, or a continual sweep, so I want to make sure that this is the very worst case scenario.

The Period class makes this easy to implement and maintain. To keep it simple, lets just use a linear equation. The formula for calculating the number of minutes between sweeps is reduced to period in minutes = mx + b, where b = 240 minutes (our maximum amount), x = the number of files, and m = the slope. Lets say that the maximum number anticipated required archives were 2 per minute, so after 240 minutes we would have 480 files that needed to be archived. If that is our worst case then m = -0.5 (delta y divide delta x, or -240 / 480). So the curve looks like this:

Now each time I do an archive sweep, I tally the files and calculate the next sweep interval using:


maxMinutes = 240
slope = -0.5
periodInMinutes = slope * fileCount + maxMinutes

Or, as a single groovy closure:

def periodToNextSweep = { fileCount, slope = -0.5, maxMinutes = 240 ->
 Period.minutes( slope * fileCount + maxMinutes )
}

You might think there is a danger in allowing the returned period to be a negative number of minutes. But in all practicality this is acceptable because the objective is to determine the next instant when a sweep should occur. If this time is in the past, simply do it now. Of coarse you would design the slope to target the worst case, so a negative time should seldom if ever occur. And the good part is that the parameters are easy to modify to fit changing environments.

Testing with a Fixed Time Source: My previous entries have demonstrated using TimeSource and Clock tied to the System clock. But, for these tests, I think a fixed time source would be more appropriate. The syntax is like this:


millis = 1234920035991L // 2009-02-17T17:20:35.991-08:00 Tuesday...
timeSource = TimeSource.fixed( Instant.millisInstant( millis ) )
clock = Clock.clockDefaultZone( timeSource )

So now when I create a date, time or date/time object from the clock, the time is always the same. Not very meaningful for real life, but great for testing.

If I create a class that uses clock, I can inject the TimeSource based on the system clock. And for testing, I can inject a fixed TimeSource, run tests, and not have to worry about the specific time, but simply base my tests on a static source. Here is the class:


class SweepController {
 def clock = Clock.systemDefaultZone()
 def slope = -0.5
 def maxMinutes = 240
 
 def periodToNextSweep = { fileCount ->
  int x = (int)(slope * fileCount + maxMinutes)
  x < 0 ? Period.ZERO : Period.minutes( x )
 }
 
 def nextSweepTime = { fileCount ->
  def period = periodToNextSweep( fileCount )
  
  clock.offsetDateTime() + period
 }
}

And here is the test script:


millis = 1235721600000L // 2009-02-27T00:00-08:00
fixed = TimeSource.fixed( Instant.millisInstant( millis ) )
clock = Clock.clockDefaultZone( fixed )

sweep = new SweepController( clock:clock )
println "now -> ${clock.offsetDateTime()}"
source = [ 
 [ 480, '2009-02-27T00:00-08:00' ],
 [ 0, '2009-02-27T04:00-08:00' ],
 [ 240, '2009-02-27T02:00-08:00' ],
 [ 120, '2009-02-27T03:00-08:00' ],
]
source.each { count, value ->
 println "count: ${count} -> ${sweep.periodToNextSweep( count )}, ${sweep.nextSweepTime( count )}"
 assert value == sweep.nextSweepTime( count ).toString()
}

When I run the script, here is what I get:


now -> 2009-02-27T00:00-08:00
count: 480 -> PT0S, 2009-02-27T00:00-08:00
count: 0 -> PT240M, 2009-02-27T04:00-08:00
count: 240 -> PT120M, 2009-02-27T02:00-08:00
count: 120 -> PT180M, 2009-02-27T03:00-08:00

The main advantage is that I can run this independent of the current date, but still use the clock object without changing anything inside the class.

Conclusion: This quick look at Period and PeriodParser to see how it fits into the JSR-310 from the groovy coder's perspective. Next time well look closer at Date, Time and DateTime math capabilities and how they work with groovy plus/minus operator overloading.

Sunday, February 22, 2009

Adjusting Date and Time with javax.time

Java's JSR-310 date and time API, co-lead by Michael Nascimento Santos and Stephen Colebourne is a natural spinoff from the venerable joda time. The implementation has many advantages over java util's Date and Calendar. Compared to java.util.Date the API has many types of Date and Time including LocalDate, LocalTime, LocalDateTime, OffsetDate, OffsetTime, OffsetDateTime, and ZonedDateTime.

The new JSR-310 date, time, and date/time classes are immutable and thread safe, unlike java.util.Calendar, but at the same time they offer many basic math, adjusters, and matchers that enable date and time calculations missing in java.util.Date. They are created from a TimeSource that can be tied to the System clock, fixed, or offset in time. This makes the classes extremely test friendly.

This entry discusses how the basic date/time math work and how the date and time adjusters can be used to solve common problems. Lets look first at the basic math.

Look Ma, no Setters: The JSR-310 date/time classes are immutable and thread safe. To accomplish this, the API doesn't include any setXX methods. Days, Hours, Years, etc are manipulated through "plus", "minus", and "with" methods that return new objects of the same type. Here are a few examples, first with date then time:


// tomorrow and 5 years from now
clock = Clock.systemDefaultZone()
today = clock.today()
assert clock.tomorrow() == today.plusDays(1)
fiveYearsFromNow = today.plusYears( 5 )
assert today.plusMonths( 60 ).year == fiveYearsFromNow.year

// two hours ago
now = clock.timeToSecond()
twoHoursAgo = now.minusHours( 2 )
assert now.minusMinutes( 120 ) == twoHoursAgo

// use offset datetime to get today at noon
noon = clock.offsetDateTime().withTime( 12, 0, 0 )
assert noon.hourOfDay == 12
assert noon.minuteOfHour == 0
assert noon.secondOfMinute == 0

The Adjusters: Here is a tricky problem: how do you compute the specific date of a week numbered day of the month, for example the 3rd friday or 4th tuesday. Bay area residents know that not being able to calculate these simple problems can cost real money (street sweep adys). So here is how JSR-310 handles this:


dt = clock.offsetDate()

thirdFridayAdjuster = DateAdjusters.dayOfWeekInMonth( 3, DayOfWeek.FRIDAY )
fourthTuesdayAdjuster = DateAdjusters.dayOfWeekInMonth( 4, DayOfWeek.TUESDAY )

thirdFriday = dt.with( thirdFridayAdjuster )
println "third friday -> ${thirdFriday}, ${thirdFriday.toDayOfWeek()}, ${thirdFridayAdjuster}"
assert DayOfWeek.FRIDAY == thirdFriday.toDayOfWeek()

fourthTuesday = dt.with( fourthTuesdayAdjuster )
println "fourth tuesday -> ${fourthTuesday}, ${fourthTuesday.toDayOfWeek()}, ${fourthTuesdayAdjuster}"
assert DayOfWeek.TUESDAY == fourthTuesday.toDayOfWeek()

These examples just scratch the surface of the many date/time manipluation methods for JSR-310's javax.time package. The next entry will discuss how JSR-310 matchers help determine if a date lands on a leap year, leap day, last day of the month, etc. and how to use this in groovy.