Monday, December 22, 2008

A Groovy Cryptology Library

Rain City's open-source groovy library has recently been enhanced to include a crypto package. Implementation includes symmetric and asymmetric algorithms, hmac, digest, guid, key pool and key store utilities. This post will explain the main classes and show a few examples.

Guid: The Guid class is a simple wrapper around java's UUID class. It makes it easy to create a new guid or read an existing guid string to create a guid object.

println new Guid().toString() -> e79a8462-32ad-48db-9905-3d53e1471de3

SaltedSecureRandom: SaltedSecureRandom provides a salt-seeded SecureRandom object using hardware dependent /dev/urandom. This works for all non-Microsoft-windows platforms. On windows, a simple secure random object is returned. (If you are serious about security, don't use windows).

Digest: This class is a thin wrapper around the JCE digest algorithms, including SHA-512. The class defaults to SHA-256 to maintain compatibility with other languages and platforms. MD5 is also available. The default provider is Sun. The class also provides a 'main' to enable checking digests from the command line.

ByteContainer: The ByteContainer class has utilities for converting strings to by streams, base64 bytes and hex strings. Invaluable for exchanging encrypted data.

Hmac: The Hmac class supports SHA-256 and MD5 and provides methods to set text and key to compute and render keyed digests. It also includes a key generator that defaults to SHA-256 algorithm.

def bc = new ByteContainer()
def k = Hmac.generateKey()

bc.bytes = k.encoded
println bc -> 85a749fd3818fccf0dc6997b6c9f25129babbf6b98606f8b51a14e68b5551fa6

Crypto: This is the primary class to use when you need asymmetric encryption using well known algorithms from various providers. The default algorithm is AES using the Bouncy Castle provider with PKCS7 padding. Other supported algorithms are Blowfish, DES, Tripple DES, and all the others supported by the Sun provider.

def crypto = new Crypto() // requires bouncy castle

assert Security.getProvider("BC")

def pw = "my password".bytes
def key = crypto.createKeySpec( pw )
assert 'AES' == key.algorithm

def text = 'this is a test of plain text'

crypto.generateIV()
println crypto.ivSpec.getIV()

def enc = crypto.encrypt( text, pw )
def s64 = enc.encodeBase64().toString()
println s64 -> q7teBCCBK+n3qTSr518QIj04qa/cS/P2bhsUw/eg8a8=

def dec = crypto.decrypt( enc, pw )
def s = new String( dec )
println s -> this is a test of plain text

crypto.clearBytes pw
println "${pw}" -> [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

RsaKeyPool/RsaKeyStore: 2048 bit public/private keys use very large random prime numbers. So generating secure key pairs takes lots of CPU cycles and a fair amount of time. To alleviate any performance bottlenecks, the best approach is to create a key pool to serve up single use key pairs. The pool actually gets keys from a key store that can be built off-line to any desired size. This approach means that a server won't be bogged down initializing the pool when it first comes up (or is bounced).

The actual generation of the key pairs is through a JCE KeyPairGenerator using the Bouncy Castle provider. It's initialized to the key size (defaulting to 2048 bits) and uses the salted secure random class mentioned earlier.

Rsa: The Rsa class provides encryption and decryption of data using public/private keys. For the uninitiated, the main thing to remember is that the public key is used to encrypt and the private key to decrypt. The Rsa class makes this a bit easier by providing encrypt() and decrypt() methods. A simple use case example goes like this:
  • Alice wants to send an encrypted document to Bob. She begins the process by requesting a public key from Bob. This is what will be used to encrypt the document.
  • Bob creates a key pair (or gets a pair from the pool) and sends the public key back to Alice.
  • Alice creates a Rsa object and sets the public key to Bob's key. The document is encrypted then sent to Bob.
  • Bob receives the encrypted document and decrypts with Rsa using his private key.
Now lets take this simple use case and implement some of the parts. For our purposes, we will use two classes Alice, the sender and Bob the receiver.

import org.raincity.glib.crypto.*

class Alice {
def encrypted
def send = { bob, text ->
def key = bob.getKey()
def rsa = Rsa.instance
rsa.pubKey = key
encrypted = rsa.encrypt( text.bytes )
bob.receive( encrypted )
}
}

class Bob {
def text
def keyPair

def getKey() {
keyPair = RsaKeyPool.instance.keyPair
return keyPair.publicKey
}

def receive = { encrypted ->
def rsa = Rsa.instance
rsa.privKey = keyPair.privateKey
text = new String( rsa.decrypt( encrypted ) )
}
}

alice = new Alice()
bob = new Bob()
alice.send( bob, "This is a super secret message" )
println "encrypted=${new BigInteger(alice.encrypted).toString(16)}"
println bob.text
And when you run the script you see...

encrypted=496bc13c9ba194a2becc525839bab8d9c40a7c126a1ac32f0aa21...
decrypted=This is a super secret message

Future posts will discuss how these classes are used to support Adobe Flex to Grails encrypted exchange sessions. If you are interested you can download the library and examples in a single jar file here.

Friday, August 1, 2008

Groovy Excel Report Library

I just finished a series of reports for one of my customers using grails as the backend and Flex for the user interface. As with many of my reports for this customer, they requested the output be delivered as CSV files. Easy, and very boring.

So, with the help of JExcel and the jxl APIs from Andy Khan, I created an alternate output of an excel spreadsheet rather than CSV flat file. This has the additional benefit of sending multi-dimensional results, e.g., summary and details on two separate sheets, output statistics and charting on a third. This is much more useful for the end-user.

To make life easy, I created a wrapper around the jxl libraries to make it possible to use an alternate excel API such as Apache POI. At the same time, I created a set of conventions that use a template spreadsheet to easily create all the output without having to program any of the "view" components (i.e., classic MVC). The controller class is Workbook and the Model data is configured in Worksheet objects. All the formatting for dates, numbers, layout, etc. is in configured in templates (xsl files) easily configured and modified by non-programmers.

Since the target for this is inside war-deployed web applications I had to configure an external templates folder to enable report format changes without having to re-war/deploy or bounce the server. The templates folder is part of a more extensive Exchange class that coordinates remote data exchanges to a known area outside the web-space.

Conventions: Below is a list of convetions used to build reports...
  • xls template contains all formatting and sheets for the report
  • key words embedded into the template provides cell locations where data is to be written. values stored in the Worksheet object are used by the Workbook controller to update the spreadsheet output.
  • a special type of keyword called "ReportData" is used to insert multilple rows into the spreadsheet, based on query result sets (or any list of data).
The Worksheet models also have a place holder for "options" that change/enhance the controller's behavior as it builds output. Examples include "greenbar" to create green-bar reports, "showItemNumbers" to write out the actual row number to the left of the ReportData.

Working Example: To demonstrate the report library I created a "Customer Invoice" report. The template has three sheets, includes graphic images, and shows two of the three "ReportData" sets in green bar (and blue bar) with automatic item numbering.

The Code: The full report can be built and run with just a few lines of code. Here is a typical example that prints the invoice, details, and expenses for a single period.























The three sheets are defined as details, expenses and invoice. Invoice details and expense rows are fetched for a specific period with GORM in the typical way
. Report fields are extracted to match the template report data layout. The main invoice sheet has a few extra named values (line 28-31) and constructs two lines of data (lines 33 & 34).

Report Output
: Here are the three sheets created by the code above.











Saturday, July 26, 2008

Dynamic Grails Application Configuration

If you deploy your grails applications with war files you may find it necessary implement an application configuration mechanism that lives outside the web-space. My first though was to use and external file, probably xml and read it in when needed. But, I decided that persisting configurations to the database provided a much more universal solution. It also avails itself to adding a UI or acting as a remote web-service.

The Domain Object, AppConfig is very simple. It has a name, parameter, value and status:

class AppConfig {
String name
String parameter
String value
String status
}

The "name" and "parameter" columns uniquely identify the configuration setting and the "value" is a string representation of the parameter's value. The status can be anything, but I find it useful to set the target environment, e.g., 'dev' or 'test'. I usually use the class name as the configuration name, so looping through a set of parameters is as easy as this:

AppConfig.findAllByName( this.class.name ).each { p ->
switch(p.parameter) {
case 'enabled': enabled = (p.value == 'true' ? true : false)
case 'other': other = p.value
case 'anInt: myInt = Integer.parseInt( p.value )
}
}

I use a standard dataset loader to read in and insert settings on application startup taking care not to clobber existing values. Configurations are read in during development, test and production.

A typical use is to set parameters for Quartz Jobs. This way, the parameters are read each time the job starts to dynamically configure actions. The most common is to enable or disable the job. This comes in handy if a job goes bad but you can't bring down the web server.

Wednesday, July 16, 2008

Groovy Polymorphics with Closures

Let me begin by stating the problem: I need an object that can locate files in a specific folder, parse the selected files, then execute database updates against the parsed results. The twist is that the update process might be inside a grails application using GORM, or, it might be a standalone jdbc process that has it's own data access mechanism. Here is the proposed object:

class Process {
def findFiles() { /* return a list of files */ }
def parseFile(file) { /* parse the file, return the data */ }

// the update function (closure)
def update

// the runner
def run() {
def list = findFiles()
records = []
list.each { records << parseFile( it ) }
update( records )
}
}
Forget about the empty methods for now, let's just assume they work. Notice that run() finds the files, parses them, then updates them. Here is how the target object is used:
def process = new Process()
process.update = (isGorm ? gormUpdater : jdbcUpdater)
process.run()
The first thing you might notice is that 'update' in the Process class has no body--it's simply a tag for the closure name. When the class becomes an object, update is assigned the closure method based on the runtime environment. Here is how a very simple GORM closure would be defined:
myGormUpdater = { records ->
records.each { row ->
row.save()
}
}
Simple. Loop through the records and save each one. The polymorphic magic is implemented with a closure, not a sub class. This eliminates class bloat by using portable stand-alone methods assigned to normal variables of a class--like named function pointers. An added benefit is that the closure can be used on similar classes that require the same functionality.

Real World Scenario: Lets say I have a small group of software engineers. Two of my engineers are data-feed experts, another three are database domain experts. Me, well, I'm the manager.

So, since I have the big picture, I design the base class and create test fixture data. Then, I tell my feed experts to give me methods for parsing a set of files. They don't have to worry about where the files are, just how to parse and create datasets. At the same time I assign the database ops to the database group. They don't have to concern themselves with where the data comes from, just how to convert datasets into database updates for all our target databases.

Even though I'm a manager, I contribute in a real way to the code base. And, to keep my team productive, I create the test data for the parser and database groups. The experts are assigned tasks that they handle best.

Combine this with daily scrum communications and, for the hands-on CTO, this is very Groovalicious!

Tuesday, July 15, 2008

Flex on Grails

Borrowing from Flex/Rails libraries created about a year ago, I have created a Flex/Grails application using similar HTTP/XML protocols. The application requests report data via HTTP, the server creates the report, persisting it to a file, then the client pulls the data back with a file request. This was the only way I could get Flex/Flash to open a "file-save" dialog to let the user write to their local hard-drive.

I used Flex's HTTPService object to do the initial report request, then FileReference with additional URLVariables parameters to retrieve the file. Grails/GORM made the data retrieval easy and the controller/service packages were simple implementations. This was comparable to Flex/Rails implementation that I worked on a year ago.

The server side runs in a war distribution so the output file had to exist outside the webspace. The Flex application requests the file with an HTTP post using URLRequest with specific parameters (URLVariables) to retrieve the file. Because the server is deployed with only war files, the request is made to a specific controller with a "getfile" action that retrieves the file from someplace on the server's drive (obviously not in the war file) and is consistent with where the file was originally written. Once the file is returned to the client it is removed, so it can only be read back a single time.

This first iteration returns a CSV file, but creating a full xls spreadsheet or a PDF document would be easy to implement. There is also an xml document that can be returned to enable showing the report in a local viewer, e.g., a data grid.

Thursday, July 10, 2008

Replacing GORM with Groovy Sql for Lightwieght Jobs

GORM, the Grails Object-Relational Mapping persistence framework provides a rational way to use Hibernate based on pojo (or groovy pogo) domain classes and basic conventions. Create a domain class, define some constraints, start up grails and viola! tables, columns, indexes are created without coding a line of SQL/DDL or creating annoying XML mappings.

But, what if you have a simple standalone process that requires single or multiple database access? Currently GORM has trouble working outside of grails, and creating a new grails project just to support a simple, no-client script is overkill. For that, I use groovy's database access classes in groovy.sql and a small custom library to easily access multiple data sources and provide simple query tools optimized for batch operations.

Here's an example: Tracking multiple UPS shipments. The application must access existing shipment requests from a local database, then query the UPS tracking remote service to discover shipment status. When a shipment is picked up or delivered, another local database is updated and a third system is updated via XML feed.

To summarize, the requirements are:
  • provide simultaneous access to multiple databases
  • query UPS tracking via HTTP requests
  • create XML responses and write to a messaging system
Cron triggers this process multiple times during the working day. It runs on the production machine (1MB slice at slicehost), so should be kind to existing web and database applications serving the user base.

Design Implementation:
As always, I start with the tests. Using groovy test case, the database queries, inserts, and updates tests were created, then the target classes and methods. Queries are always the easiest, and updates are straight forward as well. Inserts with auto generated IDs are a bit tricker, but here is where groovy sql comes to the rescue.

Creating new tables presented a new problem: should I create SQL/DDL scripts? No! For this, it was easy enough to access the current grails application and create the new domain classes, run grails to have Hibernate create the new tables.

The groovy Sql class provides a method called executeInsert(). The magic in this method is that it returns an array of all auto-generated keys. So, if I'm working in MySql or Oracle or whatever, inserts all look the same--no database specific code required.

Database access defaults to the production environment but database tests must operate on the test database (similar to grails). This is easy to accomplish by simply overriding the default behavior in the test scripts.

Conclusions:
  • Use groovy Sql for small, non-web applications that require single or multiple database access.
  • Use Grails/GORM for all database/table creation and maintenance (no DDL scripts). This also has the additional benefit of enabling access to new tables through grails if required.
  • Use executeInsert() to return auto-generated keys when inserting new rows.

Friday, May 9, 2008

MySQL InnoDB Dialect

I missed a configuration detail in conf/DataSource.groovy prior to going to production on a recent grails project. I didn't notice the damage until I looked closely at the backup script created by mysqldump. The re-build script called out ENGINE=MyISAM rather than InnoDB. So, no transactions, cascading, or other basic features.

The Fix: To correct the problem, I inserted the following into the hibernate section:
dialect='org.hibernate.dialect.MySQL5InnoDBDialect'
When I ran my tests, there were a couple of changes that need to be made mainly because transactions were now working correctly. The next step was to repair the production database.

The Repair: The sql rebuild script needed to be modified to replace MyISAM with InnoDB--a perfect job for groovy. Here is the script:
#!/usr/bin/env groovy
file = new File( 'production.sql' )

new File('fix.sql').withPrintWriter { writer ->
file.eachLine { line ->
writer.println(line.replace('MyISAM', 'InnoDB'))
}
}

Simple to implement. Very groovy, and now with transactions!

Thursday, May 8, 2008

Alternate Logging in Grails

For most cases, the Grails logging system, based on log4j, works fine, especially for development. But when in production, if you deploy using a single jar, it gets a bit trickier when trying to enable and disable logging. So, to that end, I have created a quick and dirty logging system using groovy's advanced file features.

Groovy Files: In java, log files are simple to configure, but you end up writing the same old try/catch/finally code--very old style. In groovy, all you need is a file name and you are good to go. Here is a groovy example:
def file = new File("test.log")
file.write("this")
No try/catch. No worrying about opening or closing. Just name it and write to it. Very groovy.

Applications Log Files: Each application has it's own specific configuration. So, in the grails-app/conf folder is the perfect place an for ApplicationConfig class to hold the application's configurations. Inside this class are a few methods that define the external logging system. The methods include getLogDirectory(), getLogProperties(), getLogfile(name), etc. For our purposes, getLogfile() is the only method we need to invoke. Again, an example:
def logger = new ApplicationConfig().getLogfile('my-service.log')
logger?.append("my message")
And that's it. So, logger? only appends the message if a log file exists. The log file is external to the web container, and easy to create (touch my-service.log) or remove. If it exists, the it gets logged to. If not, then no logging occurs. The sys-admin, armed with a list of available log file targets, can create or remove logs on the fly without touching the web server. Again, very groovy.

Thursday, March 6, 2008

Grails Application Deployed to Slicehost

We deployed our beta application on slicehost virtual server. The slice is 1GB using fedora 7, mysql 5.1, jetty 6.1.6 and resin 3.2. Groovy version 1.5.4 is installed on the slice and we used Grails 1.0.1 for development, but it's not installed on the slice.

The application at this point is minimal, but is scheduled to go live later this month. I can't go into to much detail, but application provides a data service between United Parsel Service and a major customer to schedule shipments of large and small freight.

Tuesday, March 4, 2008

Groovy DateTime Math

DateTime is capable of simple increment (++), decrement (--), and all the add/roll methods provided by Calendar. Here is an example of some of the ops:
refDate = new Date()
dt = new DateTime(date:refDate)
tomorrow = dt + 1
yesterday = dt - 1

assert refDate == dt // insure that the original did not change
assert refDate + 1 == tomorrow
assert refDate - 1 == yesterday
assert refDate + 7 == nextWeek
Groovy Equality: Groovy evaluates '==' as equality, not identity. So it was necessary to override equals() to compare to the millisecond and return true or false. The override makes it possible to return true when the underlying milliseconds match whether from Date, Calendar, Long, or DateTime.

Problems with Milliseconds: When working through the plus and minus methods I made the mistake of using the raw milliseconds to add days. This worked fine on my linux machine, but mysteriously died on OSX. What I neglected to take into consideration was the switch between standard and daylight time (luckily my development cycle is close to the change or I would have missed it). So, a quick fix was to rely on java.util.Date to add days and return a new DateTime copy. A better approach may be to use groovy's Duration classes...

Durations: The groovy library includes a set of duration classes in the groovy.time package. The duration classes are containers for years, months, days, hours, minutes, seconds and millis. They support basic math functions to add and subtract durations--very groovy. DateTime uses the duration classes for basic math with the following methods:
now = new DateTime()
current = now.asDuration()
twodays = new Duration(2, 0, 0, 0, 0)
dt = DateTime.fromDuration( current + todays )

Combined with groovy's TimeCategory, you can do this:
use(TimeCategory) {
future = new DateTime(date:2.weeks.from.now)
oneWeek = 1.week // this is a duration

// schedule something once each week for the next 5 weeks...
dates = []
5.times { dates << it.weeks.from.now }
}

Additional utility methods and parsing to come next...

Groovy DateTime Comparisons

Those of you following my posts know that I'm working on enhanced date/time support for groovy. In a previous post, I discussed a new DateTime class that leverages the power of java's Calendar and updates the interface in a groovy way. Which brings me to after(), before(), and compareTo() methods.

Calendar implements compareTo() that accepts only a calendar object. If you pass it a Date and exception is thrown. How restrictively lame.

DateTime's compareTo() accepts multiple types that can be evaluated to the millisecond level. So, DateTime, Date, Calendar, even long types are accepted. This approach enables this to work:

now = [ new Date(), Calendar.getInstance(), System.currentTimeMillis, new DateTime() ]

yesterday = new DateTime() - 1
tomorrow = new DateTime() + 1
later = new DateTime(hours:23, minutes:59)

now.each {
assert yesterday.before( it )
assert tomorrow.after( it )
assert laster.after( it )
}
DateTime. How groovy is that!

Groovy Temporal Support Continued

One area where ruby outshines groovy is in date and date time support. Ruby's date time support isn't more capable, just easier to use. Java, and groovy have Calendar--very complete, and very clunky. There is also joda time. Very complete, but a bit of an overkill for my tastes.

So here are the objectives:
  • ability to add/subtract dates with integers representing days, months, etc.
  • ability to subtract two dates to yield intervals representing days, hours, etc.
  • ability to compare two dates
  • ability to parse any type of date input, and most easily parse the standards
  • ability to format dates in a wide variety of ways
  • ability to create dates with integers and words like 10.days.ago or nextweek
Java's calendar class supports the first three requirements, parsing is available through SimpleDateFormat class. Groovy's TimeCategory enables some of the most common integer enhancements to create dates based on common temporal works. So, the pieces are in place, the next step is to wrap this in a groovy way.

DateTime, the Groovy Calendar: The first step is to create an easy (easier) to use, full featured DateTime class. One approach is to simply extend GregorianCalanedar and add the new methods, mostly getters and setters. That's were I started but soon discovered a better solution was to contain the calendar object, and use invokeMethod() to pass methods and arguments. Then, rather than define properties of the DateTime object, define getters and setters in bean style making the variables appear as if they were members. For example, I define these two methods:
int getYear() { return calendar.get(Calendar.YEAR) }
void setYear(int yr) { calendar.set(Calendar.YEAR, yr) }
Now, I can use short hand to access "year" like this:
def dateTime = new DateTime(year:2008)
assert dateTime.year == 2008
dateTime.year = 2020
assert dateTime.year == 2020
dateTime.year++
assert dateTime.year == 2021
Groovy, right? And adding other methods like setSeconds(), clearTime(), isWednesday() were easy one-liners to implement. I also took the liberty of shifting the month to 1..12 rather than 0..11, the calendar default.

Constructors: This was also very groovy. After I added the getter/setters, constructing with the virtual member vars was easy. So the DateTime class can be constructed like this:
dt = new DateTime(year:2020, month:4, day:1)
dt = new DateTime(hours:0, minutes:0, seconds:0)
dt = new DateTime().clearTime()
dt = new DateTime() + 5

Date/Time Formatting: The Rails group enhanced ruby's DateTime.to_s() method by adding formatting like this: to_s(format). This makes a lot of sense. To implement this I use a format hash that stores formatting strings by name. The map is static to enable use application wide (I may regret this later) and there is a default format member variable that controls how toString() is formatted. I also added toString(format) to display a date in multiple formats like this:
// default format is a modified ISO-8601 called 'db'
dt = new DateTime(year:2010, day:25, month:4, hours:15, minutes:35, seconds:29)
assert dt.toString() == '2010-04-25 15:35:29'
assert dt.toString(8601) == '2010-04-25T15:35:29-0800'
assert dt.toString('mdy') == '04/25/2010'
assert dt.toString('dmy') == '25.04.2010'
assert dt.toString('dMony') == '25-Apr-2010'
Other formats are available in the formats hash. The actual formatting is done using SimpleDateFormat that supplies a wide range of date formats and is compatible with Java 1.4, important in the groovy community.

A Groovy Date Parser: I've seen many approaches to this, usually making use of java's SimpleDateFormat class with a boat load of formats. I think there may be a better way. First, extracting the date parts, usually the numbers and constructing date time objects based on the numeric values. I'll discuss this implementation later...

Sunday, March 2, 2008

Groovy Temporal Support Classes

Groovy adds a rich set of features to java including closures, ranges, dynamic metaClass, access to existing java code base, etc. What is missing is full featured date processing. One approach to fixing this is to use existing libraries of java date extensions, such as joda time. Today I decided to implement some basic date/time features without a third party dependency. I may end up using joda as the project unfolds, but for now I'm trying to resist the temptation.

HourMinuteSecond: This class acts as a container and formatter of hours, minutes and seconds similar to groovy's TimeDuration class. In native form it's completely mutable, so not thread safe. But it can be made immutable by invoking asImmutable(). I'll explain how groovy makes this easy to implement later on.

The class API is as follows:
static HourMinuteSecond now()
static HourMinuteSecond fromDate(date)

def parse('hh:mm:ss')
def set(hours, minutes, seconds)
def set(Date dt)
def next() // enables ++
def previous() // enables --
def add(hours, minutes, seconds)
def sub(hours, minutes, seconds)
boolean isZero()
String toString() // formatted as hh:mm:ss
void addZeroEventListener(closure)
def asImmutable()
The object can be used as a simple up/down timer measured in seconds. When attached to a one second ticker thread, only the next() method needs to be invoked to tally up the seconds. Used as a count down timer, an event is triggered when zero is reached.

The methods are actually closures, so creating the immutable clone was as simple as pointing all the mutators to a single closure that throws an UnsupportedOperationException when a mutating method is called. This mimics how groovy's List.asImmutable() method works.

I will post more as the temporal project unfolds, and supply a link once the library is ready for distribution.

Thursday, February 7, 2008

Grails Dataset Loaders, Part 2

About a month ago I posted a few thoughts on Dataset Loaders for Grails. Now that one of my Grails projects is going into beta, I thought it would be a good time to revisit the issue.

Rails History: With rails I used fixtures for test data and had to hand code data loaders for production. I put the loaders in lib/tasks. They were rake tasks that did the initial loading and/or refreshing of data for a specified environment. So data loading for a production or development was a manual task. Not a huge burden, but just one more thing to remember.

Now that I'm in the Grails realm I have created automatic loaders that are environment aware. A DatasetBootStrap class loads data for development, test, and production whenever the server starts. This is in line with what hibernate does when it starts up--verify that the domain models match the current data source, and make changes where necessary. In rails this is done with migrations--a good idea, but yet another manual step.

DatasetBootStrap: The DatasetBootStrap class first determines the current environment, then loads the appropriate dataset classes and invokes the "load" method (or closure if you will). A list of datasets is configured manually for each environment. This approach makes it possible to share loaders for all three environments (or four if you use staging). Some data is just for test. Other data, like a list of US States, is loaded for all three environments. The main point is nothing is repeated, keeping my code as DRY as possible.

Datasets: I currently keep my datasets in src/groovy/datasets, but I think that it's time to move this folder into grails-app/conf/datasets. I also use a package declaration (package datasets) to insure that dataset classes don't conflict with other classes (I've had problems with this in the test structure when trying to create unit tests that have a name conflict with integration tests).

Simulating Rails-Like Test Fixtures: This was the easy part. In each test that requires a specific dataset, the setUp() method is used to load as many datasets as necessary to create a solid test environment. Loading all the data is as easy as calling the DatasetBootStrap's init() method.

Grails Integration: My goal for Datasets is to propose it's use for all grails projects by integrating it into the grails core. I have a bit of refactoring to do before this is practical, but would welcome any comments or suggestions from other grails (or rails) users as to what features a Dataset loader module should include.

Wednesday, February 6, 2008

UPS Web Services Interface

One of my customers has a requirement to use United Parcel Service to ship large and small packages. UPS has a great set of documentation, but they handle large and small packages with different technologies. Small packages are simple XML request/response transactions that provide methods for requesting a shipment, accepting shipments, and voiding shipments. Large packages allow you to request a shipment, but the request/response is all in SOAP, something of an overkill in my opinion.

UPS requires lots of data concerning who, what and where to establish their pricing. Data is sent to a URL/end point and a response is returned. The response includes pricing and tracking numbers, as well as image data for printing barcoded shipping labels. So, it is mainly a data transfer operation-- a single service.

SOAP is great if you have lots of defined services, like weather in Seattle or Paris or zip code 94705. Or conversion of units from/to. But for simple data transfer, REST, or just plain old XML request/response is much more efficient.

Here is a diagram of the complete data transfer...


...

As you can see, there are lots of communications. The first question is, why is there a server in the middle? Well, here's the deal--the SunReturns client is just that, a client. The data really resides at the DataServer--the client simply persists it's own private session. The UPS server is at their site. All communications are via SOA/services--some simple request/response (ala REST) and others use SOAP.

The technology used in the middle is grails/groovy. The SunReturns server is currently a mixture of JSP (legacy) and groovy/GSP. The groovy components are XML only and include controllers, services, domain objects, etc--similar to grails but without the Hibernate layer. It talks to legacy databases, both Oracle and MySQL--not impossible with grails, but definitely outside the conventions.

Conclusions: The natural loose coupling of SOA provides a new vehicle to solve business problems. At the same time, distributing business processes across diverse servers provides a workable migration path to newer technologies. Groovy, grails, and simple XML transport provide a way to implement SOA with a minimum of effort.

Wednesday, January 16, 2008

Business Service Components

Every business application has a set of commonalities that lend themselves to standardization. In the rails, and to some extent grails world, this translates into plugins. The restful authorization plugin being a good example.

But, what if rather than a plugin, you had access to a software service that provide the same functionality? Here is a comparison:

Using a business related plugin:
  1. download and install the plugin into your project
  2. build out all the required tables
  3. hookup your internal logic to insure that the plugin loads correctly
  4. use the plugin

Using a Software Business Service:
  1. use the service

The choice should be obvious. Rather than dealing with database migrations, version control, broken or non-existent tests, integration into your pristine project, you simply use the service. A good example is the captcha service called Captchator from Andreas Schwartz. This service replaces logic that traditionally comes in the form of a plugin with a callable service that does what a captcha component needs to do. No downloads required. No migration of new tables. Just use the service.

Now imagine a full suite of services for business processes. Authentication, company and contact information, time tracking, invoicing, fulfillment--all discrete services that a VAR or software developer can simple use as components in a vertical solution. Wouldn't that be cleaner than installing plugins?

Saturday, January 5, 2008

Git for distributed backups

I usually use svn for code and document backups but today I decided to use git to automate my database backups. Git seems to run much faster and offers a bit more security. It also functions as a good transport for multiple distribution to remote sites.

The backups are in ~/backups so I created a git repository and moved it to my remote slicehost site. Here are the steps:
  • cd backups
  • git init
  • git commit -a -m 'initial repo'
  • cd ~
  • git clone --bare ~/backups backups.git
  • scp -r backups.git dpw@raincity.slice.com:
The next step was to move the backups.git folder to /public/git and chown -R git:git. Then to read the repo back:
  • mv backups /tmp
  • git clone ssh://git@raincity.slice.com/public/git/backups.git
At this point I have local and remote repos. I went to other machines and cloned the repo to insure that I have multiple locations on and off site. The next step was to hook git up to my standard backup from the master database. At the tail end of the SQL backup script I do this:
  • git add .
  • git commit -m "automated backup on `date`..."
  • git push
And there you go. This provides a local and remote backup. Configuring cron on other machines is as simple as doing a 'cd ~/backups ; git pull'. Life is easy...

Survey Project--A Grails Implementation

I started small grails application today to enable creating surveys and questions and tracking questions and results. The domain classes are:
  • Survey: the primary table that defines a survey project
  • SurveyQuestion: object to hold a specific question mapped to a survey
  • SurveyResponder: the person that is answering the survey questions
  • SurveyResponse: the recorded answers that the reponder enters
And here is the proposed ER diagram, created using dbDesigner.


As the diagram shows, surveys have many questions, questions have many responses. Surveys also have many responders and responders have many responses. You can also see that this is a very simple survey with no branches.

Grails/GORM Implementation:
I began by generating the Survey and Survey question domain classes and basic validation tests. Next was to create the SurveyReponder and SurveyResponse classes. The implementation was straight forward, but I had to fiddle with the domain class definitions to get hibernate to understand my intent. Just a matter of specifically defining class references in dependent models rather than depending on the 'belongs_to' and 'has_many' declarations.

Datasets and Loaders (Fixtures):
Unlike Rails, Grails doesn't use fixtures for test data. Lucky for me I have a Grails enhancement the solves this based on a sandbox proposal from the Grails guys. So I generated the dataset loader classes using: grails create-dataset [domain]. This creates the following:
  • a DataLoaderBootStrap class in grails-app/conf/ (unless it exists)
  • a DomainDataset class in src/groovy/datasets
  • a set of random data to be loaded
An example of what gets created is here:

> grails create-dataset survey
> ...
> file grails-app/conf/DataLoaderBootStrap.groovy exists...
> created file src/groovy/datasets/SurveyDataset.groovy
>

The SurveyDataset class looks like this:

class SurveyDataset {
dataset = {
def set = [
[ name:'879',description:'510',email:'680' ],
]

return set
}

load() {
dataset().each { data ->
obj = new Survey( data )
obj.save()
}
}
}

As you can see, the data is contrived but all of the fields are present and other than the actual data, the script is good to go. I'll show how this data can be loaded from the console, command line and within test scripts in a later post.

Friday, January 4, 2008

Grails Dataset Loaders, aka Test Fixtures

One thing I've been missing since working in Grails is test fixtures ala Rails. The first steps I follow when developing data models is to create test data, which is fairly easy in Rails. For Rails I had to do some extra work to get this data to load into the development db and I also had to augment the yaml files with ERB to generate random data.

The Grails sandbox area has a thread that proposes creating test fixtures that begin in late 2006. As the conversation unfolded, there was a decision to use the term Dataset rather then Fixture--a welcomed change. But the thread kind of died as of October of 2006, so I guess I need to implement this on my own. Here's my attempt...

Datasets and Loaders (Fixtures)
:
I followed the sandbox proposal and created Dataset classes rather than yaml files. Closures were an obvious choice to create datasets, but I also had some other requirements. My objectives were:
  • datasets are usually a single set, but may be separated by dev, test, and production
  • datasets are implemented as closures to enable generating random data
  • dataset loaders may be separated into development, test, and production
  • they can be loaded from within the application on start up
  • they can be loaded on demand from test classes (similar to calling out fixtures)
  • they can be loaded through scripts (ant or grails)
  • they can be loaded from grails shell and console
For all environments a DataLoaderBootStrap class controls loading when the application starts (grails run-app). The bootstrap is environment aware, and loads specified datasets from individual Dataset classes, e.g. UserDataset, CountryDataset, etc. based on the current environment.

The datasets are classes that have a dataset closure that defines hash maps of data (not model classes) and a load() method to do the actual database inserts (or updates). The classes are placed in the src/groovy/datasets folder. Their is a load() method that creates objects from the hash set then invokes "save" to either insert or update depending on the state of the database. This has the benefit of creating predictable data without having to continually drop or trucate tables (ala rails).

When testing in the console, the loaders can be run manually. Test scripts can invoke loaders at any time to insure that there is predictable data. A command line script can be invoked to load all data for the given environment using "grails [env] load-data".

I thought of creating a plugin for this but it doesn't really fit the plugin framework. So, for now I plan to simply keep it as a platform enhancement.

Wednesday, January 2, 2008

Web Containers on SliceHost

I have a small virtual server at slice host (256M). I originally thought it would be appropriate for rails/mongrel and apache, but probably too small for a java web container. But, after installing resin and running it through some basic tests, it looks like even the small slice is fine for testing and semi-live (e.g. demo) applications.

My next step is to test jetty and glassfish. I don't think there will be problems with jetty, but glassfish may be too much of a memory hog to work correctly.

Running 'top' shows that resin actually uses about 60M when it starts up and goes up to 90M under basic use. This will undoubtedly change when spring/hibernate and real database access kicks in, but it looks good so far. My goal is to install a grails application to see if slice host can be used for basic demo purposes.