Category Archives: Uncategorized

Easy multi-threading with Groovy

I finished writing an ETL process today. I know, you’re so jealous. Actually it was pretty fun, it pulls in some cool data. Although it was satisfying to have it working, some quick calculations showed it was going to take, “um, way too long” (58 hours!).

List stuffToProcess = Stuff.findAllByProcessedIsNull()

stuffToProcess.each {stuff ->
    try {
        Map data = someRestServiceClient.fetchDataThatTakesOneSecond(stuff)
        importService.storeTheStuff(data, stuff)
    } catch(Exception e) {...}
}

enter the Groovy Parallel features to the rescue.

List stuffToProcess = Stuff.findAllByProcessedIsNull()

GParsPool.withPool(64) {
   stuffToProcess.eachParallel {stuff ->
      try {
         Map data = someRestServiceClient.fetchDataThatTakesOneSecond(stuff)
         importService.storeTheStuff(data, stuff)
      } catch(Exception e) {...}
   }
}

Two lines of code and it’s 10 times faster!

Continue reading

Less is more for Machine Learning inputs

I came across some good advice on maintenance of Machine Learning algorithms. The short version, Less is More when you are deciding what data to feed into your algorithm.

It also reminded me of some systems issues Jaron Lanier pointed out in a recent interview.

  • Over time will your model be primarily processing its own predictions?The Netflix movie recommendation engine has narrowed its available inputs because most users only choose between the top predictions.
  • Will your prediction model cannibalize its source of data?Machine learning powered language translation is automating the work of many human translators, whose previous work provides the input needed by the machine learning algorithms.

Never, ever do anything important in Windows

Just learned a great Microsoft Windows lesson.

There is a database program I like that I use in a Windows VM.  I was in the middle of  executing some commands when: BAM, windows update and restart.

What brilliant design Microsoft, of course the update should just suddenly take control mid-keystroke.

Those commands I was in the middle of could have been something critical. For instance when switching over from one system to another, the last step is often to quickly run a couple of commands to execute the changeover so that there is just a second or two of downtime.

“Ok, tell the request router to stop sending millions of users to the old system.”
“Now, tell it to point at the new… <WINDOWS UPDATE>”

So, never, ever do anything important in Windows.  And change your settings to not automatically apply updates.

Age Adjusted Median Income

The other day my wife mentioned that median income in the US had gone down over the past few years, which led to a random thought while I was driving home.  Perhaps when median income is plotted over time it should be weighted based on the relative earning power of the current age of the population.  When a large portion of the population is at the age of their optimum earning power (around 50) you would expect median income to be higher than when a large portion is very young or very old.

http://www.census.gov/hhes/www/income/data/historical/household/

At least it’ll make a good pair programming scenario next time I’m interviewing a candidate.

Today’s Challenge: Build a better system during the training meeting

Had a mandatory training meeting today on our time tracking software. Since we already had to figure out the 33 step process a few weeks ago so we could get paid…  today we had an impromptu hackathon.

This software is so bad that surely I can build something better during the training meeting for said software.  -Me

Now, how to beat my coworker Mikkel who is wicked good? We both have to vaguely pay attention to the webinar so that’s an equal handicap. Aha, don’t acknowledge that I’m serious until about 15 minutes in. Especially since my computer is creakily working toward four years of service so a few tasks that should take 20 seconds instead drag on for several minutes. It’s about five minutes into the hour when I start, in theory that leaves 55 minutes to build something cool.

Ok, grails create-app timetrack.
Create a user domain class to sub in for a real authentication plugin.

Now the biggest problem with the real time-tracking software is that you have to enter hours for each day.  This is dumb when the only useful purpose it serves is to track vacation time and try to avoid everyone being on vacation at once during a big release.

Don’t want an hours worked for the day object, let’s do the opposite and create-domain-class TimeOff.  Next create some controllers, set scaffold=true and voila we’ve the world’s simplest app for entering time off.

One of those commands takes much longer than it should and after firing up Intellij as well I’m at the 15 minute mark.  Mikkel realizes I’m serious and starts cranking out a rails app. 

No problem, time for a secret weapon: Dojo.

I know Dojo has some great calendar widgets and a calendar sounds like a good interface for something concerned with days and time. Start looking through docs; not that calendar; this looks right; nope wasn’t that either.  Ok, finally got the right code, still getting a weird error though…

And the webinar is over, done after 40 minutes.

What?  I’ve only been coding for 35 mins and Mikkel for 20.  Well, time to show our results.  Mikkel’s lets you enter hours worked one day after another.  With about 15 steps to fill out a two week pay period it’s a more than 100% improvement over the same part of the real app.  And mine?  Giant javascript error, typical dojo.  Damn.


Couldn’t let it end there though.  Put another 15 minutes in to wire up json output in Grails and get the javascript error fixed and…  Bam, a decent prototype for the interface.

timetrack

For comparison, here’s the real app.  Now I just need to turn my prototype into a real app…

Submit Time Sheet Express Page

How to strangle productivity

Step 1 Outlook Web App.  Step 2 Time Tracking software.

I’m not quite sure why companies adopt time tracking software.  A friend mentioned that he thought there was some accounting treatment that encouraged it but some web searching has only lead me to BS statements like these.

“Time is money, so once it is clear where employees are spending their time, an employer can decide how to better prepare and deal with projects.”

“With a distributed company it ensures that everyone is working.”

Yes, entering eights into boxes ensures that I’m working.  Apparently if the time tracking interface is as cumbersome as possible someone can be extra sure that I’m hard at work.

Now, HR people tend to be extremely nice and well-meaning, perhaps this makes them especially susceptible to the sales pitch of time tracking companies. I understand if a company has part-time employees, or is billing hours to clients; then tracking those hours can make sense. For salaried employees who are otherwise being treated as adults, it’s a gigantic waste of time. Particularly if the software requires a training meeting.

Which leads to TODAY’S CHALLENGE.