All Opinions are…

All Opinions expressed in this blog are my own … the OFFICIAL POSITION OF THE UNITED STATES OF AMERICA!

And furthermore, the official position of YOU. Yes that’s right. Didn’t get the memo? Well I sent you an email. And in the footer there was a bunch of legalese that by receipt of the email included the right to officially represent you in this blog, oh and your eternal soul. Nothing to worry about, just standard, “if you are or are not the intended recipient” stuff. Could be in your spam folder, you might want to check there. Still applies of course because, “even if this is in your spam folder”.

You have hereby been notified.

Easy multi-threading with Groovy

I finished writing an ETL process today. I know, you’re so jealous. Actually it was pretty fun, it pulls in some cool data. Although it was satisfying to have it working, some quick calculations showed it was going to take, “um, way too long” (58 hours!).

List stuffToProcess = Stuff.findAllByProcessedIsNull()

stuffToProcess.each {stuff ->
    try {
        Map data = someRestServiceClient.fetchDataThatTakesOneSecond(stuff)
        importService.storeTheStuff(data, stuff)
    } catch(Exception e) {...}

enter the Groovy Parallel features to the rescue.

List stuffToProcess = Stuff.findAllByProcessedIsNull()

GParsPool.withPool(64) {
   stuffToProcess.eachParallel {stuff ->
      try {
         Map data = someRestServiceClient.fetchDataThatTakesOneSecond(stuff)
         importService.storeTheStuff(data, stuff)
      } catch(Exception e) {...}

Two lines of code and it’s 10 times faster!

Continue reading

Less is more for Machine Learning inputs

I came across some good advice on maintenance of Machine Learning algorithms. The short version, Less is More when you are deciding what data to feed into your algorithm.

It also reminded me of some systems issues Jaron Lanier pointed out in a recent interview.

  • Over time will your model be primarily processing its own predictions?The Netflix movie recommendation engine has narrowed its available inputs because most users only choose between the top predictions.
  • Will your prediction model cannibalize its source of data?Machine learning powered language translation is automating the work of many human translators, whose previous work provides the input needed by the machine learning algorithms.

Never, ever do anything important in Windows

Just learned a great Microsoft Windows lesson.

There is a database program I like that I use in a Windows VM.  I was in the middle of  executing some commands when: BAM, windows update and restart.

What brilliant design Microsoft, of course the update should just suddenly take control mid-keystroke.

Those commands I was in the middle of could have been something critical. For instance when switching over from one system to another, the last step is often to quickly run a couple of commands to execute the changeover so that there is just a second or two of downtime.

“Ok, tell the request router to stop sending millions of users to the old system.”
“Now, tell it to point at the new… <WINDOWS UPDATE>”

So, never, ever do anything important in Windows.  And change your settings to not automatically apply updates.

Brook’s law, why software engineering is not programming

Part 2 of Managing Software Projects

Do you know the difference between programming and software engineering?

Not a Jeopardy question, the difference affects tens of millions of people who work directly on software. Programming is writing instructions for a computer, something everyone should learn. Software engineering occurs when two or more people work on a project, which introduces a major difference: communication. Communication is also something everyone should learn. Writing good instructions is clearly important but the communication between two or ten or a hundred programmers quickly becomes the critical factor in producing a good product.

No it's this way

The important thing to know here is Brooks’ law. Fred Brooks ran software development for IBM during the 60s and 70s; thousands of smart people writing millions of lines of code. For large projects like operating systems he tried to scale up by adding more programmers, just as a construction project would add more workers or a factory would add more assembly lines. But it didn’t work.  He concluded that the more programmers he added the slower things went.

Brooks formulated his law of communication as “adding programmers to a late project makes it later.”

Here is the problem. For every programmer there is a communication link between them and every other programmer working on the project or n(n-1)/2 links. For two programmers you have one communication link; three, three links. So far, not so bad, sit those three people in the same room and they might avoid too many misunderstandings. Keep going up, with seven programmers you’ve got 21 links, about half of everyone’s time is spent coordinating with others. A sixteen programmer team? 256 links!

Clearly the project cannot be built in half the time by doubling the number of workers.

Very quickly the number of communication paths increases faster than the number of programmers,  Worse it’s not just programmers, it’s everyone who needs to intimately know the software: QA staff, the development manager, the product manager. This limits the number of people who can work on a single project.

bricks_smallBut wait, plenty of companies employ thousands of programmers, how does Lockheed Martin build something like an artillery targeting system for the army?  One piece at a time, with vast amounts of planning, hundreds of people dedicated to project communication, working for decades and probably 100% over budget. Much of this effort goes into splitting the project into pieces that a small team can handle and defining detailed interfaces to minimize the communication they need to do with each other.

So that’s the difference, communication is the key skill of software engineering. Software engineers are often not known for their communication skills – or maybe we should say programmers are not known for their communication skills but a well-functioning software engineering team is one that communicates well.

Age Adjusted Median Income

The other day my wife mentioned that median income in the US had gone down over the past few years, which led to a random thought while I was driving home.  Perhaps when median income is plotted over time it should be weighted based on the relative earning power of the current age of the population.  When a large portion of the population is at the age of their optimum earning power (around 50) you would expect median income to be higher than when a large portion is very young or very old.

At least it’ll make a good pair programming scenario next time I’m interviewing a candidate.

Today’s Challenge: Build a better system during the training meeting

Had a mandatory training meeting today on our time tracking software. Since we already had to figure out the 33 step process a few weeks ago so we could get paid…  today we had an impromptu hackathon.

This software is so bad that surely I can build something better during the training meeting for said software.  -Me

Now, how to beat my coworker Mikkel who is wicked good? We both have to vaguely pay attention to the webinar so that’s an equal handicap. Aha, don’t acknowledge that I’m serious until about 15 minutes in. Especially since my computer is creakily working toward four years of service so a few tasks that should take 20 seconds instead drag on for several minutes. It’s about five minutes into the hour when I start, in theory that leaves 55 minutes to build something cool.

Ok, grails create-app timetrack.
Create a user domain class to sub in for a real authentication plugin.

Now the biggest problem with the real time-tracking software is that you have to enter hours for each day.  This is dumb when the only useful purpose it serves is to track vacation time and try to avoid everyone being on vacation at once during a big release.

Don’t want an hours worked for the day object, let’s do the opposite and create-domain-class TimeOff.  Next create some controllers, set scaffold=true and voila we’ve the world’s simplest app for entering time off.

One of those commands takes much longer than it should and after firing up Intellij as well I’m at the 15 minute mark.  Mikkel realizes I’m serious and starts cranking out a rails app. 

No problem, time for a secret weapon: Dojo.

I know Dojo has some great calendar widgets and a calendar sounds like a good interface for something concerned with days and time. Start looking through docs; not that calendar; this looks right; nope wasn’t that either.  Ok, finally got the right code, still getting a weird error though…

And the webinar is over, done after 40 minutes.

What?  I’ve only been coding for 35 mins and Mikkel for 20.  Well, time to show our results.  Mikkel’s lets you enter hours worked one day after another.  With about 15 steps to fill out a two week pay period it’s a more than 100% improvement over the same part of the real app.  And mine?  Giant javascript error, typical dojo.  Damn.

Couldn’t let it end there though.  Put another 15 minutes in to wire up json output in Grails and get the javascript error fixed and…  Bam, a decent prototype for the interface.


For comparison, here’s the real app.  Now I just need to turn my prototype into a real app…

Submit Time Sheet Express Page

How to strangle productivity

Step 1 Outlook Web App.  Step 2 Time Tracking software.

I’m not quite sure why companies adopt time tracking software.  A friend mentioned that he thought there was some accounting treatment that encouraged it but some web searching has only lead me to BS statements like these.

“Time is money, so once it is clear where employees are spending their time, an employer can decide how to better prepare and deal with projects.”

“With a distributed company it ensures that everyone is working.”

Yes, entering eights into boxes ensures that I’m working.  Apparently if the time tracking interface is as cumbersome as possible someone can be extra sure that I’m hard at work.

Now, HR people tend to be extremely nice and well-meaning, perhaps this makes them especially susceptible to the sales pitch of time tracking companies. I understand if a company has part-time employees, or is billing hours to clients; then tracking those hours can make sense. For salaried employees who are otherwise being treated as adults, it’s a gigantic waste of time. Particularly if the software requires a training meeting.

Which leads to TODAY’S CHALLENGE.

Decision Making

How a team makes decisions determines how effective that team is.

All teams have a decision making process whether explicit or implicit and all members of the team have a hand in it. Even in a purely a command and control structure, such as a fast food restaurant where process is centrally decided, every member of the team at the restaurant is reinterpreting the decision and evolving it into what actually happens.

Modern organizations do knowledge work where decisions are the primary output: whether to have a sale or run an ad campaign; which feature to build next. Therefore teams need to be highly concerned with the effectiveness of those decisions. This may take a large amount of time and certainly takes effort and focus. Since every member of the team will evolve what they are working on toward what that member wants, a big portion of the decision making process may be mostly invisible.  This hidden misalignment is a big part of why outputs differ from what was decided in the ‘the meeting’ on what those outputs will be.

That’s not to say that everyone at a company needs to be involved in every decision. However, visibility about what decisions are going to be made and when, can allow people to self select to be involved in the decisions that most affect their work. Make it clear that anyone can attend any meeting and that it is everyone’s responsibility to exclude themselves from meetings as well.

Meetings are frequently inefficiently run.  Which is a big problem.  So solve it.  Make meetings short and focused.  If you have set up the meeting, you may need to exclude yourself from arguing a side of the decision to be made in order to keep the meeting on task.  If that is the case, you can appoint a surrogate to help convey your point of view, or appoint a surrogate to run the meeting.  Is there more to discuss?  Have another short meeting tomorrow where whoever still needs to be heard should attend and everyone will have had time to consider and refine their ideas.  Communication needs to happen and it is best if it is explicit and visible to anyone who wants to know.

Another symptoms of bad meetings is when communication primarily happens in informal groups.  Part of this is natural, a huge amount of communication and productive thinking happens over beers or while chatting about something else.  That is good and worth encouraging.  What is bad is if the only forum for people to voice their concerns or for strategies to be realigned is when enough people happen to get together for lunch or drinks.

Want your team to make better decisions? Here are a few questions to ask yourself and your team members.

What decisions do you need to be involved in that you are not now? Which do you not need to be involved in?

Who should be making the tradeoff between X and Y (often the customer if possible), how do we convey the reasoning behind that tradeoff to everyone who is working on it?

Help me out, what other good questions should a team be asking in order to make effective decisions?