10 Print "Hello": 2006

Wednesday, December 13, 2006

Playing in the Sandbox

This message showed up in the Manning Sandbox forum for wxPython In Action. After saying some nice things about the book, the poster has some suggestions:

I would love to see an advanced volume covering topics such as XRC, using XML to define a screen layout; creating custom widgets... internationalization, and a full chapter or more expanding on chapter 5 "Creating your blueprint." I find that... program organization is most important yet little seems to be written about it, for any programming language.... A book that illustrates solutions to design problems using patterns, Python, and wxPython will help many people...

Thanks for the kind words. I'm thrilled that you found the book helpful.

There are about three or four things that I regret not being able to include in the book. XRC is definitely on that list -- I think that XML or other GUI description languages are going to be increasingly important. (Also on the list: multimedia and how to distribute your finished program).

These all got pushed out of the book over space considerations. We were something like 75 pages over budget as it was. Owing to some communication issues, nobody realized that we were that far over our page count until the pages were already written, and while it wasn't a problem to get them approved, it did mean that we didn't push forward into other topics.

There are no current plans for a second book, although it's not out of the question that we'd fill a gap or two with an article on line somewhere. (To be clear, there are no current plans for that, either). Frankly, sales are not high enough for a publisher to seek us out for a second book at this point, especially since an advanced book would, almost by definition, sell fewer copies than the original. That's not a knock on the sales, which seem to be roughly in line with publisher expectations, just a comment on the size of the potential market.

As for a patterns/program organization book for Python, I'd love to write one. For one thing, it'd give me a chance to rant at length about the finer points of architecture and class structure. I've been known to have an opinion or two on the topic. The main problem would be selling it. My sense is that it's rather hard to sell a general-topic programming book compared to one that's tied to a specific tool or product. This would especially be true for a relatively small market like Python.

Still, that's what blogs are for, so hopefully I'll be able to get some interesting thoughts on those lines here.

Friday, December 08, 2006

GWT Article Now Online

I'm happy to announce that Part One of my four part series on using Google Web Toolkit is now available at: http://www-128.ibm.com/developerworks/opensource/library/os-ad-gwt1/

This part focuses on creating a GUI front-end using GWT. In case you wonder about the lead time on these things, it was originally written in August, and slightly updated right before it was published.

I think that the future part of the series will appear monthly. The next one is about using the Derby database as your back-end.

I'm pleased with how the article turned out, although I sort of wish I had taken the time to polish the look of the sample a bit -- that's one ugly Web 2.0 application. On the other hand, there's only a limited number of space available in this article (2000 words, and I was already over), and any graphical enhancement needs to be put in code and explained. So it's a tradeoff.

Hope you like it.

Saturday, December 02, 2006

Don't Ask Questions, It Only Encourages Him

Let me promote this from the comment section -- it's not hard to find, it's the only comment on the previous post.

What is your favorite Python IDE? Your editor choices are interesting and valid but I wondered if you have a preffered IDE for Python and wxPython work?

I may have covered this somewhere, either on this site, or in the Python 411 podcast interview. If so, I'm sorry.

When I'm on a Windows machine, my Python editor of choice is JEdit, and has been for quite a while. I should say that I only rarely use the Python debugger and Jython interpreter add-ons that are available for it. I do have a couple of custom scripts that execute the script I'm working on and so on, but it's not a very elaborate set-up. What I like about JEdit is that it has a clean interface, had familiar Windows keyboard shortcuts and menus, and had a lot of plugin and macro power. The main downside is that it's kind of a memory hog for a text editor.

When I jumped Mac-side for my personal stuff, I tried just about every free programmer's editor I could find (JEdit, like a lot of Java/Swing programs, behaves a bit oddly on a Mac). Finally, I somewhat reluctantly decided to pay for TextMate. Although Python-mode on TextMate hasn't quite gotten the love and attention that Ruby has, there's still a lot of power there. It's really easy to script and customize, and it's about the only editor I've seen that has dynamic mode. So that if you are writing, say, a Django template, TextMate knows the the HTML portion gets colored and edited under HTML scope, and inside the Django tags, you use Django/Python mode.

My main issue with the various IDE's is that I haven't seen a Python one where the gain in the IDE functionality makes up for the editor itself not being as powerful as my normal editor. Some of this is my personal workflow -- using Python and test-first programming, it's very rare for me to feel that I need a step-through debugger. I'm also not very fond of code-generating GUI builders, for reasons that are probably worth another post.

That said, one thing I do like when I'm in and IDE for whatever reason is the ability to specify a project and do useful things across project scope. (Both JEdit and TextMate have this ability, but not fully formed.) I've done some Ruby work using Eclipse with Ruby plugins, and it's nice to, say, have one-button access to running all your unit tests.

That's what works for me, but I'm always looking for new tools and finding out how other people work.

Friday, November 24, 2006

Editors I Like

Two tools I use all the time. Neither is free, and since my strong bias is to use free tools where possible, these are some really impressive editors.

IntelliJ IDEA For all my Java needs. It's got more features and is more usable than any other Java IDE out there. The only downsides are that it's not free, and there are about a half-dozen keyboard shortcuts you have to get down before you achieve anything like full Zen mastery (well, and it could be nicer about the way it arranges tabs in the editor). I converted one of my teams to IntelliJ largely by just using it in team programming session. (Sample comment: "You don't seem to be typing very much"). IntelliJ is so good at using the class and type information that it single-handedly makes it useful to have static typing.

There are a couple of killer features that I miss horribly when using another tool, like the control-n series of shortcuts to browse classes, files, or symbols in the project. Even better, in the search box, the search is camel-case aware. So if you type MBGC, it'll match the class MyBigGoofyClass, or the symbol makeBeveragesGreeenCoffee. It's brilliant, completely intuitive and makes searches through the code very quick. Ten seconds after you try it once, you're hooked.

Another cool feature -- a help menu item that lists cool features along with a count of how often you use them, so you can see what neat stuff you are missing out on.

TextMate My Mac-based editor of choice for everything other than Java. The Mac editor wars (BBEdit vs. Everybody Else) do sometimes reach a fever pitch normally associated only with critical topics like Emacs vs Vi. So I don't want this to be a bash on other Mac editors -- if I want to do that, I'll do another post. Besides, if what you are doing works for you, more power to you. What I like about TextMate is that it really was built from the bottom up to be scriptable, which gives it a tremendous amount of flexibility and power to handle different file modes, integrate shortcuts, run scripts from TextMate, just easy as could be to use that stuff, and even pretty easy to create new commands. Biggest quibbles are the single-keystroke only Undo, which I understand as a design choice even as I find it kind of frustrating, and the project mode, which is useless to me, at least partially because it will only display like five tabs across before going to the overflow menu, and that's just not enough.

Killer feature: select a chunk of text, type the left half of an enclosing pair, like a left paren or bracket. Rather than replace the selected text with the keypress, which is almost never what you want to do, TextMate will surround the selected text with the enclosing pair, which is so useful that I try and fail to use this in all my other editors now. And yes, the list of enclosing pairs is adjustable for each file mode.

Sunday, November 19, 2006

Less Frequently Asked Questions

Hi. Miss me? Thought not. Well, I've been feeling increasingly guilty about not posting something here, especially as the comments continue to trickle in -- we're up to eight now that I've cleared off some comment spam!

I'm going to boldly ignore the three partially written posts and do another round of publishing questions.

How's the wxPython book selling? Not bad. Got my first-ever royalty check a couple of weeks ago (actually, a royalty direct deposit). Went well into the low three figures, which is relevant to the older question about how much money you can make at this thing. (To be fair, that's artificially lowered a little bit because some of the royalties still were going to finish paying off the advance. I was hoping that the book would substantially outsell the Jython book, which hasn't happened yet. A little surprising given both the relative health of the mailing lists for the two tools and also the fact that the wx book had a substantially better rating on Amazon. And oh yes, I know what the Amazon rank is. Also the current total of reviews -- thanks to all ten people who have gone to Amazon and side nice things.

The implication of the Amazon numbers would seem to be that either a) Amazon is selling fewer books than four years ago or b) Manning doesn't have the distribution within bookstores that O'Reilly did. I suspect the latter, especially since the Jython book seems to have had more foreign sales.

What tools do you use to write the books? I'm actually pretty interested in this one with respect to how other authors do it. It seems to me that there isn't a fully satisfying tool for authoring technical books. In both cases, the main tool decisions were made before I came on board. For the Jython book, we used Word. There was exactly one positive of this -- it was nice to use the change-tracking system to do comments back and forth. Word, though, seems to be optimized for a two page office memo, and is really a pain for anything more complex. Especially when you are changing font styles frequently for things like code literals, or inserting code blocks, or numbered lists (which can be nightmarish). Most of my demo code was written in jEdit. O'Reilly provided a Word template with predefined paragraph and character styles for things like lists, headers, and literals, and they then automatically converted that for production.

Manning had something similar, but we only used it directly. For the wx book, we used ReStructured Text and a CVS server to do change tracking. This enabled us to do the writing in an ordinary text editor -- I used jEdit again (Robin used Emacs, I think). The CVS server was a great idea. Using ReST had some benefits -- for one thing, it was much easier to integrate figures and code snippets with include statements. The biggest downside from just a typing standpoint was tables, which were a pain. However, Manning would only accept submissions in Word at that time, so we had a kind of convoluted system where we would auto-convert the ReST to OpenOffice, then hand save it as a Word file. Once we turned the books in, we were stuck doing final proofreading and edits in Word. (Although Robin and I used Writely, now Google Docs, to do our change lists from the proofreading -- that worked really nicely). One significant difference between the publishers was that O'Reilly did the indexing in-house, whereas Manning asked us to do it. It's actually more difficult than it looks to do well -- I'm under no illusion that we did the index as well as somebody with more experience would have.

One more question along these lines upcoming, but it's longer and it's late. I'll try not to wait six weeks.

Sunday, October 01, 2006

Now with sound

Robin Dunn and I were interviewed for Ron Stevens' excellent Python 411 podcast -- you can download and listen to the .mp3 file here. It's about 45 minutes long, which should mean that Ron used nearly all of the interview. Hope you like it. It was a lot of fun to do. Thanks again to Ron for having us on, and also for the very nice review of the book he wrote on Slashdot. Haven't listened to the podcast myself yet, so I hope I don't sound like a gibbering idiot.

Saturday, September 30, 2006

wxWorld

I'm pleased to be able to link to a new article: Build cross-platform GUIs using wxWidgets available on the IBM developerWorks site. The original title was "wxWorld", and it's a quick look at wxPython, the wxWidgets toolkit, and some of the other wxWidgets language bindings. I had some fun digging through the different language tools trying to create short wx programs in each. Hope you like it.

Friday, September 22, 2006

Tips-First for Test-First

Of all the exciting ideas and revelations that came from Kent Beck's original XP book, Test-First Programming has been the one that most significantly affected the way I work on a day-to-day basis.

I love programming test-first. It's a great way to take a large, amorphous task and solve it piece by piece. It's also a nice morale boost -- "Hey, I know that my code does nine things. Let's go for ten..."

Here are a bunch of things I wish somebody had told me about test-first programming.

Unit Testing is not All Testing

So, after I started doing test-first, I walked around for about six months all, "my code is perfect because I wrote tests". My smugness came crashing down when testers found some bugs. My code was better because I wrote tests, but turns out I had made some dicey assumptions about the inputs, and so my tests passed, but were still incorrect. Test-first is not a complete test suite. You still need to do acceptance testing, you still need to do GUI testing where appropriate.

However, you can still automate a very large percentage of acceptance tests. The more you can automate the tests, the more they'll be run, and the happier you'll be.

Test-First is a structure for writing good code at least as much as it's a means for verifying code

Code that has been written in a test-first style tends to have certain qualities. Small methods, loosely coupled. Small objects, loosely coupled. Code that causes side effects (such as output) tends to be separated from code that doesn't. These are all side effects of what's easy to test -- it's easy to write small methods in a tight test-first loop. And dependencies between methods or objects make tests harder to write.

As it happens, those exact qualities -- tight cohesion and loose coupling -- are exactly what characterizes the best software architectures. My test-first experience is that I wind up with much better code architectures from test-first then I do when I try to guess the design before I start. (Which is not to say that a little bit of pre-design can't be helpful, just that it can be overdone).

Test-first is better suited for some things than others

That doesn't mean that you shouldn't try, of course. Test-first is vital in cases where you know the input and output, but not the process. It's also critical in cases where your program can be incorrect in subtle ways. It's somewhat less important for things that will visibly or loudly break. GUI's are a challenge because GUI layouts tend to change in ways that can break unit tests. GUI behaviors are more stable and easier to test. Again, though, you should try and have automated coverage of even those areas that weren't developed test-first.

Trust the process. Look ahead on tests, not implementation.

It works. The tight process is: Write a test. Run the test so that it fails. Make the simplest fix that will pass the test. Run the test so that it passes. Refactor. Keep to that tight loop. Resist the temptation to guess about what you'll need to pass the next test. What I usually do is put a list of the tests I'm going to write in comments in my test class -- that's my lookahead plan, and keeps me from forgetting something. But the design I do in my actual code comes during the refactor step, which is where I see duplication and abstraction.

The earlier you start the better off you are.

It gets increasingly hard to convert a code base to test-first the longer you wait. I've even had 20 line classes that needed significant refactoring to unit test (mostly because output was intertwined with functionality -- the code was better after the refactoring). Test-first is a good place to start, anyway -- pick something to test, and go.

Treat your tests like code and refactor them

Pretty much every test-first guide gives sort of a perfunctory nod to, "oh yeah, keep your test code clean." But I think this could stand a little more attention. For one thing, your unit tests are critically important to your ability to deliver quality code -- and they have no tests of their own. The cleaner your tests are, the better you'll be able to see issues with the tests themselves.

One thing that has worked nicely for me is extracting sets of assertions into custom assert methods. If you are continually making the same five assertions to check validity of your objects, throw them into an assert_instance method of some kind. Another common case is making the same assertion over a range of values -- move the for loop to your custom assertion and pass the range endpoints in.

There are two big advantages to doing this consistently. The first is that it's easier to see what's going on from one line of assert_person(expected_name, expected_addr) then from five lines of assert_equals. The second is that it ensures that you actually do make all the assertions every time. Hey, everybody slacks, and test-first is about making the test setups as quickly as possible. If you can trigger all umpteen tests on your class with one method call, you're more likely to do the whole set every time, rather than just picking one or two at random each time.

Don't reuse instance variables

This is another refactoring issue. It's tempting as you add new unit test cases to do something like this:

Person p = new Person("noel", "david", "rappin");
assertEquals(15, p.nameLength());
p.setLastName("flintstone");
assertEquals(19, p.nameLength());
p.setFirstName("pebbles")
assertEquals(22, p.nameLength());
p.setFirstName("betty");
assertEquals(22, p.nameLength());

The last test fails -- quick, what's it testing? Okay, now we have to trace the life of that instance variable all the way back up. It's hard to read, and prone to dangerous errors. You should never reuse an instance variable like this in a unit test -- every assertion should, where its at all feasible, be completely distinct:

assertNameLength(int expected, String first, String middle, String last) {
  Person p = new Person(first, middle, last);
  assertEquals(expected, p.nameLength())
};

testNameLength() {
  assertNameLength(15, "noel", "david", "rappin");
  assertNameLength(19, "noel", "david", "flintstone);
  assertNameLength(22, "pebbles", "david", "flintstone");
  assertNameLength(22, "betty", "david", "flintstone");
}

Now when the last test fails, you can actually see what's going on.

Avoid tautologies

The scariest issue you can have with tests is a test that passes when it should fail, allowing you to continue blithely along, ignorant of a bug you should have already caught. There will come a day, for instance, when you will forget to put any assertions in a test. There are a couple of things you can do to make tautologies less likely.

Follow the process. The process says each test has to fail before you add code. Adding tests that you already know will pass can easily lead to writing a test that will never fail.
If you have constants for text or numerical values in your code, don't reuse those in the tests -- use the literal or create a separate constant in the test.
Be careful with Mock Objects. Try not to test the things that you are explicitly inserting in the Mock when it's created.

Mock Objects Rule

Mock Objects are the missing link in helping you test all the things that are traditionally hard to unit test, like databases, GUI, web server... anything where your code is dependent on an external system or person, the Mock can get in the way and pretend to be that third-party and allow you to send and receive data in a testable way. Mock Object packages exist for a variety of languages, and using a package will save you time and effort on your tests.

Hope that helps -- go out and test something.

Friday, September 15, 2006

Why, Johnny, Why?

We interrupt Python week to bring you the following alternative programming rant. I know, Python week has sort of gone up in smoke. But one of our mottoes here is "Whenever a Hugo Award winning SF novelist writes a hyperbolic screed about BASIC in the public schools, 10 Print Hello will be there". As a motto, it's not very catchy. We're working on it.

As soon as I mentioned "Hugo Award winner", "BASIC" and "hyperbolic screed" many of you were probably able to quickly deduce that the author is David Brin, here on Salon wondering what happened to BASIC (you'll have to watch an ad to view the article):

Only, quietly and without fanfare, or even any comment or notice by software pundits, we have drifted into a situation where almost none of the millions of personal computers in America offers a line-programming language simple enough for kids to pick up fast. Not even the one that was a software lingua franca on nearly all machines, only a decade or so ago. And that is not only a problem for Ben and me; it is a problem for our nation and civilization.

Does he have your attention yet? He'll equate the loss of BASIC to an act of war later in the essay. Brin seems to be making three separate points:

BASIC used to be available on all computers that kids touch, and that is no longer the case.

This is obviously true, but a bit less dramatic than Brin implies.

Brin implies that BASIC was available for kids for a long time, and only recently disappeared. Actually, that's close to backwards. BASIC was generally available for less than a decade, and has been fading ever since. Even though Brin says a couple of times that "20 years ago" millions of kids could have used BASIC, the fact is that by 1986 BASIC was on its way out as a standard part of home computers.

Although it was invented in the early 60's, BASIC is associated most strongly with the late '70s and early 80s generation of computers. This market would eventually be dominated by the Apple ][ line, but earlier included things like the TRS/80, and Texas Instruments. (I even remember the Bally TV based game system having a BASIC module circa 1981 or so.). In any case, neither the Mac (introduced in 1984) or the IBM PC and clones featured BASIC to that same degree. By 1985, the idea that all computers would have BASIC was much less strong, although Apple ][ and BASIC instruction lingered in schools for a few years after that Which is why Brin's son is still seeing it in math textbooks, although that says more about the textbook industry than anything else. Data point -- my middle school had an Apple ][ computer lab in 1984 or 5. In my high school, a couple of years later, the computers were already Mac & PC's without basic -- we learned Pascal.

There is much less of an sense that kids should be taught programming (particularly in BASIC) than there may have been in the mid 80's.

Largely true. Another data point -- my younger relatives about ten years later were no longer taught BASIC, nor were the kids at elementary schools I studied about the same time. By now, computers had migrated into the actual classroom and were being used as reference and also for what I guess you'd have to call multimedia authoring. I do think this is a loss. But at the same time, I've always kind of suspected that the reason why elementary school kids were taught BASIC in the early 80s was because the schools were kind of floundering around for what to do with the shiny new computers. My read of the educational literature during the time I was studying educational technology was that eventually this petered out because it was not clear that teaching programming was helping students become better general learners. To be fair, that's not exactly the point Brin is arguing, but it does suggest that, perhaps, losing BASIC is not the end of civilization.

BASIC has some magical set of properties (Brin calls it "line-programming") that makes it uniquely suitable for introducing programming concepts. Because of this, we're losing an entire generation of tinkerers. This, I don't get at all.

So, Brin goes on at some length about how the computer people he's talked to don't seem to feel that it's a problem that BASIC isn't around anymore, while he, Brin, knows better. (Anybody familiar with Brin's essay entitled "The Dogma of Otherness" should catch at least a hint of irony.) Anyway, while Brin does acknowledge that BASIC has a lot of limitations, he goes on at length about line-programming and how important it is.

I'm not completely sure what Brin means by line-programming. Google didn't give me a relevant link. I'm going to assume that it has something to do with the fact that BASIC circa the Applesoft years was coded on a line-by-line basis. Brin suggests that this was an experience that modern languages can't give:

The "scripting" languages that serve as entry-level tools for today's aspiring programmers -- like Perl and Python -- don't make this experience accessible to students in the same way. BASIC was close enough to the algorithm that you could actually follow the reasoning of the machine as it made choices and followed logical pathways. Repeating this point for emphasis: You could even do it all yourself, following along on paper, for a few iterations, verifying that the dot on the screen was moving by the sheer power of mathematics, alone. Wow! (Indeed, I would love to sit with my son and write "Pong" from scratch. The rule set -- the math -- is so simple. And he would never see the world the same, no matter how many higher-level languages he then moves on to.)

I confess, I have no idea what he's saying here, though I do like the scare-quotes around "scripting". I'm kind of trying to get my head around the idea that you can't program a mathematical algorithm in Python and follow the reasoning of the machine. I mean, there are higher level constructs, but if we're talking about loops and conditionals for 5th grade math problems... I think Python would be pretty easy to follow and would look a lot like the logical structure of the algorithm. Python even has an interactive interpreter so you could type the code in line-by-line if you wanted. That actually could be pretty cool in a learning-math setting. And you could even track it with pencil and paper.

It is true, though, that it was much more conceptually simple to do simple graphics in Applesoft BASIC than in Pascal. That's not the language's fault, and it's not because you write Python in a full text editor. It's because modern programming languages sit on an operating system that mediates access to the drawing controls, and Applesoft BASIC didn't. It wouldn't be hard to come up with a Python package that emulated the draw controls of Applesoft basic (which were on the order of "Make that pixel blue. Now make that one red").

I think what I object to is the implication that this is somehow a difficult time to be learning to program, that it's harder now to get into programming. That's totally wrong -- it's a fabulous time to be learning to program. Brin says his son is now learning C++, so I'll assume he's interested and motivated. Twenty years ago, yeah, he would have had BASIC. And that's it. Unless you wanted to pay some money. As for seeing any examples of what a real program looked like, forget it.

These days... Well, a Mac OS X box ships with what, a half-dozen or so programming languages right out of the box, with who knows how many all available for free. Want to see basic algorithm code for free? It's there. All kind of code, complex and simple, is available online. There's a whole industry of programming books, something I would have devoured as a kid. We've traded The One True Teaching Language for many different languages. An elementary school teacher explaining "20 goto 10" is now a publishing empire, plus the internet. Coloring individual dots on a screen is now building a web page, or a web application, or a sprite animation. Even an elementary school child who is motivated can do more and understand more about computers than I would have dreamed in 1985. Have we lost something? Maybe. Have we gained something? Oh yes.

Wednesday, September 13, 2006

Obligatory Apple Post

Since what every tech blog reader needs is another round up of Apple's Showtime event...

Overall, nice incremental stuff, perhaps a little disappointing to those who were expecting a radical new mainline iPod.

New Shuffle: This is getting close to being jewelry, actually is starting to look like a cufflink to me.
New Nano: Smaller, bigger drive, better battery life, colors. Solid incremental upgrade.
New iPod: A very small incremental upgrade. It's very irritating that the new search function is not being backported to existing video iPods.

Here's my bold prediction -- the widescreen, touch panel iPod will never be released as it is currently rumored. My guess is that there will be practical troubles either keeping the screen clean and scratch free when people are putting their grubby mitts right on it and/or having a touch screen wheel UI that is actually usable. More likely the former, but I think the latter might be a problem too. Just wait for me to be wrong!
iTunes UI Enhancements: Mostly quite nice. The album art view and the cover flow view are both very pretty. They don't really mesh with how I use iTunes, but I can see where they will be useful. Points to Apple for actually buying CoverFlow rather than just ripping it off. The library enhancements and the iPod view are nice (although the iPod prefs aren't that much nicer than they were in the Preferences screen). As usual, the UI is in a state of flux, it's a bit more subdued, but some elements (like the tabs in the iPod screen just look weird.
iTunes Movies: Best seen as the beginning of a long term strategy than a goal in itself. Although if I were the manufacturer of a portable DVD player, I'd start getting a little worried. Still, I can't imaging genuinely sitting through a full-length widescreen movie on an iPod screen unless I was trapped on an airplane.
iTunes Games: Kind of underplayed, but in some ways the most interesting potential. If, that is, Apple releases an SDK such that the open source hackers of the world can get their hand on it. That could be very, very interesting... UPDATE: Looks like Apple has "no plans to offer an SDK". Bummer. Developers should be agitating for this.
iTV: I get where they are going with this, and I really want to like it, but absent DVR capability (even if it was on the networked Mac) I really can't see this being a major player.

Python tie-in (what with it being Python Week and all...): nothing really. I do have a pretty cool and obsessive Python script that creates fancy random playlists for iTunes/iPod, including things like, randomly pick albums, randomly play multiple songs in a row from the same artist, play songs in specific genres more often, etc., etc. I'll post it here someday, but the code really needs a good sweep first.

Tuesday, September 12, 2006

Re-refactoring

Here's a little riff inspired by one of the examples in Martin Fowler's book Refactoring, which is another great programming book that deserves an appreciation post one of these days. This was actually also spawned by code that I've read, and later realized that Fowler did a similar example. Thing is, I don't think Fowler went far enough in this case.

Here's the example. (page 243 for those of you playing the home game). But, since it's Python Week here, I'll translate to Python.

if isSpecialDeal():
   total = price * 0.95
   send()
else:
   total = price * 0.98
   send()

Fowler correctly notices the duplicate call to send(), and refactors to:

if isSpecialDeal():
   total = price * 0.95
else:
   total = price * 0.98
send()

This is fine as far as it goes, but as I see it, there's a second duplication in this snippet -- the formula for calculating the total. I'd rather see something like this (using the new Python 2.5 ternary syntax:

multiplier = (0.95 if isSpecialDeal() else 0.98)
total = price * multipiler
send()

There are a couple of advantages to this last snippet. We've separated the calculation of the total from the act of gathering the data for that calculation. This makes the actual formula for the total clearer, and allows you to easily spawn the multiplier getter off to it's own method if it gets more complicated. Plus we've removed more duplication, and I think made the code structure match the logical structure of the calculation a little bit better.

This is a simple example, and you could quibble with it. The general idea of separating conditional logic from calculations is a solid way to clean up code and make it easier to maintain in the long run.

Before I leave... I'm not sold on the syntax for the Python ternary yet. I'm told that the syntax was chosen over the perhaps more consistent if cond then x else y end because it was felt that in most use cases you'd have a clear preferred choice and a clear alternate choice, and putting the preferred choice before the conditional emphasized that. I don't know if that matches how I'd use a ternary. Although I guess it's reminiscent of listcomp syntax. I need to use it in real programs to know for sure.

Monday, September 11, 2006

Some 411 of my own

Saturday, Robin and I had the pleasure of being interviewed by Ron Stephens for the excellent Python 411 podcast. I think this was the first time I've ever been interviewed for anything, and while it's always fun to talk about Python, the book, and me (not necessarily in that order), it does take some getting used to.

Anyway, I do mention this here blog during the interview, and while I don't want to talk about the actual interview in detail until I hear the edited version, it did occur to me that I might want to have some actual Python content on board in case anybody comes by to check the place out.

Python content all week, then, starting with today's Things I Love About Python:

Whitespace. I know that I said just a few short days ago that I wasn't going to redefend Python's whitespace blocks. That was then, this is now. Now, I'm just going to gush over them. I love using whitespace to mark blocks. It enforces what I'd be doing anyway. It encourages consistent style, with the result that other people's Python code is actually intelligible. It encourages short methods and shallow nesting, both good habits, and it lets you get about 10-25% more code on a page. Nobody is ever going to have an Obfuscated Python contest (okay, I looked it up... somebody has, but they realize it's a joke.
List Comprehensions. One of my favorite syntax features in any language. So concise and yet so clear... Try to describe the following any more clearly in any language, programming or not.

[x.name for x in students if x.grade > 90]

Okay, they do sometimes blow up if you make them too complicated.
First Class Functions. It's easier to pass around named function objects in Python than in just about any language not named Lisp. This is a very good thing. It enables all kinds of elegant abstractions (especially since classes and instances can all be made callable). Over time, using Python has made all my coding move to a more functional style that's easier to test, verify, and maintain.

Of course, not everything in Python needs to be elegant and abstracted. Last night I had a problem. I wanted to download all episodes of a popular podcast that does not have an easily accessible archive page. Rather that walk through months of postings, I decided to write a script that would take advantage of the pages naming conventions, loop to find the shows for given days, find the downloadable URL and download, then add to iTunes. Final code, just under 60 lines. Elapsed time, under 45 minutes start to first download, including downloading, installing, and using a new library (Beautiful Soup, which is a nice HTML parser). The point is not that I'm particularly good at this (the script is a little sloppy and doesn't handle error conditions well), but that Python is particularly good at this. Plus, it was fun -- no fighting with compilers and interpreters, able to find support for the libraries when I needed it.

Saturday, September 09, 2006

Hybrids In Bloom

A couple of big stories in the wide world of scripting languages running on virtual machine platforms.

IronPython, the .NET implementation of Python created by Jython creator Jim Hugunin, released version 1.0.
The two primary developers of the JRuby project, implementing a Java-based Ruby interpreter, were hired by Sun with the mandate to bring JRuby to 1.0.

Unsurprisingly, I think this is all great. Programming hybrids are a beautiful thing. The more tools the merrier, and the more ways to combine the best parts of different tools, the merrier squared.

The amazing thing about IronPython is that it benchmarks as being faster than traditional CPython, which seems sort of counter-intuitive. (I'm assuming, based on nothing at all, that there's a higher memory load, but if I'm wrong, I'm sure somebody will point that out).

One of the interesting things about JRuby is a certain shift in momentum. When Jim Hugunin created JPython, the primary goal was to be able to use existing Java libraries with Python syntax. The JRuby team (and by extension, Sun), in contrast, seem to be comparatively more interested in using existing Ruby libraries (like Campfire and Rails) on a JVM backdrop than in using existing Java classes.

For instance, JRuby does not seem to have an analogue to the Jython shortcut of using converting attribute assignment in Jython (foo.bar = 3) to a Java bean setter (foo.setbar(3)), which makes Java classes feel more Pythonic. (Again, correct me if wrong, the existing tutorials don't touch on this point, and I'm basing this on a possibly out-of-date article). (UPDATE: Somebody did correct me -- JRuby does have this style. I wonder if it also works on constructors the way Jython does. So the point below is somewhat invalidated.)

And I don't mean this as a good/bad thing, either -- it's perfectly all right for the different tools to have different priorities. It's fascinating that Ruby is now seen as bringing a host of useful tools to the Java platform, in a way that I think we would have been laughed at a few years ago for suggesting that strongly about Jython.

Good luck and more power to everyone, I can't wait for Java on Rails...

Thursday, September 07, 2006

I/O, I/O, It's Off To Work I Go

Welcome to our program, Things I Agree With Totally And Wish I Had Said First. Our hero tonight is Tim Ottinger with his hit, "Frameworks are for the Impatient". It seems Ottinger is puzzled by a library he's trying to use..

Look, this framework is not the game Myst. I did not install this thing so that I could amuse myself for days by running around the file system trying to figure out what it is about...

In the Java world... [E]ven opening and reading a file is cause to go google the library one more time. Heaven forbid you have to manipulate dates or the like. These are small things, and should be very easy.... You shouldn't have to crack open a half-dozen US$50.00 books.

One point of "obvious" is probably worth one hundred points of "clever".

In the immortal words of Arnold Horshack, "Right you are, Mr. Kotter". Hmm. That sounds better if you imagine it in a Horshack voice. Doesn't look like much in print...

The Java io library is a particular nemesis of mine. I've been using it for what, just under a decade now, and the only way I was able to stop looking up the API every other week was to write a utility class that had an API that was actually useful (you know, obscure methods like copyFile, readlines, writeToFile...).

The io library may actually be the purest example of the Java school of OO design, marked by the principles like:

Use an abstraction (streams) that looks interesting on paper, but that nobody ever uses. Ignore the abstraction (files) that is already established. When that doesn't work, add a completely new abstraction (readers/writers) that's completely non-interoperable with the first one.
Make sure the user has to type a lot of words to get anything done.
Make it just as easy to do obscure, complicated things as it is to do typical things, even if this means making it harder to do normal tasks. Doesn't everybody want to randomly access binary files just as often as write some text? (Actually, there are obscure things in the API that are much easier than say, iterating over the lines in a text file).

Don't even get me started on the nio package. Really, don't. I couldn't explain it on a bet.

Sigh. Java is too easy a target sometimes.

Wednesday, September 06, 2006

Fonts

I'm curious -- how do you set up your screen in your text editor when you are programming?

Based on people I've worked with, I seem to do two things in my setup that are unusual. I use fairly large fonts (16-18 point, if I can) and I'm aggressive about cutting off lines at 80 characters. The upshot is that I'm showing less text on the screen at a time than most programmers I know.

Good? Bad? Not sure. The 80 character habit came from I think a combination of Code Complete where it's recommended on the (outdated now) premise that that's as many characters as you could print on a line, plus doing the books, where you generally do have to cut the examples off at 72 or 80 characters, plus some nasty HTML bugs where somebody tried to do a whole table in one line, and there was a missing </td> in column 436 or something. I'll stretch beyond 80 characters, but I really don't like having code hang that I
have to scroll right to see.

The bigger font I think is more of a aesthetic choice (and not, say, a vision issue). I do like that it tends to focus me on one method at a time, and encourages me to keep methods shorter to stay on one screen.

A couple of years ago, I did a little internet search on fonts for programming. Among other things, I found there are people who really like programming with a 7 or 9 point font. Anyway, I picked Bitstream Vera Sans Mono (and also the open source twin Deja Vu Sans Mono) because it is a bit heavier weight than Courier, a lot prettier to look at, and it's specifically designed to differentiate between similar characters.

Monday, September 04, 2006

Web Apps and Language Wars

I wasn't planning on posting about either web apps or linking to Joel Spolsky again, but this language wars post is just too interesting to pass up. Besides, a jillion people have already commented on this, so what's a jillion and one?

Spolsky is riffing on what language or platform you should use for an enterprise web project. He makes a few points (note, I'm paraphrasing him here -- these are his points, not mine):

There are 3 1/2 platforms that are proven to work in the enterprise web app space (Java, C#, PHP, and maybe Python.

Within that group, there's no difference large enough to offset expertise, so pick the one that you know the best.
Rails is not part of that group. Even though it's fun.

I suspect you know which one of those three points has gotten the most attention. Obviously the Rails people are ticked off, which I think is a combination of Spolsky taking his point too far, and Rails partisans taking his point even farther.

Look, I love Rails as much as a person can love a framework. I wish I had been smart enough to put all the pieces together myself (another post for another time...). My Rails experience has been uniformly positive. Nevertheless, if I wanted to pitch Rails for a mission-critical enterprise application, I would expect to have to justify the choice. Using Rails is still a risk, relative to the others, it's still newer, people are still trying to work out optimal deployment, it still doesn't have the library support the others have. Where I would differ from what Spolsky is saying is that I think it might be a justifiable risk even in a mission-critical enterprise application.

Scaling and library support is not the only source of risk. There's also the risk that your code will get bogged down in a huge ball of intertangled display and logic code (PHP). Or the risk that your developer time will be slowed down enough that it delays deployment (Java). Or the risk of deploying in a system that is owned by Microsoft (guess...). Choosing one of the "nobody ever got fired for choosing X" languages is a safer choice. Which doesn't always make it the best choice.

(And yes, I know that Spolsky ends his essay by mentioning that one of his apps is written in a custom in-house version of VBScript. Red herring. He's not saying that Java, C#, and PHP are the only languages to use ever, just that they are the only languages that currently have the ecological support to be guaranteed safe in a "death before failure" scenario.)

I'd argue the following corollary: I agree that, all else being equal, expertise trumps any difference between these platforms. That's a little circular, of course, because how will you get experience without using a tool? (I know about apprenticeship as a junior member on a larger project, but it's not always feasible.) Almost every project or team spins off low-level applications -- bug trackers, vacation trackers, internal chat rooms. Things that are not high-priority, but are still useful. So, when putting those together, I think it's a good idea to range far and wide and try new things that might pay off in a future project (I almost wrote that you "have the right, no, the duty" to do that, but I thought that might be a little over the top). Me, I'm going to try out Python/Django next chance I get...

Friday, September 01, 2006

Java Closures

Here's a nice item being proposed for Java 1.7: closures in Java. On behalf of all those people who actually do create entire classes just to be able to use map and other functional styles in Java, may I say, please, please, please put this in Java. (This seems a good place to link to Joel Spolsky's wonderful programming fable "Can Your Programming Language Do This").

The proposed syntax looks like this:

public static void main(String[] args) {
  int plus2(int x) { return x+2; }
  int(int) plus2b = plus2;
  System.out.println(plus2b(2));
}

Line one of that syntax creates a closure object that takes and returns an int using what is basically Java method syntax. Line two assigns that closure to another variable using the syntax int(int) to specify the types of the signature. Line three shows that you can call the closure object as you'd expect, although notice that, unlike most Java calls, there's no receiver object specified, and it's not using an implicit this -- it's purely a function.

The proposal also specifies an alternate syntax for creating short closure objects -- I don't like this one as much:

int(int) plus2b = (int x) : x+2;

This is all nice, and I know I'd use it pretty much daily. Unfortunately, though, I wonder if the strict typing will wind up making the closures less useful than, say, Ruby blocks. I assume there'd be some way to tie this into the generics system so that methods that might take blocks with different type signatures would be able to convince the compiler that everything is okay. Let's see... if I wanted to do a new method of List collect, it would be something like this.

public List<V> collect(V(T) closure) {
    List<V> result = new ArrayList<V>();
    for (T obj : iterator()) {
        result.add(closure(obj))
    }
    return result;
}

int plus2(int x) { return x+2 };
List fred = list.collect(plus2);

Is that right? If so, that's certainly a lot better than we have now.

I have three quibbles and an enhancement.

Quibble 1: Like generics, what looks nice for int(int) is going to look a lot less pleasant when the signature is, say, OrderLineItem(Order, Product) or even better List<List<OrderLineItem>>(List<Order>, List<Customer>, List<Product>), which I could easily see as a real world case.

Quibble 2: Do do this right would require including support for closures up and down the standard library -- all through the util classes, all through SWING, JDBC -- there's all sorts of places in the library that would be cleaned up by being able to take closures. I suspect that's unlikely to happen quickly.

Quibble 3: The proposal says "We are also experimenting with generalizing this to support an invocation syntax that interleaves parts of the method name and its arguments, which would allow more general user-defined control structures that look like if, if-else, do-while, and so on." I'm thinking this is more of a Smalltalk or Objective-C style? That would look odd within Java.

What I really want, though is a method literal analogous to a class literal. Something like...

Integer(MyClass) closure = MyClass.someSillyThing.method;
MyClass obj = new MyClass();
Integer x = obj.closure(3);

or even better:

MyClass obj = new MyClass();
Integer(MyClass) closure = obj.someSillyThing.method;
Integer x = closure(3);

And yes, I realize that's basically Python bound and unbound methods. The tricky part in Java is that there might be more than one overloaded method called someSillyThing, and so I'm assuming that whatever closure object I'm creating would be able to get the right one based on the declared type (or, alternately, I suppose, dispatch properly when called). That should be doable, though. And then my Java code can look even more like Python...

Good stuff. I hope something like this gets in.

Thursday, August 31, 2006

Languages I Use

Continuing in the getting to know you kind of vein, I thought I'd ground some of what I say by talking about the three programming languages that have made up the bulk of my professional and hobby work for the past five years or so -- Java, Python, and Ruby.

Java: I've been programming Java since either just before or just after the 1.0 release... can't quite remember at this point. I think I've covered most of the major Java libraries (although I've done very little EJB work). Basically, Java is the station wagon of programming languages. It's not elegant or efficient, but it gets you where your going and you won't offend anybody along the way.

Far and away the best feature of working in Java is the tool support, especially the IDE support. If you're doing something in Java, odds are somebody has done it before and there's an open source .jar file somewhere that will help. Plus, IntellJ IDEA makes working in Java almost as productive on a time-basis as working in a scripting language.

Java is, of course, famously verbose, and there's the constant sense of telling the compiler things that the compiler should already know. (The 1.5 language features improve this somewhat, at the cost of moving some of the verbosity elsewhere). Java's original design goal was basically to get C programmers to do object-oriented stuff without scaring them away, and at that it's a success, but that does lead to oddities like having both basic types and object wrappers for them, and keeping the ridiculous C-style switch statement.

There's also the "Java style" of OO design, enshrined early on by Sun, and followed by many third-party libraries. To oversimplify, there's a lot of design made complicated by the desire to make less common tasks as privileged in the API as more common ones. For example, the need to spell out what are basically boilerplate properties in a web.xml file or a Struts config file. Swing has several features like this, including the event system and say, supporting multiple listeners for a button click.

Python: With Python I suppose you have to start with the whitespace thing, although I know that anybody who actually works in Python is sick of hearing about it. I wrote what I hope was a spirited defense of it in the Jython book, and I'm not going to repeat myself. What I like most about working in Python is the conceptual consistency -- objects are like classes are like modules are like dictionaries. I find that to be a very powerful equation, and it makes Python code more predictable to me. I also think the syntax is very clear and readable. (I particularly like the list comprehension syntax.)

On the down side, although there probably is more Python libraries than Ruby ones floating around (although I'm not as sure of that as I was even a year ago), there's no central repository so they are harder to find. It is true, though, that Python style quirks are more likely to bleed into my programming in other languages than vice-versa -- I write a lot of Java code that really looks like it wants to be written in Python.

Ruby: I actually got into Python, Ruby, and XP at about the same time. I started on Ruby because a number of the early XP gurus were excited about it. And while I know I'm supposed to pick a side or something, I actually like Python and Ruby both quite a bit. There are some particular bits of Ruby syntax that I think are particularly well done, for example the way that accessors are handled. I even like that you can leave parentheses off method calls if the line is unambiguous, although a little of that can go a long way. Blocks make the language very flexible, and very easy to build non-redundant code in.

There are a couple of pieces of Ruby syntax that make me nervous, like the syntactic sugar for hashes as the last argument of a method or the way you don't have to specify that a block is needed in the signature of a method. To be fair, I haven't experienced practical problems with these features yet. Ruby has a lot of syntax sweetener compared to Python, which is sometimes good (elegant Ruby code is very elegant) and sometimes bad (I've had some trouble following Ruby examples if they are very magical). Because Ruby has been the focus of a lot of XP/Agile writers, the testing tools and general XPish support is very good. For a long time, I thought that general library support lagged Python, but that's becoming less true daily. And of course there is also Rails, about which more at another time.

Sunday, August 27, 2006

Code Complete: An Appreciation

It's been about 25 years since I first typed 10 PRINT "HELLO", and in that time I've read dozens of books aimed at making me better at creating software. There are several things I want to do with this site, but certainly one of them is to recognize those books that had a particularly strong impact on my professional career.

The first one is Code Complete, by Steve McConnell. It stands out on the shelf because it's not about learning a new language, tool, or discipline, and it's not a big picture rethinking of software engineering itself. Instead, it's a series of presentations of empirical data about specific features of the coding process, as well as very specific examples of how to generate elegant, readable code.

I read this book the summer before I entered a graduate program in Computer Science, in a defensive panic that I didn't know enough actual programming. As soon as I finished it, I started rewriting my existing programs to align with McConnell's suggestions, and I've never really stopped. In particular, the sections on control structures and layout paint a clear picture of what maintainable software looks like in practice, simply be setting out the principles and demonstrating them example by example.

The book's influence is all the more amazing because its examples are in langauges (C, Basic, and Pascal) that have not been useful to me professionally. (I had stopped using Pascal by the time I read it, wrote a little Visual Basic since, and probably an even smaller amount of C). The basic ideas, though, are adaptable to any declarative language. There's a second edition, dated to 2004, that I haven't read, but which I understand updates the data and examples in the book (the examples now include Java and C#). (Come to think of it, I probably should check it out...).

It's rather amazing to me that so many of the Amazon reviews of this book still, nearly fifteen years after its original publication, say that there's no other book that covers this kind of ground.. the kind of pinhole cleanup of code that is so much of the difference between a great program and a mess. I actually think there are one or two other books that cover similar ground, but it's clear to me that this is a gap in the kind of knowledge about programming that is shared. This book will make you a better programmer.

Saturday, August 26, 2006

Occasionally Asked Questions

I wouldn't say it happens often, but I do sometimes get asked some questions about being a technical author. Seemed like a good place to start.

For a long time, the most common question was Did you pick the animal on the cover of the Jython book? The answer is no. The cover animals are picked by the O'Reilly production team, and the mechanism they use for assigning animals to books is somewhat mystical. I think we could have rejected it had we had a really strong reason (I know of at least one other book that has). For the wxPython book, Manning offered us a selection of a few different art figures, and we also picked the color of the spine.

The other most common question is something like Can you make a living at this? or more generally, how publishing finances work. I'm not a complete expert, but I suspect my experiences generalize. Tech books are generally sold on the basis of a proposal (in contrast to fiction novels from new authors, which are usually not sold until the book is complete). Publishers generally describe their proposal formats on their websites -- I can't talk as much about that part of the process because I came in after the proposal phase in both cases.

The contract specifies payment as an advance and a royalty rate. The advance is paid up front in stages as the book is completed. There's room for negotiation on this, but it's typically something like 1/3 on signing, 1/3 at the halfway point, and 1/3 when the final manuscript is approved. My sense is that newbie authors can expect an advance in the mid to high four digit range. The royalty rate the amount of each books sale that goes to the author (the amount is based on what the publisher is paid by the store, not on the cover price of the book). However, the amount of the advance is subtracted from the royalties -- the author does not see additional payment until the total royalty amount exceeds the initial advance. At this point, the book is said to have "earned out". In case you are wondering, the author does not have to return the advance if the book never earns out -- the advance is a gamble by the publisher. A typical royalty rate is about 10%, but some publishers (notably Pragmatic) offer more. If there is more than one author, than the authors decide how the money will be split among them, and that split is also enshrined in the contract.

Oh, and if you have an agent, then the agent typically takes %15 percent off the top. Typically, they earn it, too, either by getting the contract in the first place, or by dealing with the publisher when you don't want to.

Tim O'Reilly said on his blog some time ago that the typical O'Reilly book earns about $15,000 for it's author. In my case, however, we earned less than that, since the Jython book has yet to earn out. In fact, it will probably never earn out -- in fact, based on the figures O'Reilly gave in his post, it's probably among the lowest selling O'Reilly books ever (at around 6000 copies or so), so I've got that going for me. I haven't gotten sales figures on the wxPython book yet, but it's Amazon ranking has been pretty good, so I'm hopeful.

I've just started writing some articles for sites like IBM developer works, which is more lucrative on a per-word basis, but I'm not planning on quitting my day job anytime soon.

That'll do for now. As I think of some other questions of interest, I'll post them here.