Recently in Programming Category

The Case Against TDD

I have been following TDD since the days of extreme programming, as I am always interested in better ways of doing software development. At first blush, test-driven development sounds like manna from heaven: it embraces YAGNI as a guiding principle and focuses on comprehensive unit testing.

Far too much development is anticipatory, with overengineering the usual result. Faced with huge, complicated code bases (or the prospect of them when commencing a new project) it is tempting to order TDD as a check against that temptation. In my experience, though, the overengineering mindset does not go away under TDD—it just transforms into a zealotry towards code coverage.

And that is my biggest gripe with TDD: creating unit tests feels like development and looks like development but it isn't development. It is very easy to get into the business of producing unit tests rather than releasing product. (I say this as someone who presented a class on unit testing at a company conference in 2008. I did a pragmatic style of TDD for approximately 2 years around that presentation.)

Assuming then that you can rein in that tendency to strive for 100% code coverage, the next biggest problem with TDD is that the biggest part of an application lives on the client—mostly outside the reach of the xUnit systems. Obviously there are ways to attack client-side unit testing but they are all inherently brittle due to cross-browser issues and sometimes frequent changes to user interfaces. As Web applications become increasingly client-based, more and more development is outside the scope of TDD. (Or within scope but at considerable cost and effort.)

I have become disenchanted with TDD and its ilk over the years for these reasons. In my opinion, the best way to improve quality is to build time for developer testing into the estimates and hold developers accountable for quality product. Emphasizing cross-browser and functional testing at the developer level before a handoff to QA has resulted in better quality in my experience than a doctrinaire emphasis on unit testing and code coverage. (Accountability is another subject entirely and I have a lot of thoughts surrounding that as well, which I'll share someday. Read Codermetrics by Jonathan Alexander for more along those lines.)

Dabbling with Linux

I've finally decided to take the plunge: I'm installing a Linux distro as a VM on my Mac.

I have resisted doing this for years and years and years. I've long thought that going Linux just meant that you're doomed to perennial tweaking and figuring out incompatible drivers. I don't give a rip about either of the "free as in's" when it comes to operating systems—I'm an unabashed Mac user, I pay for all of my software, and my programmatic life is completely Windows-based.

So why am I doing this? And why now?

Python.

I have been reading some really intriguing books on data analysis, social networking, and monitoring and all of the examples are in Python. I've always been tempted by Python the language and Python the community, and I've even made minor forays into that world. I know that Mac OS X is a great platform for Python but I have zero familiarity with Linux.

In the end, if I make anything significant, I'm going to want to host it on Linux so why not start now. With a virtual machine, I can duplicate my final environment without polluting my Mac or worrying about the differences between the two. I initially looked at Ubuntu but I think it's really more of a consumer-grade distro whereas I want raw server.

A colleague at work said that he uses CentOS; I figured that's as good as any and he certainly knows more than I do. So I downloaded CentOS 6.0 minimal and I'll see how it goes.

Amen, Joel Spolsky!

I couldn't agree more:

I don't understand why this "leaving the industry" thought process is so predominant in this little corner of the universe.

This is a TERRIBLE time to leave the industry. I don't know if you've noticed, but there are half a million NEW unemployed people JUST THIS MONTH.

Although the tech industry is not immune, programming jobs are not really being impacted. Yes, there are fewer openings, but there are still openings (see my job board for evidence). I still haven't met a great programmer who doesn't have a job. I still can't fill all the openings at my company.

Our pay is great. There's no other career except Wall Street that regularly pays kids $75,000 right out of school, and where so many people make six figures salaries for long careers with just a bachelors degree. There's no other career where you come to work every day and get to invent, design, and engineer the way the future will work.

Despite the occasional idiot bosses and workplaces that forbid you from putting up dilbert cartoons on your cubicle walls, there's no other industry where workers are treated so well. Jesus you're spoiled, people. Do you know how many people in America go to jobs where you need permission to go to the bathroom?

Stop the whining, already. Programming is a fantastic career. Most programmers would love to do it even if they didn't get paid. How many people get to do what they love and get paid for it? 2%? 5%?

SQLcached

I have to agree with Dare Obasanjo's latest blog entry about in-memory caching. After working on high-transaction, heavy database-using Web applications for the last nine years, there is one thing above all else that I have learned and taken to heart: a Web application is only as good as its caching strategy. My career has seen a progression from light to heavy cache usage and each new application has benefitted in scalability from that.

Dare's entry got me thinking: why couldn't the RDBMS itself incorporate a distributed, in-memory cache like memcached or Project Velocity? What if a Web application could basically eliminate the need for its own caching layer by relying solely on the database, which would then aggressively and algorithmically use one of the caching services to expand its memory-based caching?

If the problem with query caching in MySQL or SQL Server is the amount of server RAM that can be installed, then distributed caching seems like the perfect solution. It's what the Web server layer uses: why not bring it down to the data layer. Moreover, given the common replication and clustering scenarios, there are likely idle database servers whose memory is already going unused for the most part. Putting a distributed caching system in place would put them in action while still keeping them ready for failovers.

The main objections I can see is that going to the database might cause an increase in network usage since some cache calls in the Web server layer would never leave the server and that the database would have to work to decide between file-level and cache-level access. But that would be minimal and the simplification it would engender on the Web application level would make the costs even less objectionable.

It's entirely possible that Project Velocity is being undertaken with exactly this thought in mind. (It's not clear that there's any movement afoot in MySQL AB towards this end—at least from my cursory searches.) This idea would have to be implemented at the RDBMS level.

Bravo! Bravo!

"All I ask is that you restrict them to 'layout-only' check-ins. In other words, if you want to do some source code reformatting and change some code, please split it up into two check-ins, one that does the reformatting and the other that changes the code." – Raymond Chen, "If you're going to reformat source code, please don't do anything else at the same time"

jQuery On the Rise

Wow. Microsoft will bundle the jQuery library with all ASP.NET MVC projects, enhance its interaction with Visual Studio through IntelliSense, use it in future ASP.NET Ajax controls, and provide support for the library itself. This is huge because most of the Microsoft universe doesn't take note of anything not provided by Microsoft in a standard distribution.

A Hard Decision

I've been a consumer of the Facebook.NET framework for a few months now as I've developed several Facebook applications in ASP.NET. I chose that framework instead of the official Microsoft one because it seemed more logical and straightforward.

Honestly, I'm not at all certain why I originally chose one over the other. I read all of the blog commentary about each, I looked over the source, and I checked out the sample applications included. Facebook.NET struck me as elegantly designed, well conceived, and actively developed. Sure, Microsoft had commissioned and paid for the development and maintenance of the Facebook Developer Toolkit, which meant it was more likely to be around in the future. This possible objection was easily dismissed since Facebook.NET was open source and could be extended privately as long as need be.

What I couldn't have foreseen were sweeping changes by Facebook to the underlying API within six months and a complete abandonment of the open-source project by its sole maintainer. Facebook has made a lot of mistakes in handling the transition but there's precious little that I, as a third-party application developer, can do about that. So my sole responsibility is to keep up with updates to the framework and alter my code to accommodate the new (or changed) functionality.

Faced with a framework that isn't getting updates, the responsibility expands considerably. One must either abandon the abandoned framework to search for greener pastures or one must take up the mantle of leadership by forking the project. Neither is a path to be chosen lightly for each entails considerable pain.

The choice was made easier for me by the fact that the Facebook Developer Toolkit was just as inactive at the time. I tried corresponding with the Facebook.NET maintainer and even succeeded a couple times: I would much rather have been a developer on a project than the man responsible. In the end, it became clear that the maintainer had moved on to other projects and that I was going to have to fork.

The result is fb.net. I largely brought it up to parity with the API changes in a span of two days but then I got distracted by work, family, and other projects myself. As it stands, there's just a little more to go and then I can make a release candidate.

My only hope is that I can get this framework ready for a full release and then start looking to build a community that can assist in its maintenance. The Facebook.NET maintainer got it off to a good start; now it's my turn to finish the job.

That's Another Way to Go About It

I generally don't like to showcase other's inelegant code, but this couldn't be more timely given yesterday's Twitter created_at parsing tip. There's always more than one way to do something, I guess.

       private static DateTime ParseDateString(string DateString)
{
Regex re = new Regex(@"(?<DayName>[^ ]+) (?<MonthName>[^ ]+) (?<Day>[^ ]{1,2}) ↵
(?<Hour>[0-9]{1,2}): (?<Minute>[0-9]{1,2}): (?<Second>[0-9]{1,2}) ↵
(?<TimeZone>[+-][0-9]{4}) (?<Year>[0-9]{4})");
Match CreatedAt = re.Match(DateString);
DateTime parsedDate = DateTime.Parse(
string.Format(
"{0} {1} {2} {3}:{4}:{5}",
CreatedAt.Groups["MonthName"].Value,
CreatedAt.Groups["Day"].Value,
CreatedAt.Groups["Year"].Value,
CreatedAt.Groups["Hour"].Value,
CreatedAt.Groups["Minute"].Value,
CreatedAt.Groups["Second"].Value));

return parsedDate;
}

For me, the lesson here is to know your libraries and always assume that someone else has done it better than you already. It then becomes a quest to find that better solution to the problem.

How to Parse DateTimes from the Twitter API

If you were wanting to interact with the Twitter API under .NET, you might find yourself trying to convert a date value from the XML results over to a DateTime in your code. Twitter uses a weird format for their dates—Thu Sep 04 11:09:28 +0000 2008—that doesn't get converted properly with just a good ol' DateTime.Parse. You could spend a lot of time iteratively trying to figure out the correct format to use. Or, you could just use the following (my free gift to you, dear Twitter-API-consuming reader):

DateTime.ParseExact((directMessage.SelectSingleNode("//created_at").InnerText), "ddd MMM dd HH:mm:ss zzzz yyyy", CultureInfo.InvariantCulture, DateTimeStyles.AdjustToUniversal);

directMessage here is an XmlNode representing a single direct message extracted from a list of direct messages. I found this out after many iterations since all of the .NET libraries for Twitter just punt on this conversion.

ReSharper *IS* All That

Longtime readers may already know that I am a big fan of JetBrains' ReSharper add-in for Visual Studio. I recommend it highly to any .NET developer, but especially to any of those who are interested in being more productive. You'd think that that'd be everyone, but I've found it not to be the case. Many developers I recommend the tool do don't want to learn a new tool, aren't particularly keyboard-enthusiastic, or mistakenly believe that ReSharper is superfluous. On that last point, I've stumbled upon a nice comparison chart showing that that is not the case.

There's overlap, to be sure, but in nearly every instance I've encountered ReSharper's implementation is more thorough and more thoughtful. It's worth every penny.

I think I just coined a new word today—encomp (verb): the act of getting an application to match the comprehensive design provided by the designers. Usage: "Oh, I just got the slices from Andy so now it's time to start encomping."

I can't find any usages of it on Google, just confusion with encompassing. I claim this neologism then.

Javascript Splice Ain't Remove

I wanted to remove an item from a Javascript array and looked in vain for a remove function. I did find an implementation of remove using the Array.prototype but I just needed a function to get rid of items that matched a specific criterion.

That's when the trouble started. The splice function—calling its array.splice(index, count) variant—will indeed remove an item or items from the array (and return an array of the items removed for some reason) but it also closes the void left by the removed item and resizes the array all in one fell swoop. So code like this doesn't work as you'd think:

for (var i = 0; i < array.length; i++)
{
   if (array[i].someParameter == "someValue")
   {
      array.splice(i, 1);
   }
}

Since array.length changes with each removal, this code will end up missing some items because it will end too quickly. In the interest of saving future Googling, frustrated Javascripters, here's how I solved the issue:

for (var i = array.length - 1; i >= 0; i--)
{
   if (array[i].someParameter == "someValue")
   {
      array.splice(i, 1);
   }
}

By starting the loop from the top of the array and working backwards, item removals and the dynamic resizing have no effect. If anyone has a better way, I'm all ears. Well, eyes.

Regular Expressions Aren't the Devil

I love regular expressions. Okay, I love the challenge of crafting regular expressions. I do not enjoy reading regular expressions that I have not created or, really, even the ones I do create. But give me a problem and tell me to make a regular expression to match things and I am all over it.

A co-worker wanted a regular expression to turn unlinked URLs in text into HTML links and to correct linked URLs that lacked a protocol into valid URLs. In other words, if "www.google.com" appeared in some text, it needed to be replaced with <a href="http://www.google.com/">www.google.com</a> and <a href="www.google.com">some link text<a> needed to turn into <a href="http://www.google.com">some link text<a>

My first pass was a monster regular expression that handled both situations but I couldn't get the replacement string to account for the fact that there was already link text in the invalid URL example. And I couldn't adequately cover the situation where there were attributes before the href attribute. So scrap that one.

This is what I came up with after separating it into two replacement passes. I share it with you both as a testament to my regular expression abilities (good or bad, you decide) and because this situation seems like one that might come up pretty frequently.

Regular expression Replacement string
(?<=\s|^)(?<domain>www\.[^\s]+)(?=\s)
|(?<=\s)(?<protocol>http[s]?://){1}
(?<domain>(www)?\.?[^\s]+)(?=\s)
<a href="http://${domain}">${domain}</a>
href="(?<domain>www\.[^"]+)" href="http://${domain}"

Five Things I Hate About C#

Inspired by Brian D. Foy's entry and Titus Brown's one for Python, here are five things I hate about C#:

  1. You can't have methods whose signatures differ only in their return values. I think I understand why it might not have been a good idea in the past, but with VisualStudio.NET I don't see how anyone could get confused any longer.
  2. You can split a class into partial classes. Why you'd want to do that is beyond me. Luckily, it's a 2.0 thing so I was able to refrain from seeing it for most of my development life.
  3. virtual. I hate how you have to explicitly state which methods can be overridden by descendents.
  4. Constructor chaining. I wish it could be done like normal method overloading where you can have the constructor do something and then call another constructor. Again, I understand why that's not feasible since the CLR would have to divine when a constructor was being called by another constructor versus a normal call to limit object creation. But the way that it's handled currently obscures what's actually happening when it treats the called constructor as a sort of pass-through to another one.
  5. Documentation. This appears to be a universal language complaint. MSDN is quite comprehensive but shallow at the same time. Everything is minimally documented and I can generally start there. But I can almost never end there: I invariably have to go to places like CodeProject in order to get a decent example of the syntax in action. Even that is generally not the final word because I usually then have to go to the blogs to see if there's any "gotchas" involved in using the language construct. Oh, and MSDN does a horrible job of differentiating between .NET 1.1 and .NET 2.0--something I desperately need because I'm currently working in both versions of the framework.

That was a lot tougher than I thought. I had to comb through the code to find things that I hated that were specifically C# and not ASP.NET. C# is really quite a lovely language. In general, it's the bee's pajamas.

Just for grins, here's five things I hate about ASP.NET:

  1. WebControls. Yes, I generally can't stand WebControls. I'm quite obsessive about my HTML and CSS; a lot of that combination's power is stripped away when you use a WebControl because Microsoft makes a lot of assumptions about how you're going to be using them.
  2. AutoEventWireup defaults to true in 2.0. Okay, so I just talked about this one
  3. but it's so wrong to change the default like that.
  4. Application event lifecycle. There's really no adequate documentation about the ASP.NET sequence of events. This article from Peter Bromberg was the best I found. I ended up spending a lot of time in debugging trying to determine what is available when in the event sequence.
  5. I hate ViewState. It is nothing but massive bloat and there's almost always a better way of storing data. I've also had several utterly inexplicable errors caused by ViewState corruption or changing.
  6. PostBack. There's a reason why the <form> tag has an action attribute, Microsoft. I know that the Javascript trickery involved in POSTing to the same page allows for some nice automagical black-boxing. I just don't like that model of Web application development.

That one was a whole lot easier. I had to limit myself to just five.

Revisiting the Past

Today I finally had the chance to revisit the first service I ever wrote. I christened it Pingarooni and it handles all the outgoing trackbacks and pings for Quick Blog. It's been a long time since I've been in a position to re-examine early code done in a state of relative ignorance—I've been coding in ASP.NET for so long that my code is generally something of which I'm quite proud.

But I had never created a Windows service. The only non-Web application I'd ever created was a console application—certainly a world apart. I didn't really know what I was doing so my code evinced a certain textbook formality that subsequent services I wrote had thankfully shed. The flow of it was horrendous and I'm surprised it lasted as long as it did.

Here are the things I learned in revising it:

  • Make your service work on batches at a time. By doing so, you'll be able to see progress and some freak error will only affect a small number of work items.
  • Make the service work on discrete work units so that many instances of the service can be run in parallel.
  • Make the service update the database to report its progress as soon as possible. At the least, it should report every batch if not throughout the batch.
  • Make the service run continuously if possible, stopping only when an OnStop event is raised. If the work load is neverending, there's no sense in pausing between runs.
  • Make the service use plugins along command pattern lines to define its work. If possible, these plugins should be put into a separate assembly or module so that additional plugins can be introduced with a simple restart of the service.

The views expressed on this website/weblog are mine alone and do not necessarily reflect the views of Go Daddy.com Software, Inc.

Job Satisfaction

This entry by Jeff Atwood made me realize why I like working on Quick Blog so much. It's not just that it mirrors my own interests or that it's challenging on a daily basis. The thing that makes it all so wonderful is that people are using it. As Jeff puts it:

A smart software developer realizes that their job is far more than writing code and shipping it; their job is to build software that people will actually want to use.

I'm not turning into some altruist. Be sure that I'm doing what I do because I get paid. But it's enormously satisfying to know that you're helping people find their voice. Every day I see the entry counts grow, the comment counts grow, and the number of fascinating blogs grow. Sure, there are other blogging engines out there—but they're paying me (well, Go Daddy) for mine.

The views expressed on this website/weblog are mine alone and do not necessarily reflect the views of Go Daddy.com Software, Inc.

I Wish It Were Fake

If there is one thing I cannot stand at work it is wrestling with my tools. I've got two Web projects in Visual Studio 2003. I hate Web projects, hate 'em—there is no reason that I've found that you cannot do a Web application using just a standard project and it makes life so much easier since Web projects are the devil.

Anyhow, I have to make this mission-critical change to one of the Web projects and neither one will load automatically. Fine, I've danced this dance before about a month ago when suddenly Visual Studio refused to allow those Web projects into the solution. You just have to remove them and re-add them each time you open Visual Studio or change solutions. Man, after writing that, I can see how messed up my work life is—constantly rerouting around lost settings, malfunctioning applications, and halfassery in order to get things done.

So I did that and one of the Web projects loaded. Well, it loaded on the second attempt to add it. Again, not an unusual thing as I also discovered that the first Web project added generally needs to be done again since the Web server takes too long and times out. At least, that's my presumption because it works the second time. (Insert existential scream here.) The second one, however, refuses to load, giving me a timeout every single time I try it.

Usually, I can just f'ing Google the error message and have a handful of things to try. Unfortunately, this seems to be a problem unique to me since I couldn't find a single instance of my particular error message. The help, unusually helpful on this topic, didn't even list my failure as one of the options in its lengthy catalog of possible messages and vague steps to resolve them.

I tried the usual panoply of magic actions that sometimes yield results: restart VS, restart Windows, go directly to the project instead of using the braindead Web project loader, create a new solution, and try again the next day hoping that last night's hell was just a bad dream. Nothing worked. Went to the IIS logs to see what sort of things Visual Studio was trying to do to load the sucker: lots of unfamiliar actions ("PROPFIND", never heard of it—oh it's a WebDAV thing, interesting) and lots of 403s.

Ahh, the 403. Permissions. I know something about that. The permissions for both Web projects appeared to be the same, but let's just give those muthas full control. The panacea of frustrated users (and bane of administrators). Nothing. Machine.config: give the app SYSTEM access instead of ASPNET. NOTHING. NOOOOOOOOOOOOOOOOOOOOOOOOOTTTTTTTTTHHHHHHHING!

So three hours completely wasted while a mission-critical fix waits. I ended up editing the file in Notepad; now I just have to figure out how to csc a Web application. I hate wrestling with my tools.

[UPDATE: Great news! I got the Web project loaded … but the source control bindings were lost. The words "Pyrrhic victory" spring to mind. Visual Studio and source control binding problems are a separate level of hell unto themselves. Bleh.]

[UPDATE 2: In a dramatic turn of events, it bound. I hit the "Bind" button in the Change Source Control dialog and it came back in a few seconds. I was dumbfounded. It's never worked that well without manual diddling about in the VSS-VS project files. Finally.]

The Idea that Never Was

Damn! There goes my one patentable idea I've had since working at Go Daddy. I wasn't going to have any noise and basically make it look like a static image. But the idea is now decimated by prior art.

My implementation was going to be better because my plan was to fragment the CAPTCHA text randomly. Each piece would then be a separate frame in the animated GIF with a minimal delay between frames. To the human eye, it would look like a fixed image and would be incredibly easy to recognize. To the automated CAPTCHA bot, the interpolation required would make defeat not worthwhile. As bots became more sophisticated, tweaks could be done to add randomness to the slicing or noise elements.

I realize, of course, that I could still make this and release it as open-source but my motivation's been sapped. Lazy Web, enjoy.

[The views expressed on this website/weblog are mine alone and do not necessarily reflect the views of Go Daddy Software, Inc.]

Breaking APIs Considered Harmful

If you've got any responsibility for an API, please please please do not break your API. OOP has a lot of neat things like overloads, encapsulation, and, hell, class name changes to make it easy for you to not completely screw over those downwind (metaphorically speaking) of you.

How Not to Program

Uh oh, Leon Bambrick has spilled the beans about how to be a programmer. (Just kidding, that's programming by coincidence.)

About this Archive

This page is an archive of recent entries in the Programming category.

Politics is the previous category.

Psychology is the next category.

Find recent content on the main index or look in the archives to find all content.

Feedback to

Monthly Archives