The Pitfalls of Cut and Paste Coding

comments

We’ve all been guilty of it in our development careers at one time or another. When starting out using a language or framework that you’ve never used before you often have no choice but to. What I’m talking about is the act of “copy paste coding”, and it’s as common in the programming world as chewing gum under seats. When you copy and paste other developer’s code into your application it’s important to fully understand what the code does before you continue; or risk joining the many fools that have gone before you.

imageWhen I was in primary school there was a really smart kid in our class; let’s call him Joshua. He got top marks pretty much all the time. My class also had a not-so-smart kid called David; to be fair there were a few Davids in my year group in primary school. David was the kind of kid who’d do anything for a shortcut, so he used to sit horizontally behind Joshua and take notes from his test sheets every time we had a Maths test. Primary school obviously seemed a lot harder when you were 10.

This went on for a couple of months; remember that I mentioned that Joshua was a pretty switched on kid and hadn’t missed the gist of what was going on. So one day we had another test and we all sat down and got stuff in. Joshua smashed the test in only 10 of the 60 minutes we had been given – to say he made us all feel inferior when he put his pencil down was an understatement.

After a slight pause, he then preceded to pull out a second test sheet and really start his test. Joshua had taken two test sheets and written fake answers to all the questions in the test.

The look of shock on David’s face was priceless.

Time to go on Safari

Recently a lot of the sites that I look after have been receiving some serious ELMAH spam. So much so that I've had to implement ELMAH filters just to get the Signal-to-noise ratio of my application’s exceptions back to a manageable threshold.

The offending exception that I've continually seen:

System.Web.HttpException: A potentially dangerous Request.Path value was detected from the client (:).

So someone is making bad requests to my web applications.

From looking at my ELMAH logs over time this is from a number of different sources ion different IP address ranges around the globe.

All of them have the same pattern to their exception.

Request Path Info: ”https:/mywebsite.com/”
User agent: ”Test Certificate Info”

They are all missing the second “/” in the https:// of my SSL site’s addresses.

So I think to myself:

“Why the hell does every site that I manage that uses SSL seem to be getting the same exceptions?”

My exceptions where all for different websites on a number of different servers.

So someone out there is scanning websites with a piece of code that puts bad paths into URLs, and uses the user-agent string “Test Certificate Info”.

Following the paper trail

So I set out on the interwebs in search of answers.

I can’t be the only one experiencing this?

And It turns out that I’m not. It seems that more than a few of us are experiencing the same thing.

One of the responses to the above linked StackOverflow post seems to nail what has happened:

“…
My guess someone read this and didn't end up changing the example code.
…”

The link in the quote above is to an MSDN C++ article that walks through making an SSL request – but until recently the code sample on the MSDN post had an error that wrote an invalid Http header while making its request.

And therein lies the problem.

A copy of a copy, of a copy, of a copy

Just like my not-so-intelligent primary schoolmate David mentioned earlier, when whoever wrote the piece of software that’s been visiting my sites was cranking out code in a haze of Doritos and Pepsi, they copied someone else’s work – and with it they copied a bug.

Unlike David however when you’re a developer this is pretty normal behaviour and wouldn’t be considered unsavoury at all – hell, open source breeds on the act of copying.

However there is one thing that like David we all have to be careful of, as copying other developers code from the internet does have it’s own set risks that you need to be aware of.

If you copy someone else’s code and you don’t fully understand what it is doing, then you probably aren’t in a place to use it in a production environment.

Whoever wrote the web spider that is out there indexing all my SSL website’s and causing my ELMAH error spam obviously didn’t note the issue in the code sample they copied from. Like them if you do the same, you might end up with a piece of code in your application that simply makes you look like an amateur.

And it doesn’t have to be that way.

Jeff Atwood and John Galloway started adding the following badge of certification to code samples nearly 5 years ago:

image

The above image was basically a sarcastic joke at the fact that a lot of code in world has simply not really been properly tested. And I don’t meant test-driven, test-first level tested – I mean the level of testing where it hasn’t really ever been used outside of simply knocking together a proof of concept sample for a blog post.

I’ve been guilty of this before when I’ve posted things on this very blog that have needed an update after the fact.

So if you’re out there copying code from other developers, open source projects or blog posts – make sure you read the logic you borrow thoroughly, take the time to understand it, and for your own sake make sure you test anything that spiders websites all over the internet before you hit the launch button.

My event logs will thank you.