*As a general rule, researchers do not test or document their programs rigorousl...

scott_s · on July 6, 2011

The point is that since software development is not their main goal or background, their practices tend to be ad-hoc. We know the value of testing and documentation, but they do not. People don't know to stop doing something until they know it's a bad practice. And they're not going to know it's a bad practice until they discover that fact on their own (which can be a slow process) or someone teaches them (faster, but potential cultural problems).

That they should is basically a given in the article. The question is how to make it happen.

gallamine · on July 6, 2011

The mindset of a scientist is that the code is a one-time thing to achieve a separate goal - data for a paper. The code isn't supposed to last, it's simply a stepping stone. For a lot of folks, whose research areas tend to move around, there isn't always the expectation that you'll get to a 2nd or 3rd paper on the same data.

Now, all of this is different if you research actually is building the model. But, I'm speaking for experience on the rest. I've built plenty of software tools that I need "right now" to get a set of data.

scott_s · on July 6, 2011

It may not be intended to last, but it's still supposed to be correct. And, of course, there's probably gobs of software out there was not intended to last, yet did.

neutronicus · on July 6, 2011

The thing about scientific code is that it's often a potential dead end. The maintenance phase of the software life cycle is not as assured as it is in industry.

Writing good engineering software is not the scientist's goal so much as demonstrating that someone else with a greater tolerance for tedium (also someone better-paid) could write good engineering software.

another · on July 6, 2011

Exactly. I'd go further: in industry, the software is typically the end product, and the quality of the software is inherently relevant. In science, the output (the prediction of the simulation, the result of the analysis, etc.) is typically the end product, and the quality of the software is relevant only insofar as it affects the quality of the output.

In practice, of course, the quality of the software often does affect the quality of the output---but time spent on software quality creates less immediate value than it does in industry.

pygy_ · on July 6, 2011

You won't see any clean code written by scientists until (major) journals make it mandatory to submit code for peer review and publication.

When it happens, I hope that they'll manage to agree on a sensible license (even though I won't set my hopes too high).

rflrob · on July 6, 2011

I have all of my code on github under a CRAPL license [1]. It assumes a certain amount of good-faith from others, but I feel that if you're worrying about getting scooped, your problem isn't ambitious enough. Luckily, my adviser agrees, and is very in favor of open releases of data [2].

[1]http://matt.might.net/articles/crapl/ [2]http://www.michaeleisen.org/blog/?p=440

pygy_ · on July 6, 2011

The terms of the license are good, but its name is literally crappy:-/

kenjackson · on July 7, 2011

That will never happen, because the universities are bigger than the journals and will push back. The universities want to own the code if there's money to be made. Stanford made a small fortune from Google, for example. If journals required code review, other journals would pop up that wouldn't require it.