Kooiti Masuda has some very interesting points in response to the question of openness in science, especially regarding data and code. Please read it first before reading my response, which follows.
1) Commercial codes used as part of scientific process
I agree with KM’s point that the fact that much of the software that scientists use is closed-source and expensive will complicate matters. In any case you will find some strong objections from the commercial software community that by virtue of its use in science their code must be released or even auditable, or that competitors to their functionality should be publicly funded.
But none of this prevents any code developed in house on the public expense from being released into open source. Such codes are substantial and are crucial to much of the controversy. Nobody is suggesting that any significant bias is introduced via bugs in Matlab, so the fact that Matlab is licensed is not practically important for these purposes.
2) Complications due to institutionally expected commercialization of
In fields far from climate, commercialization of academic codes is the norm. Institutions, for their own interests, presume that codes developed there have commercial value unless demonstrated otherwise. Publicly funded institutions act very much like corporations in this regard, except perhaps with less agility.
Should these limitations be applied only to particular fields? How can one reasonably establish boundaries between fields where publication is expressly required and others where it is discouraged?
I am actually gearing up for negotiations with my university to permit me to release a modest piece of general-purpose code I have written. The default position of the intellectual property office of the university is “no, this code belongs to the institution, and if there is potential for outside use, they should pay us for it”.
As in a corporation, at least in America the case for open source must be made explicit and focused on the needs of the university of public laboratory, not of the science or the general public. Support for the contrary may end up coming from the funding agencies, the principal investigators and/or the general public. The general public, especially people working in small, closely-held businesses, has difficulty understanding the bureaucratic barriers to open source.
3) Informal coding experiments
As for the difficulties implied by informally developed code, I actually have some technical ideas that would greatly reduce these, which (ironically I am afraid) I need to keep somewhat private at present. Hopefully I can find funding for this work and release it into the public domain. Wouldn’t it be bizarre to have to close the source for a tool facilitating the open sourcing of academic software?
Another problem you do not raise is the difficulty with very large calculations, which tend to be performed on one-of-a-kind machines. Here, the efforts may not be repeatable in practice even within the given research team, as the constantly shifting experimental platforms subvert exact repeatability and require occasional adaptation of the codes just to keep up with the requirements of the machine. As the machine is unique not only as an instance, but as a configuration, supercomputing undermines reproducibility.
5) Still, Open Science is Always Better
All this said, I remain a strong supporter of publication, documentation and reproducibility in dramatically more detail than was possible in the past. There are far more difficulties than the rather vicious critics of the field acknowledge. However, defenses from within the field that openness is technically impossible or socially undesirable are very unhelpful and in my opinion very wrong.
6) Can openness backfire?
There is certainly the risk that open science will appear to facilitate misunderstanding. The widespread misuse of the web-based portal to the MODTRAN program, by people who don’t understand the precise nature of the problem it solves, is very illustrative. My feeling about that is that people who get things wrong will get things wrong no matter how little or much information you give them.
In the end, the only defense of genuine science remains peer review, though the structure of peer review may also need to change in the future. But that’s another topic.