

What you confuse here is doing something that can benefit from applying logical thinking with doing science.
I’m not confusing that. Effective programming requires and consists of small scale application of the scientific method to the systems you work with.
the argument has become “but it seems to be thinking to me”
I wasn’t making that argument so I don’t know what you’re getting at with this. For the purposes of this discussion I think it doesn’t matter at all how it was written or whether what wrote it is truly intelligent, the important thing is the code that is the end result, whether it does what it is intended to and nothing harmful, and whether the programmer working with it is able to accurately determine if it does what it is intended to.
The central point of it is that, by the very nature of LKMs to produce statistically plausible output, self-experimenting with them subjects one to very strong psychological biases because of the Barnum effect and therefore it is, first, not even possible to assess their usefulness for programming by self-exoerimentation(!) , and second, it is even harmful because these effects lead to self-reinforcing and harmful beliefs.
I feel like “not even possible to assess their usefulness for programming by self-exoerimentation(!)” is necessarily a claim that reading and testing code is something no one can do, which is absurd. If the output is often correct, then the means of creating it is likely useful, and you can tell if the output is correct by evaluating it in the same way you evaluate any computer program, without needing to directly evaluate the LLM itself. It should be obvious that this is a possible thing to do. Saying not to do it seems kind of like some “don’t look up” stuff.
Does it? I read the whole interview in the OP post and it does not seem like this would be the opinion of the researcher:
How do you justify the claim that her data shows the usefulness of violent civil resistance campaigns?