ChatGPT could be the key to unbundling science

Jan 26, 2023

Most research projects can be divided into discrete fun parts and boring parts. I usually have the most fun in coming up with the question and designing the analysis, but doing the analysis and interpreting the results are fun too, especially when visualization is involved. Even data collection has its rewards. Its tedium is often tinged with a bit of excitement around what you might find when the pieces finally come together.

The boring part descends on you with the first keystroke of the manuscript. The modern scientific paper has all the charm of a form letter. Its rigid delimitations between sections and the dry style mandated by virtually every journal mean there’s little space for researchers to communicate the interest that they feel about their work. And in most cases revisions are even worse, as the peer review process seems ever more to impinge on domains it was never meant to tackle, like enforcing prescriptive grammatical rules or requesting the addition of citations that clearly didn’t inform the paper itself.

Despite all this, the currency of scientific reputation is the journal article. I think of this as a portrait and its frame: the researcher paints with data but has to enclose it in a laboriously carved frame, as if to capture the meaning of what they’ve found.

This bundling creates a number of problems. First, it creates the illusion of authority by excluding the data exploration and the multiple analytic choices made along the way to the final result. The effect is a highlight reel with a clean narrative arc that only barely resembles the actual process of conducting the research that created it. This can also exclude people who took part in portions of the research, but not enough to qualify for authorship on the final paper. Next, and related to the first point, is how publications tend to gloss over the uncertainty at the heart of the scientific process. New questions ripple out from every experiment worth conducting, but confining them within an article can cut off the process of exploring what the results really mean.

To understand why ChatGPT has the potential to dethrone the journal article from its central position in scientific knowledge production, I’d like to mention a quote from Joel Spolsky that I first read on Stratechery: “All else being equal, demand for a product increases when the prices of its complements decrease.” A complement in this sense is a product that is consumed in conjunction with another. Spolsky largely focused on how the commoditization of different elements of computing affected software vendors. For example, he talks about how Microsoft worked to push down the price of computers so that it could sell more operating systems.

In the context of scientific research, though, what has ChatGPT commoditized?

ChatGPT not only writes prose, it writes code too and generates ideas. It usually does these things well enough, without being spectacular. Ian Leslie interprets this to mean that demand for unique and unusual perspectives has increased as the price of mediocrity has plummeted: “It’s going to get less and less valuable and more and more perilous to be generic in any way – to be a generic writer, or to be a generic person, a generic thinker. Because the machines are very good at [creating] the generic model of this form of thought, this form of expression, whatever. There will be a much higher premium on cultivating your own distinctive, inimitable voice in whatever form that takes.”

I don’t dispute that this will likely be true for the arts, but I don’t think it will apply consistently in scientific research, simply because the same tools that power ChatGPT have also opened the door to tools for extracting meaning from scientific papers. Services like Consensus and Elicit look to the scientific literature and try to answer your questions by pulling insights out of articles. The result is generally a single sentence from each paper, to which they add a few ways of exploring more context around that idea.

The development process for Consensus provides some insight about where the value of journal articles lies. In an interview with The Atlantic, one of the founders describes how they had readers annotate the key messages of each paper. This then went into a training set with the full article text and the abstracted key messages. The article’s author writes parenthetically that each paper only has one or two key messages, which is casually devastating to people who have spent days wrangling with their coauthors over turns of phrase and trying to please reviewers who each seem to have their own set of grammatical hobby horses. All for one or two key messages.

Creating these tools required extensive effort, but there’s an unignorable oddity that will eventually arise as we use AI tools to help create articles and, on the other end, interpret the articles using different versions of those same tools. It’s all the weirder when you recognize that these services toss aside the vast majority of the language in their quest to identify the paper’s one or two key messages. I have enough faith in the scientific process to hope that this closed loop doesn’t last long. Everyone will have to see the absurdity of computers writing papers for other computers, and will want to find a way out.

To me, the way out seems to be to unbundle the research process. This means that, rather than stuffing a question, a dataset, an analysis, and an interpretation into one product, we can unpack these disparate processes and publish each individual element separately. A useful, pithily phrased question could be its own publication, maybe accompanied by a little background on why it might be important. A dataset would stand on its own, without having any interpretive frame around it at all. A visualization of someone else’s data could add meaning that even its initial publishers didn’t see.

I don’t think it’s likely that exceptional writing will save the journal article from being deconstructed in this way, but I still think that writing will play an important role in research. Each analysis can be accompanied by a few key points written by its authors (if they want to) and, again, published independently. This will no longer be the canonical interpretation, as others can just as easily publish competing interpretations. Making sense of findings across analyses can be a genre of its own, with new conventions that develop to best suit the needs of each field and subject. Lay summaries of research will be as important as ever, as would be tutorials for working with data from different sources.

The advantages of this deconstruction will be huge. Individuals can specialize in different aspects of the research process and get credit for the parts that they contribute to. A given researcher could specialize in creating visualizations of datasets that others publish or in identifying emerging knowledge by pulling together different analyses. This specialization will generate efficiencies, and assigning credit to specific portions of the knowledge creation process can reduce the problem of unearned authorships.

At the same time, though, there’s the challenge of changing how we think about researchers’ reputations and all the entrenched interests that will resist that change. Journals make enormous profits off of their role as reputation brokers, and universities like the easy legibility that publication provides. In a sense this is because it makes knowledge production into a fungible unit. This, in turn, creates difficulties in interdisciplinary fields where job applicants from areas with very different publication standards have to compete for positions. Most decision makers have a hard time comparing the productivity of researchers from fields like economics that require very long publications to those from fields like public health that generally stick to much shorter papers.

Thus, unbundling science requires acknowledging the uniqueness of each scientific contribution. In that sense, ChatGPT really is the complement of something “distinct [and] inimitable” in research, too, even if that’s not exactly the researcher’s voice.

There are platforms under development that try to decouple knowledge creation from the journal article by combining the reputational and monetary incentives of unbundled scholarship. They’d allow for researchers to commission analyses and peer reviews from others with a proprietary currency that would also allow outsiders to see a researcher’s specific contributions. This is a great way of exercising our imaginations about what is possible. But to me this is primarily a cultural problem rather than a technical one. I don’t see any reason why a CV full of unbundled knowledge products couldn’t be legible as a marker of reputation, except for that institutions haven’t figured out how to interpret them. One possible solution is for journals to live on as aggregators of remarkable researchers, sort of like columnists at a newspaper or the web rings of the early internet, rather than aggregators of articles.

There’s that grim saying from Max Planck about how “science proceeds one funeral at a time,” and I think that’s true for paradigms around how knowledge is produced and credit distributed. The problem is that publications live a long time. Nature is over 150 years old and is more prestigious than ever. Despite the creaking of the journalistic system caused by overburdened peer reviewers and the ever-increasing rate of publication, there aren’t many signs yet of these behemoths stepping aside. The spread of ChatGPT and other NLP products could change that, but only if researchers themselves start to recognize the absurdity of using AI to write for AI.

Perambulations

Discussion about this post