This is an article in the series of “what is an intellectual life”. This is a revised version of an early draft that adds the example of ChatGPT. The post is to discuss the concept of independent researcher.

This might not be a strange concept generally, but it might be the case in the Chinese context. To summarize, independent researchers are structural features of science’s progress, and perhaps are the agent of blue skies research. Here “structural” should be understood as follows. Every now and then, a system needs something fundamentally new; and those new ideas typically start at the edge of the system, where different main stream ideas are mixed. Because those ideas are new-born, they do not have ecosystem support yet in the society—it could start a new ecosystem when it works. Therefore, they typically do not have institutional support for some time, and they are self-funded or take some blue skies grant.

The coexistence of indie researchers and academic societies is a general pattern in the history of science, and in the following, we break this down step by step.

Blue skies research

To begin with, we shall explain what we mean by blue skies research, To explain it, we have a miraculous contemporary example now: the discovery of ChatGPT.

In an interview with Ilya Sutskever, the interviewer asked what surprised him the most in his journey of working out ChatGPT. Ilya said,

perhaps it was neural network works at all.

To understand such a surprise, is to understand blue skies research. And we need to go back to the Geoffrey Hinton’s days. In a few earlier academic conferences, Hinton would retell the dark days of neural network research: before the Deep Belief Network paper, neural network is considered a “dirty” word, and Hinton was told by conference organizers that

we do not publish papers on artificial neural network;

or

we already have one paper on neural network, and we do not need another one.

The exact phrases might be slightly different as quoted, but the message is there.

At the earlier days of artificial neural network research, the idea that a cartoon mimic of biological neural system could produce meaningful intelligence is considered not even as a serious science—to clarify, we shall not delve too much into the past, and the accumulation of this image also has its history.

The Imagenet moment started with Ilya’s speculative ideas that what if we scale the dataset and model up? Neural networks do not work might because they are way too small. In cooperation with Alex Krizhevsky, who is among the first to apply GPU acceleration on neural networks, they managed to scale both the network up for the first time with CUDA programming, and won the ImageNet competition of 2012 with a large margin over the second place. This started a new era of Computer Vision.

Then, the progress of Deep Learning is the stories of visionaries made big bet on scale. AlphaGo is the bet of DeepMind’s founders on large scale deep reinforcement learning, in which Ilya was also involved. and ChatGPT was the outcome of the exploration of Ilya’s idea: what if we keep scaling the model? Then, next-word prediction was the only task that has massive amount of data and could be trained in a self-supervised way. Even when GPT-2 showed great promise, and the scaling law of transformers had been discovered, it was not clear what the path would lead if we keep scaling the model up: it would take millions of dollars to train GPT-3.

And eventually, Sam Altman the Resourceful, Greg Brockman the Architect, and Ilya Sutskever the Visionary, managed to push through, and discovered ChatGPT.

This is a typical process of blue skies research. It was the efforts of multiple generations of scientists: before Ilya, there was Hinton; before Hinton, there was McCulloch and Pitts (who built the first model of neural networks that led to the current model).

Blue skies research is scientific research in domains where “real-world” applications are not immediately apparent. It is closely related to basic research, but with some subtle difference. Basic research is a type of scientific research with the aim of improving scientific theories for better understanding and prediction of natural or other phenomena, and the impact of basic research has been rather well understood: it leads to major advance in technology in the decades ahead. Therefore, building giant particle colliders to investigate theoretical physics would be basic research, and its utility has been supported by societal progress delivered by the century of theoretical physics beforehand. However, blue skies research are research like theoretical physics at the turn of 20th century, where it is not clear at all that the investigation of the black body emission would lead to quantum physics, and there would be no justification, a rational argument beforehand, or even awareness of the possibility of the undertaking—it could be considered to be too difficult to be possible, or an issue that is not aware by most people (e.g., climate change in 1970s).

The life cycle of those major discoveries from blue skies research to industrial products tends to be punctuated by steps separated by five to ten years. The previous narratives are the exciting ones, and before that, neural network has roughly been through 70 years of lesser known research that punctuated by those kind of steps. And this is somehow how almost all big discoveries are made.

This sounds markedly different from how research is carried over nowadays, when it was expected that a researcher could publish a dozen papers a year. To quote Ilya again in his interview:

the goal is to discover small bits of knowledge, sharing it with other researchers, and getting scientific recognition as results; it is a very valid goal, and is very understandable.

The current academic style of research might be detrimental to the progress of science, and some further discussion could be found at Bengio’s blog that promotes slow science. However, This is also a deep topic, which we shall not delve into too much.

In the following, we look back into the past, we could see that blue skies research shares similar patterns to how science progresses in the past.

Science in the enlightenment era

In its early stage, practicing science, which then was known as natural philosophy, was a hard way to earn a living and an easy one for making enemies.

Those early seekers continually moved from court to court and university to university, living a vagabond life. Then, Universitas, a Latin word from which University is derived, referred to the group of teachers in the cathedral school who set up with a few students a place for higher education, and self-governed like a guild. And the courts were a federation of feudal lords nominally under the leadership of the Christian papacy. The courts wanted magic, predictions and later weaponry, the universities polemical teaching. These traveling salesmen in ideas were not philosophers in the sense of pale meditators tied to a desk, yet they managed to write an incredible number of books, often not published till after their death but widely circulated in manuscript.

This was the case of the most extraordinary character of the first half of the 16th century. The more known character is Giordano Bruno, who was protected by the princes and cities for his skill in magic—which then was to see and utilize patterns unnoticed by average people through natural philosophy—until they could not, and was burned at the stake by Inquisitors. The lesser known Telesio, whose work of 1565, On Nature, led Bacon to call him “the first of the moderns”. Those pioneers are less known than their youngsters because the first, who struggle out of the establishment and form new and useful conceptions, only appears half right, incomplete, and remote in this history of ideas, but they perhaps more to be cherished than those who come after, who clear off the debris and offer a neater, more fullblown view.

And this is a repeated pattern not only in the history of ideas, but also in the history of science, though in a less dangerous fashion.

In the early Renaissance, Erasmus was a vagabond hopped among universities and royal patrons, deliberately avoid funds that would curb his independence while writing the translation of Bible that unified the Greek and the Latin traditions of the New Testament. In early Enlightenment, Spinoza was excluded from the Jewish community, had been expelled by Amsterdam municipal authorities, and made a living by giving private philosophy lessons and grinding lenses, while working as an independent scholar writing Ethics (which published posthumously) among other works. In early Industrial revolution, Sadi Carnot, the father of thermodynamic, was employed as a General Staff in Paris, on call for military duty, but from then on he dedicated most of his attention to private intellectual pursuits and received only two-thirds pay. He died unfortunately because of cholera epidemic at the age of 36 at 1832, and his work was rediscovered by Kelvin in 1853. As well known, in early Modernity, Einstein could not secure an academic post after graduation, and worked as a patent worker outside the academics before he worked out the theory of relativity. Bertrand Russell spent his 20s living on his inherited money and in an isolated life for 10 years writing Principia Mathematica, which started the analytic philosophy in revolt to the German idealism. And Godel could not find a job after obtaining his doctorate, yet attended the meeting of Vienna Circle regularly and proved the Incomplete Theorem. Even during the times between the WWII and 1980s when we might call the golden times of science, complexity science started in a similar way: Brian Arthur grayed his hair while advocating the concept of positive feedback in economics; Langton had worked hourly jobs, both as a carpenter with a home remodeling company and as an assistant at a stained-glass shop, while developing his prototypical ideas that started the field Artificial Life; and Stephen Wolfram needed to start to a company to publish A New Kind of Science.

To start something fundamentally new, the road is hard and long because the evidence is sparse, the benefits are obscure, and the old systems were good—they had consistency, completeness, and proven access to material abundance; only at a few points did contrary facts or gaps in explanation threaten their validity. None of the pioneers were in ”suitable” places where important works were supposed to happen.

To appreciate the timeline of science advances, we could look at the example of computer science. In addition to the technological advances, the theoretical foundation of computer science, Turing Machine, was the accumulated result of almost a century’s effort. It started with Gottlob Frege, who reworked logic in the middle of 19th century, and was widely considered to be the greatest logician since Aristotle, though he was largely ignored during his lifetime. Giuseppe Peano, Bertrand Russell and Ludwig Wittgenstein continued the effort and tried to rework the foundation of mathematics and even philosophy—which led to analytic philosophy—through logic. The effort ultimately led Kurt Godel of the Vienna Circle to prove that logic formal system is not self-consistent. The result inspired Alan Turing to investigate what is reducible to logic and what is not, and led to the concept what is computable and what is not, which is the Turing Machine. This process roughly took a century. Similar timelines hold for statistical physics, quantum physics, theories of relativity, and perhaps all the branches of science.

None of them publish many papers.

The granularity of such developments of research of this significance is decades; even if the current information society could accelerate the information propagation, it could not accelerate the genesis of ideas, which are the concentrated efforts of individuals or small groups that span 5-10 years, interlock with one another, and accumulate into a breakthrough in decades. This is so obvious when one takes the effort to looks into the history of philosophy and science.

Science in the Chinese context

In our times, because of the business and economic impact of science, research has become an organized effort, and thus this type of blue skies research (when it could happen) is intertwined with incremental research in the academia and industrial commercial oriented research—those are the funding that supports Ilya’s work until ChatGPT could be worked out.

However, the situation in China is more chaotic and difficult. I will try to describe the situation without sounding cynic. The science system in China was built under deep pragmatism for national rejuvenation under severe resource and time pressure; as a consequence, it has been playing the game of catching up, and the research done is mostly to build on previous works incrementally to understand the field. Meanwhile, it mostly focuses on the part of science that has practical benefits in short term. As a result, the Chinese science takes the role to develop commercial or economic oriented science in the global science ecosystem. The role to build original 0-to-1 blue skies research is predominately played by the Western system. Consequently, the main stream science consists of mostly commercially driven incremental research, and there is almost no zero-to-one original research happening for both historical and structural reason.

In this system, blue skies research has very few institutional support, and Yitang Zhang is a great example. Zhang has been the top graduate of Peking University, and he has obtained his PhD in the U.S.. After graduation, he was deeply motivated to study profound mathematical problems. However, Zhang would rather take precarious jobs in the U.S. than come back to China just because the tolerance of all kinds of individual life choices in the West: Zhang took precarious jobs for 7 years—as an accountant, a delivery worker for a New York City restaurant, also in motel in Kentucky and in a Subway sandwich shop—then a lecturer for 14 years, before he made world-renowned advance in mathematics. This is an inspiring and sad story from the Chinese system, but at the same time, reveals that as a foreigner, he does not have access to the institutional support in the U.S. for his research as well.

The institutional support for Ilya in the Western system does not exist under the Chinese context. If we say that the Westerner scientists could play the easy mode of blue skies research, the Chinese scientists have to play the hard mode of Renaissance thinkers.

Blue skies research and independent researcher

Now we could discuss the typical characteristics of blue skies research, and how it is related to independent researchers.

This type of research typically goes through four phases: a preparation phase where a great amount of prerequisite training are undertaken; a hermit-gestation phase where individuals concentrate a stretch of period to work in isolation, synthesize previous experiences, and work out a proof of concept; a monastery-order phase where a critical group is formed to polish the proof of concept; lastly, a society-engagement phase where the result is disseminated and ultimately leads to societal impact. A more elaborate discussion on the phases of science is given by the president of Santa Fe Institute, David Krakauer, and it would be greatly recommended to check it out—Santa Fe Institute is known as the Mecca of complexity science, was set up by a number of Nobel Prize winners, and somehow could be regarded as the continuation of the Los Alamos Lab in spirits.

The gestation phase perhaps is the most critical one, where few guidance is available given that the territory is mostly uncharted, and it is not clear whether the perceived program is significant in any sense; consequently, it is a phase that stands between associate research and fully grown research program. Meanwhile, support for research mostly exists at the preparation phase and the monastery-order phase in the form of jobs and funding—the monastery-order phase is a typical university or institutional setting (recall that monasteries are precursors to universities and research institutions), or in modern days, industrial labs (whose support also spans from the monastery-order phase and the societal-impact phase).

Recognizing this phenomenon, in each stage of a scholar’s career cycle, the support for blue skies research has existed in varied forms. In the early stage, the top prestigious institutions have scholarship system that selects mainly based on researcher merit instead of research proposal. Cambridge University has its formidable tradition of Prize Fellowship, where top graduate could be awarded Prize Fellowship for several years to pursue whatever research directions they are interested: Isaac Newton, Bertrand Russel, and Stephen Hawk’s work were all supported by those fellowship; their works were all unprecedented, and there were no faculties in Cambridge could guide them, though some form of mentorship existed in the form of senior fellows of Prize Fellowship. Harvard models its society of fellow based on the Prize Fellowship of Cambridge. And in the later stage, the institution of tenured professorship is to offer institutional support for this type of inquiry. And there are funding agents offer blue skies research grant to people who do not have access to those previously mentioned system: those systems are path dependent, and are rather privileged choices. For example, after the miracle of ChatGPT, independent research has become more systematically promoted than they were previously, and there are more institutions funding independent researchers: OpenAI, and Long Term Future Fund, to give a few examples.

However, as a sad story, to the best of my knowledge, those institutional support does not exist in Mainland China, even if they are educated and work in the Western system later on. Except for isolated lucky cases, structurally and statistically, in the preparation phase, the researchers have to work first as an apprentice to accumulate knowledge and experience, and then as a contracted worker throughout a series of contracts in the preparation stage instead of as an institutionally supported researcher undertaking independent inquiry (e.g., Prize Fellow in Cambridge University, or Harvard Society of Fellows)—this nostalgically resembles the experience of Enlightenment thinkers, who need to move from university to university, and court to court to trade magic, prediction, tutorship for patronage. And the gestation phase that requires high concentration of energy and time is very hard to accommodate in the system, and being independent researcher might be the few structural options to have.

To see what it takes to break through this structural difficulties, Yitang Zhang is an inspiration example previously that perhaps breaks through the system through sheer will and brutal intelligence. Another more well documented example would be Grigori Perelman, who is a Russian, proved the century-problem of mathematics (Poincare Conjecture), and shares a similar social-economic background. In his biography, the author documents a series of key figures in Grigori’s career who gave him critical help that was extremely coincidental luck:

Perhaps the single most fateful incident in Perelman’s lifetime was the appearance, in Perelman’s first year at Leningrad University, of a larger-than-life presence in the form of a small old man with a square gray beard. His name was Alexander Danilovich Alexandrov; he was a living legend, and miraculously and almost ridiculously, he was teaching geometry to first-year Mathmech students.

Alexandrov to the Soviet Union could be simplistically understood as Poincare to French, and at the year Grigori went to college, he was coincidentally teaching first-year math after a series of political events:

Alexandrov risked his career—and ultimately lost his post as university president—by supporting mathematicians who came under attack for being either ideologically unreliable or Jewish.

Alexandrov was fired as the president in 1964 and proceeded to spend the next two decades in what still amounted to exile of the not-entirely-self-imposed variety in Siberia, helping to create a science town there. In his seventies, he returned to his university with what turned out to be a vain hope of reclaiming a place there: he wanted to fill a vacant chair in geometry. In the run-up to the chair election, he taught a first-year course and charmed students in part because of his openness about the absurdity of his predicament.

Without this series of support, independent researcher might be the structurally unavoidable option. Simplistically, each institution has its life cycle: it started by certain research program that was a blue skies research before, and matured when this program achieved its at least initial goal. Individuals in the system need to align to the goal to be able to find support. However, a new blue skies research typically does not align with the goal very much. Thus, being outsiders allows them to be free of the vested structural interests, manifesting as personal, political, historical and cultural constraints that exponentially slow down the progress, like 10 times—and considering that the time frame of such work is typically minimally a few years’ concentrated effort, such slow down would make the work a few decades-long. In reality, quantitative difference would lead to qualitative change. This supposed decade-long program would mostly end up being impossible, and this impossibility manifests as the institutionalized thinking of individuals in the institutionalized systems, making breaking out of the box unthinkable. This is also why there is a myth that the creative energy of an individual exhausts before one’s thirties if one has not done anything notable, and the myth that a tenured professor is a retired professor.

Hope that all those discussions capture what independent researchers are: they are features of science’s progress. This concept basically means people who work on something that do not have ecosystem support yet in the society—it could start a new ecosystem when it works. Many of them achieve nothing, but many major discoveries come from this way.