academic_cycle

1. Background

About the Author: Alexander Wu (geekan) is the creator of MetaGPT, a startup CEO, and a seasoned software engineer. He authored the first version of MetaGPT’s codebase and brings 18 years of programming expertise to the project. At the peak of his technical versatility, Wu worked with 21 programming languages within a year.

2. Introduction

Since January 22, numerous individuals have engaged in discussions with me regarding the principles underlying DeepSeek R1. However, the enthusiasm for technology typically resides within a small contingent of individuals. Today, I would prefer to focus on a different aspect: why DeepSeek was the one who created R1. The true merit of DeepSeek is founded in its status as an organization that emphasizes the academic cycle, which has significantly influenced its capability to create R1. In comparison, the algorithm itself holds less significance. As time progresses, the innovative algorithms of today may be supplanted at any moment; however, a distinguished organization consistently propels the advancement of technology.

Academic Cycle: Engaging in organizational-level critical thinking to consistently create atomized innovations that advance the frontiers of science.

The strength of an organization acts as a catalyst for innovation, thereby establishing a competitive advantage. This phenomenon has recurred consistently over the past decade. ByteDance exemplifies this trend, having entered the market later while outperforming established entities such as Tencent, Meta, and Google in the realm of recommendation systems. This notable success can be attributed to ByteDance’s foundational organizational structure and its stringent A/B testing methodology, which cultivated an academic cycle and yielded superior performance.

Early OpenAI followed a similar trajectory. Ilya established a robust academic organization, maintaining a closed innovation cycle, ultimately producing ChatGPT. This breakthrough significantly distanced OpenAI from competitors and secured a valuation exceeding $100 billion. These cases demonstrate that specific organizational structures naturally facilitate innovation. Without such frameworks, large-scale innovation rarely occurs.

While DeepSeek, OpenAI, and early ByteDance all maintain academic cycles, they follow distinctly different pathways, as illustrated in the following comparison table:

	DeepSeek	OpenAI	ByteDance
Academic Cycle	Yes	Yes (internal)	Yes (internal)
Critical Thinking	High (open-source enhancement)	High	High (passive acquisition)
Achievements	DeepSeek App DAU exceeded 10 million in February 2025, just two weeks after r1’s release	ChatGPT App DAU reached over 20 million in February 2025 after long-term growth	Industry-leading recommendation systems: TikTok, Toutiao, Douyin, and Xigua Video
Open-Sourcing	Extensive; both v3 and r1 open-sourced	Substantial in early stages; primarily closed-source later	Limited; includes BytePS and a few others
Strategic Governance Framework	Highly autonomous; anyone can mobilize all resources; complete equality among team members	Initially autonomous; evolved to a hierarchical control structure	Partially autonomous
Process Visibility	High	Initially high, now low.	High (internal)
Scaling Pathway	Shaped academic cycle through open source practices and team members with strong critical thinking skills	Shaped academic cycle through Ilya’s organizational design and team members with strong critical thinking skills	Shaped academic cycle by cultivating critical thinking through experimental systems and OKR-driven organizational structure

It’s easy to identify the common thread among these companies: they all cultivated academic cycles. This commonality exists because innovation rarely emerges from isolated breakthroughs, but rather from synthesizing numerous incremental achievements. Only academic cycles can consistently produce these small advances that ultimately coalesce into significant innovation.

This explains an interesting observation: initially, few recognized the potential of DeepSeek, OpenAI, or ByteDance. Many industry observers even considered these companies “questionable.” This perception occurs because when an organization’s reasoning extends beyond established consensus, that same consensus becomes an unreliable metric for evaluation. More fundamentally, this misconception stems from a widespread lack of appreciation for the critical importance of academic cycles.

Using DeepSeek as an example, they could not have developed DeepSeek R1 without innovations like MLA, GRPO, new MoE architectures, PTX, self-play techniques, and their documented failures with MCTS and PRM.
Without OpenAI’s progression through GPT-3, Scaling Laws, InstructGPT, and numerous setbacks in reinforcement learning (their original focus area), they could not have created GPT-3.5 and GPT-4.
Without ByteDance’s experimental system quantifying the contribution of each algorithmic strategy, they couldn’t have validated the effectiveness of numerous features in their recommendation systems. This validation was crucial to significantly outperforming competitors and developing successful products like Toutiao, Xigua, and Douyin.

More specifically, an academic cycle enables a small team to substantially exceed the boundaries of human knowledge within a finite timeframe, reaching previously unattained heights.

In practice, academic cycles require every organizational member to possess several critical attributes, a combination that is exceedingly difficult to achieve. The most essential elements include:

Rational thinking: Establishing critical thinking capabilities throughout the entire organization.
Flow state immersion: Maintaining strong motivation, dedicating 100% focus, making commitments, and self-activating to achieve flow states.
Accepting human limitations: Initiating without hesitation, acknowledging mistakes without dwelling on them, documenting lessons and moving forward, and maintaining authenticity.
Pursuing global optimization: Recognizing that all processes may be suboptimal until true objectives become fully defined.

3. Content

3.1 Rational Thinking

3.1.1 What Is Knowledge?

Recently, we discussed methods for evaluating civilizational evolution with a Stanford professor at an academic conference. During this discussion, a Berkeley PhD suggested defining evolution as “the discovery of new knowledge.” We extensively debated the definition of knowledge without reaching any conclusion at the conference. Afterward, I asked DeepSeek R1 for its definition of knowledge. Its final paragraph response was deeply unsettling:

Knowledge is the universe’s tool for understanding itself. From quantum fluctuations to the emergence of consciousness, knowledge is essentially a recursive attempt by a localized universe region (human civilization) to comprehend the whole. It serves both as our shield against entropy and as a prison constraining our cognition. True wisdom may lie in maintaining reverence for the “unknown unknowns,” as Socrates observed: “The only thing I know is that I know nothing.”

Knowledge encompasses many different vehicles. DNA represents 4 billion years of accumulated evolutionary knowledge, obtained through Darwinian optimization processes. However, what we commonly refer to as knowledge primarily derives from scientific advancement, gradually formed over the past two millennia and predominantly conveyed through text and imagery.

Civilization advances at a pace that significantly surpasses the random mutations and genetic recombination of deoxyribonucleic acid (DNA). This acceleration transpires due to our mastery of symbols as a means of communication, which facilitates widespread innovation and enhances our species.

But how does innovation actually happen? I believe systematic innovation depends on collective critical thinking.

3.1.2 Critical Thinking

critical_thinking

A few months ago, while writing a PhD recommendation letter for a student, I noticed almost every top academic institution ranked the same criterion first: critical thinking. This priority likely stems from recognizing that critical thinking is essential for efficiently advancing scientific boundaries.

Science originates from philosophy, and the roots of Critical Thinking can be traced back to fifth-century BCE ancient Greece. Socrates, who referred to himself as a philosopher (from the Greek φίλος (philos) meaning ‘love’ and σοφία (sophia) meaning ‘wisdom’, thus ‘lover of wisdom’), encouraged his students to question assumptions and examine evidence through the ‘Socratic method’, which is regarded as an early practice of critical thinking. Aristotle systematically integrated mathematical methods with philosophical speculation, transforming empirical debates into formalized syllogisms. Together, they contributed significantly to the advancement of Critical Thinking in the world.

Edward Glaser first proposed the term “critical thinking ” in 1941. By 1987, the National Council for Excellence in Critical Thinking established a consensus definition through the Delphi Project, describing it as “self-regulatory judgment based on evidence, concepts, methodologies, and other standards.”

I wrote a document on critical thinking that I share with collaborators. It expands the scope, requiring people to distinguish facts from opinions, evaluate evidence quality, understand reasoning methods, formulate hypotheses, and conduct experiments. The process emphasizes atomic and incremental progress while developing high-quality, reusable experiences—ideally in code format.

Critical thinking strongly relates to Elon Musk’s first-principles thinking; the core components of first-principles thinking are nearly equivalent to critical thinking.

However, individual critical thinking doesn’t automatically translate to collective critical thinking. Many organizations recruit numerous PhDs without gaining organizational advantages.

Recently, a number of prominent technology companies have established extensive artificial intelligence research departments, recruiting leading PhDs in machine learning. However, it is frequently observed that organizations employing 50 to 100 AI PhDs tend to innovate at a slower pace compared to startups with only 5 to 10 researchers, albeit possessing a more robust culture of academic cycles.

3.1.3 Evidence Hierarchy

The most crucial aspect of critical thinking is distinguishing facts from opinions. To achieve sufficient evidence quality, we must verify the authenticity and reliability of collected data and evidence, avoiding misleading or unverified information. Generally, I categorize evidence levels as follows:

Evidence hierarchy: Opinion < Consensus < Statistics < Experiment < Double-blind experiment (A/B testing) < Common knowledge
Higher-level evidence should take precedence when addressing the same question. For instance, “X is better than Y” is merely an opinion, while “A/B testing shows X improved over Y by 5% in scenario Z” constitutes evidence. Naturally, obtaining higher-level evidence incurs costs.
Consensus is not strong evidence. When everyone believes a particular stock will rise, it often declines significantly. Quality evidence requires sufficient evidentiary grade, such as reasoning based on facts: “Satellite imagery reveals widespread pest infestation in soybean production regions, projected to impact xxx” → “Sell soybean futures”

However, evidence hierarchies aren’t absolute. For example, when Ilya states that LLM pretraining has concluded, his opinion may carry more weight than the general consensus. Nevertheless, this partial ordering of evidence levels holds true in most contexts.

3.1.3.1 A/B Testing

When discussing evidence hierarchies, we must highlight A/B testing, which is the highest level of evidence we can produce.

This introduces an interesting phenomenon: collective critical thinking doesn’t necessarily require developing critical thinking in each individual team member.

ByteDance provides an illustrative example. In 2012, Zhang Yiming wrote A/B testing code himself. Around 2014, he established an experimental system requiring all improvements to undergo A/B testing validation. This effectively mandated that all team members support their views with high-grade evidence.

This approach essentially forced everyone in the system to acquire critical thinking skills. One could argue this represents a form of Darwinism, as those unable to develop essential thinking would face layoffs. I recall that around 2014, ByteDance discovered what might be called a “PhD scaling law”—each doctoral graduate they hired for algorithm optimization correlated with a relative 1% annual CTR improvement.

3.1.3.2 Fuzzy Correctness

At this point, we need to consider another question: high-evidence standards work well for recommendation systems, but are they beneficial in all domains?

Not necessarily. Recommendation systems operate in well-defined, easily verifiable environments. Most challenges in other fields involve lengthy experimental cycles where hypotheses can’t be rapidly confirmed or refuted. Similar to the difficulty in implementing Process Reward Models (PRM), perhaps we often don’t need rigorous evidence, but rather what I call “fuzzy correctness.”

Groundbreaking innovations typically emerge from combinations of fuzzily correct components. Regarding LLM development, papers I frequently mention include Tomas Mikolov’s word2vec, which demonstrated that word combinations like “man-woman” and “king-queen” have similar vector distances, and Transformer architectures, which increased parallelization and reduced computation compared to RNNs. These were only fuzzily correct components for LLMs, as their future benefits were unknown and difficult to estimate.

Evaluating cutting-edge problems is extraordinarily challenging, often as difficult as solving them. Who could have predicted GPT-4’s powerful brand effect that drove global adoption? Likely no one, probably not even OpenAI, before its release. Similarly, who could have foreseen R1’s current level of breakthrough success? Presumably no one, including DeepSeek.

Our pursuit of evidence standards becomes nuanced: success requires high-quality evidence, yet some evidence remains nearly impossible to obtain, forcing us to advance through fuzzy correctness.

Nevertheless, the pursuit of evidence must permeate the entire organization, establishing the foundation for collective rationality and enabling efficient communication among all team members.

3.1.4 Collective Rationality

Collective critical thinking is challenging to achieve in most fields. It requires individuals with essential critical thinking abilities alongside qualities like sincerity, goodwill, and optimism, all within an organization with appropriate standard operating procedures. One crucial SOP is “focus on issues, not people,” which effectively encourages collective rationality.

In a rational group environment, everyone must distinguish between facts and opinions in their communications. This skill is relatively accessible even for those without the entire critical thinking process. When confronted with a questionable opinion, the appropriate response is simply: “What evidence supports this view?” When facing unfamiliar terminology, ask: “How is this term defined?” If someone lacks search skills, teach them effective search techniques. If emotions creep into discussions, guide them back to the “focus on issues, not people” principle.

This approach addresses typical problems like individual bias. Despite considering myself reasonably intelligent, I make frequent mistakes. I must encourage everyone to point out my errors, listen carefully to feedback, quickly accept valid points, and thoughtfully defend positions I believe are correct until a decision cycle completes. While collective critical thinking requires most individuals to possess critical thinking skills, this alone is insufficient. We can achieve true collective rationality only when everyone encourages participation and treats others with goodwill.

Each person’s mental context differs completely. Even subtle contextual differences can lead to wildly divergent conclusions about the same issue. When hearing something that seems absurd, asking “What’s the context here?” often reveals that our own position was mistaken. This resembles language models in how we can be overconfident. Understanding when to be confident is a problematic skill most people lack.

The key is guiding others toward sound reasoning. Correcting muddled logic in someone lacking critical thinking skills can be tremendously time-consuming. It’s often more efficient to identify the issue and request that they formalize their thinking in a structured document (like an arXiv paper or RFC) rather than engaging in open discussion or debate. In articulating their logic, they naturally discover flaws, saving everyone considerable time.

Collective rationality is difficult to achieve, but it remains the most vital component of the academic cycle.

3.2 Flow State Induction

3.2.1 Motivation

Without proper motivation, mental focus becomes difficult. Your mind wanders elsewhere, making it nearly impossible to enter a flow state.

If money is someone’s primary motivation, they rarely accumulate truly significant wealth. Paradoxically, those focused on solving important problems often earn unexpected fortunes. An organization filled with self-interested individuals requires extremely robust SOPs to function effectively. Conversely, an organization comprised of people with altruistic motivations requires far fewer procedural controls.

Generally, the best motivation is simply enjoying what you do. Programmers who love coding, lawyers who enjoy debate, architects who think deeply, and salespeople who thrive on communication all derive satisfaction directly from their work without needing external validation. This presents an interesting paradox: both productive work and scrolling through TikTok consume the same amount of time, yet some people genuinely prefer meaningful work while others only want entertainment. It isn’t easy to continuously motivate someone whose neural pathways are wired for constant stimulation.

Truthfully, people with genuine intrinsic motivation are exceedingly rare. Developing such motivation requires either environmental conditioning throughout one’s history or deliberate self-training—both uncommon paths. Those with strong intrinsic motivation have essentially overcome “genetic gravity”—our biological programming that prioritizes survival and reproduction above all else. When Zhang Yiming emphasizes delayed gratification, he addresses this fundamental issue.

3.2.2 Commitment

Upon establishing appropriate motivation, it becomes imperative to engage in complete dedication and invest one’s full time. This necessity arises because insights often develop from subtle moments of intuition that may manifest abruptly, possibly instigated by an overlooked context brought forth in conversation or a previously encountered idea that resurfaces during divergent thinking. Inspiration invariably reveals itself when one remains engaged with their subject matter during activities such as dining, walking, or any other leisure moments, all while maintaining unwavering focus without distraction. Every exceptional individual I have encountered employs this methodology. Individuals frequently inquire about my process for generating specific ideas; however, there exists no singular technique—rather, it is the product of consistent contemplation at every instance.

Naturally, even with complete dedication, we sometimes find ourselves less productive than usual. During these times, making bold commitments to others becomes important: “I’ll deliver a draft of this paper within 24 hours.” We may not always succeed, and such challenges can be significant, but the key is to make these commitments fearlessly and without fearing criticism. As you continually challenge yourself, you gradually map out your capability boundaries and steadily strengthen your abilities.

3.2.3 Flow

All these behaviors contribute to achieving a flow state, an extraordinarily efficient cognitive condition attainable through moderate challenges within a relaxed environment.

As illustrated in the diagram, when individuals enter a state of flow, they tend to disregard most external distractions and lose track of time. Tasks are accomplished rapidly while preserving a sense of emotional calmness, typically yielding efficiency levels several times beyond the norm. Should an organization promote an environment that facilitates access to flow states, the collective productivity coefficient may escalate from O(N) to O(kN), where k could potentially be several times greater or even exceed ten. This represents a significant shift from linear to enhanced linear productivity scaling.

3.3 Imperfection-Informed Systems Design

In reality, even the most exceptional individuals possess numerous flaws.

3.3.1 Start First, Don’t Hesitate

A prominent limitation of humanity is hesitation. Numerous challenges are intrinsically ambiguous, making predicting future outcomes complex when faced with these issues.

Individuals with commendable credentials frequently exhibit reluctance to embark on new endeavors, as innovation inherently involves elements of probability. This reluctance ultimately diminishes the potential for innovation itself, necessitating proactive engagement before assessment; one cannot ascertain what lies beyond a closed door without first opening it.

Fleming’s discovery of penicillin was not intentional, but rather due to his initial experimentation and vigilant observation of unexpected phenomena. In academic circles, acting before achieving perfection is a fundamental principle. Those who wait for ‘inspiration to strike’ or the ‘perfect moment’ often achieve nothing. As Einstein said, ‘I have no special talent. I am only passionately curious.’ This curiosity drove him to experiment continuously, rather than waiting for inspiration.

3.3.2 Record Everything

Innovation relies on documentation, not memory. Human memory is notoriously unreliable, with a steep decay curve. Reviewing documents I wrote just a month ago, I'm often amazed: "Did I really write something this insightful?" It sometimes seems absurd. Over the past 13 years, I've created thousands of documents recording my thoughts. I document everything, then forget it all, clearing my mind.

Documentation serves beyond preventing forgetfulness—it allows us to observe how our thinking evolves. By reviewing early records, we can trace the development of ideas and identify overlooked insights that often become fountains of innovation.

Making mistakes isn’t problematic; failing to learn from them is. Edison tried over a thousand materials while inventing the light bulb, with each failure being a necessary step toward success. He noted: “I didn’t fail 1,000 times; I found 1,000 materials unsuitable for filaments.”

In scientific exploration, errors frequently provide more insight than correct answers. Quantum mechanics was developed through a series of “incorrect” theories undergoing continuous refinement. Though ultimately proven incomplete, the Bohr model guided quantum mechanics’ developmental direction.

3.3.3 Maintaining Sincerity

One critical aspect of dealing with yourself and others is maintaining authenticity.

Upon mastering fundamental critical thinking skills, individuals cultivate a heightened awareness of boundaries. A collective understanding of mutual limits, adherence to established protocols, and conscientious avoidance of overstepping exist, thus ensuring focus solely on designated responsibilities. This results in a scenario wherein’ all individuals seem exemplary, yet no tangible outcomes are achieved.’

A quintessential illustration is observed during project evaluations involving specialists from diverse functions. Each expert provides many professional perspectives, resulting in an elongated project approval process that may extend for an entire year. Consequently, by this time, the project tends to have forfeited its competitive advantage compared to its initial proposition.

A further illustration pertains to forming a distinguished academic team wherein all members maintain politeness and regard for each other’s expertise and perspectives. Consequently, individuals may experience discomfort when it comes to identifying issues, leading to superficial dialogues that regress into philosophical debates without yielding practical results.

In truth, we must uphold authenticity by addressing ambiguous matters, articulating potentially unsettling truths, and refraining from concealing thoughts that may overstep boundaries. Concurrently, we must be willing to accept feedback from others genuinely.

Individuals frequently express concern regarding the potential for misspoken words or imprecise reasoning that may lead to confusion or misinterpretation by others. Nevertheless, it is crucial to recognize that perfect precision is not always requisite. Instead, one should articulate one’s perspectives candidly, employing a reasonable degree of rigor. Even when others may disagree, seeking correction is not detrimental; the individual faces no personal loss while simultaneously acquiring valuable knowledge that can yield long-term benefits.

3.4 Pursuing Global Optimization

A common organizational problem involves mistaking partial optimization for global optimization. At its core, the academic cycle represents a continuous exploration process seeking global optima.

3.4.1 Doing the Right Thing vs. Doing Things Right

“Doing things right” focuses on efficiency, while “doing the right things” prioritizes effectiveness. Efficiently pursuing the wrong objectives only gets us to the wrong destination faster. Most failed projects don’t suffer from poor execution but from choosing incorrect directions.

In DeepSeek’s case, they pursue technical excellence and directional correctness. If they had focused solely on optimizing existing algorithms without questioning whether the algorithmic paradigm itself was appropriate, they would have fallen into the local optimization trap like many AI teams. Innovation requires continuously asking, “Are we doing the right things?” rather than just, “Are we doing things right?“

3.4.2 Process Optimization vs. Outcome Optimization

Systems are typically designed to pursue process optimization: following standards, submitting on schedule, and avoiding anomalies. However, genuine innovation frequently emerges from disrupting established processes. Einstein didn’t develop relativity by strictly adhering to the physics conventions of his time, but by questioning fundamental assumptions.

In academic cycles, result optimization significantly outweighs process optimization. This requires organizations to tolerate “beneficial chaos”—allowing rule-breaking when it serves higher objectives. Early OpenAI culture exemplified this approach. Rather than limiting themselves to academia’s standard publication processes, they selected the most effective methods to advance research, even when this meant departing from tradition.

3.4.3 KPI vs. OKR

KPIs fundamentally measure progress along known paths, while OKRs explore uncharted territories. Organizations fixated on KPIs typically become shortsighted and risk-averse. KPI-driven teams optimize established metrics while overlooking potentially breakthrough directions. In contrast, OKRs encourage setting ambitious goals, even when full achievement seems unlikely. This approach better suits innovative environments facing uncertainty. ByteDance’s success partly stems from adopting OKR management rather than pure KPI systems, enabling teams to transcend existing boundaries. True innovation requires the courage to pursue objectives that appear impossible by current standards. The academic cycle inherently challenges cognitive boundaries—not precisely executing known paths, but systematically exploring unknown domains. Ultimately, successful organizations aren’t perfecting known processes but continuously discovering and executing the right initiatives.

4. Conclusion

Innovation isn’t accidental but the inevitable result of internal logic. As an organizational-level intellectual operating mechanism, the academic cycle fundamentally amplifies and optimizes human cognitive processes systematically. When examining cases like DeepSeek, OpenAI, and ByteDance, we observe not merely technological achievements but the triumph of a specific mindset. This approach transcends the limitations of individual genius to establish a collective intelligence system capable of continuously generating breakthroughs.

The core value of the academic cycle lies in eliminating the randomness of innovation and transforming it into a sustainable, replicable pathway. This isn’t achieved through singular management techniques or team composition but through cultivating a specific cognitive ecosystem: rational thinking provides the foundation, flow states supply momentum, accepting human limitations ensures resilience, and pursuing global optimization offers direction.

Within this framework, innovation is no longer contingent upon extraordinary genius or fortuitous inspiration; instead, it transforms into an organizational competency - a cultural attribute that can be intentionally nurtured. Exceptional organizations are not solely defined by the quantity of experts or resources at their disposal, but rather by their ability to achieve sustained and practical innovation.

Ultimately, the academic cycle reveals a profound truth: innovation’s essence lies not in technological tools or methodologies, but in how we organize collective thinking. When an organization systematically transcends the limitations of human cognition, breakthroughs emerge as naturally as water flowing downhill.

This may explain why organizational culture and cognitive frameworks are recognized as paramount competitive advantages in today’s swiftly evolving technological landscape. While algorithms may ultimately be outperformed, the academic cycle, once established, transforms into an inexhaustible source of innovation

The article’s author, Chenglin Wu (Alexander Wu), created MetaGPT, which also demonstrates a complete academic cycle. They are not only the first multi-agent Coding Agents but have also published seven academic papers, ranking first and second in the LLM Agent field at ICLR 2024 and ICLR 2025, respectively.

They have also launched a commercial product, MetaGPT X (mgx.dev), showcasing how academic cycles can be implemented in practice. It is the world’s first multi-agent Coding Agents product, achieving $1 million ARR just one month after its launch. This is the best proof of the effectiveness of the academic cycle.

Academic Cycle