← 返回信息流
AI 资讯雷峰网·3 小时前

Anthropic和OpenAI同日入局AI4S,巨头生态战悄然升级

原标题:Anthropic、OpenAI同一天落子AI4S赛道,巨头混战从「拼模型」转向「卡生态」

速览

Anthropic和OpenAI于6月30日同时入局AI4S赛道,Anthropic基于现有模型开发Claude Science工作台,强调通过工具链和流程实现端到端科研自动化;OpenAI则推出GeneGeneBench-Pro基准,测试显示最强模型通过率仅28.7%,揭露‘notice-act gap’结构性缺陷。两家公司均指向模型能力天花板,Google DeepMind凭借AlphaFold等基础模型形成生态壁垒。三巨头打法各异:Anthropic走宽普及化,OpenAI走窄定义标准,Google走深自有资产。此次行动标志AI+科学从单点比拼转向工作流与数据主权竞争,科学家如何选择仍存变数。

AI 深度解读

背景

2026年6月30日,Anthropic与OpenAI同时在AI4S赛道(AI + 科学)发布新产品,开启了AI4S领域巨头的混战新阶段。Anthropic推出Claude Science,明确“不依赖新模型”,通过工作流整合现有能力覆盖科学家日常研究。OpenAI推出GeneBench-Pro,成为覆盖基因组学、定量生物学等10个领域的评测基准,并推出GPT-Rosalind。Google DeepMind此前凭借AlphaFold等基础模型深耕多年,其Gemini for Science平台则将专有资产与数据库深度捆绑。这三家巨头同时在这一领域入局,标志着AI4S从“拼模型”转向“卡生态”的关键转折点。

核心内容

6月30日,Anthropic和OpenAI同时在AI4S赛道投下了各自的筹码。Anthropic发布了科研智能体工作台Claude Science,明确表态“不依赖新模型”,通过工作流整合现有能力来承包科学家日常研究流程。

OpenAI推出了GeneBench-Pro,一套覆盖基因组学、定量生物学等10个领域的评测基准,其测试数据显示,在129个真实科研workflow题目中,即便是最强的GPT-5.6 Sol,端到端通过率也只有28.7%。在非GPT模型中表现最强的是Claude Opus 4.8,端到端通过率仅为16.0%。OpenAI在论文中将这一缺陷命名为“notice-act gap”,即模型能够注意到数据异常、识别局部诊断信号,但无法将认知转化为下游决策,做出相应的正确分析。

Two houses’ directions seem different, but both are based on the same judgment: the bottleneck of AI4S is no longer that the model is not strong enough, but that the model has not yet achieved true end-to-end capabilities. Based on this consensus, Anthropic’s choice is to package existing models into an expandable workbench, using toolchains and processes to make up for the unreliability of the models. OpenAI’s choice is to take the lead in defining “what constitutes a completed scientific task,” locking the discourse power into standards.

Google DeepMind already has a significant advantage in AI + science, with AlphaFold and other foundational models, and its Gemini for Science platform integrates 30+ life science databases with proprietary assets, cutting into the market in a platform integration manner.

The AI4S battlefield has quietly entered the stage of “giant ecology confrontation,” from single-point model ability comparison to the battlefield of ecological position carding and workflow integration.

Why do the three giants simultaneously launch into the underlying infrastructure of AI4S at this exact time? The model capabilities have reached the ceiling of the “notice-act gap.” Piling up computing power’s old path does not work in scientific research scenarios. Engineering integration, ecological carding, and data sovereignty have become more practical breakthrough points. The three giants entered simultaneously because of “colliding into the ceiling.”

Anthropic’s approach is most straightforward. Claude Science is essentially a dedicated workbench — the main AI assistant acts like a project manager, decomposing tasks, distributing them to sub-assistants for execution, and having a fact-verification agent cross-verify. It connects to 60+ scientific databases and has pre-built toolkits for genomics, protein structure, chemistry, etc. Technically, it uses the MCP protocol to call external vertical models (such as scGPT for single-cell data, DNABERT for gene sequences) to execute specific calculations, while Claude itself only handles natural language understanding, task decomposition, and result interpretation. This division of labor means Anthropic indeed does not need to rely on new models, avoids high inference costs of general large models directly processing biological matrices, and allows vertical models to iterate independently without waiting for long-cycle updates of general models. More importantly, in the life sciences field, which strictly requires data compliance, sensitive data can be processed locally on the MCP Server without uploading to the cloud.

If Anthropic’s approach is like “taking the entire track,” then OpenAI’s logic is to use GeneBench-Pro as the referee, defining “what is a good AI4S,” and then using the dedicated model GPT-Rosalind as the athlete to hit high scores. In addition to the latest GeneBench-Pro, OpenAI launched GPT-Rosalind four months ago, a model specifically fine-tuned for biological reasoning, initially released in preview form to eligible US enterprise clients after security review.

Google DeepMind holds the only ace. It owns foundational scientific models such as AlphaFold and AlphaGenome, which are proprietary assets, deeply integrated with Gemini for Science, and integrates 30+ life science databases. The key advantage is that other players can only access models by calling tools, while in Google’s case, they are the underlying infrastructure. Perhaps other vendors can create a better workbench or define a stricter benchmark, but the core capability of protein structure prediction is in Google’s hands.

In market positioning, the three giants’ choices differ: Anthropic goes wide, promoting via subscriptions (Pro, Max, Team, Enterprise users can use Claude Science). Notably, Anthropic recently launched a $30,000 credit funding program targeting 50 postdoctoral and graduate projects, with applications closing on July 15, aiming to lock young scientists as independent PIs before they graduate by cultivating habits of using Claude Science. OpenAI goes narrow, with public standards allowing more people to enter but closed models relying on enterprise gatekeeping to establish barriers. Google goes deep, using proprietary assets to build barriers, where the model is the platform — the deeper one uses it, the more inseparable it becomes.

The three sets of strategies correspond to three different thoughts and risks. Anthropic bets that the ceiling won’t break in the short term and first lays out workflows through engineering, with the core risk that if the model breaks through early, it may only become a tool for arranging combinations. OpenAI bets that the ceiling will be broken eventually and first occupies the standard, waiting for model capabilities to catch up, but this “self-appointed referee” approach may not be accepted by the scientific community. Google bets that above the ceiling there is still a layer — whoever masters the source of foundational models has an eternal trump card, with high barriers, but relatively closed ecology.

The three have their own chips and blind spots, no one has the winning hand, but at the same time window, they have pushed their chips onto the table. Currently, the outcome is difficult to predict, at least no major customer has been locked by any single company. Novo Nordisk appears on both Anthropic’s Claude Science customer list and OpenAI’s early Rosalind partner list. The same enterprise is simultaneously trying out multiple solutions, indicating the market is still in an open competition phase, and no company’s toolchain has become strong enough for scientists to migrate their complete workflow to it.

The ultimate outcome of AI4S will probably not be decided by any single giant alone. When the three players collide into the ceiling at the same time, they enter simultaneously, but there is no consensus on breakthrough directions. The real answer lies with the scientists themselves — how they balance data sovereignty, academic independence, and research efficiency, and which one they cast their trust vote to. This answer may be more decisive than any technical parameters.

关键要点

  • Anthropic通过Claude Science工作台整合现有模型和60+科学数据库,实现任务拆解与子助手执行,无需依赖新模型,强调本地数据处理以符合生命科学合规要求。
  • OpenAI通过GeneBench-Pro定义129道模拟真实科研工作流(数据清洗至结论)的严格端到端评测基准,GPT-5.6 Sol端到端通过率仅28.7%,Claude Opus 4.8为16.0%,揭示“notice-act gap”缺陷,并推出GPT-Rosalind作为专用生物推理模型。
  • Google DeepMind凭借AlphaFold等自有基础模型与Gemini for Science深度整合30+数据库,形成平台级底层基础设施壁垒。
  • 三巨头同时入局AI4S的核心原因是模型能力触及“notice-act gap”天花板,堆算力无法突破,转向工程化、生态卡位和数据主权成为务实路径。
  • 市场定位差异:Anthropic走宽订阅普及化(含博士后资助计划),OpenAI走窄公开标准+企业门槛,Google走深专有资产壁垒。
  • 多家制药巨头(如Novo Nordisk)同时试用多家方案,AI4S市场仍处于开放竞争阶段,科学家在数据主权与学术独立性间的取舍将决定最终走向。

意义与影响

AI4S的混战标志着AI4S赛道从单纯的模型能力竞赛转向生态位卡位与工作流工程化的全面战场。三巨头不约而同选择这一时间点入局,反映出模型“notice-act gap”的天花板已成共识,堆算力老路在科研场景难以走通,工程化整合、数据合规与平台壁垒成为更务实的突破口。Anthropic的Claude Science、OpenAI的GeneBench-Pro与GPT-Rosalind、Google DeepMind的Gemini for Science,将深刻重塑科学研究的效率、数据安全与合作模式。

这一转变直接降低科研机构对高成本人工专家的依赖(单题成本数千美元),促进垂直模型与通用大模型的分工协作,并通过MCP协议、本地处理等技术保障敏感数据主权,符合生命科学严苛合规标准。同时,三大平台化的方式可能形成不同生态壁垒:Anthropic的订阅式普及可能加速学术习惯养成,OpenAI的标准定义或锁定话语权,Google的专有资产则构筑高门槛壁垒。

市场层面,制药巨头并行试用多家方案显示竞争尚处于开放期,科学家对数据主权、学术独立性与研究效率的取舍将决定最终信任投向。AI4S的终局不再由单一巨头决定,而是科学家与企业共同推动的动态平衡过程。这一事件标志着AI+科学领域进入一个新的战略阶段,为下一代科研工具与基础设施的迭代提供了清晰路径,也为行业参与者提供了生态卡位的重要窗口。

查看原文 →leiphone.com