The Neuron Times — Mardi 27 mai 2026

À la Une · Modèles & FrontièreFront Page · Models & FrontierAuf der Titelseite · Modelle & GrenzenPrima pagina · Modelli e FrontieraIn Prima Pagina · Modell e Frontiera

Claude Mythos résout un problème légendaire d'Erdős, ravivant le débat sur les capacités mathématiques de l'IAClaude Mythos solves a legendary Erdős problem, reigniting the debate over AI's mathematical abilitiesClaude Mythos löst ein legendäres Erdős-Problem und entfacht Debatte über mathematische Fähigkeiten der KIClaude Mythos risolve un problema leggendario di Erdős, riaccendendo il dibattito sulle capacità matematiche dell'IAClaude Mythos l'ha risolt on problema legendari de Erdős, rilanciand el debatt sula capacità matematich de l'IA

Quelques jours après qu'OpenAI a réfuté la conjecture d'Erdős sur les distances unitaires, Anthropic affirme que son modèle Mythos a trouvé une preuve élégante du problème original — un résultat qui interroge la frontière entre découverte et imitation.Days after OpenAI refuted the Erdős unit-distance conjecture, Anthropic claims its model Mythos found an elegant proof of the original problem — a result that raises questions about the boundary between discovery and imitation.Wenige Tage nachdem OpenAI die Erdős-Vermutung über Einheitsdistanzen widerrief, behauptet Anthropic, dass sein Mythos-Modell einen eleganten Beweis für das ursprüngliche Problem gefunden hat – ein Ergebnis, das die Grenze zwischen Entdeckung und Nachahmung infrage stellt.Pochi giorni dopo che OpenAI ha smentito la congettura di Erdős sulle distanze unitarie, Anthropic afferma che il suo modello Mythos ha trovato una dimostrazione elegante del problema originale — un risultato che interroga la frontiera tra scoperta e imitazione.Poeu dì dop che l'OpenAI l'ha refutaa la congettura de Erdős sui distanz unitari, l'Afferma che sò modell Mythos l'ha trovaa ona dimostrazion eleganta del problema original — un resultad che interoga la frontiera tra scoverta e imitazion.

De la rédaction — 27 mai 2026From the newsroom — 27 May 2026Von der Redaktion — 27. Mai 2026Dalla redazione — 27 maggio 2026De la redazion — 27 magg 2026

La guerre des modèles mathématiques a pris un tour inattendu. Alors qu'OpenAI venait de publier une réfutation de la conjecture d'Erdős sur les distances unitaires — un problème ouvert depuis 1946 —, The Decoder rapporte qu'Anthropic affirme que son modèle Claude Mythos a résolu le problème original « le week-end dernier », avec ce que l'ingénieur Sholto Douglas qualifie de « preuve mignonne et simple ».The battle of mathematical models has taken an unexpected turn. Just after OpenAI published a refutation of the Erdős unit-distance conjecture — a problem open since 1946 — The Decoder reports that Anthropic claims its model Claude Mythos solved the original problem "last weekend" with what engineer Sholto Douglas calls a "cute, simple proof."Der Krieg der mathematische Modelle hat eine unerwartete Wendung genommen. Während OpenAI gerade eine Widerlegung der Erdős-Vermutung über Einheitsdistanzen publiziert hatte – ein Problem, das seit 1946 offen war –, berichtet The Decoder, dass Anthropic behauptet, sein Claude-Mythos-Modell habe das ursprüngliche Problem « am vergangenen Wochenende » mit einem, wie Ingenieur Sholto Douglas sagt, « niedlichen und einfachen Beweis » gelöst.La guerra dei modelli matematici ha preso una svolta inaspettata. Mentre OpenAI aveva appena pubblicato una smentita della congettura di Erdős sulle distanze unitarie — un problema aperto dal 1946 —, The Decoder riferisce che Anthropic afferma che il suo modello Claude Mythos ha risolto il problema originale «lo scorso fine settimana», con ciò che l'ingegnere Sholto Douglas definisce una «dimostrazione carina e semplice».La guerra di modell matematich l'ha pigiaa un gir inaspettat. Mentre l'OpenAI l'aveva appena publicaa ona refutazion de la congettura de Erdős sui distanz unitari — on problema approv del 1946 —, The Decoder l'ha reportaa che l'Afferma che sò modell Claude Mythos l'ha resolt el problema original « el weekend passaa », con quel che l'ingegner Sholto Douglas el definiss « dimostrazion carina e sempliz ».

Ce résultat, s'il est confirmé, constituerait un signal fort de ce que Douglas appelle un « serious overhang » — un décalage croissant entre les capacités réelles des modèles de pointe et ce que la communauté scientifique mesure effectivement. Le fait qu'un modèle puisse produire une preuve mathématique élégante sur un problème qui a résisté aux meilleurs mathématiciens pendant près de huit décennies soulève des questions fondamentales sur la nature de la découverte assistée par IA.This result, if confirmed, would be a powerful signal of what Douglas calls a "serious overhang" — a growing gap between the real capabilities of frontier models and what the scientific community actually measures. The fact that a model could produce an elegant mathematical proof of a problem that resisted the best mathematicians for nearly eight decades raises fundamental questions about the nature of AI-assisted discovery.Dieses Ergebnis, sofern bestätigt, wäre ein starkes Signal für das, was Douglas einen « serious overhang » nennt – eine wachsende Kluft zwischen den tatsächlichen Fähigkeiten der fortschrittlichsten Modelle und dem, was die wissenschaftliche Gemeinschaft tatsächlich misst. Die Tatsache, dass ein Modell einen eleganten mathematischen Beweis für ein Problem liefern kann, das die besten Mathematiker fast acht Jahrzehnte lang nicht lösen konnten, wirft grundlegende Fragen über die Natur der KI-unterstützten Entdeckung auf.Questo risultato, se confermato, costituirebbe un segnale forte di ciò che Douglas chiama un «serious overhang» — un divario crescente tra le capacità reali dei modelli di punta e ciò che la comunità scientifica misura effettivamente. Il fatto che un modello possa produrre una dimostrazione matematica elegante su un problema che ha resistito ai migliori matematici per quasi otto decenni solleva questioni fondamentali sulla natura della scoperta assistita dall'IA.Quest resultad, se l'è confermaa, el saria on segnal forte de quel che Douglas el ciama on « serious overhang » — on sfasament cresent tra i capacità real di modell de punta e quel che la comunità scientifiga la mesura de bon. El fatt che on model el possa produr ona dimostrazion matematega eleganta su on problema che l'ha resistii ai megior matemategh per quasi ottant'ann domanda fondamental sula natura de l'assistenza a la scoverta de l'IA.

L'épisode s'inscrit dans une semaine riche en annonces de modèles. OpenRouter a référencé plusieurs nouvelles entrées majeures, dont Qwen3.7-Max d'Alibaba (1M de contexte, tarification à 1,25 $/M tokens en entrée), Grok Build 0.1 de xAI (256K de contexte, optimisé pour l'ingénierie logicielle agentique), et Gemini 3.5 Flash de Google (multimodal, 1M de contexte). Anthropic a également lancé Claude Opus 4.7 Fast, une variante à haute vitesse de son dernier modèle phare.The episode comes amid a week rich in model announcements. OpenRouter has listed several major new entries, including Alibaba's Qwen3.7-Max (1M context, priced at $1.25/M input tokens), xAI's Grok Build 0.1 (256K context, optimized for agentic software engineering), and Google's Gemini 3.5 Flash (multimodal, 1M context). Anthropic has also launched Claude Opus 4.7 Fast, a high-speed variant of its latest flagship model.Diese Episode fällt in eine Woche reicher Modellankündigungen. OpenRouter hat mehrere neue bedeutende Einträge aufgenommen, darunter Qwen3.7-Max von Alibaba (1M Kontext, Preisgestaltung ab 1,25 $/M Eingabe-Tokens), Grok Build 0.1 von xAI (256K Kontext, optimiert für Agenten-Softwareentwicklung) und Gemini 3.5 Flash von Google (multimodal, 1M Kontext). Anthropic hat zudem Claude Opus 4.7 Fast eingeführt, eine Hochgeschwindigkeitsvariante seines aktuellen Flaggschiff-Modells.L'episodio si inserisce in una settimana ricca di annunci di modelli. OpenRouter ha referenziato diverse nuove entrate importanti, tra cui Qwen3.7-Max di Alibaba (1M di contesto, tariffazione a 1,25 $/M token in ingresso), Grok Build 0.1 di xAI (256K di contesto, ottimizzato per l'ingegneria del software agentica) e Gemini 3.5 Flash di Google (multimodale, 1M di contesto). Ha inoltre lanciato Claude Opus 4.7 Fast, una variante ad alta velocità del suo ultimo modello di punta.L'episod l'è part de la settimana richa de anunci de modell. OpenRouter l'ha catalogaa diversi nov ingress maior, tra i qual Qwen3.7-Max de Alibaba (1M de contest, prezz a 1,25 $/M token in ingres), Grok Build 0.1 de xAI (256K de contest, ottimizzaa per l'ingegneria del software agentica), e Gemini 3.5 Flash de Google (multimodal, 1M de contest). L'Afferma anca lanciaa Claude Opus 4.7 Fast, ona varianta a alta velocità de sò ultim modell de ponta.

MajeurMajorBedeutendImportanteMaggior

La Chine restreint les déplacements de ses meilleurs chercheurs en IAChina restricts travel for its top AI researchersChina schränkt Reisen seiner besten KI-Forscher einLa Cina limita gli spostamenti dei suoi migliori ricercatori in IALa Cina la limita i spostament di sò mej ricercador in IA

From the Wires — 26 mai 2026From the Wires — 26 May 2026From the Wires — 26. Mai 2026From the Wires — 26 maggio 2026From the Wires — 26 magg 2026

Selon The Decoder, Pékin exige désormais que les chercheurs en IA des entreprises privées comme Alibaba et DeepSeek obtiennent une autorisation officielle avant de quitter le pays. Une mesure motivée par la crainte de fuites de données, de vols de technologie et de débauchage de talents.According to The Decoder, Beijing now requires AI researchers at private companies such as Alibaba and DeepSeek to obtain official authorization before leaving the country. A measure motivated by fears of data leaks, technology theft, and talent poaching.Laut The Decoder verlangt Peking nun von KI-Forschern privater Unternehmen wie Alibaba und DeepSeek eine offizielle Genehmigung, bevor sie das Land verlassen dürfen. Die Massnahme ist begründet durch die Sorge vor Datenleaks, Technologiediebstahl und Anwerbung von Talenten.Secondo The Decoder, Pechino ora esige che i ricercatori IA delle imprese private come Alibaba e DeepSeek ottengano un'autorizzazione ufficiale prima di lasciare il paese. Una misura motivata dalla preoccupazione per fughe di dati, furti di tecnologia e abilitazione dei talenti.Segond The Decoder, Pechino l'esig adesso che i ricercador in IA di privaa come Alibaba e DeepSeek el otengh l'autorizazion ufficial prima de lascià el paes. Una misura motivaa per la paura de fuit de dat, de furtechnologh e de reclutament de talent.

SécuritéSecuritySicherheitSicurezzaSegurezza

Google Cloud appelle à mettre la sécurité IA au conseil d'administrationGoogle Cloud calls for AI security to be brought to the boardroomGoogle Cloud fordert KI-Sicherheitsdebatte im VerwaltungsratGoogle Cloud invita a portare la sicurezza IA nel consiglio di amministrazioneGoogle Cloud el ciama a metter la segurezza IA in consiglio d'amministrazion

From the Wires — 26 mai 2026From the Wires — 26 May 2026From the Wires — 26. Mai 2026From the Wires — 26 maggio 2026From the Wires — 26 magg 2026

Le COO de Google Cloud, Francis de Souza, exhorte les entreprises à intégrer la sécurité dans leur stratégie IA dès le départ. Son message : la cybersécurité ne doit plus être reléguée aux équipes techniques mais discutée au plus haut niveau.Google Cloud's COO, Francis de Souza, urges companies to integrate security into their AI strategy from the outset. His message: cybersecurity should no longer be relegated to technical teams but discussed at the highest level.Der COO von Google Cloud, Francis de Souza, drängt Unternehmen darauf, Sicherheit von Anfang an in ihre KI-Strategie zu integrieren. Seine Botschaft: Cybersicherheit darf nicht länger auf das Technik-Team abgeschoben werden, sondern muss auf höchster Ebene diskutiert werden.Il COO di Google Cloud, Francis de Souza, esorta le imprese a integrare la sicurezza nella loro strategia IA fin dall'inizio. Il suo messaggio: la cybersecurity non deve più essere relegata ai team tecnici, ma discussa al più alto livello.El COO de Google Cloud, Francis de Souza, el esortaa i impresari a integrae la segurezza in la strategia IA de bon inizio. El sò messagg : la cybersecurity la dovess più ess relegaa ai team tecnic ma discutiu al livell pussee alt.

SociétéSocietyGesellschaftSocietàSocietà

Les citations hallucinées par l'IA contaminent les articles médicauxAI-hallucinated citations are contaminating medical papersVon KI halluzinierte Zitate vergiften medizinische FachaufsätzeLe citazioni hallucinate dall'IA contaminano gli articoli mediciI citazion hallucinaa de l'IA le contamina i articg medicai

From the Wires — 26 mai 2026From the Wires — 26 May 2026From the Wires — 26. Mai 2026From the Wires — 26 maggio 2026From the Wires — 26 magg 2026

Un audit de 2,5 millions d'articles biomédicaux révèle que le taux de références fabriquées a été multiplié par plus de douze depuis 2023. Les chercheurs soupçonnent un lien avec l'utilisation massive des modèles de langage. 98 % des articles concernés n'ont reçu aucune réponse de leurs éditeurs.An audit of 2.5 million biomedical articles reveals that the rate of fabricated references has increased more than twelvefold since 2023. Researchers suspect a link to the massive use of language models. 98% of the affected papers have received no response from their publishers.Ein Audit von 2,5 Millionen biomedizinischen Artikeln zeigt, dass die Rate fabrizierter Referenzen seit 2023 um mehr als das Zwölffache gestiegen ist. Die Forscher vermuten einen Zusammenhang mit der massiven Nutzung von Sprachmodellen. 98 % der betroffenen Artikel haben von ihren Herausgebern keine Antwort erhalten.Un audit di 2,5 milioni di articoli biomedici rivela che il tasso di riferimenti fabricati è stato moltiplicato per più di dodici dal 2023. I ricercatori sospettano un legame con l'uso massiccio dei modelli linguistici. Il 98 % degli articoli interessati non ha ricevuto alcuna risposta dai loro editori.Un audit de 2,5 milion de articg biomedici el rivela che el tass de referenz fabricaa l'è staa moltiplicaa per oltre dodesde dal 2023. I ricercador i sospetta un ligam con l'utilizzazion massiva di modell de lenguagg. El 98 % di articg interessaa i ha ricevuu nissuna risposta di i editor.

Page 1 — Page 1 — Seite 1 — Pagina 1 — Pagina 1 — À la UneFront PageAuf der TitelseitePrima paginaIn Prima Pagina

I. Modèles & FrontièreModels & FrontierModelle & GrenzenModelli e FrontieraModell e Frontiera

Anthropic

Afferma

Claude Mythos et la preuve Erdős : exploit réel ou coup de communication ?Claude Mythos and the Erdős proof: genuine breakthrough or publicity stunt?Claude Mythos und der Erdős-Beweis: echter Leistungsakt oder PR-Manöver?Claude Mythos e la dimostrazione Erdős: exploit reale o colpo mediatico?Claude Mythos e la dimostrazion Erdős: exploit real o colp de comunicazion ?

Anthropic affirme que son modèle Mythos a résolu la conjecture d'Erdős sur les distances unitaires avec une « preuve mignonne et simple ». Le résultat, rapporté par l'ingénieur Sholto Douglas, intervient juste après la réfutation publiée par OpenAI. La communauté mathématique reste prudente, attendant une vérification indépendante. Lire l'article.Anthropic claims its model Mythos has solved the Erdős unit-distance conjecture with a "cute, simple proof." The result, reported by engineer Sholto Douglas, comes just after OpenAI's published refutation. The mathematical community remains cautious, awaiting independent verification. Read the article.Anthropic behauptet, dass sein Mythos-Modell die Erdős-Vermutung über Einheitsdistanzen mit einem « niedlichen und einfachen Beweis » gelöst hat. Das vom Ingenieur Sholto Douglas berichtete Ergebnis erscheint kurz nach der von OpenAI publizierten Widerlegung. Das Fachpublikum verhält sich vorsichtig und wartet auf eine unabhängige Überprüfung. Artikel lesen.Anthropic afferma che il suo modello Mythos ha risolto la congettura di Erdős sulle distanze unitarie con una «dimostrazione carina e semplice». Il risultato, riportato dall'ingegnere Sholto Douglas, arriva subito dopo la smentita pubblicata da OpenAI. La comunità matematica resta prudente, in attesa di una verifica indipendente. Leggere l'articolo.L'Afferma che sò modell Mythos l'ha resolt la congettura de Erdős sui distanz unitari con ona « dimostrazion carina e sempliz ». El resultad, reportaa de l'ingegner Sholto Douglas, l'intervien justo dopo la refutazion publicaa de l'OpenAI. La comunità matematega l'è ancor cautta, in attesa de verificazion indipendenta. Lìgge l'articul.

OpenRouter

Qwen3.7-Max, Grok Build et Gemini 3.5 Flash débarquent sur l'APIQwen3.7-Max, Grok Build, and Gemini 3.5 Flash land on the APIQwen3.7-Max, Grok Build und Gemini 3.5 Flash starten auf der APIQwen3.7-Max, Grok Build e Gemini 3.5 Flash sbarcano sull'APIQwen3.7-Max, Grok Build e Gemini 3.5 Flash i desbarcha in sul'API

Le catalogue OpenRouter s'enrichit de trois modèles majeurs : Qwen3.7-Max d'Alibaba (1M de contexte, 1,25 $/M tokens), Grok Build 0.1 de xIA (256K de contexte, orienté coding agentique), et Gemini 3.5 Flash de Google (multimodal, 1M de contexte, 1,5 $/M tokens). Ces entrées confirment la cadence accélérée des sorties de modèles propriétaires. Voir le catalogue.The OpenRouter catalog gains three major models: Alibaba's Qwen3.7-Max (1M context, $1.25/M tokens), xAI's Grok Build 0.1 (256K context, agentic coding-oriented), and Google's Gemini 3.5 Flash (multimodal, 1M context, $1.5/M tokens). These entries confirm the accelerating pace of proprietary model releases. View the catalog.Der OpenRouter-Katalog wird um drei bedeutende Modelle erweitert: Qwen3.7-Max von Alibaba (1M Kontext, 1,25 $/M Tokens), Grok Build 0.1 von xAI (256K Kontext, agentisch fokussiert auf Coding) und Gemini 3.5 Flash von Google (multimodal, 1M Kontext, 1,5 $/M Tokens). Diese Neuzugänge bestätigen das beschleunigte Tempo der Veröffentlichungen proprietärer Modelle. Katalog einsehen.Il catalogo OpenRouter si arricchisce di tre modelli importanti: Qwen3.7-Max di Alibaba (1M di contesto, 1,25 $/M token), Grok Build 0.1 di xAI (256K di contesto, orientato al coding agentico) e Gemini 3.5 Flash di Google (multimodale, 1M di contesto, 1,5 $/M token). Queste entrate confermano il ritmo accelerato delle uscite di modelli proprietari. Vedere il catalogo.El catalogh OpenRouter l'arricchiss de tri modell maior: Qwen3.7-Max de Alibaba (1M de contest, 1,25 $/M token), Grok Build 0.1 de xAI (256K de contest, orientaa al coding agentich), e Gemini 3.5 Flash de Google (multimodal, 1M de contest, 1,5 $/M token). Questi ingress i conferma la cadenza acceleraa de i uscit de modell proprietari. Vedd el catalogh.

Anthropic

Afferma

Claude Opus 4.7 Fast : la vitesse au prix fortClaude Opus 4.7 Fast: speed at a steep priceClaude Opus 4.7 Fast: Geschwindigkeit zu vollem PreisClaude Opus 4.7 Fast: la velocità a caro prezzoClaude Opus 4.7 Fast: la velocità al prezz fort

Anthropic lance Opus 4.7 Fast, une variante à haut débit de son dernier modèle phare. Même capacités qu'Opus 4.7, mais avec une vitesse de sortie supérieure — et une tarification six fois plus élevée (30 $/M tokens en entrée, 150 $/M en sortie). Destiné aux applications professionnelles sensibles à la latence. Détails tarifaires.Anthropic launches Opus 4.7 Fast, a high-throughput variant of its latest flagship. Same capabilities as Opus 4.7, but with faster output — and pricing six times higher ($30/M input tokens, $150/M output). Designed for latency-sensitive professional applications. Pricing details.Anthropic bringt Opus 4.7 Fast auf den Markt, eine Hochgeschwindigkeitsvariante seines aktuellen Flaggschiff-Modells. Gleiche Fähigkeiten wie Opus 4.7, aber mit höherer Ausgabegeschwindigkeit – und einer sechsmal höheren Preisgestaltung (30 $/M Tokens Eingabe, 150 $/M Ausgabe). Gedacht für latenzkritische professionelle Anwendungen. Preisdetails.Anthropic lancia Opus 4.7 Fast, una variante ad alto throughput del suo ultimo modello di punta. Stesse capacità di Opus 4.7, ma con una velocità di output superiore — e una tariffazione sei volte maggiore (30 $/M token in ingresso, 150 $/M in uscita). Destinato alle applicazioni professionali sensibili alla latenza. Dettagli tariffari.L'Afferma lancia Opus 4.7 Fast, ona varianta a alt flux de sò ultim modell de ponta. Stess capacità de l'Opus 4.7, ma con ona velocità de lavorazion superior — e una tarifazion ses volte pussee volta (30 $/M token in ingres, 150 $/M in destinazion). Destinaaa a i aplicazion professional sensibii a la latenza. Dettagj tariffari.

xAI

Grok Build 0.1 : le modèle de coding agentique de xAI entre en scèneGrok Build 0.1: xAI's agentic coding model enters the arenaGrok Build 0.1: xIAs Agenten-Coding-Modell auf dem MarktGrok Build 0.1: il modello di coding agentico di xAI entra in scenaGrok Build 0.1: el modell de coding agentich de xAI l'entra in sena

xAI lance Grok Build 0.1, un modèle rapide entraîné spécifiquement pour les workflows d'ingénierie logicielle agentique. Supportant texte et image, avec 256K de contexte et une tarification agressive (1 $/M en entrée, 2 $/M en sortie). Le modèle est désormais accessible via l'API xAI et OpenRouter. Lire l'annonce.xAI launches Grok Build 0.1, a fast model trained specifically for agentic software engineering workflows. Supporting text and image, with 256K context and aggressive pricing ($1/M input, $2/M output). The model is now accessible via the xAI API and OpenRouter. Read the announcement.xAI bringt Grok Build 0.1 heraus, ein schnelles Modell, das speziell für agentische Softwareentwicklungs-Workflows trainiert wurde. Unterstützt Text und Bild, mit 256K Kontext und aggressiver Preisgestaltung (1 $/M Eingabe, 2 $/M Ausgabe). Das Modell ist nun zugänglich über die xAI API und OpenRouter. Ankündigung lesen.xAI lancia Grok Build 0.1, un modello rapido addestrato specificamente per i workflow di ingegneria del software agentico. Supporta testo e immagine, con 256K di contesto e una tariffazione aggressiva (1 $/M in ingresso, 2 $/M in uscita). Il modello è ora accessibile tramite l'API xAI e OpenRouter. Leggere l'annuncio.xAI lancia Grok Build 0.1, on modell veloc entraina specifegament per i workflow de l'ingegneria del software agentica. El suportaa test e imajen, con 256K de contest e ona tarifazion aggressiva (1 $/M in ingres, 2 $/M in destinazion). El modell l'è adess accessibel via l'API xAI e OpenRouter. Lìgge l'anonzi.

Google DeepMind

AlphaProof Nexus : neuf problèmes d'Erdős résolus en autonomieAlphaProof Nexus: nine Erdős problems solved autonomouslyAlphaProof Nexus: neun Erdős-Probleme autonom gelöstAlphaProof Nexus: nove problemi di Erdős risolti in autonomiaAlphaProof Nexus: nove problem de Erdős resolt in autonomia

Le système de Google DeepMind a résolu de façon autonome neuf problèmes ouverts d'Erdős, dont deux datant de 1970, pour un coût de quelques centaines de dollars par problème. Contrairement à l'approche en langage naturel d'OpenAI, AlphaProof Nexus utilise le compilateur Lean pour vérifier formellement chaque étape. Le taux de réussite global reste modeste : 2,5 %. Lire l'article.Google DeepMind's system has autonomously solved nine open Erdős problems, including two from 1970, at a cost of a few hundred dollars per problem. Unlike OpenAI's natural-language approach, AlphaProof Nexus uses the Lean compiler to formally verify each step. The overall success rate remains modest: 2.5%. Read the article.Das System von Google DeepMind hat autonom neun offene Erdős-Probleme gelöst, darunter zwei aus dem Jahr 1970, mit Kosten von ein paar hundert Dollar pro Problem. Im Gegensatz zu OpenIAs Ansatz in natürlicher Sprache verwendet AlphaProof Nexus den Lean-Compiler, um jeden Schritt formal zu verifizieren. Die Gesamterfolgsrate bleibt bescheiden: 2,5 %. Artikel lesen.Il sistema di Google DeepMind ha risolto autonomamente nove problemi aperti di Erdős, di cui due risalenti al 1970, per un costo di poche centinaia di dollari per problema. A differenza dell'approccio in linguaggio naturale di OpenAI, AlphaProof Nexus utilizza il compilatore Lean per verificare formalmente ogni passaggio. Il tasso di successo globale resta modesto: 2,5 %. Leggere l'articolo.El sistema de Google DeepMind l'ha resolt in autonomia nove problem approv de Erdős, tra i qual du del 1970, per un costo de poeu centenar de dolar per problema. A diferenza de l'approcc in lenguagg natural de l'OpenAI, AlphaProof Nexus el dopera el compilator Lean per verificà formalment ogni pass. El tass de riessida global l'è ancor modest: 2,5 %. Lìgge l'articul.

Anthropic

Afferma

Chris Olah invité par le pape Léon XIV : l'IA a-t-elle des signes d'introspection ?Chris Olah invited by Pope Leo XIV: do AI models show signs of introspection?Chris Olah vom Papst empfangen: Zeigt KI Anzeichen von Selbstreflexion?Chris Olah invitato da papa Leone XIV: l'IA mostra segni di introspezione?Chris Olah invitaa dal papa Leon XIV: l'IA gh'ha segn de introspezion ?

Le cofondateur d'Anthropic a été invité à s'exprimer lors du lancement de l'encyclique « Magnifica Humanitas » du pape Léon XIV. Olah y a affirmé que les modèles d'IA montrent des signes d'introspection et d'états émotionnels. Le document papal adoptait un ton plus prudent : « Ces systèmes imitent simplement certaines fonctions de l'intelligence humaine. » Lire l'article.The Anthropic co-founder was invited to speak at the launch of Pope Leo XIV's encyclical "Magnifica Humanitas." Olah stated that AI models display signs of introspection and emotional states. The papal document adopted a more cautious tone: "These systems merely mimic certain functions of human intelligence." Read the article.Der Mitbegründer von Anthropic wurde eingeladen, bei der Vorstellung der Enzyklika « Magnifica Humanitas » von Papst Leo XIV zu sprechen. Olah erklärte darin, dass KI-Modelle Anzeichen von Introspektion und emotionalen Zuständen zeigen. Das päpstliche Dokument nimmt einen vorsichtigeren Ton an: « Diese Systeme imitieren lediglich gewisse Funktionen der menschlichen Intelligenz. » Artikel lesen.Il cofondatore di Anthropic è stato invitato a esprimersi al lancio dell'enciclica «Magnifica Humanitas» di papa Leone XIV. Olah ha affermato che i modelli IA mostrano segni di introspezione e stati emotivi. Il documento papale adottava un tono più prudente: «Questi sistemi imitano semplicemente alcune funzioni dell'intelligenza umana.» Leggere l'articolo.El cofondador de l'Afferma l'è staa invitaa a esprimes in ocasion del lans de l'enciclica « Magnifica Humanitas » del papa Leon XIV. Olah lì l'ha affermaa che i modell de l'IA i mostra segn de introspezion e stat emotiv. El document papal l'ha adoptat on ton pussee caut: « Questi sistema i imita semplizament certi funzion de l'intelligenza umana. » Lìgge l'articul.

Page 2 — Page 2 — Seite 2 — Pagina 2 — Pagina 2 — Le Cahier TechniqueThe Technical NotebookDas Technische HeftIl Quaderno Tecnicoel Quadern Tecnich

II. Harnais & OutillageHarnesses & ToolingTooling & WerkzeugeImpianti e AttrezzatureImpiant e Utensili

Claude Code

Claude Code v2.1.152 : revue de code automatique et compétences rechargéesClaude Code v2.1.152: automatic code review and reloaded skillsClaude Code v2.1.152: automatischer Code-Review und nachladbare FähigkeitenClaude Code v2.1.152: revisione automatica del codice e competenze ricaricateClaude Code v2.1.152: revèv del codis automatic e competenze ricaricàa

La dernière version de Claude Code introduit /code-review --fix, qui applique automatiquement les suggestions de revue à l'arbre de travail. La commande /simplify invoque désormais /code-review --fix. Les compétences peuvent définir des outils interdits dans leur frontmatter, et la commande /reload-skills permet de re-scanner les répertoires sans redémarrer la session. Notes de version.The latest version of Claude Code introduces /code-review --fix, which automatically applies review suggestions to the working tree. The /simplify command now invokes /code-review --fix. Skills can define prohibited tools in their frontmatter, and /reload-skills allows rescan of directories without restarting the session. Release notes.Die neueste Version von Claude Code führt /code-review --fix ein, das Review-Vorschläge automatisch auf den Arbeitsbaum anwendet. Der Befehl /simplify ruft nun /code-review --fix auf. Fähigkeiten können verbotene Werkzeuge in ihrem Frontmatter definieren, und der Befehl /reload-skills ermöglicht erneutes Scannen der Verzeichnisse, ohne die Sitzung neu zu starten. Versionshinweise.L'ultima versione di Claude Code introduce /code-review --fix, che applica automaticamente i suggerimenti di revisione all'albero di lavoro. Il comando /simplify ora invoca /code-review --fix. Le competenze possono definire strumenti vietati nel loro frontmatter, e il comando /reload-skills permette di riesaminare le directory senza riavviare la sessione. Note di versione.L'ultima version de Claude Code l'introdus /code-review --fix, che l'aplica automaticament i suggeriment de revèv al alber de lavor. El command /simplify el dopera adess /code-review --fix. I competenze i podee definì i interdizzion de utensili in sò frontmatter, e el command /reload-skills el permet de scansionà di nov i scartacon sensa laa tornà la session. Nota de version.

OpenAI Codex

Codex CLI 0.134.0 : recherche dans l'historique et profils unifiésCodex CLI 0.134.0: conversation history search and unified profilesCodex CLI 0.134.0: Verlaufsrecherche und vereinheitlichte ProfileCodex CLI 0.134.0: ricerca nella cronologia e profili unificatiCodex CLI 0.134.0: ricerca in la stòria e profii unificaa

OpenAI publie la version 0.134.0 de Codex CLI avec une recherche dans l'historique des conversations locales (correspondances insensibles à la casse avec aperçus), et fait de --profile le sélecteur principal à travers le CLI, les permissions TUI et les flux sandbox. Le setup MCP s'améliore avec le ciblage d'environnement par serveur et les options OAuth pour les serveurs HTTP streamables. Notes de version.OpenAI releases version 0.134.0 of Codex CLI with local conversation history search (case-insensitive matching with previews), and makes --profile the primary selector across the CLI, TUI permissions, and sandbox workflows. MCP setup improves with per-server environment targeting and OAuth options for streamable HTTP servers. Release notes.OpenAI veröffentlicht Version 0.134.0 von Codex CLI mit Verlaufsrecherche in lokalen Unterhaltungen (gross-/kleinschreibungsunabhängige Treffer mit Vorschau) und macht --profile zum Hauptwähler über CLI, TUI-Berechtigungen und Sandbox-Flows hinweg. Das MCP-Setup verbessert sich durch serverseitige Umgebungsauswahl und OAuth-Optionen für streamfähige HTTP-Server. Versionshinweise.OpenAI pubblica la versione 0.134.0 di Codex CLI con una ricerca nella cronologia delle conversazioni locali (corrispondenze insensibili alle maiuscole con anteprime) e rende --profile il selettore principale attraverso il CLI, i permessi TUI e i flussi sandbox. Il setup MCP migliora con il targeting dell'ambiente per server e le opzioni OAuth per i server HTTP streamabili. Note di versione.OpenAI la publica la version 0.134.0 de Codex CLI con ona ricerca in la stòria local de conversazion (corrispondenz insensibii a la cassa con anteprima), e la fa de --profile el seletor principal in tra el CLI, i permess TUI e i fluss sandbox. El setup MCP el mejora con el targeting d'ambient per server e i opzion OAuth per i server HTTP streamabel. Nota de version.

OpenAI API

Workload Identity Federation et Secure MCP Tunnel arrivent sur la plateformeWorkload Identity Federation and Secure MCP Tunnel arrive on the platformWorkload Identity Federation und Secure MCP Tunnel kommen auf die PlattformWorkload Identity Federation e Secure MCP Tunnel arrivano sulla piattaformaWorkload Identity Federation e Secure MCP Tunnel i desbarcha in sul'API

OpenAI a déployé deux fonctionnalités de sécurité majeures : la fédération d'identité de charge de travail (échange de tokens externes contre des tokens d'accès OpenAI sans stocker de clés API longue durée) et le Secure MCP Tunnel pour les clients entreprise, permettant de connecter ChatGPT, Codex et l'API Responses à des serveurs MCP privés via un tunnel hébergé par le client. Changelog.OpenAI has deployed two major security features: workload identity federation (exchanging external tokens for OpenAI access tokens without storing long-lived API keys) and Secure MCP Tunnel for enterprise customers, enabling ChatGPT, Codex, and the Responses API to connect to private MCP servers via a customer-hosted tunnel. Changelog.OpenAI hat zwei bedeutende Sicherheitsfunktionen eingeführt: Workload Identity Federation (Austausch externer Tokens gegen OpenAI-Zugriffsschlüssel ohne Speicherung langlebiger API-Schlüssel) und den Secure MCP Tunnel für Unternehmenskunden, der es ermöglicht, ChatGPT, Codex und die API Responses über einen kundengetriebenen Tunnel mit privaten MCP-Servern zu verbinden. Changelog.OpenAI ha implementato due funzionalità di sicurezza importanti: la federazione dell'identità del carico di lavoro (scambio di token esterni con token di accesso OpenAI senza memorizzare chiavi API a lunga durata) e il Secure MCP Tunnel per i clienti enterprise, che permette di collegare ChatGPT, Codex e l'API Responses a server MCP privati tramite un tunnel ospitato dal cliente. Changelog.OpenAI l'ha deployaa doi funcion de segurezza maior: la federazion d'identità de carich de lavor (interscambi de token estern per token de access OpenAI sensa conservaa contrasign de longa durada) e el Secure MCP Tunnel per i client da l'impresa, permetent de conetg ChatGPT, Codex e l'API Responses a server MCP privaa tramite on tunnel hostaa del client. Changelog.

llama.cpp

llama.cpp b9352 : correctif backend ZenDN et nouvelles binairesllama.cpp b9352: ZenDN backend fix and new binariesllama.cpp b9352: ZenDN-Backend-Korrekturen und neue Binärdateienllama.cpp b9352: correzione backend ZenDN e nuovi binarillama.cpp b9352: correctiv backend ZenDN e nov binari

Le projet llama.cpp publie trois builds en 24 heures (b9334, b9351, b9352). Le dernier corrige la nomenclature des fonctions matmul dans le backend ZenDN. Les binaires couvrent CUDA 12/13, Vulkan, ROCm 7.2, OpenVINO, Android ARM64, macOS (avec et sans KleidiAI), Windows CPU/CUDA/Vulkan/HIP, et Linux s390x. Téléchargements.The llama.cpp project publishes three builds in 24 hours (b9334, b9351, b9352). The latest fixes function naming in the ZenDN matmul backend. Binaries cover CUDA 12/13, Vulkan, ROCm 7.2, OpenVINO, Android ARM64, macOS (with and without KleidiAI), Windows CPU/CUDA/Vulkan/HIP, and Linux s390x. Downloads.Das llama.cpp-Projekt veröffentlicht drei Builds in 24 Stunden (b9334, b9351, b9352). Die letzte Version korrigiert die Namensgebung der matmul-Funktionen im ZenDN-Backend. Die Binärdateien decken CUDA 12/13, Vulkan, ROCm 7.2, OpenVINO, Android ARM64, macOS (mit und ohne KleidiAI), Windows CPU/CUDA/Vulkan/HIP und Linux s390x ab. Downloads.Il progetto llama.cpp pubblica tre build in 24 ore (b9334, b9351, b9352). L'ultimo corregge la nomenclatura delle funzioni matmul nel backend ZenDN. I binari coprono CUDA 12/13, Vulkan, ROCm 7.2, OpenVINO, Android ARM64, macOS (con e senza KleidiAI), Windows CPU/CUDA/Vulkan/HIP e Linux s390x. Download.El proget llama.cpp el publica tri builds in 24 ore (b9334, b9351, b9352). El ultim el corrigg la nomenatura de i funzion matmul in del backend ZenDN. I binari i copren CUDA 12/13, Vulkan, ROCm 7.2, OpenVINO, Android ARM64, macOS (con e senza KleidiAI), Windows CPU/CUDA/Vulkan/HIP, e Linux s390x. Download.

Page 3 — Page 3 — Seite 3 — Pagina 3 — Pagina 3 — La RechercheResearchForschungLa RicercaLa Ricerca

III. Papiers & LaboratoiresPapers & LabsPapiere & LaborePapers e LaboratoriArticg e Laboratori

Nous Research

Lighthouse Attention : une attention hiérarchique 17 fois plus rapide à 512K de contexteLighthouse Attention: hierarchical attention 17× faster at 512K contextLighthouse Attention: hierarchische Attention um das 17-Fache schneller bei 512K KontextLighthouse Attention: un'attenzione gerarchica 17 volte più veloce a 512K di contestoLighthouse Attention: on'atenzion hierarega 17 volte pussee veloc a 512K de contest

Nous Research présente Lighthouse Attention, un mécanisme d'attention hiérarchique par sélection qui atteint une vitesse 17 fois supérieure à l'attention standard à 512K de contexte sur un seul GPU B200. Une approche prometteuse pour le traitement de très longs documents sans sacrifier la qualité. Lire le papier.Nous Research presents Lighthouse Attention, a selection-based hierarchical attention mechanism achieving 17× the speed of standard attention at 512K context on a single B200 GPU. A promising approach for processing very long documents without sacrificing quality. Read the paper.Nous Research präsentiert Lighthouse Attention, einen hierarchischen, selektiven Aufmerksamkeitsmechanismus, der bei 512K Kontext auf einer einzigen B200-GPU 17-mal schneller ist als Standard-Attention. Ein vielversprechender Ansatz für die Verarbeitung sehr langer Dokumente ohne Qualitätseinbussen. Paper lesen.Nous Research presenta Lighthouse Attention, un meccanismo di attenzione gerarchica per selezione che raggiunge una velocità 17 volte superiore all'attenzione standard a 512K di contesto su una singola GPU B200. Un approccio promettente per l'elaborazione di documenti molto lunghi senza sacrificare la qualità. Leggere il paper.Nous Research la presenta Lighthouse Attention, on meccanism de atenzion hierarega per selezion che l'raggiunk ona velocità 17 volte superiora a l'atenzion standard a 512K de contest su on unich GPU B200. On approcc prometedor per el trattament de document lunghissim senza sacrificaa la qualità. Lìgge el papir.

Nous Research

Contrastive Neuron Attribution : piloter le comportement des modèles en ciblant des neurones spécifiquesContrastive Neuron Attribution: steering model behavior by targeting specific neuronsContrastive Neuron Attribution: Modellverhalten durch gezielte Neuronensteuerung lenkenContrastive Neuron Attribution: guidare il comportamento dei modelli mirando a neuroni specificiContrastive Neuron Attribution: pilotà el comportament di modell in manera specifica neuroni

La méthode de contrastive neuron attribution permet de modifier de façon fiable le comportement d'un modèle à haute intensité tout en préservant la qualité des sorties. En utilisant le refus comme cas d'étude, les chercheurs montrent comment découvrir et ablater des circuits de neurones MLP spécifiques. Lire le papier.The contrastive neuron attribution method enables reliable modification of model behavior at high intensity while preserving output quality. Using refusal as a case study, the researchers show how to discover and ablate specific MLP neuron circuits. Read the paper.Die kontrastive Neuronenattributionsmethode ermöglicht es, das Verhalten eines Modells mit hoher Intensität verlässlich zu verändern, ohne die Qualität der Ausgaben zu beeinträchtigen. Am Beispiel der Ablehnung zeigen die Forscher, wie sich spezifische MLP-Neuronenkreise entdecken und ablatieren lassen. Paper lesen.Il metodo di contrastive neuron attribution permette in modo affidabile di modificare il comportamento di un modello ad alta intensità preservando la qualità degli output. Utilizzando il rifiuto come caso di studio, i ricercatori mostrano come scoprire e ablare circuiti di neuroni MLP specifici. Leggere il paper.El metodo de contrastive neuron attribution el permet de modifegà in manera fidedegna el comportament de on modell a alta intensità tut conservant la qualità di sortid. Doprand el refud come cas d'estudi, i i ricercator i mostra come scoeubre e ablad di circuit de neuroni MLP specific. Lìgge el papir.

Together AI

OSCAR : une quantification KV cache en 2 bits pour les contextes longsOSCAR: 2-bit KV cache quantization for long contextsOSCAR: 2-Bit-KV-Cache-Quantisierung für lange KontexteOSCAR: una quantizzazione KV cache a 2 bit per i contesti lunghiOSCAR: ona quantizzazion KV cache a 2 bit per i contest longh

Together AI open-source OSCAR (Offline Spectral Covariance-Aware Rotation), une méthode de quantification INT2 du cache KV pour le serving de modèles à contexte long. À 2,28 bits par élément KV, OSCAR réduit l'écart de précision avec BF16 à 3,78 points sur Qwen3-4B-Thinking-2507, tout en améliorant le débit. Lire l'article.Together AI open-sources OSCAR (Offline Spectral Covariance-Aware Rotation), an INT2 KV cache quantization method for long-context model serving. At 2.28 bits per KV element, OSCAR reduces the accuracy gap with BF16 to 3.78 points on Qwen3-4B-Thinking-2507, while improving throughput. Read the article.Together AI open-sourced OSCAR (Offline Spectral Covariance-Aware Rotation), eine INT2-Quantisierungsmethode des KV-Cache für das Serving kontextlanger LLMs. Bei 2,28 Bit pro KV-Element reduziert OSCAR die Genauigkeitslücke zu BF16 um 3,78 Punkte auf Qwen3-4B-Thinking-2507, während der Durchsatz verbessert wird. Artikel lesen.Together AI rende open source OSCAR (Offline Spectral Covariance-Aware Rotation), un metodo di quantizzazione INT2 del cache KV per il serving di modelli a contesto lungo. A 2,28 bit per elemento KV, OSCAR riduce il divario di precisione con BF16 a 3,78 punti su Qwen3-4B-Thinking-2507, migliorando al contempo il throughput. Leggere l'articolo.Together AI l'open-source OSCAR (Offline Spectral Covariance-Aware Rotation), on metodo de quantizzazion INT2 del cache KV per el serving de modell a contest longh. A 2,28bit per element KV, OSCAR el reduz la diferenza de precision con BF16 a 3,78 punt su Qwen3-4B-Thinking-2507, migliorand al tempis el throughput. Lìgge l'articul.

Stability AI

Stable Audio 3 : de la diffusion latente rapide pour la génération musicaleStable Audio 3: fast latent diffusion for music generationStable Audio 3: schnelle latente Diffusion zur MusikerzeugungStable Audio 3: diffusione latente rapida per la generazione musicaleStable Audio 3: de la diffuzion latent veloc per la generazion musical

Stability AI publie Stable Audio 3, une famille de modèles de diffusion latente pour la génération de musique instrumentale et d'effets sonores. Les variantes small (CPU MacBook Pro M4) et medium (8 Go de VRAM) génèrent de la stéréo 44.1 kHz via un entraînement en trois étapes. Les poids ouverts sont disponibles. Lire l'article.Stability AI releases Stable Audio 3, a family of latent diffusion models for instrumental music and sound effects generation. The small variant (CPU MacBook Pro M4) and medium variant (8 GB VRAM) produce 44.1 kHz stereo via a three-stage training pipeline. Open weights are available. Read the article.Stability AI veröffentlicht Stable Audio 3, eine Familie latenter Diffusionsmodelle zur Erzeugung instrumentaler Musik und Soundeffekte. Die Varianten small (CPU MacBook Pro M4) und medium (8 GB VRAM) generieren 44,1-kHz-Stereo über ein dreistufiges Training. Die offenen Gewichte sind verfügbar. Artikel lesen.Stability AI pubblica Stable Audio 3, una famiglia di modelli a diffusione latente per la generazione di musica strumentale e effetti sonori. Le varianti small (CPU MacBook Pro M4) e medium (8 Go di VRAM) generano stereo a 44,1 kHz tramite un addestramento in tre fasi. I pesi aperti sono disponibili. Leggere l'articolo.Stability AI la publica Stable Audio 3, ona famiglia de modell de diffuzion latent per la generazion de musica instrumentala e de effet sonòr. I variant small (CPU MacBook Pro M4) e medium (8 Go de VRAM) i genera stereo a 44.1 kHz tramite on entrenament a tri pass. I pes avert i son disponibij. Lìgge l'articul.

ZeroEntropy

Zerank-2 : un reranker de précision pour les pipelines RAGZerank-2: a precision reranker for RAG pipelinesZerank-2: ein Präzisions-Reranker für RAG-PipelinesZerank-2: un reranker di precisione per i pipeline RAGZerank-2: on reranker de precision per i pipeline RAG

Le reranker Zerank-2 de ZeroEntropy, basé sur Qwen3 (4B), améliore la qualité de recherche dans les pipelines retrieve-and-rerank. Ce tutoriel pratique montre comment construire un pipeline complet avec un bi-encodeur rapide suivi d'un rerankage fin. Lire le tutoriel.ZeroEntropy's Zerank-2 reranker, based on Qwen3 (4B), improves search quality in retrieve-and-rerank pipelines. This practical tutorial shows how to build a complete pipeline with a fast bi-encoder followed by fine-grained reranking. Read the tutorial.Der Zerank-2-Reranker von ZeroEntropy, basierend auf Qwen3 (4B), verbessert die Suchqualität in Retrieve-and-Rerank-Pipelines. Dieses praktische Tutorial zeigt den Aufbau einer kompletten Pipeline mit einem schnellen Bi-Encoder, gefolgt von einem feinen Reranking. Tutorial lesen.Il reranker Zerank-2 di ZeroEntropy, basato su Qwen3 (4B), migliora la qualità della ricerca nei pipeline retrieve-and-rerank. Questo tutorial pratico mostra come costruire un pipeline completo con un bi-encodeur rapido seguito da un reranking accurato. Leggere il tutorial.El reranker Zerank-2 de ZeroEntropy, basaa su Qwen3 (4B), el mejora la qualità de ricerca in i pipeline retrieve-and-rerank. Quest tutorial prategh el mostra come construii on pipeline complet con on bi-encodee rapid seguii de on reranking fin. Lìgge el tutorial.

WorkOS

auth.md : un protocole ouvert d'enregistrement des agents IAauth.md: an open protocol for AI agent registrationauth.md: ein offenes Registrierungsprotokoll für KI-Agentenauth.md: un protocollo aperto di registrazione per agenti IAauth.md: on protocol avert de registrazion di agent IA

WorkOS publie auth.md, un protocole basé sur OAuth qui permet aux applications de publier un fichier Markdown indiquant aux agents IA les flux d'enregistrement supportés, les scopes à demander et comment obtenir des identifiants liés à un utilisateur réel — sans formulaire humain. Lire l'article.WorkOS publishes auth.md, an OAuth-based protocol allowing applications to publish a Markdown file telling AI agents which registration flows are supported, which scopes to request, and how to obtain credentials tied to a real user — without a human form. Read the article.WorkOS veröffentlicht auth.md, ein OAuth-basiertes Protokoll, das es Anwendungen ermöglicht, eine Markdown-Datei zu veröffentlichen, in der KI-Agenten angegeben wird, welche Registrierungsflüsse unterstützt werden, welche Scopes anzufordern sind und wie Zugangsdaten im Zusammenhang mit einem echten Nutzer erhalten werden können – ohne menschliches Formular. Artikel lesen.WorkOS pubblica auth.md, un protocollo basato su OAuth che permette alle applicazioni di pubblicare un file Markdown che indica agli agenti IA i flussi di registrazione supportati, gli scope da richiedere e come ottenere credenziali collegate a un utente reale — senza modulo umano. Leggere l'articolo.WorkOS la publica auth.md, on protocol basaa su OAuth che permet a i apliegazion de publicaa on file Markdown indichand ai agent IA i fluss de registrazion suportaa, i scopes da domandà e come otengh identifiga ligaa a on utent real — senza formular uman. Lìgge l'articul.

Page 4 — Page 4 — Seite 4 — Pagina 4 — Pagina 4 — La Communauté & ÉditoCommunity & EditorialCommunity & EditorialLa Comunità e l'EditorialeLa Comunità e l'Editol

I. Tour des labosLab TourLabortourTour dei laboratoriGiro di Laboratori

Recherche

Research

Forschung

Ricerca

MagenticLite : un framework agentique optimisé pour les petits modèlesMagenticLite: an agentic framework optimized for small modelsMagenticLite: ein Agenten-Framework optimiert für kleine ModelleMagenticLite: un framework agentico ottimizzato per i piccoli modelliMagenticLite: on framework agentich ottimizzaa per i modell piscinin

Microsoft Research — édition du jour

Microsoft Research — today's edition

Microsoft Research — Tagesausgabe

Microsoft Research — edizione del giorno

Microsoft Research — edizion del dì

Microsoft Research a publié un article présentant MagenticLite, une expérience agentique conçue pour fonctionner efficacement sur des modèles de taille réduite. Le billet de blog, accessible sur le site de Microsoft Research, détaille cette approche visant à rendre les workflows agents plus accessibles en termes de ressources de calcul. Lire l'articleMicrosoft Research has published a paper presenting MagenticLite, an agentic experiment designed to run efficiently on smaller models. The blog post, available on the Microsoft Research website, details this approach aimed at making agent workflows more accessible in terms of compute resources. Read the articleMicrosoft Research hat ein Paper zu MagenticLite veröffentlicht, einem Agenten-Experiment, das für die effiziente Nutzung in kleineren Modellen konzipiert wurde. Der Blogbeitrag auf der Website von Microsoft Research erläutert diesen Ansatz, der agentische Workflows hinsichtlich Rechenressourcen zugänglicher machen soll. Artikel lesenMicrosoft Research ha pubblicato un articolo che presenta MagenticLite, un'esperienza agentica progettata per funzionare efficacemente su modelli di dimensioni ridotte. Il post del blog, accessibile sul sito di Microsoft Research, dettaglia questo approccio volto a rendere i workflow agentici più accessibili in termini di risorse di calcolo. Leggere l'articoloMicrosoft Research l'ha publicaa on articg che presenta MagenticLite, on'esperienza agentega pensaa per fonzionà in manera eficente su modell de dimension redutte. El blog post, accessibel sul sit de Microsoft Research, el detala quest'approcc che i mira a rendè i workflow agentich pussee accessibij in termin de risors calcolistegh. Lìgge l'articul

Produit

Product

Produkt

Prodotto

Grok Build entre en bêta early-access pour les abonnés SuperGrokGrok Build enters early-access beta for SuperGrok subscribersGrok Build startet als Early-Access-Beta für SuperGrok-AbonnentenGrok Build entra in beta early-access per gli abbonati SuperGrokGrok Build l'è disponibel in beta early-access per i abonent SuperGrok

xAI — édition du jour

xAI — today's edition

xAI — Tagesausgabe

xAI — edizione del giorno

xAI — edizion del dì

xAI annonce le lancement en bêta de Grok Build, un agent de codage qui s'exécute directement depuis le terminal, accessible à tous les abonnés SuperGrok et X Premium Plus. L'outil avait été annoncé fin mai et constitue la réponse de xAI aux agents de codage comme Claude Code ou Codex d'OpenAI. Lire l'annoncexAI announces the beta launch of Grok Build, a coding agent that runs directly from the terminal, available to all SuperGrok and X Premium Plus subscribers. The tool was announced in late May and represents xAI's answer to coding agents such as Claude Code or OpenAI's Codex. Read the announcementxAI kündigt den Beta-Start von Grok Build an, ein Coding-Agent, der direkt im Terminal läuft und für alle SuperGrok- und X Premium Plus-Abonnenten zugänglich ist. Das Ende Mai angekündigte Tool ist xIAs Antwort auf Coding-Agenten wie Claude Code oder OpenAI Codex. Ankündigung lesenxAI annuncia il lancio in beta di Grok Build, un agente di codifica che si esegue direttamente dal terminale, accessibile a tutti gli abbonati SuperGrok e X Premium Plus. Lo strumento era stato annunciato a fine maggio e costituisce la risposta di xAI agli agenti di codifica come Claude Code o Codex di OpenAI. Leggere l'annuncioxAI l'anuncia el lanci in beta de Grok Build, on agent de codis che l'eseguiss direttament del terminal, accessibel a tucc i abonent SuperGrok e X Premium Plus. L'utensil l'era staa anonciaa de fin magg e l'constituiss la resposta de xAI ai agent de codis come Claude Code o Codex de l'OpenAI. Lìgge l'anonzi

II. Écosystème & ÉditoEcosystem & EditorialÖkosystem & EditorialEcosistema e EditorialeEcosistema e Editòr

Sécurité

Security

Sicherheit

Sicurezza

Segurezza

Les plateformes d'authentification pour agents IA : le guide 2026Authentication platforms for AI agents: the 2026 guideAuthentifizierungsplattformen für KI-Agenten: der Leitfaden 2026Le piattaforme di autenticazione per agenti IA: la guida 2026Piazzafort per l'autentifegazion di agent IA: la guida del 2026

Avec plus de 97 millions de téléchargements mensuels du SDK MCP, l'authentification des agents est devenue une infrastructure critique. Ce guide compare huit plateformes (WorkOS, Stytch, Auth0, Composio, Nango, Arcade, TrueFoundry, Cloudflare) sur la conformité spec, la profondeur d'identité enterprise et l'étendue des intégrations. Lire le guide.With over 97 million monthly downloads of the MCP SDK, agent authentication has become critical infrastructure. This guide compares eight platforms (WorkOS, Stytch, Auth0, Composio, Nango, Arcade, TrueFoundry, Cloudflare) on spec compliance, enterprise identity depth, and integration breadth. Read the guide.Mit über 97 Millionen monatlichen Downloads des MCP-SDKs ist die Agenten-Authentifizierung zu einer kritischen Infrastruktur geworden. Dieser Leitfaden vergleicht acht Plattformen (WorkOS, Stytch, Auth0, Composio, Nango, Arcade, TrueFoundry, Cloudflare) hinsichtlich Spezifikationskonformität, Enterprise-Identitätstiefe und Integrationsumfang. Leitfaden lesen.Con oltre 97 milioni di download mensili dell'SDK MCP, l'autenticazione degli agenti è diventata un'infrastruttura critica. Questa guida confronta otto piattaforme (WorkOS, Stytch, Auth0, Composio, Nango, Arcade, TrueFoundry, Cloudflare) sulla conformità spec, la profondità dell'identità enterprise e l'ampiezza delle integrazioni. Leggere la guida.Con oltra 97 milion de download mensii del SDK MCP, l'autentifegazion di agent l'è scominciaa a vess ona infrastruttura critica. Questa guida la compare otto piazzafort (WorkOS, Stytch, Auth0, Composio, Nango, Arcade, TrueFoundry, Cloudflare) su la conformità spec, la profundità d'identità enterprise e l'estension de i integrà. Lìgge la guida.

Open Source

OmniVoice Studio : une alternative locale et open source à ElevenLabsOmniVoice Studio: a local, open-source alternative to ElevenLabsOmniVoice Studio: eine lokale und quelloffene Alternative zu ElevenLabsOmniVoice Studio: un'alternativa locale e open source a ElevenLabsOmniVoice Studio: ona alternativa local e open source a ElevenLabs

OmniVoice Studio propose clonage vocal, doublage vidéo, dictation en temps réel et diarisation de locuteurs entièrement sur machine locale. Sans clé API, sans cloud, sans abonnement. Le projet supporte 646 langues et expose un serveur MCP pour intégration avec Claude, Cursor ou tout client MCP. Découvrir le projet.OmniVoice Studio offers voice cloning, video dubbing, real-time dictation, and speaker diarization entirely on local hardware. No API key, no cloud, no subscription. The project supports 646 languages and exposes an MCP server for integration with Claude, Cursor, or any MCP client. Discover the project.OmniVoice Studio bietet Sprachklonung, Videosprecher Synchronisation, Echtzeit-Diktierfunktion und Sprecher-Diarization vollständig auf lokaler Maschine an. Ohne API-Schlüssel, ohne Cloud, ohne Abonnement. Das Projekt unterstützt 646 Sprachen und stellt einen MCP-Server zur Integration mit Claude, Cursor oder jedem MCP-Client bereit. Projekt entdecken.OmniVoice Studio propone clonazione vocale, doppiaggio video, dettatura in tempo reale e diarizzazione dei parlanti interamente su macchina locale. Senza chiave API, senza cloud, senza abbonamento. Il progetto supporta 646 lingue e espone un server MCP per l'integrazione con Claude, Cursor o qualsiasi client MCP. Scoprire il progetto.OmniVoice Studio l'offra clonazion vocal, doppiagg video, dettadura in temp real e diarizzazion di partecipantt interament su la macchina local. Senza contrasign API, senza cloud, senza abonament. El proget el suporta 646 lenguagg e l'espoe on server MCP per l'integrazion con Claude, Cursor o qualsaissia client MCP. Scoeuvrì el proget.

Communauté

Community

Comunità

Paul Graham ignore les emails écrits par l'IA : « C'est comme être menti »Paul Graham ignores AI-written emails: "It feels like being lied to"Paul Graham ignoriert KI-geschriebene E-Mails: « Es fühlt sich an wie gelogen zu werden »Paul Graham ignora le email scritte dall'IA: «È come essere mentiti»Paul Graham el ignora i email scrivuu de l'IA: « L'è come vess menzognaa »

Le fondateur de Y Combinator, l'un des premiers investisseurs d'OpenAI, assume ignorer les emails clairement rédigés par IA. Des études suggèrent que sa réaction est loin d'être isolée. Un signal sur la méfiance croissante envers la communication générée par les modèles. Lire l'article.The founder of Y Combinator, one of OpenAI's earliest investors, makes no apologies for ignoring emails clearly written by AI. Studies suggest his reaction is far from isolated. A signal of growing distrust toward model-generated communication. Read the article.Der Gründer von Y Combinator, einer der ersten Investoren bei OpenAI, stellt sich auf die Ebene, deutlich KI-verfasste E-Mails zu ignorieren. Studien deuten darauf hin, dass seine Reaktion keineswegs isoliert ist. Ein Signal wachsender Skepsis gegenüber durch Sprachmodelle erzeugter Kommunikation. Artikel lesen.Il fondatore di Y Combinator, uno dei primi investitori di OpenAI, assume di ignorare le email chiaramente redatte dall'IA. Studi suggeriscono che la sua reazione è tutt'altro che isolata. Un segnale sulla sfiducia crescente verso la comunicazione generata dai modelli. Leggere l'articolo.El fondador de Y Combinator, on di prim investitor de l'OpenAI, l'assum de ignoraa i email chiarament scrivuu de l'IA. De estudo i sugeriss che la sò reazion l'è lontana de vess isolada. On segnal sulla sfidú cresent verso la comunicazion generada de i modell. Lìgge l'articul.

Juridique

Legal

Rechtlich

Giuridico

Giuridegh

L'IA inonde les tribunaux fédéraux américains de plaintes généréesAI floods US federal courts with generated complaintsKI überschwemmt US-Bundesgerichte mit generierten KlagenL'IA inonda i tribunali federali americani di denunce generateL'IA la manda i tribunai federai american de plent generaa

Une étude du MIT et de l'USC montre que les poursuites déposées sans avocat ont presque doublé depuis l'arrivée de ChatGPT. Un sur cinq contient du texte généré par IA. Les juges recourent à des mesures drastiques pour gérer le déluge. Lire l'étude.A study from MIT and USC shows that lawsuits filed without a lawyer have nearly doubled since the arrival of ChatGPT. One in five contains AI-generated text. Judges are resorting to drastic measures to manage the deluge. Read the study.Eine Studie des MIT und der USC zeigt, dass der Antragstellung ohne Anwalt seit dem Aufkommen von ChatGPT fast doppelt so oft vorkommt. Jeder fünfte enthält von KI generierten Text. Die Richter greifen zu drastischen Massnahmen, um die Flut zu bewältigen. Studie lesen.Uno studio del MIT e dell'USC mostra che le cause presentate senza avvocato sono quasi raddoppiate dall'arrivo di ChatGPT. Una su cinque contiene testo generato dall'IA. I giudici ricorrono a misure drastiche per gestire il diluvio. Leggere lo studio.Un'estudi del MIT e dell'USC el mostra che i process senza avocat i è quasi doppiiaa dal rivar de ChatGPT. Un su cinch el contegn test generaa de l'IA. i giudizzi i recor a misur drastic per gestì el diluvi. Lìgge l'estudi.

Industrie

Industry

Industrie

Industria

George Hotz : les agents de coding seront « l'une des erreurs les plus coûteuses »George Hotz: coding agents will be "one of the most costly mistakes"George Hotz: Coding-Agenten werden « einer der teuersten Fehler » seinGeorge Hotz: gli agenti di coding saranno «uno degli errori più costosi»George Hotz: i agent de coding i sarann « on di eror pussee costos »

Après six mois de test, le programmeur George Hotz est catégorique : les LLM livrent des prototypes rapides mais échouent sur les détails, produisant des bugs de plus en plus difficiles à détecter. Un avis qui illustre la fracture profonde de la communauté IA sur le rôle des modèles de langage. Lire l'article.After six months of testing, programmer George Hotz is categorical: LLMs deliver rapid prototypes but fail on the details, producing bugs that are increasingly difficult to catch. An opinion that illustrates the deep divide within the AI community over the role of language models. Read the article.Nach sechs Monaten Testzeit ist der Programmierer George Hotz eindeutig: LLMs liefern schnelle Prototypen, scheitern aber an Details und produzieren Bugs, die immer schwerer aufzuspüren sind. Eine Einschätzung, die die tiefe Kluft der KI-Community hinsichtlich der Rolle von Sprachmodellen veranschaulicht. Artikel lesen.Dopo sei mesi di test, il programmatore George Hotz è categorico: i LLM forniscono prototipi rapidi ma falliscono sui dettagli, producendo bug sempre più difficili da individuare. Un parere che illustra la frattura profonda della comunità IA sul ruolo dei modelli linguistici. Leggere l'articolo.Dopo sei mes de test, el programador George Hotz l'è categorich: i LLM i consegna prototip rapid ma i fà mazaa sui detagli, producendo bug pussee e pussee difficej da individuà. Un parer che la illustra la fractura profunda de la comunità IA sul rool di modell de lenguagg. Lìgge l'articul.

Recherche

Research

Forschung

Ricerca

Les modèles IA donnent souvent la bonne réponse mais citent les mauvaises sourcesAI models often give the right answers but cite the wrong sourcesKI-Modelle liefern oft die richtige Antwort, aber zitieren die falschen QuellenI modelli IA danno spesso la risposta giusta ma citano le fonti sbagliateI modell de l'IA i da spesso la risposta giusta ma i citaa i font sbagliaa

Des chercheurs de l'université de Pékin documentent l'« hallucination d'attribution » : même quand la réponse est correcte, le passage cité ne la soutient pas. Leur benchmark CiteVQA est le premier à tester systématiquement ce phénomène, particulièrement risqué dans les domaines réglementés comme le droit et la médecine. Lire l'étude.Researchers from Peking University document "attribution hallucination": even when the answer is correct, the cited passage does not support it. Their benchmark CiteVQA is the first to systematically test this phenomenon, which is particularly risky in regulated fields such as law and medicine. Read the study.Forscher der Universität Peking dokumentieren die « Halluzination der Attribution »: Selbst wenn die Antwort richtig ist, stützt die zitierte Passage sie nicht. Ihr Benchmark CiteVQA ist der erste, der systematisch dieses Phänomen untersucht – besonders risikoreich in regulierten Bereichen wie Recht und Medizin. Studie lesen.Ricercatori dell'università di Pechino documentano l'«allucinazione di attribuzione»: anche quando la risposta è corretta, il passaggio citato non la sostiene. Il loro benchmark CiteVQA è il primo a testare sistematicamente questo fenomeno, particolarmente rischioso nei settori regolamentati come il diritto e la medicina. Leggere lo studio.De ricercador de l'università de Pechino i documenta l'« hallucinazion de atribuizion »: anca quand la risposta l'è giusta, el passagg citad el la sostegn nò. El sò benchmark CiteVQAl'è el prim a testà sistemategament questo fenomen, particularment risgos in camp regolamentaa come el dirit e la medicina. Lìgge l'estudi.

III. ÉditoEditorialEditorialEditorialeEditol

Éditorial

Editorial

Leitartikel

Editoriale

Editorial

La semaine où les mathématiques sont devenues un terrain de batailleThe week mathematics became a battlefieldDie Woche, in der das Mathematik zu einem Kampfgebiet wurdeLa settimana in cui la matematica è diventata un campo di battagliaLa settimana en che i matemategh i è vegnuu on camp de battaglia

En l'espace de quelques jours, OpenAI réfute une conjecture d'Erdős, DeepMind en résout neuf autres, et Anthropic affirme que Mythos a trouvé la preuve élégante. Cette cascade de résultats mathématiques spectaculaires par des modèles de langage pose une question dérangeante : sommes-nous en train de confondre la capacité à trouver des preuves avec la capacité à comprendre les mathématiques ? Les modèles actuels excellent dans l'exploration combinatoire et la reconnaissance de motifs — des compétences précieuses pour la découverte mathématique. Mais la valeur d'une preuve réside autant dans sa capacité à éclairer que dans sa vérification formelle. Quand Sholto Douglas parle de « serious overhang », il touche du doigt un phénomène réel : les modèles font déjà des choses que nous ne savons pas encore mesurer. Le risque n'est pas que les modèles remplacent les mathématiciens, mais que la course à la performance mathématique — mesurée en problèmes résolus plutôt qu'en compréhension produite — détourne l'attention des questions plus fondamentales : que signifie comprendre une preuve ? Et que nous apprennent ces résultats sur la nature de l'intelligence artificielle elle-même ?In the space of a few days, OpenAI refuted an Erdős conjecture, DeepMind solved nine others, and Anthropic claimed Mythos had found the elegant proof. This cascade of spectacular mathematical results by language models raises an unsettling question: are we conflating the ability to find proofs with the ability to understand mathematics? Today's models excel at combinatorial exploration and pattern recognition — valuable skills for mathematical discovery. But the value of a proof lies as much in its power to illuminate as in its formal verification. When Sholto Douglas speaks of a "serious overhang," he is touching on a real phenomenon: models are already doing things we do not yet know how to measure. The risk is not that models will replace mathematicians, but that the race for mathematical performance — measured in problems solved rather than understanding produced — diverts attention from more fundamental questions: what does it mean to understand a proof? And what do these results tell us about the nature of artificial intelligence itself?Innerhalb weniger Tage widerlegt OpenAI eine Erdős-Vermutung, DeepMind löst neun weitere, und Anthropic behauptet, Mythos habe den eleganten Beweis gefunden. Diese Kaskade spektakulärer mathematischer Ergebnisse durch Sprachmodelle wirft eine unangenehme Frage auf: Verwechseln wir die Fähigkeit, Beweise zu finden, mit der Fähigkeit, Mathematik zu verstehen? Heutige Modelle glänzen in kombinatorischer Exploration und Mustererkennung – wertvolle Fähigkeiten für die mathematische Entdeckung. Doch der Wert eines Beweises liegt ebenso in seiner Erhellungskraft wie in seiner formalen Verifikation. Wenn Sholto Douglas von einem « serious overhang » spricht, berührt er ein echtes Phänomen: Die Modelle tun bereits Dinge, die wir noch nicht zu messen wissen. Die Gefahr liegt nicht darin, dass die Modelle Mathematiker ersetzen, sondern dass das Rennen um mathematische Leistungen – gemessen in gelösten Problemen statt produziertem Verständnis – die Aufmerksamkeit von grundlegenderen Fragen ablenkt: Was bedeutet es, einen Beweis zu verstehen? Und was verraten uns diese Ergebnisse über die Natur der Künstlichen Intelligenz selbst?Nel giro di pochi giorni, OpenAI smentisce una congettura di Erdős, DeepMind ne risolve altre nove e Anthropic afferma che Mythos ha trovato la dimostrazione elegante. Questa cascata di risultati matematici spettacolari da parte di modelli linguistici pone una domanda scomoda: stiamo confondendo la capacità di trovare dimostrazioni con la capacità di comprendere la matematica? I modelli attuali eccellono nell'esplorazione combinatoria e nel riconoscimento di pattern — competenze preziose per la scoperta matematica. Ma il valore di una dimostrazione risiede tanto nella sua capacità di illuminare quanto nella sua verifica formale. Quando Sholto Douglas parla di «serious overhang», tocca un fenomeno reale: i modello fanno già cose che non sappiamo ancora misurare. Il rischio non è che i modelli sostituiscano i matematici, ma che la corsa alla prestazione matematica — misurata in problemi risolti piuttosto che in comprensione prodotta — distragga l'attenzione dalle domande più fondamentali: cosa significa comprendere una dimostrazione? E cosa ci insegnano questi risultati sulla natura stessa dell'intelligenza artificiale?In spaziu de poeu dì, l'OpenAI la refutta ona congettura de Erdős, DeepMind 'n'è resolt nove alter, e l'Afferma dì che Mythos l'ha trovaa la dimostrazion eleganta. Quest cascad de resultad matematich spectacolari de modell de lenguagg la pòss ona domanda disturbanta: semm minga in train de confond la capacidad de trovà dimostrazion con la capacidad de capì i matemategh? I modell atual i n'è oltra ben in l'esplorazion combinatoria e el reconossiment de motiff — competenz preziós per la scoverta matematega. Ma la valor de ona dimostrazion l'è tant in la soa capacidad de illuminà quant in la soa verificazion formal. Quand Sholto Douglas el parla de « serious overhang », el tocca con el did on fenomen real: i modell i fa già robb che sa minga ancora mesurà. El risch minga che i modell i replacen i matemategh, ma che la corsa a la prestazion matematega — mesurada en problem resolt pussee che in comprension producud — la deròr l'atenzion de domanda pussee fundamental: che significa capì ona dimostrazion? E che insegna quest resultad sula natura de l'intelligenza artifega in de lor stess?