The Neuron Times — Jeudi 4 juin 2026

À la Une · Modèles ouvertsFront Page · Open ModelsSchlagzeilen · Offene ModellePrima pagina · Modelli apertiIn prima pagina · Modèll Avert

Google DeepMind lance Gemma 4 12B, un modèle multimodal qui tourne sur un simple laptopGoogle DeepMind launches Gemma 4 12B, a multimodal model that runs on a simple laptopGoogle DeepMind lanciert Gemma 4 12B, ein multimodales Modell für den einfachen LaptopGoogle DeepMind lancia Gemma 4 12B, un modello multimodale che gira su un semplice laptopGoogle DeepMind el lanza Gemma 4 12B, on modèll multimodal che 'l gira sora on laptop sempliz

Le nouveau modèle open-source traite texte, image et audio sans encodeur dédié et fonctionne sur une machine dotée de 16 Go de RAM.The new open-source model processes text, image and audio without a dedicated encoder and runs on a machine with 16 GB of RAM.Das neue Open-Source-Modell verarbeitet Text, Bild und Audio ohne dedizierten Encoder und läuft auf einem Rechner mit 16 GB RAM.Il nuovo modello open-source elabora testo, immagini e audio senza encoder dedicato e funziona su una macchina con 16 GB di RAM.El noeuv modèll open-source el trata test, immagin e audio senza encoder dedicad e 'l fonziona sora ona macchina dotada de 16 GB de RAM.

De la rédaction — 4 juin 2026From the editorial desk — 4 June 2026Von der Redaktion — 4. Juni 2026Dalla redazione — 4 giugno 2026De la redazzion — 4 giugno 2026

Google DeepMind a dévoilé mercredi Gemma 4 12B, un modèle multimodal dense qui se distingue par une architecture dite « sans encodeur » : les données visuelles et audio sont injectées directement dans le backbone du grand modèle de langage, sans passer par des modules de vision ou de codage audio séparés. Cette approche, détaillée dans le guide développeur publié par Google, permet au modèle d'atteindre des performances proches de celles de son grand frère Gemma 4 26B tout en consommant moitié moins de ressources.Google DeepMind unveiled Wednesday Gemma 4 12B, a dense multimodal model distinguished by a so-called 'encoder-free' architecture: visual and audio data are injected directly into the large language model backbone, without passing through separate vision or audio encoding modules. This approach, detailed in the developer guide published by Google, allows the model to achieve performance close to that of its larger sibling Gemma 4 26B while consuming half the resources.Google DeepMind hat am Mittwoch Gemma 4 12B vorgestellt, ein dichtes multimodales Modell, das sich durch eine sogenannte «encoderfreie» Architektur auszeichnet: Visuelle und Audiodaten werden direkt in das Backbone des grossen Sprachmodells eingespeist, ohne separate Bild- oder Audio-Codierungsmodule. Dieser Ansatz, der im von Google veröffentlichten Entwicklerleitfaden detailliert beschrieben wird, ermöglicht es dem Modell, eine Leistung zu erzielen, die der seines grossen Bruders Gemma 4 26B nahekommt, bei halbem Ressourcenverbrauch.Google DeepMind ha svelato mercoledì Gemma 4 12B, un modello multimodale denso che si distingue per un'architettura detta « senza encoder »: i dati visivi e audio vengono iniettati direttamente nel backbone del grande modello linguistico, senza passare attraverso moduli di visione o codifica audio separati. Questo approccio, dettagliato nella guida per sviluppatori pubblicata da Google, consente al modello di raggiungere prestazioni vicine a quelle del suo fratello maggiore Gemma 4 26B consumando la metà delle risorse.Google DeepMind l'ha presentaa mercoldì Gemma 4 12B, on modèll multimodal dens che 'l se distingua per ona architettura ciamada « senza encoder »: i dati visiv e audio hinn iniettad direttament in del backbone del grand modèll de lengoeu, senza passà per di modul de vision o de codifega audio separaa. 'Sto approcc, dettagliaa in del guida desvilupador publicada de Google, el permet al modèll de rivà a di prestazion arent a quei del sò fradell grand Gemma 4 26B, consumand la metà di risorse.

Le modèle, disponible sous licence Apache 2.0, supporte un contexte allant jusqu'à 256 000 tokens et couvre plus de 140 langues. Selon The Decoder, il s'agit du premier modèle multimodal de cette taille capable de fonctionner sur un ordinateur portable grand public avec 16 Go de RAM, ouvrant la voie à des workflows agentiques et à du traitement de données entièrement locaux. Google a également publié des variantes pré-entraînées et instruction-tunées, ainsi que des poids ouverts.The model, available under the Apache 2.0 license, supports a context of up to 256,000 tokens and covers more than 140 languages. According to The Decoder, it is the first multimodal model of this size capable of running on a consumer laptop with 16 GB of RAM, paving the way for agentic workflows and fully local data processing. Google has also released pre-trained and instruction-tuned variants, as well as open weights.Das unter der Apache-2.0-Lizenz verfügbare Modell unterstützt einen Kontext von bis zu 256'000 Tokens und deckt über 140 Sprachen ab. Laut The Decoder handelt es sich um das erste multimodale Modell dieser Grösse, das auf einem handelsüblichen Laptop mit 16 GB RAM lauffähig ist, und ebnet den Weg für agentische Workflows und vollständig lokale Datenverarbeitung. Google hat zudem vortrainierte und instruktionsgetunte Varianten sowie offene Gewichte veröffentlicht.Il modello, disponibile con licenza Apache 2.0, supporta un contesto fino a 256.000 token e copre oltre 140 lingue. Secondo The Decoder, si tratta del primo modello multimodale di queste dimensioni in grado di funzionare su un computer portatile consumer con 16 GB di RAM, aprendo la strada a flussi di lavoro agentici e all'elaborazione dei dati interamente locale. Google ha inoltre pubblicato varianti pre-addestrate e instruction-tuned, oltre a pesi aperti.El modèll, disponibil sotta licenza Apache 2.0, el supporta on contest fina a 256 000 token e 'l quatta pussee de 140 lengoeu. Segond The Decoder, l'è el prim modèll multimodal de 'sta grandezza bon de fonzionà sora on ordenador portatil de consum cont 16 GB de RAM, dervend la strada a di fluss de lavorà agentigh e a di tratament de dacc completament locai. Google l'ha anca publicaa di variant pre-inalvad e istruzzion-tunad, insema a di pes avert.

La sortie de Gemma 4 12B s'accompagne d'outils concrets pour les développeurs : le framework Google AI Edge Gallery permet d'exécuter le modèle sur macOS avec exécution dynamique de code Python, tandis que LiteRT-LM CLI propose une nouvelle commande serve créant un endpoint local compatible avec les standards de l'industrie. La communauté r/LocalLLaMA a immédiatement salué cette sortie, tandis que des rumeurs évoquent déjà l'arrivée prochaine d'une variante 120B du modèle.The release of Gemma 4 12B comes with concrete tools for developers: the Google AI Edge Gallery framework allows the model to run on macOS with dynamic Python code execution, while LiteRT-LM CLI introduces a new serve command creating a local endpoint compatible with industry standards. The r/LocalLLaMA community immediately welcomed this release, while rumors already hint at an upcoming 120B variant of the model.Die Veröffentlichung von Gemma 4 12B wird von konkreten Tools für Entwickler begleitet: Das Google AI Edge Gallery Framework ermöglicht die Ausführung des Modells unter macOS mit dynamischer Ausführung von Python-Code, während die LiteRT-LM CLI einen neuen Befehl serve bereitstellt, der einen lokalen, mit Industriestandards kompatiblen Endpunkt erzeugt. Die Community r/LocalLLaMA hat diese Veröffentlichung umgehend begrüsst, während bereits Gerüchte über das baldige Erscheinen einer 120B-Variante des Modells kursieren.L'uscita di Gemma 4 12B è accompagnata da strumenti concreti per gli sviluppatori: il framework Google AI Edge Gallery consente di eseguire il modello su macOS con esecuzione dinamica di codice Python, mentre LiteRT-LM CLI propone un nuovo comando serve che crea un endpoint locale compatibile con gli standard del settore. La comunità r/LocalLLaMA ha immediatamente accolto con favore questa uscita, mentre circolano già voci sull'imminente arrivo di una variante 120B del modello.La sortida de Gemma 4 12B l'è compagnada de strüment concret per i desvilupador: el framework Google AI Edge Gallery el permet de eseguì el modèll sora macOS con esecuzion dinamica de codes Python, intant che LiteRT-LM CLI la propon ona noeuva commanda serve che la crea on endpoint local compatibil cont i standard de l'industria. La comunità r/LocalLLaMA l'ha subet salutad 'sta sortida, intant che di vos ghe disen sgiamò de l'arrivada vesina de ona variant 120B del modèll.

PartenariatPartnershipPartnerschaftPartnershipPartenariad

Anthropic dévoile le Services Track et le Partner Hub de son réseau ClaudeAnthropic unveils Services Track and Partner Hub for its Claude networkAnthropic stellt Services Track und Partner Hub seines Claude-Netzwerks vorAnthropic svela il Services Track e il Partner Hub del suo network ClaudeAnthropic el presenta el Services Track e 'l Partner Hub de la soa red Claude

From the Wires — 3 juin 2026

From the Wires — 3 June 2026

From the Wires — 03.06.2026

Dagli studi — From the Wires — 3 giugno 2026

From the Wires — 3 giugno 2026

Anthropic a annoncé mercredi le lancement du Services Track et du Partner Hub au sein du Claude Partner Network, une initiative visant à structurer l'écosystème de partenaires autour de ses modèles. Cette annonce s'inscrit dans une semaine chargée pour la société, qui a également révélé avoir soumis confidentiellement un projet de S-1 à la SEC et levé 65 milliards de dollars en série H, pour une valorisation post-money de 965 milliards de dollars.

Anthropic announced Wednesday the launch of the Services Track and Partner Hub within the Claude Partner Network, an initiative aimed at structuring the partner ecosystem around its models. This announcement comes during a busy week for the company, which also revealed it has confidentially submitted a draft S-1 to the SEC and raised $65 billion in Series H funding, at a post-money valuation of $965 billion.

Anthropic hat am Mittwoch die Einführung des Services Track und des Partner Hub im Rahmen des Claude Partner Network angekündigt, eine Initiative zur Strukturierung des Partner-Ökosystems rund um seine Modelle. Diese Ankündigung erfolgt in einer ereignisreichen Woche für das Unternehmen, das zudem bekannt gab, vertraulich einen S-1-Entwurf bei der SEC eingereicht und 65 Milliarden Dollar in einer Serie-H-Finanzierungsrunde bei einer Post-Money-Bewertung von 965 Milliarden Dollar aufgebracht zu haben.

Anthropic ha annunciato mercoledì il lancio del Services Track e del Partner Hub all'interno del Claude Partner Network, un'iniziativa volta a strutturare l'ecosistema di partner attorno ai suoi modelli. Questo annuncio si inserisce in una settimana intensa per la società, che ha anche rivelato di aver presentato in via confidenziale una bozza di S-1 alla SEC e raccolto 65 miliardi di dollari in serie H, per una valutazione post-money di 965 miliardi di dollari.

Anthropic l'ha anunziaa mercoldì el lanzament del Services Track e del Partner Hub denter in del Claude Partner Network, ona iniziativa per strutturà l'ecosistema de partenari intorna ai sò modèll. 'St'anunzi el riva denter in d'ona setemena carica per la società, che l'ha anca revelaa de havé presentaa confidenzialment on progett de S-1 a la SEC e de havé cataa su 65 miliard de dollar in serie H, per ona valorizzazion post-money de 965 miliard de dollar.

Génération d'imagesImage GenerationBildgenerierungGenerazione di immaginiGenerazion d'Immagin

Ideogram 4.0 publié en open-weight avec résolution 2K nativeIdeogram 4.0 released as open-weight with native 2K resolutionIdeogram 4.0 als Open-Weight-Modell mit nativer 2K-Auflösung veröffentlichtIdeogram 4.0 pubblicato in open-weight con risoluzione 2K nativaIdeogram 4.0 publicaa in open-weight con risoluzion 2K nativa

From the Wires — 3 juin 2026

From the Wires — 3 June 2026

From the Wires — 03.06.2026

Dagli studi — From the Wires — 3 giugno 2026

From the Wires — 3 giugno 2026

Ideogram a publié la version 4.0 de son modèle texte-image en poids ouverts, avec une résolution native 2K, un contrôle par boîtes englobantes et un rendu de texte amélioré. Sur le classement DesignArena, le modèle se classe premier parmi tous les modèles ouverts ; seuls les systèmes fermés d'OpenAI et Google font mieux. L'utilisation commerciale nécessite une licence payante.

Ideogram has released version 4.0 of its text-to-image model as open weights, with native 2K resolution, bounding box control and improved text rendering. On the DesignArena leaderboard, the model ranks first among all open models; only the closed systems from OpenAI and Google perform better. Commercial use requires a paid license.

Ideogram hat Version 4.0 seines Text-zu-Bild-Modells mit offenen Gewichten veröffentlicht, mit nativer 2K-Auflösung, Bounding-Box-Steuerung und verbesserter Textwiedergabe. Im DesignArena-Ranking belegt das Modell den ersten Platz unter allen offenen Modellen; nur die geschlossenen Systeme von OpenAI und Google schneiden besser ab. Die kommerzielle Nutzung erfordert eine kostenpflichtige Lizenz.

Ideogram ha pubblicato la versione 4.0 del suo modello testo-immagine in pesi aperti, con risoluzione nativa 2K, controllo tramite bounding box e resa del testo migliorata. Nella classifica DesignArena, il modello si classifica primo tra tutti i modelli aperti; solo i sistemi chiusi di OpenAI e Google fanno meglio. L'uso commerciale richiede una licenza a pagamento.

Ideogram l'ha publicaa la version 4.0 del sò modèll test-imagin in pes avert, cont ona risoluzion nativa 2K, on controll per box de limit e on rendiment del test miglioraa. Sora la classifica DesignArena, el modèll el se classifica prim intra tucc i modèll avert; domà i sistema sarad de OpenAI e Google fan mej. L'usagg comercial el gh'ha besogn de ona licenza pagada.

PhysiquePhysicsPhysikFisicaFisica

NVIDIA dévoile Cosmos 3, un modèle fondation pour l'IA physiqueNVIDIA unveils Cosmos 3, a foundation model for physical AINVIDIA enthüllt Cosmos 3, ein Foundation-Modell für physikalische KINVIDIA svela Cosmos 3, un modello foundation per l'IA fisicaNVIDIA el presenta Cosmos 3, on modèll fondazion per l'IA fisica

From the Wires — 3 juin 2026

From the Wires — 3 June 2026

From the Wires — 03.06.2026

Dagli studi — From the Wires — 3 giugno 2026

From the Wires — 3 giugno 2026

NVIDIA a présenté Cosmos 3, un modèle fondation « omnimodal » qui associe un raisonneur VLM autorégressif à un générateur par diffusion pour l'IA physique. Cette architecture à deux tours (Mixture-of-Transformers) unifie le raisonnement physique, la génération de mondes et la génération d'actions. Le modèle a été présenté sur HuggingFace Daily Papers avec 17 votes de la communauté.

NVIDIA has introduced Cosmos 3, an 'omnimodal' foundation model that pairs an autoregressive VLM reasoner with a diffusion generator for physical AI. This two-tower Mixture-of-Transformers architecture unifies physical reasoning, world generation and action generation. The model was featured on HuggingFace Daily Papers with 17 community votes.

NVIDIA hat Cosmos 3 vorgestellt, ein «omnimodales» Foundation-Modell, das einen autoregressiven VLM-Argumentierer mit einem Diffusionsgenerator für physikalische KI kombiniert. Diese Zwei-Turm-Architektur (Mixture-of-Transformers) vereinheitlicht physikalisches Denken, Welterzeugung und Aktionsgenerierung. Das Modell wurde auf HuggingFace Daily Papers mit 17 Community-Stimmen vorgestellt.

NVIDIA ha presentato Cosmos 3, un modello foundation « omnimodale » che associa un ragionatore VLM autoregressivo a un generatore per diffusione per l'IA fisica. Questa architettura a due torri (Mixture-of-Transformers) unifica il ragionamento fisico, la generazione di mondi e la generazione di azioni. Il modello è stato presentato su HuggingFace Daily Papers con 17 voti della comunità.

NVIDIA l'ha presentaa Cosmos 3, on modèll fondazion « omnimodal » che 'l gionta on ragionador VLM autoregressiv a on generator per diffusion per l'IA fisica. 'St'architettura a du torr (Mixture-of-Transformers) la unifica el ragionament fisich, la generazion de mond e la generazion d'azion. El modèll l'è staa presentaa sora HuggingFace Daily Papers con 17 vot de la comunità.

VoixVoiceSpracheVoceVos

xAI intègre Grok comme voix des agents VapixAI integrates Grok as the voice of Vapi agentsxAI integriert Grok als Stimme der Vapi-AgentenxAI integra Grok come voce degli agenti VapixAI l'integra Grok come vos di agent Vapi

From the Wires — 3 juin 2026

From the Wires — 3 June 2026

From the Wires — 03.06.2026

Dagli studi — From the Wires — 3 giugno 2026

From the Wires — 3 giugno 2026

xAI a annoncé que Grok devient la voix de Vapi, apportant une qualité vocale de pointe à des millions d'agents Vapi. Par ailleurs, la société a dévoilé un aperçu de Grok Imagine 1.5, la nouvelle version de son modèle de génération d'images.

xAI has announced that Grok becomes the voice of Vapi, bringing cutting-edge voice quality to millions of Vapi agents. Additionally, the company unveiled a preview of Grok Imagine 1.5, the new version of its image generation model.

xAI hat bekannt gegeben, dass Grok zur Stimme von Vapi wird und Millionen von Vapi-Agenten Spitzenqualität in der Sprachausgabe verleiht. Darüber hinaus hat das Unternehmen einen Vorgeschmack auf Grok Imagine 1.5 gegeben, die neue Version seines Bildgenerierungsmodells.

xAI ha annunciato che Grok diventa la voce di Vapi, portando una qualità vocale all'avanguardia a milioni di agenti Vapi. Inoltre, la società ha svelato un'anteprima di Grok Imagine 1.5, la nuova versione del suo modello di generazione di immagini.

xAI l'ha anunziaa che Grok el deventa la vos de Vapi, portand ona qualità vocala de ponta a milion de agent Vapi. De là, la società l'ha presentaa on'anteprima de Grok Imagine 1.5, la noeuva version del sò modèll de generazion d'immagin.

Page 1 — Page 1 — Seite 1 — Pagina 1 — Pagina 1 — À la UneFront PageTitelgeschichteIn Primo PianoIn Prima Pagina

I. Modèles & FrontièreModels & FrontierModelle & GrenzbereichModelli & FrontieraModèll & Frontiera

Google DeepMind

Gemma 4 12B : le multimodal open-source à portée de laptopGemma 4 12B: open-source multimodal within laptop reachGemma 4 12B: Multimodale Open-Source-KI für den LaptopGemma 4 12B: il multimodale open-source a portata di laptopGemma 4 12B: el multimodal open-source a portada de laptop

Google DeepMind a publié Gemma 4 12B, un modèle dense multimodal qui se passe d'encodeur visuel ou audio dédié. Disponible sous licence Apache 2.0, il supporte 256K tokens de contexte et couvre 140 langues. Le guide développeur publié par Google détaille son architecture novatrice et ses capacités agentiques locales.

Google DeepMind has published Gemma 4 12B, a dense multimodal model that does away with a dedicated visual or audio encoder. Available under the Apache 2.0 license, it supports 256K tokens of context and covers 140 languages. The developer guide published by Google details its innovative architecture and local agentic capabilities.

Google DeepMind hat Gemma 4 12B veröffentlicht, ein dichtes multimodales Modell, das auf einen dedizierten visuellen oder Audio-Encoder verzichtet. Es ist unter der Apache-2.0-Lizenz verfügbar, unterstützt 256K Token Kontext und deckt 140 Sprachen ab. Der Entwicklerleitfaden von Google erläutert seine neuartige Architektur und lokale agentische Fähigkeiten.

Google DeepMind ha pubblicato Gemma 4 12B, un modello denso multimodale che fa a meno di un encoder visivo o audio dedicato. Disponibile con licenza Apache 2.0, supporta 256K token di contesto e copre 140 lingue. La guida per sviluppatori pubblicata da Google dettaglia la sua architettura innovativa e le sue capacità agentiche locali.

Google DeepMind l'ha publicaa Gemma 4 12B, on modèll dens multimodal che 'l se passa de encoder visiv o audio dedicad. Disponibil sotta licenza Apache 2.0, el supporta 256K token de contest e 'l quatta 140 lengoeu. La guida desvilupador publicada de Google la detaglia la soa architettura innovativa e i sò capacità agentigh locai.

OpenAI

GPT-Rosalind s'enrichit de capacités avancées en sciences de la vieGPT-Rosalind gains advanced life sciences capabilitiesGPT-Rosalind erweitert Fähigkeiten in den BiowissenschaftenGPT-Rosalind si arricchisce di capacità avanzate nelle scienze della vitaGPT-Rosalind el se ricchiss de capacità avanzad in di scienz de la vita

OpenAI a dévoilé de nouvelles capacités pour GPT-Rosalind, son modèle dédié aux sciences de la vie. Les améliorations portent sur le raisonnement biologique, l'expertise en chimie médicinale, l'analyse génomique et les flux de travail expérimentaux. Par ailleurs, OpenAI a publié un plan pour la gouvernance démocratique de l'IA frontière et un agenda de politique publique.

OpenAI has unveiled new capabilities for GPT-Rosalind, its model dedicated to the life sciences. The improvements cover biological reasoning, medicinal chemistry expertise, genomic analysis and experimental workflows. Separately, OpenAI published a blueprint for democratic governance of frontier AI and a public policy agenda.

OpenAI hat neue Fähigkeiten für GPT-Rosalind vorgestellt, sein auf die Biowissenschaften spezialisiertes Modell. Die Verbesserungen betreffen biologisches Denken, medizinisches Chemie-Fachwissen, Genomanalyse und experimentelle Workflows. Zudem hat OpenAI einen Plan für die demokratische Governance von Frontier-KI und eine öffentliche politische Agenda veröffentlicht.

OpenAI ha svelato nuove capacità per GPT-Rosalind, il suo modello dedicato alle scienze della vita. I miglioramenti riguardano il ragionamento biologico, l'esperienza in chimica farmaceutica, l'analisi genomica e i flussi di lavoro sperimentali. Inoltre, OpenAI ha pubblicato un piano per la governance democratica dell'IA di frontiera e un agenda di politica pubblica.

OpenAI l'ha presentaa di noeuv capacità per GPT-Rosalind, el sò modèll dedicad ai scienz de la vita. I migliorament porten sora el ragionament biologich, l'espertiza in chimica medicinala, l'analisi genomica e i fluss de lavorà esperimentai. De là, OpenAI l'ha publicaa on pian per la governanza democratica de l'IA frontiera e on agenda de politega publica.

NVIDIA

Cosmos 3 : un modèle fondation unifié pour l'IA physiqueCosmos 3: a unified foundation model for physical AICosmos 3: Ein vereinheitlichtes Foundation-Modell für physikalische KICosmos 3: un modello foundation unificato per l'IA fisicaCosmos 3: on modèll fondazion unificaa per l'IA fisica

NVIDIA a présenté Cosmos 3, un modèle fondation « omnimodal » qui combine un raisonneur VLM autorégressif avec un générateur par diffusion. Cette architecture Mixture-of-Transformers à deux tours permet d'unifier raisonnement physique, génération de mondes et génération d'actions. Le paper a recueilli 17 votes sur HuggingFace Daily Papers.

NVIDIA has introduced Cosmos 3, an 'omnimodal' foundation model that combines an autoregressive VLM reasoner with a diffusion generator. This two-tower Mixture-of-Transformers architecture unifies physical reasoning, world generation and action generation. The paper garnered 17 votes on HuggingFace Daily Papers.

NVIDIA hat Cosmos 3 vorgestellt, ein «omnimodales» Foundation-Modell, das einen autoregressiven VLM-Argumentierer mit einem Diffusionsgenerator kombiniert. Diese Zwei-Turm-Mixture-of-Transformers-Architektur vereinheitlicht physikalisches Denken, Welterzeugung und Aktionsgenerierung. Das Paper erhielt 17 Stimmen auf HuggingFace Daily Papers.

NVIDIA ha presentato Cosmos 3, un modello foundation « omnimodale » che combina un ragionatore VLM autoregressivo con un generatore per diffusione. Questa architettura Mixture-of-Transformers a due torri consente di unificare ragionamento fisico, generazione di mondi e generazione di azioni. Il paper ha raccolto 17 voti su HuggingFace Daily Papers.

NVIDIA l'ha presentaa Cosmos 3, on modèll fondazion « omnimodal » che 'l combina on ragionador VLM autoregressiv cont on generator per diffusion. 'St'architettura Mixture-of-Transformers a du torr la permet de unificà ragionament fisich, generazion de mond e generazion d'azion. El paper l'ha cataa su 17 vot sora HuggingFace Daily Papers.

xAI

Grok devient la voix des agents Vapi et dévoile Imagine 1.5Grok becomes the voice of Vapi agents and unveils Imagine 1.5Grok wird zur Stimme der Vapi-Agenten und enthüllt Imagine 1.5Grok diventa la voce degli agenti Vapi e svela Imagine 1.5Grok el deventa la vos di agent Vapi e 'l presenta Imagine 1.5

xAI a annoncé que Grok devient la voix de Vapi, intégrant la qualité vocale de pointe du modèle dans la plateforme d'agents vocaux. La société a également publié un aperçu de Grok Imagine 1.5, la nouvelle version de son générateur d'images.

xAI has announced that Grok becomes the voice of Vapi, integrating the model's cutting-edge voice quality into the voice agent platform. The company also released a preview of Grok Imagine 1.5, the new version of its image generator.

xAI hat bekannt gegeben, dass Grok zur Stimme von Vapi wird und die Spitzenqualität der Sprachausgabe des Modells in die Plattform für Sprachagenten integriert. Das Unternehmen hat zudem einen Vorgeschmack auf Grok Imagine 1.5 veröffentlicht, die neue Version seines Bildgenerators.

xAI ha annunciato che Grok diventa la voce di Vapi, integrando la qualità vocale all'avanguardia del modello nella piattaforma di agenti vocali. La società ha inoltre pubblicato un'anteprima di Grok Imagine 1.5, la nuova versione del suo generatore di immagini.

xAI l'ha anunziaa che Grok el deventa la vos de Vapi, integrand la qualità vocala de ponta del modèll in la piattaforma d'agent vocai. La società l'ha anca publicaa on'anteprima de Grok Imagine 1.5, la noeuva version del sò generator d'immagin.

Page 2 — Page 2 — Seite 2 — Pagina 2 — Pagina 2 — Le Cahier TechniqueTech NotebookDas Technische HeftIl Quaderno TecnicoEl Quadern Tecnich

II. Harnais & CLIHarnesses & CLIGeschirre & CLIHarnais & CLIHarnes & CLI

Claude Code

Version 2.1.162 : meilleure visibilité des sessions en attenteVersion 2.1.162: better visibility of pending sessionsVersion 2.1.162: Bessere Sichtbarkeit wartender SitzungenVersione 2.1.162: migliore visibilità delle sessioni in attesaVersion 2.1.162: mej visibilità di session in attesa

La nouvelle version de Claude Code introduit l'affichage de l'état waitingFor dans le format JSON de claude agents, indiquant ce qui bloque une session en attente (ex. invite de permission). Les outils Grep et Glob sont désormais explicitement listables, et la commande /effort confirme quand le niveau choisi persistera comme défaut.

The new version of Claude Code introduces the display of the waitingFor state in the JSON format of claude agents, indicating what is blocking a pending session (e.g. permission prompt). Grep and Glob tools are now explicitly listable, and the /effort command confirms when the chosen level will persist as the default.

Die neue Version von Claude Code führt die Anzeige des waitingFor-Status im JSON-Format von claude agents ein, der angibt, was eine wartende Sitzung blockiert (z. B. Berechtigungsaufforderung). Die Tools Grep und Glob sind nun explizit auflistbar, und der Befehl /effort bestätigt, wenn die gewählte Stufe als Standard bestehen bleibt.

La nuova versione di Claude Code introduce la visualizzazione dello stato waitingFor nel formato JSON di claude agents, indicando cosa blocca una sessione in attesa (es. richiesta di permesso). Gli strumenti Grep e Glob sono ora esplicitamente elencabili, e il comando /effort conferma quando il livello scelto persisterà come predefinito.

La noeuva version de Claude Code l'introdux l'visualizzazion del stat waitingFor in del format JSON de claude agents, indicand cossa che 'l blocca ona session in attesa (es. invito de permess). I strument Grep e Glob hinn adess esplicitament listabil, e la commanda /effort la conferma quand che 'l nivell scernii el persistirà come defaut.

Codex CLI

OpenAI publie Codex 0.137.0 avec contrôles TUI étendusOpenAI releases Codex 0.137.0 with extended TUI controlsOpenAI veröffentlicht Codex 0.137.0 mit erweiterten TUI-SteuerungenOpenAI pubblica Codex 0.137.0 con controlli TUI estesiOpenAI el publica Codex 0.137.0 con controll TUI estes

OpenAI a déployé Codex 0.137.0 avec de nouvelles fonctionnalités : prise en charge des touches F13-F24 dans les contrôles TUI, collage dans les menus de recherche, et un statut compact pour le mode raisonnement. Les flux administrateur affichent désormais les limites de crédit mensuelles et peuvent appliquer des bundles de configuration cloud. Un SDK Python 0.1.0b3 a également été publié.

OpenAI has deployed Codex 0.137.0 with new features: support for F13-F24 keys in TUI controls, pasting into search menus, and a compact status for reasoning mode. Admin flows now display monthly credit limits and can apply cloud configuration bundles. A Python SDK 0.1.0b3 has also been released.

OpenAI hat Codex 0.137.0 mit neuen Funktionen bereitgestellt: Unterstützung der Tasten F13-F24 in TUI-Steuerungen, Einfügen in Suchmenüs und einen kompakten Status für den Denkmodus. Administrationsabläufe zeigen nun monatliche Kreditlimits an und können Cloud-Konfigurationspakete anwenden. Ein Python-SDK 0.1.0b3 wurde ebenfalls veröffentlicht.

OpenAI ha distribuito Codex 0.137.0 con nuove funzionalità: supporto dei tasti F13-F24 nei controlli TUI, incolla nei menu di ricerca e uno stato compatto per la modalità ragionamento. I flussi amministratore mostrano ora i limiti di credito mensili e possono applicare bundle di configurazione cloud. È stato inoltre pubblicato un SDK Python 0.1.0b3.

OpenAI l'ha desplegaa Codex 0.137.0 con di noeuv funzionalità: support di taster F13-F24 in di controll TUI, incollada in di menu de ricerca, e on stat compact per la modalità ragionament. I fluss aministrador mostren adess i limit de credit mensil e poden applicà di bundle de configurazion cloud. On SDK Python 0.1.0b3 l'è anca staa publicaa.

Gemini CLI

Version 0.46.0 : transition vers Gemini 3.5 Flash en mode automatiqueVersion 0.46.0: transition to Gemini 3.5 Flash in auto modeVersion 0.46.0: Umstellung auf Gemini 3.5 Flash im AutomatikmodusVersione 0.46.0: transizione a Gemini 3.5 Flash in modalità automaticaVersion 0.46.0: transizion vers Gemini 3.5 Flash in modalità automatica

Google a publié Gemini CLI 0.46.0-preview.0 qui active le modèle Gemini 3.5 Flash comme modèle par défaut en mode automatique lorsque le flag expérimental est présent. La version corrige également un crash PTY lors du redimensionnement et une boucle de spam lorsque l'éditeur préféré est invalide.

Google has released Gemini CLI 0.46.0-preview.0 which enables the Gemini 3.5 Flash model as the default in auto mode when the experimental flag is present. The release also fixes a PTY crash during resizing and a spam loop when the preferred editor is invalid.

Google hat Gemini CLI 0.46.0-preview.0 veröffentlicht, das das Modell Gemini 3.5 Flash als Standardmodell im Automatikmodus aktiviert, wenn das experimentelle Flag gesetzt ist. Die Version behebt zudem einen PTY-Absturz bei der Grössenänderung und eine Spam-Schleife, wenn der bevorzugte Editor ungültig ist.

Google ha pubblicato Gemini CLI 0.46.0-preview.0 che attiva il modello Gemini 3.5 Flash come modello predefinito in modalità automatica quando il flag sperimentale è presente. La versione corregge inoltre un crash PTY durante il ridimensionamento e un loop di spam quando l'editor preferito non è valido.

Google l'ha publicaa Gemini CLI 0.46.0-preview.0 che 'l attiva el modèll Gemini 3.5 Flash come modèll defaut in modalità automatica quand che 'l flag esperimental l'è present. La version la corregg anca on crash PTY durant el redimensionament e on loop de spam quand che l'editor preferii l'è invalid.

Cline

CLI v3.0.17 corrige un blocage après redémarrage du HubCLI v3.0.17 fixes a hang after Hub restartCLI v3.0.17 behebt Blockade nach Neustart des HubCLI v3.0.17 corregge un blocco dopo il riavvio dell'HubCLI v3.0.17 el corregg on bloch dopo redemarida del Hub

La version CLI v3.0.17 de Cline corrige une régression où l'interface interactive pouvait rester bloquée après l'arrêt et le redémarrage de Cline Hub. Le connecteur Telegram s'enrichit d'un flag --allowed-user-id pour restreindre les utilisateurs autorisés à interagir avec l'agent.

The CLI v3.0.17 release of Cline fixes a regression where the interactive interface could remain stuck after stopping and restarting Cline Hub. The Telegram connector gains a --allowed-user-id flag to restrict which users are authorized to interact with the agent.

Die Version CLI v3.0.17 von Cline behebt eine Regression, bei der die interaktive Oberfläche nach dem Stoppen und Neustarten von Cline Hub blockiert bleiben konnte. Der Telegram-Konnektor erhält ein Flag --allowed-user-id, um die Benutzer einzuschränken, die mit dem Agenten interagieren dürfen.

La versione CLI v3.0.17 di Cline corregge una regressione per cui l'interfaccia interattiva poteva rimanere bloccata dopo l'arresto e il riavvio di Cline Hub. Il connettore Telegram si arricchisce di un flag --allowed-user-id per limitare gli utenti autorizzati a interagire con l'agente.

La version CLI v3.0.17 de Cline la corregg ona regression indove l'interfaccia interattiva la podeva restà blocada dopo l'arester e 'l redemarida de Cline Hub. El connector Telegram el se ricchiss d'on flag --allowed-user-id per restrenge i utent autorizaa a interagì con l'agent.

III. Moteurs & InférenceEngines & InferenceEngines & InferenzMotori & InferenzaMotor & Inferenza

llama.cpp

Build b9496 : correctif pour Gemma 4 sur les FPU unifiésBuild b9496: fix for Gemma 4 on unified FPUsBuild b9496: Fehlerbehebung für Gemma 4 auf vereinheitlichten FPUsBuild b9496: correttivo per Gemma 4 su FPU unificatiBuild b9496: correttiva per Gemma 4 sora i FPU unificaa

La version b9496 de llama.cpp corrige un problème de Floating Point Exception (FPE) unifiée pour Gemma 4. La mise à jour est disponible pour macOS (Apple Silicon et Intel), Linux, Windows, Android et iOS.

The b9496 release of llama.cpp fixes a unified Floating Point Exception (FPE) issue for Gemma 4. The update is available for macOS (Apple Silicon and Intel), Linux, Windows, Android and iOS.

Die Version b9496 von llama.cpp behebt ein Problem mit einer vereinheitlichten Floating-Point-Exception (FPE) für Gemma 4. Das Update ist für macOS (Apple Silicon und Intel), Linux, Windows, Android und iOS verfügbar.

La versione b9496 di llama.cpp corregge un problema di Floating Point Exception (FPE) unificata per Gemma 4. L'aggiornamento è disponibile per macOS (Apple Silicon e Intel), Linux, Windows, Android e iOS.

La version b9496 de llama.cpp la corregg on problema de Floating Point Exception (FPE) unificada per Gemma 4. L'aggiornament l'è disponibil per macOS (Apple Silicon e Intel), Linux, Windows, Android e iOS.

Ollama

v0.30.4 : mise à jour de llama.cpp et correctif Windowsv0.30.4: llama.cpp update and Windows fixv0.30.4: Update von llama.cpp und Windows-Korrekturv0.30.4: aggiornamento di llama.cpp e correttivo Windowsv0.30.4: aggiornament de llama.cpp e correttiva Windows

Ollama 0.30.4 met à jour le moteur llama.cpp sous-jacent et corrige un problème de nettoyage du processus llama-server lors de l'arrêt sur Windows. Un bug connu affecte Gemma 4 12B avec une exception flottante.

Ollama 0.30.4 updates the underlying llama.cpp engine and fixes a cleanup issue with the llama-server process during shutdown on Windows. A known bug affects Gemma 4 12B with a floating point exception.

Ollama 0.30.4 aktualisiert die zugrundeliegende llama.cpp-Engine und behebt ein Problem bei der Bereinigung des llama-server-Prozesses beim Herunterfahren unter Windows. Ein bekannter Fehler betrifft Gemma 4 12B mit einer Fliesskomma-Ausnahme.

Ollama 0.30.4 aggiorna il motore llama.cpp sottostante e corregge un problema di pulizia del processo llama-server durante l'arresto su Windows. Un bug noto riguarda Gemma 4 12B con un'eccezione floating point.

Ollama 0.30.4 l'aggiorna el motor llama.cpp sotta-stant e 'l corregg on problema de netezza del process llama-server durant l'arester sora Windows. On bug cognossuu el tocca Gemma 4 12B cont ona eccezion flottanta.

MLX-VLM

v0.6.1 : support de Gemma 4 Unified, MTP et Cohere MoEv0.6.1: support for Gemma 4 Unified, MTP and Cohere MoEv0.6.1: Unterstützung für Gemma 4 Unified, MTP und Cohere MoEv0.6.1: supporto di Gemma 4 Unified, MTP e Cohere MoEv0.6.1: support de Gemma 4 Unified, MTP e Cohere MoE

La version 0.6.1 de MLX-VLM ajoute le support de Gemma 4 avec architecture unifiée et Multi-Token Prediction (MTP), le modèle Cohere2 MoE, et NVIDIA LocateAnything-3B. Plusieurs correctifs portent sur la gestion du rollback et le streaming des pensées.

The 0.6.1 release of MLX-VLM adds support for Gemma 4 with unified architecture and Multi-Token Prediction (MTP), the Cohere2 MoE model, and NVIDIA LocateAnything-3B. Several fixes address rollback handling and thought streaming.

Die Version 0.6.1 von MLX-VLM fügt Unterstützung für Gemma 4 mit vereinheitlichter Architektur und Multi-Token Prediction (MTP), das Modell Cohere2 MoE und NVIDIA LocateAnything-3B hinzu. Mehrere Korrekturen betreffen das Rollback-Management und das Streaming von Gedanken.

La versione 0.6.1 di MLX-VLM aggiunge il supporto di Gemma 4 con architettura unificata e Multi-Token Prediction (MTP), il modello Cohere2 MoE e NVIDIA LocateAnything-3B. Diversi correttivi riguardano la gestione del rollback e lo streaming dei pensieri.

La version 0.6.1 de MLX-VLM la gionta el support de Gemma 4 con architettura unificada e Multi-Token Prediction (MTP), el modèll Cohere2 MoE, e NVIDIA LocateAnything-3B. Diversi correttiv porten sora la gestion del rollback e 'l streaming di penser.

oMLX

v0.4.1 : stabilité mémoire et cycle de vie du serveurv0.4.1: memory stability and server lifecyclev0.4.1: Speicherstabilität und Server-Lebenszyklusv0.4.1: stabilità della memoria e ciclo di vita del serverv0.4.1: stabilità memoria e ciclo de vita del server

oMLX 0.4.1 améliore la gestion de la mémoire préfill, introduit l'éviction des modèles inactifs avant le throttling, et expose des contrôles de cycle de vie du serveur depuis l'application macOS et la CLI avec les commandes omlx start, stop et restart.

oMLX 0.4.1 improves prefill memory management, introduces eviction of idle models before throttling, and exposes server lifecycle controls from the macOS app and CLI with the omlx start, stop and restart commands.

oMLX 0.4.1 verbessert die Prefill-Speicherverwaltung, führt die Räumung inaktiver Modelle vor dem Throttling ein und stellt Lebenszyklussteuerungen des Servers über die macOS-App und die CLI mit den Befehlen omlx start, stop und restart bereit.

oMLX 0.4.1 migliora la gestione della memoria prefill, introduce l'evizione dei modelli inattivi prima del throttling ed espone controlli del ciclo di vita del server dall'applicazione macOS e dalla CLI con i comandi omlx start, stop e restart.

oMLX 0.4.1 el migliora la gestion de la memoria prefill, l'introdux l'evizion di modèll inattiv prima del throttling, e 'l espon di controll de ciclo de vita del server de l'applicazion macOS e de la CLI con i command omlx start, stop e restart.

Page 3 — Page 3 — Seite 3 — Pagina 3 — Pagina 3 — La RechercheResearchDie ForschungLa RicercaLa Ricerca

IV. Papers & LabosPapers & LabsPapers & LaborePapers & LaboratoriPaper & Laboratori

OpenAI

Optimisation par préférence directe disponible pour GPT-4.1Direct Preference Optimization available for GPT-4.1Direct Preference Optimization für GPT-4.1 verfügbarOttimizzazione per preferenza diretta disponibile per GPT-4.1Ottimizzazion per preferenza diretta disponibil per GPT-4.1

OpenAI a ajouté le support du Direct Preference Optimization (DPO) pour le fine-tuning des modèles gpt-4.1-2025-04-14, gpt-4.1-mini-2025-04-14 et gpt-4.1-nano-2025-04-14. Cette technique d'alignement permet d'optimiser les modèles directement à partir de préférences humaines sans passer par un modèle de récompense séparé.

OpenAI has added support for Direct Preference Optimization (DPO) for fine-tuning the gpt-4.1-2025-04-14, gpt-4.1-mini-2025-04-14 and gpt-4.1-nano-2025-04-14 models. This alignment technique allows models to be optimized directly from human preferences without going through a separate reward model.

OpenAI hat die Unterstützung für Direct Preference Optimization (DPO) für das Feintuning der Modelle gpt-4.1-2025-04-14, gpt-4.1-mini-2025-04-14 und gpt-4.1-nano-2025-04-14 hinzugefügt. Diese Alignment-Technik ermöglicht die direkte Optimierung von Modellen anhand menschlicher Präferenzen ohne ein separates Belohnungsmodell.

OpenAI ha aggiunto il supporto del Direct Preference Optimization (DPO) per il fine-tuning dei modelli gpt-4.1-2025-04-14, gpt-4.1-mini-2025-04-14 e gpt-4.1-nano-2025-04-14. Questa tecnica di allineamento consente di ottimizzare i modelli direttamente a partire da preferenze umane senza passare attraverso un modello di ricompensa separato.

OpenAI l'ha giontaa el support del Direct Preference Optimization (DPO) per el fine-tuning di modèll gpt-4.1-2025-04-14, gpt-4.1-mini-2025-04-14 e gpt-4.1-nano-2025-04-14. 'Sta tecnica d'alignament la permet de ottimizzà i modèll direttament di preferenz uman senza passà per on modèll de ricompensa separaa.

Hugging Face

DPO au-delà des chatbots : un tutoriel pour applications spécialiséesDPO beyond chatbots: a tutorial for specialized applicationsDPO über Chatbots hinaus: Ein Tutorial für spezialisierte AnwendungenDPO oltre i chatbot: un tutorial per applicazioni specializzateDPO oltra i chatbot: on tutorial per applicazion specializad

Un article publié sur le blog de Hugging Face explore l'application du Direct Preference Optimization au-delà des chatbots conversationnels, détaillant comment cette méthode d'alignement peut être utilisée pour des applications spécialisées comme la génération de code, la synthèse de documents ou les systèmes de recommandation.

An article published on the Hugging Face blog explores the application of Direct Preference Optimization beyond conversational chatbots, detailing how this alignment method can be used for specialized applications such as code generation, document summarization or recommendation systems.

Ein auf dem Blog von Hugging Face veröffentlichter Artikel untersucht die Anwendung von Direct Preference Optimization über konversationelle Chatbots hinaus und erläutert, wie diese Alignment-Methode für spezialisierte Anwendungen wie Codegenerierung, Dokumentenzusammenfassung oder Empfehlungssysteme genutzt werden kann.

Un articolo pubblicato sul blog di Hugging Face esplora l'applicazione del Direct Preference Optimization oltre i chatbot conversazionali, dettagliando come questo metodo di allineamento possa essere utilizzato per applicazioni specializzate come la generazione di codice, la sintesi di documenti o i sistemi di raccomandazione.

On articol publicaa sora el blog de Hugging Face l'esplora l'applicazion del Direct Preference Optimization oltra i chatbot conversazionai, detaliand come 'sto metod d'alignament el pò vess doperad per di applicazion specializad come la generazion de codes, la sintesi de document o i sistema de raccomandazion.

Hugging Face

Ajout d'outils MCP au robot Reachy MiniAdding MCP tools to the Reachy Mini robotHinzufügen von MCP-Tools zum Roboter Reachy MiniAggiunta di strumenti MCP al robot Reachy MiniGionta de strument MCP al robot Reachy Mini

Un tutoriel publié sur Hugging Face détaille l'intégration d'outils MCP (Model Context Protocol) au robot open-source Reachy Mini, permettant aux modèles de langage de contrôler directement les actionneurs et capteurs du robot via des appels d'outils standardisés.

A tutorial published on Hugging Face details the integration of MCP (Model Context Protocol) tools into the open-source Reachy Mini robot, allowing language models to directly control the robot's actuators and sensors via standardized tool calls.

Ein auf Hugging Face veröffentlichtes Tutorial beschreibt die Integration von MCP-Tools (Model Context Protocol) in den Open-Source-Roboter Reachy Mini, die es Sprachmodellen ermöglicht, die Aktuatoren und Sensoren des Roboters direkt über standardisierte Tool-Aufrufe zu steuern.

Un tutorial pubblicato su Hugging Face dettaglia l'integrazione di strumenti MCP (Model Context Protocol) nel robot open-source Reachy Mini, consentendo ai modelli linguistici di controllare direttamente gli attuatori e i sensori del robot tramite chiamate di strumenti standardizzate.

On tutorial publicaa sora Hugging Face el detalia l'integrazion de strument MCP (Model Context Protocol) al robot open-source Reachy Mini, permettend ai modèll de lengoeu de controllà direttament i attuator e sensor del robot via di ciamad de strument standardizad.

Wasmer

Wasmer utilise Codex et GPT-5.5 pour construire un runtime Node.js edgeWasmer uses Codex and GPT-5.5 to build an edge Node.js runtimeWasmer nutzt Codex und GPT-5.5 zum Bau einer Node.js-Edge-LaufzeitWasmer utilizza Codex e GPT-5.5 per costruire un runtime Node.js edgeWasmer el dopera Codex e GPT-5.5 per costruì on runtime Node.js edge

Wasmer a utilisé Codex avec GPT-5.5 pour construire un runtime Node.js destiné à l'edge, accélérant le développement de 10 à 20 fois et livrant le projet en quelques semaines au lieu de mois. Un cas d'usage concret de l'agentic coding en environnement contraint.

Wasmer used Codex with GPT-5.5 to build a Node.js runtime for the edge, accelerating development by 10 to 20 times and delivering the project in weeks instead of months. A concrete use case of agentic coding in a constrained environment.

Wasmer hat Codex mit GPT-5.5 verwendet, um eine Node.js-Laufzeit für den Edge-Bereich zu entwickeln, was die Entwicklung um das 10- bis 20-fache beschleunigte und das Projekt in wenigen Wochen statt Monaten lieferte. Ein konkreter Anwendungsfall für agentisches Programmieren in einer eingeschränkten Umgebung.

Wasmer ha utilizzato Codex con GPT-5.5 per costruire un runtime Node.js destinato all'edge, accelerando lo sviluppo di 10-20 volte e consegnando il progetto in poche settimane invece che mesi. Un caso d'uso concreto di agentic coding in ambiente vincolato.

Wasmer l'ha doperad Codex con GPT-5.5 per costruì on runtime Node.js destinad a l'edge, accelerand el desvilupp de 10 a 20 volt e consegnand el progett in d'on quaj setimana inveci de mes. On cas d'usagg concret del coding agentich in ambient costrenciud.

Page 4 — Page 4 — Seite 4 — Pagina 4 — Pagina 4 — La Communauté & ÉditoCommunity & EditorialCommunity & EditorialLa Comunità & EditorialeLa Comunità & Editorial

V. Signaux de la communautéCommunity SignalsSignale aus der CommunitySegnali dalla comunitàSegnal de la Comunità

Nous Research

Hermes Desktop : un agent IA open-source multiplateformeHermes Desktop: a cross-platform open-source AI agentHermes Desktop: Ein plattformübergreifender Open-Source-KI-AgentHermes Desktop: un agente IA open-source multipiattaformaHermes Desktop: on agent IA open-source multi-piattaforma

Nous Research a publié Hermes Desktop, une interface graphique native pour Hermes Agent v0.15.2, sous licence MIT. L'application partage le même noyau d'agent, les mêmes compétences et la même mémoire que la version CLI, avec un affichage en streaming des sorties d'outils.

Nous Research has released Hermes Desktop, a native graphical interface for Hermes Agent v0.15.2, under the MIT license. The application shares the same agent core, skills and memory as the CLI version, with streaming display of tool outputs.

Nous Research hat Hermes Desktop veröffentlicht, eine native grafische Oberfläche für Hermes Agent v0.15.2, unter der MIT-Lizenz. Die Anwendung teilt denselben Agentenkern, dieselben Fähigkeiten und denselben Speicher wie die CLI-Version, mit einer Streaming-Anzeige der Tool-Ausgaben.

Nous Research ha pubblicato Hermes Desktop, un'interfaccia grafica nativa per Hermes Agent v0.15.2, con licenza MIT. L'applicazione condivide lo stesso nucleo d'agente, le stesse competenze e la stessa memoria della versione CLI, con una visualizzazione in streaming delle uscite degli strumenti.

Nous Research l'ha publicaa Hermes Desktop, ona interfaccia grafica nativa per Hermes Agent v0.15.2, sotta licenza MIT. L'applicazion la spartiss l'istess nœuv d'agent, i istess competenz e l'istessa memoria de la version CLI, cont ona visualizzazion in streaming di sortid di strument.

Perplexity

Un système hybride décide local ou cloud pour chaque tâcheA hybrid system decides local or cloud for each taskEin hybrides System entscheidet lokal oder in der Cloud für jede AufgabeUn sistema ibrido decide locale o cloud per ogni compitoOn sistema ibrid el decid local o cloud per ogni incarich

Perplexity a annoncé un orchestrateur hybride qui combine des modèles exécutés localement avec des modèles cloud puissants, décidant automatiquement où traiter chaque tâche en fonction de sa complexité et des ressources disponibles.

Perplexity has announced a hybrid orchestrator that combines locally run models with powerful cloud models, automatically deciding where to process each task based on its complexity and available resources.

Perplexity hat einen hybriden Orchestrator angekündigt, der lokal ausgeführte Modelle mit leistungsstarken Cloud-Modellen kombiniert und automatisch entscheidet, wo jede Aufgabe basierend auf ihrer Komplexität und den verfügbaren Ressourcen verarbeitet wird.

Perplexity ha annunciato un orchestratore ibrido che combina modelli eseguiti localmente con modelli cloud potenti, decidendo automaticamente dove elaborare ogni compito in base alla sua complessità e alle risorse disponibili.

Perplexity l'ha anunziaa on orchestrador ibrid che 'l combina di modèll eseguii localment con di modèll cloud potent, decidend automaticament indove tratà ogni incarich in fonzion de la soa complessità e di risorse disponibil.

Régulation

Regulation

Regulierung

Regolamentazione

Regolazion

Le Royaume-Uni impose à Google un opt-out pour l'IA SearchUK imposes opt-out for AI Search on GoogleGrossbritannien zwingt Google zu einem Opt-out für die KI-SucheIl Regno Unito impone a Google un opt-out per l'IA SearchEl Regn Unii l'impon a Google on opt-out per l'IA Search

L'autorité britannique de la concurrence (CMA) a imposé à Google de fournir un outil permettant aux éditeurs de se retirer des fonctionnalités de recherche générative AI Overviews et AI Mode. L'option sera d'abord testée au Royaume-Uni avant un déploiement mondial.

The UK competition authority (CMA) has ordered Google to provide a tool allowing publishers to opt out of generative AI search features AI Overviews and AI Mode. The option will first be tested in the UK before a global rollout.

Die britische Wettbewerbsbehörde (CMA) hat Google dazu verpflichtet, ein Tool bereitzustellen, mit dem Verlage sich von den generativen KI-Funktionen AI Overviews und AI Mode abmelden können. Die Option wird zunächst in Grossbritannien getestet, bevor sie weltweit ausgerollt wird.

L'autorità britannica della concorrenza (CMA) ha imposto a Google di fornire uno strumento che consenta agli editori di ritirarsi dalle funzionalità di ricerca generativa AI Overviews e AI Mode. L'opzione sarà prima testata nel Regno Unito prima di un'implementazione globale.

L'autorità britannica de la concorrenza (CMA) l'ha imponuu a Google de fornì on strument che 'l permet ai editor de retiràss di funzionalità de ricerca generativa AI Overviews e AI Mode. L'opzion la sarà prima testada al Regn Unii prima d'on despiegament mondial.

L'agent IA de Meta pour WhatsApp Business disponible mondialementMeta's AI agent for WhatsApp Business available globallyMetas KI-Agent für WhatsApp Business weltweit verfügbarL'agente IA di Meta per WhatsApp Business disponibile globalmenteL'agent IA de Meta per WhatsApp Business disponibil mondialment

Meta a rendu disponible son agent IA pour WhatsApp Business à l'échelle mondiale. La facturation s'effectue à la consommation de tokens, ouvrant la voie à une adoption massive par les entreprises utilisant la plateforme de messagerie.

Meta has made its AI agent for WhatsApp Business available globally. Billing is based on token consumption, paving the way for mass adoption by businesses using the messaging platform.

Meta hat seinen KI-Agenten für WhatsApp Business weltweit verfügbar gemacht. Die Abrechnung erfolgt nach Token-Verbrauch, was den Weg für eine massive Adoption durch Unternehmen ebnet, die die Messaging-Plattform nutzen.

Meta ha reso disponibile il suo agente IA per WhatsApp Business a livello mondiale. La fatturazione avviene al consumo di token, aprendo la strada a un'adozione massiccia da parte delle aziende che utilizzano la piattaforma di messaggistica.

Meta l'ha renduu disponibil el sò agent IA per WhatsApp Business a scala mondiala. La fatturazion la se fa a la consumazion de token, dervend la strada a ona adozion massiva di impres che doperen la piattaforma de messagg.

VI. ÉditoEditorialEditorialEditorialeEditorial

Éditorial

Editorial

Leitartikel

Editoriale

Editorial

Gemma 4 12B : le vrai tournant du multimodal localGemma 4 12B: the real turning point for local multimodal AIGemma 4 12B: Die wahre Wende der lokalen MultimodalitätGemma 4 12B: la vera svolta del multimodale localeGemma 4 12B: el ver voltant del multimodal local

Avec la sortie de Gemma 4 12B, Google DeepMind franchit un cap que beaucoup attendaient : un modèle multimodal capable de traiter texte, image et audio sur un simple laptop de 16 Go de RAM. L'architecture sans encodeur n'est pas qu'une prouesse technique — elle redéfinit ce que « local » signifie dans l'IA. Jusqu'ici, le multimodal était l'apanage du cloud, nécessitant des GPU coûteux ou des API distantes. Gemma 4 12B change la donne en rendant ces capacités accessibles à tout développeur équipé d'un ordinateur portable récent. Sous licence Apache 2.0, le modèle ouvre des perspectives immédiates pour les applications agentiques locales, le traitement de données sensibles sans transfert vers le cloud, et l'expérimentation en mobilité réduite. C'est aussi un signal fort pour l'écosystème open-source : la frontière de la recherche ne se joue plus seulement dans les data centers.

With the release of Gemma 4 12B, Google DeepMind has crossed a threshold many were waiting for: a multimodal model capable of processing text, image and audio on a simple 16 GB RAM laptop. The encoder-free architecture is not just a technical feat — it redefines what 'local' means in AI. Until now, multimodal capabilities were the preserve of the cloud, requiring expensive GPUs or remote APIs. Gemma 4 12B changes the game by making these capabilities accessible to any developer equipped with a recent laptop. Under the Apache 2.0 license, the model opens up immediate prospects for local agentic applications, processing of sensitive data without cloud transfer, and experimentation on the go. It is also a strong signal for the open-source ecosystem: the research frontier is no longer confined to data centers.

Mit der Veröffentlichung von Gemma 4 12B überschreitet Google DeepMind eine Schwelle, die viele erwartet haben: ein multimodales Modell, das Text, Bild und Audio auf einem einfachen Laptop mit 16 GB RAM verarbeiten kann. Die encoderfreie Architektur ist nicht nur eine technische Meisterleistung – sie definiert neu, was «lokal» in der KI bedeutet. Bislang war Multimodalität die Domäne der Cloud und erforderte teure GPUs oder entfernte APIs. Gemma 4 12B ändert dies, indem es diese Fähigkeiten jedem Entwickler mit einem aktuellen Laptop zugänglich macht. Unter der Apache-2.0-Lizenz eröffnet das Modell unmittelbare Perspektiven für lokale agentische Anwendungen, die Verarbeitung sensibler Daten ohne Cloud-Transfer und das Experimentieren unterwegs. Es ist auch ein starkes Signal für das Open-Source-Ökosystem: Die Forschungsfront verlagert sich nicht länger nur in die Rechenzentren.

Con l'uscita di Gemma 4 12B, Google DeepMind supera una soglia che molti attendevano: un modello multimodale in grado di elaborare testo, immagini e audio su un semplice laptop con 16 GB di RAM. L'architettura senza encoder non è solo un'impresa tecnica — ridefinisce cosa significhi « locale » nell'IA. Finora, il multimodale era appannaggio del cloud, richiedendo GPU costose o API remote. Gemma 4 12B cambia le cose rendendo queste capacità accessibili a qualsiasi sviluppatore dotato di un computer portatile recente. Con licenza Apache 2.0, il modello apre prospettive immediate per le applicazioni agentiche locali, l'elaborazione di dati sensibili senza trasferimento verso il cloud e la sperimentazione in mobilità ridotta. È anche un segnale forte per l'ecosistema open-source: la frontiera della ricerca non si gioca più solo nei data center.

Con la sortida de Gemma 4 12B, Google DeepMind el traversa on cap che tanti spettaven: on modèll multimodal bon de tratà test, immagin e audio sora on semplic laptop de 16 GB de RAM. L'architettura senza encoder l'è no domà ona prodezza tecnica — la redefiniss cossa che « local » el voeur dì in l'IA. Fin adess, el multimodal l'era l'apanage del cloud, besognand di GPU costos o di API distant. Gemma 4 12B el cambia el gioeugh, rendend 'sti capacità accessibil a tucc i desvilupador fornii d'on ordenador portatil recent. Sotta licenza Apache 2.0, el modèll el derva di prospettiv immediad per i applicazion agentigh locai, el tratament de dacc sensibil senza trasferiment vers el cloud, e l'esperimentazion in mobilità reduta. L'è anca on segnal fort per l'ecosistema open-source: la frontiera de la ricerca la se giuga pu domà in di data center.