Lo nuevo en Article Factory y lo último en el mundo de la IA generativa

Space AI Data Centers Face Steep Economic Hurdles

Space AI Data Centers Face Steep Economic Hurdles
Elon Musk and other tech leaders are planning to move artificial‑intelligence compute to orbit, envisioning satellite constellations that could host massive data‑center workloads. Early analyses, however, show that the cost of building and launching such orbital facilities far exceeds that of traditional ground‑based centers. High launch prices, expensive satellite manufacturing, thermal‑management challenges, radiation exposure, and limited solar‑panel lifespans all contribute to the unfavorable economics. While inference workloads may eventually find a niche in space, experts agree that significant technology breakthroughs and cost reductions are required before orbital AI becomes viable. Leer más →

Inferact Secures $150M Seed Round to Commercialize vLLM

Inferact Secures $150M Seed Round to Commercialize vLLM
The creators of the open‑source inference engine vLLM have launched a venture‑backed startup called Inferact, raising $150 million in seed funding at an $800 million valuation. The round was co‑led by Andreessen Horowitz and Lightspeed Venture Partners. Inferact aims to bring the high‑performance vLLM technology, originally incubated at the UC Berkeley lab of Databricks co‑founder Ion Stoica, to enterprise customers. Early adopters include Amazon’s cloud services and a major shopping app, signaling strong market interest as AI inference moves to the forefront of commercial deployment. Leer más →

Huawei Ascend 950, Nvidia H200, and AMD MI300 Instinct: Head‑to‑Head AI Chip Comparison

Huawei Ascend 950, Nvidia H200, and AMD MI300 Instinct: Head‑to‑Head AI Chip Comparison
A side‑by‑side look at three leading AI accelerators—Huawei's Ascend 950 series, Nvidia's H200 (GH100 Hopper), and AMD's Radeon Instinct MI300 (Aqua Vanjaram). The comparison covers architecture, process technology, transistor counts, die size, memory type and capacity, bandwidth, compute performance across FP8, FP16, FP32 and FP64, and target scenarios such as large‑scale LLM training, inference, and high‑performance computing. Availability timelines differ, with each vendor positioning its chip for data‑center and HPC workloads. Leer más →

Google Unveils Ironwood TPU with Record 1.77PB Shared Memory

Google Unveils Ironwood TPU with Record 1.77PB Shared Memory
Google introduced its seventh‑generation Tensor Processing Unit, dubbed Ironwood, at a recent Hot Chips event. The dual‑die chip delivers 4,614 TFLOPs of FP8 performance and pairs each die with eight stacks of HBM3e, providing 192 GB of memory per chip. When scaled to a 9,216‑chip pod, the system reaches 1.77 PB of directly addressable memory—the largest shared‑memory configuration ever recorded for a supercomputer. The architecture includes advanced reliability features, liquid‑cooling infrastructure, and AI‑assisted design optimizations, and is already being deployed in Google Cloud data centers for large‑scale inference workloads. Leer más →

Google Unveils Ironwood TPU with Record 1.77PB Shared Memory

Google Unveils Ironwood TPU with Record 1.77PB Shared Memory
Google introduced its seventh‑generation Tensor Processing Unit, dubbed Ironwood, at a recent Hot Chips event. The dual‑die chip delivers 4,614 TFLOPs of FP8 performance and pairs each die with eight stacks of HBM3e, providing 192 GB of memory per chip. When scaled to a 9,216‑chip pod, the system reaches 1.77 PB of directly addressable memory—the largest shared‑memory configuration ever recorded for a supercomputer. The architecture includes advanced reliability features, liquid‑cooling infrastructure, and AI‑assisted design optimizations, and is already being deployed in Google Cloud data centers for large‑scale inference workloads. Leer más →

Google Unveils Ironwood TPU with Record 1.77PB Shared Memory

Google Unveils Ironwood TPU with Record 1.77PB Shared Memory
Google introduced its seventh‑generation Tensor Processing Unit, dubbed Ironwood, at a recent Hot Chips event. The dual‑die chip delivers 4,614 TFLOPs of FP8 performance and pairs each die with eight stacks of HBM3e, providing 192 GB of memory per chip. When scaled to a 9,216‑chip pod, the system reaches 1.77 PB of directly addressable memory—the largest shared‑memory configuration ever recorded for a supercomputer. The architecture includes advanced reliability features, liquid‑cooling infrastructure, and AI‑assisted design optimizations, and is already being deployed in Google Cloud data centers for large‑scale inference workloads. Leer más →

Google Unveils Ironwood TPU with Record 1.77PB Shared Memory

Google Unveils Ironwood TPU with Record 1.77PB Shared Memory
Google introduced its seventh‑generation Tensor Processing Unit, dubbed Ironwood, at a recent Hot Chips event. The dual‑die chip delivers 4,614 TFLOPs of FP8 performance and pairs each die with eight stacks of HBM3e, providing 192 GB of memory per chip. When scaled to a 9,216‑chip pod, the system reaches 1.77 PB of directly addressable memory—the largest shared‑memory configuration ever recorded for a supercomputer. The architecture includes advanced reliability features, liquid‑cooling infrastructure, and AI‑assisted design optimizations, and is already being deployed in Google Cloud data centers for large‑scale inference workloads. Leer más →

Google Unveils Ironwood TPU with Record 1.77PB Shared Memory

Google Unveils Ironwood TPU with Record 1.77PB Shared Memory
Google introduced its seventh‑generation Tensor Processing Unit, dubbed Ironwood, at a recent Hot Chips event. The dual‑die chip delivers 4,614 TFLOPs of FP8 performance and pairs each die with eight stacks of HBM3e, providing 192 GB of memory per chip. When scaled to a 9,216‑chip pod, the system reaches 1.77 PB of directly addressable memory—the largest shared‑memory configuration ever recorded for a supercomputer. The architecture includes advanced reliability features, liquid‑cooling infrastructure, and AI‑assisted design optimizations, and is already being deployed in Google Cloud data centers for large‑scale inference workloads. Leer más →