OpenAI just launched o3 and o4-mini, models that combine visual intelligence with deeper reasoning. For the first time, these AIs don't just see images – they think with them, manipulating photos to extract insights just as a human would zoom in or rotate a picture to understand it better.
The US-China chip war just hit Wall Street hard. Tech stocks tumbled after Nvidia lost $5.5 billion from US restrictions on AI chip sales to China. Meanwhile, Taiwan strengthened its grip as the world's essential chip supplier.
A new artificial intelligence system from China's Shandong First Medical University helps scientists understand how genes turn on and off. Called TRAPT, it maps gene control with record-breaking accuracy.
Amazon Launches Nova Sonic, Challenges OpenAI and Google in AI Voice Race
Amazon just unveiled Nova Sonic, an AI model that talks and listens like a human. The new system beats both OpenAI and Google on speed and cost, while matching them on quality.
Nova Sonic combines speech recognition and generation into a single model. This marks a shift from traditional approaches that cobble together separate systems for listening, thinking, and speaking. The unified design helps Nova Sonic grasp context better and respond more naturally.
The model excels at the subtle dance of conversation. It waits for its turn to speak, handles interruptions gracefully, and picks up on those awkward pauses we humans love so much. It even adapts its tone to match the speaker's style – though hopefully not when dealing with angry customers.
Early test results look promising.
👉 Nova Sonic beat OpenAI's GPT-4o in head-to-head comparisons, winning 51% of conversations with its masculine voice and 50.9% with its feminine voice.
👉 Against Google's Gemini Flash 2.0, the margins were even wider: 69.7% and 66.3% respectively. The British accent version performed particularly well, winning 58.3% against OpenAI.
The system shines at understanding different accents and handling noisy environments. Its word error rate is 36.4% lower than OpenAI's model across English, French, Italian, German, and Spanish. In English specifically, it makes 24.2% fewer mistakes. In noisy conditions like meeting rooms, it beats OpenAI by an impressive 46.7%.
Speed matters in conversation, and Nova Sonic delivers. It responds in 1.09 seconds on average, compared to 1.18 seconds for OpenAI and 1.41 seconds for Google. Plus, it costs 80% less than OpenAI's offering.
Early adopters are already putting the system to work. ASAPP is using it to power customer service calls. Education First is helping students practice languages with it. Stats Perform is using it to generate sports commentary and analysis from live data.
The model currently speaks English in both American and British accents, with both masculine and feminine voice options. Amazon promises more languages and accents are coming soon.
Nova Sonic integrates with Amazon's Bedrock platform through a new streaming API. This means developers can build voice applications for everything from travel booking to healthcare services. The model can also use external tools and databases to ground its responses in real facts – no more making up flight times or hotel prices.
Amazon emphasized its commitment to responsible AI development. The company has published AI Service Cards for Nova Sonic, detailing its capabilities, limitations, and safety measures.
Why this matters:
The unified model approach could finally deliver on the promise of natural voice AI. No more awkward pauses, robotic responses, or that feeling like you're talking to three different systems duct-taped together.
Amazon's aggressive pricing (80% cheaper than OpenAI) could accelerate adoption across industries. Though perhaps "aggressive pricing" is just Amazon-speak for "We have AWS and can run things cheaper than anyone else."
OpenAI just launched o3 and o4-mini, models that combine visual intelligence with deeper reasoning. For the first time, these AIs don't just see images – they think with them, manipulating photos to extract insights just as a human would zoom in or rotate a picture to understand it better.
The US-China chip war just hit Wall Street hard. Tech stocks tumbled after Nvidia lost $5.5 billion from US restrictions on AI chip sales to China. Meanwhile, Taiwan strengthened its grip as the world's essential chip supplier.
Nvidia just launched its RTX 5060 Ti graphics card, and this time they've remembered that memory matters. The new GPU comes with a whopping 16GB of VRAM, double what previous budget cards offered. It's like they finally realized gamers need more than 8GB to run modern titles – shocking, we know.
OpenAI is developing a social network that could shake up the AI landscape. The project centers on ChatGPT's image generation and includes a social feed, according to sources close to the matter.