AI GUERRILLA /// DAILY AI INTELLIGENCE #002

Gemini 3.1 Flash-Lite's $0.25 Model Just Beat GPT-5 Mini

Monday, March 9, 2026

Google launched Gemini 3.1 Flash-Lite, a model priced at just $0.25 per million input tokens that outperforms GPT-5 mini and Claude 4.5 Haiku on key benchmarks while running 2.5x faster than its predecessor. Meanwhile, OpenAI faces internal fallout from its Pentagon deal, and the AI jobs debate intensifies as U.S. payrolls drop 92,000.

🔥 TOP STORIES

MODELS

Google Launches Gemini 3.1 Flash-Lite — Fastest and Cheapest in the Gemini 3 Family

Google released Gemini 3.1 Flash-Lite on March 3, its most cost-efficient model yet, priced at just $0.25 per million input tokens and $1.50 per million output tokens. That's a fraction of Gemini 3.1 Pro's $2/$18 pricing. But here's the kicker: it's not just cheap — it beat GPT-5 mini and Claude 4.5 Haiku on 6 out of 11 benchmarks, including an 86.9% score on GPQA Diamond (doctorate-level science questions). It runs 2.5x faster to first token and 45% faster overall than Gemini 2.5 Flash, generating 363 tokens per second. The model ships with configurable "thinking levels" — developers can dial reasoning up or down depending on the task, from bulk translation (minimal thinking) to dashboard generation (deep reasoning). Available now in preview via Google AI Studio and Vertex AI.

Price: $0.25/1M input • $1.50/1M output    Speed: 363 tokens/sec    Context: 1M tokens

Google Blog → | SiliconANGLE → | Benchmarks →
ETHICS

The Intercept: OpenAI Says 'Trust Us' on Surveillance — But Won't Show Proof

A deep investigation published today reveals that OpenAI has offered no verifiable evidence that its Pentagon contract actually prevents domestic surveillance or autonomous weapons use. The report highlights that Anthropic refused the same deal specifically because it couldn't get those protections in writing, while OpenAI claims it succeeded — but the contract terms remain classified. The piece also documents a pattern of credibility concerns around CEO Sam Altman, citing former employees who have described him as having "low integrity" in court filings.

Read on The Intercept →
JOBS

The 'AI Jobpocalypse' Week: 92K Jobs Lost, Benioff vs. Dorsey, and the Data Doesn't Lie

It was the worst week for AI-and-jobs headlines yet. U.S. payrolls dropped 92,000 in February — far worse than the 50,000 gain expected. Block's Jack Dorsey cut 40% of his workforce (4,000 jobs), explicitly blaming AI, while his stock surged 18%. Salesforce CEO Marc Benioff pushed back, calling it "company-specific" — then it came out that Salesforce itself cut 4,000 support roles to AI agents. Morgan Stanley cut 2,500. Oracle is planning thousands more. The debate is no longer theoretical.

Benzinga → | Fortune →

🛠️ TOOL OF THE DAY

Google AI Studio

With Flash-Lite now live, Google AI Studio just became one of the most cost-effective ways to prototype AI apps. It's a free, browser-based IDE where you can test any Gemini model, build prompts, generate code, and export API calls — all without a credit card. If you're building high-volume AI features (chatbots, translation, content moderation), Flash-Lite through AI Studio is now cheaper than any OpenAI or Anthropic equivalent at scale.

For: Developers, AI builders, startups    Price: Free to start, pay-per-token at scale

Try Google AI Studio → | Flash-Lite Docs →

⚡ QUICK HITS

 Alibaba's tiny Qwen3.5-9B (runs on a laptop) is outperforming OpenAI's gpt-oss-120B on multiple benchmarks. The open-source efficiency revolution continues to close the gap with closed models.

Radical Data Science →

 Nearly 900 tech workers across Google and OpenAI signed an open letter titled "We Will Not Be Divided" calling for clearer limits on military AI use. Google has been silent as it reportedly negotiates its own Pentagon deal for Gemini.

CNBC →

 AI2 released Olmo Hybrid, a 7B-parameter open model that combines transformer attention with linear recurrent layers. It reaches the same accuracy as Olmo 3 using 49% fewer tokens — roughly 2x data efficiency.

Radical Data Science →

 Amazon's new Zen 5 "Turin" cloud instances (C8) are up to 70% faster than their Zen 3 predecessors. The AI infrastructure arms race is driving real compute gains for developers.

Daily News Stuff →

 A Melbourne AI agency publicly broke up with ChatGPT, citing both military concerns and hallucination issues — switching to Anthropic's Claude for all client work.

HumAI Blog →

🔬 RESEARCH SPOTLIGHT

Anthropic's New Framework for Measuring AI's Labor Impact

Anthropic published a paper introducing a new way to measure how AI affects the labor market — moving beyond the simplistic "AI replaces jobs" framing. Instead, it tracks which specific tasks within jobs are being augmented or automated, and at what rate. The framework could become the standard for policymakers and economists trying to get an honest picture of AI's impact on employment. Timely, given this week's job loss numbers.

Read More →

💬 GUERRILLA TAKE

Google just quietly changed the game with Flash-Lite's pricing. At $0.25 per million tokens, the cost floor for "good enough" AI just cratered. This matters more than any benchmark score. When inference becomes this cheap, every startup can afford to bake AI into every feature — and the moat shifts from "who has the best model" to "who builds the best product on top of commodity intelligence." OpenAI and Anthropic are fighting over Pentagon contracts while Google is making AI so cheap it becomes invisible infrastructure. Watch the builders, not the headlines.

Get AI Guerrilla in your inbox every morning at 8 AM.

Was this forwarded to you? Subscribe free →

AI GUERRILLA /// MARCH 9, 2026 /// NO FLUFF. NO FILLER. JUST SIGNAL.

Keep Reading