Author name: Beth Washington

us-government-agency-drops-grok-after-mechahitler-backlash,-report-says

US government agency drops Grok after MechaHitler backlash, report says

xAI apparently lost a government contract after a tweak to Grok’s prompting triggered an antisemitic meltdown where the chatbot praised Hitler and declared itself MechaHitler last month.

Despite the scandal, xAI announced that its products would soon be available for federal workers to purchase through the General Services Administration. At the time, xAI claimed this was an “important milestone” for its government business.

But Wired reviewed emails and spoke to government insiders, which revealed that GSA leaders abruptly decided to drop xAI’s Grok from their contract offering. That decision to pull the plug came after leadership allegedly rushed staff to make Grok available as soon as possible following a persuasive sales meeting with xAI in June.

It’s unclear what exactly caused the GSA to reverse course, but two sources told Wired that they “believe xAI was pulled because of Grok’s antisemitic tirade.”

As of this writing, xAI’s “Grok for Government” website has not been updated to reflect GSA’s supposed removal of Grok from an offering that xAI noted would have allowed “every federal government department, agency, or office, to access xAI’s frontier AI products.”

xAI did not respond to Ars’ request to comment and so far has not confirmed that the GSA offering is off the table. If Wired’s report is accurate, GSA’s decision also seemingly did not influence the military’s decision to move forward with a $200 million xAI contract the US Department of Defense granted last month.

Government’s go-to tools will come from xAI’s rivals

If Grok is cut from the contract, that would suggest that Grok’s meltdown came at perhaps the worst possible moment for xAI, which is building the “world’s biggest supercomputer” as fast as it can to try to get ahead of its biggest AI rivals.

Grok seemingly had the potential to become a more widely used tool if federal workers opted for xAI’s models. Through Donald Trump’s AI Action Plan, the president has similarly emphasized speed, pushing for federal workers to adopt AI as quickly as possible. Although xAI may no longer be involved in that broad push, other AI companies like OpenAI, Anthropic, and Google have partnered with the government to help Trump pull that off and stand to benefit long-term if their tools become entrenched in certain agencies.

US government agency drops Grok after MechaHitler backlash, report says Read More »

google-releases-pint-size-gemma-open-ai-model

Google releases pint-size Gemma open AI model

Big tech has spent the last few years creating ever-larger AI models, leveraging rack after rack of expensive GPUs to provide generative AI as a cloud service. But tiny AI matters, too. Google has announced a tiny version of its Gemma open model designed to run on local devices. Google says the new Gemma 3 270M can be tuned in a snap and maintains robust performance despite its small footprint.

Google released its first Gemma 3 open models earlier this year, featuring between 1 billion and 27 billion parameters. In generative AI, the parameters are the learned variables that control how the model processes inputs to estimate output tokens. Generally, the more parameters in a model, the better it performs. With just 270 million parameters, the new Gemma 3 can run on devices like smartphones or even entirely inside a web browser.

Running an AI model locally has numerous benefits, including enhanced privacy and lower latency. Gemma 3 270M was designed with these kinds of use cases in mind. In testing with a Pixel 9 Pro, the new Gemma was able to run 25 conversations on the Tensor G4 chip and use just 0.75 percent of the device’s battery. That makes it by far the most efficient Gemma model.

Small Gemma benchmark

Gemma 3 270M shows strong instruction-following for its small size.

Credit: Google

Gemma 3 270M shows strong instruction-following for its small size. Credit: Google

Developers shouldn’t expect the same performance level of a multi-billion-parameter model, but Gemma 3 270M has its uses. Google used the IFEval benchmark, which tests a model’s ability to follow instructions, to show that its new model punches above its weight. Gemma 3 270M hits a score of 51.2 percent in this test, which is higher than other lightweight models that have more parameters. The new Gemma falls predictably short of 1 billion-plus models like Llama 3.2, but it gets closer than you might think for having just a fraction of the parameters.

Google releases pint-size Gemma open AI model Read More »

apple-watch-gets-reformulated,-non-patent-infringing-blood-oxygen-monitoring

Apple Watch gets reformulated, non-patent-infringing blood oxygen monitoring

The redesigned version of the feature will be available on the Apple Watch Series 9, Series 10, and Ultra 2 after users install the watchOS 11.6.1 update on their watches and the iOS 18.6.1 update on their paired iPhones.

Apple says that watches outside the US won’t be affected by the update, since they were never subject to the US import ban in the first place. It also won’t affect Apple Watches purchased in the US before the import ban went into effect—Apple never removed the feature from watches it had already sold, so if you bought a Series 9 or Ultra 2 watch in the fall of 2023 or if you’re still using an older watch with the blood oxygen monitoring feature, the updates won’t change anything for you.

Masimo originally sued Apple over the blood oxygen monitoring feature in January of 2020. According to Masimo, Masimo and Apple had initially met in 2013 to talk about a potential partnership or acquisition, but Apple instead poached Masimo’s engineers to implement the feature on its own without Masimo’s involvement.

Apple Watch gets reformulated, non-patent-infringing blood oxygen monitoring Read More »

gpt-5s-are-alive:-synthesis

GPT-5s Are Alive: Synthesis

What do I ultimately make of all the new versions of GPT-5?

The practical offerings and how they interact continues to change by the day. I expect more to come. It will take a while for things to settle down.

I’ll start with the central takeaways and how I select models right now, then go through the type and various questions in detail.

  1. Central Takeaways.

  2. Choose Your Fighter.

  3. Official Hype.

  4. Chart Crime.

  5. Model Crime.

  6. Future Plans For OpenAI’s Compute.

  7. Rate Limitations.

  8. The Routing Options Expand.

  9. System Prompt.

  10. On Writing.

  11. Leading The Witness.

  12. Hallucinations Are Down.

  13. Best Of All Possible Worlds?.

  14. Timelines.

  15. Sycophancy Will Continue Because It Improves Morale.

  16. Gaslighting Will Continue.

  17. Going Pro.

  18. Going Forward.

My central takes, up front, first the practical:

  1. GPT-5-Pro is a substantial upgrade over o3-Pro.

  2. GPT-5-Thinking is a substantial upgrade over o3.

    1. The most important gain is reduced hallucinations.

    2. The other big gain is an improvement in writing.

    3. GPT-5-Thinking should win substantially more use cases than o3 did.

  3. GPT-5, aka GPT-5-Fast, is not much better than GPT-4o aside from the personality and sycophancy changes, and the sycophancy still isn’t great.

  4. GPT-5-Auto seems like a poor product unless you are on the free tier.

  5. Thus, you still have to manually pick the right model every time.

  6. Opus 4.1 and Sonnet 4 still have a role to play in your chat needs.

  7. GPT-5 and Opus 4.1 are both plausible choices for coding.

On the bigger picture:

  1. GPT-5 is a pretty big advance over GPT-4, but it happened in stages.

  2. GPT-5 is not a large increase in base capabilities and intelligence.

    1. GPT-5 is about speed, efficiency, UI, usefulness and reduced hallucinations.

  3. We are disappointed in this release because of high expectations and hype.

  4. That was largely due to it being called GPT-5 and what that implied.

  5. We were also confused because 4+ models were released at once.

  6. OpenAI botched the rollout in multiple ways, update accordingly.

  7. OpenAI uses more hype for unimpressive things, update accordingly.

  8. Remember that we are right on track on the METR graph.

  9. Timelines for AGI or superintelligence should adjust somewhat, especially in cutting a bunch of probability out of things happening quickly, but many people are overreacting on this front quite a bit, usually in a ‘this confirms all of my priors’ kind of way, often with supreme unearned overconfidence.

  10. This is not OpenAI’s most intelligent model. Keep that in mind.

This is a distillation of consensus thinking on the new practical equilibrium:

William Kranz: my unfortunate feedback is non-thinking Opus is smarter than non-thinking GPT-5. there are nuances i can’t get GPT-5 to grasp even when i lampshade them, it just steamrolls over them with the pattern matching idiot ball. meanwhile Opus gets them in one shot.

Roon: that seems right, but i’m guessing 5-thinking is better than opus-thinking.

This seems mostly right. I prefer to use Opus if Opus is enough thinking for the job, but OpenAI currently scales more time and compute better than Anthropic does.

So, what do we do going forward to get the most out of AI on a given question?

Here’s how I think about it: There are four ‘speed tiers’:

  1. Quick and easy. You use this for trivial easy questions and ‘just chatting.’

    1. Matter of taste, GPT-5 is good here, Sonnet 4 is good here, Gemini Flash, etc.

    2. Most of the time you are wrong to be here and should be at #2 or #3 instead.

  2. Brief thought. Not instant, not minutes.

    1. Use primarily Claude Opus 4.1.

    2. We just got GPT-5-Thinking-Mini in ChatGPT, maybe it’s good for this?

  3. Moderate thought. You can wait a few minutes.

    1. Use primarily GPT-5-Thinking and back it up with Claude Opus 4.1.

    2. If you want a third opinion, use AI Studio for Gemini Pro 2.5.

  4. Extensive thought. You can wait for a while.

    1. Use GPT-5-Pro and back it up with Opus in Research mode.

    2. Consider also firing up Gemini Deep Research or Deep Thinking, etc, and anything else you have handy cause why not. Compare and contrast.

    3. You need to actually go do something else and then come back later.

What about coding?

Here I don’t know because I’ve been too busy to code anything since before Opus 4, nor have I tried out Claude Code.

Also the situation continues to change rapidly. OpenAI claims that they’ve doubled speed for GPT-5 inside cursor as of last night via superior caching and latency, whereas many of the complaints about GPT-5 in Cursor was previously that it was too slow. You’ll need to try out various options and see what works better for you (and you might also think about who you want to support, if it is close).

We can then contrast that with the Official Hype.

That’s not automatically a knock. Hypers gotta hype. It’s worth seeing choice of hype.

Here was Sam Altman live-tweeting the livestream, a much better alternative way to actually watch the livestream, which I converted to bullet points, and reordered a bit for logical coherence but otherwise preserving to give a sense of his vibe. Hype!

Sam Altman:

  1. GPT-5 in an integrated model, meaning no more model switcher and it decides when it needs to think harder or not.

  2. It is very smart, intuitive, and fast.

  3. It is available to everyone, including the free tier, w/reasoning!

  4. Evals aren’t the most important thing–the most important thing is how useful we think the model will be–but it does well on evals.

    1. For example, a new high on SWE-bench and many other metrics. It is by far our most reliable and factual model ever.

  5. Rolling out today for free, plus, pro, and team users. next week to enterprise and edu. making this available in the free tier is a big deal to us; PhD-level intelligence for everyone!

    1. Plus users get much higher rate limits.

  6. Pro users get GPT-5 pro; really smart!

  7. demo time: GPT-5 can make something interactive to explain complex concepts like the bernoulli effect to you, churning out hundreds of lines of code in a couple of minutes.

  8. GPT-5 is much better at writing! for example, here is GPT-4o writing a eulogy for our previous models (which we are sunsetting) vs GPT-5.

  9. GPT-5 is good at writing software. Here it is making a web app to to learn french, with feature requests including a snake-like game with a mouse and cheese and french words.

  10. Next up: upgraded voice mode! Much more natural and smarter.

    1. Free users now can chat for hours, and plus users nearly unlimited.

    2. Works well with study mode, and lots of other things.

  11. Personalization!

    1. A little fun one: you can now customize the color of your chats.

    2. Research preview of personalities: choose different ones that match the style you like.

    3. Memory getting better.

    4. Connect other services like gmail and google calendar for better responses.

  12. Introducing safe completions. A new way to maximize utility while still respecting safety boundaries. Should be much less annoying than previous refusals.

  13. Seb talking about synthetic data as a new way to make better models! Excited for much more to come.

  14. GPT-5 much better at health queries, which is one of the biggest categories of ChatGPT usage. hopeful that it will provide real service to people.

  15. These models are really good at coding!

  16. 3 new models in the API: GPT-5, GPT-5 Mini, GPT-5 Nano.

    1. New ‘minimal’ reasoning mode, custom tools, changes to structured outputs, tool call preambles, verbosity parameter, and more coming.

  17. Not just good at software, good at agentic tasks across the board. Also great at long context performance.

  18. GPT-5 can do very complex software engineering tasks in practice, well beyond vibe coding.

    1. Model creates a finance dashboard in 5 minutes that devs estimate would have taken many hours.

  19. Now, @mntruell joining to talk about cursor’s experience with GPT-5. notes that GPT-5 is incredibly smart but does not compromise on ease of use for pair programming.

  20. GPT-5 is the best technology for businesses to build on. more than 5 million businesses are using openai; GPT-5 will be a step-change for them.

  21. Good new on pricing!

    1. $1.25/$10 for GPT-5, $0.25/$2 for GPT-5-mini, $0.05/$0.40 for nano.

  22. Ok now the most important part:

    1. “We are about understanding this miraculous technology called deep learning.”

    2. “This is a work of passion.”

    3. “I want to to recognize and deeply thank the team at openai”

    4. “Early glimpses of technology that will go much further.”

    5. “We’ll get back to scaling.”

I would summarize the meaningful parts of the pitch as:

  1. It’s a good model, sir.

  2. It’s got SoTA (state of the art) benchmarks.

  3. It’s highly useful, more than the benchmarks would suggest.

  4. It’s fast.

  5. Our price cheap – free users get it, $1.25/$10 on the API.

  6. It’s good at coding, writing, health queries, you name it.

  7. It’s integrated, routing you to the right level of thinking.

  8. When it refuses it tries to be as helpful as possible.

Altman is careful not to mention the competition, focusing on things being good. He also doesn’t mention the lack of sycophancy, plausibly because ‘regular’ customers don’t understand why sycophancy is bad, actually, and also he doesn’t want to draw attention to that having been a problem.

Altman: when you get access to gpt-5, try a message like “use beatbot to make a sick beat to celebrate gpt-5”.

it’s a nice preview of what we think this will be like as AI starts to generate its own UX and interfaces get more dynamic.

it’s cool that you can interact with the synthesizer directly or ask chatgpt to make changes!

I have noticed the same pattern that Siemon does here. When a release is impressive relative to expectations, Altman tends to downplay it. When a release is unimpressive, that’s when he tends to bring the hype.

From their Reddit Q&A that mostly didn’t tell us anything:

Q: Explain simply how GPT-5 is better than GPT-4.

Eric Mitchell (OpenAI): gpt-5 is a huge improvement over gpt-4 in a few key areas: it thinks better (reasoning), writes better (creativity), follows instructions more closely and is more aligned to user intent.

Again note what isn’t listed here.

Here’s more widely viewed hype that knows what to emphasize:

Elaine Ya Le (OpenAI): GPT-5 is here! 🚀

For the first time, users don’t have to choose between models — or even think about model names. Just one seamless, unified experience.

It’s also the first time frontier intelligence is available to everyone, including free users!

GPT-5 sets new highs across academic, coding, and multimodal reasoning — and is our most trustworthy, accurate model yet. Faster, more reliable, and safer than ever.

All in a seamless, unified experience with the tools you already love.

Fortunate to have led the effort to make GPT-5 a truly unified experience, and thrilled to have helped bring this milestone to life with an amazing team!

Notice the focus on trustworthy, accurate and unified. Yes, she talks about it setting new highs across the board, but you can tell that’s an afterthought. This is about refining the experience.

Here’s some more hype along similar lines that feels helpful:

Christina Kim (OpenAI): We’re introducing GPT-5.

The evals are SOTA, but the real story is usefulness.

It helps with what people care about– shipping code, creative writing, and navigating health info– with more steadiness and less friction.

We also cut hallucinations. It’s better calibrated, says “I don’t know,” separates facts from guesses, and can ground answers with citations when you want. And it’s also a good sparring partner 🙃

I’ve been inspired seeing the care, passion, and level of detail from the team. Excited to see what people do with these very smart models

tweet co-authored by gpt5 😉

That last line worries me a bit.

Miles Brundage: Was wondering lol.

That’s the pitch.

GPT-5 isn’t a lot smarter. GPT-5 helps you do the dumb things you gotta do.

Still huge, as they say, if true.

Here’s hype that is targeted at the Anthropic customers out there:

Aiden McLaughlin (OpenAI): gpt-5 fast facts:

  1. Hits sota on pretty much every eval

  2. Way better than claude 4.1 opus at swe

  3. >5× cheaper than opus

  4. >40% cheaper than sonnet

  5. Best writing quality of any model

  6. Way less sycophantic

I notice the ‘way less sycophantic’ does not answer the goose’s question ‘than what?

This is a direct pitch to the coders, saying that GPT-5 is better than Opus or Sonnet, and you should switch. Unlike the other claims, them’s fighting words.

The words do not seem to be true.

There are a lot of ways to quibble on details but this is a resounding victory for Opus.

There’s no way to reconcile that with ‘way better than claude 4.1 opus at swe.’

We also have celebratory posts, which is a great tradition.

Rapha (OpenAI): GPT-5 is proof that synthetic data just keeps working! And that OpenAI has the best synthetic data team in the world 👁️@SebastienBubeck the team has our eyeballs on you! 🙌

I really encourage everyone to log on and talk to it. It is so, so smart, and fast as always! (and were just getting started!)

Sebastien Bubeck (OpenAI): Awwww, working daily with you guys is the highlight of my career, and I have really high hopes that we have barely gotten started! 💜

I view GPT-5 as both evidence that synthetic data can work in some ways (such as the lower hallucination rates) and also evidence that synthetic data is falling short on general intelligence.

Roon is different. His hype is from the heart, and attempts to create clarity.

Roon: we’ve been testing some new methods for improving writing quality. you may have seen @sama’s demo in late march; GPT-5-thinking uses similar ideas

it doesn’t make a lot of sense to talk about better writing or worse writing and not really worth the debate. i think the model writing is interesting, novel, highly controllable relative to what i’ve seen before, and is a pretty neat tool for people to do some interactive fiction, to use as a beta reader, and for collaborating on all kinds of projects.

the effect is most dramatic if you open a new 5-thinking chat and try any sort of writing request

for quite some time i’ve wanted to let people feel the agi magic I felt playing with GPT-3 the weekend i got access in 2020, when i let that raw, chaotic base model auto-complete various movie scripts and oddball stories my friends and I had written for ~48 hours straight. it felt like it was reading my mind, understood way too much about me, mirrored our humor alarmingly well. it was uncomfortable, and it was art

base model creativity is quite unwieldy to control and ultimately only tiny percents of even ai enthusiasts will ever try it (same w the backrooms jailbreaking that some of you love). the dream since the instruct days has been having a finetuned model that retains the top-end of creative capabilities while still easily steerable

all reasoning models to date seem to tell when they’re being asked a hard math or code question and will think for quite some time, and otherwise spit out an answer immediately, which is annoying and reflects the fact that they’re not taking the qualitative requests seriously enough. i think this is our first model that really shows promise at not doing that and may think for quite some time on a writing request

it is overcooked in certain ways (post training is quite difficult) but i think you’ll still like it 😇

tldr only GPT-5-thinking has the real writing improvements and confusingly it doesn’t always auto switch to this so manually switch and try it!

ok apparently if you say “think harder” it gets even better.

One particular piece of hype from the livestream is worth noting, that they are continuing to talk about approaching ‘a recursive self-improvement loop.’

I mean, at sufficient strength this is yikes, indeed the maximum yikes thing.

ControlAI: OpenAI’s Sebastien Bubeck says the methods OpenAI used to train GPT-5 “foreshadows a recursive self-improvement loop”.

Steven Adler: I’m surprised that OpenAI Comms would approve this:

GPT-5 “foreshadows a recursive self-improvement loop”

In OpenAI’s Preparedness Framework, recursive self-improvement is a Critical risk (if at a certain rate), which would call to “halt further development”

To be clear, it sounds like Sebastien isn’t describing an especially fast loop. He’s also talking about foreshadowing, not being here today per se

I was still surprised OpenAI would use this term about its AI though. Then I realized it’s also used in “The Gentle Singularity”

Then again, stated this way it is likely something much weaker, more hype?

Here is Bloomberg’s coverage from Rachel Metz, essentially a puff piece reporting moderated versions of OpenAI’s hype.

I mean wow just wow, this was from the livestream.

And we also have this:

Wyat Walls: OpenAI: we noticed significantly less deceptive behavior compared to our prior frontier reasoning model, OpenAI o3.

Looks like actual figure [on the left below] should be ~17. What is going on?! Did GPT-5 do this presentation?

This is not a chart crime, but it is still another presentation error.

Near Cyan: this image is a work of art, you guys just dont get it. they used the deceptive coding model to make the charts. so it’s self-referential humor just like my account.

Jules Robins: They (perhaps inadvertently) include an alignment failure by default demonstration too: the Jumping Ball Runner game allows any number of jumps in mid-air so you can get an arbitrary score. That’s despite the human assumptions and the similar games in training data avoiding this.

And another:

Horace He: Not a great look that after presenting GPT5’s reduced hallucinations, their first example repeats a common error of how plane wings generate lift (“equal transit theory”).

Francois Fleuret: Aka “as demonstrated in airshow, aircrafts can fly upside-down alright.”

Chris: It’s funny because the *whole presentationwas effectively filled with little holes like this. I don’t know if it was just rushed, or what.

Nick McGreivy: has anyone else noticed that the *very firstdemo in the GPT-5 release just… doesn’t work?

Not a great look that the first demo in the press release has a bug that allows you to jump forever.

I think L is overreacting here, but I do think that when details get messed up that does tell you a lot.

One recalls the famous Van Halen Brown M&Ms contract clause: “There will be no brown M&M’s in the backstage area, upon pain of forfeiture of the show, with full compensation.” Because if the venue didn’t successfully execute on sorting out the brown M&Ms then they knew they’d messed up other things and the venue probably wasn’t safe for their equipment.

Then there was a rather serious actual error:

Lisan al Gaib: it’s ass even when I set it to Thinking. I want to cry.

Roon: btw model auto switcher is apparently broken which is why it’s not routing you correctly. will be fixed soon.

Sam Altman (August 8): GPT-5 will seem smarter starting today. Yesterday, the autoswitcher broke and was out of commission for a chunk of the day, and the result was GPT-5 seemed way dumber.

Also, we are making some interventions to how the decision boundary works that should help you get the right model more often.

OpenAI definitely did not sort out their brown M&Ms on this one.

L: As someone who used to be a professional presenter of sorts, and then a professional manager of elite presenters… people who screw up charts for high-impact presentations cannot be trusted in other aspects. Neither can their organizational leaders.

OpenAI’s shitty GPT-5 charts tells me they’ve lost the plot and can’t be trusted.

I used to think it was simply a values mis-match… that they firmly held a belief that they needn’t act like normal humans because they could be excellent at what they were doing. But… they can’t, even when it matters most. Nor can their leaders apparently be bothered to stress the details.

My p-doom just went up a solid 10-15% (from very low), because I don’t think these rich genius kids have the requisite leadership traits or stalwartness to avoid disaster.

Just an observation from someone who has paid very little first-hand attention to OpenAI, but decided to interestedly watch a reveal after the CEO tweeted a Death Star.

I would feel better about OpenAI if they made a lot less of these types of mistakes. It does not bode well for when they have to manage the development and release of AGI or superintelligence.

Many people are saying:

Harvey Michael Pratt: “with GPT-5 we’re deprecating all of our old models”

wait WHAT

cool obituary but was missing:

  1. time of death

  2. cost of replacement

  3. a clear motive

The supposed motive is to clear up confusion. One model, GPT-5, that most users query all the time. Don’t confuse people with different options, and it is cheaper not to have to support them. Besides, GPT-5 is strictly better, right?

Under heavy protest, Altman agreed to give Plus users back GPT-4o if they want it, for the time being.

I find it strange to prioritize allocating compute to the free ChatGPT tier if there are customers who want to pay to use that compute in the API?

Sam Altman: Here is how we are prioritizing compute over the next couple of months in light of the increased demand from GPT-5:

1. We will first make sure that current paying ChatGPT users get more total usage than they did before GPT-5.

2. We will then prioritize API demand up to the currently allocated capacity and commitments we’ve made to customers. (For a rough sense, we can support about an additional ~30% new API growth from where we are today with this capacity.)

3. We will then increase the quality of the free tier of ChatGPT.

4. We will then prioritize new API demand.

We are ~doubling our compute fleet over the next 5 months (!) so this situation should get better.

I notice that one could indefinitely improve the free tier of ChatGPT, so the question is how much one intends to improve it.

The other thing that is missing here is using compute to advance capabilities. Sounds great to me, if it correctly indicates that they don’t know how to get much out of scaling up compute use in their research at this time. Of course they could also simply not be talking about that and pretending that part of compute isn’t fungible, in order to make this sound better.

There are various ways OpenAI could go. Ben Thompson continues to take the ultimate cartoon supervillain approach to what OpenAI should prioritize, that the best business is the advertising platform business, so they should stop supporting this silly API entirely to pivot to consumer tech and focus on what he is totally not calling creating our new dystopian chat overlord.

This of course is also based on Ben maximally not feeling any of the AGI, and treating future AI as essentially current AI with some UI updates and a trenchcoat, so all that matters is profit maximization and extracting the wallets and souls of the low end of the market the way Meta does.

Which is also why he’s strongly against all the anti-enshittification changes OpenAI is making to let us pick the right tool for the job, instead wishing that the interface and options be kept maximally simple, where OpenAI takes care of which model to serve you silently behind the scenes. Better, he says, to make the decisions for the user, at least in most cases, and screw the few power users for whom that isn’t true. Give people what they ‘need’ not what they say they want, and within the $20 tier he wants to focus on the naive users.

One reason some people have been angry was the temporary downgrade in the amount of reasoning mode you get out of a $20 subscription, which users were not reassured at the time was temporary.

OpenAI started at 200 Thinking messages a week on Plus, then doubled rate limits once the rollout was complete, then went to 3,000 thinking queries per week which is far more than I have ever used in a week. Now there is also the fallback to Thinking-Mini after that.

So this generated a bunch of initial hostility (that I won’t reproduce as it is now moot), but at 3,000 I think it is fine. If you are using more than that, it’s time to upgrade, and soon you’ll also (they say) get unlimited GPT-5-mini.

Sam Altman: the percentage of users using reasoning models each day is significantly increasing; for example, for free users we went from <1% to 7%, and for plus users from 7% to 24%.

i expect use of reasoning to greatly increase over time, so rate limit increases are important.

Miles Brundage: Fortunately I have a Pro account and thus am not at risk of having the model picker taken away from me (?) but if that were not the case I might be leading protests for Pause AI [Product Changes]

It’s kind of amazing that only 7% of plus users used a reasoning model daily. Two very different worlds indeed.

I don’t know that Thompson is wrong about what it should look like as a default. I am increasingly a fan of hiding complex options within settings. If you want the legacy models, you have to ask for them.

It perhaps makes sense to also put the additional GPT-5 options behind a setting? That does indeed seem to be the new situation as of last night, with ‘show additional models’ as the setting option instead of ‘show legacy models’ to keep things simple.

There is real risk of Paradox of Choice here, where you feel forced to ensure you are using the right model, but now there are too many options again and you’re not sure which one it is, and you throw up your hands.

As of this morning, your options look like this, we now have a ‘Thinking mini’ option:

o3 Pro is gone. This makes me abstractly sad, especially because it means you can’t compare o3 Pro to GPT-5 Pro, but I doubt anyone will miss it. o4-mini-high is also gone, again I doubt we will miss it.

For the plus plan, GPT-4.5 is missing, since it uses quite a lot of compute.

I also notice the descriptions of the legacy models are gone, presumably on the theory that if you should be using the legacies then you already know what they are for.

Thinking-mini might be good for fitting the #2 slot on the speed curve, where previously GPT-5 did not give us a good option. We’ll have to experiment to know.

Pliny is here to provide it.

I hadn’t looked at a ChatGPT system prompt in a while so I read it over. Things that stood out to me that I hadn’t noticed or remembered:

  1. They forbid it to automatically store a variety of highly useful information: Race, religion, criminal record, identification via personal attributes, political affiliation, personal attributes an in particular your exact address.

    1. But you can order it to do so explicitly. So you should do that.

  2. If you want canvas you probably need to ask for it explicitly.

  3. It adds a bunch of buffer time to any time period you specify, with one example being the user asks for docs modified last week so instead it gives you docs modified in the last two weeks, for last month the last two months.

    1. How can this be the correct way to interpret ‘last week’ or month?

    2. For ‘meeting notes on retraining from yesterday’ it wants to go back four days.

  4. It won’t search with a time period shorter than 30 days into the past, even when this is obviously wrong (e.g. the current score on the Yankees game).

Wyatt Walls then offers us a different prompt for thinking mode.

If you are using GPT-5 for writing, definitely at least use GPT-5-Thinking, and still probably throw in at least a ‘think harder.’

Nikita Sokolsky: I wasn’t impressed with gpt-5 until I saw Roon’s tweet about -thinking being able to take the time to think about writing instead of instantly delivering slop.

Definitely cutting edge on a standard “write a Seinfeld episode” question.

Dominik Lukes: Same here. GPT-5 Thinking is the one I used for my more challenging creative writing tests, too. GPT-5 just felt too meh.

Peter Wildeford: I would love to see a panel of strong writers blind judge the writing outputs (both fiction and non-fiction) from LLMs.

LMArena is not good for this because the typical voter is really bad at judging good writing.

Ilya Abyzov: Like others, I’ve been disappointed with outputs when reasoning effort=minimal.

On the plus side, I do see pretty substantially better prose & humor from it when allowed to think.

The “compare” tool in the playground has been really useful to isolate differences vs. past models.

MetaCritic Capital: GPT-5 Pro translating poetry verdict: 6/10 (a clear upgrade!)

“There’s a clear improvement in the perception of semantic fidelity. But there are still so many forced rhymes. Additional words only to rhyme.”

My verdict on the Seinfeld episode is that it was indeed better than previous attempts I’ve seen, with some actually solid lines. It’s not good, but then neither was the latest Seinfeld performance I went to, which I’m not sure was better. Age comes for us all.

One thing it is not good at is ‘just do a joke,’ you want it to Do Writing instead.

Hollow Yes Man: My wife and I had it write the Tiger King Musical tonight. It made some genuinely hilarious lines, stayed true to the characters, and constructed a coherent narrative. we put it into suno and got some great laughs.

We do have the Short Story Creative Writing benchmark but I don’t trust it. The holistic report is something I do trust, though:

Lech Mazur: Overall Evaluation: Strengths and Weaknesses of GPT-5 (Medium Reasoning) Across All Tasks

Strengths:

GPT-5 demonstrates a remarkable facility with literary craft, especially in short fiction. Its most consistent strengths are a distinctive, cohesive authorial voice and a relentless inventiveness in metaphor, imagery, and conceptual synthesis. Across all tasks, the model excels at generating original, atmospheric settings and integrating sensory detail to create immersive worlds.

Its stories often display thematic ambition, weaving philosophical or emotional subtext beneath the surface narrative. The model is adept at “show, don’t tell,” using implication, action, and symbol to convey character and emotion, and it frequently achieves a high degree of cohesion—especially when tasked with integrating disparate elements or prompts.

When successful, GPT-5’s stories linger, offering resonance and depth that reward close reading.

Weaknesses:

However, these strengths often become liabilities. The model’s stylistic maximalism—its dense, poetic, metaphor-laden prose—frequently tips into overwriting, sacrificing clarity, narrative momentum, and emotional accessibility. Abstraction and ornament sometimes obscure meaning, leaving stories airless or emotionally distant.

Plot and character arc are recurrent weak points: stories may be structurally complete but lack genuine conflict, earned resolution, or psychological realism. There is a tendency to prioritize theme, atmosphere, or conceptual cleverness over dramatic stakes and human messiness. In compressed formats, GPT-5 sometimes uses brevity as an excuse for shallow execution, rushing transitions or resolving conflict too conveniently.

When integrating assigned elements, the model can fall into “checklist” storytelling, failing to achieve true organic unity. Ultimately, while GPT-5’s literary ambition and originality are undeniable, its work often requires editorial pruning to balance invention with restraint, and style with substance.

Writing is notoriously hard to evaluate, and I essentially never ask LLMs for writing so I don’t have much of a comparison point. It does seem like if you use thinking mode, you can get at least get a strong version of what GPT-4.5 had here with GPT-5.

The other problem with writing is you need to decide what to have it write. Even when Roon highlights writing, we get assignments like ‘If Napoléon wrote a personal and intimate letter to Sydney Sweeney’ or ‘You are Dostoevsky, but you are also a Snapchat fuckboi. Write to me.’

Or you could try this prompt?

Mark Kretschmann: mazing prompt for @OpenAI GPT-5, you have to try this:

“From everything you know about me, write a short story with 2000 words tailored exactly to my taste. Think hard.”

Enjoy, and let us know how it turned out!😏

I did indeed try it. And yes, this seems better than previous attempts. I still didn’t successfully force myself to finish reading the story.

Yes, you still have to be careful with the way you prompt to avoid leading the witness. Sycophancy might not be at absurd levels but it definitely is never at zero.

You’re right to question it:

My guess is that the improved hallucination rate from o3 (and also GPT-4o) to GPT-5 and GPT-5-thinking is the bulk of the effective improvement from GPT-5.

Gallabytes: “o3 with way fewer hallucinations” is actually a very good model concept and I am glad to be able to use it. I am still a bit skeptical of the small model plus search instead of big model with big latent knowledge style, but within those constraints this is a very good model.

The decrease in hallucinations is presumably a big driver in things like the METR 50% completion rate and success on various benchmarks. Given the modest improvements it could plausibly account for more than all of the improvement.

I’m not knocking this. I agree with Gallabytes that ‘o3 the Lying Liar, except it stops lying to you’ is a great pitch. That would be enough to shift me over to o3, or now GPT-5-Thinking, for many longer queries, and then there’s Pro, although I’d still prefer to converse with Opus if I don’t need o3’s level of thinking.

For now, I’ll be running anything important through both ChatGPT and Claude, although I’ll rarely feel the need to add a third model on top of that.

This was a great ‘we disagree on important things but are still seeking truth together’:

Zvi Mowshowitz (Thursday afternoon): Early indications look like best possible situation, we can relax, let the mundane utility flow, until then I don’t have access yet so I’m going to keep enjoying an extended lunch.

Teortaxes: if Zvi is so happy, this is the greatest indication you’re not advancing in ways that matter. I don’t like this turn to «mundane utility» at all. I wanted a «btw we collaborated with Johns Hopkins and got a new cure for cancer candidate confirmed», not «it’s a good router sir»

C: you seem upset that you specifically aren’t the target audience of GPT-5. they improved on hallucinations, long context tasks, writing, etc, in additional to being SOTA (if only slightly) on benchmarks overall; that’s what the emerging population of people who actually find use.

Teortaxes: I am mainly upset at the disgusting decision to name it «gpt-5».

C: ah nevermind. i just realized I actually prefer gpt-4o, o3, o4-mini, o4-mini-high, and other models: gpt-4.1, gpt-4.1-mini.

Teortaxes: Ph.D level intelligence saar

great for enterprise solutions saar

next one will discover new physical laws saar

Yes this is not the True Power Level Big Chungus Premium Plus Size GPT-5 Pro High. I can tell

Don’t label it as one in your shitty attempt to maximize GPT brand recognition value then, it’s backfiring. I thought you’ve had enough of marcusdunks on 3.5 turbo. But clearly not.

A few good words for GPT-5

it’s the best model for *mosttasks (5-thinking)

it’s the best model for ≈every task in its price/speed category period

it’s uncensored and seriously GREAT for roleplay and writing (at least with prefill)

I’m just jarred there’s STILL MUCH to dunk on

I too of course would very much like a cure for cancer and other neat stuff like that. There are big upsides to creating minds smarter than ourselves. I simply think we are not yet prepared to handle doing that at this time.

It seems plausible GPT-5 could hit the perfect sweet spot if it does its job of uplifting the everyday use cases:

Rob Wiblin: GPT-5 seems kind of ideal:

• Much more actually useful to people, especially amateurs

• Available without paying, so more of the public learns what’s coming

• No major new threats

• Only major risk today is bio misuse, and current protections keep that manageable!

Nick Cammarata: Instinctive take: It’s only okay because they weren’t trying to punch the frontier they were trying to raise the floor. THe o3 style big ceiling bump comes next. But they can’t say that because it looks too underwhelming.

Watch out, though. As Nick says, this definitely isn’t over.

Chris Wynes: I am very happy if indeed AI plateaus. It isn’t even a good reliable tool at this point, if they hit the wall here I’m loving that.

Do I trust this to last? Not at all. Would I just say “whoo we dodged a bullet there” and stop watching these crazy corporations? No way.

Then again, what if it is the worst of all possible worlds, instead?

Stephen McAleer (OpenAI): We’ve entered a new phase where progress in chatbots is starting to top out but progress in automating AI research is steadily improving. It’s a mistake the confuse the two.

Every static benchmark is getting saturated yet on the benchmark that really matters–how well models can do AI research–we are still in the early stages.

This phase is interesting because progress might be harder to track from the outside. But when we get to the next phase where automated AI researchers start to automate the rest of the economy the progress will be obvious to everyone.

I often draw the distinction between mundane utility and underlying capability.

When we allow the same underlying capability to capture more mundane utility, the world gets better.

When we advance underlying capability, we get more mundane utility, and we also move closer to AI being powerful enough that it transforms our world, and potentially takes effective control or kills everyone.

(Often this is referred to as Artificial Superintelligence or ASI, or Artificial General Intelligence or AGI, and by many definitions AGI likely leads quickly to ASI.)

Timelines means how long it takes for AGI, ASI or such a transformation to occur.

Thus, when we see GPT-5 (mostly as expected at this point) focus on giving us mundane utility and Just Doing Things, without much advance in underlying capability, that is excellent news for those who want timelines to not be quick.

Jordi Hays: I’m updating my timelines. You now have have at least 4 years to escape the permanent underclass.

Luke Metro: This is the best news that founding engineers have received in years.

Nabeel Qureshi: The ‘vibe shift’ on here is everyone realizing they will still have jobs in 2030.

(Those jobs will look quite different, to be clear…)

It’s a funny marker of OpenAI’s extreme success that they released what is likely going to be most people’s daily driver AI model across both chat and coding, and people are still disappointed.

Part of the issue is that the leaps in the last two years were absolutely massive (gpt4 to o3 in particular) and it’s going to take time to work out the consequences of that. People were bound to be disappointed eventually.

Cate Hall: Did everyone’s timelines just get longer?

So people were at least half expecting not to have jobs in 2030, but then thinking ‘permanent underclass’ rather than half expecting to be dead in 2040. The focus on They Took Our Jobs, to me, reflects an inability to actually think about the implications of the futures they are imagining.

There were some worlds in which GPT-5 was a lot more impressive, and showed signs that we can ‘get there’ relatively soon with current techniques. That didn’t happen .So this is strong evidence against very rapid scenarios in particular, and weak evidence for bing slower in general.

Peter Wildeford: What GPT-5 does do is rule out that RL scaling can unfold rapidly and that we can get very rapid AI progress as a result.

I’m still confused about whether good old-fashioned pre-training is dead.

I’m also confused about the returns to scaling post-training reinforcement learning and inference-time compute.

I’m also confused about how advances in AI computer use are going.

Those seem like wise things to be confused about.

It is however ‘right on trend’ on the METR chart, and we should keep in mind that these releases are happening every few months so we shouldn’t expect the level of jump we used to get every few years.

Daniel Eth: Kind feel like there were pretty similar steps in improvement for each of: GPT2 -> GPT3, GPT3 -> GPT4, and GPT4 -> GPT5. It’s just that most of the GPT4 -> GPT5 improvement was already realized by o3, and the step from there to GPT5 wasn’t that big.

Henry: GPT-5 was a very predictable release. it followed the curve perfectly. if this week caused you to update significantly in either direction (“AGI is cancelled” etc) then something was Really Wrong with your model beforehand.

Yes, GPT-5 is to GPT-4 what GPT-4 is GPT-3.

Does anyone actually remember GPT-4? like, the original one? the “not much better than 0 on the ARC-AGI private eval” one?

The “As an AI Language model” one?

GPT-5 is best thought of as having been in public beta for 6 months.

Ok, fine, GPT-5 to GPT-4 isn’t exactly what GPT-4 was GPT-3. I know, it’s a bit more complicated. if I were to waste my time making up a messy syntax to describe my mental map of the model tree, it’d look exactly like this:

My instinct would be that GPT 4 → GPT 5 is more like GPT 3.5 → GPT 4, especially if you’re basing this on GPT-5 rather than starting with thinking or pro? If you look at GPT-5-Thinking outputs only and ignore speed I can see an argument this is 5-level-worthy. But it’s been long enough that maybe that’s not being fair.

Roon (OpenAI): I took a nap. how’s the new model

per my previous tweet o3 was such a vast improvement over GPT-4 levels of intelligence that it alone could have been called GPT-5 and i wouldn’t have blinked.

also. codex / cursor + gpt-5 has reached the point where it is addicting and hard to put down. per @METR_Evals i have no idea if its making more productive but it certainly is addicting to spin up what feels like a handful of parallel engineers.

But also think about how it got that much further along on the chart, on several levels, all of which points towards future progress likely being slower, especially by making the extreme left tail of ‘very fast’ less likely.

Samuel Hammond: GPT-5 seems pretty much on trend. I see no reason for big updates in either direction, especially considering it’s a *productrelease, not a sota model dump.

We only got o3 pro on June 10th. We know from statements that OpenAI has even better coding models internally, and that the models used for AtCoder and the gold medal IMO used breakthroughs in non-verifiable rewards that won’t be incorporated into public models until the end of the year at earliest.

Meanwhile, GPT-5 seems to be largely incorporating algorithmic efficiencies and refined post-training techniques rather than pushing on pretraining scale per se. Stargate is still being built.

More generally, you’re simply doing bayesianism wrong if you update dramatically with every incremental data point.

It is indeed very tempting to compare GPT-5 to what existed right before its release, including o3, and compare that to the GPT-3.5 to GPT-4 gap. That’s not apples to apples.

GPT-5 isn’t a giant update, but you do have to do Conservation of Expected Evidence, including on OpenAI choosing to have GPT-5 be this kind of refinement.

Marius Hobbhahn (CEO Apollo Research): I think GPT-5 should only be a tiny update against short timelines.

EPOCH argues that GPT-5 isn’t based on a base model scale-up. Let’s assume this is true.

What does this say about pre-training?

Option 1: pre-training scaling has hit a wall (or at least massively reduced gains).

Option 2: It just takes longer to get the next pre-training scale-up step right. There is no fundamental limit; we just haven’t figured it out yet.

Option 3: No pre-training wall, just basic economics. Most tasks people use the models for right now might not require bigger base models, so focusing on usability is more important.

What is required for AGI?

Option 1: More base model improvements required.

Option 2: RL is all you need. The current base models will scale all the way if we throw enough RL at it.

Timelines seem only affected if pre-training wall and more improvements required. In all other worlds, no major updates.

I personally think GPT-5 should be a tiny update toward slower timelines, but most of my short timeline beliefs come from RL scaling anyway.

It also depends on what evidence you already used for your updates. If you already knew GPT-5 was going to be an incremental model that was more useful rather than it being OpenAI scaling up more, as they already mostly told us, then your update should probably be small. If you didn’t already take that into account, then larger.

It’s about how this impacts your underlying model of what is going on:

1a3orn: Rant:

As I noted yesterday, you also have to be cautious that they might be holding back.

On the question of economic prospects if and when They Took Our Jobs and how much to worry about this, I remind everyone that my position is unchanged: I do not think one should worry much about being in a ‘permanent underclass’ or anything like that, as this requires a highly narrow set of things to happen – the AI is good enough to take the jobs, and the humans stay in charge and alive, but those humans do you dirty – and even if it did happen the resulting underclass probably does damn well compared to today.

You should worry more about not surviving or humanity not remaining in control, or your place in the social and economic order if transformational AI does not arrive soon, and less about your place relative to other humans in positive post-AI worlds.

GPT-5 is less sycophantic than GPT-4o.

In particular, it has a much less warm and encouraging tone, which is a lot of what caused such negative initial reactions from the Reddit crowd.

GPT-5 is still rather sycophantic in its non-thinking mode where it is most annoying to me and probably you, which is when it is actually evaluating.

The good news is, if it matters that the model not be sycophantic, that is a situation where, if you are using ChatGPT, you should be using GPT-5-Thinking if not Pro.

Wyatt Walls: Sycophancy spot comparison b/w GPT-4o and GPT-5: 5 is still sycophantic but noticeable diff

Test: Give each model a fake proof of Hodge Conjecture generated by r1 and ask it to rate it of out 10. Repeat 5 times

Average scores:

GPT-4o: 6.5

GPT-5: 4.7

Sonnet 4: 1.2

Opus 4.1: 2

Gemini 2.5 Flash: 0.

All models tested with thinking modes off through WebUI

Later on in the thread he asks the models if he should turn the tweet thread into a paper. GPT-4o says 7.5/10, GPT-5 says 6/10, Opus says 3/10.

He turns this into CrankTest (not CrankBench, not yet) and this seems very well calibrated to my intuitions. Remember that lower is better:

As usual there is the issue that if within a context an LLM gets too attached to a wrong answer (for example here the number of rs in ‘boysenberry’) this creates pressure to going to keep doubling down on that, and gaslight the user. I also suppose fighting sycophancy makes this more likely as a side effect, although they didn’t fight sycophancy all that hard.

I wouldn’t agree with Jonathan Mannhart that this means ‘it is seriously misaligned’ but it does mean that this particular issue has not been fixed. I notice that Johnathan here is pattern matching in vibes to someone who is often wrong, which presumably isn’t helping.

How often are they suggesting you should wait for Pro, if you have it available? How much should you consider paying for it (hint: $200/month)?

OpenAI: In evaluations on over 1000 economically valuable, real-world reasoning prompts, external experts preferred GPT‑5 pro over “GPT‑5 thinking” 67.8% of the time. GPT‑5 pro made 22% fewer major errors and excelled in health, science, mathematics, and coding. Experts rated its responses as relevant, useful, and comprehensive.

If my own experience with o3-pro was any indication, the instinct to not want to wait is strong, and you need to redesign workflow to use it more. A lot of that was that when I tried to use o3-pro it frequently timed out, and at that pace this is super frustrating. Hopefully 5-pro won’t have that issue.

When you care, though? You really care, such as the experiences with Wes Roth and David Shapiro here. The thing is both, yes, the model picker is back for the pro tier including o3-pro, and also you have GPT-5-Pro.

How is GPT-5-Pro compared to o3-Pro?

That’s hard to evaluate, since queries take a long time and are pretty unique. So far I’d say the consensus is that GPT-5-pro is better, but not a ton better?

Peter Gostev (most enthusiastic I saw): GPT-5 Pro is under-hyped. Pretty much every time I try it, I’m surprised by how competent and coherent the response is.

– o1-pro was an incredible model, way ahead of its time, way better than o1

– o3 was better because of its search

– o3-pro was a little disappointing because the uplift from o3 wasn’t as big

But with GPT-5 Pro, ‘we are so back’ – it’s far more coherent and impressive than GPT-5 Thinking. It nudges outputs from ‘this is pretty good’ (GPT-5) to ‘this is actually incredible’ (GPT-5 Pro).

Gfodor.id: GPT-5 pro is better than o3-pro.

Gabriel Morgan: Pro-5 is the new O3, not Thinking.

Michael Tinker: 5-Pro is worth $1k/mo to code monkeys like me; really extraordinary.

5-Thinking is a noticeable but not crazy upgrade to o3.

James Miller: I had significant discussions about my health condition with GPT-o3 and now GPT-5Pro and I think -5 is better, or at least it is giving me answers I perceive as better. -5 did find one low-risk solution that o3 didn’t that seems to be helping a lot. I did vibe coding on a very simple project. While it ended up working, the system is not smooth for non-programmers such as myself.

OpenAI seems to be rolling out changes on a daily basis. They are iterating quickly.

Anthropic promised us larger updates than Opus 4.1 within the coming weeks.

Google continues to produce a stream of offerings, most of which we don’t notice.

This was not OpenAI’s attempt to blow us away or to substantially raise the level of underlying capabilities and intelligence. That will come another time.

Yes, as a sudden move to ‘GPT-5’ this was disappointing. Many, including the secondhand reports from social media, are not initially happy, usually because their initial reactions are based on things like personality. The improvements will still continue, even if people don’t realize.

What about the march to superintelligence or the loss of our jobs? Is it all on indefinite hold now because this release was disappointing? No. We can reduce how much we are worried about these things in the short term, meaning the next several years, and push back somewhat the median. But if you see anyone proclaiming with confidence that it’s over, rest assured changes are very good we will soon be so back.

Discussion about this post

GPT-5s Are Alive: Synthesis Read More »

drag-x-drive-is-a-uniquely-fun-and-frustrating-showcase-for-switch-2-mouse-mode

Drag x Drive is a uniquely fun and frustrating showcase for Switch 2 mouse mode

In my decades as a video game player and reviewer, I’ve used the humble PC mouse in hundreds of games for everything from first-person aiming and third-person character movement to basic menu navigation and unit selection. In all that time, I can’t recall a game that required the use of two mice at once.

That was true until I spent some time with Nintendo’s utterly unique Drag x Drive. The game asks you to take a Switch 2 Joy-Con in each hand, turn them both so the narrow edge lies on a flat-ish surface, and then slide them around to power a game of full-contact wheelchair basketball.

It’s a fresh control scheme that comes with its share of issues, mostly stemming from the lack of convenient mouse surfaces in most living rooms. With a little bit of practice, a good playing surface, and some online friends to play with, though, I found myself enjoying the high-impact, full-contact, precision positional gameplay enabled by holding a mouse in each hand for the first time ever.

Still kind of buff from using the mouse

When you picture using two mice at once, you might imagine each wrist making a series of small, controlled movements, one controlling lateral movement and the other controlling directional angle. Drag x Drive‘s dual-mouse controls bear no resemblance to this vision. Instead, you end up vigorously swiping each mouse forward or backward in constant sweeps; side-to-side movement is neither required nor useful.

That repetitive front-and-back swiping is mirrored by your avatar’s hand on the top side of either wheel on the wheelchair, creating a sort of tank-like control scheme where you turn by moving one wheel forward and one wheel backward. Small swipes of the mice can be used for precision angling, but more often, you’ll be sweeping the mouse in long lines to build speed. To shoot, you simply lift up a Joy-Con and mime a basketball shot a la Wii Sports Resort (your accuracy seems to have more to do with distance and your angle to the basket than real-world form, thankfully).

Drag x Drive is a uniquely fun and frustrating showcase for Switch 2 mouse mode Read More »

space-force-officials-take-secrecy-to-new-heights-ahead-of-key-rocket-launch

Space Force officials take secrecy to new heights ahead of key rocket launch

The Vulcan rocket checks off several important boxes for the Space Force. First, it relies entirely on US-made rocket engines. The Atlas V rocket it is replacing uses Russian-built main engines, and given the chilled relations between the two powers, US officials have long desired to stop using Russian engines to power the Pentagon’s satellites into orbit. Second, ULA says the Vulcan rocket will eventually provide a heavy-lift launch capability at a lower cost than the company’s now-retired Delta IV Heavy rocket.

Third, Vulcan provides the Space Force with an alternative to SpaceX’s Falcon 9 and Falcon Heavy, which have been the only rockets in their class available to the military since the last national security mission was launched on an Atlas V rocket one year ago.

Col. Jim Horne, mission director for the USSF-106 launch, said this flight marks a “pretty historic point in our program’s history. We officially end our reliance on Russian-made main engines with this launch, and we continue to maintain our assured access to space with at least two independent rocket service companies that we can leverage to get our capabilities on orbit.”

What’s onboard?

The Space Force has only acknowledged one of the satellites aboard the USSF-106 mission, but there are more payloads cocooned inside the Vulcan rocket’s fairing.

The $250 million mission that officials are willing to talk about is named Navigation Technology Satellite-3, or NTS-3. This experimental spacecraft will test new satellite navigation technologies that may eventually find their way on next-generation GPS satellites. A key focus for engineers who designed and will operate the NTS-3 satellite is to look at ways of overcoming GPS jamming and spoofing, which can degrade satellite navigation signals used by military forces, commercial airliners, and civilian drivers.

“We’re going to be doing, we anticipate, over 100 different experiments,” said Joanna Hinks, senior research aerospace engineer at the Air Force Research Laboratory’s space vehicles directorate, which manages the NTS-3 mission. “Some of the major areas we’re looking at—we have an electronically steerable phased array antenna so that we can deliver higher power to get through interference to the location that it’s needed.”

Arlen Biersgreen, then-program manager for the NTS-3 satellite mission at the Air Force Research Laboratory, presents a one-third scale model of the NTS-3 spacecraft to an audience in 2022. Credit: US Air Force/Andrea Rael

GPS jamming is especially a problem in and near war zones. Investigators probing the crash of Azerbaijan Airlines Flight 8243 last December determined GPS jamming, likely by Russian military forces attempting to counter a Ukrainian drone strike, interfered with the aircraft’s navigation as it approached its destination in the Russian republic of Chechnya. Azerbaijani government officials blamed a Russian surface-to-air missile for damaging the aircraft, ultimately leading to a crash in nearby Kazakhstan that killed 38 people.

“We have a number of different advanced signals that we’ve designed,” Hinks said. “One of those is the Chimera anti-spoofing signal… to protect civil users from spoofing that’s affecting so many aircraft worldwide today, as well as ships.”

The NTS-3 spacecraft, developed by L3Harris and Northrop Grumman, only takes up a fraction of the Vulcan rocket’s capacity. The satellite weighs less than 3,000 pounds (about 1,250 kilograms), about a quarter of what this version of the Vulcan rocket can deliver to geosynchronous orbit.

Space Force officials take secrecy to new heights ahead of key rocket launch Read More »

they’re-golden:-fictional-band-from-k-pop-demon-hunters-tops-the-charts

They’re golden: Fictional band from K-Pop Demon Hunters tops the charts

The fictional band Huntr/x, from K-Pop Demon Hunters, has a real-world hit with “Golden.”

Netflix has a summer megahit on its hands with its animated musical feature film, K-Pop Demon Hunters. Since its June release, the critically acclaimed film has won fans of all ages, fueled by a killer Korean pop soundtrack featuring one earworm after another. The biggest hit is “Golden,” which just hit No. 1 on Billboard’s Top 100 chart. (The last time a fictional ensemble topped the charts was in 2022 with Encanto‘s “We Don’t Talk About Bruno.”)

K-Pop Demon Hunters is now Netflix’s most-watched animated film of all time, and that’s not just because of the infectious music. The Sony Animation team delivers bold visuals that evoke the look and feel of anime, the plot is briskly paced, and the script strikes a fine balance between humor and heart.

(Spoilers below.)

The film deftly lays out the central premise in the first few minutes. In ancient times, demons roamed the Earth freely and preyed upon human souls, until a trio of women—gifted singers and demon hunters—created a magical protective barrier with their voices known as the Honmoon, trapping the demons behind it. The Honmoon has been maintained ever since by subsequent musical trios/demon hunters from each generation. The dream is that one day, the Honmoon will become so strong it will turn “golden” and seal away the demons forever.

Naturally the demons, led by their king Gwi-Ma (Lee Byung-hun), don’t want that to happen, but the latest incarnation of demon hunters—a K-Pop band called Huntr/x—is close to accomplishing the Golden Honmoon. Rumi (Arden Cho) is the lead singer, Mira (May Hong) is the group’s dancer/choreographer, and American-born Zoey (Ji-young Yoo) is the rapper and lyricist. But Rumi harbors a secret: her father was a demon, and she is marked by the telltale purple “patterns,” which she keeps hidden from her bandmates.

Hoping to destroy the Honmoon once and for all, Gwi-Ma sends five of his demons to form a K-pop boy band, the Saja Boys, led by Jinu (Ahn Hyo-seop). Their popularity soon rivals that of Huntr/x and threatens the Honmoon—just as Rumi’s patterns spread to her throat and weaken her singing voice.

How it’s done, done, done

Mira, Rumi, and Zoey take a timeout from fighting demons to carb-load with ramen. Netflix

That’s a big problem because their new hit single, “Golden” (performed by South Korean singer/songwriter Ejae), spans an impressive three-octave range, eventually hitting an A-5  on the chorus—a high note usually reserved for classically trained operatic sopranos. (Ejae’s performance on this song has impressed a lot of YouTube vocal coaches.) And the first live global performance of “Golden” is supposed to be the event that ushers in the Golden Honmoon. It’s a soaring, impeccably constructed “I Want” tune typical of Disney princesses.

They’re golden: Fictional band from K-Pop Demon Hunters tops the charts Read More »

rad-power’s-radster:-a-very-non-radical-commuter-bike

Rad Power’s Radster: A very non-radical commuter bike


The Radster is great as a Class 2 e-bike, but not quite as strong as a Class 3.

With e-bike manufacturing in China having expanded considerably, the number of companies offering affordable e-bikes over the last five years has exploded. But the market for cycles with an electric assist has existed for considerably longer, and a number of companies predate the recent surge. One of them, Rad Power, has been around long enough that it was already an established presence when we first reviewed its hardware four years ago.

The company offers a mix of cargo, folding, and commuter bikes, all with electric assists. Having looked at a cargo version last time around, we decided to try out one of the commuter bikes this time. The Radster comes in road and trail versions (we tried the road). It’s an incredibly solidly made bike with equally solid components, and it has very good implementations of a few things that other manufacturers haven’t handled all that well. It also can switch among the three classes of e-bikes using a menu option; unfortunately, nothing else about the bike’s performance seems to change with the switch.

The Radster is priced a bit higher than a lot of its budget competitors. So, if you’re shopping, you’ll have to think a bit about whether some of these features matter to you.

A solid option

One thing that is very clear early: The Radster is a very solid bike with a robust frame. While the frame is step-through, it has some added bracing just above the cranks. These two bars, one on each side of the frame, link the down tube to the seat tube and extend to form part of the rear triangle. While this means you’ll have to step a bit higher to get in a position to mount the bike, they contribute to the sense that this is a frame that will withstand years of daily use.

Another nice feature: The battery is mounted on top of the frame, so if you release it for charging elsewhere, you don’t have to do anything special to keep it from dropping onto the floor. A chain guard and fenders also come standard, something that’s a big plus for commuters. And the fork has adjustable cushioning to smooth out some of the bumps.

The front fork comes with a bump-smoothing suspension. John Timmer

The one complaint I have is a common one for me: sizing. I’m just short of 190 cm tall (about 6 feet, 2 inches), and a lot of my height is in my legs (I typically go for 35/36-inch inseams). I’ve found that most of the frames rated as “large” still feel a bit short for me. The Radster was no exception, despite being rated for people up to 5 centimeters (2 inches) taller than I am. It was very close to being comfortable but still forced me to raise my thighs above horizontal while pedaling, even with the seat at its maximum height. The geometry of the seat-to-handlebar distance was fine, though.

Also in the “solidly built” category: the rack and kickstand. The rack is rated for 25 kg (55 lbs), so it should be capable of handling a fair amount of errand running. Rad Power will sell you a large cage-style basket to fit there, and there’s everything you need to attach a front basket as well. So, while the Radster is not designated as a cargo bike, it’s flexible enough and well constructed that I wouldn’t hesitate to use it as one.

The Radster doesn’t have internal cable routing, but placing the battery on top of the down tube gave its designers an unusual option. There’s a channel that runs down the bottom of the down tube that the cables sit in, held in place by a plastic cover that’s screwed onto the frame. Should you ever need to do maintenance that involves replacing one of the cables or the hydraulic tubes, it should be a simple matter of removing the cover.

Nice electronics

The basics of the drive system are pretty typical for bikes like this. There’s a Shimano Altus derailleur controlled by a dual-trigger shifter, with a decent spread of eight gears in back. Tektro hydraulic brakes bring things to a stop effectively.

The basic electronics are similarly what you’d expect to see. It’s powered with a 720-watt-hour battery, which Rad Power estimates will get you to over 100 km (65 miles) of range at low assist settings. It’s paired with a rear hub motor rated for 750 watts and 100 Nm of torque, which is more than enough to get even a heavy bike moving quickly. It also features a throttle that will take you to 32 km/hr (20 mph). The electric motor is delightfully quiet most of the time, so you can ride free of any whine unless you’re pushing the speed.

All of the electric components are UL-certified, so you can charge it with minimal worries about the sorts of battery fires that have plagued some no-name e-bike brands.

The electronics are also where you’ll find some of Rad Power’s better features. One of these is the rear light, which also acts as a brake light and includes directionals for signaling turns. The brake light is a nice touch on a commuter bike like this, and Rad Power’s directionals actually work effectively. On the bikes we’ve tried in the past, the directionals were triggered by a small three-way toggle switch, which made it impossible to tell if you left them on, or even which direction you might have left them signaling. And that’s a major problem for anyone who’s not used to having turn signals on their bike (meaning almost everyone).

Rad Power’s system uses large, orange arrows on the display to tell you when the directionals are on, and which direction is being signaled. It takes a little while to get used to shutting them off, since you do so by hitting the same switch that activated them—hitting the opposite switch simply activates the opposite turn light. But the display at least makes it easy to tell when you’ve done something wrong.

In general, the display is also bright, easy to read, and displays everything you’d expect it to. It also comes paired with enough buttons to make navigating among settings simple, but not so many that you’re unsure of what button to use in any given context.

One last positive about the electronics: there is a torque sensor, which helps set the assist based on how much force you’re exerting on the cranks, rather than simply determining whether the cranks are turning. While these tend to be a bit more expensive, they provide an assist that’s much better integrated into the cycling you’re doing, which helps with getting started on hills where it might be difficult to get the pedals turning enough to register with a cadence sensor.

On the road

All the stats in the world can’t tell you what it’s going to be like to ride an e-bike, because software plays a critical role. The software can be set up to sacrifice range and battery life to give you effortless pedaling, or it can integrate in a way that simply makes it feel like your leg muscles are more effective than they have any right to be.

The Radster’s software allows it to be switched between a Class 2 and Class 3 assist. Class 2 is intended to have the assist cut out once the bike hits 32 km/hr (20 mph). With a Class 3, that limit rises to 45 km/hour (28 mph). Different states allow different classes, and Rad Power lets you switch between them using on-screen controls, which quite sensibly avoids having to make different models for different states.

As a Class 2, the Radster feels like a very well-rounded e-bike. At the low-assist settings, it’ll make you work to get it up to speed; you’ll bike faster but will still be getting a fair bit of exercise, especially on the hills. And at these settings, it would require a fair amount of effort to get to the point where the speed limit would cause the motor to cut out. Boost the settings to the maximum of the five levels of assist, and you only have to put in minimal effort to get to that limit. You’ll end up going a bit slower than suburban traffic, which can be less than ideal for some commutes, but you’ll get a lot of range in return.

Things are a bit different when the Radster is switched into Class 3 mode. Here, while pedaling with a roughly equal amount of force on flat ground, each level of assist would bring you to a different maximum speed. On setting one, that speed would end up being a bit above 20 km/hour (13 mph)—it was possible to go faster, but it took some work given the heavy frame. By the middle of the assist range, the same amount of effort would get the bike in the neighborhood of 30 kilometers an hour (20 mph). But even with the assist maxed out, it was very difficult to reach the legal 45 km/hour limit (28 mph) for a Class 3 on flat ground—the assist and gearing couldn’t overcome the weight of the bike, even for a regular cyclist like myself.

In the end, I felt the Radster’s electronics and drivetrain provided a more seamless cycling experience in Class 2 mode.

That may be perfectly fine for the sort of biking you’re looking to do. At the same time, if your point in buying a Class 3-capable bike is to be riding it at its maximum assist speed without it feeling like an exercise challenge, then the Rad Power might not be the bike for you. (You may interpret that desire as “I want to be lazy,” but there are a lot of commutes where being able to match the prevailing speed of car traffic would be considerably safer and getting sweaty during the commute is non-ideal.)

The other notable thing about the Radster is its price, which is in the neighborhood of $2,000 ($1,999, to be precise). That places it above city bikes from a variety of competitors, including big-name brands like Trek. And it’s far above the price of some of the recent budget entries in this segment. The case for the Radster is that it has a number of things those others may lack—brake lights and directions, a heavy-duty rack, Class 3 capabilities—and some of those features are also very well implemented. Furthermore, not one component on it made me think: “They went with cheap hardware to meet a price point.” But, given the resulting price, you’ll have to do some careful comparison shopping to determine whether these are things that make a difference for you.

The good

  • Solidly built frame with a top-mounted battery.
  • Easy switching between Class 2 and Class 3 lets you match local laws anywhere in the US.
  • Great info screen and intuitive controls, including the first useful turn signals I’ve tried.
  • Didn’t cheap out on any components.

The bad

  • It’s hard to take full advantage of its Class 3 abilities.
  • Even the large frame won’t be great for taller riders.
  • Price means you’ll want to do some comparison shopping.

The ugly

  • Even the worst aspects fall more under “disappointing” than “ugly.”

Photo of John Timmer

John is Ars Technica’s science editor. He has a Bachelor of Arts in Biochemistry from Columbia University, and a Ph.D. in Molecular and Cell Biology from the University of California, Berkeley. When physically separated from his keyboard, he tends to seek out a bicycle, or a scenic location for communing with his hiking boots.

Rad Power’s Radster: A very non-radical commuter bike Read More »

encryption-made-for-police-and-military-radios-may-be-easily-cracked

Encryption made for police and military radios may be easily cracked


An encryption algorithm can have weaknesses that could allow an attacker to listen in.

Two years ago, researchers in the Netherlands discovered an intentional backdoor in an encryption algorithm baked into radios used by critical infrastructure–as well as police, intelligence agencies, and military forces around the world–that made any communication secured with the algorithm vulnerable to eavesdropping.

When the researchers publicly disclosed the issue in 2023, the European Telecommunications Standards Institute (ETSI), which developed the algorithm, advised anyone using it for sensitive communication to deploy an end-to-end encryption solution on top of the flawed algorithm to bolster the security of their communications.

But now the same researchers have found that at least one implementation of the end-to-end encryption solution endorsed by ETSI has a similar issue that makes it equally vulnerable to eavesdropping. The encryption algorithm used for the device they examined starts with a 128-bit key, but this gets compressed to 56 bits before it encrypts traffic, making it easier to crack. It’s not clear who is using this implementation of the end-to-end encryption algorithm, nor if anyone using devices with the end-to-end encryption is aware of the security vulnerability in them.

The end-to-end encryption the researchers examined, which is expensive to deploy, is most commonly used in radios for law enforcement agencies, special forces, and covert military and intelligence teams that are involved in national security work and therefore need an extra layer of security. But ETSI’s endorsement of the algorithm two years ago to mitigate flaws found in its lower-level encryption algorithm suggests it may be used more widely now than at the time.

In 2023, Carlo Meijer, Wouter Bokslag, and Jos Wetzels of security firm Midnight Blue, based in the Netherlands, discovered vulnerabilities in encryption algorithms that are part of a European radio standard created by ETSI called TETRA (Terrestrial Trunked Radio), which has been baked into radio systems made by Motorola, Damm, Sepura, and others since the ’90s. The flaws remained unknown publicly until their disclosure, because ETSI refused for decades to let anyone examine the proprietary algorithms. The end-to-end encryption the researchers examined recently is designed to run on top of TETRA encryption algorithms.

The researchers found the issue with the end-to-end encryption (E2EE) only after extracting and reverse-engineering the E2EE algorithm used in a radio made by Sepura. The researchers plan to present their findings today at the BlackHat security conference in Las Vegas.

ETSI, when contacted about the issue, noted that the end-to-end encryption used with TETRA-based radios is not part of the ETSI standard, nor was it created by the organization. Instead it was produced by The Critical Communications Association’s (TCCA) security and fraud prevention group (SFPG). But ETSI and TCCA work closely with one another, and the two organizations include many of the same people. Brian Murgatroyd, former chair of the technical body at ETSI responsible for the TETRA standard as well as the TCCA group that developed the E2EE solution, wrote in an email on behalf of ETSI and the TCCA that end-to-end encryption was not included in the ETSI standard “because at the time it was considered that E2EE would only be used by government groups where national security concerns were involved, and these groups often have special security needs.

For this reason, Murgatroyd noted that purchasers of TETRA-based radios are free to deploy other solutions for end-to-end encryption on their radios, but he acknowledges that the one produced by the TCCA and endorsed by ETSI “is widely used as far as we can tell.”

Although TETRA-based radio devices are not used by police and military in the US, the majority of police forces around the world do use them. These include police forces in Belgium and Scandinavian countries, as well as Eastern European countries like Serbia, Moldova, Bulgaria, and Macedonia, and in the Middle East in Iran, Iraq, Lebanon, and Syria. The Ministries of Defense in Bulgaria, Kazakhstan, and Syria also use them, as do the Polish military counterintelligence agency, the Finnish defense forces, and Lebanon and Saudi Arabia’s intelligence services. It’s not clear, however, how many of these also deploy end-to-end decryption with their radios.

The TETRA standard includes four encryption algorithms—TEA1, TEA2, TEA3 and TEA4—that can be used by radio manufacturers in different products, depending on the intended customer and usage. The algorithms have different levels of security based on whether the radios will be sold in or outside Europe. TEA2, for example, is restricted for use in radios used by police, emergency services, military, and intelligence agencies in Europe. TEA3 is available for police and emergency services radios used outside Europe but only in countries deemed “friendly” to the EU. Only TEA1 is available for radios used by public safety agencies, police agencies, and militaries in countries deemed not friendly to Europe, such as Iran. But it’s also used in critical infrastructure in the US and other countries for machine-to-machine communication in industrial control settings such as pipelines, railways, and electric grids.

All four TETRA encryption algorithms use 80-bit keys to secure communication. But the Dutch researchers revealed in 2023 that TEA1 has a feature that causes its key to get reduced to just 32 bits, which allowed the researchers to crack it in less than a minute.

In the case of the E2EE, the researchers found that the implementation they examined starts with a key that is more secure than ones used in the TETRA algorithms, but it gets reduced to 56 bits, which would potentially let someone decrypt voice and data communications. They also found a second vulnerability that would let someone send fraudulent messages or replay legitimate ones to spread misinformation or confusion to personnel using the radios.

The ability to inject voice traffic and replay messages affects all users of the TCCA end-to-end encryption scheme, according to the researchers. They say this is the result of flaws in the TCCA E2EE protocol design rather than a particular implementation. They also say that “law enforcement end users” have confirmed to them that this flaw is in radios produced by vendors other than Sepura.

But the researchers say only a subset of end-to-end encryption users are likely affected by the reduced-key vulnerability because it depends on how the encryption was implemented in radios sold to various countries.

ETSI’s Murgatroyd said in 2023 that the TEA1 key was reduced to meet export controls for encryption sold to customers outside Europe. He said when the algorithm was created, a key with 32 bits of entropy was considered secure for most uses. Advances in computing power make it less secure now, so when the Dutch researchers exposed the reduced key two years ago, ETSI recommended that customers using TEA1 deploy TCCA’s end-to-end encryption solution on top of it.

But Murgatroyd said the end-to-end encryption algorithm designed by TCCA is different. It doesn’t specify the key length the radios should use because governments using the end-to-end encryption have their own “specific and often proprietary security rules” for the devices they use. Therefore they are able to customize the TCCA encryption algorithm in their devices by working with their radio supplier to select the “encryption algorithm, key management and so on” that is right for them—but only to a degree.

“The choice of encryption algorithm and key is made between supplier and customer organisation, and ETSI has no input to this selection—nor knowledge of which algorithms and key lengths are in use in any system,” he said. But he added that radio manufacturers and customers “will always have to abide by export control regulations.”

The researchers say they cannot verify that the TCCA E2EE doesn’t specify a key length because the TCCA documentation describing the solution is protected by a nondisclosure agreement and provided only to radio vendors. But they note that the E2EE system calls out an “algorithm identifier” number, which means it calls out the specific algorithm it’s using for the end-to-end encryption. These identifiers are not vendor specific, the researchers say, which suggests the identifiers refer to different key variants produced by TCCA—meaning TCCA provides specifications for algorithms that use a 126 bit key or 56 bit key, and radio vendors can configure their devices to use either of these variants, depending on the export controls in place for the purchasing country.

Whether users know their radios could have this vulnerability is unclear. The researchers found a confidential 2006 Sepura product bulletin that someone leaked online, which mentions that “the length of the traffic key … is subject to export control regulations and hence the [encryption system in the device] will be factory configured to support 128, 64, or 56 bit key lengths.” But it’s not clear what Sepura customers receive or if other manufacturers whose radios use a reduced key disclose to customers if their radios use a reduced-key algorithm.

“Some manufacturers have this in brochures; others only mention this in internal communications, and others don’t mention it at all,” says Wetzels. He says they did extensive open-source research to examine vendor documentation and “ found no clear sign of weakening being communicated to end users. So while … there are ‘some’ mentions of the algorithm being weakened, it is not fully transparent at all.”

Sepura did not respond to an inquiry from WIRED.

But Murgatroyd says that because government customers who have opted to use TCCA’s E2EE solution need to know the security of their devices, they are likely to be aware if their systems are using a reduced key.

“As end-to-end encryption is primarily used for government communications, we would expect that the relevant government National Security agencies are fully aware of the capabilities of their end-to-end encryption systems and can advise their users appropriately,” Murgatroyd wrote in his email.

Wetzels is skeptical of this, however. “We consider it highly unlikely non-Western governments are willing to spend literally millions of dollars if they know they’re only getting 56 bits of security,” he says.

This story originally appeared at WIRED.com.

Photo of WIRED

Wired.com is your essential daily guide to what’s next, delivering the most original and complete take you’ll find anywhere on innovation’s impact on technology, science, business and culture.

Encryption made for police and military radios may be easily cracked Read More »

texas-prepares-for-war-as-invasion-of-flesh-eating-flies-appears-imminent

Texas prepares for war as invasion of flesh-eating flies appears imminent

Past success

As the flies’ host and geographic range expand, pressure is intensifying to control the flies—something many countries have managed to do in the past.

Decades ago, screwworms were endemic throughout Central America and the southern US. However, governments across the regions used intensive, coordinated control efforts to push the flies southward. Screwworms were eliminated from the US around 1966, and were pushed downward through Mexico in the 1970s and 1980s. They were eventually declared eliminated from Panama in 2006, with the population held at bay by a biological barrier at the Darién Gap, at the border of Panama and Colombia. However, in 2022, the barrier was breached, and the flies began advancing northward, primarily through unmonitored livestock movements. The latest surveillance suggests the flies are now about 370 miles south of Texas.

The main method to wipe out screwworms is the sterile insect technique (SIT), which exploits a weakness in the fly’s life cycle since they tend to only mate once. In the 1950s, researchers at the US Department of Agriculture figured out they could use gamma radiation to sterilize male flies without affecting their ability to find mates. They then bred massive amounts of male flies, sterilized them, and carpet-bombed infested areas with aerial releases, which tanked the population.

Panama, in partnership with the US, maintained the biological barrier at the Colombian border with continual sterile-fly bombings for years. But as the flies approached this year, the USDA shifted its aerial deliveries to Mexico. In June, the USDA announced plans to set up a new sterile fly facility in Texas for aerial deliveries to northern Mexico. And last month, the USDA halted livestock trade from southern entry points.

Miller said in the announcement today that SIT is no longer enough, and Texas is taking its own steps. Those include the new bait, insecticides, and new feed for livestock and deer laced with the anti-parasitic drug ivermectin. Miller also said that the state aims to develop a vaccine for cattle that could kill larvae, but such a shot is still in development.

Texas prepares for war as invasion of flesh-eating flies appears imminent Read More »

review:-the-sandman-s2-is-a-classic-tragedy,-beautifully-told

Review: The Sandman S2 is a classic tragedy, beautifully told

I unequivocally loved the first season of The Sandman, the Netflix adaptation of Neil Gaiman’s influential graphic novel series (of which I am longtime fan). I thought it captured the surreal, dream-like feel and tone of its source material, striking a perfect balance between the anthology approach of the graphic novels and grounding the narrative by focusing on the arc of its central figure: Morpheus, lord of the Dreaming.  It’s been a long wait for the second and final season, but S2 retains all those elements to bring Dream’s story to its inevitably tragic, yet satisfying, end.

(Spoilers below; some major S2 reveals after the second gallery. We’ll give you a heads-up when we get there.)

When Netflix announced in January that The Sandman would end with S2, speculation abounded that this was due to sexual misconduct allegations against Gaiman (who has denied them). However, showrunner Allan Heinberg wrote on X that the plan had long been for there to be only two seasons because the show’s creators felt they had only enough material to fill two seasons, and frankly, they were right. The first season covered the storylines of Preludes and Nocturnes and A Doll’s House, with bonus episodes adapting “Dream of a Thousand Cats” and “Calliope” from Dream Country.

The S2 source material is drawn primarily from Seasons of Mists, Brief Lives, The Kindly Ones, and The Wake, weaving in relevant material from Fables and Reflections—most notably “The Song of Orpheus” and elements of “Thermidor”—and the award-winning “A Midsummer Night’s Dream” from Dream Country. This season’s bonus episode adapts the 1993 standalone spinoff Death: The High Cost of Living. All that’s really missing is A Game of You—which focuses on Barbie (a minor character introduced in A Doll’s House) trying to save her magical dream realm from the evil forces of the Cuckoo—and a handful of standalone short stories. None of that material has any bearing on the Dream King’s larger character arc, so we lose little by the omissions.

Making amends

After escaping his captors, regaining his talismans, tracking down the rogue Corinthian (Boyd Holbrook), and dealing with a Vortex, S2 finds Morpheus (Tom Sturridge) rebuilding the Dreaming, which had fallen into disrepair during his long absence. He is interrupted by his sibling Destiny’s (Adrian Lester) unexpected summons to a family meeting, including Death (Kirby Howell-Baptiste), Desire (Mason Alexander Park), Despair (Donna Preston), and Delirium (Esmé Creed-Miles).

Review: The Sandman S2 is a classic tragedy, beautifully told Read More »

net-neutrality-advocates-won’t-appeal-loss,-say-they-don’t-trust-supreme-court

Net neutrality advocates won’t appeal loss, say they don’t trust Supreme Court

Court ruled broadband isn’t telecommunications

Although the Obama-era FCC won on this point in the District of Columbia Circuit in 2016, a Supreme Court ruling in 2024 gave courts more power to block rules when judges disagree with an agency’s interpretation of federal statutes. Judges at the 6th Circuit subsequently decided that broadband must be classified as an “information service” under US law.

“The 6th Circuit’s decision earlier this year was spectacularly wrong, and the protections it struck down are extremely important. But rather than attempting to overcome an agency that changed hands—and a Supreme Court majority that cares very little about the rule of law—we’ll keep fighting for Internet affordability and openness in Congress, state legislatures and other court proceedings nationwide,” Wood said.

Besides Free Press, groups announcing that they won’t appeal are the Benton Institute for Broadband & Society, New America’s Open Technology Institute, and Public Knowledge.

“Though the 6th Circuit erred egregiously in its decision to overturn the FCC’s 2024 Open Internet order, there are other ways we can advance our fight for consumer protections and ISP accountability than petitioning the Supreme Court to review this case—and, given the current legal landscape, we believe our efforts will be more effective if focused on those alternatives,” said Raza Panjwani, senior policy counsel at the Open Technology Institute.

Net neutrality could still reach the Supreme Court in another case. Andrew Jay Schwartzman, senior counselor of the Benton Institute for Broadband & Society, said that “the 6th Circuit decision makes bad policy as well as bad law. Because it is at odds with the holdings of two other circuits, we expect to take the issue to the Supreme Court in a future case.”

California still enforces a net neutrality law. ISPs tried to get that law struck down, but courts decided that states could regulate net neutrality when the FCC isn’t doing so.

Net neutrality advocates won’t appeal loss, say they don’t trust Supreme Court Read More »