Author name: 9u50fv

honda-and-nissan-to-merge,-honda-will-take-the-lead

Honda and Nissan to merge, Honda will take the lead

In 2019, then-head of the alliance Carlos Ghosn was arrested by Japanese police on charges of financial misconduct. After three months under house arrest, Ghosn fled the country and Japan’s criminal justice system, which rarely returns a not guilty verdict.

What about Renault, Mitsubishi?

Last year, Nissan agreed to invest $663 million into Renault’s EV activities; at the same time Renault gave up the majority of its shares in Nissan, reducing the stake of each company owned by the other, down to 15 percent. That was meant to lead to “a broader range of EV products and powertrains,” said Uchida at the time. But evidently it was decided that this arrangement was not sufficient to improve Nissan’s electric vehicle portfolio.

For its part, Mitsubishi says it will monitor the situation and decide whether or not to join at a later date. Meanwhile Renault said in a statement that “as the main shareholder of Nissan, Renault Group will consider all options based on the best interest of the Group and its stakeholders. Renault Group continues to execute its strategy and to roll-out projects that create value for the Group, including projects already launched within the Alliance.”

Assuming nothing throws a spanner in these particular works, the deal will be finalized by next June, with the new holding company for the two OEMs created by August 2026. Honda will take the lead of the new enterprise, in large part thanks to its larger market capitalization. But Mibe said that any real payoff from the merger wouldn’t be realized until after 2030. The hope is that shared development costs and greater purchasing scale will help drive down costs, but both companies will continue existing partnerships such as that between Honda-GM as well.

We can expect more shared vehicle platforms between Honda and Nissan, as well as deeper cooperation at the R&D stage. But there are also plans to optimize manufacturing, including facilities, as well as integrating supply chains and even sales financing to find cost savings and efficiencies.

Honda and Nissan to merge, Honda will take the lead Read More »

monthly-roundup-#25:-december-2024

Monthly Roundup #25: December 2024

I took a trip to San Francisco early in December.

Ever since then, things in the world of AI have been utterly insane.

Google and OpenAI released endless new products, including Google Flash 2.0 and o1.

Redwood Research and Anthropic put out the most important alignment paper of the year, on the heels of Apollo’s report on o1.

Then OpenAI announced o3. Like the rest of the media, this blog currently is horrendously lacking in o3 content. Unlike the rest of the media, it is not because I don’t realize that This Changes Everything. It is because I had so much in the queue, and am taking the time to figure out what to think about it.

That queue includes all the other, non-AI things that happened this past month.

So here we are, to kick off Christmas week.

John Wentworth reminds us that often people conflate a prediction of what it likely to happen with an assurance of what is going to happen, whereas these are two very different things. And often, whether or not they’re directly conflating the two, they will attempt to convert a prediction (‘I’ll probably come around 9pm’) to an assurance (‘cool can you pick me up on the way?’) in ways that are expensive without realizing they’re expensive.

Your periodic reminder that if you say you’ll make a ton of money and then pursue your dreams, or then advance the causes you care about, the vast majority of the time this does not actually happen. Not never, but, well, hardly ever.

Journalist combines two unrelated statements from Palmer Luckey into an implied larger statement to effectively fabricate a misleading quote. It does seem like journalists are violating the bounded distrust rules more and more often, which at some point means they’re moving the lines involved.

I feel like I’ve shared this graph before but seems worth sharing again (via MR):

An important note from Michael Vassar: People rarely see themselves or their group as ‘bad’ or ‘evil,’ but often they do view themselves as ‘winners’ rather than ‘good.’ Which is a very different morality, and you can guess what label I’d use for such folks.

Starbucks recycling, like much other recycling, isn’t actually a thing that happens.

Sam Knowlton: Recycling is a psyop to convince people that plastic can be used abundantly and sustainably without consequences.

Of all the recyclable #5 plasticware waste generated in the US only 1% is recycled.

As a clean (with respect to the things this blog cares about) example of the kind of accusations being thrown around by a certain type of person, that very much rhyme with certain accusations in other areas including AI, in case I want to point one later: I saw this example where Mario Nawfal got 15m+ views for saying ‘Biden paid Reuters $300m for targeting Elon’s companies’ based on Mike Benz (who also got 15m+ views and got Elon Musk to reply with a 100% sign and called this ‘lawfare’) stating the facts that:

  1. The government gave Reuters $300 million in total government contracts, mostly for various data analytics services to Thomson Reuters Government Contracts, a distinct subsidiary of Reuters.

  2. Reuters did investigations of Musk that were unkind, for which they won the Pulitzer Prize for National Reporting.

  3. No claim as to how #1 and #2 are related.

  4. Therefore conspiracy and government funded attacks on Elon Musk!

No, seriously, here is the full argument, with the entire comments section cheering on how awful and illegal all this was:

Mario Newfal: The Biden administration gave $300 million in government contracts to Reuters while 11 federal agencies investigated Elon’s businesses—Tesla, SpaceX, and X.

During this time, Reuters received millions from these agencies and won a Pulitzer Prize for reporting on “misconduct at Elon’s companies.”

This means taxpayer dollars funded media attacks on Elon! It’s a coordinated effort to undermine one of America’s most innovative leaders.

Elon keeps building; they keep scheming.

It’s all one big government operation and conspiracy, man. Except not only don’t these posts have any evidence or causal story whatsoever here, not even a fig leaf of one, these contracts are not even larger under Biden than you would have expected from what they got under Trump. If you do a search on the very database he links to and extend it back another 4 years to include the first Trump administration, and sort by contract size, you get this:

As in, 7 of the 9 biggest contracts to Reuters in the past 8 years began under the Trump administration and the long tail looks similar. On so many levels there is absolutely nothing here. When I ask who seems more likely to put their finger on the scale of unrelated government contracts on the basis of news coverage, I think we all know the answer.

Career advice from Richard Ngo, aim to become the best at some broadly-leverageable thing, which can include being the best at the intersection of A, B and C.

Antonio Brown apologizes to Polymarket CEO Shayne Coplan for his role in Kalshi’s campaign to insinuate wrongdoing in the wake of the raid on Coplan’s apartment.

Tyler Cowen asks, should we try to bring back public hangouts? He says yes that would be good, but it seems impossible, and mostly looks at the reasons for the change.

Seriously, as I said on Twitter, I love economists, never stop economisting:

Tyler Cowen: A bigger change is that average walking speed rose by 15%. So the pace of American life has accelerated, at least in public spaces in the Northeast. Most economists would predict such a result, since the growth in wages has increased the opportunity cost of just walking around. Better to have a quick stroll and get back to your work desk.

I am tempted to reply with something wonky about marginal incentive effects not obviously pointing in that direction, or how the opportunity cost is mostly about substitution effects on leisure time instead, but mostly I just want to bask in it.

The biggest change in behavior was that lingering fell dramatically. The amount of time spent just hanging out dropped by about half across the measured locations.

The internet and mobile phones are likely driving this change in behavior.

I think faster walking, when you are alone, is mostly great. It doesn’t only get you where you are going faster, it’s better exercise. A slow walk alone can be nice but on average it’s mostly a skill issue. If you’re with someone else, then yeah, walk slow, have a chat.

As for outright lingering, yeah, I think this one is opportunity costs from better leisure options, the same as most everything else. Why would I linger at Boston’s Downtown Crossing, or another public square, and let serendipity happen, given the other options I have?

Did you know average cow milk yields are continuing to steadily rise and are about five times where they are in 1950? Which was already five times as productive as medieval cows?

Community Notes on Twitter extends to links. Now stop throttling them, please.

Elon Musk instead outright says ‘just write a description in the main post and put the link in the reply. This just stops lazy linking.’

Chris Prucha: Watching this ratio like it’s tyson vs paul🍿

As in, he’s putting a large tax on linking, since putting it in the reply will dramatically decrease rate of clicks through. Which is the point.

The whole thing continues to be a giant middle finger to every Twitter user.

In rival news, BlueSky is rapidly on the rise, and has caught Threads.

Adam Thierer: In recent years, we’ve repeatedly been fed a bed of lies about supposedly unassailable “digital monopolies” when, in reality, competition is always developing in unexpected ways.

These days my head is spinning trying to figure out which social media platforms (X, Bluesky, LinkedIn, Threads, Mastodon, etc) I should be focused on. I’m trying to touch all these bases myself while also keeping up with all the other traditional platforms and outlets out there. It is completely overwhelming.

I suppose it’s only a matter of time before the pro-regulation crowd switches their argument and petitions government to end all this “ruinous competition” through interconnection / interop mandates. But, before that happens, let’s be clear that this splintering of social media is happening without any sort of unnatural external pressure from govt authorities. Once again, organic social and market forces worked their magic. The only problem is it works so well! Now we just have too many damn choices.

Now, excuse me while I go post this same rant on 5 other “digital monopolies.” 😂

BlueSky at that time was still less than 10% the size of Twitter. There are obvious parallels to what happened with Mastodon, which quickly fizzled out.

Yet this time feels different. With Mastodon, it felt like performative anger. People were announcing their moves like they were morally superior. Not this time. This time, the people moving are talking about it practically. They are serious.

The pattern is that the most progressive members of Twitter, and those who move in related circles, are mostly the ones splitting off into BlueSky. They claim Twitter is all MAGA now, I can confirm many times over that many they believe this, but it is all about how you use the site. I don’t encounter any of that, because I don’t interact with the relevant content.

Some people, it seems, think BlueSky is like ‘old Twitter’ and now has all the nerds and think the changes like getting rid of block and prioritizing video ruined it.

Also some sad stats:

From all reports, BlueSky is ‘like old Twitter’ in some ways, especially in the sense that Old Twitter was D+42, and in that it was largely a left wing echo chamber. Which in turn meant that other spaces did not have those people, and leaned further right, while the left wing echo chamber acted as an exclusionary rather than inclusionary force. Also, yes, more people looking to understand things or win at politics should be reading Tracing Woods, who I have met and is delightful.

This response to All Day TA is cited as an example of how this works.

Having this as a clientele puts BlueSky in a strange position, for example with its user base refusing to accept the idea that Jesse Signal might have an account and post with it, and reportedly pushing hard to have him banned for (essentially) being Jesse Signal.

My current view of BlueSky is that those who leave Twitter for BlueSky are usually improving both social networks. Everyone wins from them being distinct. Bubbles are not always a bad thing.

If you’re seeking ElonBucks, consider that you’ll get something on the order of $0.16-$0.24 per reply, with the bigger tweets giving you relatively low payments, as you get rewarded for engagement from blue checks whereas big Tweets bring out the bots.

Community Notes is a miracle of the modern age. Is it over?

Richard Hanania: “Readers added context: mask off moment.”

This is the end of the old Community Notes. Now it’s about editorializing. These things always start off targeting the least sympathetic before expanding. Shame.

I strongly say no, and believe this is an important principle. I often see people dismiss norms when they see even one clear instance of ‘getting away with it’ or non-enforcement, or the start of a potential slippery slope, as if these things must be absolute and stand up to rigid definitions, or they’re worthless, doomed or both. And that simply is not so. Lots of rules, including most laws and norms, constantly face this sort of pushing and pushback, and are muddling through, often for a very long time.

Should Community Notes call this a ‘mask off moment’? No, but Community Notes is just people, and occasionally they’re going to do that sort of thing in this kind of spot. To illustrate, after Hanania drew attention to this, the note was voted off this post.

DOGE will be aiming to target regulations using pauses followed by review and reviticism. As Vivek Ramaswamy and Elon Musk point out, it is mostly about getting rid of regulations, not getting rid of headcount. Government union representatives had an op-ed response, and it is exactly what you would expect. Here is a report on the DOGE ramp-up attempt.

DOGE is looking for regulations to target. How do you tell them about this? It seems that you literally DM them on Twitter. That is literally what I have my ops person at Balsa doing for the next few weeks, gathering together properly formatted pitches to DOGE, starting with repealing the Jones Act and Dredge Act of course.

Jennifer Pahlka put out this widely praised post about how hard DOGE will have it when up against all the legal barriers, and how people like Musk willing to brazenly do things people say are illegal might be our best hope in spite of it, not someone like her who has studied the issues but would be too timid to act. Sounds like she should advise?

More than that, what this is saying is, the law has tied all this up in knots. So what we need, ultimately, is not DOGE. DOGE is potentially helpful but not good enough. What we need is new law, to get rid of old law. I realize this would be very difficult, but the first step is having it shovel ready. Is anyone actually writing the ‘make it so the government can do reasonable things without avalanches of lawsuits’ bill? The one that would actually work? At some point we might get an opening.

How many jobs will they cut? Market at time of writing this says 76k, but with a long tail and a 13% chance of over 1 million which means the mean is substantially higher.

Also hopefully they’ll look at government hiring, now that Elon Musk has noticed that the process is unbelievably stupid? Also he’s now following Alec Stapp, which is pretty great.

For those who don’t know, from the above link: If you want to get hired for a government job, you need to literally cut and paste the exact language in the job description into your resume, then in your self-assessment fill out ‘master’ for everything, or you’re ngmi. So, I suppose you’ll want to do that.

Things are going to be interesting with Jim O’Neill backing up RFK Jr at HHS.

Someone explain to me how he intends to provide these ‘expedited permits’?

Doge Designer: Bullish on America 🚀

Donald Trump: Any person or company investing ONE BILLION DOLLARS, OR MORE, in the United States of America, will receive fully

expedited approvals and permits, including, but in no way limited to, all Environmental approvals. GET READY TO ROCK!!!

Also, if you can do this at all, why not expedite all the permits? Rather than make the billionaires and mega corporations the only ones who can build anything, forcing everyone to partner with one of them?

And one might want to balance that bullishness. He’s studied automation and he’s coming out firmly against it:

Matt Parlmer: This is not going to make America more competitive.

Donald Trump: Just finished a meeting with the International Longshoremen’s Association and its President, Harold Daggett, and Executive VP, Dennis Daggett. There has been a lot of discussion having to do with “automation” on United States docks. I’ve studied automation, and know just about everything there is to know about it. The amount of money saved is nowhere near the distress, hurt, and harm it causes for American Workers, in this case, our Longshoremen.

Foreign companies have made a fortune in the U.S. by giving them access to our markets. They shouldn’t be looking for every last penny knowing how many families are hurt. They’ve got record profits, and I’d rather these foreign companies spend it on the great men and women on our docks, than machinery, which is expensive, and which will constantly have to be replaced. In the end, there’s no gain for them, and I hope that they will understand how important an issue this is for me.

For the great privilege of accessing our markets, these foreign companies should hire our incredible American Workers, instead of laying them off, and sending those profits back to foreign countries. It is time to put AMERICA FIRST!

Is it worse if he knows this is not how any of this works, or if he thinks this actually is how any of this works?

I predict that Trump’s statement opposing port automation was a substantial misstep. There is a certain crowd that really wants to be optimistic about making things work again, and this is a very clear negative signal to them.

Bending the knee to the dockworkers shows weakness, and has extremely bad vibes.

Homeland Security modernizes H-1B program effective January 17, 2025, from the summary the big changes are expanded eligibility for founders with controlling interest in the petition, and nonprofits and research entities being exempt from the cap. That last one is a huge deal.

Agus: One implication of this rule is that it should allow a broader set of nonprofits in EA/AI safety to leverage cap-exempt status for H-1Bs, allowing research to be a “fundamental activity” (among many) rather than the org’s primary activity.

HS2 in the UK forced to spend 100 million on a bat tunnel despite no evidence of any way the trains in question interfere with bats. The details keep somehow making it worse.

Tesla to use Native American tribes to get around dealer requirements for auto sales. This falls under ‘why did this solution take so long to find’ and also ‘haha sickos.’ You love to see it.

Welcome to being a CEO in the EU with over 40m in revenue, now please report these 649 environmental and social indicators.

Hotels are still mostly failing to let you check in on your phone. Various replies say chains get close or work sometimes, Hilton seems to be ahead of the curve here where it usually works, with Marriott claiming to do it but mostly not working. Nate Silver reports the MGM hotels in Vegas do it, makes sense Vegas would be ahead of the curve. On my most recent hotel trip I was not tempted to try to check in online.

Google introduces Willow, an advancement in quantum computing. I frankly have no idea how impressed I should be, or in what ways I should update or what impacts I should expect, beyond a lot of people reporting being impressed.

Google has had ‘loss of pulse alerts’ working for months in Europe on its watches and it’s ready to go but the FDA keeps saying it’s better to let people die, instead. The lives saved number in the thread seems way too high, but I also don’t see the downside.

Joe Weisenthal: Riding in an Uber after a Waymo feels like going from an iPhone to a flip phone.

Whether Waymo can scale like the iPhone did. Obviously a totally separate question. But just as an experience, the difference is stark.

Having ridden in Waymos myself now, I do not want to go back.

And yes, they are everywhere in San Francisco, my eyes confirm this:

liz: prolly about 15-20% of all the cars i see on regular basis in sf are waymos now. rest of the country doesnt recognize how real this is.

Tyler Cowen points to a new working paper from Kevin Lang, that notices that under reasonable assumptions, it would take a t-score of 5.48 to reject the null hypothesis in an economics paper with 95% confidence, with 65% of narrowly rejected hypotheses and 41% of all rejected hypotheses remaining true. Notice that this is the optimistic conclusion that assumes everyone’s methodology is good and no fraud or large mistakes are involved, so it is much worse than this.

Scott Aaronson responds to Google Willow’s advances in quantum computing. Basically, yes it’s a cool advance, but don’t get overexcited yet.

In case it needs to be said: You find a way to rebuild Notre Dame. It is in the 99th percentile of things people spend money on to rebuild Notre Dame. If your ethics and world impact modules suggest that the world should not rebuild Notre Dame, or that marginal ordinary ‘effective’ charity spending would be better than rebuilding Notre Dame, please go and fix your modules accordingly. Thank you.

No one has even heard of effective altruism in any meaningful way.

Rob Wiblin: Who has heard of effective altruism and can demonstrate they’re not confabulating?

Roughly nobody, even among people with advanced degrees.

(~1% of total population, ~3% of grad school finishers.)

If you go to ‘has heard of EA at all’ it’s 12%, but they mostly know nothing more.

Of the 1% who actually know what EA means, their attitudes are generally positive.

Sentiment is far more positive among those who don’t know what EA is, if an advocate tells them what EA is, but the issues with that measurement are obvious.

This is a good touching of grass for what regular people have even heard about:

This lack of knowing anything about EA caused EAs generally to greatly underestimate the reputational damage they took from FTX and SBF. As Oliver points out, this is a general point – most of the time most people don’t think about you at all, and most people haven’t heard of and don’t care about most things. So if you do a general population survey mostly all you detect are the vague vibes, but that is very different from what they would find if suddenly they did care, or what the people interacting with you will care about.

The AI situation is similar. Americans hate AI, don’t want AI, and support regulation of AI. The vibes are terrible. That doesn’t mean they actively care much yet, and it isn’t inherently that predictive of what their opinions will be once they do care.

Ever since some combination of FTX and the Battle of the Board at OpenAI, there have been systematic hyperstition attacks made against Effective Altruism (and also anyone else who wants to not die from AI) – attempts to lie about social reality and how everyone hates EAs and they are outcasts and low status and so on, in order to convince others to make take those lies and make them true. Noah Smith is the latest to join this.

I suppose I am modestly disappointed by Noah Smith there, whereas I no longer know how to be disappointed by the hysterics of Marc Andreessen, such as those he is replying to here.

Here are some charts on how EA conferences are doing, with 2024 seeming to show declines. I don’t presume this is a good measure of how EA in general is doing.

If you own the business or can choose what it expenses, you probably could do a lot more expensing without taking on any substantial risk.

Fast (and free) shipping is truly beloved.

Ryan Peterson: Fast shipping can have 5x the sales impact of a super bowl ad.

This is another reason to highly value Jones Act repeal. If we speed up transit within the United States, that can have a big impact on reshoring production.

An unusually frank, self-aware and seemingly balanced view of the costs and benefits of meditation. If one takes this description seriously, and I do, meditation clearly has high opportunity costs and net negative story value. There are benefits, I believe those exist as well, but it made me more confident in my decision not to go seriously down that road. The key benefit that’s missing and might have sold me on it, given Sasha Cohen wrote this, is that this doesn’t let you marry your own Cate Hall.

Grim analysis of Russian economic outlooks, especially if the war is not halted. Things held up well for a while, but at some point the costs add up and the reserves run out, and things start to escalate. First slowly, then quickly.

Many say (here Robin Hanson and HatingOnGodot) that public speaking is easy if you don’t respect a single soul in the room, they will read your disdain for confidence. You can also actually be confident or not care what they think, those works as well.

A thread of polls that asks what it would take before you would let your trusted friends convince you to go to a doctor for what they say is a manic episode, despite you not seeing why any of your new behaviors should be concerning.

And when the doctor says you need meds and everyone around you agrees, a large majority won’t take the meds, although a majority of married people would if those warning them included their spouse. But as Paul says, that’s what being crazy will often look like when you’re crazy.

The cops additionally arresting you for a seemingly insane reason got a 60% majority to take the meds, but a lot of people still wouldn’t do it.

It seems rather obvious that people are wrong here. Your close friends all saying you need to see a doctor is rather strong evidence. The doctor then telling you they’re right and you need meds is very strong evidence you need meds. Yes, this means you can in theory be ‘hacked from the outside’ but that is supremely less likely than already being hacked from the inside (and if you’re delusional about all your friends telling you that you need meds, then you definitely need meds!).

The keys here are that almost no one agrees with you, and you don’t know why.

I don’t generally let it bother me much if a majority thinks I’m crazy or wrong.

I do let it bother me when it is essentially everyone, and I don’t have a damn good model of why they’re think I’m crazy or wrong. I probably am.

However, if I have a good model of exactly why they all think I’m crazy, then it might be time for ‘they all thought that I was crazy, but I’ll show them!’

Nate Silver makes his case against eliminating daylight savings time, saying it will cost daylight, and we should save the daylight instead. I say no, we should kill daylight savings time. If schools and companies and businesses then want to adjust their start times, then go ahead. There’s nothing stopping you. In particular I think Nate is being rather unfair in his assessment of the cost of the clock adjustments. Indeed, he proves too much – if clock adjustments are almost free, why not have more adjustments?

What makes a good Royal Navy Officer? Motivation. Motivation matters more for performance evaluations and advancement to leadership than general intelligence or personality traits. Does this mean intelligence is not so important? Perhaps for this particular job it is so, especially in peacetime and until a high level is reached, more than that I would say it is a liability.

The question is indeed who wants to be a Royal Navy officer? Who wants to work hard at that for many years? Being intelligent is a highly double edged sword. If you are the Royal Navy, the highly motivated might not be the best talent, but they are the best talent you can hope to retain.

What does it take before you should trust someone else’s advice on what to do?

As always, some people need to hear this, some need to hear the opposite.

Daystar Eld: Your wants and preferences are not invalidated by smarter or more “rational” people’s preferences. What feels good or bad to someone is not a monocausal result of how smart or stupid they are.

The post is about one form of the Valley of Bad Rationality, where (as a summary of the post’s key points here) you think that you shouldn’t do ‘irrational’ things like eat ice cream (it’s a superstimulus!) or want to share housework (they earn more than you, their time is more valuable!), or feel hurt, or have different preferences than that of your community, and so on. And you definitely shouldn’t let someone bully you with logic into giving up your desires or preferences, even if they aren’t legible. Not everything you think and do and want and insist upon needs to pass a strict logical test all the time.

Beware requiring everything to be legible or logical, especially on every level at once.

You can absolutely take that principle too far. This here I think is simply wrong:

Daystar Eld: If someone else tells you that something you’re doing or thinking is irrational, they need to first demonstrate that they understand your goals, and second demonstrate that they have information you don’t, which may inform predictions of why your actions will fail to achieve those goals.

I need to understand your instrumental goals in context, and every little bit helps, but I absolutely do not need to understand your overall goals except insofar as they are relevant to the actions in question.

I also need some epistemic advantage – which often is actually ‘I understand what your goals are better than you do’ or yes sometimes ‘I am more skilled or smarter’ – but that need not take the form of information. If I have the same information you do, and we are both focused on the same goal, then yes one of us can plausibly be much better at figuring out what to do from there. That doesn’t mean you have to trust it.

First 20 seasons of Law & Order now on Hulu! Woo hoo! I’m not currently watching this on the elliptical, but it’s absolutely great for that.

I didn’t realize I was setting this up, but it turns out I was (2/5 stars):

So of course I was delighted that Bret Deveraux not only fully agreed with me (he was kinder on the action scenes than I was, I wasn’t impressed, we agree that Denzel Washington was by far the best part), he also decided to waste a lot of time with two long posts dedicated to nitpicking the film. I knew the film had historical accuracy issues, and I knew I didn’t know the half of it, but even accounting for not knowing the half of it… I definitely did not know the half of it. Wow. They Just Didn’t Care.

I hope to have a 2024 year-in-movies spectacular post, if I find the time. For now, I’ll say I still think The Fall Guy is my favorite movie of 2024, followed by Megalopolis, but I’m realistic and unless something blows me away from the end-of-year releases at the awards shows I will be rooting for Anora.

Tyler Cowen says India has the best food, with $5 meals there often better than Michelin star restaurants in Paris. I too am not a big fan of the Michelin stars. I do buy his case that ‘when everyone is a food critic’ standards rise, and I think the rise of online reviews is a lot of why food has been rapidly improving (and it has!). And I buy that India punches ‘far above its weight’ here and relative to its prices.

But I think the full claim mostly says something very particular about Tyler’s preferences (although I have never been to India so anything is possible). I think this also links in to Scott Alexander and the discussion on taste – Tyler is largely identifying a particular type of taste that he loves, that is highly present in India.

He also mentions that reservations are not a problem, ‘unlike in London or New York.’

Whereas my experience in New York is that reservations are only required at a handful of places, as long as you are not going at peak times on Friday or Saturday night, or to peak brunch, or trying for one of a handful of the hottest places, half of which will still let you sit at the bar if you show up early. My solution is simply that the few places that are hard to get into don’t exist unless someone else gets me a reservation.

Patrick McKenzie: I do not know what product manager at Google Docs decided that every time I see my own name I would prefer to be reminded by a fly-in card of who I am, what my schedule is like, that I am currently outside of my business hours, and options to email/etc myself, but I urgently want that individual to edit a transcript sometime while on deadline.

That “Was this helpful?” reminds me of Camellia from Wrath of the Righteous, whose catch phrase is “I am helpful, am I not?” and who is lawful good by comparison to the slow-moving interruptive doesn’t-actually-disable-it feedback form which pops if you thumbs down the card.

Had to serially select my name to perform editing of the transcript.

Patrick McKenzie points out that with notably rare exceptions essentially everyone prefers the chargeback system ot the legal system, where the chargeback system is extremely punishing to anyone who gets chargebacks, which means that customers can explicitly break off their agreements and avoid cancellation fees and such if they ever feel like it, and only a few businesses (like many gyms) will find it worthwhile to fight back.

I realize living in Japan is part of it, but the rate at which things like ‘they think your wife’s name on all the forms must not be real so they decide to name her poochie’ remains off the charts high.

The ancient art of strongarming your suppliers and contractors in order to get them to do things in a reasonable time frame, which is the only way things get done within a reasonable time frame while coordinating suppliers.

“For the benefit of the recorded phone line” and “can you send that in an email so I can have a paper trail?

Patrick McKenzie doesn’t go to the doctor.

Thread with notes on identity theft, in response to another thread about the pervasiveness of identity theft among poor people with extreme problems, with it being extremely difficult and costly to clean up the mess even once you know about it.

A contractor helps ensure that Patrick’s mother’s kitchen is set up to accomodate a potential future wheelchair. That’s a great contractor, also a key idea.

There are those who do not understand why Patrick cares so much about subtexts and being a Dangerous Professional, and those who don’t understand that some people need to be informed about this. Yes, the two should meet, it would be fun and also educational.

Promising early review from Ondrej Strasky of upcoming game The Bazaar. I’ll be checking it out at a later stage, but haven’t yet.

Balatro No Jokers challenge is indeed possible. Of course, the key is an insane amount or rerolling until you get the start you need.

Looking back at the Tempest handoff file, part 1, for those old enough to remember.

On the music of Sid Meier’s Civilization. I feel this. Songo di Volare is on in the background right now, I’m not crying, you’re crying. What I think this undersells is the amount to which great games (and movies and shows) make the associated music great. Yes, there is correlation – if you’re doing great work in one area you do great work in another, and this music is great – but a lot of why we see it as great is that we associate it with the games and the rise of Civilization. Baba Yetu is otherwise not special, but it is Grammy-level because it is part of the game.

Customize famous retro gaming screens with your own text. Good times, man.

Magic’s latest banned and restricted announcement unbans Mox Opal, Faithless Looting, Green Sun’s Zenith and Splinter Twin in Modern while banning The One Ring, Amped Raptor and Jegantha, the Wellspring.

Here are two takes I am inclined to agree with, although my knowledge is rusty now.

Sam Black: The bans are very clear steps in the right direction that, as usual these days, almost certainly didn’t go far enough, but that’s because there is real value in taking things slow (I think I’d like ban updates to be a little more frequent so they could be slow but less slow).

Legacy is probably a Nadu ban away from playable, but I might play another legacy tournament now, where I didn’t even consider playing legacy at EW (despite being there) before. I’m actually happier about the bauble ban than the frog ban.

I wanted big unbans in Modern and I’m very happy they went that way. Also, it’s possible Mox Opal is the strongest card in Modern again, but I have no problem with its unban and it does make me curious to try Modern again.

The hate for Lantern is extremely strong, but at least there’s payoff for trying to make it work again, so I could see myself messing around with Modern Amulet at some point, however.

I’m noticing that I’m less likely to try Modern because I’m not excited about the opportunities to play paper Modern, which is interesting since it used to be the most played paper format. This might just be a bubble I’ve fallen into since I wasn’t interested, or it might be a result of Modern having been bad enough to fall off for awhile, like Standard did in the past, but I’m sure I’m not the only one who’s Modern curious after this update, and I hope event organizers respond by offering some nice Modern events soon.

Brian Kibler: Understandably lots of ban list chatter this morning. Just a reminder that the design philosophy of direct-to-Modern sets like Modern Horizons necessitates pushing the envelope of the most powerful cards in the history of the game and broken cards are absolutely inevitable.

The genie is out of the bottle, and the sets make tons of money, so they’re not going away. Modern is no longer a non-rotating format. It’s a format that effectively rotates whenever the next Horizons set comes out and creeps the power level of the entire game because it has to.

I completely understand the business case for Modern Horizons, but I think from a game design and balance perspective, they are *literallythe worst thing that has ever happened to paper Magic because of the constant upward pressure they put on power level.

Path of Exile 2 is in early access. I’ve barely had time to try it. So far, I like a lot of the choices, but it’s too early to tell. It is very hard early on compared to other similar games, especially for the wrong characters. We’re talking a several-minute fight (at least for my character) with potential one shot kills less than an hour into a Diablo-like, at level 4. And it is very visually dark.

New York Mets pay quite a lot to sign Juan Soto, $765 million for 15 years, or $805 million if they want to block the opt-out clause. Nate Silver thinks this is roughly market rate and the deal is good, actually, because his prospects are actually insanely great. Plus, one thing he doesn’t consider: If they do introduce the insane ‘golden at-bat’ or other such nonsense, then one god-tier player gets a lot more valuable.

Ultimately it comes down to whether baseball contracts will keep getting bigger, since the money is mostly far in the future. I would be sad about this signing if the Mets were effectively on a fixed budget set now, but Steve Cohen is one of a kind and if anything I bet this means he wants to spend more to ensure the money didn’t go to waste, and I expect salaries to rise over time.

So I’m happy about it.

Similarly, I expect Pete Alonso to be at least somewhat overpriced, but I’d be all for signing him as long as the price is only moderately unreasonable, because I don’t expect the Mets to then take that money away from the rest of their budget.

Also, for both cases, I think having star players in very long term contracts is great for fans and for the game. I want to root for my same guys for a decade, as much as possible. Alonso has to be much more valuable as a Met than anywhere else, but if we do it I want it to be a full-career contract. And again, that ultimately would look like a bargain if salaries keep rising, even if it looks high now.

I am extremely excited for the College Football Playoff. I was worried that it would harm the regular season, I was spectacularly wrong it made it infinitely better, and now we get the playoff.

The talk of the town are complaints about the seeding, that the conference champions should not get automatic byes. And the talk is now even louder after what happened in the first round.

I disagree, unless we are expanding to a full 16 teams, which we should probably do. The byes make conference championships matter. It makes them worth fighting for and caring about, effectively playoff games no matter what.

This also answers the question ‘why would you show up to your conference championship game?’ that everyone was so worried risked ruining conference championship games.

The answer is, ‘because a slot in the quarterfinals is a lot better than a slot in the first round.’ You would of course want to play for a first-round bye (and sometimes an automatic playoff slot that you wouldn’t otherwise have!) even at the risk of occasionally slipping out of the field.

Consider the SMU situation, the only team that was in danger of slipping out. If they beat Clemson, one of the weakest four teams in the field, they would have had a first round bye, so they’d have gotten to skip a much harder other game. So even they are mostly better off playing, and for no other team in contention was it even a question.

My expectation was that they wouldn’t much be punishing teams that lost conference championship games in any case, unless they were exposed as total frauds. That has been the pattern in the past, even when there weren’t stakes.

The last time a team under the existing system would have lost a slot due to a championship game was Oregon in 2021 after a blowout loss to Utah. Before that it was TCU in 2017, when they started on the bubble at #11 and took a blowout loss to Oklahoma. Both seem like very reasonable cuts.

So even if the committee isn’t consciously intervening here (until this year these decisions meant almost nothing) we are looking at about one drop out every four years, and most of them won’t be controversial.

I also thought that letting the #5 seed (aka the highest rated non-champion) have a presumptive easy quarterfinal was also great design.

The future, however, is clearly in having more true home games. Everyone wants true home playoff games. So yes everyone wants a bye, but the ‘gains from trade’ are clear.

I do think this was a weird season, in that Alabama missed the playoff and could plausibly have won it all. Normally, there won’t be a bubble team like that. And if we expand to 16 teams, as we likely will and should, then the issue goes away – any team with even 3 losses that could plausibly win, should then make it.

My solution would be to expand to 16, and the top four conference champions are locked into first round home games. None of the four can be seeded lower than 8. Ideally I’d also allow the top seeds to draft their opponents, but we probably can’t have everything.

In terms of how we determine the rankings, this year made it clear we don’t put enough weight on strength of schedule and record, and especially on Nick Saban’s question: Who did you beat? I understand that you don’t set your conference schedule, and you don’t know who is going to be good, but let’s be real. The non-SEC mind really cannot comprehend an SEC schedule. But ultimately, if we go to 16 (and even now with 12) and you don’t get in, that’s still completely on you.

I certainly don’t agree that the playoff is a failure. Yes, the first four games were blowouts, but that’s still playoff football, and it was mostly not because of poor design. It turns out the home teams were very good, and the road teams weren’t. That won’t always be true. We should have had Alabama over SMU, true, but you can’t not include Clemson, Tennessee or Indiana.

On the question of gambling, things are rather grim in Brazil, with mobile gaming apps available and many paying credit card rates exceeding 400%.

Ezra Klein: Online gambling is going to be a fascinating dividing line between the NatCon coalition that sees itself as restoring virtue and the Barstool Conservative side. The evidence is overwhelming that a lot of people are getting hurt, and not just here.

Good Charles Lehman piece on this.

In general you don’t want to put a cap on interest rates, and it is good to give people access to even very expensive credit, but at 400%+ credit card rates I have to wonder. Steps being pondered, like banning advertising that claims gambling is ‘an investment,’ or not allowing funding directly via credit cards, seem likely to be wise.

The only way to (always!) win is not to (have to) play.

I demand free speech! Or, on second thought, maybe not in this case?

They really don’t like Ohio.

I have been convinced that both Claude and I were wrong, and that the Ohio thing is not actually about the well known villains that are the Ohio State Buckeyes. But I’m still going to head cannon and pretend that we were right anyway.

Discussion about this post

Monthly Roundup #25: December 2024 Read More »

human-versus-autonomous-car-race-ends-before-it-begins

Human versus autonomous car race ends before it begins


A2RL admits that this is a hard problem, and that’s refreshing.

A pair of open-wheel race cars parked on the main straight at Suzuka in Japan

A2RL chose the Super Formula chassis to install its autonomous driving tech. Recently, an A2RL car went to Suzuka in Japan to try and race against a human-driven version. Credit: Roberto Baldwin

A2RL chose the Super Formula chassis to install its autonomous driving tech. Recently, an A2RL car went to Suzuka in Japan to try and race against a human-driven version. Credit: Roberto Baldwin

TOKYO—Racing is hard. It’s hard on the teams, it’s hard on the owner’s bank account, it’s hard on the cars, and it’s especially hard on the drivers. Driving at the edge for a few hours in a vehicle cockpit that’s only slightly wider than your frame can take a toll.

The A2RL (Abu Dhabi Autonomous Racing League) removes one of those elements from its vehicles but, in doing so, creates a whole new list of complexities. Say goodbye to the human driver and hello to 95 kilograms of computers and a whole suite of sensors. That setup was poised to be part of a demonstration “race” against former F1 driver Daniil Kvyat at Suzuka Circuit in Japan during the Super Formula season finale.

But again, racing is hard, and replacing humans doesn’t change that. The people who run and participate in A2RL are aware of this, and while many organizations have made it a sport of overselling AI, A2RL is up-front about the limitations of the current state of the technology. One example of the technology’s current shortcomings: The vehicles can’t swerve back and forth to warm up the tires.

A team of people stand in front of a racing car and pose for a photograph

The A2RL team and former F1 racer Daniil Kvyat (center) smile for the media at Suzuka. Credit: Roberto Baldwin

Giovanni Pau, Team Principal of TII Racing, stated during a press briefing regarding the AI system built for racing, “We don’t have human intuition. So basically, that is one of the main challenges to drive this type of car. It’s impossible today to do a correct grip estimation. A thing my friend Daniil (Kvyat) can do in a nanosecond.”

Technology Innovation Institute (TII) develops the hardware and software stack for all the vehicles. Hardware-wise, the eight teams receive the same technology. When it comes to software, the teams need to build out their own system on TII’s software stack to get the vehicles to navigate the tracks.

Not quite learning but not quite not learning

In April, four teams raced on the track in Abu Dhabi. As we’ve noted before, how the vehicles navigate the tracks and world around them isn’t actually AI. It’s programmed responses to an environment; these vehicles are not learning on their own. Frankly, most of what is called “AI” in the real world is also not AI.

Vehicles driven by the systems still need years of research to come close to the effectiveness of a human beyond the wheel. Kvyat has been working with A2RL since the beginning. In that time, the former F1 driver has been helping engineers understand how to bring the vehicle closer to their limit.

The speed continues to increase as the development progresses. Initially, the vehicles were three to five minutes slower than Kvyat around a lap; now, they are about eight seconds behind. That’s a lifetime in a real human-to-human race, but an impressive amount of development for vehicles with 90 kg of computer hardware crammed into the cockpit of a super formula car.

Credit: Roberto Baldwin

Currently, the vehicles are capable of recreating 90–95 percent of the speed of a human driver, according to Pau. Those capabilities are reduced when a human driver is also on the track, particularly for safety reasons. When asked by Ars what his biggest concern was being on the track with a vehicle that doesn’t have a human behind the wheel, Kvyat said he has to “try to follow the car first to see what line it chooses and to understand where it is safe to race it. Some places here [at Suzuka] are quite narrow—on the contrary from the Abu Dhabi track—and there are a lot of long corners. So I really need to be alert and give respect and space to the AI car,” Kvyat said.

Kvyat also noted that the AI car is traveling at a more respectable speed, so he really needs to know what’s going on.

The predictability of a human driver both on a track and in the real world is one of the issues surrounding AI. As we drive, walk, or bike around a city, we rely on eye contact from drivers, and there are certain behavioral expectations. It’s the behavioral outliers that cause issues. Examples include things like running a stop sign, weaving into a lane already occupied by another vehicle, or stopping in the middle of the road for no discernible reason. On the track, an autonomous vehicle might choose to deviate from the racing line around a corner because of a signal input that a human driver would ignore or fold into their driving based on their real-world experience. The context of the rest of a lived life is just as important as what’s learned on the track. Life and racing are hard and chaotic.

The “race”

On the Saturday of the race weekend, a demonstration of two A2RL vehicles raced around the circuit. The vehicles were moving quickly down the straight. The corners, though? We were told that they were still a bit tricky for the vehicles to navigate.

Down in the pits, the team watched a bank of monitors. Sensor data came in from the vehicles—zeros and ones representing the track translated into a sea of graphs. To help parse the data quickly, the system shows a green flag when everything is going well and red flags when the values are out of whack with what’s supposed to happen. In addition to how the vehicle is moving, information about fuel consumption, brake wear, and tire temperature is shared with the team.

All of this data lets the team know how hard it is pushing the vehicle. If everything looks good, the team can push the vehicle to go a little quicker, to push a little harder for a better lap time. Humans elsewhere in the pits will soon tell their human drivers the same thing. Push harder, be quicker; the car can handle it. The data coming in predicts what will happen in the next few seconds.

Hopefully.

The individual teams will try to find the optimal line, just like the human team, but it doesn’t always follow what humans have done before on a track. They work to create an optimal line for the autonomous car instead of just copying what humans are doing.

This team has been at Suzuka for weeks ahead of this race. The HD map they bought from a third party was off by meters. In that time the team had to remap the track for the vehicles and teach them how to drive on a circuit that’s narrower than the track at Abu Dhabi.

The car is outfitted with Sony 4K cameras, radars, lidar, high-definition GPS, and other sensors. The electric steering can handle up to five Gs. The hydraulic brakes on each wheel could be triggered individually, but currently, they are not, according to Pau. However, Pau did note that enabling this function would open up new possibilities, especially in cornering.

A pair of racing cars on the grid at Suzuka before the start of a race. Photographers and engineers are fussing over the car

On the grid at Suzuka. Credit: Roberto Baldwin

Pau took a moment while walking us around the vehicle to point to the laser that measures the external temperature of the tire. That, along with the ability to track the tire’s pressure, are key to ensuring the vehicle stays on the track.

The next morning, the main event was gearing up. Man versus machine. A modern-day John Henry tale without the drama of the song about a steel-driving man. We all knew Kvyat would win. A2RL was very up-front that the system is not nearly as quick as a human. At least not yet. But it had decided to bring the race to Japan, a country known to be on the cutting edge of technology. The “race” was to be held ahead of the season finale of the Super Formula season.

It was cooler that morning than the previous day. The cars were pushed out to the grid. Kvyat was stationed behind the driverless vehicle. The time between leaving the pits and the race starting felt longer than the day before. The tires were cooling off.

The A2RL vehicle took off approximately 22 seconds ahead of Kvyat, but the race ended before the practice lap was completed. Cameras missed the event, but the A2RL car lost traction and ended up tail-first into a wall. A rather anti-climatic end to weeks of work by the team. In the pits, people gathered around the monitors trying to determine exactly what went wrong.

Khurram Hassan, commercial director of A2RL, told Ars that the cold tires on the cold track caused a loss of traction. A press release sent out later in the day noted that one of the rear tires suddenly lost pressure, causing the vehicle to lose traction and slide into the wall.

The cameras missed the spin, but caught the aftermath. Roberto Baldwin

Hassan reminded us that the vehicle does not know how to swerve back and forth yet to warm up its tires. But more importantly, he said that the gap between simulation and the real world is very real. “You could do things on a computer screen, but this is so important. Because you have to be on the track,” Hassan said.

The reality is that reality is chaos and always changing. When a company notes that it’s doing millions of miles of simulated testing, it’s vital to remember that a computer-generated world does not equal the one we inhabit.

Reality and intelligence

A2RL doesn’t want to replace human-to-human racing. It understands the emotional attachment humans have to watching other humans compete. It also realizes that as these vehicles improve, what the teams learn will not be directly pulled from the track and put on self-driving cars. But by pushing these vehicles to the limit and letting AI determine the best course of action to keep from slamming into a wall or other vehicle, that information could be used in the future as a safety feature in vehicles—a way to keep a collision from happening used in conjunction with other safety features.

The day before the human versus AI race, Super Formula had its penultimate race of the season. During that race, two cars left the pits only to have one of their rear wheels come off. Also, another two cars collided with each another. Racing is hard, and accidents happen.

For A2RL, failure is always an option. It may break the hearts of everyone in the pits that have prepped for weeks for an event, but it’s important to remember that it’s a controlled environment. A2RL seems to understand and talks about the complications of aiming for an AI-powered vehicle. It would be nice if those companies testing on our streets did the same.

Human versus autonomous car race ends before it begins Read More »

louisiana-bars-health-dept.-from-promoting-flu,-covid,-mpox-vaccines:-report

Louisiana bars health dept. from promoting flu, COVID, mpox vaccines: Report

Louisiana’s health department has been barred from advertising or promoting vaccines for flu, COVID-19, and mpox, according to reporting by NPR, KFF Health News, and New Orleans Public Radio WWNO.

Their investigative report—based on interviews with multiple health department employees who spoke on the condition of anonymity for fear of retaliation—revealed that employees were told of the startling policy change in meetings in October and November and that the policy would be implemented quietly and not put into writing.

Ars Technica has contacted the health department for comment and will update this post with any new information.

The health department provided a statement to NPR saying that it has been “reevaluating both the state’s public health priorities as well as our messaging around vaccine promotion, especially for COVID-19 and influenza.” The statement described the change as a move “away from one-size-fits-all paternalistic guidance” to a stance in which “immunization for any vaccine, along with practices like mask wearing and social distancing, are an individual’s personal choice.”

According to employees, the new policy cancelled standard fall flu vaccination events this year and affects every other aspect of the health department’s work, as NPR explained:

“Employees could not send out press releases, give interviews, hold vaccine events, give presentations or create social media posts encouraging the public to get the vaccines. They also could not put up signs at the department’s clinics that COVID, flu or mpox vaccines were available on site.”

“We’re really talking about deaths”

The change comes amid a dangerous swell of anti-vaccine sentiment and misinformation in Louisiana and across the country. President-elect Trump has picked Robert F. Kennedy Jr.—a high-profile anti-vaccine advocate and one of the most prolific spreaders of vaccine misinformation—to head the US Department of Health and Human Services.

Louisiana bars health dept. from promoting flu, COVID, mpox vaccines: Report Read More »

the-ai-war-between-google-and-openai-has-never-been-more-heated

The AI war between Google and OpenAI has never been more heated

Over the past month, we’ve seen a rapid cadence of notable AI-related announcements and releases from both Google and OpenAI, and it’s been making the AI community’s head spin. It has also poured fuel on the fire of the OpenAI-Google rivalry, an accelerating game of one-upmanship taking place unusually close to the Christmas holiday.

“How are people surviving with the firehose of AI updates that are coming out,” wrote one user on X last Friday, which is still a hotbed of AI-related conversation. “in the last <24 hours we got gemini flash 2.0 and chatGPT with screenshare, deep research, pika 2, sora, chatGPT projects, anthropic clio, wtf it never ends."

Rumors travel quickly in the AI world, and people in the AI industry had been expecting OpenAI to ship some major products in December. Once OpenAI announced “12 days of OpenAI” earlier this month, Google jumped into gear and seemingly decided to try to one-up its rival on several counts. So far, the strategy appears to be working, but it’s coming at the cost of the rest of the world being able to absorb the implications of the new releases.

“12 Days of OpenAI has turned into like 50 new @GoogleAI releases,” wrote another X user on Monday. “This past week, OpenAI & Google have been releasing at the speed of a new born startup,” wrote a third X user on Tuesday. “Even their own users can’t keep up. Crazy time we’re living in.”

“Somebody told Google that they could just do things,” wrote a16z partner and AI influencer Justine Moore on X, referring to a common motivational meme telling people they “can just do stuff.”

The Google AI rush

OpenAI’s “12 Days of OpenAI” campaign has included releases of their full o1 model, an upgrade from o1-preview, alongside o1-pro for advanced “reasoning” tasks. The company also publicly launched Sora for video generation, added Projects functionality to ChatGPT, introduced Advanced Voice features with video streaming capabilities, and more.

The AI war between Google and OpenAI has never been more heated Read More »

automakers-excoriated-by-senators-for-fighting-right-to-repair

Automakers excoriated by Senators for fighting right-to-repair

Yesterday, US Senators Jeff Merkley (D-OR), Elizabeth Warren (D-MA), and Joshua Hawley (R-MO) sent letters to the heads of Ford, General Motors, and Tesla, as well as the US heads of Honda, Hyundai, Nissan, Stellantis, Subaru, Toyota, and Volkswagen, excoriating them over their opposition to the right-to-repair movement.

“We need to hit the brakes on automakers stealing your data and undermining your right-to-repair,” said Senator Merkley in a statement to Ars. “Time and again, these billionaire corporations have a double standard when it comes to your privacy and security: claiming that sharing vehicle data with repair shops poses cybersecurity risks while selling consumer data themselves. Oregon has one of the strongest right-to-repair laws in the nation, and that’s why I’m working across the aisle to advance efforts nationwide that protect consumer rights.”

Most repairs aren’t at dealerships

The Senators point out that 70 percent of car parts and services currently come from independent outlets, which are seen as trustworthy and providing good value for money, “while nearly all dealerships receive the worst possible rating for price.”

OEMs and their tier-one suppliers restricting the supply of car parts to within their franchised dealership networks also slows down the entire repair process for owners as well as increasing the cost of getting one’s car fixed, the letter states.

As Ars noted recently, more than one in five automotive recalls are now fixed with software patches, and increasingly the right-to-repair fight has centered on things digital—access to diagnostics, firmware, and connected services. The percentage of non-hardware recall fixes will surely grow in the coming years as more and more automakers replace older models with software-defined vehicles.

Automakers excoriated by Senators for fighting right-to-repair Read More »

not-to-be-outdone-by-openai,-google-releases-its-own-“reasoning”-ai-model

Not to be outdone by OpenAI, Google releases its own “reasoning” AI model

Google DeepMind’s chief scientist, Jeff Dean, says that the model receives extra computing power, writing on X, “we see promising results when we increase inference time computation!” The model works by pausing to consider multiple related prompts before providing what it determines to be the most accurate answer.

Since OpenAI’s jump into the “reasoning” field in September with o1-preview and o1-mini, several companies have been rushing to achieve feature parity with their own models. For example, DeepSeek launched DeepSeek-R1 in early November, while Alibaba’s Qwen team released its own “reasoning” model, QwQ earlier this month.

While some claim that reasoning models can help solve complex mathematical or academic problems, these models might not be for everybody. While they perform well on some benchmarks, questions remain about their actual usefulness and accuracy. Also, the high computing costs needed to run reasoning models have created some rumblings about their long-term viability. That high cost is why OpenAI’s ChatGPT Pro costs $200 a month, for example.

Still, it appears Google is serious about pursuing this particular AI technique. Logan Kilpatrick, a Google employee in its AI Studio, called it “the first step in our reasoning journey” in a post on X.

Not to be outdone by OpenAI, Google releases its own “reasoning” AI model Read More »

here’s-what-we-learned-driving-audi’s-new-q6-and-sq6-electric-suvs

Here’s what we learned driving Audi’s new Q6 and SQ6 electric SUVs

HEALDSBURG, Calif.—Earlier this summer, Ars got its first drive of Audi’s new Q6 e-tron on some very wet roads in Spain. Then, we were driving pre-production Q6s in Euro-spec. Now, the electric SUV is on sale in the US, with more power in the base model and six months more refinement for its software. But the venue change did not bring a change of weather—heavy rain was the order of the day, making me wonder if Audi is building its new electric vehicle on the site of an ancient rain god’s temple?

Of all its rivals, Audi appears to have settled into a nomenclature for its vehicles that at least makes a little sense. Odd numbers are for internal combustion engines, even numbers for EVs, although it also appends “e-tron” on the end to make that entirely clear… and give francophones something to snicker about. (Yes, the e-tron GT does not fit into this schema, but nobody’s perfect.)

The Q6 e-tron is also the most advanced EV to wear Audi’s four rings. Built on a new architecture called PPE (premium platform electric), at its heart is an 800 V powertrain with a 100 kWh (94.4 kWh useable) lithium-ion battery pack that powers a permanently excited synchronous motor driving the rear wheels, and in the case of the quattro versions, an asynchronous motor. The electric motors have 30 percent less energy consumption than those used in the Q8 e-tron, and are smaller and lighter.

That makes it a lot more up to date than the Q8 e-tron, which uses a modified version of Audi’s venerable MLB Evo platform, or the smaller Q4 e-tron, a somewhat disappointing electric crossover that’s essentially a Volkswagen ID.4 with a glow-up. That goes for the Q6 e-tron’s electronics, which are also a generation newer than the Q4 e-tron, and also more capable.

Audi is starting off US Q6 e-tron sales with a pair of models, the $65,800 Q6 e-tron quattro and the $72,900 SQ6 e-tron quattro. A $63,800 single-motor (not-quattro) Q6 e-tron will be available in time, with 302 hp (225 kW) and an EPA range of 321 miles (517 km), but we’ll have to wait a while before we get behind the wheel of that one.

Here’s what we learned driving Audi’s new Q6 and SQ6 electric SUVs Read More »

louisiana-resident-in-critical-condition-with-h5n1-bird-flu

Louisiana resident in critical condition with H5N1 bird flu

The Louisiana resident infected with H5N1 bird flu is hospitalized in critical condition and suffering from severe respiratory symptoms, the Louisiana health department revealed Wednesday.

The health department had reported the presumptive positive case on Friday and noted the person was hospitalized, as Ars reported. But a spokesperson had, at the time, declined to provide Ars with the patient’s condition or further details, citing patient confidentiality and an ongoing public health investigation.

This morning, the Centers for Disease Control and Prevention announced that it had confirmed the state’s H5N1 testing and determined that the case “marks the first instance of severe illness linked to the virus in the United States.”

In a follow-up, the health department spokesperson Emma Herrock was able to release more information about the case. In addition to being in critical condition with severe respiratory symptoms, the person is reported to be over the age of 65 and has underlying health conditions.

Further, the CDC collected partial genetic data of the H5N1 strain infecting the patient, finding it to be of D1.1. genotype, which has been detected in wild birds and some poultry in the US. Notably, it is the same genotype seen in a Canadian teenager who was also hospitalized in critical condition from the virus last month. The D1.1. genotype is not the same as the one circulating in US dairy cows, which is the B3.13 genotype.

Louisiana resident in critical condition with H5N1 bird flu Read More »

the-$700-price-tag-isn’t-hurting-ps5-pro’s-early-sales

The $700 price tag isn’t hurting PS5 Pro’s early sales

When Sony revealed the PlayStation 5 Pro a few months ago, some wondered just how many people would be willing to spend $700 for a marginal upgrade to the already quite powerful graphical performance of the PS5. Now, initial sales reports suggest there’s still a substantial portion of the console market that’s willing to shell out serious money for top-of-the-line console graphics.

Circana analyst Matt Piscatella shared on Bluesky this morning that the PS5 Pro accounted for a full 19 percent of US PS5 sales in its launch month of November. That sales ratio puts initial upgrade interest in the PS5 Pro roughly in line with lifetime interest in the PS4 Pro, which recent reports suggest was responsible for about 20 percent of all PS4 sales following its launch in 2016.

That US sales ratio also lines up with international sales reports for the PS5 Pro launch. In the UK, GfK ChartTrack reports that the PS5 Pro was responsible for 26 percent of all console sales for November. And in Japan, Famitsu sales data suggests the PS5 Pro was responsible for a full 63 percent of the PS5’s November sales after selling an impressive 78,000 units in its launch week alone.

Shut up and take my money

In the US, raw unit sales for the PS5 Pro were down slightly (12 percent) compared to those for the PS4 Pro’s launch month in November 2016, Piscatella writes. But the PS5 Pro still managed to bring in 50 percent more total US revenue in its launch month, owing to the PS4 Pro’s much more reasonable $400 launch price (or $533 in 2024 dollars).

The $700 price tag isn’t hurting PS5 Pro’s early sales Read More »

nvidia-partners-leak-next-gen-rtx-50-series-gpus,-including-a-32gb-5090

Nvidia partners leak next-gen RTX 50-series GPUs, including a 32GB 5090

Rumors have suggested that Nvidia will be taking the wraps off of some next-generation RTX 50-series graphics cards at CES in January. And as we get closer to that date, Nvidia’s partners and some of the PC makers have begun to inadvertently leak details of the cards.

According to recent leaks from both Zotac and Acer, it looks like Nvidia is planning to announce four new GPUs next month, all at the high end of its lineup: The RTX 5090, RTX 5080, RTX 5070 Ti, and RTX 5070 were all briefly listed on Zotac’s website, as spotted by VideoCardz. There’s also an RTX 5090D variant for the Chinese market, which will presumably have its specs tweaked to conform with current US export restrictions on high-performance GPUs.

Though the website leak didn’t confirm many specs, it did list the RTX 5090 as including 32GB of GDDR7, an upgrade from the 4090’s 24GB of GDDR6X. An Acer spec sheet for new Predator Orion desktops also lists 32GB of GDDR7 for the 4090, as well as 16GB of GDDR7 for the RTX 5080. This is the same amount of RAM included with the RTX 4080 and 4080 Super.

The 5090 will be a big deal when it launches because no graphics card released since October 2022 has come close to beating the 4090’s performance. Nvidia’s early 2024 Super refresh for some 40-series cards didn’t include a 4090 Super, and AMD’s flagship RX 7900 XTX card is more comfortable competing with the likes of the 4080 and 4080 Super. The 5090 isn’t a card that most people are going to buy, but for the performance-obsessed, it’s the first high-end performance upgrade the GPU market has seen in more than two years.

Nvidia partners leak next-gen RTX 50-series GPUs, including a 32GB 5090 Read More »

the-second-gemini

The Second Gemini

  1. Trust the Chef.

  2. Do Not Trust the Marketing Department.

  3. Mark that Bench.

  4. Going Multimodal.

  5. The Art of Deep Research.

  6. Project Mariner the Web Agent.

  7. Project Astra the Universal Assistant.

  8. Project Jules the Code Agent.

  9. Gemini Will Aid You on Your Quest.

  10. Reactions to Gemini Flash 2.0.

Google has been cooking lately.

Gemini Flash 2.0 is the headline release, which will be the main topic today.

But there’s also Deep Research, where you can ask Gemini to take several minutes, check dozens of websites and compile a report for you. Think of it as a harder to direct, slower but vastly more robust version of Perplexity, that will improve with time and as we figure out how to use and prompt it.

NotebookLM added a call-in feature for podcasts, a Plus paid offering and a new interface that looks like a big step up.

Veo 2 is their new video generation model, and Imagen 3 is their new image model. There’s also Whisk, where you hand it a bunch of images and it combines them with some description for a new image. Superficially they all look pretty good.

They claim people in a survey chose Veo 2 generations over Sora Turbo by a wide margin, note that the edges over the other options imply Sora was sub-par:

Here’s one comparison of both handling the same prompt. Here is Veo conquering the (Will Smith eating) spaghetti monster.

This is a strong endorsement from a source I find credible:

Nearcyan: I haven’t seen a model obliterate the competition as thoroughly as Veo2 is right now since Claude 3.5.

They took the concept I was barely high-IQ enough to try to articulate and actually put it in the model and got it to work at scale.

It really was two years from StyleGAN2 to Stable Diffusion, then two years from Stable Diffusion to Veo2. They were right. Again.

I wonder when the YouTubers are going to try to revolt.

There’s a new Realtime Multimodal API, Agentic Web Browsing from Project Mariner, Jules for automated code fixing, an image model upgrade, the ability to try Project Astra.

And they’re introducing Android XR (launches in 2025) as a new operating system and platform for ‘Gemini-era’ AR or VR glasses, which they’re pitching as something you wear all day with AI being the killer app, similar to a smart watch. One detail I appreciated was seamless integration with a mouse and keyboard. All the details I saw seem… right, if they can nail the execution. The Apple Vision Pro is too cumbersome, didn’t have AI involved and didn’t work out, but Google’s vision feels like the future.

Demis Hassabis is calling 2025 ‘the year of the AI agent.’

Gemini 2.0 is broadly available, via Google Studio, Vertex and its API, and Gemini’s app.

Gemini 2.0 finally has a functional code interpreter.

For developers, they offer this cookbook.

One big thing we do not know yet is the price. You can use a free preview, and that’s it.

If you want to join all the waitlists, and why wouldn’t you, go to Google Labs.

I mean, obviously, you never want to ‘trust the marketing department.’

But in this case, I mean something else: Do not trust them to do their jobs.

Google has been bizarrely bad about explaining all of this and what it can do. I very much do not want to hear excuses about ‘the Kleenex effect’ (especially since you could also call this ‘the Google effect’) or ‘first mover advantage.’ This is full-on not telling us what they are actually offering or giving us any reasonable way to find out beyond the ‘faround’ plan.

Even when I seek out their copy, it is awful.

For example, CEO Sundar Pichai’s note at the top of their central announcement is cringeworthy corporate-speak and tells you almost nothing. Nobody wants this.

On at least some benchmarks, Gemini Flash 2.0 outperforms Gemini 1.5 Pro.

That chart only compares to Gemini Pro 1.5, and only on their selection of benchmarks. But based on other reports, it seems likely that yes, this is an overall intelligence upgrade over Pro 1.5 while also being a lot faster and cheaper.

It tops something called the ‘Hallucination leaderboard’ along with Zhipu.

Chubby: Gemini 2.0 Flash on Hallucination leaderboard

Gemini shows its strength day by day

Claude is at 4.6%, and its hallucinations don’t bother me, but I do presume this is measuring something useful.

The performance on Arena is super impressive for a model of this size and speed, and a large improvement over the old Flash, which was already doing great for its size. It’s not quite at the top, but it’s remarkably close:

Gemini 2.0 is sufficiently lightweight, fast and capable that Google says it enables real time multimodal agentic output. It can remember, analyze and respond accordingly.

It also has native tool use, which is importantly missing from o1.

Google claims all this will enable their universal AI assistant, Project Astra, to be worth using. And also Project Mariner, asking it to act as a reasoning multi-step agent on the web on your behalf across domains.

Currently Astra and Mariner, and also their coding agent Jules, are in the experimental stage. This is very good. Projects like this should absolutely have extensive experimental stages first. It is relatively fine to rush Flash 2.0 into Search, but agents require a bit more caution, if only for not-shooting-self-in-foot practical purposes.

Project Astra now is fully multilingual including within a conversation, and has 10 minutes of in-session memory plus memory of earlier conversations, and less latency.

There’s a waiting list for Astra, but right now, you can use Google Studio to click Stream Realtime on your screen, which seems to be at least close to the same thing if you do it on mobile? There’s a button to use your webcam, another to talk to it.

On a computer you can then use a voice interface, and it will analyze things in real time, including analyzing code and suggesting bug fixes.

If we can bring this together with the rest of the IDE and the abilities of a Cursor, watch out, cause it would solve some of the bottlenecks.

Deep Research is a fantastic idea.

Ethan Mollick: Google has a knack for making non-chatbot interfaces for serious work with LLMs.

When I demo them, both NotebookLM & Deep Research are instantly understandable and fill real organizational needs. They represent a tiny range of AI capability, but they are easy for everyone to get

You type in your question, you tab out somewhere else for a while, then you check back later and presto, there’s a full report based on dozens of websites. Brilliant!

It is indeed a great modality and interface. But it has to work in practice.

In practice, well, there are some problems. As always, there’s basic accuracy, such as this output – I had it flat out copy benchmark numbers wrong, claim 40.4% was a higher score than 62.1% on GPQA (rather persistently, even!) and so on.

I also didn’t feel like the ‘research plan’ corresponded that well to the results.

The bigger issue is that it will give you ‘the report you vaguely asked for.’ It doesn’t, at least in my attempts so far, do the precision thing. Ask it a particular question, get a generic vaguely related answer. And if you try to challenge its mistakes, weird mostly unhelpful things happen.

That doesn’t mean it is useless.

If what you want matches Gemini’s inclinations about what a vaguely related report would look like, you’re golden.

If what you want is a subset of that but will be included in the report, you can, as someone suggested to me on Twitter, take the report, click a button to make it a Google Doc, then feed the Google Doc to Claude (or Gemini!) and have it pick out the information you want.

These were by far the most gung-ho review I’ve seen so far:

Dean Ball: Holy hell, Gemini’s deep research is unbelievable.

I just pulled information from about 100 websites and compiled a report on natural gas generation in minutes.

Perhaps my favorite AI product launch of the last three business days.

The first questions I ask language models are *alwaysresearch questions I myself have investigated in the recent past.

Gemini’s performance on the prompt in question was about 85%, for what it’s worth, but the significant point is that no other model could have gotten close.

It wasn’t factually inaccurate about anything I saw—most of the problem was the classic llm issue of not getting at what I *actuallywanted on certain sub-parts of the inquiry.

Especially useful for me since I very often am doing 50-state surveys of things.

Sid Bharath: Gemini Deep Research is absolutely incredible. It’s like having an analyst at your fingertips, working at inhuman speeds.

Need a list of fashion bloggers and their email addresses to promote your new clothing brand brand? Deep Research crawls through hundreds of sites to pull it all together into a spreadsheet, along with a description of that website and a personalized pitch, in minutes.

Analyzing a stock or startup pitch deck? Deep Research can write up a full investment memo with competitive analysis and market sizing, along with sources, while you brew your coffee.

Whenever you need to research something, whether it’s for an essay or blog, analyzing a business, building a product, promoting your brand, creating an outreach list, Deep Research can do it at a fraction of the time you or your best analyst can.

And it’s available on the Gemini app right now. Check it out and let me know what you think.

On reflection, Dean Ball’s use cases are a great fit for Deep Research. I still don’t see how he came away so enthused.

Sid Bharath again seems like he has a good use case with generating a list of contacts. I’m a lot more suspicious about some of the other tasks here, where I’d expect to have a bigger slop problem.

You can also view DR as a kind of ‘free action.’ You get a bunch of Deep Research reports on a wide variety of subjects. The ones that don’t help, you quickly discard. So it’s fine if the hit rate is not so high.

Another potential good use is to use this as a search engine for the sources, looking either at the ones in the final data set or the list of researched websites.

It will take time to figure out the right ways to take advantage of this, and doubtless Google can improve this experience a lot if it keeps cooking.

Jon Stokes sees Deep Research as Google ‘eating its seed corn’ as in not only search but also the internet, because this is hits to websites with no potential customers.

Jon Stokes: Gemini is strip-mining the web. Not a one of the 563 websites being visited by Gemini in the above screencap is getting any benefit from this activity — in fact, they’re paying to serve this content to Google. It’s all cost, no benefit for rightsholders.

I don’t think it is true they get no benefit. I have clicked on a number of Deep Research’s sources and looked at them myself, and I doubt I am alone in this.

I encourage you to share your experiences.

Project Mariner scores a SotA 83.5% on the WebVoyager benchmark, going up to 90%+ if you give it access to tree search. They certainly are claiming it is damn impressive.

The research prototype can only use the active tab, stopping you from doing other things in the meantime. Might need multiple computers?

Here’s Olivia Moore using it to nail GeoGuessr. The example in question does seem like easy mode, there’s actual signs that give away the exact location, but very cool.

It is however still in early access, so we can’t try it out yet.

Shane Legg (Chief AGI Scientist, Google): Who’s starting to feel the AGI?

I was excited when I first saw the announcements for Project Astra, but we’re still waiting and haven’t seen much. They’re now giving us more details and claiming it has been upgraded, and is ready to go experimental. Mostly we get some early tester reports, a few minutes long each.

One tester points to the long-term memory as a key feature. That was one of the ones that made sense to me, along with translation and object identification. Some of the other ways the early testers used Astra, and their joy in some of the responses, seemed so weird to me. It’s cool that Astra can do these things, but why are these things you want Astra to be doing?

That shows how far we’ve come. I’ve stopped being impressed that it can do a thing, and started instead asking if I would want to do the thing in practice.

Astra will have at least some tool use, 10 minutes of in-context memory, a long-term memory for past conversations and real time voice interaction. The prototype glasses, they are also coming.

Here Roni Rahman goes over the low hanging fruit Astra use cases, and a similar thread from Min Choi.

My favorite use case so far is getting Gemini to watch the screen for when you slack off and yell at you to get back to work.

Jules is Google’s new code agent. Again, it isn’t available yet for us regular folk, they promise it for interested developers in early 2025.

How good is it? Impossible to know. All we know is Google’s getting into the game.

There’s also a data science agent scheduled for the first half of 2025.

With the multimodal Live API, Gemini 2.0 can be your assistant while playing games.

It understand your screen, help you strategize in games, remember tasks, and search the web for background information, all in voice mode.

An excellent question:

High Minded Lowlife: I don’t play these games so I gotta ask. Are these actually good suggestions or just generic slop answers that sound good but really aren’t. If the former then this is pretty awesome.

That’s always the question, isn’t it? Are the suggestions any good?

I notice that if Gemini could put an arrow icon or even better pathways onto the screen, it would be that much more helpful here.

So we all know what that means.

We already know that no, Gemini can’t play Magic: The Gathering yet.

What is the right way to use this new power while gaming?

When do you look at the tier list, versus very carefully not looking at the tier list?

Now more than ever, you need to cultivate the gaming experience that you want. You want a challenge that is right for you, of the type that you enjoy. Sometimes you want the joy of organic discovery and exploration, and other times you want key information in advance, especially to avoid making large mistakes.

Here Sid Bharath uses Gemini to solve the New York Times Crossword, as presumably any other LLM could as well with a slightly worse interface. But it seems like mostly you want to not do this one?

Sully is a big fan.

Sully: This is insane.

Gemini Flash 2.0 is twice as fast and significantly smarter than before.

Guys, DeepMind is cooking.

From the benchmarks, it is better than 1.5 Pro.

Mbongeni Ndlovu: I’m loving Gemini 2.0 Flash so much right now.

Its video understanding is so much better and faster than 1.5 Pro.

The real-time streaming feature is pretty wild.

Sully: Spent the day using Gemini Flash 2.0, and I’m really impressed.

Basically, it is the same as GPT-4O and slightly worse than Claude, in my opinion.

Once it is generally available, I think all our “cheap” requests will go to Flash. Getting rid of GPT Mini plus Haiku (and some GPT-4o).

Bidnu Reddy is a big fan.

Bindu Reddy: Gotta say Gemini 2.0 is a way bigger launch that whatever OpenAI has announced so far

Also love that Google made the API available for evals and experiments

Last but not the least, Gemini’s speed takes your breath away

Mostafa Dehghani notices that Gemini 2.0 can break down steps in the ‘draw the rest of the owl’ task.

What is my take so far?

Veo 2 seems great, but it’s not my area, and I notice I don’t care.

Deep Research is a great idea, and it has a place in your workflow even with all the frustrations, but it’s early days and it needs more time to cook. It’s probably a good idea to keep a few Gemini windows open for this, occasionally put in questions where it might do something interesting, and then quickly scan the results.

Gemini-1206 seems solid from what I can tell but I don’t notice any temptation to explore it more, or any use case where I expect it to be a superior tool to some combination of o1, GPT-4o with web search, Perplexity and Claude Sonnet.

Gemini Flash 2.0 seems like it is doing a remarkably good impression of models that are much larger and more expensive. I’d clearly never use it over Claude Sonnet where I had both options, but Flash opens up a bunch of new use cases, and I’m excited to see where those go.

Project Astra (or ‘streaming realtime’) in particular continues to seem fascinating, both the PC version with a shared screen and the camera version with your phone. I’m eager to put both to proper tests, even in their early forms, but have not yet found the time. Maybe I should just turn it on during my work at some point and see what happens.

Project Mariner I don’t have access to, so it’s impossible to know if it is anything yet.

For now I notice that I’m acting like most people who bounce off AI, and don’t properly explore it, and miss out. On a less dumb level, but I need to snap out of it.

The future is going to get increasingly AI, and increasingly weird. Let’s get that first uneven distribution.

Discussion about this post

The Second Gemini Read More »