Meta

professor-sues-meta-to-allow-release-of-feed-killing-tool-for-facebook

Professor sues Meta to allow release of feed-killing tool for Facebook

Professor sues Meta to allow release of feed-killing tool for Facebook

themotioncloud/Getty Images

Ethan Zuckerman wants to release a tool that would allow Facebook users to control what appears in their newsfeeds. His privacy-friendly browser extension, Unfollow Everything 2.0, is designed to essentially give users a switch to turn the newsfeed on and off whenever they want, providing a way to eliminate or curate the feed.

Ethan Zuckerman, a professor at University of Massachusetts Amherst, is suing Meta to release a tool allowing Facebook users to

Ethan Zuckerman, a professor at University of Massachusetts Amherst, is suing Meta to release a tool allowing Facebook users to “unfollow everything.” (Photo by Lorrie LeJeune)

The tool is nearly ready to be released, Zuckerman told Ars, but the University of Massachusetts Amherst associate professor is afraid that Facebook owner Meta might threaten legal action if he goes ahead. And his fears appear well-founded. In 2021, Meta sent a cease-and-desist letter to the creator of the original Unfollow Everything, Louis Barclay, leading that developer to shut down his tool after thousands of Facebook users had eagerly downloaded it.

Zuckerman is suing Meta, asking a US district court in California to invalidate Meta’s past arguments against developers like Barclay and rule that Meta would have no grounds to sue if he released his tool.

Zuckerman insists that he’s “suing Facebook to make it better.” In picking this unusual legal fight with Meta, the professor—seemingly for the first time ever—is attempting to tip Section 230’s shield away from Big Tech and instead protect third-party developers from giant social media platforms.

To do this, Zuckerman is asking the court to consider a novel Section 230 argument relating to an overlooked provision of the law that Zuckerman believes protects the development of third-party tools that allow users to curate their newsfeeds to avoid objectionable content. His complaint cited case law and argued:

Section 230(c)(2)(B) immunizes from legal liability “a provider of software or enabling tools that filter, screen, allow, or disallow content that the provider or user considers obscene, lewd, lascivious, filthy, excessively violent, harassing, or otherwise objectionable.” Through this provision, Congress intended to promote the development of filtering tools that enable users to curate their online experiences and avoid content they would rather not see.

Unfollow Everything 2.0 falls in this “safe harbor,” Zuckerman argues, partly because “the purpose of the tool is to allow users who find the newsfeed objectionable, or who find the specific sequencing of posts within their newsfeed objectionable, to effectively turn off the feed.”

Ramya Krishnan, a senior staff attorney at the Knight Institute who helped draft Zuckerman’s complaint, told Ars that some Facebook users are concerned that the newsfeed “prioritizes inflammatory and sensational speech,” and they “may not want to see that kind of content.” By turning off the feed, Facebook users could choose to use the platform the way it was originally designed, avoiding being served objectionable content by blanking the newsfeed and manually navigating to only the content they want to see.

“Users don’t have to accept Facebook as it’s given to them,” Krishnan said in a press release provided to Ars. “The same statute that immunizes Meta from liability for the speech of its users gives users the right to decide what they see on the platform.”

Zuckerman, who considers himself “old to the Internet,” uses Facebook daily and even reconnected with and began dating his now-wife on the platform. He has a “soft spot” in his heart for Facebook and still finds the platform useful to keep in touch with friends and family.

But while he’s “never been in the ‘burn it all down’ camp,” he has watched social media evolve to give users less control over their feeds and believes “that the dominance of a small number of social media companies tends to create the illusion that the business model adopted by them is inevitable,” his complaint said.

Professor sues Meta to allow release of feed-killing tool for Facebook Read More »

critics-question-tech-heavy-lineup-of-new-homeland-security-ai-safety-board

Critics question tech-heavy lineup of new Homeland Security AI safety board

Adventures in 21st century regulation —

CEO-heavy board to tackle elusive AI safety concept and apply it to US infrastructure.

A modified photo of a 1956 scientist carefully bottling

On Friday, the US Department of Homeland Security announced the formation of an Artificial Intelligence Safety and Security Board that consists of 22 members pulled from the tech industry, government, academia, and civil rights organizations. But given the nebulous nature of the term “AI,” which can apply to a broad spectrum of computer technology, it’s unclear if this group will even be able to agree on what exactly they are safeguarding us from.

President Biden directed DHS Secretary Alejandro Mayorkas to establish the board, which will meet for the first time in early May and subsequently on a quarterly basis.

The fundamental assumption posed by the board’s existence, and reflected in Biden’s AI executive order from October, is that AI is an inherently risky technology and that American citizens and businesses need to be protected from its misuse. Along those lines, the goal of the group is to help guard against foreign adversaries using AI to disrupt US infrastructure; develop recommendations to ensure the safe adoption of AI tech into transportation, energy, and Internet services; foster cross-sector collaboration between government and businesses; and create a forum where AI leaders to share information on AI security risks with the DHS.

It’s worth noting that the ill-defined nature of the term “Artificial Intelligence” does the new board no favors regarding scope and focus. AI can mean many different things: It can power a chatbot, fly an airplane, control the ghosts in Pac-Man, regulate the temperature of a nuclear reactor, or play a great game of chess. It can be all those things and more, and since many of those applications of AI work very differently, there’s no guarantee any two people on the board will be thinking about the same type of AI.

This confusion is reflected in the quotes provided by the DHS press release from new board members, some of whom are already talking about different types of AI. While OpenAI, Microsoft, and Anthropic are monetizing generative AI systems like ChatGPT based on large language models (LLMs), Ed Bastian, the CEO of Delta Air Lines, refers to entirely different classes of machine learning when he says, “By driving innovative tools like crew resourcing and turbulence prediction, AI is already making significant contributions to the reliability of our nation’s air travel system.”

So, defining the scope of what AI exactly means—and which applications of AI are new or dangerous—might be one of the key challenges for the new board.

A roundtable of Big Tech CEOs attracts criticism

For the inaugural meeting of the AI Safety and Security Board, the DHS selected a tech industry-heavy group, populated with CEOs of four major AI vendors (Sam Altman of OpenAI, Satya Nadella of Microsoft, Sundar Pichai of Alphabet, and Dario Amodei of Anthopic), CEO Jensen Huang of top AI chipmaker Nvidia, and representatives from other major tech companies like IBM, Adobe, Amazon, Cisco, and AMD. There are also reps from big aerospace and aviation: Northrop Grumman and Delta Air Lines.

Upon reading the announcement, some critics took issue with the board composition. On LinkedIn, founder of The Distributed AI Research Institute (DAIR) Timnit Gebru especially criticized OpenAI’s presence on the board and wrote, “I’ve now seen the full list and it is hilarious. Foxes guarding the hen house is an understatement.”

Critics question tech-heavy lineup of new Homeland Security AI safety board Read More »

customers-say-meta’s-ad-buying-ai-blows-through-budgets-in-a-matter-of-hours

Customers say Meta’s ad-buying AI blows through budgets in a matter of hours

Spending money is just so hard … can’t a computer do it for me? —

Based on your point of view, the AI either doesn’t work or works too well.

AI is here to terminate your bank account.

Enlarge / AI is here to terminate your bank account.

Carolco Pictures

Give the AI access to your credit card, they said. It’ll be fine, they said. Users of Meta’s ad platform who followed that advice have been getting burned by an AI-powered ad purchasing system, according to The Verge. The idea was to use a Meta-developed AI to automatically set up ads and spend your ad budget, saving you the hassle of making decisions about your ad campaign. Apparently, the AI funnels money to Meta a little too well: Customers say it burns, though, what should be daily ad budgets in a matter of hours, and costs are inflated as much as 10-fold.

The AI-powered software in question is the “Advantage+ Shopping Campaign.” The system is supposed to automate a lot of ad setup for you, mixing and matching various creative elements and audience targets. The power of AI-powered advertising (Google has a similar product) is that the ad platform can get instant feedback on its generated ads via click-through rates. You give it a few guard rails, and it can try hundreds or thousands of combinations to find the most clickable ad at a speed and efficiency no human could match. That’s the theory, anyway.

The Verge spoke to “several marketers and businesses” with similar stories of being hit by an AI-powered spending spree once they let Meta’s system take over a campaign. The description of one account says the AI “had blown through roughly 75 percent of the daily ad budgets for both clients in under a couple of hours” and that “the ads’ CPMs, or cost per impressions, were roughly 10 times higher than normal.” Meanwhile, the revenue earned from those AI-powered ads was “nearly zero.” The report says, “Small businesses have seen their ad dollars get wiped out and wasted as a result, and some have said the bouts of overspending are driving them from Meta’s platforms.”

Meta’s Advantage+ sales pitch promises to “Use machine learning to identify and aim for your highest value customers across all of Meta’s family of apps and services, with minimal input.” The service can “Automatically test up to 150 creative combinations and deliver the highest performing ads.” Meta promises that “on average, companies have seen a 17 percent reduction in cost per action [an action is typically a purchase, registration, or sign-up] and a 32 percent increase in return on ad spend.”

In response to the complaints, a Meta spokesperson told The Verge the company had fixed “a few technical issues” and that “Our ads system is working as expected for the vast majority of advertisers. We recently fixed a few technical issues and are researching a small amount of additional reports from advertisers to ensure the best possible results for businesses using our apps.” The Verge got that statement a few weeks ago, though, and advertisers are still having issues. The report describes the service as “unpredictable” and says what “other marketers thought was a one-time glitch by Advantage Plus ended up becoming a recurring incident for weeks.”

To make matters worse, layoffs in Meta’s customer service department mean it’s been difficult to get someone at Meta to deal with the AI’s spending sprees. Some accounts report receiving refunds after complaining, but it can take several tries to get someone at customer service to deal with you and upward of a month to receive a refund. Some customers quoted in the report have decided to return to pre-AI, non-automated way of setting up a Meta ad campaign, which can take “an extra 10 to 20 minutes.”

Customers say Meta’s ad-buying AI blows through budgets in a matter of hours Read More »

apple-releases-eight-small-ai-language-models-aimed-at-on-device-use

Apple releases eight small AI language models aimed at on-device use

Inside the Apple core —

OpenELM mirrors efforts by Microsoft to make useful small AI language models that run locally.

An illustration of a robot hand tossing an apple to a human hand.

Getty Images

In the world of AI, what might be called “small language models” have been growing in popularity recently because they can be run on a local device instead of requiring data center-grade computers in the cloud. On Wednesday, Apple introduced a set of tiny source-available AI language models called OpenELM that are small enough to run directly on a smartphone. They’re mostly proof-of-concept research models for now, but they could form the basis of future on-device AI offerings from Apple.

Apple’s new AI models, collectively named OpenELM for “Open-source Efficient Language Models,” are currently available on the Hugging Face under an Apple Sample Code License. Since there are some restrictions in the license, it may not fit the commonly accepted definition of “open source,” but the source code for OpenELM is available.

On Tuesday, we covered Microsoft’s Phi-3 models, which aim to achieve something similar: a useful level of language understanding and processing performance in small AI models that can run locally. Phi-3-mini features 3.8 billion parameters, but some of Apple’s OpenELM models are much smaller, ranging from 270 million to 3 billion parameters in eight distinct models.

In comparison, the largest model yet released in Meta’s Llama 3 family includes 70 billion parameters (with a 400 billion version on the way), and OpenAI’s GPT-3 from 2020 shipped with 175 billion parameters. Parameter count serves as a rough measure of AI model capability and complexity, but recent research has focused on making smaller AI language models as capable as larger ones were a few years ago.

The eight OpenELM models come in two flavors: four as “pretrained” (basically a raw, next-token version of the model) and four as instruction-tuned (fine-tuned for instruction following, which is more ideal for developing AI assistants and chatbots):

OpenELM features a 2048-token maximum context window. The models were trained on the publicly available datasets RefinedWeb, a version of PILE with duplications removed, a subset of RedPajama, and a subset of Dolma v1.6, which Apple says totals around 1.8 trillion tokens of data. Tokens are fragmented representations of data used by AI language models for processing.

Apple says its approach with OpenELM includes a “layer-wise scaling strategy” that reportedly allocates parameters more efficiently across each layer, saving not only computational resources but also improving the model’s performance while being trained on fewer tokens. According to Apple’s released white paper, this strategy has enabled OpenELM to achieve a 2.36 percent improvement in accuracy over Allen AI’s OLMo 1B (another small language model) while requiring half as many pre-training tokens.

An table comparing OpenELM with other small AI language models in a similar class, taken from the OpenELM research paper by Apple.

Enlarge / An table comparing OpenELM with other small AI language models in a similar class, taken from the OpenELM research paper by Apple.

Apple

Apple also released the code for CoreNet, a library it used to train OpenELM—and it also included reproducible training recipes that allow the weights (neural network files) to be replicated, which is unusual for a major tech company so far. As Apple says in its OpenELM paper abstract, transparency is a key goal for the company: “The reproducibility and transparency of large language models are crucial for advancing open research, ensuring the trustworthiness of results, and enabling investigations into data and model biases, as well as potential risks.”

By releasing the source code, model weights, and training materials, Apple says it aims to “empower and enrich the open research community.” However, it also cautions that since the models were trained on publicly sourced datasets, “there exists the possibility of these models producing outputs that are inaccurate, harmful, biased, or objectionable in response to user prompts.”

While Apple has not yet integrated this new wave of AI language model capabilities into its consumer devices, the upcoming iOS 18 update (expected to be revealed in June at WWDC) is rumored to include new AI features that utilize on-device processing to ensure user privacy—though the company may potentially hire Google or OpenAI to handle more complex, off-device AI processing to give Siri a long-overdue boost.

Apple releases eight small AI language models aimed at on-device use Read More »

meta-debuts-horizon-os,-with-asus,-lenovo,-and-microsoft-on-board

Meta debuts Horizon OS, with Asus, Lenovo, and Microsoft on board

Face Operating Systems —

Rivalry with Apple now mirrors the Android/iOS competition more than ever.

The Meta Quest Pro at a Best Buy demo station in October 2022.

Enlarge / The Meta Quest Pro at a Best Buy demo station in October 2022.

Meta will open up the operating system that runs on its Quest mixed reality headsets to other technology companies, it announced today.

What was previously simply called Quest software will be called Horizon OS, and the goal will be to move beyond the general-use Quest devices to more purpose-specific devices, according to an Instagram video from Meta CEO Mark Zuckerberg.

There will be headsets focused purely on watching TV and movies on virtual screens, with the emphasis on high-end OLED displays. There will also be headsets that are designed to be as light as possible at the expense of performance for productivity and exercise uses. And there will be gaming-oriented ones.

The announcement named three partners to start. Asus will produce a gaming headset under its Republic of Gamers (ROG) brand, Lenovo will make general purpose headsets with an emphasize on “productivity, learning, and entertainment,” and Xbox and Meta will team up to deliver a special edition of the Meta Quest that will come bundled with an Xbox controller and Xbox Cloud Gaming and Game Pass.

Users running Horizon OS devices from different manufacturers will be able to stay connected in the operating system’s social layer of “identities, avatars, social graphs, and friend groups” and will be able to enjoy shared virtual spaces together across devices.

The announcement comes after Meta became an early leader in the relatively small but interesting consumer mixed reality space but with diminishing returns on new devices as the market saturates.

Further, Apple recently entered the fray with its Vision Pro headset. The Vision Pro is not really a direct competitor to Meta’s Quest devices today—it’s far more expensive and loaded with higher-end tech—but it may only be the opening volley in a long competition between the companies.

Meta’s decision to make Horizon OS a more open platform for partner OEMs in the face of Apple’s usual focus on owning and integrating as much of the software, hardware, and services in its device as it can mirrors the smartphone market. There, Google’s Android (on which Horizon OS is based) runs on a variety of devices from a wide range of companies, while Apple’s iOS runs only on Apple’s own iPhones.

Meta also says it is working on a new spatial app framework to make it easier for developers with experience on mobile to start making mixed reality apps for Horizon OS and that it will start “removing the barriers between the Meta Horizon Store and App Lab, which lets any developer who meets basic technical and content requirements release software on the platform.”

Pricing, specs, and release dates have not been announced for any of the new devices. Zuckerberg admitted it’s “probably going to take a couple of years” for this ecosystem of hardware devices to roll out.

Meta debuts Horizon OS, with Asus, Lenovo, and Microsoft on board Read More »

llms-keep-leaping-with-llama-3,-meta’s-newest-open-weights-ai-model

LLMs keep leaping with Llama 3, Meta’s newest open-weights AI model

computer-powered word generator —

Zuckerberg says new AI model “was still learning” when Meta stopped training.

A group of pink llamas on a pixelated background.

On Thursday, Meta unveiled early versions of its Llama 3 open-weights AI model that can be used to power text composition, code generation, or chatbots. It also announced that its Meta AI Assistant is now available on a website and is going to be integrated into its major social media apps, intensifying the company’s efforts to position its products against other AI assistants like OpenAI’s ChatGPT, Microsoft’s Copilot, and Google’s Gemini.

Like its predecessor, Llama 2, Llama 3 is notable for being a freely available, open-weights large language model (LLM) provided by a major AI company. Llama 3 technically does not quality as “open source” because that term has a specific meaning in software (as we have mentioned in other coverage), and the industry has not yet settled on terminology for AI model releases that ship either code or weights with restrictions (you can read Llama 3’s license here) or that ship without providing training data. We typically call these releases “open weights” instead.

At the moment, Llama 3 is available in two parameter sizes: 8 billion (8B) and 70 billion (70B), both of which are available as free downloads through Meta’s website with a sign-up. Llama 3 comes in two versions: pre-trained (basically the raw, next-token-prediction model) and instruction-tuned (fine-tuned to follow user instructions). Each has a 8,192 token context limit.

A screenshot of the Meta AI Assistant website on April 18, 2024.

Enlarge / A screenshot of the Meta AI Assistant website on April 18, 2024.

Benj Edwards

Meta trained both models on two custom-built, 24,000-GPU clusters. In a podcast interview with Dwarkesh Patel, Meta CEO Mark Zuckerberg said that the company trained the 70B model with around 15 trillion tokens of data. Throughout the process, the model never reached “saturation” (that is, it never hit a wall in terms of capability increases). Eventually, Meta pulled the plug and moved on to training other models.

“I guess our prediction going in was that it was going to asymptote more, but even by the end it was still leaning. We probably could have fed it more tokens, and it would have gotten somewhat better,” Zuckerberg said on the podcast.

Meta also announced that it is currently training a 400B parameter version of Llama 3, which some experts like Nvidia’s Jim Fan think may perform in the same league as GPT-4 Turbo, Claude 3 Opus, and Gemini Ultra on benchmarks like MMLU, GPQA, HumanEval, and MATH.

Speaking of benchmarks, we have devoted many words in the past to explaining how frustratingly imprecise benchmarks can be when applied to large language models due to issues like training contamination (that is, including benchmark test questions in the training dataset), cherry-picking on the part of vendors, and an inability to capture AI’s general usefulness in an interactive session with chat-tuned models.

But, as expected, Meta provided some benchmarks for Llama 3 that list results from MMLU (undergraduate level knowledge), GSM-8K (grade-school math), HumanEval (coding), GPQA (graduate-level questions), and MATH (math word problems). These show the 8B model performing well compared to open-weights models like Google’s Gemma 7B and Mistral 7B Instruct, and the 70B model also held its own against Gemini Pro 1.5 and Claude 3 Sonnet.

A chart of instruction-tuned Llama 3 8B and 70B benchmarks provided by Meta.

Enlarge / A chart of instruction-tuned Llama 3 8B and 70B benchmarks provided by Meta.

Meta says that the Llama 3 model has been enhanced with capabilities to understand coding (like Llama 2) and, for the first time, has been trained with both images and text—though it currently outputs only text. According to Reuters, Meta Chief Product Officer Chris Cox noted in an interview that more complex processing abilities (like executing multi-step plans) are expected in future updates to Llama 3, which will also support multimodal outputs—that is, both text and images.

Meta plans to host the Llama 3 models on a range of cloud platforms, making them accessible through AWS, Databricks, Google Cloud, and other major providers.

Also on Thursday, Meta announced that Llama 3 will become the new basis of the Meta AI virtual assistant, which the company first announced in September. The assistant will appear prominently in search features for Facebook, Instagram, WhatsApp, Messenger, and the aforementioned dedicated website that features a design similar to ChatGPT, including the ability to generate images in the same interface. The company also announced a partnership with Google to integrate real-time search results into the Meta AI assistant, adding to an existing partnership with Microsoft’s Bing.

LLMs keep leaping with Llama 3, Meta’s newest open-weights AI model Read More »

meta’s-new-$199-quest-2-price-is-a-steal-for-the-vr-curious

Meta’s new $199 Quest 2 price is a steal for the VR-curious

Bargain basement —

Move comes as support winds down for the original Quest headset.

For just $199, you could be having as much fun as this paid model.

Enlarge / For just $199, you could be having as much fun as this paid model.

Meta has announced it’s permanently lowering the price of its aging Quest 2 headset to $199 for a 128GB base model, representing the company’s lowest price yet for a full-featured untethered VR headset.

The Quest 2, which launched in 2020 at $299, famously defied tech product convention by increasing its MSRP to $399 amid inflation and supply chain issues in mid-2022. Actual prices for the headset at retail have fallen since then, though; Best Buy offered new units for $299 as of last October and for $250 by the 2023 post-Thanksgiving shopping season, for instance.

And the Quest 2 is far from the company’s state-of-the-art headset at this point. Meta launched the surprisingly expensive Quest Pro in late 2022 before dropping that headset’s price from $1,499 to $999 less than five months later. And last year’s launch of the Quest 3 at a $499 starting price brought some significant improvements in resolution, processing power, thickness, and full-color passthrough images over the Quest 2.

But for how long?

Those looking to get the Quest 2 at its new bargain MSRP should keep in mind that Meta may not be planning to support the aging headset for the long haul. Meta is currently winding down support for the original Quest headset, which launched in 2019 and no longer has access to important online features, security updates, and even new apps. The Quest 2 is just 18 months younger than the original Quest, and the new price might represent an effort to clear out defunct stock in favor of newer, more powerful Quest options.

The Quest 2 (left) has a 40 percent thicker profile than the pancake-optics on the Quest 3 (right).

Enlarge / The Quest 2 (left) has a 40 percent thicker profile than the pancake-optics on the Quest 3 (right).

Meta

Then again, plenty of developers are still targeting apps and games at the comparatively large audience on the Quest 2, which sold an estimated 15 million units through mid-2022, roughly on par with the Xbox Series S|X in roughly the same time period. But there are some signs that Quest 2 software is selling relatively slower than those hardware numbers might suggest amid reports that many Quest purchasers are no longer active users. And Meta continues to lose massive amounts of money on the VR segment, while Sony is reportedly halting production of the PS5-tethered PSVR2 headset amid weaker than expected demand.

The Quest 2’s new price is the first time Meta has offered a headset below the “$250 and 250 grams” target former Meta CTO John Carmack once envisioned for a “super cheap, super lightweight headset” that could bring in the mass market (the Quest 2 weighs in at 503 grams). The new price is also stunningly cheap when you consider that, just six or seven years ago, VR-curious consumers could easily end up paying $1,500 or more (in 2024 dollars) for a high-end tethered headset and the “VR-ready” computer needed to power it.

If you’ve waited this long to see what virtual reality gaming is all about, this price drop is the perfect opportunity to indulge your curiosity for a relative pittance. Heck, it might be worth it even if your headset ends up, like mine, a Beat Saber machine most of the time.

Meta’s new $199 Quest 2 price is a steal for the VR-curious Read More »

words-are-flowing-out-like-endless-rain:-recapping-a-busy-week-of-llm-news

Words are flowing out like endless rain: Recapping a busy week of LLM news

many things frequently —

Gemini 1.5 Pro launch, new version of GPT-4 Turbo, new Mistral model, and more.

An image of a boy amazed by flying letters.

Enlarge / An image of a boy amazed by flying letters.

Some weeks in AI news are eerily quiet, but during others, getting a grip on the week’s events feels like trying to hold back the tide. This week has seen three notable large language model (LLM) releases: Google Gemini Pro 1.5 hit general availability with a free tier, OpenAI shipped a new version of GPT-4 Turbo, and Mistral released a new openly licensed LLM, Mixtral 8x22B. All three of those launches happened within 24 hours starting on Tuesday.

With the help of software engineer and independent AI researcher Simon Willison (who also wrote about this week’s hectic LLM launches on his own blog), we’ll briefly cover each of the three major events in roughly chronological order, then dig into some additional AI happenings this week.

Gemini Pro 1.5 general release

On Tuesday morning Pacific time, Google announced that its Gemini 1.5 Pro model (which we first covered in February) is now available in 180-plus countries, excluding Europe, via the Gemini API in a public preview. This is Google’s most powerful public LLM so far, and it’s available in a free tier that permits up to 50 requests a day.

It supports up to 1 million tokens of input context. As Willison notes in his blog, Gemini 1.5 Pro’s API price at $7/million input tokens and $21/million output tokens costs a little less than GPT-4 Turbo (priced at $10/million in and $30/million out) and more than Claude 3 Sonnet (Anthropic’s mid-tier LLM, priced at $3/million in and $15/million out).

Notably, Gemini 1.5 Pro includes native audio (speech) input processing that allows users to upload audio or video prompts, a new File API for handling files, the ability to add custom system instructions (system prompts) for guiding model responses, and a JSON mode for structured data extraction.

“Majorly Improved” GPT-4 Turbo launch

A GPT-4 Turbo performance chart provided by OpenAI.

Enlarge / A GPT-4 Turbo performance chart provided by OpenAI.

Just a bit later than Google’s 1.5 Pro launch on Tuesday, OpenAI announced that it was rolling out a “majorly improved” version of GPT-4 Turbo (a model family originally launched in November) called “gpt-4-turbo-2024-04-09.” It integrates multimodal GPT-4 Vision processing (recognizing the contents of images) directly into the model, and it initially launched through API access only.

Then on Thursday, OpenAI announced that the new GPT-4 Turbo model had just become available for paid ChatGPT users. OpenAI said that the new model improves “capabilities in writing, math, logical reasoning, and coding” and shared a chart that is not particularly useful in judging capabilities (that they later updated). The company also provided an example of an alleged improvement, saying that when writing with ChatGPT, the AI assistant will use “more direct, less verbose, and use more conversational language.”

The vague nature of OpenAI’s GPT-4 Turbo announcements attracted some confusion and criticism online. On X, Willison wrote, “Who will be the first LLM provider to publish genuinely useful release notes?” In some ways, this is a case of “AI vibes” again, as we discussed in our lament about the poor state of LLM benchmarks during the debut of Claude 3. “I’ve not actually spotted any definite differences in quality [related to GPT-4 Turbo],” Willison told us directly in an interview.

The update also expanded GPT-4’s knowledge cutoff to April 2024, although some people are reporting it achieves this through stealth web searches in the background, and others on social media have reported issues with date-related confabulations.

Mistral’s mysterious Mixtral 8x22B release

An illustration of a robot holding a French flag, figuratively reflecting the rise of AI in France due to Mistral. It's hard to draw a picture of an LLM, so a robot will have to do.

Enlarge / An illustration of a robot holding a French flag, figuratively reflecting the rise of AI in France due to Mistral. It’s hard to draw a picture of an LLM, so a robot will have to do.

Not to be outdone, on Tuesday night, French AI company Mistral launched its latest openly licensed model, Mixtral 8x22B, by tweeting a torrent link devoid of any documentation or commentary, much like it has done with previous releases.

The new mixture-of-experts (MoE) release weighs in with a larger parameter count than its previously most-capable open model, Mixtral 8x7B, which we covered in December. It’s rumored to potentially be as capable as GPT-4 (In what way, you ask? Vibes). But that has yet to be seen.

“The evals are still rolling in, but the biggest open question right now is how well Mixtral 8x22B shapes up,” Willison told Ars. “If it’s in the same quality class as GPT-4 and Claude 3 Opus, then we will finally have an openly licensed model that’s not significantly behind the best proprietary ones.”

This release has Willison most excited, saying, “If that thing really is GPT-4 class, it’s wild, because you can run that on a (very expensive) laptop. I think you need 128GB of MacBook RAM for it, twice what I have.”

The new Mixtral is not listed on Chatbot Arena yet, Willison noted, because Mistral has not released a fine-tuned model for chatting yet. It’s still a raw, predict-the-next token LLM. “There’s at least one community instruction tuned version floating around now though,” says Willison.

Chatbot Arena Leaderboard shake-ups

A Chatbot Arena Leaderboard screenshot taken on April 12, 2024.

Enlarge / A Chatbot Arena Leaderboard screenshot taken on April 12, 2024.

Benj Edwards

This week’s LLM news isn’t limited to just the big names in the field. There have also been rumblings on social media about the rising performance of open source models like Cohere’s Command R+, which reached position 6 on the LMSYS Chatbot Arena Leaderboard—the highest-ever ranking for an open-weights model.

And for even more Chatbot Arena action, apparently the new version of GPT-4 Turbo is proving competitive with Claude 3 Opus. The two are still in a statistical tie, but GPT-4 Turbo recently pulled ahead numerically. (In March, we reported when Claude 3 first numerically pulled ahead of GPT-4 Turbo, which was then the first time another AI model had surpassed a GPT-4 family model member on the leaderboard.)

Regarding this fierce competition among LLMs—of which most of the muggle world is unaware and will likely never be—Willison told Ars, “The past two months have been a whirlwind—we finally have not just one but several models that are competitive with GPT-4.” We’ll see if OpenAI’s rumored release of GPT-5 later this year will restore the company’s technological lead, we note, which once seemed insurmountable. But for now, Willison says, “OpenAI are no longer the undisputed leaders in LLMs.”

Words are flowing out like endless rain: Recapping a busy week of LLM news Read More »

meta-relaxes-“incoherent”-policy-requiring-removal-of-ai-videos

Meta relaxes “incoherent” policy requiring removal of AI videos

Meta relaxes “incoherent” policy requiring removal of AI videos

On Friday, Meta announced policy updates to stop censoring harmless AI-generated content and instead begin “labeling a wider range of video, audio, and image content as ‘Made with AI.'”

Meta’s policy updates came after deciding not to remove a controversial post edited to show President Joe Biden seemingly inappropriately touching his granddaughter’s chest, with a caption calling Biden a “pedophile.” The Oversight Board had agreed with Meta’s decision to leave the post online while noting that Meta’s current manipulated media policy was too “narrow,” “incoherent,” and “confusing to users.”

Previously, Meta would only remove “videos that are created or altered by AI to make a person appear to say something they didn’t say.” The Oversight Board warned that this policy failed to address other manipulated media, including “cheap fakes,” manipulated audio, or content showing people doing things they’d never done.

“We agree with the Oversight Board’s argument that our existing approach is too narrow since it only covers videos that are created or altered by AI to make a person appear to say something they didn’t say,” Monika Bickert, Meta’s vice president of content policy, wrote in a blog. “As the Board noted, it’s equally important to address manipulation that shows a person doing something they didn’t do.”

Starting in May 2024, Meta will add “Made with AI” labels to any content detected as AI-generated, as well as to any content that users self-disclose as AI-generated.

Meta’s Oversight Board had also warned that Meta removing AI-generated videos that did not directly violate platforms’ community standards was threatening to “unnecessarily risk restricting freedom of expression.” Moving forward, Meta will stop censoring content that doesn’t violate community standards, agreeing that a “less restrictive” approach to manipulated media by adding labels is better.

“If we determine that digitally created or altered images, video, or audio create a particularly high risk of materially deceiving the public on a matter of importance, we may add a more prominent label so people have more information and context,” Bickert wrote. “This overall approach gives people more information about the content so they can better assess it and so they will have context if they see the same content elsewhere.”

Meta confirmed that, in July, it will stop censoring AI-generated content that doesn’t violate rules restricting things like voter interference, bullying and harassment, violence, and incitement.

“This timeline gives people time to understand the self-disclosure process before we stop removing the smaller subset of manipulated media,” Bickert explained in the blog.

Finally, Meta adopted the Oversight Board’s recommendation to “clearly define in a single unified Manipulated Media policy the harms it aims to prevent—beyond users being misled—such as preventing interference with the right to vote and to participate in the conduct of public affairs.”

The Oversight Board issued a statement provided to Ars, saying that members “are pleased that Meta will begin labeling a wider range of video, audio, and image content as ‘made with AI’ when they detect AI image indicators or when people indicate they have uploaded AI content.”

“This will provide people with greater context and transparency for more types of manipulated media, while also removing posts which violate Meta’s rules in other ways,” the Oversight Board said.

Meta relaxes “incoherent” policy requiring removal of AI videos Read More »

facebook-let-netflix-see-user-dms,-quit-streaming-to-keep-netflix-happy:-lawsuit

Facebook let Netflix see user DMs, quit streaming to keep Netflix happy: Lawsuit

A promotional image for Sorry for Your Loss, with Elizabeth Olsen

Enlarge / A promotional image for Sorry for Your Loss, which was a Facebook Watch original scripted series.

Last April, Meta revealed that it would no longer support original shows, like Jada Pinkett Smith’s Red Table Talk talk show, on Facebook Watch. Meta’s streaming business that was once viewed as competition for the likes of YouTube and Netflix is effectively dead now; Facebook doesn’t produce original series, and Facebook Watch is no longer available as a video-streaming app.

The streaming business’ demise has seemed related to cost cuts at Meta that have also included layoffs. However, recently unsealed court documents in an antitrust suit against Meta [PDF] claim that Meta has squashed its streaming dreams in order to appease one of its biggest ad customers: Netflix.

Facebook allegedly gave Netflix creepy privileges

As spotted via Gizmodo, a letter was filed on April 14 in relation to a class-action antitrust suit that was filed by Meta customers, accusing Meta of anti-competitive practices that harm social media competition and consumers. The letter, made public Saturday, asks a court to have Reed Hastings, Netflix’s founder and former CEO, respond to a subpoena for documents that plaintiffs claim are relevant to the case. The original complaint filed in December 2020 [PDF] doesn’t mention Netflix beyond stating that Facebook “secretly signed Whitelist and Data sharing agreements” with Netflix, along with “dozens” of other third-party app developers. The case is still ongoing.

The letter alleges that Netflix’s relationship with Facebook was remarkably strong due to the former’s ad spend with the latter and that Hastings directed “negotiations to end competition in streaming video” from Facebook.

One of the first questions that may come to mind is why a company like Facebook would allow Netflix to influence such a major business decision. The litigation claims the companies formed a lucrative business relationship that included Facebook allegedly giving Netflix access to Facebook users’ private messages:

By 2013, Netflix had begun entering into a series of “Facebook Extended API” agreements, including a so-called “Inbox API” agreement that allowed Netflix programmatic access to Facebook’s users’ private message inboxes, in exchange for which Netflix would “provide to FB a written report every two weeks that shows daily counts of recommendation sends and recipient clicks by interface, initiation surface, and/or implementation variant (e.g., Facebook vs. non-Facebook recommendation recipients). … In August 2013, Facebook provided Netflix with access to its so-called “Titan API,” a private API that allowed a whitelisted partner to access, among other things, Facebook users’ “messaging app and non-app friends.”

Meta said it rolled out end-to-end encryption “for all personal chats and calls on Messenger and Facebook” in December. And in 2018, Facebook told Vox that it doesn’t use private messages for ad targeting. But a few months later, The New York Times, citing “hundreds of pages of Facebook documents,” reported that Facebook “gave Netflix and Spotify the ability to read Facebook users’ private messages.”

Meta didn’t respond to Ars Technica’s request for comment. The company told Gizmodo that it has standard agreements with Netflix currently but didn’t answer the publication’s specific questions.

Facebook let Netflix see user DMs, quit streaming to keep Netflix happy: Lawsuit Read More »

facebook-secretly-spied-on-snapchat-usage-to-confuse-advertisers,-court-docs-say

Facebook secretly spied on Snapchat usage to confuse advertisers, court docs say

“I can’t think of a good argument for why this is okay” —

Zuckerberg told execs to “figure out” how to spy on encrypted Snapchat traffic.

Facebook secretly spied on Snapchat usage to confuse advertisers, court docs say

Unsealed court documents have revealed more details about a secret Facebook project initially called “Ghostbusters,” designed to sneakily access encrypted Snapchat usage data to give Facebook a leg up on its rival, just when Snapchat was experiencing rapid growth in 2016.

The documents were filed in a class-action lawsuit from consumers and advertisers, accusing Meta of anticompetitive behavior that blocks rivals from competing in the social media ads market.

“Whenever someone asks a question about Snapchat, the answer is usually that because their traffic is encrypted, we have no analytics about them,” Facebook CEO Mark Zuckerberg (who has since rebranded his company as Meta) wrote in a 2016 email to Javier Olivan.

“Given how quickly they’re growing, it seems important to figure out a new way to get reliable analytics about them,” Zuckerberg continued. “Perhaps we need to do panels or write custom software. You should figure out how to do this.”

At the time, Olivan was Facebook’s head of growth, but now he’s Meta’s chief operating officer. He responded to Zuckerberg’s email saying that he would have the team from Onavo—a controversial traffic-analysis app acquired by Facebook in 2013—look into it.

Olivan told the Onavo team that he needed “out of the box thinking” to satisfy Zuckerberg’s request. He “suggested potentially paying users to ‘let us install a really heavy piece of software'” to intercept users’ Snapchat data, a court document shows.

What the Onavo team eventually came up with was a project internally known as “Ghostbusters,” an obvious reference to Snapchat’s logo featuring a white ghost. Later, as the project grew to include other Facebook rivals, including YouTube and Amazon, the project was called the “In-App Action Panel” (IAAP).

The IAAP program’s purpose was to gather granular insights into users’ engagement with rival apps to help Facebook develop products as needed to stay ahead of competitors. For example, two months after Zuckerberg’s 2016 email, Meta launched Stories, a Snapchat copycat feature, on Instagram, which the Motley Fool noted rapidly became a key ad revenue source for Meta.

In an email to Olivan, the Onavo team described the “technical solution” devised to help Zuckerberg figure out how to get reliable analytics about Snapchat users. It worked by “develop[ing] ‘kits’ that can be installed on iOS and Android that intercept traffic for specific sub-domains, allowing us to read what would otherwise be encrypted traffic so we can measure in-app usage,” the Onavo team said.

Olivan was told that these so-called “kits” used a “man-in-the-middle” attack typically employed by hackers to secretly intercept data passed between two parties. Users were recruited by third parties who distributed the kits “under their own branding” so that they wouldn’t connect the kits to Onavo unless they used a specialized tool like Wireshark to analyze the kits. TechCrunch reported in 2019 that sometimes teens were paid to install these kits. After that report, Facebook promptly shut down the project.

This “man-in-the-middle” tactic, consumers and advertisers suing Meta have alleged, “was not merely anticompetitive, but criminal,” seemingly violating the Wiretap Act. It was used to snoop on Snapchat starting in 2016, on YouTube from 2017 to 2018, and on Amazon in 2018, relying on creating “fake digital certificates to impersonate trusted Snapchat, YouTube, and Amazon analytics servers to redirect and decrypt secure traffic from those apps for Facebook’s strategic analysis.”

Ars could not reach Snapchat, Google, or Amazon for comment.

Facebook allegedly sought to confuse advertisers

Not everyone at Facebook supported the IAAP program. “The company’s highest-level engineering executives thought the IAAP Program was a legal, technical, and security nightmare,” another court document said.

Pedro Canahuati, then-head of security engineering, warned that incentivizing users to install the kits did not necessarily mean that users understood what they were consenting to.

“I can’t think of a good argument for why this is okay,” Canahuati said. “No security person is ever comfortable with this, no matter what consent we get from the general public. The general public just doesn’t know how this stuff works.”

Mike Schroepfer, then-chief technology officer, argued that Facebook wouldn’t want rivals to employ a similar program analyzing their encrypted user data.

“If we ever found out that someone had figured out a way to break encryption on [WhatsApp] we would be really upset,” Schroepfer said.

While the unsealed emails detailing the project have recently raised eyebrows, Meta’s spokesperson told Ars that “there is nothing new here—this issue was reported on years ago. The plaintiffs’ claims are baseless and completely irrelevant to the case.”

According to Business Insider, advertisers suing said that Meta never disclosed its use of Onavo “kits” to “intercept rivals’ analytics traffic.” This is seemingly relevant to their case alleging anticompetitive behavior in the social media ads market, because Facebook’s conduct, allegedly breaking wiretapping laws, afforded Facebook an opportunity to raise its ad rates “beyond what it could have charged in a competitive market.”

Since the documents were unsealed, Meta has responded with a court filing that said: “Snapchat’s own witness on advertising confirmed that Snap cannot ‘identify a single ad sale that [it] lost from Meta’s use of user research products,’ does not know whether other competitors collected similar information, and does not know whether any of Meta’s research provided Meta with a competitive advantage.”

This conflicts with testimony from a Snapchat executive, who alleged that the project “hamper[ed] Snap’s ability to sell ads” by causing “advertisers to not have a clear narrative differentiating Snapchat from Facebook and Instagram.” Both internally and externally, “the intelligence Meta gleaned from this project was described” as “devastating to Snapchat’s ads business,” a court filing said.

Facebook secretly spied on Snapchat usage to confuse advertisers, court docs say Read More »

apple,-google,-and-meta-are-failing-dma-compliance,-eu-suspects

Apple, Google, and Meta are failing DMA compliance, EU suspects

EU Commissioner for Internal Market Thierry Breton talks to media about non-compliance investigations against Google, Apple, and Meta under the Digital Markets Act (DMA).

Enlarge / EU Commissioner for Internal Market Thierry Breton talks to media about non-compliance investigations against Google, Apple, and Meta under the Digital Markets Act (DMA).

Not even three weeks after the European Union’s Digital Markets Act (DMA) took effect, the European Commission (EC) announced Monday that it is already probing three out of six gatekeepers—Apple, Google, and Meta—for suspected non-compliance.

Apple will need to prove that changes to its app store and existing user options to swap out default settings easily are sufficient to comply with the DMA.

Similarly, Google’s app store rules will be probed, as well as any potentially shady practices unfairly preferencing its own services—like Google Shopping and Hotels—in search results.

Finally, Meta’s “Subscription for No Ads” option—allowing Facebook and Instagram users to opt out of personalized ad targeting for a monthly fee—may not fly under the DMA. Even if Meta follows through on its recent offer to slash these fees by nearly 50 percent, the model could be deemed non-compliant.

“The DMA is very clear: gatekeepers must obtain users’ consent to use their personal data across different services,” the EC’s commissioner for internal market, Thierry Breton, said Monday. “And this consent must be free!”

In total, the EC announced five investigations: two against Apple, two against Google, and one against Meta.

“We suspect that the suggested solutions put forward by the three companies do not fully comply with the DMA,” antitrust chief Margrethe Vestager said, ordering companies to “retain certain documents” viewed as critical to assessing evidence in the probe.

The EC’s investigations are expected to conclude within one year. If tech companies are found non-compliant, they risk fines of up to 10 percent of total worldwide turnover. Any repeat violations could spike fines to 20 percent.

“Moreover, in case of systematic infringements, the Commission may also adopt additional remedies, such as obliging a gatekeeper to sell a business or parts of it or banning the gatekeeper from acquisitions of additional services related to the systemic non-compliance,” the EC’s announcement said.

In addition to probes into Apple, Google, and Meta, the EC will scrutinize Apple’s fee structure for app store alternatives and send retention orders to Amazon and Microsoft. That makes ByteDance the only gatekeeper so far to escape “investigatory steps” as the EU fights to enforce the DMA’s strict standards. (ByteDance continues to contest its gatekeeper status.)

“These are the cases where we already have concrete evidence of possible non-compliance,” Breton said. “And this in less than 20 days of DMA implementation. But our monitoring and investigative work of course doesn’t stop here,” Breton said. “We may have to open other non-compliance cases soon.

Google and Apple have both issued statements defending their current plans for DMA compliance.

“To comply with the Digital Markets Act, we have made significant changes to the way our services operate in Europe,” Google’s competition director Oliver Bethell told Ars, promising to “continue to defend our approach in the coming months.”

“We’re confident our plan complies with the DMA, and we’ll continue to constructively engage with the European Commission as they conduct their investigations,” Apple’s spokesperson told Ars. “Teams across Apple have created a wide range of new developer capabilities, features, and tools to comply with the regulation. At the same time, we’ve introduced protections to help reduce new risks to the privacy, quality, and security of our EU users’ experience. Throughout, we’ve demonstrated flexibility and responsiveness to the European Commission and developers, listening and incorporating their feedback.”

A Meta spokesperson told Ars that Meta “designed Subscription for No Ads to address several overlapping regulatory obligations, including the DMA,” promising to comply with the DMA while arguing that “subscriptions as an alternative to advertising are a well-established business model across many industries.”

The EC’s announcement came after all designated gatekeepers were required to submit DMA compliance reports and scheduled public workshops to discuss DMA compliance. Those workshops conclude tomorrow with Microsoft and appear to be partly driving the EC’s decision to probe Apple, Google, and Meta.

“Stakeholders provided feedback on the compliance solutions offered,” Vestager said. “Their feedback tells us that certain compliance measures fail to achieve their objectives and fall short of expectations.”

Apple and Google app stores probed

Under the DMA, “gatekeepers can no longer prevent their business users from informing their users within the app about cheaper options outside the gatekeeper’s ecosystem,” Vestager said. “That is called anti-steering and is now forbidden by law.”

Stakeholders told the EC that Apple’s and Google’s fee structures appear to “go against” the DMA’s “free of charge” requirement, Vestager said, because companies “still charge various recurring fees and still limit steering.”

This feedback pushed the EC to launch its first two probes under the DMA against Apple and Google.

“We will investigate to what extent these fees and limitations defeat the purpose of the anti-steering provision and by that, limit consumer choice,” Vestager said.

These probes aren’t the end of Apple’s potential app store woes in the EU, either. Breton said that the EC has “many questions on Apple’s new business model” for the app store. These include “questions on the process that Apple used for granting and terminating membership of” its developer program, following a scandal where Epic Games’ account was briefly terminated.

“We also have questions on the fee structure and several other aspects of the business model,” Breton said, vowing to “check if they allow for real opportunities for app developers in line with the letter and the spirit of the DMA.”

Apple, Google, and Meta are failing DMA compliance, EU suspects Read More »