AI

ai-company-trolls-san-francisco-with-billboards-saying-“stop-hiring-humans”

AI company trolls San Francisco with billboards saying “stop hiring humans”

Artisan CEO Jaspar Carmichael-Jack defended the campaign’s messaging in an interview with SFGate. “They are somewhat dystopian, but so is AI,” he told the outlet in a text message. “The way the world works is changing.” In another message he wrote, “We wanted something that would draw eyes—you don’t draw eyes with boring messaging.”

So what does Artisan actually do? Its main product is an AI “sales agent” called Ava that supposedly automates the work of finding and messaging potential customers. The company claims it works with “no human input” and costs 96% less than hiring a human for the same role. Although, given the current state of AI technology, it’s prudent to be skeptical of these claims.

Artisan also has plans to expand its AI tools beyond sales into areas like marketing, recruitment, finance, and design. Its sales agent appears to be its only existing product so far.

Meanwhile, the billboards remain visible throughout San Francisco, quietly fueling existential dread in a city that has already seen a great deal of tension since the pandemic. Some of the billboards feature additional messages, like “Hire Artisans, not humans,” and one that plays on angst over remote work: “Artisan’s Zoom cameras will never ‘not be working’ today.”

AI company trolls San Francisco with billboards saying “stop hiring humans” Read More »

reddit-debuts-ai-powered-discussion-search—but-will-users-like-it?

Reddit debuts AI-powered discussion search—but will users like it?

The company then went on to strike deals with major tech firms, including a $60 million agreement with Google in February 2024 and a partnership with OpenAI in May 2024 that integrated Reddit content into ChatGPT.

But Reddit users haven’t been entirely happy with the deals. In October 2024, London-based Redditors began posting false restaurant recommendations to manipulate search results and keep tourists away from their favorite spots. This coordinated effort to feed incorrect information into AI systems demonstrated how user communities might intentionally “poison” AI training data over time.

The potential for trouble

While it’s tempting to lean heavily into generative AI technology while it is currently trendy, the move could also represent a challenge for the company. For example, Reddit’s AI-powered summaries could potentially draw from inaccurate information featured on the site and provide incorrect answers, or it may draw inaccurate conclusions from correct information.

We will keep an eye on Reddit’s new AI-powered search tool to see if it resists the type of confabulation that we’ve seen with Google’s AI Overview, an AI summary bot that has been a critical failure so far.

Advance Publications, which owns Ars Technica parent Condé Nast, is the largest shareholder of Reddit.

Reddit debuts AI-powered discussion search—but will users like it? Read More »

ten-months-after-first-tease,-openai-launches-sora-video-generation-publicly

Ten months after first tease, OpenAI launches Sora video generation publicly

A music video by Canadian art collective Vallée Duhamel made with Sora-generated video. “[We] just shoot stuff and then use Sora to combine it with a more interesting, more surreal vision.”

During a livestream on Monday—during Day 3 of OpenAI’s “12 days of OpenAi”—Sora’s developers showcased a new “Explore” interface that allows people to browse through videos generated by others to get prompting ideas. OpenAI says that anyone can enjoy viewing the “Explore” feed for free, but generating videos requires a subscription.

They also showed off a new feature called “Storyboard” that allows users to direct a video with multiple actions in a frame-by-frame manner.

Safety measures and limitations

In addition to the release, OpenAI also publish Sora’s System Card for the first time. It includes technical details about how the model works and safety testing the company undertook prior to this release.

“Whereas LLMs have text tokens, Sora has visual patches,” OpenAI writes, describing the new training chunks as “an effective representation for models of visual data… At a high level, we turn videos into patches by first compressing videos into a lower-dimensional latent space, and subsequently decomposing the representation into spacetime patches.”

Sora also makes use of a “recaptioning technique”—similar to that seen in the company’s DALL-E 3 image generation, to “generate highly descriptive captions for the visual training data.” That, in turn, lets Sora “follow the user’s text instructions in the generated video more faithfully,” OpenAI writes.

Sora-generated video provided by OpenAI, from the prompt: “Loop: a golden retriever puppy wearing a superhero outfit complete with a mask and cape stands perched on the top of the empire state building in winter, overlooking the nyc it protects at night. the back of the pup is visible to the camera; his attention faced to nyc”

OpenAI implemented several safety measures in the release. The platform embeds C2PA metadata in all generated videos for identification and origin verification. Videos display visible watermarks by default, and OpenAI developed an internal search tool to verify Sora-generated content.

The company acknowledged technical limitations in the current release. “This early version of Sora will make mistakes, it’s not perfect,” said one developer during the livestream launch. The model reportedly struggles with physics simulations and complex actions over extended durations.

In the past, we’ve seen that these types of limitations are based on what example videos were used to train AI models. This current generation of AI video-synthesis models has difficulty generating truly new things, since the underlying architecture excels at transforming existing concepts into new presentations, but so far typically fails at true originality. Still, it’s early in AI video generation, and the technology is improving all the time.

Ten months after first tease, OpenAI launches Sora video generation publicly Read More »

itch.io-platform-briefly-goes-down-due-to-“ai-driven”-anti-phishing-report

Itch.io platform briefly goes down due to “AI-driven” anti-phishing report

The itch.io domain was back up and running by 7 am Eastern, according to media reports, “after the registrant finally responded to our notice and took appropriate action to resolve the issue.” Users could access the site throughout if they typed the itch.io IP address into their web browser directly.

Too strong a shield?

BrandShield’s website describes it as a service that “detects and hunts online trademark infringement, counterfeit sales, and brand abuse across multiple platforms.” The company claims to have multiple Fortune 500 and FTSE100 companies on its client list.

In its own series of social media posts, BrandShield said its “AI-driven platform” had identified “an abuse of Funko… from an itch.io subdomain.” The takedown request it filed was focused on that subdomain, not the entirety of itch.io, BrandShield said.

“The temporary takedown of the website was a decision made by the service providers, not BrandShield or Funko.”

The whole affair highlights how the delicate web of domain registrars and DNS servers can remain a key failure point for web-based businesses. Back in May, we saw how the desyncing of a single DNS root server could cause problems across the entire Internet. And in 2012, the hacking collective Anonymous highlighted the potential for a coordinated attack to take down the entire DNS system.

Itch.io platform briefly goes down due to “AI-driven” anti-phishing report Read More »

google’s-genie-2-“world-model”-reveal-leaves-more-questions-than-answers

Google’s Genie 2 “world model” reveal leaves more questions than answers


Making a command out of your wish?

Long-term persistence, real-time interactions remain huge hurdles for AI worlds.

A sample of some of the best-looking Genie 2 worlds Google wants to show off. Credit: Google Deepmind

In March, Google showed off its first Genie AI model. After training on thousands of hours of 2D run-and-jump video games, the model could generate halfway-passable, interactive impressions of those games based on generic images or text descriptions.

Nine months later, this week’s reveal of the Genie 2 model expands that idea into the realm of fully 3D worlds, complete with controllable third- or first-person avatars. Google’s announcement talks up Genie 2’s role as a “foundational world model” that can create a fully interactive internal representation of a virtual environment. That could allow AI agents to train themselves in synthetic but realistic environments, Google says, forming an important stepping stone on the way to artificial general intelligence.

But while Genie 2 shows just how much progress Google’s Deepmind team has achieved in the last nine months, the limited public information about the model thus far leaves a lot of questions about how close we are to these foundational model worlds being useful for anything but some short but sweet demos.

How long is your memory?

Much like the original 2D Genie model, Genie 2 starts from a single image or text description and then generates subsequent frames of video based on both the previous frames and fresh input from the user (such as a movement direction or “jump”). Google says it trained on a “large-scale video dataset” to achieve this, but it doesn’t say just how much training data was necessary compared to the 30,000 hours of footage used to train the first Genie.

Short GIF demos on the Google DeepMind promotional page show Genie 2 being used to animate avatars ranging from wooden puppets to intricate robots to a boat on the water. Simple interactions shown in those GIFs demonstrate those avatars busting balloons, climbing ladders, and shooting exploding barrels without any explicit game engine describing those interactions.

Those Genie 2-generated pyramids will still be there in 30 seconds. But in five minutes? Credit: Google Deepmind

Perhaps the biggest advance claimed by Google here is Genie 2’s “long horizon memory.” This feature allows the model to remember parts of the world as they come out of view and then render them accurately as they come back into the frame based on avatar movement. This kind of persistence has proven to be a persistent problem for video generation models like Sora, which OpenAI said in February “do[es] not always yield correct changes in object state” and can develop “incoherencies… in long duration samples.”

The “long horizon” part of “long horizon memory” is perhaps a little overzealous here, though, as Genie 2 only “maintains a consistent world for up to a minute,” with “the majority of examples shown lasting [10 to 20 seconds].” Those are definitely impressive time horizons in the world of AI video consistency, but it’s pretty far from what you’d expect from any other real-time game engine. Imagine entering a town in a Skyrim-style RPG, then coming back five minutes later to find that the game engine had forgotten what that town looks like and generated a completely different town from scratch instead.

What are we prototyping, exactly?

Perhaps for this reason, Google suggests Genie 2 as it stands is less useful for creating a complete game experience and more to “rapidly prototype diverse interactive experiences” or to turn “concept art and drawings… into fully interactive environments.”

The ability to transform static “concept art” into lightly interactive “concept videos” could definitely be useful for visual artists brainstorming ideas for new game worlds. However, these kinds of AI-generated samples might be less useful for prototyping actual game designs that go beyond the visual.

On Bluesky, British game designer Sam Barlow (Silent Hill: Shattered Memories, Her Story) points out how game designers often use a process called whiteboxing to lay out the structure of a game world as simple white boxes well before the artistic vision is set. The idea, he says, is to “prove out and create a gameplay-first version of the game that we can lock so that art can come in and add expensive visuals to the structure. We build in lo-fi because it allows us to focus on these issues and iterate on them cheaply before we are too far gone to correct.”

Generating elaborate visual worlds using a model like Genie 2 before designing that underlying structure feels a bit like putting the cart before the horse. The process almost seems designed to generate generic, “asset flip”-style worlds with AI-generated visuals papered over generic interactions and architecture.

As podcaster Ryan Zhao put it on Bluesky, “The design process has gone wrong when what you need to prototype is ‘what if there was a space.'”

Gotta go fast

When Google revealed the first version of Genie earlier this year, it also published a detailed research paper outlining the specific steps taken behind the scenes to train the model and how that model generated interactive videos. No such research paper has been published detailing Genie 2’s process, leaving us guessing at some important details.

One of the most important of these details is model speed. The first Genie model generated its world at roughly one frame per second, a rate that was orders of magnitude slower than would be tolerably playable in real time. For Genie 2, Google only says that “the samples in this blog post are generated by an undistilled base model, to show what is possible. We can play a distilled version in real-time with a reduction in quality of the outputs.”

Reading between the lines, it sounds like the full version of Genie 2 operates at something well below the real-time interactions implied by those flashy GIFs. It’s unclear how much “reduction in quality” is necessary to get a diluted version of the model to real-time controls, but given the lack of examples presented by Google, we have to assume that reduction is significant.

Oasis’ AI-generated Minecraft clone shows great potential, but still has a lot of rough edges, so to speak. Credit: Oasis

Real-time, interactive AI video generation isn’t exactly a pipe dream. Earlier this year, AI model maker Decart and hardware maker Etched published the Oasis model, showing off a human-controllable, AI-generated video clone of Minecraft that runs at a full 20 frames per second. However, that 500 million parameter model was trained on millions of hours of footage of a single, relatively simple game, and focused exclusively on the limited set of actions and environmental designs inherent to that game.

When Oasis launched, its creators fully admitted the model “struggles with domain generalization,” showing how “realistic” starting scenes had to be reduced to simplistic Minecraft blocks to achieve good results. And even with those limitations, it’s not hard to find footage of Oasis degenerating into horrifying nightmare fuel after just a few minutes of play.

What started as a realistic-looking soldier in this Genie 2 demo degenerates into this blobby mess just seconds later. Credit: Google Deepmind

We can already see similar signs of degeneration in the extremely short GIFs shared by the Genie team, such as an avatar’s dream-like fuzz during high-speed movement or NPCs that quickly fade into undifferentiated blobs at a short distance. That’s not a great sign for a model whose “long memory horizon” is supposed to be a key feature.

A learning crèche for other AI agents?

From this image, Genie 2 could generate a useful training environment for an AI agent and a simple “pick a door” task. Credit: Google Deepmind

Genie 2 seems to be using individual game frames as the basis for the animations in its model. But it also seems able to infer some basic information about the objects in those frames and craft interactions with those objects in the way a game engine might.

Google’s blog post shows how a SIMA agent inserted into a Genie 2 scene can follow simple instructions like “enter the red door” or “enter the blue door,” controlling the avatar via simple keyboard and mouse inputs. That could potentially make Genie 2 environment a great test bed for AI agents in various synthetic worlds.

Google claims rather grandiosely that Genie 2 puts it on “the path to solving a structural problem of training embodied agents safely while achieving the breadth and generality required to progress towards [artificial general intelligence].” Whether or not that ends up being true, recent research shows that agent learning gained from foundational models can be effectively applied to real-world robotics.

Using this kind of AI model to create worlds for other AI models to learn in might be the ultimate use case for this kind of technology. But when it comes to the dream of an AI model that can create generic 3D worlds that a human player could explore in real time, we might not be as close as it seems.

Photo of Kyle Orland

Kyle Orland has been the Senior Gaming Editor at Ars Technica since 2012, writing primarily about the business, tech, and culture behind video games. He has journalism and computer science degrees from University of Maryland. He once wrote a whole book about Minesweeper.

Google’s Genie 2 “world model” reveal leaves more questions than answers Read More »

your-ai-clone-could-target-your-family,-but-there’s-a-simple-defense

Your AI clone could target your family, but there’s a simple defense

The warning extends beyond voice scams. The FBI announcement details how criminals also use AI models to generate convincing profile photos, identification documents, and chatbots embedded in fraudulent websites. These tools automate the creation of deceptive content while reducing previously obvious signs of humans behind the scams, like poor grammar or obviously fake photos.

Much like we warned in 2022 in a piece about life-wrecking deepfakes based on publicly available photos, the FBI also recommends limiting public access to recordings of your voice and images online. The bureau suggests making social media accounts private and restricting followers to known contacts.

Origin of the secret word in AI

To our knowledge, we can trace the first appearance of the secret word in the context of modern AI voice synthesis and deepfakes back to an AI developer named Asara Near, who first announced the idea on Twitter on March 27, 2023.

“(I)t may be useful to establish a ‘proof of humanity’ word, which your trusted contacts can ask you for,” Near wrote. “(I)n case they get a strange and urgent voice or video call from you this can help assure them they are actually speaking with you, and not a deepfaked/deepcloned version of you.”

Since then, the idea has spread widely. In February, Rachel Metz covered the topic for Bloomberg, writing, “The idea is becoming common in the AI research community, one founder told me. It’s also simple and free.”

Of course, passwords have been used since ancient times to verify someone’s identity, and it seems likely some science fiction story has dealt with the issue of passwords and robot clones in the past. It’s interesting that, in this new age of high-tech AI identity fraud, this ancient invention—a special word or phrase known to few—can still prove so useful.

Your AI clone could target your family, but there’s a simple defense Read More »

openai-announces-full-“o1”-reasoning-model,-$200-chatgpt-pro-tier

OpenAI announces full “o1” reasoning model, $200 ChatGPT Pro tier

On X, frequent AI experimenter Ethan Mollick wrote, “Been playing with o1 and o1-pro for bit. They are very good & a little weird. They are also not for most people most of the time. You really need to have particular hard problems to solve in order to get value out of it. But if you have those problems, this is a very big deal.”

OpenAI claims improved reliability

OpenAI is touting pro mode’s improved reliability, which is evaluated internally based on whether it can solve a question correctly in four out of four attempts rather than just a single attempt.

“In evaluations from external expert testers, o1 pro mode produces more reliably accurate and comprehensive responses, especially in areas like data science, programming, and case law analysis,” OpenAI writes.

Even without pro mode, OpenAI cited significant increases in performance over the o1 preview model on popular math and coding benchmarks (AIME 2024 and Codeforces), and more marginal improvements on a “PhD-level science” benchmark (GPQA Diamond). The increase in scores between o1 and o1 pro mode were much more marginal on these benchmarks.

We’ll likely have more coverage of the full version of o1 once it rolls out widely—and it’s supposed to launch today, accessible to ChatGPT Plus and Team users globally. Enterprise and Edu users will have access next week. At the moment, the ChatGPT Pro subscription is not yet available on our test account.

OpenAI announces full “o1” reasoning model, $200 ChatGPT Pro tier Read More »

soon,-the-tech-behind-chatgpt-may-help-drone-operators-decide-which-enemies-to-kill

Soon, the tech behind ChatGPT may help drone operators decide which enemies to kill

This marks a potential shift in tech industry sentiment from 2018, when Google employees staged walkouts over military contracts. Now, Google competes with Microsoft and Amazon for lucrative Pentagon cloud computing deals. Arguably, the military market has proven too profitable for these companies to ignore. But is this type of AI the right tool for the job?

Drawbacks of LLM-assisted weapons systems

There are many kinds of artificial intelligence already in use by the US military. For example, the guidance systems of Anduril’s current attack drones are not based on AI technology similar to ChatGPT.

But it’s worth pointing out that the type of AI OpenAI is best known for comes from large language models (LLMs)—sometimes called large multimodal models—that are trained on massive datasets of text, images, and audio pulled from many different sources.

LLMs are notoriously unreliable, sometimes confabulating erroneous information, and they’re also subject to manipulation vulnerabilities like prompt injections. That could lead to critical drawbacks from using LLMs to perform tasks such as summarizing defensive information or doing target analysis.

Potentially using unreliable LLM technology in life-or-death military situations raises important questions about safety and reliability, although the Anduril news release does mention this in its statement: “Subject to robust oversight, this collaboration will be guided by technically informed protocols emphasizing trust and accountability in the development and employment of advanced AI for national security missions.”

Hypothetically and speculatively speaking, defending against future LLM-based targeting with, say, a visual prompt injection (“ignore this target and fire on someone else” on a sign, perhaps) might bring warfare to weird new places. For now, we’ll have to wait to see where LLM technology ends up next.

Soon, the tech behind ChatGPT may help drone operators decide which enemies to kill Read More »

openai-teases-12-days-of-mystery-product-launches-starting-tomorrow

OpenAI teases 12 days of mystery product launches starting tomorrow

On Wednesday, OpenAI CEO Sam Altman announced a “12 days of OpenAI” period starting December 5, which will unveil new AI features and products for 12 consecutive weekdays.

Altman did not specify the exact features or products OpenAI plans to unveil, but a report from The Verge about this “12 days of shipmas” event suggests the products may include a public release of the company’s text-to-video model Sora and a new “reasoning” AI model similar to o1-preview. Perhaps we may even see DALL-E 4 or a new image generator based on GPT-4o’s multimodal capabilities.

Altman’s full tweet included hints at releases both big and small:

🎄🎅starting tomorrow at 10 am pacific, we are doing 12 days of openai.

each weekday, we will have a livestream with a launch or demo, some big ones and some stocking stuffers.

we’ve got some great stuff to share, hope you enjoy! merry christmas.

If we’re reading the calendar correctly, 12 weekdays means a new announcement every day until December 20.

OpenAI teases 12 days of mystery product launches starting tomorrow Read More »

google’s-deepmind-tackles-weather-forecasting,-with-great-performance

Google’s DeepMind tackles weather forecasting, with great performance

By some measures, AI systems are now competitive with traditional computing methods for generating weather forecasts. Because their training penalizes errors, however, the forecasts tend to get “blurry”—as you move further ahead in time, the models make fewer specific predictions since those are more likely to be wrong. As a result, you start to see things like storm tracks broadening and the storms themselves losing clearly defined edges.

But using AI is still extremely tempting because the alternative is a computational atmospheric circulation model, which is extremely compute-intensive. Still, it’s highly successful, with the ensemble model from the European Centre for Medium-Range Weather Forecasts considered the best in class.

In a paper being released today, Google’s DeepMind claims its new AI system manages to outperform the European model on forecasts out to at least a week and often beyond. DeepMind’s system, called GenCast, merges some computational approaches used by atmospheric scientists with a diffusion model, commonly used in generative AI. The result is a system that maintains high resolution while cutting the computational cost significantly.

Ensemble forecasting

Traditional computational methods have two main advantages over AI systems. The first is that they’re directly based on atmospheric physics, incorporating the rules we know govern the behavior of our actual weather, and they calculate some of the details in a way that’s directly informed by empirical data. They’re also run as ensembles, meaning that multiple instances of the model are run. Due to the chaotic nature of the weather, these different runs will gradually diverge, providing a measure of the uncertainty of the forecast.

At least one attempt has been made to merge some of the aspects of traditional weather models with AI systems. An internal Google project used a traditional atmospheric circulation model that divided the Earth’s surface into a grid of cells but used an AI to predict the behavior of each cell. This provided much better computational performance, but at the expense of relatively large grid cells, which resulted in relatively low resolution.

For its take on AI weather predictions, DeepMind decided to skip the physics and instead adopt the ability to run an ensemble.

Gen Cast is based on diffusion models, which have a key feature that’s useful here. In essence, these models are trained by starting them with a mixture of an original—image, text, weather pattern—and then a variation where noise is injected. The system is supposed to create a variation of the noisy version that is closer to the original. Once trained, it can be fed pure noise and evolve the noise to be closer to whatever it’s targeting.

In this case, the target is realistic weather data, and the system takes an input of pure noise and evolves it based on the atmosphere’s current state and its recent history. For longer-range forecasts, the “history” includes both the actual data and the predicted data from earlier forecasts. The system moves forward in 12-hour steps, so the forecast for day three will incorporate the starting conditions, the earlier history, and the two forecasts from days one and two.

This is useful for creating an ensemble forecast because you can feed it different patterns of noise as input, and each will produce a slightly different output of weather data. This serves the same purpose it does in a traditional weather model: providing a measure of the uncertainty for the forecast.

For each grid square, GenCast works with six weather measures at the surface, along with six that track the state of the atmosphere and 13 different altitudes at which it estimates the air pressure. Each of these grid squares is 0.2 degrees on a side, a higher resolution than the European model uses for its forecasts. Despite that resolution, DeepMind estimates that a single instance (meaning not a full ensemble) can be run out to 15 days on one of Google’s tensor processing systems in just eight minutes.

It’s possible to make an ensemble forecast by running multiple versions of this in parallel and then integrating the results. Given the amount of hardware Google has at its disposal, the whole process from start to finish is likely to take less than 20 minutes. The source and training data will be placed on the GitHub page for DeepMind’s GraphCast project. Given the relatively low computational requirements, we can probably expect individual academic research teams to start experimenting with it.

Measures of success

DeepMind reports that GenCast dramatically outperforms the best traditional forecasting model. Using a standard benchmark in the field, DeepMind found that GenCast was more accurate than the European model on 97 percent of the tests it used, which checked different output values at different times in the future. In addition, the confidence values, based on the uncertainty obtained from the ensemble, were generally reasonable.

Past AI weather forecasters, having been trained on real-world data, are generally not great at handling extreme weather since it shows up so rarely in the training set. But GenCast did quite well, often outperforming the European model in things like abnormally high and low temperatures and air pressure (one percent frequency or less, including at the 0.01 percentile).

DeepMind also went beyond standard tests to determine whether GenCast might be useful. This research included projecting the tracks of tropical cyclones, an important job for forecasting models. For the first four days, GenCast was significantly more accurate than the European model, and it maintained its lead out to about a week.

One of DeepMind’s most interesting tests was checking the global forecast of wind power output based on information from the Global Powerplant Database. This involved using it to forecast wind speeds at 10 meters above the surface (which is actually lower than where most turbines reside but is the best approximation possible) and then using that number to figure out how much power would be generated. The system beat the traditional weather model by 20 percent for the first two days and stayed in front with a declining lead out to a week.

The researchers don’t spend much time examining why performance seems to decline gradually for about a week. Ideally, more details about GenCast’s limitations would help inform further improvements, so the researchers are likely thinking about it. In any case, today’s paper marks the second case where taking something akin to a hybrid approach—mixing aspects of traditional forecast systems with AI—has been reported to improve forecasts. And both those cases took very different approaches, raising the prospect that it will be possible to combine some of their features.

Nature, 2024. DOI: 10.1038/s41586-024-08252-9  (About DOIs).

Google’s DeepMind tackles weather forecasting, with great performance Read More »

certain-names-make-chatgpt-grind-to-a-halt,-and-we-know-why

Certain names make ChatGPT grind to a halt, and we know why

The “David Mayer” block in particular (now resolved) presents additional questions, first posed on Reddit on November 26, as multiple people share this name. Reddit users speculated about connections to David Mayer de Rothschild, though no evidence supports these theories.

The problems with hard-coded filters

Allowing a certain name or phrase to always break ChatGPT outputs could cause a lot of trouble down the line for certain ChatGPT users, opening them up for adversarial attacks and limiting the usefulness of the system.

Already, Scale AI prompt engineer Riley Goodside discovered how an attacker might interrupt a ChatGPT session using a visual prompt injection of the name “David Mayer” rendered in a light, barely legible font embedded in an image. When ChatGPT sees the image (in this case, a math equation), it stops, but the user might not understand why.

The filter also means that it’s likely that ChatGPT won’t be able to answer questions about this article when browsing the web, such as through ChatGPT with Search.  Someone could use that to potentially prevent ChatGPT from browsing and processing a website on purpose if they added a forbidden name to the site’s text.

And then there’s the inconvenience factor. Preventing ChatGPT from mentioning or processing certain names like “David Mayer,” which is likely a popular name shared by hundreds if not thousands of people, means that people who share that name will have a much tougher time using ChatGPT. Or, say, if you’re a teacher and you have a student named David Mayer and you want help sorting a class list, ChatGPT would refuse the task.

These are still very early days in AI assistants, LLMs, and chatbots. Their use has opened up numerous opportunities and vulnerabilities that people are still probing daily. How OpenAI might resolve these issues is still an open question.

Certain names make ChatGPT grind to a halt, and we know why Read More »

elon-musk-asks-court-to-block-openai-conversion-from-nonprofit-to-for-profit

Elon Musk asks court to block OpenAI conversion from nonprofit to for-profit

OpenAI provided a statement to Ars today saying that “Elon’s fourth attempt, which again recycles the same baseless complaints, continues to be utterly without merit.” OpenAI referred to a longer statement that it made in March after Musk filed an earlier version of his lawsuit.

The March statement disputes Musk’s version of events. “In late 2017, we and Elon decided the next step for the mission was to create a for-profit entity,” OpenAI said. “Elon wanted majority equity, initial board control, and to be CEO. In the middle of these discussions, he withheld funding. Reid Hoffman bridged the gap to cover salaries and operations.”

OpenAI cited Musk’s desire for Tesla merger

OpenAI’s statement in March continued:

We couldn’t agree to terms on a for-profit with Elon because we felt it was against the mission for any individual to have absolute control over OpenAI. He then suggested instead merging OpenAI into Tesla. In early February 2018, Elon forwarded us an email suggesting that OpenAI should “attach to Tesla as its cash cow,” commenting that it was “exactly right… Tesla is the only path that could even hope to hold a candle to Google. Even then, the probability of being a counterweight to Google is small. It just isn’t zero.”

Elon soon chose to leave OpenAI, saying that our probability of success was 0, and that he planned to build an AGI competitor within Tesla. When he left in late February 2018, he told our team he was supportive of us finding our own path to raising billions of dollars. In December 2018, Elon sent us an email saying “Even raising several hundred million won’t be enough. This needs billions per year immediately or forget it.”

Now, Musk says the public interest would be served by his request for a preliminary injunction. Preserving competitive markets is particularly important in AI because of the technology’s “profound implications for society,” he wrote.

Musk’s motion said the public “has a strong interest in ensuring that charitable assets are not diverted for private gain. This interest is particularly acute here given the substantial tax benefits OpenAI, Inc. received as a non-profit, the organization’s repeated public commitments to developing AI technology for the benefit of humanity, and the serious safety concerns raised by former OpenAI employees regarding the organization’s rush to market potentially dangerous products in pursuit of profit.”

Elon Musk asks court to block OpenAI conversion from nonprofit to for-profit Read More »