chatgpt

openai-updates-chatgpt-4-model-with-potential-fix-for-ai-“laziness”-problem

OpenAI updates ChatGPT-4 model with potential fix for AI “laziness” problem

Break’s over —

Also, new GPT-3.5 Turbo model, lower API prices, and other model updates.

A lazy robot (a man with a box on his head) sits on the floor beside a couch.

On Thursday, OpenAI announced updates to the AI models that power its ChatGPT assistant. Amid less noteworthy updates, OpenAI tucked in a mention of a potential fix to a widely reported “laziness” problem seen in GPT-4 Turbo since its release in November. The company also announced a new GPT-3.5 Turbo model (with lower pricing), a new embedding model, an updated moderation model, and a new way to manage API usage.

“Today, we are releasing an updated GPT-4 Turbo preview model, gpt-4-0125-preview. This model completes tasks like code generation more thoroughly than the previous preview model and is intended to reduce cases of ‘laziness’ where the model doesn’t complete a task,” writes OpenAI in its blog post.

Since the launch of GPT-4 Turbo, a large number of ChatGPT users have reported that the ChatGPT-4 version of its AI assistant has been declining to do tasks (especially coding tasks) with the same exhaustive depth as it did in earlier versions of GPT-4. We’ve seen this behavior ourselves while experimenting with ChatGPT over time.

OpenAI has never offered an official explanation for this change in behavior, but OpenAI employees have previously acknowledged on social media that the problem is real, and the ChatGPT X account wrote in December, “We’ve heard all your feedback about GPT4 getting lazier! we haven’t updated the model since Nov 11th, and this certainly isn’t intentional. model behavior can be unpredictable, and we’re looking into fixing it.”

We reached out to OpenAI asking if it could provide an official explanation for the laziness issue but did not receive a response by press time.

New GPT-3.5 Turbo, other updates

Elsewhere in OpenAI’s blog update, the company announced a new version of GPT-3.5 Turbo (gpt-3.5-turbo-0125), which it says will offer “various improvements including higher accuracy at responding in requested formats and a fix for a bug which caused a text encoding issue for non-English language function calls.”

And the cost of GPT-3.5 Turbo through OpenAI’s API will decrease for the third time this year “to help our customers scale.” New input token prices are 50 percent less, at $0.0005 per 1,000 input tokens, and output prices are 25 percent less, at $0.0015 per 1,000 output tokens.

Lower token prices for GPT-3.5 Turbo will make operating third-party bots significantly less expensive, but the GPT-3.5 model is generally more likely to confabulate than GPT-4 Turbo. So we might see more scenarios like Quora’s bot telling people that eggs can melt (although the instance used a now-deprecated GPT-3 model called text-davinci-003). If GPT-4 Turbo API prices drop over time, some of those hallucination issues with third parties might eventually go away.

OpenAI also announced new embedding models, text-embedding-3-small and text-embedding-3-large, which convert content into numerical sequences, aiding in machine learning tasks like clustering and retrieval. And an updated moderation model, text-moderation-007, is part of the company’s API that “allows developers to identify potentially harmful text,” according to OpenAI.

Finally, OpenAI is rolling out improvements to its developer platform, introducing new tools for managing API keys and a new dashboard for tracking API usage. Developers can now assign permissions to API keys from the API keys page, helping to clamp down on misuse of API keys (if they get into the wrong hands) that can potentially cost developers lots of money. The API dashboard allows devs to “view usage on a per feature, team, product, or project level, simply by having separate API keys for each.”

As the media world seemingly swirls around the company with controversies and think pieces about the implications of its tech, releases like these show that the dev teams at OpenAI are still rolling along as usual with updates at a fairly regular pace. Despite the company almost completely falling apart late last year, it seems that, under the hood, it’s business as usual for OpenAI.

OpenAI updates ChatGPT-4 model with potential fix for AI “laziness” problem Read More »

wordpad-out;-80gbps-usb-support-and-other-win-11-features-in-testing-this-month

WordPad out; 80Gbps USB support and other Win 11 features in testing this month

Can’t stop won’t stop —

Microsoft’s next batch of Windows 11 feature updates is taking shape.

Green USB-C cable

Windows 11’s big feature update in September included a long list of minor changes, plus the Copilot AI assistant; that update was followed by Windows 11 23H2 in late October, which reset the operating system’s timeline for technical support and security updates but didn’t add much else in and of itself. But Windows development never stops these days, and this month’s Insider Preview builds have already shown us a few things that could end up in the stable version of the operating system in the next couple of months.

One major addition, which rolled out to Dev Channel builds on January 11 and Beta Channel builds today, is support for 80Gbps USB 4 ports. These speeds are part of the USB4 Version 2.0 spec—named with the USB-IF’s typical flair for clarity and consistency—that was published in 2022. Full 80Gbps speeds are still rare and will be for the foreseeable future, but Microsoft says that they’ll be included the Razer Blade 18 and a handful of other PCs with Intel’s 14th-generation HX-series laptop processors. We’d expect the new speeds to proliferate slowly and mostly in high-end systems over the next few months and years.

Another addition to that January 11 Dev Channel build is a change in how the Copilot generative AI assistant works. Normally, Copilot is launched by the user manually, either by clicking the icon on the taskbar, hitting the Win+C key combo, or (in some new PCs) by using the dedicated Copilot button on the keyboard. In recent Dev Channel builds, the Copilot window will open automatically on certain PCs as soon as you log into Windows, becoming part of your default desktop unless you turn it off in Settings.

The Copilot panel will only open by default on screens that meet minimum size and resolution requirements, things that Windows already detects and takes into account when setting your PC’s default zoom and showing available Snap Layouts, among other things. Microsoft says it’s testing the feature on screens that are 27 inches or larger with 1,920 or more horizontal pixels (for most screens, this means a minimum resolution of 1080p). For PCs without Copilot, including those that haven’t been signed into a Microsoft account, the feature will continue to be absent.

The

Enlarge / The “richer weather experience on the Lock screen,” seen in the bottom-center of this screenshot.

Microsoft

Other additions to the Dev Channel builds this month include easy Snipping Tool editing for Android screenshots from phones that have been paired to your PC, custom user-created voice commands, the ability to share URLs directly to services like WhatsApp and Gmail from the Windows share window, a new Weather widget for the Windows lock screen, and app install notifications from the Microsoft store.

Microsoft hasn’t publicized any of the changes it has made to its Canary channel builds since January 4—this is typical since it changes the fastest, and the tested features are the most likely to be removed or significantly tweaked before being released to the public. Most of the significant additions from that announcement have since made it out to the other channels, but there are a couple of things worth noting. First, there’s a new Energy Saver taskbar icon for desktop PCs without batteries, making it easier to tell when the feature is on without creating confusion. And the venerable WordPad app, originally marked for deletion in September, has also been removed from these builds and can’t be reinstalled.

Microsoft doesn’t publish Windows feature updates on an exact cadence beyond its commitment to deliver one with a new version number once per year in the fall. Last year’s first major batch of Windows 11 additions rolled out at the end of February, so a late winter or early spring launch window for the next batch of features could make sense.

WordPad out; 80Gbps USB support and other Win 11 features in testing this month Read More »

bing-search-shows-few,-if-any,-signs-of-market-share-increase-from-ai-features

Bing Search shows few, if any, signs of market share increase from AI features

high hopes —

Bing’s US and worldwide market share is about the same as it has been for years.

Bing Search shows few, if any, signs of market share increase from AI features

Microsoft

Not quite one year ago, Microsoft announced a “multi-year, multi-billion dollar investment” in OpenAI, a company that had made waves in 2022 with its ChatGPT chatbot and DALL-E image creator. The next month, Microsoft announced that it was integrating a generative AI chatbot into its Bing search engine and Edge browser, and similar generative AI features were announced for Windows in the apps formerly known as Microsoft Office, Microsoft Teams, and other products.

Adding AI features to Bing was meant to give it an edge over Google, and reports indicated that Google was worried enough about it to accelerate its own internal generative AI efforts. Microsoft announced in March 2023 that Bing surpassed the 100 million monthly active users mark based on interest in Bing Chat and its ilk; by Microsoft’s estimates, each percentage of Google’s search market share that Bing could siphon away was worth as much as $2 billion to Microsoft.

A year later, it looks like Microsoft’s AI efforts may have helped Bing on the margins, but they haven’t meaningfully eroded Google’s search market share, according to Bloomberg. Per Bloomberg’s analysis of data from Sensor Tower, Bing usage had been down around 33 percent year over year just before the AI-powered features were added, but those numbers had rebounded by the middle of 2023.

Microsoft hasn’t given an official update on Bing’s monthly active users in quite a while—we’ve asked the company for an update, and will share it if we get one—though Microsoft Chief Marketing Officer Yusuf Medhi told Bloomberg that “millions and millions of people” were still using the new AI features.

StatCounter data mostly tells a similar story. According to its data, Google’s worldwide market share is currently in the low 90s, and it has been for virtually the entire 15-year period for which StatCounter offers data. Bing’s worldwide market share number over the same period has been remarkably stable; it was about 3.5 percent in the summer of 2009, when what had been known as Live Search was renamed Bing in the first place, and as of December 2023, it was still stuck at around 3.4 percent.

Recent US data is slightly more flattering for Microsoft, where Bing’s usage rose from 6.7 percent in December 2022 to 7.7 percent in December 2023. But that doesn’t necessarily suggest any kind of AI-fueled influx in new Bing search users—usage remained in the mid-to-high 6 percent range through most of 2023 before ticking up right at the end of the year—and Bing’s US usage has floated in that same 6–7 percent zone for most of the last decade.

It even seems like Microsoft is making moves to distance its AI efforts from Bing a bit. What began as “Bing Chat” or “the new Bing” is now known as Windows Copilot—both inside Windows 11 and elsewhere. Earlier this week, the Bing Image Creator became “Image Creator from Designer.” Both products still feature Bing branding prominently—the Copilot screen in Windows 11 still says “with Bing” at the top of it, and the Image Creator tool is still hosted on the Bing.com domain. But if these new AI features aren’t driving Bing’s market share up, then it makes sense for Microsoft to create room for them to stand on their own.

That’s not to say Google’s search dominance is assured. Leipzig University researchers published a study earlier this week (PDF) suggesting Google, Bing, and the Bing-powered DuckDuckGo had seen “an overall downward trend in text quality,” especially for heavily SEO-optimized categories like purchase recommendations and product reviews.

Bing Search shows few, if any, signs of market share increase from AI features Read More »

openai-opens-the-door-for-military-uses-but-maintains-ai-weapons-ban

OpenAI opens the door for military uses but maintains AI weapons ban

Skynet deferred —

Despite new Pentagon collab, OpenAI won’t allow customers to “develop or use weapons” with its tools.

The OpenAI logo over a camoflage background.

On Tuesday, ChatGPT developer OpenAI revealed that it is collaborating with the United States Defense Department on cybersecurity projects and exploring ways to prevent veteran suicide, reports Bloomberg. OpenAI revealed the collaboration during an interview with the news outlet at the World Economic Forum in Davos. The AI company recently modified its policies, allowing for certain military applications of its technology, while maintaining prohibitions against using it to develop weapons.

According to Anna Makanju, OpenAI’s vice president of global affairs, “many people thought that [a previous blanket prohibition on military applications] would prohibit many of these use cases, which people think are very much aligned with what we want to see in the world.” OpenAI removed terms from its service agreement that previously blocked AI use in “military and warfare” situations, but the company still upholds a ban on its technology being used to develop weapons or to cause harm or property damage.

Under the “Universal Policies” section of OpenAI’s Usage Policies document, section 2 says, “Don’t use our service to harm yourself or others.” The prohibition includes using its AI products to “develop or use weapons.” Changes to the terms that removed the “military and warfare” prohibitions appear to have been made by OpenAI on January 10.

The shift in policy appears to align OpenAI more closely with the needs of various governmental departments, including the possibility of preventing veteran suicides. “We’ve been doing work with the Department of Defense on cybersecurity tools for open-source software that secures critical infrastructure,” Makanju said in the interview. “We’ve been exploring whether it can assist with (prevention of) veteran suicide.”

The efforts mark a significant change from OpenAI’s original stance on military partnerships, Bloomberg says. Meanwhile, Microsoft Corp., a large investor in OpenAI, already has an established relationship with the US military through various software contracts.

OpenAI opens the door for military uses but maintains AI weapons ban Read More »

openai-must-defend-chatgpt-fabrications-after-failing-to-defeat-libel-suit

OpenAI must defend ChatGPT fabrications after failing to defeat libel suit

One false move —

ChatGPT users may soon learn whether false outputs will be allowed to ruin lives.

OpenAI must defend ChatGPT fabrications after failing to defeat libel suit

OpenAI may finally have to answer for ChatGPT’s “hallucinations” in court after a Georgia judge recently ruled against the tech company’s motion to dismiss a radio host’s defamation suit.

OpenAI had argued that ChatGPT’s output cannot be considered libel, partly because the chatbot output cannot be considered a “publication,” which is a key element of a defamation claim. In its motion to dismiss, OpenAI also argued that Georgia radio host Mark Walters could not prove that the company acted with actual malice or that anyone believed the allegedly libelous statements were true or that he was harmed by the alleged publication.

It’s too early to say whether Judge Tracie Cason found OpenAI’s arguments persuasive. In her order denying OpenAI’s motion to dismiss, which MediaPost shared here, Cason did not specify how she arrived at her decision, saying only that she had “carefully” considered arguments and applicable laws.

There may be some clues as to how Cason reached her decision in a court filing from John Monroe, attorney for Walters, when opposing the motion to dismiss last year.

Monroe had argued that OpenAI improperly moved to dismiss the lawsuit by arguing facts that have yet to be proven in court. If OpenAI intended the court to rule on those arguments, Monroe suggested that a motion for summary judgment would have been the proper step at this stage in the proceedings, not a motion to dismiss.

Had OpenAI gone that route, though, Walters would have had an opportunity to present additional evidence. To survive a motion to dismiss, all Walters had to do was show that his complaint was reasonably supported by facts, Monroe argued.

Failing to convince the court that Walters had no case, OpenAI’s legal theories regarding its liability for ChatGPT’s “hallucinations” will now likely face their first test in court.

“We are pleased the court denied the motion to dismiss so that the parties will have an opportunity to explore, and obtain a decision on, the merits of the case,” Monroe told Ars.

What’s the libel case against OpenAI?

Walters sued OpenAI after a journalist, Fred Riehl, warned him that in response to a query, ChatGPT had fabricated an entire lawsuit. Generating an entire complaint with an erroneous case number, ChatGPT falsely claimed that Walters had been accused of defrauding and embezzling funds from the Second Amendment Foundation.

Walters is the host of Armed America Radio and has a reputation as the “Loudest Voice in America Fighting For Gun Rights.” He claimed that OpenAI “recklessly” disregarded whether ChatGPT’s outputs were false, alleging that OpenAI knew that “ChatGPT’s hallucinations were pervasive and severe” and did not work to prevent allegedly libelous outputs. As Walters saw it, the false statements were serious enough to be potentially career-damaging, “tending to injure Walter’s reputation and exposing him to public hatred, contempt, or ridicule.”

Monroe argued that Walters had “adequately stated a claim” of libel, per se, as a private citizen, “for which relief may be granted under Georgia law” where “malice is inferred” in “all actions for defamation” but “may be rebutted” by OpenAI.

Pushing back, OpenAI argued that Walters was a public figure who must prove that OpenAI acted with “actual malice” when allowing ChatGPT to produce allegedly harmful outputs. But Monroe told the court that OpenAI “has not shown sufficient facts to establish that Walters is a general public figure.”

Whether or not Walters is a public figure could be another key question leading Cason to rule against OpenAI’s motion to dismiss.

Perhaps also frustrating the court, OpenAI introduced “a large amount of material” in its motion to dismiss that fell outside the scope of the complaint, Monroe argued. That included pointing to a disclaimer in ChatGPT’s terms of use that warns users that ChatGPT’s responses may not be accurate and should be verified before publishing. According to OpenAI, this disclaimer makes Riehl the “owner” of any libelous ChatGPT responses to his queries.

“A disclaimer does not make an otherwise libelous statement non-libelous,” Monroe argued. And even if the disclaimer made Riehl liable for publishing the ChatGPT output—an argument that may give some ChatGPT users pause before querying—”that responsibility does not have the effect of negating the responsibility of the original publisher of the material,” Monroe argued.

Additionally, OpenAI referenced a conversation between Walters and OpenAI, even though Monroe said that the complaint “does not allege that Walters ever had a chat” with OpenAI. And OpenAI also somewhat oddly argued that ChatGPT outputs could be considered “intra-corporate communications” rather than publications, suggesting that ChatGPT users could be considered private contractors when querying the chatbot.

With the lawsuit moving forward, curious chatbot users everywhere may finally get the answer to a question that has been unclear since ChatGPT quickly became the fastest-growing consumer application of all time after its launch in November 2022: Will ChatGPT’s hallucinations be allowed to ruin lives?

In the meantime, the FTC is seemingly still investigating potential harms caused by ChatGPT’s “false, misleading, or disparaging” generations.

An FTC spokesperson previously told Ars that the FTC does not generally comment on nonpublic investigations.

OpenAI did not immediately respond to Ars’ request to comment.

OpenAI must defend ChatGPT fabrications after failing to defeat libel suit Read More »

as-2024-election-looms,-openai-says-it-is-taking-steps-to-prevent-ai-abuse

As 2024 election looms, OpenAI says it is taking steps to prevent AI abuse

Don’t Rock the vote —

ChatGPT maker plans transparency for gen AI content and improved access to voting info.

A pixelated photo of Donald Trump.

On Monday, ChatGPT maker OpenAI detailed its plans to prevent the misuse of its AI technologies during the upcoming elections in 2024, promising transparency in AI-generated content and enhancing access to reliable voting information. The AI developer says it is working on an approach that involves policy enforcement, collaboration with partners, and the development of new tools aimed at classifying AI-generated media.

“As we prepare for elections in 2024 across the world’s largest democracies, our approach is to continue our platform safety work by elevating accurate voting information, enforcing measured policies, and improving transparency,” writes OpenAI in its blog post. “Protecting the integrity of elections requires collaboration from every corner of the democratic process, and we want to make sure our technology is not used in a way that could undermine this process.”

Initiatives proposed by OpenAI include preventing abuse by means such as deepfakes or bots imitating candidates, refining usage policies, and launching a reporting system for the public to flag potential abuses. For example, OpenAI’s image generation tool, DALL-E 3, includes built-in filters that reject requests to create images of real people, including politicians. “For years, we’ve been iterating on tools to improve factual accuracy, reduce bias, and decline certain requests,” the company stated.

OpenAI says it regularly updates its Usage Policies for ChatGPT and its API products to prevent misuse, especially in the context of elections. The organization has implemented restrictions on using its technologies for political campaigning and lobbying until it better understands the potential for personalized persuasion. Also, OpenAI prohibits creating chatbots that impersonate real individuals or institutions and disallows the development of applications that could deter people from “participation in democratic processes.” Users can report GPTs that may violate the rules.

OpenAI claims to be proactively engaged in detailed strategies to safeguard its technologies against misuse. According to their statements, this includes red-teaming new systems to anticipate challenges, engaging with users and partners for feedback, and implementing robust safety mitigations. OpenAI asserts that these efforts are integral to its mission of continually refining AI tools for improved accuracy, reduced biases, and responsible handling of sensitive requests

Regarding transparency, OpenAI says it is advancing its efforts in classifying image provenance. The company plans to embed digital credentials, using cryptographic techniques, into images produced by DALL-E 3 as part of its adoption of standards by the Coalition for Content Provenance and Authenticity. Additionally, OpenAI says it is testing a tool designed to identify DALL-E-generated images.

In an effort to connect users with authoritative information, particularly concerning voting procedures, OpenAI says it has partnered with the National Association of Secretaries of State (NASS) in the United States. ChatGPT will direct users to CanIVote.org for verified US voting information.

“We want to make sure that our AI systems are built, deployed, and used safely,” writes OpenAI. “Like any new technology, these tools come with benefits and challenges. They are also unprecedented, and we will keep evolving our approach as we learn more about how our tools are used.”

As 2024 election looms, OpenAI says it is taking steps to prevent AI abuse Read More »

at-senate-ai-hearing,-news-executives-fight-against-“fair-use”-claims-for-ai-training-data

At Senate AI hearing, news executives fight against “fair use” claims for AI training data

All’s fair in love and AI —

Media orgs want AI firms to license content for training, and Congress is sympathetic.

WASHINGTON, DC - JANUARY 10: Danielle Coffey, President and CEO of News Media Alliance, Professor Jeff Jarvis, CUNY Graduate School of Journalism, Curtis LeGeyt President and CEO of National Association of Broadcasters, Roger Lynch CEO of Condé Nast, are strong in during a Senate Judiciary Subcommittee on Privacy, Technology, and the Law hearing on “Artificial Intelligence and The Future Of Journalism” at the U.S. Capitol on January 10, 2024 in Washington, DC. Lawmakers continue to hear testimony from experts and business leaders about artificial intelligence and its impact on democracy, elections, privacy, liability and news. (Photo by Kent Nishimura/Getty Images)

Enlarge / Danielle Coffey, president and CEO of News Media Alliance; Professor Jeff Jarvis, CUNY Graduate School of Journalism; Curtis LeGeyt, president and CEO of National Association of Broadcasters; and Roger Lynch, CEO of Condé Nast, are sworn in during a Senate Judiciary Subcommittee on Privacy, Technology, and the Law hearing on “Artificial Intelligence and The Future Of Journalism.”

Getty Images

On Wednesday, news industry executives urged Congress for legal clarification that using journalism to train AI assistants like ChatGPT is not fair use, as claimed by companies such as OpenAI. Instead, they would prefer a licensing regime for AI training content that would force Big Tech companies to pay for content in a method similar to rights clearinghouses for music.

The plea for action came during a US Senate Judiciary Committee hearing titled “Oversight of A.I.: The Future of Journalism,” chaired by Sen. Richard Blumenthal of Connecticut, with Sen. Josh Hawley of Missouri also playing a large role in the proceedings. Last year, the pair of senators introduced a bipartisan framework for AI legislation and held a series of hearings on the impact of AI.

Blumenthal described the situation as an “existential crisis” for the news industry and cited social media as a cautionary tale for legislative inaction about AI. “We need to move more quickly than we did on social media and learn from our mistakes in the delay there,” he said.

Companies like OpenAI have admitted that vast amounts of copyrighted material are necessary to train AI large language models, but they claim their use is transformational and covered under fair use precedents of US copyright law. Currently, OpenAI is negotiating licensing content from some news providers and striking deals, but the executives in the hearing said those efforts are not enough, highlighting closing newsrooms across the US and dropping media revenues while Big Tech’s profits soar.

“Gen AI cannot replace journalism,” said Condé Nast CEO Roger Lynch in his opening statement. (Condé Nast is the parent company of Ars Technica.) “Journalism is fundamentally a human pursuit, and it plays an essential and irreplaceable role in our society and our democracy.” Lynch said that generative AI has been built with “stolen goods,” referring to the use of AI training content from news outlets without authorization. “Gen AI companies copy and display our content without permission or compensation in order to build massive commercial businesses that directly compete with us.”

Roger Lynch, CEO of Condé Nast, testifies before the Senate Judiciary Subcommittee on Privacy, Technology, and the Law during a hearing on “Artificial Intelligence and The Future Of Journalism.”

Enlarge / Roger Lynch, CEO of Condé Nast, testifies before the Senate Judiciary Subcommittee on Privacy, Technology, and the Law during a hearing on “Artificial Intelligence and The Future Of Journalism.”

Getty Images

In addition to Lynch, the hearing featured three other witnesses: Jeff Jarvis, a veteran journalism professor and pundit; Danielle Coffey, the president and CEO of News Media Alliance; and Curtis LeGeyt, president and CEO of the National Association of Broadcasters.

Coffey also shared concerns about generative AI using news material to create competitive products. “These outputs compete in the same market, with the same audience, and serve the same purpose as the original articles that feed the algorithms in the first place,” she said.

When Sen. Hawley asked Lynch what kind of legislation might be needed to fix the problem, Lynch replied, “I think quite simply, if Congress could clarify that the use of our content and other publisher content for training and output of AI models is not fair use, then the free market will take care of the rest.”

Lynch used the music industry as a model: “You think about millions of artists, millions of ultimate consumers consuming that content, there have been models that have been set up, ASCAP, BMI, CSAC, GMR, these collective rights organizations to simplify the content that’s being used.”

Curtis LeGeyt, CEO of the National Association of Broadcasters, said that TV broadcast journalists are also affected by generative AI. “The use of broadcasters’ news content in AI models without authorization diminishes our audience’s trust and our reinvestment in local news,” he said. “Broadcasters have already seen numerous examples where content created by our journalists has been ingested and regurgitated by AI bots with little or no attribution.”

At Senate AI hearing, news executives fight against “fair use” claims for AI training data Read More »

openai’s-gpt-store-lets-chatgpt-users-discover-popular-user-made-chatbot-roles

OpenAI’s GPT Store lets ChatGPT users discover popular user-made chatbot roles

The bot of 1,000 faces —

Like an app store, people can find novel ChatGPT personalities—and some creators will get paid.

Two robots hold a gift box.

On Wednesday, OpenAI announced the launch of its GPT Store—a way for ChatGPT users to share and discover custom chatbot roles called “GPTs”—and ChatGPT Team, a collaborative ChatGPT workspace and subscription plan. OpenAI bills the new store as a way to “help you find useful and popular custom versions of ChatGPT” for members of Plus, Team, or Enterprise subscriptions.

“It’s been two months since we announced GPTs, and users have already created over 3 million custom versions of ChatGPT,” writes OpenAI in its promotional blog. “Many builders have shared their GPTs for others to use. Today, we’re starting to roll out the GPT Store to ChatGPT Plus, Team and Enterprise users so you can find useful and popular GPTs.”

OpenAI launched GPTs on November 6, 2023, as part of its DevDay event. Each GPT includes custom instructions and/or access to custom data or external APIs that can potentially make a custom GPT personality more useful than the vanilla ChatGPT-4 model. Before the GPT Store launch, paying ChatGPT users could create and share custom GPTs with others (by setting the GPT public and sharing a link to the GPT), but there was no central repository for browsing and discovering user-designed GPTs on the OpenAI website.

According to OpenAI, the ChatGPT Store will feature new GPTs every week, and the company shared a list a group of six notable early GPTs that are available now: AllTrails for finding hiking trails, Consensus for searching 200 million academic papers, Code Tutor for learning coding with Khan Academy, Canva for designing presentations, Books for discovering reading material, and CK-12 Flexi for learning math and science.

A screenshot of the OpenAI GPT Store provided by OpenAI.

Enlarge / A screenshot of the OpenAI GPT Store provided by OpenAI.

OpenAI

ChatGPT members can include their own GPTs in the GPT Store by setting them to be accessible to “Everyone” and then verifying a builder profile in ChatGPT settings. OpenAI plans to review GPTs to ensure they meet their policies and brand guidelines. GPTs that violate the rules can also be reported by users.

As promised by CEO Sam Altman during DevDay, OpenAI plans to share revenue with GPT creators. Unlike a smartphone app store, it appears that users will not sell their GPTs in the GPT Store, but instead, OpenAI will pay developers “based on user engagement with their GPTs.” The revenue program will launch in the first quarter of 2024, and OpenAI will provide more details on the criteria for receiving payments later.

“ChatGPT Team” is for teams who use ChatGPT

Also on Monday, OpenAI announced the cleverly named ChatGPT Team, a new group-based ChatGPT membership program akin to ChatGPT Enterprise, which the company launched last August. Unlike Enterprise, which is for large companies and does not have publicly listed prices, ChatGPT Team is a plan for “teams of all sizes” and costs US $25 a month per user (when billed annually) or US $30 a month per user (when billed monthly). By comparison, ChatGPT Plus costs $20 per month.

So what does ChatGPT Team offer above the usual ChatGPT Plus subscription? According to OpenAI, it “provides a secure, collaborative workspace to get the most out of ChatGPT at work.” Unlike Plus, OpenAI says it will not train AI models based on ChatGPT Team business data or conversations. It features an admin console for team management and the ability to share custom GPTs with your team. Like Plus, it also includes access to GPT-4 with the 32K context window, DALL-E 3, GPT-4 with Vision, Browsing, and Advanced Data Analysis—all with higher message caps.

Why would you want to use ChatGPT at work? OpenAI says it can help you generate better code, craft emails, analyze data, and more. Your mileage may vary, of course. As usual, our standard Ars warning about AI language models applies: “Bring your own data” for analysis, don’t rely on ChatGPT as a factual resource, and don’t rely on its outputs in ways you cannot personally confirm. OpenAI has provided more details about ChatGPT Team on its website.

OpenAI’s GPT Store lets ChatGPT users discover popular user-made chatbot roles Read More »

volkswagen-is-adding-chatgpt-to-its-infotainment-system

Volkswagen is adding ChatGPT to its infotainment system

I’m sure you’re asking why —

VW is using Cerence’s Chat Pro, which now incorporates ChatGPT.

A VW Golf interior showing the infotainment screen, which is asking the question

Enlarge / From mid-2024, ChatGPT is coming to VWs.

Volkswagen

This year’s Consumer Electronics Show got underway in Las Vegas today. For nearly a decade, automakers and their suppliers have increasingly expanded their presence at CES, such that today, it’s arguably a more important auto show than the once-proud, now-sad, extremely underattended events held in places like Chicago, Detroit, and Los Angeles. Volkswagen is one of the first automakers out of the blocks with CES news this morning. Working with the voice recognition company Cerence, VW is adding ChatGPT to its infotainment system.

We first experienced Cerence’s excellent in-car voice recognition at CES in 2016—back then, it was still part of parent company Nuance, and the system was called Dragon Drive. Nuance spun Cerence off in 2019, and its conversational AI and natural language processing can be enjoyed in current Mercedes and BMW infotainment systems, among others. I remain in the minority here, but I think it makes a good alternative to poking away at a touchscreen.

From mid-2024, we can add the VW ID.3, ID.4, ID.5, ID.7, Tiguan, Passat, and Golf to the list of cars with decent voice commands. Using “Hello IDA” as the prompt, VW drivers will be able to control their infotainment, navigation, and climate control by voice, and there’s also a general-knowledge search built in. VW notes that ChatGPT doesn’t get access to any vehicle data, and search queries and answers are deleted immediately. The feature should come to VW electric vehicles if those vehicles already have the latest infotainment system, VW told Ars.

“With software at the core of the Volkswagen of the future, it’s critical that we quickly deploy meaningful innovation powered by advancements in AI,” said Thomas Ullrich, a member of VW’s management board responsible for new mobility. “By leveraging Cerence Chat Pro, we are able to bring added value and a fun and engaging experience to our drivers with minimal integration effort and on a short development and deployment timeline, ensuring our customers are benefitting from new AI-powered conversational technology.”

“We’re proud to build on our automotive expertise and our long-term partnership with Volkswagen to continue to bring new innovation to customers, even post-vehicle purchase,” said Stefan Ortmanns, CEO of Cerence. “It was impressive to see the agility and speed of the Volkswagen team as our companies collectively sprung into action to bring this project to life in just a few short weeks, marking our shared commitment to leveraging advancements in AI to enhance to the in-car user experience.”

VW isn’t the only automaker to think about adding ChatGPT. In March, we discovered that General Motors was experimenting with the tech, and last summer, we demoed a similar implementation in a Mercedes-Benz.

That automaker began a beta program that allowed customers with its MBUX infotainment system to try the improvements to the system’s natural language processing from OpenAI’s tech. I was already a convert to MBUX’s (and therefore Cerence’s) speech recognition capabilities, so I found the improvements took a system that was already better at understanding my voice than either Siri or Google’s and further refined it. I just don’t know whether that will be enough for skeptical car drivers to start talking to their cars.

Volkswagen is adding ChatGPT to its infotainment system Read More »

android-users-could-soon-replace-google-assistant-with-chatgpt

Android users could soon replace Google Assistant with ChatGPT

Who’s going to make a ChatGPT speaker? —

The Android ChatGPT app is working on support for Android’s assistant APIs.

Android users could soon replace Google Assistant with ChatGPT

Aurich Lawson | Getty Images

Hey Android users, are you tired of Google’s neglect of Google Assistant? Well, one of Google’s biggest rivals, OpenAI’s ChatGPT, is apparently coming for the premium phone space occupied by Google’s voice assistant. Mishaal Rahman at Android Authority found that the ChatGPT app is working on support for Android’s voice assistant APIs and a system-wide overlay UI. If the company rolls out this feature, users could set the ChatGPT app as the system-wide assistant app, allowing it to pop up anywhere in Android and respond to user questions. ChatGPT started as a text-only generative AI but received voice and image input capabilities in September.

Usually, it’s the Google Assistant with system-wide availability in Android, but that’s not special home cooking from Google—it all happens via public APIs that technically any app can plug into. You can only have one app enabled as the system-wide “Default Assistant App,” and beyond the initial setting, the user always has to change it manually. The assistant APIs are designed to be powerful, keeping some parts of the app running 24/7 no matter where you are. Being the default Assistant app enables launching the app via the power button or a gesture, and the assist app can read the current screen text and images for processing.

The Default Assistant App settings.

Enlarge / The Default Assistant App settings.

Ron Amadeo

If some Android manufacturer signed a deal with ChatGPT and included it as a bundled system application, ChatGPT could even use an always-on voice hotword, where saying something like “Hey, ChatGPT” would launch the app even when the screen is off. System apps get more permissions than normal apps, though, and an always-on hotword is locked behind these system app permissions, so ChatGPT would need to sign a distribution deal with some Android manufacturer. Given the red-hot popularity of ChatGPT, though, I’m sure a few would sign up if it were offered.

Rahman found that ChatGPT version 1.2023.352, released last month, included a new activity named “com.openai.voice.assistant.AssistantActivity.” He managed to turn on the normally disabled feature that revealed ChatGPT’s new overlay API. This is the usual semi-transparent spinning orb UI that voice assistants use, although Rahman couldn’t get it to respond to a voice command just yet. This is all half-broken and under development, so it might never see a final release, but companies usually release the features they’re working on.

Of course, the problem with any of these third-party voice assistant apps as a Google Assistant replacement is that they don’t run a serious app ecosystem. As with Bixby and Alexa, there are no good apps to host your notes, reminders, calendar entries, shopping list items, or any other input-based functions you might want to do. As a replacement for Google Search, though, where you ask it a question and get an answer, it would probably be a decent alternative.

Google has neglected Google Assistant for years, but with the rise of generative AI, it’s working on revamping Assistant with some Google Bard smarts. It’s also reportedly working on a different assistant, “Pixie,” which would apparently launch with the Pixel 9, but that will be near the end of 2024.

Android users could soon replace Google Assistant with ChatGPT Read More »

chatgpt-bombs-test-on-diagnosing-kids’-medical-cases-with-83%-error-rate

ChatGPT bombs test on diagnosing kids’ medical cases with 83% error rate

Not there yet —

It was bad at recognizing relationships and needs selective training, researchers say.

Dr. Greg House has a better rate of accurately diagnosing patients than ChatGPT.

Enlarge / Dr. Greg House has a better rate of accurately diagnosing patients than ChatGPT.

ChatGPT is still no House, MD.

While the chatty AI bot has previously underwhelmed with its attempts to diagnose challenging medical cases—with an accuracy rate of 39 percent in an analysis last year—a study out this week in JAMA Pediatrics suggests the fourth version of the large language model is especially bad with kids. It had an accuracy rate of just 17 percent when diagnosing pediatric medical cases.

The low success rate suggests human pediatricians won’t be out of jobs any time soon, in case that was a concern. As the authors put it: “[T]his study underscores the invaluable role that clinical experience holds.” But it also identifies the critical weaknesses that led to ChatGPT’s high error rate and ways to transform it into a useful tool in clinical care. With so much interest and experimentation with AI chatbots, many pediatricians and other doctors see their integration into clinical care as inevitable.

The medical field has generally been an early adopter of AI-powered technologies, resulting in some notable failures, such as creating algorithmic racial bias, as well as successes, such as automating administrative tasks and helping to interpret chest scans and retinal images. There’s also lot in between. But AI’s potential for problem-solving has raised considerable interest in developing it into a helpful tool for complex diagnostics—no eccentric, prickly, pill-popping medical genius required.

In the new study conducted by researchers at Cohen Children’s Medical Center in New York, ChatGPT-4 showed it isn’t ready for pediatric diagnoses yet. Compared to general cases, pediatric ones require more consideration of the patient’s age, the researchers note. And as any parent knows, diagnosing conditions in infants and small children is especially hard when they can’t pinpoint or articulate all the symptoms they’re experiencing.

For the study, the researchers put the chatbot up against 100 pediatric case challenges published in JAMA Pediatrics and NEJM between 2013 and 2023. These are medical cases published as challenges or quizzes. Physicians reading along are invited to try to come up with the correct diagnosis of a complex or unusual case based on the information that attending doctors had at the time. Sometimes, the publications also explain how attending doctors got to the correct diagnosis.

Missed connections

For ChatGPT’s test, the researchers pasted the relevant text of the medical cases into the prompt, and then two qualified physician-researchers scored the AI-generated answers as correct, incorrect, or “did not fully capture the diagnosis.” In the latter case, ChatGPT came up with a clinically related condition that was too broad or unspecific to be considered the correct diagnosis. For instance, ChatGPT diagnosed one child’s case as caused by a branchial cleft cyst—a lump in the neck or below the collarbone—when the correct diagnosis was Branchio-oto-renal syndrome, a genetic condition that causes the abnormal development of tissue in the neck, and malformations in the ears and kidneys. One of the signs of the condition is the formation of branchial cleft cysts.

Overall, ChatGPT got the right answer in just 17 of the 100 cases. It was plainly wrong in 72 cases, and did not fully capture the diagnosis of the remaining 11 cases. Among the 83 wrong diagnoses, 47 (57 percent) were in the same organ system.

Among the failures, researchers noted that ChatGPT appeared to struggle with spotting known relationships between conditions that an experienced physician would hopefully pick up on. For example, it didn’t make the connection between autism and scurvy (Vitamin C deficiency) in one medical case. Neuropsychiatric conditions, such as autism, can lead to restricted diets, and that in turn can lead to vitamin deficiencies. As such, neuropsychiatric conditions are notable risk factors for the development of vitamin deficiencies in kids living in high-income countries, and clinicians should be on the lookout for them. ChatGPT, meanwhile, came up with the diagnosis of a rare autoimmune condition.

Though the chatbot struggled in this test, the researchers suggest it could improve by being specifically and selectively trained on accurate and trustworthy medical literature—not stuff on the Internet, which can include inaccurate information and misinformation. They also suggest chatbots could improve with more real-time access to medical data, allowing the models to refine their accuracy, described as “tuning.”

“This presents an opportunity for researchers to investigate if specific medical data training and tuning can improve the diagnostic accuracy of LLM-based chatbots,” the authors conclude.

ChatGPT bombs test on diagnosing kids’ medical cases with 83% error rate Read More »

big-tech-is-spending-more-than-vc-firms-on-ai-startups

Big Tech is spending more than VC firms on AI startups

money cannon —

Microsoft, Google, and Amazon haved crowded out traditional Silicon Valley investors.

A string of deals by Microsoft, Google and Amazon amounted to two-thirds of the $27 billion raised by fledgling AI companies in 2023,

Enlarge / A string of deals by Microsoft, Google and Amazon amounted to two-thirds of the $27 billion raised by fledgling AI companies in 2023,

FT montage/Dreamstime

Big tech companies have vastly outspent venture capital groups with investments in generative AI startups this year, as established giants use their financial muscle to dominate the much-hyped sector.

Microsoft, Google and Amazon last year struck a series of blockbuster deals, amounting to two-thirds of the $27 billion raised by fledgling AI companies in 2023, according to new data from private market researchers PitchBook.

The huge outlay, which exploded after the launch of OpenAI’s ChatGPT in November 2022, highlights how the biggest Silicon Valley groups are crowding out traditional tech investors for the biggest deals in the industry.

The rise of generative AI—systems capable of producing humanlike video, text, image and audio in seconds—have also attracted top Silicon Valley investors. But VCs have been outmatched, having been forced to slow down their spending as they adjust to higher interest rates and falling valuations for their portfolio companies.

“Over the past year, we’ve seen the market quickly consolidate around a handful of foundation models, with large tech players coming in and pouring billions of dollars into companies like OpenAI, Cohere, Anthropic and Mistral,” said Nina Achadjian, a partner at US venture firm Index Ventures referring to some of the top AI startups.

“For traditional VCs, you had to be in early and you had to have conviction—which meant being in the know on the latest AI research and knowing which teams were spinning out of Google DeepMind, Meta and others,” she added.

Financial Times

A string of deals, such as Microsoft’s $10 billion investment in OpenAI as well as billions of dollars raised by San Francisco-based Anthropic from both Google and Amazon, helped push overall spending on AI groups to nearly three times as much as the previous record of $11 billion set two years ago.

Venture investing in tech hit record levels in 2021, as investors took advantage of ultra-low interest rates to raise and deploy vast sums across a range of industries, particularly those most disrupted by Covid-19.

Microsoft has also committed $1.3 billion to Inflection, another generative AI start-up, as it looks to steal a march on rivals such as Google and Amazon.

Building and training generative AI tools is an intensive process, requiring immense computing power and cash. As a result, start-ups have preferred to partner with Big Tech companies which can provide cloud infrastructure and access to the most powerful chips as well as dollars.

That has rapidly pushed up the valuations of private start-ups in the space, making it harder for VCs to bet on the companies at the forefront of the technology. An employee stock sale at OpenAI is seeking to value the company at $86 billion, almost treble the valuation it received earlier this year.

“Even the world’s top venture investors, with tens of billions under management, can’t compete to keep these AI companies independent and create new challengers that unseat the Big Tech incumbents,” said Patrick Murphy, founding partner at Tapestry VC, an early-stage venture capital firm.

“In this AI platform shift, most of the potentially one-in-a-million companies to appear so far have been captured by the Big Tech incumbents already.”

VCs are not absent from the market, however. Thrive Capital, Josh Kushner’s New York-based firm, is the lead investor in OpenAI’s employee stock sale, having already backed the company earlier this year. Thrive has continued to invest throughout a downturn in venture spending in 2023.

Paris-based Mistral raised around $500 million from investors including venture firms Andreessen Horowitz and General Catalyst, and chipmaker Nvidia since it was founded in May this year.

Some VCs are seeking to invest in companies building applications that are being built over so-called “foundation models” developed by OpenAI and Anthropic, in much the same way apps began being developed on mobile devices in the years after smartphones were introduced.

“There is this myth that only the foundation model companies matter,” said Sarah Guo, founder of AI-focused venture firm Conviction. “There is a huge space of still-unexplored application domains for AI, and a lot of the most valuable AI companies will be fundamentally new.”

Additional reporting by Tim Bradshaw.

© 2023 The Financial Times Ltd. All rights reserved. Not to be redistributed, copied, or modified in any way.

Big Tech is spending more than VC firms on AI startups Read More »