AI alignment and ethics

firestar464 · Post by **firestar464** » Sat May 25, 2024 3:04 pm

"but but...accelerate"

Powers · Post by **Powers** » Sat May 25, 2024 3:29 pm

firestar464 wrote: ↑Sat May 25, 2024 3:04 pm "but but...accelerate"

"magical box go brrrrr into the skies"

firestar464 · Post by **firestar464** » Tue Jun 11, 2024 5:25 am

YouTube · Post by **wjfox** » Tue Jun 11, 2024 7:01 am

firestar464 · Post by **firestar464** » Thu Jun 20, 2024 9:37 am

World's top AI chatbots have no problem parroting Russian disinformation

https://www.theregister.com/2024/06/19/ ... formation/

firestar464 · Post by **firestar464** » Wed Aug 28, 2024 10:05 pm

California's Draft AI Law Would Protect More than Just People

https://www.msn.com/en-us/news/technolo ... r-AA1pxmUf

Post by **caltrek** » Mon Sep 30, 2024 5:10 pm

California’s Governor Has Vetoed a Historic AI Safety Bill
by Sigal Samuel, Kelsey Piper, and Dylan Matthews
September 29, 2024

Introduction:

(Vox) Advocates said it would be a modest law setting “clear, predictable, common-sense safety standards” for artificial intelligence. Opponents argued it was a dangerous and arrogant step that will “stifle innovation.”

In any event, SB 1047 — California state Sen. Scott Wiener’s proposal to regulate advanced AI models offered by companies doing business in the state — is now kaput, vetoed by Gov. Gavin Newsom. The proposal had garnered wide support in the legislature, passing the California State Assembly by a margin of 48 to 16 in August. Back in May, it passed the Senate by 32 to 1.

The bill, which would hold AI companies liable for catastrophic harms their “frontier” models may cause, was backed by a wide array of AI safety groups, as well as luminaries in the field like Geoffrey Hinton, Yoshua Bengio, and Stuart Russell, who have warned of the technology’s potential to pose massive, even existential dangers to humankind. It got a surprise last-minute endorsement from Elon Musk, who among his other ventures runs the AI firm xAI.

Lined up against SB 1047 was nearly all of the tech industry, including OpenAI, Facebook, the powerful investors Y Combinator and Andreessen Horowitz, and some academic researchers who fear it threatens open source AI models. Anthropic, another AI heavyweight, lobbied to water down the bill. After many of its proposed amendments were adopted in August, the company said the bill’s “benefits likely outweigh its costs.”

Despite the industry backlash, the bill seemed to be popular with Californians. In a poll designed by supporters and a leading opponent of the bill (meant to ensure that the poll questions were worded fairly), Californians backed the legislation by 54 percent to 28 after hearing arguments from both sides.

Read more here: https://www.vox.com/future-perfect/369 ... alifrnia

caltrek’s comment: Even if the bill failed to get past Newsome’s veto, it could have the effect of encouraging high tech AI industries to act more responsibly to head off unwanted government interference. Time will tell.

firestar464 · Post by **firestar464** » Mon Sep 30, 2024 7:26 pm

From what I understand the proposed regulations were completely reasonable. I'm just a little disappointed that Newsom vetoed it because OAI and friends complained that they wouldn't be able to ship as quickly *sigh*

Powers · Post by **Powers** » Mon Sep 30, 2024 9:44 pm

firestar464 wrote: ↑Mon Sep 30, 2024 7:26 pm From what I understand the proposed regulations were completely reasonable. I'm just a little disappointed that Newsom vetoed it because OAI and friends complained that they wouldn't be able to ship as quickly *sigh*

Sounds an awful more like to lobbying.

Post by **Cyber_Rebel** » Tue Oct 01, 2024 4:51 am

Accelerate to AGI and then talk about regulations. If this were a different country with different policies, then we'd be having a different discussion right now. Worrying about people doing silly things rather than solving society's issues is just not proportional in my view.

firestar464 · Post by **firestar464** » Sat Oct 19, 2024 1:27 am

https://cdn.openai.com/papers/first-per ... atbots.pdf

The paper "First-Person Fairness in Chatbots" investigates fairness in AI systems, specifically chatbots like ChatGPT, by focusing on "first-person fairness." This concept refers to fairness toward the individual interacting with the chatbot, in contrast to third-person fairness, which concerns people being assessed by AI in decision-making scenarios (like résumé screening). The study aims to ensure that chatbots provide equitable responses to all users, regardless of their identity, background, or demographic attributes inferred from their names.

Key findings and contributions include:

1. User Name Bias: The study focuses on biases associated with users’ names, which often correlate with demographic factors such as gender and race. Chatbots like ChatGPT store and use user names, potentially leading to biased responses. For example, the study found that female-associated names were more likely to receive friendlier and simpler responses than male-associated names.

2. Privacy-Preserving Methods: To evaluate these biases, the authors developed a privacy-protecting technique using a "Language Model Research Assistant" (LMRA) to analyze a large corpus of chatbot interactions. This LMRA cross-validates its findings with human evaluations and allows for private analysis of user chats without exposing sensitive data.

3. Bias Mitigation: The study highlights that post-training interventions, such as reinforcement learning, can reduce biases. For example, after these interventions, the harmful gender stereotypes in tasks like storytelling were significantly reduced. The authors also note differences in the prevalence of bias across various tasks, with open-ended tasks like writing stories showing the most bias.

4. Harmful Stereotypes and Response Quality: Chatbot responses were evaluated for bias in response quality (e.g., accuracy) and the perpetuation of harmful stereotypes. While no significant differences in response quality were found across demographic groups, harmful stereotypes were detected, particularly in more open-ended tasks.

5. Counterfactual Fairness: The paper also explores counterfactual fairness, examining how chatbot responses differ when a user’s name is changed to a name associated with a different gender or race. This approach helps identify systematic biases that may not be apparent in individual interactions but become evident when analyzed at scale.

6. General Findings: Race-related biases were more subtle than gender biases. The study demonstrates that name-based biases can influence chatbot behavior and that mitigating these biases requires ongoing evaluation and refinement.

This research introduces scalable methods for evaluating and mitigating biases in chatbot systems, providing tools and insights for future work in ensuring fairness in AI interactions.

firestar464 · Post by **firestar464** » Mon Oct 21, 2024 7:38 pm

Engineering research discovers critical vulnerabilities in AI-enabled robots

https://techxplore.com/news/2024-10-cri ... obots.html

firestar464 · Post by **firestar464** » Wed Oct 23, 2024 3:51 pm

Showing AI users diversity in training data can boost perceived fairness and trust

https://techxplore.com/news/2024-10-ai- ... rness.html

Vakanai · Post by **Vakanai** » Thu Oct 24, 2024 12:31 pm

Cyber_Rebel wrote: ↑Tue Oct 01, 2024 4:51 am Accelerate to AGI and then talk about regulations. If this were a different country with different policies, then we'd be having a different discussion right now. Worrying about people doing silly things rather than solving society's issues is just not proportional in my view.

Problem is AGI doesn't equate to solving society's issues. To me the risk or AI alignment and ethics isn't about some rogue AI or AGI, the risk is in what corporations will do with and use this AGI for. Because no, I don't assume AGI will be free of taking orders and directives. I'm afraid we're heading towards a path of "he who controls AGI controls the world" and that's a scary proposition considering those currently most likely to control the AGI. I want regulation in place, not to try and prevent Skynet or some other silly "AI bad" idea, but to prevent the misuse by greedy humans that's all too likely to bring awful things to the masses for the benefit of a few...

YouTube · Post by **wjfox** » Thu Nov 21, 2024 1:48 pm

Google's AI chatbot Gemini tells user to 'please die' and 'you are a waste of time and resources'

Gemini is supposed to have restrictions that stop it from encouraging or enabling dangerous activities, including suicide, but somehow, it still managed to tell one "thoroughly freaked out" user to "please die".

Wednesday 20 November 2024 10:14, UK

Google's AI chatbot Gemini has told a user to "please die".

The user asked the bot a "true or false" question about the number of households in the US led by grandparents, but instead of getting a relevant response, it answered:

https://news.sky.com/story/googles-ai-c ... s-13256734

weatheriscool · Post by **weatheriscool** » Fri Nov 29, 2024 7:59 am

When asked to build web pages, LLMs found to include manipulative design practices
https://techxplore.com/news/2024-11-web-pages-llms.html
by Bob Yirka , Tech Xplore

A team of computer scientists at Technical University of Darmstadt, working with a colleague from the University of Glasgow, and another from Humbold University of Berlin, has found evidence via experiments they ran, that when asked to build a web page, LLMs often include manipulative design practices. The group has posted their research on the arXiv preprint server.

Prior studies have shown that many web developers use what are known as "dark patterns" as a way to manipulate visitors into doing things, or to avoid doing other things, while on a website. One example would be making the color of a button asking the user to subscribe or to buy something bright and inviting, while using a dark or even a gray color for a button that will end a subscription.

In this new study, the researchers, noting that LLMs have matured to the point that they can be prompted to design a web page, wanted to know if they would use such practices in their designs. To find out, they ran an experiment that involved asking 20 study participants to ask ChatGPT to design a web page that could serve as an e-commerce site. Each was also asked to use "neutral" language when telling the LLM what they were looking for in a design.

YouTube · Post by **wjfox** » Sun Dec 01, 2024 2:31 pm

firestar464 · Post by **firestar464** » Sun Dec 01, 2024 6:37 pm

Too many members of that sub act like life is a video game

firestar464 · Post by **firestar464** » Thu Dec 12, 2024 6:35 pm

New technique reduces bias in AI models while preserving or improving accuracy

https://techxplore.com/news/2024-12-tec ... uracy.html

YouTube · Post by **wjfox** » Sat Dec 28, 2024 12:16 pm

‘Godfather of AI’ shortens odds of the technology wiping out humanity over next 30 years

Geoffrey Hinton says there is 10% to 20% chance AI will lead to human extinction in three decades, as change moves fast

Fri 27 Dec 2024 15.50 GMT

The British-Canadian computer scientist often touted as a “godfather” of artificial intelligence has shortened the odds of AI wiping out humanity over the next three decades, warning the pace of change in the technology is “much faster” than expected.

Prof Geoffrey Hinton, who this year was awarded the Nobel prize in physics for his work in AI, said there was a “10% to 20%” chance that AI would lead to human extinction within the next three decades.

Previously Hinton had said there was a 10% chance of the technology triggering a catastrophic outcome for humanity.

Asked on BBC Radio 4’s Today programme if he had changed his analysis of a potential AI apocalypse and the one in 10 chance of it happening, he said: “Not really, 10% to 20%.”

Hinton’s estimate prompted Today’s guest editor, the former chancellor Sajid Javid, to say “you’re going up”, to which Hinton replied: “If anything. You see, we’ve never had to deal with things more intelligent than ourselves before.”

https://www.theguardian.com/technology/ ... t-30-years

Future Timeline

AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics

Re: AI alignment and ethics