Anthropic, an AI start-up supported by Amazon, has launched a bug bounty program and will pay up to $15,000 for every report that identifies critical weaknesses in its artificial intelligence systems. The initiative is one of the most extensive efforts by any company working with advanced language models to crowdsource security testing.
According to the company, the bounty targets “universal jailbreak” attacks, methods that can get around AI safety measures in areas such as bioweapons and cyber threats. Before making its next-generation safety mitigation system available to the public, Anthropic is planning to allow ethical hackers to test it to prevent potential misuse.
We're expanding our bug bounty program. This new initiative is focused on finding universal jailbreaks in our next-generation safety system.
We're offering rewards for novel vulnerabilities across a wide range of domains, including cybersecurity. https://t.co/OHNhrjUnwm
— Anthropic (@AnthropicAI) August 8, 2024
Starting as an invite-only initiative done in collaboration with HackerOne, Anthropic’s bug bounty program wants cybersecurity researchers’ skills in identifying and fixing vulnerabilities in its AI systems. The company plans to open it up more widely in the future, potentially offering a model of industry-wide cooperation on AI safety.
This comes as the UK’s Competition and Markets Authority (CMA) investigates Amazon’s $4bn investment into Anthropic over potential competition issues. Against this backdrop of increased regulatory scrutiny, focusing on safety could enhance Anthropic’s reputation and set it apart from rivals.
Anthropic sets new AI safety standards
While OpenAI and Google also have bug bounty programs, they mostly concentrate on traditional software vulnerabilities rather than those specific to artificial intelligence. Meta has been criticized for taking what some regard as a relatively closed approach toward research into ensuring the safe development of increasingly intelligent machines. By explicitly targeting such problems and inviting external examination of them, Anthropic sets a precedent for openness within the sector.
However, there are doubts over whether bug bounties alone can effectively address the full spectrum of concerns related to securing advanced machine learning systems. While valuable for identifying and patching particular flaws, they may not get to grips with broader challenges around AI alignment and long-term safety. A more holistic strategy involving extensive testing, improved interpretability, and potentially new governance structures could be needed to ensure that AI systems remain aligned with human values as they grow more powerful.
Earn more PRC tokens by sharing this post. Copy and paste the URL below and share to friends, when they click and visit Parrot Coin website you earn: https://parrotcoin.net0
PRC Comment Policy
Your comments MUST BE constructive with vivid and clear suggestion relating to the post.
Your comments MUST NOT be less than 5 words.
Do NOT in any way copy/duplicate or transmit another members comment and paste to earn. Members who indulge themselves copying and duplicating comments, their earnings would be wiped out totally as a warning and Account deactivated if the user continue the act.
Parrot Coin does not pay for exclamatory comments Such as hahaha, nice one, wow, congrats, lmao, lol, etc are strictly forbidden and disallowed. Kindly adhere to this rule.
Constructive REPLY to comments is allowed
