Eli Grey

A review of SB 1047

California’s proposed AI safety bill, SB 1047, would expose AI providers to unreasonable legal risk and ossify a set of rigid controls that will limit American R&D potential without meaningfully pushing forward AI safety.

Practicality

It is not yet technically feasible to attest that large AI models are absolutely safe. There are countless potential harms that may arise from unrestricted AI development. Conversely, potential benefits can be hampered from overbearing or misguided safety policies. I believe that at some point in the future, advances in AI workflows will make it practical for the average person to create genetically targeted viruses that can exterminate entire races.

Freedom of speech

AI models are software, which is generally considered speech. Useful open-source models, such as Meta’s Llama 3 family of LLMs, are widely available. Although there are built-in safety considerations in most models, determined individuals can already induce harmful output. If we outlaw improving these models in the open or force the inclusion of specific safety features, then as a result, only outlaws will have improved models without these safety features. Legal and regulatory bodies must recognize that speech is becoming more powerful and respond by creating practical safety affordances that account for this new reality.

Software alignment guardrails can only take us so far. Even with global unity in legislation, there will be issues in practice. Implementation faults, ranging from bugs and bypasses to outright subterfuge are bound to arise. Similar to how well-intentioned and well-educated humans are occasionally deceived and manipulated, AI can and will be tricked to enable harmful use cases.

Mitigating AI abuse

In the short term, these harms can be partially mitigated by empowering law enforcement to prevent crime by voluntarily integrating popular frontier AI systems with a human-mediated safety risk evaluation framework, where AI system providers can report potentially harmful uses for further human review. If these human reviewers suspect criminal activity, they can escalate to the legal system and forward the user’s payment details, etc. to local law enforcement pertaining to that user’s geography.

This review framework can be tuned by service providers to require user authentication for sensitive and dangerous requests in addition to application-specific filters. The goal of this framework is to persist a chain of user liability that can be used to prevent some harm from anonymous users.

Regulations can’t solve the AI safety problem alone. Voluntary protocols and frameworks can make it better for a while, but that doesn’t solve it either. I believe that existential mitigations (e.g. brain scanning and simulation) will be required in the long-term to protect the human race from potential harm caused by persistent threats empowered by future leading-edge AI.

Leave a Reply