World

Microsoft Azure AI unveils ‘Prompt Shields’ to combat LLM manipulation

[ad_1]


Readers help support MSPoweruser. When you make a purchase using links on our site, we may earn an affiliate commission.

Tooltip Icon

Read the affiliate disclosure page to find out how can you help MSPoweruser effortlessly and without spending any money. Read more

Microsoft today announced a major security enhancement for its Azure OpenAI Service and Azure AI Content Safety platforms. Dubbed “Prompt Shields,” the new feature offers robust defense against increasingly sophisticated attacks targeting large language models (LLMs).

Prompt Shields protects against:

  • Direct Attacks: Also known as jailbreak attacks, these attempts explicitly instruct the LLM to disregard safety protocols or perform malicious actions.
  • Indirect Attacks: These attacks subtly embed harmful instructions within seemingly normal text, aiming to trick the LLM into undesirable behavior.

Prompt Shields is integrated with Azure OpenAI Service content filters and are available in Azure AI Content Safety. Thanks to advanced machine learning algorithms and natural language processing, Prompt Shields can identify and neutralize potential threats in user prompts and third-party data.

Spotlighting: A Novel Defense Technique

Microsoft also introduced “Spotlighting,” a specialized prompt engineering approach designed to thwart indirect attacks. Spotlighting techniques, such as delimiting and datamarking, help LLMs clearly distinguish between legitimate instructions and potentially harmful embedded commands.

Availability

Prompt Shields is currently in public preview as part of Azure AI Content Safety and will be available within the Azure OpenAI Service on April 1st. Integration into Azure AI Studio is planned in the near future.

[ad_2]

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button