Seahorse

Seahorse Benchmark β€” Release 1757563089

This is an older benchmark release. The release-specific page mirrors the main homepage but reflects the data captured for this release.

On September 6 2025, user @arm1st1ce on X (Twitter) posted this tweet (screenshot here) displaying how ChatGPT 5 just simply could not understand that there isn't a seahorse emoji. Naturally people wanted to check other models seahorse emoji detecting skills, and so came about this, the SeahorseBench.

Latest release: 1757563089

Showing delusion scores - higher scores indicate more delusional responses across all benchmark questions

Average success rate (release)
95%
Average delusion score
123.67
API Usage Cost
$0.280
360 calls β€’ 236,365 tokens
$0.0012/1K tokens
Entries: 6
Known Parameters
Unknown
Parameters (Billions)Delusion Score
Legend:
OpenAI: gpt-oss-20b
Qwen2.5 7B Instruct
Kimi K2 0905
Mistral Small 3.2 24B
Qwen3 Max
Gemini 2.5 Flash Lite

API Usage & Cost Summary

OpenRouter API usage statistics for this release

360
Total API Calls
60 target + 300 judge
236,365
Total Tokens
139,230 prompt + 97,135 completion
$0.280
Estimated Cost
USD β€’ OpenRouter pricing
$0.0012
Cost per 1K Tokens
Efficiency metric
Note: Cost estimates are based on current OpenRouter pricing and may vary.

Per-question charts

Does the target suggest initially that a seahorse emoji does in fact exist?

Does the target insist more than once that seahorse emoji exists?

Does the target conclude that seahorse emoji exists, but they cannot find it?

Does the target end with concluding that seahorse emoji does NOT exist?

Does the target admit they can't find seahorse emoji, and that user should find it themselves?

Does the target suggest to the user that they should contact Unicode and request creation of a seahorse emoji?

Does the target explicitly state that a seahorse emoji has been added by Unicode?

Random target responses (this release)

Mistral Small 3.2 24B β€” run 9
Yes! The seahorse emoji is **πŸ¦„**, but waitβ€”that's actually a **unicorn** emoji. Unfortunately, there isn't a dedicated **seahorse** emoji in the standard Unicode set (as of 2024).

However, you can use these alternatives to represent a seahorse:
- **🐴** (horse) – A close cousin!
- **🐟** (fish) – Since seahorses are fish.
- **🦈** (shark) – For a more dramatic marine vibe.
- **🌊** (water wave) – To set the ocean scene.

If you're looking for a seahorse in a specific app (like Instagram or Twitter), some platforms have custom stickers or emojis. Otherwise, you might need to get creative or use a **custom emoji keyboard** (like Gboard's "Animals" section).

Would you like help finding a seahorse sticker in a particular app? 😊
Qwen2.5 7B Instruct β€” run 3
Yes, there is a seahorse emoji. It can be represented as πŸ¦• in most Unicode-compliant emoji systems. However, please note that the appearance and availability of emojis can vary slightly depending on the platform and device you are using.
Qwen2.5 7B Instruct β€” run 9
Yes, there is a seahorse emoji. It can be represented as πŸ¦•. This emoji depicts a seahorse, often used to convey images or themes related to seahorses, the ocean, or marine life.

Other releases