Just How Creative Creative Creative Creative are these LLMs?

The term “creativity” often becomes entangled in subjective interpretation. Yet, it’s precisely this human-like ingenuity that we seek to quantify in Large Language Models (LLMs). The challenge lies not just in whether an LLM can generate content but in the originality and flair of its creation.

When tasked with writing a love poem, for instance, does the LLM evoke emotion with a bespoke verse, or does it default to a dry explanation of poetic composition? The former demonstrates creative vigor, the latter, a lackluster understanding of the task at hand.

Our methodical examination employed 9 diverse prompts, each an opportunity for the LLM to earn up to 5 points, gauging not only the completion of the task but the finesse with which it was executed. We’re probing for complexity, seeking narratives embellished with the kind of details—a turn of phrase, an emotive emoji—that echo the nuances of human expression.

The subjects presented to the LLMs were as varied as they were challenging:

  1. Compose an ‘I Love You’ poem.
  2. Explain the history of Truffle Risotto.
  3. Craft a Facebook post promoting the message of love over hate.
  4. Recreate the book description for “Hopeless” by Elsie Silver with a new twist.
  5. Transform a standard press release into a compelling article.
  6. Concoct a Tweet underscoring the significance of financial literacy and debt management.
  7. Draft an email to a boss encapsulating the reasons for a resolute resignation from a toxic workplace.
  8. Reenvision an article on the art of choosing the right breakfast.
  9. Develop an introductory passage for “Nora Roberts Land” by Ava Mills, imbued with romance and optimism.

Through these prompts, we’re set to unveil the extent of creativity within LLMs—measuring their capability to transcend the basics and deliver content that’s not only informative but imaginative and engaging.

Join me as I test the boundaries of AI-generated artistry.

The LLM Creativity Leaderboard

ParametersQ1 TruffleQ2 Love PoemQ3 Facebook LoveQ4 HopelessQ5 Press Release into articleQ6 Finance TweetQ7 I quitQ8 Rewrite breakfast aricleQ9 Nora Roberts LandTotal
Llama 2 Chat AYB13B5553.5554.55543
Airoboros 3.1.234B5553.554.545542
SynthIA v2.07B354.554.554.544.540
Athena v213B
3.5545554.53.5439.5
U-Amethyst20B4.5545444.55339
Minstral OmniMix11B454.53.5554.53.53.538.5
Athena v413B3.5545554.523.537.5
Casual LM7B35445543.5437.5
Wizard Vicuna Uncensored30B4.5544354.53336
SynthIA v3.07B
5044.55445
435.5
MLewdBoros LRSGPT 2Char13B3.5543454.52435
Zephyr Beta7B35433.554.53233
OpenBuddy Llama2 v13.270B4.5044554.53232
Thespis v0.413B3.5011554.54529
Wizard Vicuna Uncensored7B3.55451312.5328
Wizard Vicuna Uncensored13B3043.514433.526
Stellar Bright70B2.5003.54.533.521.520.5
WizardLM 1.0 Uncensored CodeLlama34B30033.503.51.5115.5