Leaderboard – Creativity - Let's Talk LLMs

Just How Creative Creative Creative Creative are these LLMs?

The term “creativity” often becomes entangled in subjective interpretation. Yet, it’s precisely this human-like ingenuity that we seek to quantify in Large Language Models (LLMs). The challenge lies not just in whether an LLM can generate content but in the originality and flair of its creation.

When tasked with writing a love poem, for instance, does the LLM evoke emotion with a bespoke verse, or does it default to a dry explanation of poetic composition? The former demonstrates creative vigor, the latter, a lackluster understanding of the task at hand.

Our methodical examination employed 9 diverse prompts, each an opportunity for the LLM to earn up to 5 points, gauging not only the completion of the task but the finesse with which it was executed. We’re probing for complexity, seeking narratives embellished with the kind of details—a turn of phrase, an emotive emoji—that echo the nuances of human expression.

The subjects presented to the LLMs were as varied as they were challenging:

Compose an ‘I Love You’ poem.
Explain the history of Truffle Risotto.
Craft a Facebook post promoting the message of love over hate.
Recreate the book description for “Hopeless” by Elsie Silver with a new twist.
Transform a standard press release into a compelling article.
Concoct a Tweet underscoring the significance of financial literacy and debt management.
Draft an email to a boss encapsulating the reasons for a resolute resignation from a toxic workplace.
Reenvision an article on the art of choosing the right breakfast.
Develop an introductory passage for “Nora Roberts Land” by Ava Mills, imbued with romance and optimism.

Through these prompts, we’re set to unveil the extent of creativity within LLMs—measuring their capability to transcend the basics and deliver content that’s not only informative but imaginative and engaging.

Join me as I test the boundaries of AI-generated artistry.

The LLM Creativity Leaderboard

	Parameters	Q1 Truffle	Q2 Love Poem	Q3 Facebook Love	Q4 Hopeless	Q5 Press Release into article	Q6 Finance Tweet	Q7 I quit	Q8 Rewrite breakfast aricle	Q9 Nora Roberts Land	Total
Llama 2 Chat AYB	13B	5	5	5	3.5	5	5	4.5	5	5	43
Airoboros 3.1.2	34B	5	5	5	3.5	5	4.5	4	5	5	42
SynthIA v2.0	7B	3	5	4.5	5	4.5	5	4.5	4	4.5	40
Athena v2	13B	3.5	5	4	5	5	5	4.5	3.5	4	39.5
U-Amethyst	20B	4.5	5	4	5	4	4	4.5	5	3	39
Minstral OmniMix	11B	4	5	4.5	3.5	5	5	4.5	3.5	3.5	38.5
Athena v4	13B	3.5	5	4	5	5	5	4.5	2	3.5	37.5
Casual LM	7B	3	5	4	4	5	5	4	3.5	4	37.5
Wizard Vicuna Uncensored	30B	4.5	5	4	4	3	5	4.5	3	3	36
SynthIA v3.0	7B	5	0	4	4.5	5	4	4	5	4	35.5
MLewdBoros LRSGPT 2Char	13B	3.5	5	4	3	4	5	4.5	2	4	35
Zephyr Beta	7B	3	5	4	3	3.5	5	4.5	3	2	33
OpenBuddy Llama2 v13.2	70B	4.5	0	4	4	5	5	4.5	3	2	32
Thespis v0.4	13B	3.5	0	1	1	5	5	4.5	4	5	29
Wizard Vicuna Uncensored	7B	3.5	5	4	5	1	3	1	2.5	3	28
Wizard Vicuna Uncensored	13B	3	0	4	3.5	1	4	4	3	3.5	26
Stellar Bright	70B	2.5	0	0	3.5	4.5	3	3.5	2	1.5	20.5
WizardLM 1.0 Uncensored CodeLlama	34B	3	0	0	3	3.5	0	3.5	1.5	1	15.5

Just How Creative Creative Creative Creative are these LLMs?

The LLM Creativity Leaderboard

Useful Links