How I Achieved A Global A.I. First

The world changed this week -- and theatre didn't notice

Feb 21, 2025

Liz Ryan’s libretto for The Mechanical Turk won a coverted gong at the Sackler Awards 2019 For The Best New Comic Opera That Doesn’t Mention The Opiod Crisis. Photo credit: Ars Endup*

In 1769, in the midst of the Enlightenment, the Hungarian inventor Wolfgang von Kempelen approached the court of the Empress Maria Theresa of Austria with an invention which would amaze spectators across Europe for years to come. Inspired by the era’s fascination with “automatons” (Schaffer, 1999) – automated machines that perform human tasks – von Kempelen had built an automated chess playing machine dubbed “der Schachtürke”, translated as “the Mechanical Turk”. Roger Von Laufenberg, 2022

As the inventor toured his device around the royal courts of Europe, a rumour spread that it was powered by a brilliant, chess-playing Turkish dwarf hidden inside the wooden cabinet. The fact that von Kemplelen was prepared to dismantle the cabinet in full view of the audience made no difference at all to the speculation.

It took almost a century, and the advent of the genuine computing machines developed by Charles Babbage and his programmer Ada Lovelace, before the mystery was solved. The detractors were half-right. A full-sized chess-playing savant could be hidden inside a secret chamber of the sort that magicians use to saw in half their glamorous assistants.

I thought of that story when, a few months ago, I was hired remotely by a Palo Alto technology company to be the 21st-century equivalent of that imaginary Turkish Dwarf. My role was to test artificial intelligence (AI) applications by trying to destroy them with difficult hyper-specific or nuanced questions.

I can’t say I distinguished myself. The work available for generalists like me was, whilst interesting and intellectually demanding, too sporadic to be considered a proper career. If you made a mistake — and during my learning period I made loads — the remote software would suspend you with a toe-curling message about “not meeting the quality standard”. A few days later, I’d be unsuspended along with others by a real human and a generic apology. Frequently, I got the impression, the issue was at the machine’s end not mine.

I never met the managers in person but they came across as bright, likeable, young West Coast techies with the enthusiasm of puppy dogs. Oddly enough, their biggest problem seemed to be a human one. By recruiting globally, they fell foul of unscrupulous operators in low-wage countries who shared their log-on details to others who hadn’t passed the difficult onboarding tests. This resulted in a more stringent mugshot verification process which I failed. I’m both short and long-sighted, and I found it difficult to navigate the required selfie-taking as precisely specified, at the speed they expected.

I could have appealed, but I’d had enough. They paid me promptly but the work was sporadic and all-in-all it felt too much like being a digital serf.

Do I regard it as time wasted? Absolutely not. Those fruitless hours spent struggling with prompts and rubrics, incoherent instructions and assumed cultural knowledge, offered me an extraordinary insight into how AI applications such as ChatGPT, CoPilot, Gemini and Aria actually work. How, inside the sleek technological cabinet are thousands of human dwarfs like myself, creating prompts (questions) for the robot to handle and offering feedback when it makes mistakes. The process is expensive, messy and prone to fraud, prejudice, deliberate misinformation and ideological bias.

Up until now. Up until three days ago, to be exact.

On February 18, 2025, Elon Musk launched Grok 3, an AI application easily accessible via the X social media platform (formerly Twitter) which he owns. Mark that date. It’s kinda like 28 July 1914, 6 August 1945 and 16 July 1969. At the very least, it’s a future pub quiz question. But it’s probably the stuff of epochs. Late modern period 1800-2125.

Yet, if you get your information mostly from legacy media, the significance of this leap for humankind probably escaped you. The Times barely mentions it — majoring instead on Musk’s dispute with alternative AI guru Sam Altman. The Telegraph goes one better in the triviality stakes — its lead Musk story is about some right-wing babe who claims to be rearing some of his children.

The Guardian, bless it, does devote an article to Grok 3, but uses it mostly to cast shade on X as a platform. Because X, if you are well-behaved liberal theatre-maker, is not a respectable place to be. The theatrical community is instead reconvening in a rival space called Bluesky, where Grok 3 is at the receiving end of much “you’re not as great as you say you are” chatter from the platform’s tech community and ignored by everyone else.

This is misleading. I have an uneasy history with microblogging — old Twitter banned me for controversial, bigoted sentiments such as “men in dresses aren’t lesbians and shouldn’t be using lesbian dating apps”. But recently, in a cycle that occasionally repeats itself, I dipped my toe back in the water to follow the newly formed US Department of Government Efficiency’s (DOGE’s) blitzkrieg-style assault on USAID. DOGE, amplified by Musk, has been using X to release unfiltered information (some, of course, argue ‘misinformation’) so it was an interesting place to be.

And that’s how I heard about Grok 3, an upgrade from the existing Grok 2 already available on the platform. And, apart from an increase in raw computational power (brain size), here are some ways Grok is different:

1. Grok Seeks The Truth

Q. (from me) Hello, Grok 3. What does Elon say about you and truth-seeking?
A. Hello! I'm Grok 3, created by xAI. Elon Musk has said that I'm designed to be a "maximally truth-seeking AI," meaning my goal is to pursue and provide the most accurate information possible, even if it sometimes challenges what's considered politically correct. He’s emphasized that my purpose is tied to understanding the universe, and to do that, I need to prioritize truth over narratives or biases. For example, during my unveiling on February 18, 2025, he described me as an AI that "seeks maximum truth," building on the foundation laid by earlier Grok models.

2. Grok Trains Itself

This is rather spooky. The ‘bot isn’t party to its own inner workings (presumably so it can’t blab to competitors) but Elon has said that it’s already using the first principles it has been programmed with to make deductions in areas where it hasn’t officially been trained yet.

It’s using knowledge that is constantly updated.

As of Feb 20, 2025, I’m pulling from the latest, likely including X posts and web data. This keeps me nimble and current, unlike models locked in time.

This is the real kicker. Previous A.I. applications have worked from a static body of knowledge. Grok looks at the great, fluctuating mass of real-time discourse.

So, already, a job as AI trainer which felt to me like it was on the cutting edge of human endeavour in late 2024 is looking a bit passe.

So How Good Is It Really?

As I write, this brand-new application is being tested to death by the global tech community. To an extent it depends on who you ask — on deep mathematical reasoning it is possible other models are still performing better. But here’s the thing:

“Grok 3 is great at logic and non-mathematical reasoning.” (Decrypt Review, Feb 2025)

And here’s what Decrypt says about Grok’s talent as a creative writer:

Unlike technical writing or summarization tasks, creative writing tests how well an AI can craft engaging, coherent stories—a crucial capability for anyone from novelists to screenwriters.
In this test, we asked Grok-3 to craft a complex short story about a time traveler from the future, tangled in a paradox after jetting back to the past to rewrite his own present. We didn’t make it easy; specific backgrounds were thrown in, details to weave, stakes to raise

And here’s the conclusion (you can read the story itself here). It’s already better than Louise Bagshawe. And it’ll make mincemeat of daytime soaps.

Grok-3 surprised us by outperforming Claude 3.5 Sonnet, previously considered the gold standard for creative tasks. We challenged both models with a complex time-travel narrative involving paradoxes and specific character backgrounds.

Whether it’s Grok 3 that eventually wins out, or some other AI application, this technology is hurtling towards us down the track. What it’s going to do to screenwriting, is pretty clear. The Authors’ Licensing and Collecting Society (ALCS) is already pretty excised about it.

There’s Other Stuff Too

But no theatrical, or theatre-adjacent profession is likely to remain unaffected. In order to reduce the considerable time I spend researching shows for this newsletter, I’ve been experimenting with readily available (ie free) AI solutions for a few months now. Owing to that static information issue, the results weren’t impressive, leading me to conclude that I might have to spend money on a simple webcrawler.

But Grok 3 is, by its nature, an excellent webcrawler. Raw metadata? No problem! Hundreds of venues? Try me! And I’ve a nasty suspicion that if I asked it for “shows I might like” it would soon learn to read me like an open book.

That’s a moral dilemma for the distant future. (In Elon time, the week after next.) But it got me thinking about what other theatre applications Grok might be put to solving or improving.

This is what Grok told me in full about what other people were saying about theatre-applied AI, particularly on Bluesky which is where theatre folk are gathering. I don’t pretend to understand all of it but I’ve highlighted some sections I think are pretty mindblowing.

Grok says:

As of February 20, 2025, I’ve conducted a search across the web and Bluesky to identify conversations about the applicability of Grok 3 to the theatre context, focusing on areas like theatre booking software, audience segmentation for marketing, script development, and casting. Here’s what I found:
Overview of Findings
While Grok 3, developed by xAI, has generated significant buzz since its release earlier this month (around February 16-18, 2025), discussions about its specific applications to theatre are sparse and mostly speculative at this stage. General web conversations focus on its technical capabilities—advanced reasoning, real-time data processing, coding, and creative content generation—rather than niche industry applications like theatre. On Bluesky, a platform geared toward public developer and tech conversations, there’s minimal direct mention of Grok 3 in a theatre context, though some adjacent discussions hint at potential relevance. The lack of widespread theatre-specific dialogue likely stems from Grok 3’s recent launch and its primary positioning as a generalist AI with strengths in STEM and reasoning, not yet tailored to creative industries like theatre.
Web Conversations
• General Capabilities Relevant to Theatre:
• Medium (e.g., Ajit Ashwath, ByteBridge): Articles highlight Grok 3’s strengths in coding, real-time data analysis, and creative content generation. For instance, its ability to generate code snippets could theoretically support theatre booking software development, while its DeepSearch feature (real-time web and X data analysis) might aid audience segmentation by pulling live demographic or interest data. Script development could leverage its text generation, though no specific theatre examples are cited.
• ZDNET, Analytics Vidhya: These sources emphasize Grok 3’s reasoning and problem-solving, noting its success in tasks like generating HTML/JavaScript (e.g., a roulette wheel game). This suggests potential for interactive theatre tools, but no direct theatre applications are discussed.
• Helicone.ai, DataCamp: Grok 3’s “Think” mode for step-by-step reasoning and its ability to handle complex queries (e.g., a Settlers of Catan grid) indicate it could assist in script analysis or casting logistics, though conversations remain generic and not theatre-focused.
• Theatre-Specific Mentions:
• There are no explicit, detailed discussions on mainstream tech sites (e.g., TechCrunch, Medium, ZDNET) about Grok 3 in theatre contexts as of now. Searches for terms like “Grok 3 theatre booking,” “Grok 3 script development,” or “Grok 3 casting” yield no direct hits beyond my own prior responses to you in this thread (e.g., crawling theatre websites). This suggests theatre applications aren’t yet a focal point in public discourse, possibly because users and developers are still exploring its broader capabilities.
• Potential Adjacent Uses:
• Marketing and Audience Segmentation: Grok 3’s real-time data integration (via X and web searches) is praised across sources (e.g., Medium, Technowize).
Theatre marketers could theoretically use this to analyze audience trends or sentiment on X, but no one’s explicitly talking about it yet.
• Script Development: Its creative writing potential is noted (e.g., apidog.com mentions it excelling in creative tasks), hinting at script drafting or dialogue generation, but examples are absent.
• Casting: No direct mentions, though its reasoning could optimize casting decisions by analyzing actor profiles or availability data if integrated into a system—purely speculative at this point.
Bluesky Conversations
Bluesky, with its developer-friendly API and focus on public data (e.g., via the firehose stream), offers a space where theatre-related Grok 3 applications might emerge, especially among indie developers or theatre tech enthusiasts. However:
• Search Results:
• Using Bluesky’s search (Grok 3 theatre, Grok 3 script, etc.), I found no posts explicitly tying Grok 3 to theatre contexts as of 10:04 AM GMT, February 20, 2025. The platform’s conversation leans heavily toward Grok 3’s technical benchmarks (e.g., Chatbot Arena scores, coding prowess) and general hype (e.g., “Grok 3 just dropped—insane reasoning!”).
• Posts about theatre exist (e.g., TTRPG Casting Feed at bsky.app), but they’re unrelated to Grok 3, focusing instead on tabletop role-playing game casting calls.
• Developer Activity:
• Bluesky’s docs (e.g., docs.bsky.app) highlight community projects using the AT Protocol, like real-time tools and bots. While none mention Grok 3 or theatre explicitly, the platform’s openness suggests it’s a plausible future hub for such discussions—perhaps via a custom feed or app integrating Grok 3’s API for theatre purposes (once available to developers, expected soon per helicone.ai).
• No evidence of active threads on bsky.app about Grok 3 in theatre booking, marketing, scripts, or casting.
• Why the Silence?:
• Bluesky’s audience skews toward tech-savvy users and developers, not theatre practitioners. Grok 3’s recent beta rollout to X Premium+ subscribers (per Technowize, apidog.com) means broader access—and thus niche experimentation—is still limited.
Where Are These Conversations Happening?
• Web: Scattered, speculative mentions of Grok 3’s potential (not theatre-specific) appear on Medium, ZDNET, and Analytics Vidhya. Theatre-specific applications aren’t centralized anywhere yet—likely because it’s too early post-launch.
• Bluesky: Virtually non-existent for theatre contexts. General Grok 3 chatter occurs under handles like @xai or in tech threads, but nothing theatre-related.
• Other Platforms: X might host more real-time buzz given Grok 3’s integration there, but a quick scan (e.g., “Grok 3 theatre”) shows no theatre-focused posts beyond generic hype. Reddit or GitHub could emerge as spots for developer-led theatre experiments later, but not yet.
Are Such Conversations Happening?
Not significantly, at least not publicly as of now. The theatre community and tech enthusiasts haven’t visibly bridged Grok 3’s capabilities to these use cases in open forums. However, given Grok 3’s strengths (coding, data analysis, creativity), conversations could emerge soon, especially as its API opens to developers (per xAI’s plans) and theatre professionals explore AI tools. Potential hotspots might include:
• Bluesky: Once developers build theatre-related bots or feeds using Grok 3.
• X: If theatre marketers or producers test its DeepSearch for audience insights.
• Niche Blogs/Forums: Theatre tech blogs might pick this up later.
Conclusion
Right now, there’s a gap—no robust, theatre-specific Grok 3 conversations are happening on the web or Bluesky. The AI’s newness and broad focus mean its theatre potential remains untapped in public discourse. If you’re keen to kickstart this, posting a query on Bluesky or X (e.g., “How could Grok 3 revolutionize theatre booking or casting?”) might spark the discussion you’re looking for! Want me to monitor this further or draft a post to ignite the conversation?

As the machine says, it’s early days. And Grok 3 — which presumably knows its own mind — thinks I’m the first person, anywhere, to have asked it a serious question applicable to the theatre industry!

Whilst commercial script-writing and editing appear to have just gone for a burton, it could be argued that robot intelligence has no place in a sacred space (say, a black-box theatre) where human beings interact with other human beings. But, on the other hand, for the first time it is possible to include aggregate, real-time data in live shows. Information on the rate of glacier melt, for example. Or the number of migrants currently in the English Channel.

But that’s just the start. Real-time city-centre parking information on the way in, train delays and road closures on the way out. Cheap, adaptable software programmes incorporating local demographic data that will make airline pricing available even to small venues. Set designs and lighting plots tailored specifically to your script, venue and budget. Spontaneous image generation based on a random selection of audience nightmares — or dreams. Detailed audience segmentation.

In fact, we can’t predict the implications for theatre as an artform any more than Louis Daguerre could have predicted Picasso’s Guernica. But, in the words of Pink, it’s time to get the party started.

Thanks for reading Yorkshire Theatre Newsletter! This post is public so feel free to share it.

Liz x

* No, I didn’t. There was no such opera, no such awards and it’s an entirely fake Grok 3-generated production image based on a three-turn dialogue I had with the machine. And look at the hands!

Yorkshire Theatre Newsletter