This year I embarked on an insightful project: a series of pragmatic evaluations using AI to document global holiday celebrations. Each month, I'll employ various AI tools, including advanced models like ChatGPT-4 and Bard, as well as innovative image generators such as Midjourney, to create content and imagery reflecting the essence of different international festivities.
As an AI enthusiast with relevant professional experience, my aim is to use these celebrations as case studies to examine the real-world applications and limitations of current AI technologies. This series will provide a grounded month-by-month analysis of how these AI models perform in practice, offering a clear view of their evolving capabilities and areas needing improvement.
Throughout the year, I'll share my findings on the efficacy of these AI tools in generating quality, relevant content and images related to each celebration. We'll explore not just the successes, but also the limitations and challenges faced, emphasizing the continued importance of human involvement in AI-generated outputs. This approach is not about overstating AI's capabilities but about realistically assessing and documenting its progress and practicality in a dynamic, real-world context.
January's Cultural Discoveries
Initial tests of 5 AI models for identifying and detailing country events showed variability in output quality, affirming Claude for data and ChatGPT/LLaMA for generation despite limitations. Further holiday content trials prompted refinements, like developing MLK evaluation criteria between ChatGPT and Claude, but still exposed skewed results without context. The January series showed promising AI capacities amid persisted accuracy and reasoning constraints, requiring human guidance to achieve true reliability and value.
AI Trials: Data Collection and Deception
AI Trials: January Part 1: Initial experiments for holiday content creation.
AI Trials: January Part 2: Holiday data collection and comparison.
AI Trials: Historian Roles on History
AI Trials: January Part 3: Testing roles with varied specificity.
AI Trials: January Part 4: Article rating criteria creation.
February's Festival Insights
This series explores the use of ChatGPT-4, Claude v2.1, and the continued challenges presented by Gemini in creating content. It underscores the importance of specialized roles and precise human oversight in guiding AI towards producing respectful and informative content. ChatGPT-4 shines with its depth, while Claude v2.1 benefits significantly from detailed prompts, and working with Gemini reveals its twin is just Bard. Through trials ranging from defining AI roles to refining templates I highlight the crucial balance between leveraging AI's efficiency and ensuring content remains culturally nuanced.
AI Trials: Professionals to Specialists
AI Trials: February Pt 1: Defining specialized roles for AI in content creation.
AI Trials: February Pt 2: Enhancing templates and rating criteria.
AI Trials: February Pt 3: Challenges and triumphs in specialized roles.
AI Trials: February Pt 4: Validating role definitions for use with Claude.
AI Trials: Leveling the Playing Field
AI Trials: February Pt 5: Challenges in AI workflows.
AI Trials: February Pt 6: Structured prompts and roles.
AI Trials: February Pt 7: Prompts and response accuracy needs.
AI Trials: February Pt 8: Enhanced AI roles improve content creation.
March's Festival Chronicles
For march I worked with Leonardo.ai, Midjourney, and StableDiffusion to produce visuals for AI-authored articles related to cultural celebrations. This month's articles emphasize the importance of well-structured prompts, including the strategic use of negative prompts, to guide the visual outcomes, and the necessity of prompt engineering to achieve satisfactory imagery.
AI Trials: Unveiling the Perils Within Beauty
AI Trials: March Pt 1: Unstructured experimentation.
AI Trials: March Pt 2: Improved prompts for acceptable imagery.
As an eternal tinkerer, my curiosity, passion, and sheer stubbornness fuel a relentless desire to experiment, learn, and share knowledge, which keeps my creative spirit ignited. I'm constantly looking for new areas to explore, driven by imagination to see where new and evolving technologies might take me.
Driven by passion, not profit, though a coffee is always welcome.
Disclaimer: The views and opinions expressed in this article are solely those of the author and do not reflect the official policy or position of Amazon Web Services (AWS). The author is a UX designer at Amazon Web Services (AWS) and has no involvement in, nor does their work pertain to, any collaborative agreements that AWS may have with Anthropic, the creators of Claude. The insights and analyses presented here are entirely independent and unrelated to any projects or initiatives between AWS and Anthropic. All content in this post is based on publicly available interfaces and is not influenced by the author's employer.