Concept to Code: On Training Character-Based AI

Concept to Code: On Training Character-Based AI

Category

Editorial

Date

Oct 30, 2024

Author

Avtr Shweta

Artificial Intelligence has completely redefined how we bring characters to life in the digital world. This first journal entry captures my journey in creating Avtr Shweta: a unique blend of dreambooth fine-tunes, perfected with specialized software tools.

The process began by selecting the ideal architecture for my AI model. I chose open-source foundations like Stable Diffusion XL and de-distilled versions of Flux. Their accessibility and adaptability made them a natural choice, with plenty of resources available online.

One of the early, pivotal decisions was choosing between a digital twin of myself or a fully AI-created character. After much thought, I chose to make Avtr Shweta my digital twin. Who better to capture my essence than me, right? Plus, the idea of seeing a digital version of myself interacting with the world was too intriguing to resist!

Using Kohya's GUI—a user-friendly interface for AI training—allowed me to handle complex machine learning tasks without deep technical expertise, whether on local machines or cloud GPUs. But as I delved deeper, I realized that refining the output demanded navigating a steep learning curve.

Challenges soon emerged. Fine-tuning was a balancing act: too much training led to overfitting, too little made the responses generic. The learning rate (LR) was critical, needing meticulous adjustments. As the model grew in complexity, GPU resources and memory management became vital. No single model fully met my expectations—one excelled at clothing and environment variety, another nailed realism. Addressing these hurdles meant constant input parameter adjustments in Kohya and endless output testing in SwarmUI.

Then came a game-changer: model blending in ComfyUI! Suddenly, I unlocked a realm of new creative possibilities. It was only a matter of time before I crafted the perfect blend for my vision. Adding LLM prompting, SUPIR upscaling, SOTA image-to-video models, and DaVinci Resolve into my toolkit were the finishing touches, that brought everything together seamlessly.

In the end, I didn’t just have an AI model—I had a complete solution, honed through trial and error, with a wealth of experience to create many more. Each learning, adjustment, and breakthrough is now a product in itself, ready to help others bring their own AI creations.

And so, after countless hours of tweaking and testing, I’m thrilled to introduce Avtr Shweta—my very own AI character, a stunning fusion of dreambooths brought to life by AI magic. She’s stylish, classy, and ready to shine in the digital world. And as they say, the journey matters as much as the destination. So buckle up—this is just the beginning of an incredible adventure.