I mentioned ComfyUI in the description of several of my AI-generated deviations. For those who haven't looked it up yet - it's a StableDiffusion power tool: it's fairly complicated, but immensely powerful and can create several things the usual AI image generators can't. It also has plugins that allow for even crazier stuff.
So, first things first, you can download ComfyUI from GitHub. It comes up with all necessary file dependencies and opens the GUI in your browser, much like the popular Automatic1111 frontend. Right off the bat, it does all the Automatic1111 stuff like using textual inversions/embeddings and LORAs, inpainting, stitching the keywords, seeds and settings into PNG metadata allowing you to load the generated image and retrieve the entire workflow, and then it does more Fun Stuff™. For example, you can rig it as a simple GAN-based upscaler with no AI generated prompt whatsoever. Plug any upscaler model from The Upscale Wiki, and you can blow 800x600 images from fifteen years ago up to four times the size with no noticeable quality loss - insanely handy if you want to use ancient stock photos from DA for a project today. Sure, the thing can't handle smaller images as well as commercial software like Topaz Gigapixel AI does, but it works with typical StableDiffusion output just fine. Other things you can do? Area composition, for example. What's that? Run a prompt for a landscape, halt it a quarter way in, run a smaller prompt for a character, halt it a quarter way in, combine the two images by pointing where you want the character placed and run a prompt containing both from where you stopped. Or ControlNets: do you want to recreate a pose from a photo or a movie poster without all the other things the AI might infer from an Image2Image prompt? Draw a quick scribble so the AI knows how to draw a firearm and where, for example? Render a proper hand with the gesture you want? It's possible.
And then, there are plugins. I mostly use three sets of custom nodes:
Derfuu's Math and Modded Nodes for upscaling the latent images by ratio for a high-res fix,
WAS Node Suite for its text tools like string concatenation, converting string to prompt conditioning (so I can plug a line of additional detail requests into a prompt for the high-res fix, like face swapping) or adding a date and time to the output filename,
Fannovel16's ControlNet Preprocessors for generating ControlNet data from input images and plugging them into a new prompt in one go.
The WAS Node Suite can also do Derfuu's upscaling-by-ratio, but with extra steps (read image size and convert it to numerical values, multiply those values by 2 and feed the data to an upscaler node) and use Meta AI's SAM (Segment Anything Model) that I could use for automatic masking once I figure out how to use it. The ControlNet preprocessors? I used them to help with inpainting on both "The Operator 2.1" and "Tiny Dancer".
For some reason, using the Protogen Infinity model for inpainting yielded weird results - for example, it consistently added only a small nub to the plate on Operator 2's sleeve that was left as an attachment point for an entire robot arm, forcing me to switch to image2image generation and re-render her as a full cyborg (with hilarious results like massive spotlights on the chest). So I used the preprocessor to read the pose from the base image for Operator 2.1 and feed it into the prompt along with an inpainting mask. With the additional data, Protogen Infinity properly drew a CyborgDiffusion-style left arm, along with that plate on the top and some skin matching the base image.
For Tiny Dancer's legs, I only masked off the legs themselves, which wasn't that great of an idea now that I'm thinking about it - I could go with a simpler trapezoid shape and base image coupled with pose ControlNet would probably yield a little cleaner result, particularly considering that most robot legs I got from the inpaint were ridiculously thin, leaving the original thighs kind of hanging, forcing me to render about fifteen different versions, then bash two of them together in an image editor anyway. But, importantly, the ControlNet maintained the character's pose for the entire time - it's also useful for averting anatomical disasters common at low pass counts.
Another neat trick you can do with ComfyUI is the high-res fix: instead of rendering the latent image into a human-readable form, you can upscale it and feed it as an input image for another render. This has two distinct advantages over simple upscaling, as you can see comparing "The Boss of the Reformatory" (left), which was rendered in a single pass at 512x768, then upscaled with ESRGAN to 4x size, and "Princess of Darkness" (right) that was rendered in 512x768, then had latent data upscaled to 2x, fed into a shorter, 20-pass render at double size and 50% balance between text prompt and input image (go below that and pixels will start showing, go above and it'll go off-model), and then upscaled to 2x the new size using BSRGAN. Not only the image is sharper, but a re-render at doubled size redraws all the details: just look at the tattoos. The left image has them blurry and blocky, while the right one looks like straight out of a tattoo artist's Instagram. Sure, there's a bit of static in the corner of the model's lip that I initially wanted to photoshop into a side labret piercing (and I might still do!), but other than that, it's perfect. However, several times I had the anatomy go crazy even if the base image had it right a couple of times, forcing me to ditch a couple of good ones that suddenly sprouted an unexpected hand somewhere.
And best of all, it's free. What had me swear off online-based generators was that not only most of them offer only base SD1.5 and SD2.1 models, but also have you pay for generating the images - which, combined, make the entire enterprise look like a scam aimed at bilking suckers who have no idea what they're doing. Not just the base models are shit and can't generate anything half-decent even after 50 attempts at fairly high pass count and guidance, but there are also no tooltips, hints, references or anything of the sort at hand to help you. If you want to educate yourself about what's what, you need to go somewhere else entirely. Yeah, I get it, the computing power needed to run those things is immense and bills don't pay themselves, but getting completely basic basics for ten bucks a month kinda stinks. Meanwhile, I fire this thing up on my four-year-old gaming rig (i7 8700K and RTX 2070 Super) and not only I get results faster, I can also achieve much better effect by plugging in additional datasets and models unavailable for website-based generators.
I was sure I named something on DA "Axing the Creative Block" due to my love of questionably funny wordplay, and after an hour and a half I noticed it's the gallery subcategory where I put execution-related art before.
Until now, it was populated with drawings, but I've been practicing something better for some time. Namely, photomanipulations mixing photos I take in my mini-studio and backgrounds coming from somewhere else.
And "somewhere else" turned out to be a Big Fucking Problem. If you ever wandered into a wrong part of the internet, you might have seen people using really low res versions of pictures stolen from stock photo sites, pasting them together with really badly cut photos of themselves for a quick wank and not really caring about the quality. Which is exactly not how I intend to do it.
First and foremost, non-negotiably, I'm not going to work in low res. Any precision in selecting, erasing, matching elements etc. goes out of the window in such a case. I can say that useful stock starts somewhere between 6 and 12 MPix if we're talking about main scene, depending on quality. The sharper the better. Sloppily edited out if not outright ignored watermarks allowed me to find some of the backgrounds on commercial stock photo sites like Alamy or Shutterstock. Okay, art requires sacrifice and expenses, so I was prepared to cough up some dinero for pics that are of acceptable quality and took at least a bit of effort to frame, take and post up for sale.
Except the number of pictures and their quality turned out to be lower than the very grey-area alternatives, for three main reasons:
Wanderlust, curiosity and FOMO takes people to very interesting places with rich history, and hiking route websites have the "user gallery" feature just for that reason.
Weird and dark fetishes of Germans apparently influence their work as far as the tourism departments of regional authorities go. Those crazy bastards maintain actual gallows (or their faithful reproductions) and market them as "historical monuments" and "tourist attractions" out in the sticks.
Computational photography can make anyone with minimal imagination a competent photographer, and everyone carries the hardware on them at all times.
I mean, fuck. Folks go on hikes and bike trips with their smartphones, snap photos of old gallows and scaffolds, and those are, on average, better than shit with a $30 price tag. Because, for example, the "professionals" either insist on shots against the sun (and it shows up in shadows on the photos, not to mention the fact that the side facing the photographer is darker) or insist on taking photos with high ISO values, making them annoyingly grainy.
However, the very grey area means that practically, if I yoinked that pic out of some hiker's gallery and went to town with it, they wouldn't even know, much less care. Terms of service for websites like DeviantArt, however, hinge on theory, and theory states that in doing so, I violate the poor schmuck's intellectual property rights that he doesn't even know he has. Which is a Bad Thing™ that would result in me getting punted before even buying the Core Membership and putting a price tag on the whole set.
To make a photomanipulation, you need a photo, right? Right. So I decided to ask around if any of my modeling acquaintances is in the mood for grim & evil. Even after sharing the specifics, I'm left with four potential models, and I'm thinking that aside from more straightforward material intended for manipulation (for example, chromakey green background and even lighting), I might simplify things and do one of those minimalistic, black background and more directed lighting kind of shoots.
The idea is there, without distractions, and I can play fast and loose with the angles instead of having to match whatever stock photo I intend to use. That and, each model will also receive her photos considerably faster, because I won't have to bother with layering everything precisely, like matching shadows, cutting through the original stock photos to put the model behind a grate or something etcetera.
The process of assembling photos into one picture that makes sense and is presentable will be long and painful, but I'll manage. Like I mentioned before, I'm a bit limited as to what DeviantArt can accept as the base material, but luckily I found a couple of interesting stock photos (or ones whose creators don't expressly discourage others from using as such), either free or for spare change.
Like I said, I have four potential models on that one. Which will make for a nice group shot of some kind. I also have a separate idea for one of the models that involves an entire different set of backdrops - luckily one coming in entirety from Wicasa-stock 's one shoot.
How long have you been on DeviantArt?
My page says it's been eleven years. Whoa.
What does your username mean?
It's a derivative of my usual Internet handle I used back then. Maybe I should change it to SeriouslyMike or something similar.
Describe yourself in three words.
Insufferable genius, sarcastic.
Are you left or right handed?
Righty-ho, sir!
What was your first deviation?
A drawing of one of the characters from my fantasy stories.
What is your favourite type of art to create?
These days - props of all sorts. I aim for having some kind of physical interactions with those, actually - like, you press buttons, flip switches etc., and lights blink, bubbles rise from a water tank etc.
If you could instantly master a different art style, what would it be?
Digital painting. Much handier than colored pencils, from what I've already noticed.
What was your first favourite?
A speedpaint of a cat.
What type of art do you tend to favourite the most?
Hard to tell. My favorites are a mess.
Who is your all-time favourite deviant artist?
I'd assume it's goldseven and Saimain.
If you could meet anyone on DeviantArt in person, who would it be?
DaveAlvarez, I assume. I'm not really following DA much these days, I admit.
How has a fellow deviant impacted your life?
I can't really say that anyone did.
What are your preferred tools to create art?
That depends. For a time, it was a pencil, a pen and colored pencils, right now I'm relying on a dremel, set of files, soldering iron and InkScape to design patterns for laser-cutting and embroidering. That, and spray paint.
What is the most inspirational place for you to create art?
Now that you mention it, I think I need some workshop space.
What is your favourite DeviantArt memory?
Oh, prepare for some ancient history. Back in 2009, Heidi and spyed dropped by for a DevMeet - which, considering that Poland is a pisshole west of hell, right beyond the borders of any civilization, was a pleasant surprise. And boy howdy, did we have fun there.