Make Playing on the Computer Fun Again
Last night, I stayed up late and burned credits on GPUs that didn't belong to me animating people that don't exist. And it was the most fun I've had in a long while "on the computer".
Remember when "playing on the computer" was a thing? Before working remote and having a computer glued to you, you'd go visit the computer and make it do fun things. I feel like we're in the golden era of that right now.
You can check out the video I made last night below. It's super immature and inspired by the comedy-style of I Think You Should Leave:
How to Generate Movies with AI
If you'd like to make something like the video I made, or maybe even be more ambitious (space? medieval scene?), there are plenty of options for tooling. I used Kling 1.6, Hedra, ChatGPT / Sora, and Eleven Labs. I also used pixabay for audio assets.
Other AI image-making tools:
- Fal.ai (various models)
- Kling
- Flux
- Midjourney V7 (just out as of this writing)
Other AI videomaking tools:
- Veo2
- Pixabay
- Minimax
- Dreamina (LumaLabs)
- Higgsfield
- Pika Labs
AI Filmmaking Workflow
I had this stupid ketchup skit idea for a while after riffing with a friend on it months back. So I plugged in my headphones and talked to ChatGPT voice-mode about my idea and how I could produce it. I asked dumb questions like how "real" skits are made and if I should have a storyboard and how camera angles worked.
Initial Ideation
I made ChatGPT give me a rough listing of scenes and what each different camera view should be pointed at. I also made it write a script, but I wasn't too rigid with with that. I'd riff with it like a friend, giving direction for style of comedy I wanted, phrases I liked, didn't like or dialogue I thought was funny and fit in well.
After that, I generated a "character sheet" of each of the characters with ChatGPT. I made ChatGPT come up with the character descriptions btw. Those came out pretty high-res and looked good.
Here's our Main character, Ty:
Generating AI Scenes
I had one of these pics for each of the main characters + the waitress. From there, these were the pictures I plugged into ChatGPT or Sora again to get each camera angle or "scene".
For a majority of the scenes, I tried to focus on the "over-the-shoulder" shot and plugged in all the characters and gave it a prompt like this:
The camera is focused on this man with the dark hair and joker shirt. The woman is sitting to his left. They're in a dimly lit, cozy restaurant with warm, ambient lighting. Wooden table with a checkered table cloth. Over the shoulder shot where the red haired man and the blonde woman are the shoulder characters.
That gave me some pretty decent scenes for areas I needed dialogue. If I wanted to flip the camera, I just flipped the prompt and changed who was being focused on vs the foreground character.
This seemed to give me enough character consistency throughout the video.
I would have liked to have better scene consistency, and for all of the areas inside the restaurant, I just used the description of the restaurant:
dimly lit, cozy restaurant with warm, ambient lighting. Wooden table with a checkered table cloth.
Maybe I should have generated an empty scene first, then placed the people in it.
Generating AI Dialogue
From here, I stuck with 3-4 voices in ElevenLabs and typed out what I wanted the line to be. I hit regenerate a few times on a lot of them until it sounded OK with the right spacing and pace. For some lines, I added punctuation: !!!!
and stuff like "he said angrily" on the end, which seemed to give angrier outputs.
I later learned that ElevenLabs has an Audio to Audio mode that would be a lot less monotone. There's another tool called RVC for voice creation, but I haven't tried that yet either.
Tying the Audio to the Video
Once my audio sounded OK, I downloaded it as an MP3.
Then, I uploaded my still shot from ChatGPT/Sora to Kling 1.6 and gave it some direction as a prompt, usually: "He speaks and talks to his friend as she watches". I mostly got just 1 person to speak, although a few buggy areas have two characters talking but one piece of dialogue.
Once Kling does it's thing, you're still not done. Your video gets generated, next you'll need to add "local dubbing". Find your MP3 file from Eleven Labs, drag it in and burn a few more credits and it'll tie the audio and video together.
More Audio & Video with Hedra
Hedra Character 3 is the other main tool I used, usually for close-up photos of characters faces when they talk. It's pretty simple, upload mp3, upload picture. Hit go. I was pretty satisfied with the results.
Organize Your Stuff
I kept all my assets locally on my mac in folders like ketchup_fart/characters
, audo/jennifer
, audio/generic
, etc. Do whatever feels right here.
From here, I pulled in the video clips I had saved into iMovie and moved stuff around and watched it once, twice, thrice, over and over again. I knew I was getting somewhere when I was making myself laugh at this stupid fucking video.
Pixabay is another great resource for what I'm pretty sure is called foley or "environmental noise" and sound effects. I got some pretty good restaurant background noise, car background noise, exhaust noises and a suspenseful music track for one of the character's serious moments.
Learn Your Camera Angles
It's helpful if you learn a few different camera angles, so just go google those and then use them with Kling or whatever tool and you should get a decent output.
I learned the name "over-the-shoulder" which I'm sure is day 1 of film school but it became a core of the video and worked pretty well for dialogue.
The other fun one I had heard of was the "dolly zoom". Not sure where I saw it but it's a pretty famous camera angle used in Jaws and a bunch of other stuff. AI indeed made it and it was super creepy and cool.
BTW, on the camera subject, there's another model out called Higgsfield that makes some downright mind-blowing camera maneuvers. Worth checking out.
What Do AI Video Generation Tools Cost?
I paid:
- $0 for ElevenLabs (but they'll probably get me as I burn through credits)
- ~$26 for 3000-ish credits on Kling (I used around 300).
- $10 for Hedra (I used all credits, but some were for other projects)
- $20 for ChatGPT/Sora...but I use it for other stuff than this. Picture generation doesn't cost anything extra or use credits but you can only do 1-2 at-a-time.
- $15 for Runway...but I didn't end up using it. So I guess I shouldn't have mentioned it here. 😛
So roughly $60-$80. I still have plenty of Kling credits left and I intend to use them. I don't think they rollover.
I'm not a big fan of the Dave and Busters-esque credits system these companies use but I guess that's the best there is now. Some other AI video generation companies have an unlimited plan that allows slower generations but as many as you want. Read up on what credits cost and how many you'll need.
For my case, Kling gave me 3000 tokens for ~$26. A 5 second video is 20 credits or 35 for "premium" which is higher quality. The lip syncing I think was 5 credits. 10 second long videos were roughly 40 credits.
I think all this stuff'll come down in price soon enough. I'm not happy to pay for it but I'm glad these revolutionary tools are available at a somewhat reasonable price.
What I'd do differently
Upscaling: There's another tool I played with is called Topaz Gigapixel. I only grabbed the trial, full edition is $85. Gigapixel uses AI and upscales images up to 6x as big. It's absolutely crazy. I think a lot of people use it for printing stuff on posters or large-format. Anyway, I'm wondering if plugging my GPT images into Gigapixel and then Kling would have produced better results.
Voice Generation: I mentioned this before, the voices were monotone which was sorta funny. I wanna check out the RVC tool and audio-to-audio on ElevenLabs.
Scene Creation: for better scene consistency, I'd like to have created a few angles of the restaurant with no one in it. This might have prevented me from adding the description every time in the prompt. Instead I'd just upload the reference.
Script Writing: I'm pumped to learn more about comedy and sketch writing. I had a friend tell me to just ship funny stuff and while I think that's true, I think you can learn the fundamentals and get better outcomes (for every domain?). But yes, also ship and practice.
I enjoyed this video by Casually Explained on Comedy. Maybe I'll pick up some books too.
Learn more camera angles and film making stuff: Again, knowing camera angles helped me out, though the scene was simple. Now I'm watching Netflix and thinking intentionally about how they film things and visual elements, etc. Wild.
Useful Links for making AI generated videos
How I made a Time Travel Movie with AI - This guy had some helpful tips and I learned you could morph your face into an AI video and even change your voice. Good stuff.
Aze Alter - AGE OF BEYOND: If Humans & AI United
This artist, Aze Alter makes these absolutely INSANE videos with some of these AI tools I've discussed. This is actually what got me inspired to make my dumb little video. "Wow, you can do THAT with AI?". Scroll through his other videos, there are some very creepy Bioshock-like videos. They stuck with me.
Aze also has a patreon that has some behind the scenes of the process. I'll probably grab that.
If you don't think there is art, skill and creativity involved in what he's made, I don't think I could ever see eye-to-eye with you.
Career Advice For Filmmakers in a World After AI - I very much enjoyed this video. This guy explains how film makers roles are going to change with AI. He isn't happy about it, but is accepting of it. He suggests film makers early in their career find AI-resistant roles like documentary, live-events and news filming where the role of accuracy is high.
He also says that entry-level roles are going to disappear and that filmmakers should be creative instead of technicians. All of the technical stuff will probably be controlled by AI.
I feel pretty similar to AI and Tech / Software Development (and other fields). Entry-level roles are gonna get wrecked (and maybe mid, senior too?). What pathway exists for people in those roles or n00bs who want to climb the ladder? I dunno. 🤷
Is AI Video Generation Ethical?
I posted my stupid video to 2 subreddits.
First, r/aivideos LOVED it. I was surprised, I wasn't sure if anyone would laugh at this besides me.
Then I posted it to r/IThinkYouShouldLeave, the sitcom that inspired the video. They absolutely hated it. The post got removed!
A moderator commented:
We don't want to hear from ChatGPT here or see some artwork made by a computer. Be a human and make stuff for other humans.
(Also, lmao, I know I'm in too deep writing this at all when I've let a reddit mod affect anything in my life. I guess I'm butthurt.)
Anyway: I have a very hard time with the anti-AI people. I believe we probably don't agree on a lot of things. I do get the sentiment, here's me being as anti-AI as possible:
AI is just another tool for the corporate elite tech-bros to vacuum up art (read:steal) into their system so that we erase real art, put artists out of a job and entry-level workers out of a job too. It is dystopian and the GPUs burn so much energy that they should be outlawed. It's a soulless, disgusting tentacle of the boot holding me down.
Ok, that was maybe purposefully a bit hysterical to reinforce my own beliefs.
But I still don't agree, I think AI is fuckin' cool and I think it brings access to more people to be creative more cheaply. Of course you can always "do art" cheaply, but I just see AI as another medium or tool to express yourself.
The benefits outweigh the side-effects. Skill is now not needed to generate an image or video. But maybe taste is needed to determine if you should keep it. Or what should go alongside it. Or how to get exactly the outcome you want.
I think you gotta have some conviction and think deeply about things, all while accepting you could be wrong. I'm fairly certain I've thought more deeply about AI art than most redditors. In the future, I'd like to find some good arguments against AI and research them to find the truth.
I am a human and I used a tool to make stuff for other humans. I spread a little piece of stupidity and 14-year-old humor through the internet via electrons, man. 🤯
What now?
I can't end this post fired up about AI-hating Redditors.
Besides, the other subreddit was so excited that it made me want to create more stuff. They noticed things I didn't that were funny. Or pointed out things that could be improved. Or mentioned ITYSL, which was a welcome compliment.
Generative AI is moving so quickly that it is hard to keep up. I think a good way to keep up is to just go try stuff out.
Even better if you have a real application...even if that's just making "stupid" things that make you and your friends laugh or smile. (sounds pretty close to the definition of art to me).
I've got some more Kling Credits and I think I might burn them up expanding the storyline and lore of the Brisket Bandits. These guys need their story told.