AI Academy Ep 1: Getting Started with Vidu
- SORAY-AI

- Feb 3
- 9 min read
AI Academy Episode 1: Getting Started with Vidu
By SORAY-AI
So you're looking to make your very first AI video with Vidu? You've come to the right place! Have a seat and we'll get started, and I will try to make everything as beginner-friendly as possible to understand so you can jump right into the fun of creating!
First of all, what is Vidu? You can check out their website here: https://www.vidu.com/home/ if you want before we dive in just so you can see what the website looks like.
Special Bonus: Use code VIDUVISION for bonus credits on sign-up! Limited time promotion for new members~
Vidu is a platform that I find to be very user-friendly and easy to pick up and learn, where you can create your own videos, images, references and more with a few clicks. It only takes a short while to sign up, and you can get going right away. There are some newbie tasks that you can complete if you want for some credits, but I won't be covering that in this article.
NOTE: This article does not cover the brand new models Vidu Q2 Reference to Video Pro or Vidu Q3, but the knowledge within is comprehensive and transferable. You may skip the final short section “Dubbing Mode” which explains a feature from Vidu Q2. I will be writing a new article this month (February 2026) to do a deep dive into the models, so stay tuned!
On the left side of the screen once you're signed in, you'll see a navigation panel. In this article, we'll be covering Reference, Image and Text to Video.
But first comes the obvious question: what one should you use and what do they do?
So, let's go through them one by one:
—Text to Video—
Text to Video is the simplest mode for the most simple of tasks! You can type whatever you like into the text box and get a result pretty quickly. But what do you use this mode for?
Personally, I find that it works well for B-Roll clips, which are like "filler" clips in a video. Think of things like a short clip of the sky, or people walking in a busy marketplace.
When you're prompting, make sure you have an idea for what kind of style you want your video to be. Anime? Pixar? 3D? Cinematic realism? Just enter it in the box!
So, a simple prompt just to get us started:A short anime style scene of an ominous red moon hanging in the night sky as the clouds roll past and the stars twinkle brightly.
Don't worry about writing complex prompts yet; you can practice that after you get the hang of using the tool so you won't get overwhelmed. Just be creative and have fun with it!
Now once you've got your prompt, you can see the settings below.
First is Duration — you can choose a video length from 2 to 8 seconds. I recommend just using 5 always since I think 2-5 seconds is the same price. If you've got a subscription, 8 seconds is the ideal pick.
In the dropdown for Model, you will see a few different models you can select from. Vidu Q2 is the newest model and so I generally suggest that you use that one. Q1 is also a solid model if you want to test it out.
I personally never change the Resolution & Encoding, so you should be fine just leaving them on default.
Aspect Ratio is just the size of the video. There's a helpful little icon next to each one so you can see what each value means. 16:9 is landscape, 9:16 is portrait, etc. If you're making something like a YouTube short, you should use 9:16. Otherwise, 16:9 is the default selection, but you should choose whatever calls to you.
Amount just means how many videos you want to generate at once. I always just do one.
Now you can just hit the big Create button and wait for your masterpiece to generate!
Here's what it made if you're curious:
That wraps it up for the Text to Video section! Luckily, most of what we learned here in this section is applicable to the other sections, too! Let's move on…
—Image to Video—
This is the section where you can start making your videos more personalized.
You see where it says +First? That's where you can upload your image. I'm going to use a cool autumn fairy that I generated a while back for our test.
Here is the picture if you'd like to use it yourself, but I recommend using your own so it's more fun! https://i.postimg.cc/dDMDSWLC/90354701.png
So when you are using a picture like this, you should consider the whole picture and what you want to see in the animation. Since she's sitting in a forest and there are leaves everywhere, I'll take that into account when I write my prompt. Remember, no need to overthink it unless you really want to. The AI is pretty smart with figuring things out!
For my prompt I'll just write this:
A gorgeous autumn fairy sits on a large tree branch, looking at the viewer. Her wings flutter softly. There is a gentle breeze blowing over the forest that rustles her hair and the foliage around her. Autumn leaves slowly dance through the air around her. Subtle movements.
Now you probably have a few questions. What are these new things that popped up on my screen?
Well, Vidu's Image to Video mode supports multiple frames! So you can make one super long video by chaining images together. I'll get to that in a moment, but let's continue with just our one frame for now. You also most likely noticed that the Duration selection is right up above the prompt instead.
Again, I recommend just leaving it at 5s.
Next is Creation Mode.
Here you'll find two selections: Cinematic and Flash. Cinematic is like the high quality mode that you can just think of as a pro mode if it helps you remember. Flash is faster and I think cheaper as well.
Thankfully, Vidu is pretty generous so you should have a couple of free trial generations to use on some of these models so you can see how you like it!
I'll do a generation with both modes so we can see the difference together!
Flash: https://www.vidu.com/share/creation/3106506121505857/066758 — The Flash one was done super fast and it looks cute, but I don't really like the way it made the camera move. It honestly made me a little dizzy, but the quality is great. So if we wanted to generate this without that kind of camera movement, we could add "static camera" to the end of our prompt!
Cinematic: https://www.vidu.com/share/creation/3106505990882762/764681 — This one took a little bit longer to generate, but it still was quite fast. So immediately I can see that the breeze in her hair is more pronounced and the push-in is more smooth and focused.
So which one do you prefer?
Now, let's try using more than one image so we can have multiple frames. Think of this as something like an animated storyboard mode. If you have a bunch of images that you made that follow a sequence, using this feature would work perfectly.
Since I don't, I'm going to see how it handles transitions between characters! I have 2 faeries, representing Spring and Summer. I'm curious how Vidu will handle it! Let's get into it now.
I wrote a prompt trying to describe what I wanted to see:
Make a beautiful transition showing the change from Spring (the first image) to Summer.
The first image is in a gorgeous flowery meadow and a pink fairy who represents the spirit of Spring is the focus. The scene gradually changes where petals and trees morph into the beautiful ocean and beach scenery, and the Spring Fairy transforms into the playful Summer fairy, dressed in a cute swimsuit and playfully holding a coconut.
Here's the result: https://www.vidu.com/share/creation/3106526173008634/186334 — it made a pretty transition between them. It might work better if I had images that were actually sequential, but the output is still nice. I really like how it transitioned the falling petals to red!
If you're just starting out and don't know how to use a video editor yet and want easy transitions, this is certainly a good feature for you!
Now, we're ready for the last section!
This one is more complex, but I'll just cover the basics so you can get started. I'll do a follow up article in the near future to explain it a little more in-depth, but don't worry, it's not too difficult!
—Reference to Video—
This is where your imagination is the only limit you have!
Wow, there's a lot of new stuff here! That's okay, let's break it down slowly.
—Images: This is where you can upload images you want to be referenced in the video. For example, a background or a scene where you want the video to take place. Character images work too.
To begin, I've uploaded a picture of the scenery I want (Lotus Pond) and a character I want to appear there (the moth). You can see they show up as little buttons in the prompt box.
Fun fact, you can drag these around in your prompt box!
And one more pro tip for this feature, you can even use @ to select any reference or image uploaded and insert it into your prompt easily. Try it out until you get the hang of it! This was also my final prompt!
Now for the result! https://www.vidu.com/share/creation/3106541094888785/149324
The moth is so adorable I'm going to cry. Vidu did an amazing job with this, right? It even captured the moth's pretty glow.
So I bet you're feeling pretty hyped right now with this cool feature right? Well, in the words of the great Billy Mays, "But wait, there's more!"
We've only covered the Images half of the equation, but there are also References, and this is where it gets really exciting!
You need to click on the little + sign here to get started making your very first Reference.
This looks intimidating but I promise it's not too hard.
First, you have 3 places to upload pictures you want your Reference to use. Hopefully you have a character in mind that you wanna use, like an OC or your favorite game character.
Now when I make references I like to use images that are:
—High Quality: Looks good, no weird blurriness or "artifacts" that make an image look like static.
—Large Resolution: Big size, like over 1000 pixels. If you don't know what this means that's okay, just look at your image and make sure you don't need a microscope to see it.
—White Background: That way your videos won't be biased to adding in extra background items. Your reference has a couch in it? That couch is gonna sneak into every video lol.
—Your character should NOT be holding anything you don't want them to CONSTANTLY hold. This is super important. Eating a lollipop? I'm sorry but this is the new Tootsie Pop and you will never get to the center no matter how many videos you generate.
TLDR: Nice looking pictures, no random objects or background, character has nothing in their hands.
You don't need to have 3 whole images if you don't have them. One is fine.
In another article I will tell you how you can make more images if you only have one and also how to remove backgrounds if you need to. If all you have is a picture with a background, don't worry. If your character is holding an item and that's all you have, it's okay.
I'm just telling you the ideal starter images, but anything can work with a little prompting effort.
Next, what is Timbre? Is that like wood? No silly, it's basically just a place to select a voice for your character. Voices are still in Beta so I normally don't use them and just select No Voice, but you can try it out if you're curious.
Now you can go to Next Step and the AI will generate a description of your character.
Double check everything written to make sure nothing is extremely off, but usually it does a pretty good job at describing everything. If there's a problem you can easily edit the text.
Now, rejoice! Your reference is complete!!
Now how do you get it into your prompt?
Option 1: Use @ like we discussed earlier.
Option 2: Select it from the References menu (checkbox) and click done. You can get back to the References menu just by clicking the button right above the prompt box.
Now you can use your new character in a video! Their name will show up in the prompt box just like image 1 and image 2 did in our earlier example.
Here's how the video turned out: https://www.vidu.com/share/creation/3106575608171512/073104
One final note on the remaining option we didn't cover, Dubbing Mode!
Here you can choose if your video is generated with sound effects (SFX) or voice. If you've not set a voice for your character I'm not sure what will happen if you select that option but if you like to live dangerously you can test it out and see. Otherwise, you can select SFX only or No Dubbing which just means no sound and your video will be silent.
I've tried it a few times and the sound/voice can be okay, but it costs extra credits so I generally don't use it. It's in Beta still, so I'm sure we'll see some big improvements this year. UPDATE: With Vidu Q3, sound is now generated natively!! How exciting! I’ll cover that in the next article!
Now, that wraps it up for all the video modes on Vidu, even if I didn't go super in-depth with things I think I covered everything to help you get a good start.
I plan to create videos for AI Academy in the future as well, for those who prefer to learn that way. They will be shorter and cover the essentials. I will not only be covering Vidu, but other platforms and means of creation as well!
Next time I'll be talking about Vidu's brand new models: Q2 Reference to Video Pro and Vidu Q3!
Stay tuned for that~
If you have any questions, feel free to leave a comment and I'd be happy to help!
Thanks for reading and have a fantastic day! :)


