Use the generated image as the exact first frame. Keep the same four animals, same seated row, same white studio background, same framing, lighting, and positions. 15 second one-take video, locked camera, no cuts.
Singing begins immediately on frame one with no silent pause. All four animals are already ...