Five and a half hours on scene 4. Four shots — reverse angle on the partner, slap in the "right" costume, the bandit waking up, and the partner's audio dialogue. Between them — a long, painful lesson that the model doesn't read the script.
"now we need a reverse angle on bandit2"
Went off without figuring out the geometry or the partner's face. Suggested an over-the-shoulder from the lying bandit's shoulder — the user caught it: "and where's the camera? under the floor?". Switched to "slightly low", generated in Seedream with a room ref and a face from the archive — and grabbed the partner's face that wasn't approved yet.
The result was predictable:
"your flow is weird, you didn't figure out who's lying where, what the partner looks like, the model is supposed to guess all that. of course it came out crap"
"garbage in - garbage out"
shot12_partner_reverse_v1.png — tracksuit with stripes (Adidas-like), even though the archive ref was in a jacket. File deleted.
"I'd like a bearded one, short beard, for contrast with the bald bandit"
Gemini 3 Pro Image, a series of edits on char_bandit_2_front.jpg:

The tracksuit came out wrong, zipper undone. Tried adding shot12_slap_setup_user.jpg as a second ref — Gemini returned the setup instead of the portrait:

Rewrote the prompt with explicit "edit the first image, the second is clothing reference only":

v3 approved as the canonical partner portrait.
"okay try it, partner is approved"
Generated with two refs (char_partner_v3 + shot1_room_no_bandit_v3):
shot12_partner_reverse_v2.png
Zipper undone, looking at the camera instead of down, posture "sitting on the couch". User:
"this is total shit. i don't think you understand the flow at all"
Missed the key thing: we already have shot12_slap_setup_user.jpg showing the partner from behind. The reverse angle has to match that 8-figure cut — pose, position, gaze direction. I was generating in a vacuum.
"you actually think the model invents everything for you and resolves everything you didn't describe?"
"I'd start by deciding what the backgrounds are and just generate the background"
Seedream × 2 with shot1_room_no_bandit_v3 as ref:
shot12_bg_low_v1.png — almost the same angle, "just rotated the photo a bit".
shot12_bg_low_v2.png — EXTREME LOW-ANGLE in the prompt, still frontal.
Both deleted. Seedream clings to the ref's composition for dear life — changing the camera angle from a single ref it can't manage.
"shot12_bg_user.png — here, I did everything myself. but I genuinely don't get what's wrong with you and why you can't"
The user cropped a part of the room ref — and that's the right background. No "3D regen" was needed.

Gemini edit, two refs (shot12_bg_user.png + char_partner_v3.png), adding a person to a ready background. The user deleted v3–v7 right after the final — from memory: v3 sat on the couch on the right ("he can't be on the right, has to be on the left, otherwise we cross the line, and bigger"), v4 leaned in from the left ("better, bigger, only chest-up"), v5 chest-up frontal ("only now he's not crouching"), v6 crouched, big — but Gemini drew an extra dark-haired head at the bottom (our bandit is bald), v7 with hard "do NOT add any other person, head, hair" — clean, but the user went back to v6. The final was v8 — a separate edit on v6: remove the foreign head + shift right:

"anyway you helped me, of course. i took your v8, cut off the background, asked it to change the lighting to the right one (window behind), then rotated 45 degrees, masked out the person again, pasted onto the right background, blurred the background, got shot12_partner_user.jpg"
The user built the final composite from scratch in Photoshop. Window light from behind + blurred background + colorful candy wrappers in the foreground (color callback from scene 2). Threw my generations into archive. This frame became the canonical partner ref for scene 4.
5s, slight forward-back lean, eye down on the lying body. shot12_partner_user.jpg as last frame:
v1 → 720p (v2 by file count, but same approach):
"he's not looking where he should, needs to be almost down, he's looking into the distance"
The problem isn't in the prompt — in shot12_partner_user.jpg the partner's eyes look forward, and Seedance has to land on that pose at the last frame. You can't override "look down" in the prompt — the last frame overrides everything.
"let's give it an action, like he's tapping the bandit on the cheek"
Animating a touch. Here I burned myself twice on cut logic:
as last frame, action at the front of the clip, then static" — user: "that's not my old one, that's your old one" and "I'll just literally change the costume myself" (about the slap, later). For the reverse angle:as first frame, match-cut on the strike" — user: "this shot is already the second one, the first is where the slaps happen. So I need a quick slap and then mostly watching"So the cut goes from the slap into the reverse angle, and on the reverse angle the strike is already in the past — need a retract phase from the first frame, then a long hold.
shot12_partner_reverse_anim_v3.mp4 — quick slap with no windup in the first second, then static:
"for some reason it drew a body in the frame that isn't there, and replace the strike with just a touch"
New rule: if an object is out of frame — don't mention it in the prompt. The model draws what's described. Removed cheek, bandit, lying man — left only "tap motion".
"the timings didn't work — he only pulls his hand back at the end of the 3rd second"
"i'm not sure it understands fractional timings, is that in the guide?"
It is — section 16.6 of the guide. But section 12 explicitly forbids over-fragmentation of sub-second markers. I shoved four in a row ([0.0s-0.3s], [0.3s-0.6s], [0.6s-1.0s], [1.0s-5.0s]) — exactly what the guide warns against.
"so why the hell did you do that if it's in common mistakes?"
v5: 7s, whole seconds, two blocks ([00:00-00:02] touch+return, [00:02-00:07] watch):
The cut into the reverse angle exposed: the costume in shot12_slap_setup_user.jpg (dark jacket) ≠ costume in shot12_bg_user.jpg (light gray tracksuit).
"just fucking great"
The user made shot12_slap_setup_new_costume.jpg — replaced the top. Prompt from the old log (v2), only ref different:
"he's doing it with the wrong hand, needs to be the right one"
v4: added with his RIGHT hand:
Three problems at once:
- right hand to left cheek (across the body)
- 15 strikes instead of 5–6
- hand doesn't retract
User's idea: v2 as motion reference (there the hand and rhythm were OK, only the costume was wrong):
"video ref didn't give it the precision needed, now the hand is glitching on rhythm"
v6: removed video-ref, explicit 5 to 6 light pats, with his right hand, on the cheek on the right side of the frame. Don't write retract — model handles it.
A separate lesson on spatial cheek description: I burned myself twice before realizing that if the partner is in the lower-left corner, then the bandit's right cheek (his own body's right) is the right side of the frame (far from the partner, across his body). User: "without a cross-body reach it'll look unnatural, need the far one".
Came in over the poll timeout (rule 20) — didn't restart, pulled the task_id from the log and polled directly. Worked.
5s of waking up with shot12_slap_setup_new_costume.jpg as first frame:
"what did you give as the ref? we got dust flying in the air for some reason, need to remove that"
v2: added The air is completely clean and still — no dust, no airborne particles:
"let's have him open his eyes immediately, first dazed, then he looks at the partner"
Asked for v3 — flew into content_policy_violation ("May violate third-party content rights"). Same ref, similar prompt, but Seedance's filter is flaky. Retry — second block, file never produced.
Softened the prompt: bald man instead of bald bandit, eyes open instead of snap open, removed looks around confused:
"tebya_suka.mp3, here's the audio — 'tebya, suka, nanyali tovar prinesti, a ne probovat ego, debil, davay sobiraysya, blyat, vse, poshli'"
Length 6.70s, Seedance at 7s. Ref shot12_bg_user_front.jpg (frontal partner portrait) + tebya_suka.mp3 as @Audio1 for lip-sync:
Poll died on SSL timeout again, pulled task_id and downloaded manually.
char_partner_v3.png — approved canonical partner portrait (gray zipped tracksuit, short beard).shot12_bg_user.jpg + shot12_partner_user.jpg — user assembled in Photoshop (crop of the ref, then 45° rotation of v8 with my partner + blur + candy). Faster and more accurate than any neural net from scratch.shot12_partner_reverse_anim_v5 — 7s, reverse angle with touch and hold (approved).shot12_slap_v6 — slap in the new costume, right hand, far cheek (approved).shot13_bandit_wakes_v4 — waking without dust, softened prompt (approved).shot14_partner_dialogue_v1 — first shot with Russian lip-sync.Lessons:
- If the object is out of frame — don't mention it in the prompt. Otherwise the model will draw it (body, head, hair).
- For as last frame in Seedance, the last frame defines the pose and the gaze. You can't override the final gaze direction with the prompt — you have to edit the image.
- Sub-second timecodes — the guide forbids over-fragmentation. Whole seconds by default, fractional only when really necessary and at most once.
- Video-ref for motion transfer in Seedance works unreliably — it can break the rhythm it's copying.
- Seedance content_policy is flaky — retry doesn't always help, but rephrasing more softly does (bald man instead of bald bandit, drop alarming words).
- Match-cut: the static phase of the clip must be on the side where there's no seam — the first clip ends in motion, the second begins in motion, then a long static.
- Spatial description ("right side of the frame") is cleaner than bodily ("right cheek") — the model doesn't confuse the coordinate system.