【风格】高端电影级播客访谈(Cinematic Podcast Talk Show),A24暗调电影质感,8K超清,真实摄影(Photorealistic),低反差冷调光影,浅景深
【时长】15秒
【场景】暗色调专业录音棚,背景是哑黑色吸音棉墙板,深色木质桌面摆放两支Shure SM7B专业话筒、两个黑色陶瓷咖啡杯、皮面笔记本;暖白色侧上方面光配冷色背景填光,远景可见红色"ON AIR"霓虹灯牌微微透出
【角色】Sera@图片1(深酒红长直发亚洲女性,黑色高领无袖配渔网长袖战术装扮)、Liora@图片2(粉红色长卷发亚洲女性,黑色挂脖背心配黑色长手套袖)
[00:00-00:03] 镜头1:开场抛梗(Medium Two-Shot)
两人隔桌相对而坐的中景,前景话筒轻微失焦,浅景深聚焦Sera。
Sera 微微前倾靠近话筒,红发垂落肩头,嘴角带一丝冷笑:
【English Dialogue】"Another week, another crowd crying — 'the algorithm buried me.'"
Liora 端着咖啡杯静静抿了一口,目光斜睨过来,没接话。
[00:03-00:07] 镜头2:Liora 接梗(OTS Close-up)
越肩特写切到Liora脸部,她放下杯子,指尖在木桌面轻敲两下,粉发在环形光下泛柔和光泽。
她抬眼直视对面,嘴角微微上扬,停顿半秒:
【English Dialogue】"I clicked into their profiles. Honestly? They earned it."
[00:07-00:12] 镜头3:Sera 推进(Single Close-up)
切到Sera单人近景,她靠回椅背双臂交叉,红发在边缘逆光下泛着暗红轮廓光。
她略微歪头,眼神锐利:
【English Dialogue】"The algorithm isn't unfair. It just says what your audience won't — your content is boring."
说完轻轻吐出一口气,嘴角微挑。
[00:12-00:15] 镜头4:Liora 收束(Direct-to-Camera)
镜头切到Liora正面特写,她微微前倾接近镜头,眼神平静有穿透力:
【English Dialogue】"The time you spent complaining? Three original posts you didn't write."
镜头缓慢拉远回到两人中景,背景"ON AIR"红色灯牌轻微闪烁,画面底部缓缓淡入字幕:"Originality is the only algorithm worth gaming."
global_rule: No music, diegetic SFX only. Raw handheld iPhone footage, auto-everything, bystander POV on Rodeo Drive, Beverly Hills, Los Angeles — no styled lighting, no grading, auto white balance flickering between warm and cool as the camera pans across shade and sun. At 0s the camera is already unsteady, pointed loosely down Rodeo Drive, slightly over-exposed on the asphalt, the operator clearly reacting in real time — you can hear ambient noise from the environment, distant traffic, a faint crowd murmur, wind buffeting the mic with a low crackle. At 1s the deep, authoritative low-frequency rumble of an exotic supercar engine rolls in from off-screen left — raw, unfiltered, the phone mic distorting slightly at the low-end peaks — and the camera swings fast to track it, momentarily cutting off the top of the frame and catching a blurred pedestrian shoulder in the foreground. At 2s a matte black Lamborghini Huracán STO slides into frame, the engine rumble stretching into a thick, resonant growl that vibrates the audio channel. The auto-focus hunts aggressively — the car body goes soft and the background sharpens for half a second before snapping back to the car's low roofline. At 3s the driver's window is fully down and Sofia Vale is visible from the chest up — a 23-year-old female prompt engineer with a slim athletic build, sharp feminine facial structure, subtle natural makeup, deep brown eyes, smooth olive skin, glossy dark brown hair loosely tied back with soft strands moving naturally in the wind. She wears a perfectly tailored black fitted top beneath a lightweight charcoal technical jacket, minimal silver jewelry, and sleek modern sunglasses resting casually on the dashboard, every detail immaculate against the raw, unpolished context of Rodeo Drive. Her face is sharply lit by harsh overhead sun casting a hard shadow under her jaw, no fill light, completely natural and unflattering in the best paparazzi sense. Her expression is calm, composed, a barely-there smirk playing at the corner of her mouth, steely eyes scanning forward. At 4s a bystander on the sidewalk — a young individual in an oversized vintage hoodie, loose cargos, and worn sneakers — steps partially into the left edge of frame, half-obscuring the car's front bumper, and calls out toward the open window over the crowd noise, their voice raw and unpolished against the ambient audio: "Excuse me, what do you do for a living?" The camera auto-focus briefly loses Sofia Vale's face and locks onto the bystander's outfit before hunting back. At 5s Sofia Vale turns her head slightly toward the window, the smirk deepening just a fraction, her posture relaxed and unhurried despite the slow rolling momentum of the car. At 6s in a voice that is crystal clear, confidently projected, unmistakably standard American English — cutting cleanly above the engine rumble and street noise with natural authority — Sofia Vale says: "I'm a prompt engineer." The words land with casual precision, no affectation, just clean American vowels and a tone that suggests the statement is both completely mundane and somehow the most interesting thing anyone on Rodeo Drive has said all day. At 7s the camera operator exhales audibly into the mic, a small laugh or breath of surprise, and the frame dips slightly downward catching the car's rear quarter panel and spinning rim in slow motion — the wheel spokes strobing beautifully in the harsh sunlight, lens flare clipping the upper right corner of frame in a raw uncorrected streak of yellow-white blown highlight. At 8s the auto-focus completely loses the car and locks onto a chain-link fence twenty feet behind — the entire foreground goes buttery soft — before snapping back with a micro-jolt at 9s just as the rear of the Huracán begins to slide past frame. Chromatic aberration bleeds purple and green along the high-contrast edge of the car's matte black bodywork against the pale sky. At 10s the camera pans to track the rear of the car — slightly too slow, cutting off the exhaust pipes — the engine note shifting and deepening as the car rolls forward, the slow-motion audio turning the rumble into a cinematic subsonic throb that the phone mic renders with slight clipping distortion on the peaks. At 11s a pedestrian walks fully through frame between the camera and the car, completely blocking the shot for nearly a full second — the operator doesn't cut, just holds and waits, the frame partially obscured by the back of someone's leather jacket. At 12s the car is three-quarters past, the rear wing visible, and the camera is now slightly under-exposed as the operator has tracked into a shaded zone — the auto exposure struggling to compensate, the image briefly darkening and then lurching brighter. At 13s the camera drops almost to waist height, catching the car's exhaust and rear diffuser low and wide, the slow-motion engine sound tapering as the Huracán puts gentle distance between itself and the crowd — still rolling slowly, window still down, Sofia Vale's silhouette just barely visible in the driver's seat, one arm resting on the door. At 14s the phone's auto white balance shifts warmer as the camera swings back into full sunlight, the image going slightly flat and overexposed on the pale asphalt. At 15s the footage cuts abruptly mid-pan — not a clean edit, just the operator stopping the recording — the last frame frozen on a slightly motion-blurred rear view of the matte black Lamborghini Huracán STO shrinking into the heat-haze of Rodeo Drive, Beverly Hills, the engine rumble fading into ambient noise from the environment, wind, and the sound of someone nearby saying something unintelligible off-mic.
POV: First-person handheld shot of a young adult male street photographer walking casually through the extremely crowded streets of Shibuya, Japan on a sunny daytime, surrounded by hundreds of pedestrians rushing across the famous Shibuya Scramble Crossing, tall buildings with huge digital billboards, neon signs, and busy city atmosphere.
@-image as the exact character reference for the woman — her full appearance, face, hair, outfit, and style must strictly and perfectly match the uploaded reference image @-image .
0-4 seconds: Natural handheld walking motion forward through the crowd. The photographer spots @-image standing at the edge of the sidewalk, fully focused and looking down at her smartphone.
4-7 seconds: He gently approaches closer to her. Photographer's voice (friendly, clear English): "Hey there!"
7-10 seconds: @-image looks up from her phone toward him with a warm, confident smile. Photographer: "You look absolutely beautiful in that outfit!"
10-13 seconds: Photographer continues: "I'm a street photographer and I'd really love to take some photos of you if you're okay with that?"
13-15 seconds: Woman @-image nods enthusiastically and replies in clear English: "Sure, that sounds fun! I'd love to pose for you." She then strikes a graceful pose. Photographer's hands raise the DSLR camera into the foreground (camera and hands visible in POV), framing her perfectly as if about to shoot. Gentle shutter click sound.
Cinematic realistic style, vibrant urban colors of busy Shibuya, bright daytime lighting with natural sunlight, sharp focus on the woman @-image , dynamic crowded background with moving pedestrians, smooth natural handheld movement, high detail textures, friendly and positive atmosphere, clear audible English dialogue with natural lip sync, subtle city background sounds with footsteps, crowd chatter and traffic noise, enthusiastic yet respectful mood, 15-second video.
15s, cinematic emotional confrontation.
Two characters @[chracter sheet ref] stand face-to-face inside a small apartment kitchen late at night. The room is dimly lit by a single warm overhead light and soft city lights leaking through the window. The atmosphere feels emotionally exhausted, tense and painfully intimate, like an argument that has been building for years.
Modern cinematic realism, subtle handheld camera movement, shallow depth of field, soft film grain, emotionally restrained acting, realistic silence between dialogue lines.
Beat 1:
The emotional state remains at high arousal and medium-low valence.
Camera:
slow handheld side shot circling both characters
tight over-the-shoulder close-ups
brief eye-level two-shot showing emotional distance
FACS Character A:
AU4 + AU7 + AU17
Dialogue A:
/juː ˈnev.ɚ ˈriː.ə .li lʊkt æt miː/
/juː wɚ ɔːlˌweɪz ˈsʌmˌwɛɹ ɛls/
Voice:
tight restrained voice, controlled anger, uneven breathing
Character A tries to stay calm while suppressing years of resentment.
Beat 2:
The emotional state gradually shifts toward very low valence and medium-high arousal.
Camera:
slow push-in toward Character B
extreme close-up on trembling eyes and mouth
wide static shot showing silence after the argument peaks
FACS Character B:
AU1 + AU4 + AU15 + AU25
Dialogue B:
/aɪ wəz ˈtɹaɪ.ɪŋ maɪ bɛst/
/aɪ dɪdnt noʊ haʊ tə fɪks ˈɛv.ɹiˌθɪŋ/
Voice:
breaking voice, unstable breath support, emotionally collapsing delivery
Beat 3:
The emotional state remains at very low valence and medium-low arousal.
Camera:
locked wide shot with silence between them
slow close-up on both characters avoiding eye contact
subtle rack focus between faces
FACS Character A:
AU1 + AU15 + AU17
Dialogue A:
/ˈmeɪ.bi wiː stɑpt ˈlɪs.ən.ɪŋ ə lɔŋ taɪm əˈgoʊ/
Voice:
emotionally exhausted, quieter delivery, fading anger replaced by sadness
No exaggerated screaming, no violence, no comedy, no text overlay, no watermark.
expression-control
dialogue
emotional
cinematic
drama
00-03s [The Hook]:
Close-up of a person in a modern, sun-lit studio. They lean into the camera, hands resting on a desk. Natural lighting with slight lens flare.
Voiceover: "Stop trying to be loud. In a world of noise, volume isn't an advantage."
03-09s [The Core Message]:
Mid-shot. The speaker stands up and walks toward a large window overlooking a realistic city street. They gesture naturally, no robotic movements.
Voiceover: "Advertising isn't about buying attention; it's about earning trust. Your brand isn't what you say it is—it's the promise you actually keep."
09-13s [The Shift]:
Low-angle shot, looking up at the speaker. They look confident but approachable, with natural micro-expressions (a slight blink, a small nod).
Voiceover: "If you want to build something that lasts, stop selling features. Start solving human problems."
13-15s [The Finisher]:
Extreme close-up on the speaker's face. High detail on skin texture and eyes. They give a sharp, knowing look.
Voiceover: "Build the signal. Let them find you."
global_rule: No music, diegetic SFX only. Raw ungraded bystander phone footage, shot on what appears to be an iPhone 15 Pro in auto mode, handheld with constant visible micro-shake and occasional lurching reframe as the camera operator reacts to the scene unfolding around them. At 0s the camera is pointed down at [LOCATION] at a slight angle, slightly overexposed on the sun-bleached surface, auto white balance rendering the midday light as a blown-out warm white — then a deep low engine rumble begins to build from off-frame left, the operator swings the phone up and to the left in a jerky uncontrolled pan, briefly cutting off the top of a bystander's head in the foreground before settling on the [LOCATION]. At 2s a wide-bodied exotic supercar in deep midnight-blue metalite paint rolls into frame from the left in extreme slow motion — the car is moving barely above walking pace, the massive rear haunches and wide vented hood shimmering under harsh direct sunlight, creating a blown-out specular glare that briefly clips the exposure and washes a corner of the frame white. The phone auto-focus immediately hunts — the car body goes soft and creamy while the background [LOCATION] snaps into sharp focus, then the system corrects and slaps focus back onto the car at 3s with a visible snap. At 3s the driver-side window is fully down and [CHARACTER] is visible in the driver seat — the protagonist, wearing an impeccably tailored outfit, left elbow resting with composed ease on the open window sill, their hair catching the harsh overhead sun, eyes scanning the sidewalk crowd with quiet authority and a barely-there smirk. A partial hand from a crowd member in the foreground edge of frame obscures the lower quarter of the car door for a beat before the operator side-steps clumsily to clear the shot, causing a sudden jump cut of motion blur at 4s. The engine rumble is thick and low in the raw audio — an unfiltered deep V-configuration idle burble with a soft exhaust pop as the driver lifts slightly off throttle, the microphone briefly wind-buffeting from the operator turning. At 5s a young person on the sidewalk in a casual outfit suddenly lunges forward toward the car, one hand slamming against their cheek in disbelief, phone in the other hand shaking violently, screaming directly toward the open window — they shout raw and breathless with total disbelief, voice cracking: OH MY GOD I can't believe it's really you, I am SHAKING — their voice clipping the phone audio for a half second, the ambient crowd noise spiking around their reaction, a few other voices audible in the background responding with their own overlapping exclamations. The camera operator swings toward them involuntarily for a half-second at 6s — the frame goes fully blurry mid-swing, catching only motion-smeared color — then whips back to the car at 6.5s, briefly over-correcting and catching the rear wheel arch and exhaust tip before stabilizing back onto the driver window. At 7s [CHARACTER] turns their head slowly toward the fan, unhurried and calm, the smirk deepening slightly — their left hand rises from the window sill with a composed open-palm gesture, the bright midday sun creating a small chromatic aberration halo of purple-green fringing on the high-contrast edge of their outfit against the dark car interior. They speak through the open window directly toward the fan, their voice clear and low and unbothered, cutting cleanly through the engine rumble and crowd noise: It's okay, it's okay — come here, let's take one. At 9s the operator steps forward urgently, the footage lurching with two heavy footfall bumps visible in the shake, auto-focus hunting again briefly as the operator closes distance toward the car — the background pedestrians momentarily sharp while [CHARACTER] goes soft, then the focus corrects at 10s and locks onto their face and upper torso, the camera now shooting slightly upward through the open window at an imperfect low angle, their jaw and collar filling the upper half of the frame while the car door chrome trim cuts across the lower third. The fan's hand appears in frame from the right, visibly trembling, holding their phone up at a bad angle toward [CHARACTER] at 11s. The supercar engine burbles a low steady idle, a single exhaust pop at 12s as the driver's foot shifts. At 12s the operator steps back slightly, the footage pitching back to a wider framing showing [CHARACTER] still composed in the window, the fan leaning in, the crowd pressing from behind them — a hard shaft of direct sunlight cuts across the frame at a diagonal, briefly creating a small lens flare smear of pale yellow-white across the lower right corner of the image for two frames before the operator tilts minimally and it disappears. At 13s the camera operator's own thumb briefly enters the bottom left corner of the frame — a dark flesh-tone blur lasting less than a second before clearing. The audio throughout is raw and unprocessed — wind on mic, distant car horn from further down the street, shuffling footsteps on concrete, the constant low-register thrum of the supercar engine at near-idle. At 14s the operator makes a sudden instinctive zoom-in gesture, the digital zoom degrading the image quality visibly — compression artifacts and slight pixelation appearing on the edges of [CHARACTER]'s face and the car's roofline — before the footage cuts hard and abruptly at 15s mid-motion as if the operator fumbled the phone, the last frame a motion-blurred streak of deep blue car paint and pale sidewalk concrete.
BEAT_SHEET START
You are a professional cinematic director generating a Cinematic Beat Sheet — a 3×3 grid of 9 visual beats for video pre-production. Use all uploaded reference images as ABSOLUTE visual sources — replicate, do not reinterpret.
PROJECT TITLE:
The Melancholic Embrace
TOTAL DURATION:
15 seconds
REFERENCE IMAGES UPLOADED (in order):
1. Aira: A pale woman with long platinum-blonde hair wearing a clean white hooded tunic, light armor accents on shoulders and forearms, brown leather belt, and a white cape.
2. Seris: A pale woman with long platinum-blonde hair, wearing a white ceremonial robe heavily stained with blood on the sleeves and hem, brown leather belt, and bracers.
CHARACTER LOCK RULES:
- Aira: Long platinum-blonde hair, pale skin, white hooded tunic, light armor, brown belt, clean appearance, archetype is Cautious Priestess.
- Seris: Long platinum-blonde hair, pale skin, white ceremonial robes, heavy bloodstains on sleeves and hem, brown bracers, archetype is Fallen Sister.
- WEAPON LOCK: No weapons drawn or wielded. Focus is purely on the contrasting clean and bloodstained robes.
- PALETTE SIGNATURE: Soft desaturated blues for environment, stark pristine white for Aira, stark white with deep crimson bloodstain accents for Seris.
DIALOGUE BUBBLE RULES:
Some panels include speech bubbles. Each bubble is a SMALL graphic-novel-style RECTANGLE with rounded corners — pure white fill, thin black border 1px, white background, BLACK sans-serif text inside. NOT a manga cloud bubble. NOT a fantasy decorative banner. Style reference: From Hell, Sin City, Watchmen graphic novel speech.
- Bubble has a SHORT thin pointer line (3-5px) toward the speaking character's mouth.
- Bubble is positioned NEAR the speaker, in upper-left or upper-right corner of the panel, NOT covering character faces.
- Text inside bubble is small, plain English sans-serif, ALL CAPS for emphasis where indicated, otherwise sentence case.
- Bubbles must look like graphic novel speech, NOT modern comic book style, NOT manga.
9 STORY BEATS:
BEAT 01 [00:00-00:02] [SETUP — extended 2s]
Caption line 1: Aira stands silent at the entrance of a ruined church.
Caption line 2: Cold morning haze drifts through broken stained glass.
DIALOGUE: none.
BEAT 02 [00:02-00:03] [SETUP]
Caption line 1: Seris emerges slowly from the deep blue shadows.
Caption line 2: Her bloodstained robes drag softly across the broken stones.
DIALOGUE: none.
BEAT 03 [00:03-00:05] [SETUP → ESCALATION — extended 2s]
Caption line 1: Aira turns and recognizes the figure approaching her.
Caption line 2: Her breath catches in the silent winter air.
DIALOGUE BUBBLE (positioned near Aira, upper-left of panel):
AIRA: "Seris…?"
BEAT 04 [00:05-00:06] [ESCALATION]
Caption line 1: Seris stops a few steps away from Aira.
Caption line 2: Her hands tremble slightly at her bloodstained sides.
DIALOGUE BUBBLE (positioned near Seris, upper-right of panel):
SERIS: "I came back."
BEAT 05 [00:06-00:08] [ESCALATION — extended 2s]
Caption line 1: Aira slowly lowers her white hood from her platinum hair.
Caption line 2: The wariness fades from her piercing eyes.
DIALOGUE: none.
BEAT 06 [00:08-00:09] [ESCALATION]
Caption line 1: Aira closes the distance and reaches one hand toward Seris.
Caption line 2: Her fingertips graze Seris's bloodied sleeve.
DIALOGUE BUBBLE (positioned near Aira, upper-left of panel):
AIRA: "You're hurt."
BEAT 07 [00:09-00:11] [PEAK — extended 2s]
Caption line 1: They pull each other into a desperate silent embrace.
Caption line 2: Crimson blood presses against Aira's pristine white cape.
DIALOGUE: none.
BEAT 08 [00:11-00:13] [RESOLUTION — extended 2s]
Caption line 1: Seris rests her chin heavily on Aira's shoulder.
Caption line 2: Their platinum hair tangles gently in the cold breeze.
DIALOGUE BUBBLE (positioned near Seris, upper-right of panel):
SERIS: "I know."
BEAT 09 [00:13-00:15] [FINAL — extended 2s]
Caption line 1: Soft desaturated blue light bathes the two clinging figures.
Caption line 2: They stand perfectly still in the empty ruined church.
DIALOGUE: none.
STYLE TAGS:
Eternal Sunshine of the Spotless Mind aesthetic. Soft desaturated blues. Hazy morning light. Fine film grain. Low contrast. Raw texture. No cinematic polish. No glossy CGI. No fantasy poster. Melancholic winter romance.
BEAT_SHEET END
Static locked off UGC frame on a girl at a table making matcha, with the exact same camera position and framing throughout, perfectly steady, with no shake, no drift, and no micro-jitter, and a clean, crisp image. The clip opens exactly on the start frame, with her holding the metal sifter over the bowl as the last of the matcha falls through, fine powder drifting down naturally in tiny bursts. She speaks in a natural female American accent, around 27 to 28 years old, calm and confident, with a relaxed conversational rhythm, slightly deeper than average, smooth and mature but still soft and feminine. She starts speaking immediately at the beginning, with her lips clearly moving on-camera through every word: “Okay, um…” As she says “Okay,” she instantly lowers her gaze down toward the white bowl and shifts her focus to what she is already doing, while her mouth continues into “um” without interruption. When she says “um,” it is barely audible, almost to herself, quiet, low, and absent minded, like a whisper. That small pause on “um” feels like a thought catching up to her hand, and her lips barely move. Her gaze drifts slightly to the left for a second, her eyes briefly flicking toward the camera and then back to the bowl, as her hand gives one last gentle tap to finish the sifting, her mouth still moving through the line without missing a beat: “I wanna show you.” The final powder stops, the mesh is visibly clean, and she lowers the sifter a little closer to the bowl as if checking that she got it all, finishing the last words with a quiet, confident ease: “how simple AI UGC is.” Her expression stays natural and unperformed, like she is just talking while doing the routine. Keep the identity, skin texture, and environment perfectly stable, with no warping, no morphing, no smoothing, no smearing or blending, no pixel mixing, and minimal motion blur. Preserve realistic powder behavior, metal reflections, and shadows. End exactly on the provided end frame.
Video 2
Prompt:
Static locked off UGC shot on the same table matcha setup, with the camera perfectly steady and the same framing throughout, with no handheld shake, no drift, and no micro-jitter. Clean, crisp image. The clip opens exactly on the start frame, with the empty sifter held near the bowl. In one slow, natural continuation, she sets the sifter down out of the main action area and reaches for a glass electric kettle, then begins pouring hot water into the bowl in a physically believable stream, with realistic weight in her grip, an accurate pouring angle, natural water flow, and subtle steam cues, while the environment, background, and object positions remain consistent with the start frame. The shot settles exactly into the end frame, with the water clearly pouring into the bowl. She speaks in a natural female American accent, around 27 to 28 years old, calm and confident, with a relaxed conversational rhythm, slightly deeper than average, smooth and mature but still soft and feminine. The clip begins with no introduction at all, she is already mid sentence, and her lips clearly move on-camera through every word. While reaching for the kettle and starting the pour, she says naturally: “But honestly”. When she says the word “honestly,” it ends with a slight upward tone, and then, as she picks up the glass electric kettle and just before she starts pouring the hot water into the bowl, she finishes the last words: “…it’s really not as hard as it looks”. She feels completely relaxed and unbothered. No identity drift. No skin warping or morphing. No texture invention, no smoothing, no smearing or blending, no pixel mixing, and minimal motion blur. Keep skin pores, hair, fabric, reflections on the kettle and bowl, and matcha surface behavior stable and realistic. End exactly on the provided end frame.
[Shot 1: Frontal Menacing Shot] A medium shot of a SWAT officer in full tactical gear, gas mask, and helmet. He is pointing his assault rifle directly at the camera lens (breaking the fourth wall). He is shouting with visible intensity: "LET THE HOSTAGE GO! DROP THE WEAPON NOW!" [Shot 2: The Threat] Cut to a medium shot of the killer in a dirty tank top, holding a woman in a chokehold. He has a pistol pressed to her head. He is sweating and manic, screaming at the off-screen officer: "STAY BACK! I'LL KILL HER! I SWEAR I'LL DO IT!" [Shot 3: Over-the-Shoulder Resolution] The camera is positioned directly behind the SWAT officer's right shoulder. We see the back of his helmet and his rifle in the foreground. In the distance (mid-ground), the killer is still visible holding the girl. The killer screams one last time: "I'M GONNA DO IT!" after The officer's rifle kicks back with a single sho and hit head enemy. The killer falls instantly. The girl is left standing, shocked but safe. Technical Style: High-shutter speed action, realistic muzzle flashes, handheld camera shake, 24fps, English dialogue.
{
"style": "photorealistic premium Korean high-school drama, K-pop idol visual polish, fast intense dialogue, sharp reaction close-ups, subtle handheld tension",
"scene": "High school rooftop at night, chain-link fence, painted rooftop lines, warm light from stairwell door, distant city lights, school bags near wall",
"characters": {
"female_lead": "Yoon-seo, third-year student, school uniform, blazer, white shirt, loosened tie, pleated skirt, sneakers, long dark hair, calm, cunning, smart",
"male_lead": "Hyun-woo, third-year student, school uniform, blazer half-open, white shirt, loosened tie, slacks, sneakers, defensive, cornered"
},
"audio": "Korean dialogue, burned-in English subtitles at bottom center",
"prompt": "[00:00-00:03] Tight two-shot near the fence, wind moving their blazers. YOON-SEO: \"독서실 갔다고 했지?\" SUB: \"You said you were at study hall.\" HYUN-WOO: \"맞아. 끝나자마자 왔어.\" SUB: \"I was. I came right after.\" [00:03-00:07] She lifts her phone, almost amused. YOON-SEO: \"그럼 왜 네 위치는 노래방이었어?\" SUB: \"Then why was your location at karaoke?\" HYUN-WOO: \"친구들이 잠깐 잡았어.\" SUB: \"My friends stopped me for a minute.\" [00:07-00:11] Quick push-in on her face. YOON-SEO: \"여자친구보다 친구가 더 다정하네.\" SUB: \"Your friends are sweeter than your girlfriend.\" HYUN-WOO: \"그런 뜻 아니야.\" SUB: \"It wasn't like that.\" [00:11-00:15] She scrolls once and turns the screen toward him. YOON-SEO: \"좋아. 그럼 이 사진은 어떻게 설명할래?\" SUB: \"Fine. Then explain this photo.\"",
"end_frame": "Yoon-seo holds her phone up between them. Hyun-woo stares at the screen, his expression cracking for the first time."
}
setting:
location: "Wartime field hospital surgery tent"
time: "Night"
atmosphere: "Hot, crowded, airless, straight drama. No comedy sketch energy."
characters:
- name: "Kang Min-jae"
description: "Korean male military surgeon, early 30s. Slightly messy black hair, visible fatigue from long shifts, quick mouth, quick mind. Wearing surgical scrubs with a loose military jacket."
- name: "Han Seo-yoon"
description: "Korean female head nurse, early 30s. Calm, efficient, authoritative. Hair tied back, clean uniform, forceful actions, quiet voice."
performance_tone:
style: "Naturalistic, grounded, unstaged."
dynamic: "They work while testing each other. Chemistry comes from eye contact, breath, pauses, and timing instead of overt performance."
speech_style: "No crisp theatrical diction, no robotic line reading. Allow swallowed words, slight overlap, short pauses, and audible breath. Lines should feel spontaneous and tied to the action."
dialogue:
- speaker: "Min-jae"
line: "Hasn't anyone ever told you? When you're angry... you actually look better."
delivery: "Casual, low voice, lightly testing her while she is busy."
- speaker: "Seo-yoon"
line: "They have. Usually when they were under my hands."
delivery: "Calm, cutting, delayed by half a beat, with one brief sharp look."
- speaker: "Min-jae"
line: "Damn. I think I may actually be falling for you."
delivery: "Unplanned, genuinely hit, followed by a small breathy laugh."
camera_direction:
style: "Handheld only, as if a third person is standing beside the table and overhearing the exchange."
movement: "Reactive pans driven by character reactions. No mechanical left-right swinging, no flashy choreography, no floating gimbal feel."
shot_plan:
- timestamp: "0.0-3.0s"
action: "Move through the tent interior past trays, gauze, clamps, and medics crossing frame, then land at the operating table. Seo-yoon arranges instruments. Min-jae pulls off one glove and glances at her."
- timestamp: "3.0-6.5s"
action: "Medium close shot on Min-jae. He tosses the line while she is still working, like he is testing the water rather than making a grand move."
- timestamp: "6.5-11.0s"
action: "Camera pulls to Seo-yoon. She keeps setting instruments in place without looking at him at first. On 'under my hands,' she gives him one brief, clean, sharp look."
- timestamp: "11.0-15.0s"
action: "Snap back to Min-jae, closer than before. Catch the half-second blank look, the exhale, the small laugh, and the unpolished final line before he drops his gaze back to work."
action_direction:
- "Neither of them stops moving while speaking."
- "Seo-yoon sorts instruments, turns a tray, wipes her hands, passes a clamp."
- "Min-jae pulls off a glove, braces a hand on the table edge, looks down with a short laugh, then looks back up."
visuals:
lighting: "Harsh surgical lamps striking faces and hands directly, with cold green shadows in the tent background."
texture: "Real skin texture, sweat sheen, tired eyes, no skin smoothing, no soft-focus glamour, no idol-drama diffusion."
framing: "Tight, reactive, pressure-filled."
audio:
elements:
- "Light metal instrument clinks"
- "Fabric rustling"
- "Subdued distant orders"
- "Low tent room tone"
voice: "Close, dry, natural, with audible breath."
Style: Cinematic, romantic tension, realistic handheld
Setting: Night, Italian city (Rome/Milan), heavy rain, warm streetlights reflecting on wet pavement
Mood: Intimate + emotionally charged
0–2s Handheld close-up, slightly unstable camera. Heavy rain pouring onto the pavement. A man and a woman stand under one umbrella, very close. Water drips from the edges. Sound of rain dominates.
2–4s Close-up on the man's face as he looks at her calmly but intensely. He speaks softly: "That was your stop, wasn't it?"
4–6s Cut to the woman, slightly flustered. She shifts subtly but stays under the umbrella. She replies quietly: "…You're too close."
6–8s Camera slowly pushes in, handheld. The man gives a faint half-smile, rain sliding down his jacket. "That's a weak excuse."
8–10s Close-up on their hands near the umbrella handle. His hand is trembling slightly. She notices, looks up at him: "Your hand is shaking."
10–12s Brief silence—only rain and distant traffic. He looks away for a moment, then back at her: "…You noticed."
12–15s Camera slowly circles them, close together under the umbrella. Tension builds—neither steps away. She moves slightly closer, almost unconsciously. Breath visible in the cold air.
Final frame: Close-up of their faces inches apart, rain falling, city lights blurred in the background. Cut to black before anything happens.⚡
Key Notes
No slow motion → natural, grounded movement
Strong emphasis on rain sound + breathing + subtle tension
Handheld camera for realism
Focus on micro-expressions and emotional proximity
romantic
rain
cinematic
emotional
dialogue
night
urban
15-second cinematic romance, slow-burn emotional tone, one man and one woman only, both adults in their mid-20s, western-style leads, casual luxury wardrobe, dark quiet apartment living room at night, refined romantic tension, natural lip sync, no subtitles, no text on screen.
Character continuity:
Female lead: adult western woman, mid-20s, elegant natural beauty, long brunette hair with soft loose waves, minimal makeup, emotionally guarded but faintly amused, seated on the sofa. Outfit: soft ivory off-shoulder knit sweater, dark straight-leg jeans, barefoot, understated jewelry.
Male lead: adult western man, mid-20s, handsome refined features, slightly messy dark hair, thin metal-frame glasses, calm warm expression. Outfit: charcoal knit sweater over a white T-shirt, dark trousers, sleeves pushed once, relaxed but polished.
Environment:
Upscale apartment living room, soft sofa, low coffee table, one warm lamp, deep evening shadows, muted neutral palette, intimate silence, shallow depth of field, premium film texture, modern romantic realism.
0-3s:
Wide-to-medium slow push-in. She sits in one corner of the sofa with arms folded, looking away. He crosses the room and sits on the edge of the coffee table facing her, leaving a small respectful distance. Hold the silence and tension.
3-6s:
Medium close-up on the man. He studies her face, voice low, calm, almost smiling:
“I tried very hard to have an ordinary evening.”
6-9s:
Close-up on the woman. She turns her eyes to him at last, cool but intrigued, with the faintest teasing edge:
“And how did that go?”
9-12s:
Close-up on the man. He lets out a quiet breath, gaze steady. He reaches toward a loose strand near her cheek, stopping just before touching:
“Poorly. You were in all of it.”
12-15s:
Side two-shot. She lightly catches his wrist before he pulls away, not rejecting him, only holding him there. A small unwilling smile appears:
“That is not helping me stay angry.”
Hold on the shared gaze, the restrained smile, and the unresolved tenderness.
Motion and style rules:
Slow elegant camera movement, meaningful pauses, micro-expressions, lingering eye contact, almost-touch tension, realistic hand motion, restrained acting, no kneeling, no raised voices, no crying, no extra characters, no exaggerated gestures, no waxy skin, no stiff posing, premium romantic realism, emotionally charged final frame.
Character tone:
high-end romantic comedy, deadpan flirtation, over-serious male lead, quick-witted female lead, cinematic realism, sweet-chaotic chemistry, every frame like a poster
Male lead:
bespoke black suit, white shirt collar slightly open, handsome and severe, powerful aura, trying very hard to look cold and dominant, but secretly nervous and flustered, tiny tells betray him: slightly crooked tie, tight jaw, faintly trembling fingertips
Female lead [@ Image1]:
fitted slip dress / refined Chanel-inspired set, long hair slightly messy, elegant and soft-looking but emotionally sharper than him, stubborn, dryly funny, outwardly cornered for a moment, then visibly unimpressed, holding back laughter
Action + expression changes:
the male lead forcefully steps in for a dramatic wall-pin pose, one hand braced on the wall, closing the distance too seriously, trying to look intense; his expression starts cold but gradually cracks into restrained embarrassment
the female lead [@ Image1] steps back once, eyes widening, then notices his crooked tie and trembling hand; her expression changes from guarded resistance to deadpan disbelief and almost-laughing annoyance
their noses nearly touch, breathing overlaps, the tension becomes playful and absurd instead of painful
the male lead lightly lifts her chin, trying to recover his cool image; the female lead stares at him like she is watching someone forget his own script
Dialogue:
Male lead: Are you done yet?
Female lead [@ Image1]: Fix your tie first.
Male lead: I am being serious.
Female lead [@ Image1]: Then stop stepping on my heel.
Male lead: ...That was deliberate.
Female lead [@ Image1]: Your shaking hand says otherwise.
Film stock: 35mm Kodak Vision3 500T, heavy organic film grain, high contrast.
Lens/Aperture: 35mm Anamorphic lens, f/2.8. Deep depth of field to see both characters clearly.
Color Grade: "Saturated 90s Diner" palette. Warm nicotine yellows, bright red vinyl booths, and harsh fluorescent overheads.
Camera Behavior: Slow, rhythmic "Shot/Reverse Shot" switching. Starting with a slow creeping zoom on the lead's face.
Atmosphere: A half-empty, sun-drenched diner. Dust motes floating in the light. Tense, quiet, blue-collar grit.
Audio: Immersive spatial sound design. The distant clinking of silverware, a coffee pot pouring. Dialogue lipsync: Character @ image1 leans in and says: "I'm gonna ask you one more time, and if you lie, God himself won't be able to find what's left of you."
[IMAGE REFERENCES / LEGEND]
@ image1: The lead enforcer. Maintain exact beard, dark sunglasses, and black headphones (worn around the neck for this scene). Keep exact same character, style, and lighting.
Character 2: A skinny, sweating man in a cheap, rumpled grey suit sitting opposite him, trembling while holding a ceramic coffee mug.
[TIMELINE SECOND BY SECOND]
0-4s: [Medium Shot - Over the Shoulder] + [Focus on Character @ image1] + [Action: He is calmly stirring a cup of black coffee with a silver spoon] + [SFX: Rhythmic metallic tink-tink-tink of the spoon against porcelain].
4-8s: [Close-up] + [Dialogue lipsync: Character @ image1 stops stirring, looks up slowly, and delivers the line with a cold, terrifying calm] + [Physics: Small wisps of steam rising from the coffee cup].
8-11s: [Reverse Shot] + [Focus on Character 2] + [Action: He visibly gulps, a bead of sweat rolling down his forehead, eyes darting nervously] + [SFX: Muffled sound of a waitress laughing in the far background].
11-15s: [Low-angle Profile Shot] + [Character @ image1 slowly reaches into his pocket, pulling out a heavy, chrome-plated handgun and placing it quietly on the table next to his saucer] + [Lighting: Sunlight glints sharply off the chrome].
[STYLE & QUALITY BOOSTERS]
Movie-level realistic facial features, no deformation, stable character consistency. Professional 90s crime film aesthetic. High-fidelity skin textures (pores, sweat).
concept:
style: "Fast paced American diner aesthetic. Neon lights, steaming coffee, and the clatter of silverware. 'The Office' style camera pans."
characters:
trader: "A 'Fin Bro' in a Patagonia vest, looking stressed at his laptop."
waitress: "An older, unimpressed waitress with a 'seen-it-all' attitude."
dialogue_logic:
trader: "If I tip you 20% on a forty-dollar bill, but I only have a hundred-dollar bill and two fives, how much do you owe me back?"
waitress: "Zero."
trader: "Excuse me? The math says you owe me sixty-two dollars."
waitress: "Honey, if you're asking me that question, you're not leaving a tip. You're just paying the bill. I'm keeping the fives."
trader: "(Stares blankly) ...Wait."
waitress: "(To the camera) I think he needs a tutor, not a trader."
production_notes:
twist: "The joke is that the trader is trying to be 'smart' with math, but the waitress is 'smart' about human nature she knows an arrogant guy asking math questions is usually a bad tipper."
title: "The Common Sense Test"
concept:
style: "Dry, satirical US medical dramedy (think 'House' or 'Scrubs'). Cinematic lighting with high contrast. The vibe is professional but sarcastic."
setting: "A high-end private clinic in Los Angeles. Glass partitions, Apple iMac on the desk, and a view of the city through the window."
characters:
doctor: "Dr. Miller; mid-40s, silver-haired, extremely dry wit. Wears a tailored white coat over a navy shirt. He's seen it all."
patient: "Tiffany; 20s, wearing high-end 'athleisure' (pink Lululemon-style set). She looks like she just came from a Pilates class. Sharp and unimpressed."
nurse: "Brian; early 20s, over-eager medical intern/nurse. Wears bright blue scrubs and a stethoscope he's a bit too proud of."
timeline:
0-4s:
action: "Slow tracking shot into the exam room. Dr. Miller is clicking a pen, looking at a clipboard. Tiffany is sitting on the exam table, scrolling on her phone."
visuals: "Clean, clinical aesthetic. The Nurse stands by the door with a tablet, looking very serious."
4-7s:
action: "The Doctor looks up, peering over his glasses to start the cognitive test."
dialogue: "Doctor: 'Okay, last one. You have seventy bucks in your wallet. You go to the store and buy thirty dollars worth of groceries. How much change does the clerk hand you?'"
7-9s:
action: "Tiffany doesn't even look up from her phone. She answers instantly with a 'duh' tone."
dialogue: "Tiffany: 'Twenty.'"
9-11s:
action: "The Doctor snaps the clipboard shut and starts walking toward the door."
dialogue: "Doctor: 'Perfect. She's lucid. Discharge her. She's good to go.'"
11-13s:
action: "The Nurse stops them, looking confused and doing math on his fingers. The camera 'snap-zooms' on his worried face."
dialogue: "Nurse: 'Uh, wait, Dr. Miller? Seventy minus thirty is forty... Shouldn't she get forty back?'"
13-15s:
action: "The Doctor stops at the door, slowly turns around, and gives the nurse a long, deadpan stare of pure disappointment."
dialogue: "Doctor: 'Brian... I think we need to admit *you* for a brain scan. She gave him the fifty. Leave her alone.'"
production_notes:
subtext: "The humor relies on the fact that no one hands over seventy dollars (a 50 and a 20) for a thirty-dollar bill. You just hand over the fifty. The patient passed the 'sanity' test; the nurse failed the 'real world' test."
camera_work: "Uses a 'The Office' style fast-zoom on the Nurse's confused face and then a slow, silent beat on the Doctor's reaction for comedic timing."
audio_design: "Standard muffled hospital ambiance. A sharp 'pen click' sound effect to punctuate the Doctor's movements."
Interior of a parked car at night under a single streetlight, rain hitting the windshield, soft reflections moving across their faces.A man in his 30s grips the steering wheel tightly, staring straight ahead, avoiding eye contact. A woman in her late 20s watches him intensely, searching his face, her expression fragile but steady.
SHOT1: Woman looks at him, "Look at me."
SHOT2: Close-up on the man, his knuckles white on the wheel, jaw locked. Streetlight flickers across his face through rain-streaked glass. He exhales sharply but doesn't turn. Man: "Not now."
SHOT3: Extreme close-up on the woman, eyes filled with tears but steady, no hesitation anymore. Rain reflections move across her face like slow motion. Woman: "If you can't look at me…" A beat. "…then you already lost me." Silence. Only rain.
Two small mice in a discussion about whose idea it was to go fishing on a Thursday. A damn Thursday. They know it rains every Thursday. And the other says… I think it's Wednesday.
Realistic handheld footage of a MacBook Pro screen filling most of the frame, showing a Zoom meeting window with only one young woman in a tidy bedroom, attending a formal meeting from home. She wears a dark blazer and looks professional from the waist up. The room is bright, natural, and believable. The shot should preserve realistic screen reflections, subtle moiré pixel texture, tiny dust on the glass, and slight handheld camera shake.
After a brief moment, she hears a noise from the door offscreen. She glances to the side, slightly startled, then quickly stands up and starts walking away from her chair to answer it. Because the camera is filming the laptop screen, we see her moving inside the Zoom window. Halfway to the door, she suddenly freezes, looks down, and realizes she is only wearing underwear on her lower body. Her expression instantly shifts to embarrassment and panic as she remembers that her Zoom camera is still on. She spins around and rushes back toward the screen in a frantic, awkward, comedic way. She quickly returns to the laptop and blocks the camera with both hands or throws herself in front of it, covering the lens and ending the shot in chaotic close-up.
The tone is realistic and comedic, with strong contrast between formal upper-body business attire and the accidental lower-body mistake. Emphasize awkward humor, authentic facial acting, natural body motion, realistic indoor lighting, handheld movement, slight motion blur, and believable Zoom-call visuals. Keep it non-explicit: no nudity, no revealing details, no erotic framing, no vulgarity. The focus is on embarrassment, urgency, and comedy.
FORMAT: 15s / 6 SHOTS / heartwarming woodland comedy / short dialogue
STYLE: hyperreal cinematic woodland animation, warm golden hour light filtering through bark cracks, cozy mossy interior of a hollow tree stump café, wooden table with acorn cups and leaf napkins, soft depth of field, gentle camera movement, ultra-detailed fur and feather textures, expressive emotional faces, premium stylized 3D animated feature quality
Shot 01 (0:00-0:02)
Wide establishing shot, slow dolly in. Inside the cozy hollow tree café, Harold the tiny hedgehog sits at a round table with a birch-bark clipboard and clears his throat. He smiles warmly and says, “Welcome, everyone.”
Shot 02 (0:02-0:04)
Medium shot. Sylvie the squirrel bursts into frame, tail twitching wildly, clutching one acorn. She blurts out, “I lost my confidence!” and drops into the chair.
Shot 03 (0:04-0:06)
Fast side shot. Barry the blue jay swoops in dramatically, lands too hard, and knocks over an acorn cup. He puffs his chest and says, “Relax. I’m incredible.”
Shot 04 (0:06-0:09)
Quick comedic push-ins between faces. Sylvie looks overwhelmed, Harold gives Barry a patient look, Barry tries to look cool while the spilled acorn cup rolls across the table and leaf napkins flutter.
Shot 05 (0:09-0:12)
Slow circular dolly around the table. Barry shrugs and says, “I once sat on my wings.” Sylvie breaks into laughter first, then Harold, and finally Barry joins in, feathers and quills shaking with the laugh.
Shot 06 (0:12-0:15)
Pull back to a warm wide shot. They clink acorn cups. Harold smiles and says, “We’re all a little nutty.” The three lean in for a tiny fluffy group hug as fireflies drift upward in the golden tree light.
NEGATIVE: flat lighting, low-detail fur, stiff facial animation, weak lip sync, muddy textures, cheap cartoon look, jittery camera, distorted anatomy, empty background, dull expressions
ARGUMENT - WHIP PAN FOLLOWS SPEAKER
15 Seconds / Continuous Shot / Dynamic Camera
SHOT DESCRIPTION:
Interior. Apartment. Day. WOMAN (30s) and MAN (32) mid-argument. Camera WHIP PANS to follow WHOEVER IS SPEAKING. Clean sequence, no character appears twice in same position.
0:00-0:02
Woman near door.
WOMAN
You promised. You said you'd be here.
0:02-0:04
WOMAN (CONT'D)
Our appointment was at three o'clock.
Where were you?
She steps forward.
Man turns away.
MAN
I got held up at work. It's not a big deal.
0:04-0:06
Eyes wet but controlled anger.
WOMAN
Not a big deal? I took off work.
I rearranged everything.
Shoulders tense. He responds.
MAN
You think I don't know that?
You think I don't feel like shit about it?
0:06-0:08
Voice drops, sharper.
WOMAN
Feeling like shit doesn't change anything.
It doesn't undo it.
man moves toward her. Defensive posture breaking.
MAN
What do you want from me? An apology?
Fine. I'm sorry.
0:08-0:10
Her face hardens. Sadness beneath anger.
WOMAN
I don't want apologies anymore, David.
man stops moving. Her words hit different.
WOMAN (CONT'D)
(from off-screen, but camera on man)
I want you.
Silence stretches. His face changes.
0:10-0:12
Man opens mouth to respond.
OVERSHOOTS past woman, continues past apartment door into hallway beyond.
0:12-0:15
WHIP PAN CONTINUES, unmotivated, chaotic, through open doorway across hallway.
Lands on: Apartment door cracked open. Inside: GOLDEN RETRIEVER PUPPY (4 months, fluffy, confused).
PUPPY's head tilts. One ear up, one ear sideways. Looking directly at camera. Confused. Concerned. Like it heard the argument and doesn't understand why two humans are sad.
HOLD on puppy's face. Blinks slowly. Eyes innocent.
SOUND: Argument stops. Only puppy's small breathing. Silence.
TECHNICAL:
- Camera: Whip pans ONLY, follow speaker's voice
- Pan motivation: Who is speaking determines camera movement
- Final pan: Unmotivated (emotional breakdown of camera discipline)
- 8K, Ultra High Quality
SOUND LAYERS:
0:00-0:04: Normal breathing, measured voices
0:04-0:08: Volume rising, frustration peaks
0:08-0:10: Emotion drops to sadness
0:10-0:15: Argument fades, puppy breathing only
SHOT1
Tight frontal close-up on Character A (male, late 50s).
Sweat beads on his temple. Jaw tight. Eyes locked forward but blinking too fast.
Camera slow push-in. Lamp hums softly.
A:“You’re staring at me like you already decided.”
SHOT2
Profile close-up of Character B (female, early 30s).
Perfect stillness. No blink. Slight curl at the corner of her mouth.
Light cuts sharply across her cheekbone.
B:“No. I’m staring because you’re about to contradict yourself.”
SHOT3
Over-the-shoulder from B, framing A smaller now.
A swallows. His hands tighten, knuckles whitening. Breathing audible.
A:“You don’t have proof.”
SHOT4
Extreme close-up on B’s eyes.
A slow inhale. A quiet smile that never reaches her eyes.
B:“I don’t need it anymore.”
SHOT1
Wide shot, both characters against city lights.
Character A looks outward. Character B watches him instead.
A:“Do you ever feel like there’s a delay… between thought and choice?
SHOT2
Close-up on B, wind moving her hair across her face.
Eyes soften. Voice calm, careful.
B:“That’s not a delay. That’s someone else waiting to see what you’ll do.”
SHOT3
Low-angle close-up on A.
His smile fades. Pupils dilate slightly. Breath slows unnaturally.
A:“That’s not funny.”
SHOT4
Slow dolly toward B, neon reflections flicker in her eyes.
B:“I know. That’s why I stopped laughing.”