Clone repository
!git clone https://github.com/AliaksandrSiarohin/first-order-model
Cloning into 'first-order-model'... remote: Enumerating objects: 85, done.[K remote: Total 85 (delta 0), reused 0 (delta 0), pack-reused 85[K Unpacking objects: 100% (85/85), done.
cd first-order-model
/content/first-order-model
Mount your Google drive folder on Colab
from google.colab import drive
drive.mount('/content/gdrive')
Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly Enter your authorization code: ·········· Mounted at /content/gdrive
Add folder https://drive.google.com/drive/folders/1kZ1gCnpfU0BnpdU47pLM_TQ6RypDDqgw?usp=sharing to your google drive.
Load driving video and source image
import imageio
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
from skimage.transform import resize
from IPython.display import HTML
import warnings
warnings.filterwarnings("ignore")
source_image = imageio.imread('/content/gdrive/My Drive/first-order-motion-model/02.png')
driving_video = imageio.mimread('/content/gdrive/My Drive/first-order-motion-model/04.mp4')
#Resize image and video to 256x256
source_image = resize(source_image, (256, 256))[..., :3]
driving_video = [resize(frame, (256, 256))[..., :3] for frame in driving_video]
def display(source, driving, generated=None):
fig = plt.figure(figsize=(8 + 4 * (generated is not None), 6))
ims = []
for i in range(len(driving)):
cols = [source]
cols.append(driving[i])
if generated is not None:
cols.append(generated[i])
im = plt.imshow(np.concatenate(cols, axis=1), animated=True)
plt.axis('off')
ims.append([im])
ani = animation.ArtistAnimation(fig, ims, interval=50, repeat_delay=1000)
plt.close()
return ani
HTML(display(source_image, driving_video).to_html5_video())
Create a model and load checkpoints
from demo import load_checkpoints
generator, kp_detector = load_checkpoints(config_path='config/vox-256.yaml',
checkpoint_path='/content/gdrive/My Drive/first-order-motion-model/vox-cpk.pth.tar')
Perfrorm image animation
from demo import make_animation
from skimage import img_as_ubyte
predictions = make_animation(source_image, driving_video, generator, kp_detector, relative=True)
#save resulting video
imageio.mimsave('../generated.mp4', [img_as_ubyte(frame) for frame in predictions])
#video can be downloaded from /content folder
HTML(display(source_image, driving_video, predictions).to_html5_video())
100%|██████████| 211/211 [00:07<00:00, 29.03it/s]
In the cell above we use relative keypoint displacement to animate the objects. We can use absolute coordinates instead, but in this way all the object proporions will be inherited from the driving video. For example Putin haircut will be extended to match Trump haircut.
predictions = make_animation(source_image, driving_video, generator, kp_detector, relative=False, adapt_movement_scale=True)
HTML(display(source_image, driving_video, predictions).to_html5_video())
100%|██████████| 211/211 [00:07<00:00, 28.72it/s]
First we need to crop a face from both source image and video, while simple graphic editor like paint can be used for cropping from image. Cropping from video is more complicated. You can use ffpmeg for this.
!ffmpeg -i /content/gdrive/My\ Drive/first-order-motion-model/07.mkv -ss 00:08:57.50 -t 00:00:08 -filter:v "crop=600:600:760:50" -async 1 hinton.mp4
ffmpeg version 3.4.6-0ubuntu0.18.04.1 Copyright (c) 2000-2019 the FFmpeg developers built with gcc 7 (Ubuntu 7.3.0-16ubuntu3) configuration: --prefix=/usr --extra-version=0ubuntu0.18.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared libavutil 55. 78.100 / 55. 78.100 libavcodec 57.107.100 / 57.107.100 libavformat 57. 83.100 / 57. 83.100 libavdevice 57. 10.100 / 57. 10.100 libavfilter 6.107.100 / 6.107.100 libavresample 3. 7. 0 / 3. 7. 0 libswscale 4. 8.100 / 4. 8.100 libswresample 2. 9.100 / 2. 9.100 libpostproc 54. 7.100 / 54. 7.100 Input #0, matroska,webm, from '/content/gdrive/My Drive/first-order-motion-model/07.mkv': Metadata: ENCODER : Lavf57.83.100 Duration: 00:14:59.73, start: 0.000000, bitrate: 2343 kb/s Stream #0:0(eng): Video: vp9 (Profile 0), yuv420p(tv, bt709), 1920x1080, SAR 1:1 DAR 16:9, 29.97 fps, 29.97 tbr, 1k tbn, 1k tbc (default) Metadata: DURATION : 00:14:59.665000000 Stream #0:1(eng): Audio: aac (LC), 44100 Hz, stereo, fltp (default) Metadata: HANDLER_NAME : SoundHandler DURATION : 00:14:59.727000000 Stream mapping: Stream #0:0 -> #0:0 (vp9 (native) -> h264 (libx264)) Stream #0:1 -> #0:1 (aac (native) -> aac (native)) Press [q] to stop, [?] for help -async is forwarded to lavfi similarly to -af aresample=async=1:min_hard_comp=0.100000:first_pts=0. [1;36m[libx264 @ 0x55cfa7862800] [0musing SAR=1/1 [1;36m[libx264 @ 0x55cfa7862800] [0musing cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2 [1;36m[libx264 @ 0x55cfa7862800] [0mprofile High, level 3.1 [1;36m[libx264 @ 0x55cfa7862800] [0m264 - core 152 r2854 e9a5903 - H.264/MPEG-4 AVC codec - Copyleft 2003-2017 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=3 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00 Output #0, mp4, to 'hinton.mp4': Metadata: encoder : Lavf57.83.100 Stream #0:0(eng): Video: h264 (libx264) (avc1 / 0x31637661), yuv420p, 600x600 [SAR 1:1 DAR 1:1], q=-1--1, 29.97 fps, 30k tbn, 29.97 tbc (default) Metadata: DURATION : 00:14:59.665000000 encoder : Lavc57.107.100 libx264 Side data: cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1 Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 128 kb/s (default) Metadata: HANDLER_NAME : SoundHandler DURATION : 00:14:59.727000000 encoder : Lavc57.107.100 aac frame= 240 fps=2.9 q=-1.0 Lsize= 1301kB time=00:00:08.01 bitrate=1330.6kbits/s speed=0.0971x video:1166kB audio:125kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.761764% [1;36m[libx264 @ 0x55cfa7862800] [0mframe I:1 Avg QP:22.44 size: 28019 [1;36m[libx264 @ 0x55cfa7862800] [0mframe P:62 Avg QP:23.31 size: 12894 [1;36m[libx264 @ 0x55cfa7862800] [0mframe B:177 Avg QP:28.63 size: 2068 [1;36m[libx264 @ 0x55cfa7862800] [0mconsecutive B-frames: 0.8% 1.7% 2.5% 95.0% [1;36m[libx264 @ 0x55cfa7862800] [0mmb I I16..4: 12.7% 76.2% 11.1% [1;36m[libx264 @ 0x55cfa7862800] [0mmb P I16..4: 1.9% 8.9% 1.1% P16..4: 35.3% 21.3% 10.8% 0.0% 0.0% skip:20.7% [1;36m[libx264 @ 0x55cfa7862800] [0mmb B I16..4: 0.0% 0.1% 0.0% B16..8: 39.1% 5.4% 1.0% direct: 1.4% skip:52.9% L0:35.4% L1:48.5% BI:16.2% [1;36m[libx264 @ 0x55cfa7862800] [0m8x8 transform intra:75.2% inter:77.3% [1;36m[libx264 @ 0x55cfa7862800] [0mcoded y,uvDC,uvAC intra: 61.9% 52.1% 5.8% inter: 15.2% 6.9% 0.0% [1;36m[libx264 @ 0x55cfa7862800] [0mi16 v,h,dc,p: 69% 8% 8% 15% [1;36m[libx264 @ 0x55cfa7862800] [0mi8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 25% 10% 19% 5% 8% 11% 8% 9% 6% [1;36m[libx264 @ 0x55cfa7862800] [0mi4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 23% 8% 11% 5% 12% 21% 7% 9% 4% [1;36m[libx264 @ 0x55cfa7862800] [0mi8c dc,h,v,p: 53% 20% 19% 8% [1;36m[libx264 @ 0x55cfa7862800] [0mWeighted P-Frames: Y:21.0% UV:1.6% [1;36m[libx264 @ 0x55cfa7862800] [0mref P L0: 57.9% 21.2% 14.0% 5.9% 1.1% [1;36m[libx264 @ 0x55cfa7862800] [0mref B L0: 93.5% 5.3% 1.2% [1;36m[libx264 @ 0x55cfa7862800] [0mref B L1: 97.4% 2.6% [1;36m[libx264 @ 0x55cfa7862800] [0mkb/s:1192.28 [1;36m[aac @ 0x55cfa7863700] [0mQavg: 534.430
Another posibility is to use some screen recording tool, or if you need to crop many images at ones use face detector(https://github.com/1adrianb/face-alignment) , see https://github.com/AliaksandrSiarohin/video-preprocessing for preprcessing of VoxCeleb.
source_image = imageio.imread('/content/gdrive/My Drive/first-order-motion-model/09.png')
driving_video = imageio.mimread('hinton.mp4', memtest=False)
#Resize image and video to 256x256
source_image = resize(source_image, (256, 256))[..., :3]
driving_video = [resize(frame, (256, 256))[..., :3] for frame in driving_video]
predictions = make_animation(source_image, driving_video, generator, kp_detector, relative=True,
adapt_movement_scale=True)
HTML(display(source_image, driving_video, predictions).to_html5_video())
100%|██████████| 240/240 [00:08<00:00, 29.00it/s]
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。