Self-Supervised Video Forensics by Audio-Visual Anomaly Detection

Chao Feng, Ziyang Chen, Andrew Owens
University of Michigan, Ann Arbor

CVPR 2023 (Highlight)

This is the code for audio-visual forensics.

Steps to run the python code directly:

pip install -r requirements.txt

# 1. test a sample fake video (path of video should be full path)
CUDA_VISIBLE_DEVICES=8 python detect.py --test_video_path /home/xxxx/test_video.mp4 --device cuda:0 --max-len 50 --n_workers 4  --bs 1 --lam 0 --output_dir /home/xxx/save 
# 2. test a list of fake videos (path of .txt file should be full path, and list should contain full paths of testing videos)
CUDA_VISIBLE_DEVICES=8 python detect.py --test_video_path /home/xxxx/fake_videos.txt --device cuda:0 --max-len 50 --n_workers 4 --bs 1 --lam 0 --output_dir /home/xxx/save

(lam is a hyperparameter you can tune to combine scores from distributions over delays and audio-visual network activations mentioned in paper method section. Default lam=0 is distributions over delays only.)

Audio-visual synchronization model checkpoint sync_model.pth can be donwloaded by this link. Noted that AV synchronization model consists of video branch, audio branch, and audio-visual feature fusion transformer.

In the end, there would be a output.log file and a testing_score.npy file under output_dir generated to record scores for all the testing videos.

Audio-visual synchronization model code is based on vit-pytorch

Decoder only autoregressive model is partially based on memory-compressed-attention

Visual encoder is heavily borrowed from action classifiction

Any questions please contact chfeng@umich.edu, I will try to respond ASAP, sorry for any delay before.

@inproceedings{feng2023self,
  title={Self-supervised video forensics by audio-visual anomaly detection},
  author={Feng, Chao and Chen, Ziyang and Owens, Andrew},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={10491--10503},
  year={2023}
}

Name	Name	Last commit message	Last commit date
Latest commit History 14 Commits
backbone	backbone
LICENSE	LICENSE
README.md	README.md
__init__.py	__init__.py
audio_process.py	audio_process.py
avfeature_regressive_model.pth	avfeature_regressive_model.pth
config_deepfake.py	config_deepfake.py
deep_fake_data.py	deep_fake_data.py
detect.py	detect.py
dist_regressive_model.pth	dist_regressive_model.pth
fake_celeb_dataset.py	fake_celeb_dataset.py
load_audio.py	load_audio.py
load_video.py	load_video.py
model.py	model.py
pca.pkl	pca.pkl
requirements.txt	requirements.txt
test.mp4	test.mp4
test.wav	test.wav
transformer_component.py	transformer_component.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Self-Supervised Video Forensics by Audio-Visual Anomaly Detection

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Search code, repositories, users, issues, pull requests...

License

cfeng16/audio-visual-forensics

Folders and files

Latest commit

History

Repository files navigation

Self-Supervised Video Forensics by Audio-Visual Anomaly Detection

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages