mseg-semantic (Semantic Segmentation) を試してみる

何を学習するのか?

GitHub – mseg-dataset/mseg-semantic: An Official Repo of CVPR ’20 “MSeg: A Composite Dataset for Multi-Domain Segmentation”
Semantic Segmentation (画像セグメンテーション) をローカルのUbuntuで試してみる。

インストール

実行

$ python3 -u mseg_semantic/tool/universal_demo.py --config=./mseg_semantic/config/test/default_config_360_ms.yaml model_name mseg-3m model_path ./mseg-3m.pth input_file ./bukit.mp4
$ python3 -u mseg_semantic/tool/universal_demo.py --config=./mseg_semantic/config/test/default_config_360_ms.yaml model_name mseg-3m model_path ./mseg-3m.pth input_file ./IMG_9082.mp4 
Namespace(config='./mseg_semantic/config/test/default_config_360_ms.yaml', file_save='default', opts=['model_name', 'mseg-3m', 'model_path', './mseg-3m.pth', 'input_file', './IMG_9082.mp4'])
arch: hrnet
base_size: 360
batch_size_val: 1
dataset: IMG_9082
has_prediction: False
ignore_label: 255
img_name_unique: False
index_start: 0
index_step: 0
input_file: ./IMG_9082.mp4
layers: 50
model_name: mseg-3m
model_path: ./mseg-3m.pth
network_name: None
save_folder: default
scales: [0.5, 0.75, 1.0, 1.25, 1.5, 1.75]
small: True
split: val
test_gpu: [0]
test_h: 713
test_w: 713
version: 4.0
vis_freq: 20
workers: 16
zoom_factor: 8
[2020-08-20 17:36:06,298 INFO universal_demo.py line 62 6262] arch: hrnet
base_size: 360
batch_size_val: 1
dataset: IMG_9082
has_prediction: False
ignore_label: 255
img_name_unique: True
index_start: 0
index_step: 0
input_file: ./IMG_9082.mp4
layers: 50
model_name: mseg-3m
model_path: ./mseg-3m.pth
network_name: None
print_freq: 10
save_folder: default
scales: [0.5, 0.75, 1.0, 1.25, 1.5, 1.75]
small: True
split: test
test_gpu: [0]
test_h: 713
test_w: 713
u_classes: ['backpack', 'umbrella', 'bag', 'tie', 'suitcase', 'case', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'animal_other', 'microwave', 'radiator', 'oven', 'toaster', 'storage_tank', 'conveyor_belt', 'sink', 'refrigerator', 'washer_dryer', 'fan', 'dishwasher', 'toilet', 'bathtub', 'shower', 'tunnel', 'bridge', 'pier_wharf', 'tent', 'building', 'ceiling', 'laptop', 'keyboard', 'mouse', 'remote', 'cell phone', 'television', 'floor', 'stage', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot_dog', 'pizza', 'donut', 'cake', 'fruit_other', 'food_other', 'chair_other', 'armchair', 'swivel_chair', 'stool', 'seat', 'couch', 'trash_can', 'potted_plant', 'nightstand', 'bed', 'table', 'pool_table', 'barrel', 'desk', 'ottoman', 'wardrobe', 'crib', 'basket', 'chest_of_drawers', 'bookshelf', 'counter_other', 'bathroom_counter', 'kitchen_island', 'door', 'light_other', 'lamp', 'sconce', 'chandelier', 'mirror', 'whiteboard', 'shelf', 'stairs', 'escalator', 'cabinet', 'fireplace', 'stove', 'arcade_machine', 'gravel', 'platform', 'playingfield', 'railroad', 'road', 'snow', 'sidewalk_pavement', 'runway', 'terrain', 'book', 'box', 'clock', 'vase', 'scissors', 'plaything_other', 'teddy_bear', 'hair_dryer', 'toothbrush', 'painting', 'poster', 'bulletin_board', 'bottle', 'cup', 'wine_glass', 'knife', 'fork', 'spoon', 'bowl', 'tray', 'range_hood', 'plate', 'person', 'rider_other', 'bicyclist', 'motorcyclist', 'paper', 'streetlight', 'road_barrier', 'mailbox', 'cctv_camera', 'junction_box', 'traffic_sign', 'traffic_light', 'fire_hydrant', 'parking_meter', 'bench', 'bike_rack', 'billboard', 'sky', 'pole', 'fence', 'railing_banister', 'guard_rail', 'mountain_hill', 'rock', 'frisbee', 'skis', 'snowboard', 'sports_ball', 'kite', 'baseball_bat', 'baseball_glove', 'skateboard', 'surfboard', 'tennis_racket', 'net', 'base', 'sculpture', 'column', 'fountain', 'awning', 'apparel', 'banner', 'flag', 'blanket', 'curtain_other', 'shower_curtain', 'pillow', 'towel', 'rug_floormat', 'vegetation', 'bicycle', 'car', 'autorickshaw', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'trailer', 'boat_ship', 'slow_wheeled_object', 'river_lake', 'sea', 'water_other', 'swimming_pool', 'waterfall', 'wall', 'window', 'window_blind']
version: 4.0
vis_freq: 20
workers: 16
zoom_factor: 8
[2020-08-20 17:36:06,298 INFO universal_demo.py line 63 6262] => creating model ...
[2020-08-20 17:36:08,827 INFO inference_task.py line 308 6262] => loading checkpoint './mseg-3m.pth'
[2020-08-20 17:36:09,329 INFO inference_task.py line 314 6262] => loaded checkpoint './mseg-3m.pth'
[2020-08-20 17:36:09,335 INFO inference_task.py line 327 6262] >>>>>>>>>>>>>> Start inference task >>>>>>>>>>>>>
[2020-08-20 17:36:09,339 INFO inference_task.py line 457 6262] Write video to /home/max/src/mseg-semantic/temp_files/IMG_9082_mseg-3m_universal_scales_ms_base_sz_360.mp4
Video fps: 30.00 @ 1080x1920 resolution.
[2020-08-20 17:36:09,365 INFO inference_task.py line 462 6262] On image 0/2732
/home/max/.local/lib/python3.6/site-packages/torch/nn/functional.py:3121: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
  "See the documentation of nn.Upsample for details.".format(mode))
/home/max/.local/lib/python3.6/site-packages/torch/nn/functional.py:2941: UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.
  warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.")
[2020-08-20 17:36:20,185 INFO inference_task.py line 462 6262] On image 1/2732

9s-10s / 1 くらいで処理してる。

Ryzen7 (2700) + GTX1070ti 環境
2723 x 10 = 27230s = 453m = 7.56Hの処理時間
1m31s(91s)の動画なので 1s処理するのに4.97mかかった。
(ただしGPUが効いてない模様 後でGPUごにょしてみる)

*追記
GPU 使ってる感じがあする それだけ重い処理なのか。

$ nvidia-smi -l

処理結果確認

元の動画

処理後の動画

投稿者: yoshimax

Software Engineer #Unity #iOS