IMAGE ai

Hôm này mình sẽ chia sẻ Darkflow để phát hiện đối tượng. Darkflow là sự kết hợp của YOLOv2 và Tensorflow 1. Hiện tại đã có rất nhiều version của YOLO : yolov3, yolov4,yolox,yolor …

Cài đặt

Ở đây mình sử dụng thư viện Tensorflow 1.15 GPU trên window nhé :v . Có thể chạy trên CPU được.

Đầu tiên là clone code từ github của tác giả tại đây

Sau đó chạy truy cập đến folder vừa tải về chạy lệnh và cuối cùng kết quả sẽ như hình dưới

python setup.py build_ext --inplace

Tiếp theo là cài đặt darkflow

pip install -e .
pip install .

2. Chạy thử Demo

Mở cmd lên và chạy lệnh sau

python flow --imgdir sample_img/ --model cfg/tiny-yolo.cfg --load bin/tiny-yolo.weights --json

Trong bài viết hướng dẫn của tác giả có vể như sử dụng linux nên không cần python ở đầu. Nhưng chạy trên window cần có nhé :))

Download file weigth tại đây

3. Custom train

3.1 Chuẩn bị dataset

Mình sẽ sử dụng dataset detect biển số xe tải về tại đây. Dataset này chưa được gán nhãn nhưng có một file location.txt chưa tọa độ của biển số trong ảnh, chúng ta sẽ phải code một chút để tạo ra file annotation là pascal voc. Thư mục ảnh và annotation chúng ta để riêng biệt nhé .

import cv2
import os
from pascal_voc_writer import Writer  # pip install pascal_voc_writer 

path_dir = "DATASET/GreenParking"
path_save = "output/annotations/"

data_coor = open(path_dir + "/location.txt", "r")
reader = data_coor.readlines()

for line in reader:
    data_line = line.split(' ')
    file = data_line[0]
    x1 = int(data_line[2])
    y1 = int(data_line[3])
    x2 = x1 + int(data_line[4])
    y2 = y1 + int(data_line[5])

    full_path_file = os.path.join(path_dir, file)
    image = cv2.imread(full_path_file)
    cv2.rectangle(image, (x1, y1), (x2, y2), 255, 1)
    h, w, c = image.shape

    print(file)

    writer = Writer(path=file, width=w, height=h)
    writer.addObject("license plate", xmin=x1, ymin=y1, xmax=x2, ymax=y2)
    writer.save(path_save + '/{}.xml'.format(file.split('.')[0]))

    cv2.imshow("image", image)
    cv2.waitKey(1)

Tiếp theo là chỉnh sửa file config trong bài này mình sử dụng file tiny-yolo-voc.cfg trong thư mục cfg của thư mục đã download về lúc đầu.

thay đổi subdivisions nếu máy yếu thì set 16,32 … số chia hết cho 8 :v
class =1 vì chỉ có một class là biến số xe
filters = (class +5 ) *5 . chỉ thay đổi filters gần class nhất

[net]
batch=64
subdivisions=8
width=416
height=416
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1

learning_rate=0.001
max_batches = 40100
policy=steps
steps=-1,100,20000,30000
scales=.1,10,.1,.1

[convolutional]
batch_normalize=1
filters=16
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=1

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky

###########

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=30
activation=linear

[region]
anchors = 1.08,1.19,  3.42,4.41,  6.63,11.38,  9.42,5.11,  16.62,10.52
bias_match=1
classes=1
coords=4
num=5
softmax=1
jitter=.2
rescore=1

object_scale=5
noobject_scale=1
class_scale=1
coord_scale=1

absolute=1
thresh = .5
random=1

Sau đó là tạo một tệp có tên labels.txt chứa tên của đối tượng sẽ detect sẽ giống với thẻ name trong file annotation.

Cuối cùng thời chờ đợi đã đến. Mở cmd trong thư mục đã tải về và chạy lệnh như hình dưới đây

Nếu vượt quá khả năng của gpu sẽ xuất hiện lỗi và chúng ta chỉ cần giảm tham số –gpu xuống là được

Tiếp theo là chờ đợi thôi

Sau khi train trong thư mục ckpt sẽ xuất hiện các file như sau

4. Chạy thử

4.1 Convert sang file protobuf(.pb)

python flow --model cfg/tiny-yolo-voc-license-plate.cfg 
            --load -1 
            --savepb

Sau khi convert sẽ xuất hiện một thư mục built_graph có chứa 2 file với phần mở rộng là meta và pb.

4.2 Code test

from darkflow.net.build import TFNet
import cv2

options = {"pbLoad": "tiny-yolo-voc-license-plate.pb"),
           "metaLoad": "tiny-yolo-voc-license-plate.meta"),
           "threshold": 0.7}

tfnet = TFNet(options)

imgcv = cv2.imread("test_detect_license.jpg")
result = tfnet.return_predict(imgcv)

for rs in result:
    label = rs['label']
    confidence = rs['confidence']
    x1, y1 = rs['topleft']['x'], rs['topleft']['y']
    x2, y2 = rs['bottomright']['x'], rs['bottomright']['y']

    cv2.rectangle(imgcv, (x1, y1), (x2, y2), (0, 255, 0), 2)

    cv2.putText(imgcv, "{}:{}".format(label, round(confidence, 2)), (x1, y1 - 5), cv2.FONT_HERSHEY_SIMPLEX, 0.5,
                (0, 255, 0), 2)

print(result)

cv2.imshow("image", imgcv)
cv2.waitKey()

Tài liệu tham khảo

https://github.com/thtrieu/darkflow

https://www.youtube.com/watch?v=eFJOGsQ_YTA