实例讲解昇腾 CANN YOLOV8 和 YOLOV9 适配 - HuaweiCloudDeveloper (2024)

本文分享自华为云社区《昇腾 CANN YOLOV8 和 YOLOV9 适配》,作者:jackwangcumt。

1 概述

华为昇腾 CANN YOLOV8 推理示例 C++样例 , 是基于Ascend CANN Samples官方示例中的sampleYOLOV7进行的YOLOV8适配。一般来说,YOLOV7模型输出的数据大小为[1,25200,85],而YOLOV8模型输出的数据大小为[1,84,8400],因此,需要对sampleYOLOV7中的后处理部分进行修改,从而做到YOLOV8/YOLOV9模型的适配。因项目研发需要,公司购置了一台Atlas 500 Pro 智能边缘服务器,安装的操作系统为Ubuntu 20.04 LTS Server,并按照官方说明文档,安装的Ascend-cann-toolkit_7.0.RC1_linux-aarch64.run等软件。具体可以参考另外一篇博文【Atlas 500 Pro 智能边缘服务器推理环境搭建】,这里不再赘述。

2 YOLOV8模型准备

在进行YOLOV8模型适配工作之前,首先需要获取YOLOV8的模型文件,这里以官方的YOLOV8n.pt模型为例,在Windows操作系统上可以安装YOLOV8环境,并执行如下python脚本(pth2onnx.py)将.pt模型转化成.onnx模型:

import argparsefrom ultralytics import YOLOdef main(): parser = argparse.ArgumentParser() parser.add_argument('--pt', default="yolov8n", help='.pt file') args = parser.parse_args() model = YOLO(args.pt) onnx_model = model.export(format="onnx", dynamic=False, simplify=True, opset=11)if __name__ == '__main__': main()

具体的YOLOV8环境搭建步骤,可以参考GitHub - ultralytics/ultralytics: NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite网站。当成功执行后,会生成yolov8n.onnx模型。输出内容示例如下所示:

(base) I:\yolov8\Yolov8_for_PyTorch>python pth2onnx.py --pt=yolov8n.ptUltralytics YOLOv8.0.229 🚀 Python-3.11.5 torch-2.1.2 CPU (Intel Core(TM) i7-10700K 3.80GHz)YOLOv8n summary (fused): 168 layers, 3151904 parameters, 0 gradients, 8.7 GFLOPsPyTorch: starting from 'yolov8n.pt' with input shape (1, 3, 640, 640) BCHW and output shape(s) (1, 84, 8400) (6.2 MB)ONNX: starting export with onnx 1.15.0 opset 11...ONNX: simplifying with onnxsim 0.4.36...ONNX: export success ✅ 1.0s, saved as 'yolov8n.onnx' (12.2 MB)Export complete (3.2s)Results saved to I:\yolov8\Yolov8_for_PyTorchPredict: yolo predict task=detect model=yolov8n.onnx imgsz=640Validate: yolo val task=detect model=yolov8n.onnx imgsz=640 data=coco.yamlVisualize: https://netron.app

从输出信息中可以看出,yolov8n.pt原始模型的输出尺寸为 (1, 3, 640, 640),格式为 BCHW ,输出尺寸为 (1, 84, 8400) 。这个模型的更多信息,可以用 netron 工具进行可视化查看,在安装了netron后,可以执行如下命令打开yolov8n.onnx模型进行Web网络结构的查看:

(base) I:\yolov8\Yolov8_for_PyTorch>netron yolov8n.onnxServing 'yolov8x.onnx' at http://localhost:8080

实例讲解昇腾 CANN YOLOV8 和 YOLOV9 适配 - HuaweiCloudDeveloper (1)

可以看到,转化后的yolov8n.onnx模型输入的节点名称为images,输入张量的大小为[1,3,640,640] 。在将yolov8n.onnx模型上传到Atlas 500 Pro服务器上,执行如下命令进行模型转换:

atc --model=yolov8n.onnx --framework=5 --output=yolov8n --input_shape="images:1,3,640,640" --soc_version=Ascend310P3 --insert_op_conf=aipp.cfg

其中的:

--soc_version=Ascend310P3可以通过npu-smi info命令进行查看,我这里打印的是 310P3 则,--soc_version 为 Ascend前缀加上310P3,即Ascend310P3。

--input_shape="images:1,3,640,640" 表示NCHW,即批处理为1,通道为3,图片大小为640x640,这与onnx模型的输入节点一致 。

--insert_op_conf=aipp.cfg 中的aipp.cfg来自官网sampleYOLOV7示例。由于原始输入图片的大小可能不符合要求,需要缩放到640x640的尺寸。aipp.cfg内容如下:

 aipp_op{ aipp_mode:static input_format : YUV420SP_U8 src_image_size_w : 640 src_image_size_h : 640 csc_switch : true rbuv_swap_switch : false matrix_r0c0 : 256 matrix_r0c1 : 0 matrix_r0c2 : 359 matrix_r1c0 : 256 matrix_r1c1 : -88 matrix_r1c2 : -183 matrix_r2c0 : 256 matrix_r2c1 : 454 matrix_r2c2 : 0 input_bias_0 : 0 input_bias_1 : 128 input_bias_2 : 128 crop: true load_start_pos_h : 0 load_start_pos_w : 0 crop_size_w : 640 crop_size_h : 640 min_chn_0 : 0 min_chn_1 : 0 min_chn_2 : 0 var_reci_chn_0: 0.0039215686274509803921568627451 var_reci_chn_1: 0.0039215686274509803921568627451 var_reci_chn_2: 0.0039215686274509803921568627451}

生执行成功后,会生成 yolov8n.om 离线模型

3 适配代码

根据官网sampleYOLOV7示例适配的YOLOV8示例,代码已经开源,地址为:https://gitee.com/cumt/ascend-yolov8-sample。核心代码sampleYOLOV8.cpp中的后处理方法GetResult为:

Result SampleYOLOV8::GetResult(std::vector<InferenceOutput> &inferOutputs, string imagePath, size_t imageIndex, bool release){ uint32_t outputDataBufId = 0; float *classBuff = static_cast<float *>(inferOutputs[outputDataBufId].data.get()); // confidence threshold float confidenceThreshold = 0.35; // class number size_t classNum = 80; number of (x, y, width, hight) size_t offset = 4; // total number of boxs yolov8 [1,84,8400] size_t modelOutputBoxNum = 8400; // read source image from file cv::Mat srcImage = cv::imread(imagePath); int srcWidth = srcImage.cols; int srcHeight = srcImage.rows; // filter boxes by confidence threshold vector<BoundBox> boxes; size_t yIndex = 1; size_t widthIndex = 2; size_t heightIndex = 3; // size_t all_num = 1 * 84 * 8400 ; // 705,600 for (size_t i = 0; i < modelOutputBoxNum; ++i) { float maxValue = 0; size_t maxIndex = 0; for (size_t j = 0; j < classNum; ++j) { float value = classBuff[(offset + j) * modelOutputBoxNum + i]; if (value > maxValue) { // index of class maxIndex = j; maxValue = value; } } if (maxValue > confidenceThreshold) { BoundBox box; box.x = classBuff[i] * srcWidth / modelWidth_; box.y = classBuff[yIndex * modelOutputBoxNum + i] * srcHeight / modelHeight_; box.width = classBuff[widthIndex * modelOutputBoxNum + i] * srcWidth / modelWidth_; box.height = classBuff[heightIndex * modelOutputBoxNum + i] * srcHeight / modelHeight_; box.score = maxValue; box.classIndex = maxIndex; box.index = i; if (maxIndex < classNum) { boxes.push_back(box); } } } ACLLITE_LOG_INFO("filter boxes by confidence threshold > %f success, boxes size is %ld", confidenceThreshold,boxes.size()); // filter boxes by NMS vector<BoundBox> result; result.clear(); float NMSThreshold = 0.45; int32_t maxLength = modelWidth_ > modelHeight_ ? modelWidth_ : modelHeight_; std::sort(boxes.begin(), boxes.end(), sortScore); BoundBox boxMax; BoundBox boxCompare; while (boxes.size() != 0) { size_t index = 1; result.push_back(boxes[0]); while (boxes.size() > index) { boxMax.score = boxes[0].score; boxMax.classIndex = boxes[0].classIndex; boxMax.index = boxes[0].index; // translate point by maxLength * boxes[0].classIndex to // avoid bumping into two boxes of different classes boxMax.x = boxes[0].x + maxLength * boxes[0].classIndex; boxMax.y = boxes[0].y + maxLength * boxes[0].classIndex; boxMax.width = boxes[0].width; boxMax.height = boxes[0].height; boxCompare.score = boxes[index].score; boxCompare.classIndex = boxes[index].classIndex; boxCompare.index = boxes[index].index; // translate point by maxLength * boxes[0].classIndex to // avoid bumping into two boxes of different classes boxCompare.x = boxes[index].x + boxes[index].classIndex * maxLength; boxCompare.y = boxes[index].y + boxes[index].classIndex * maxLength; boxCompare.width = boxes[index].width; boxCompare.height = boxes[index].height; // the overlapping part of the two boxes float xLeft = max(boxMax.x, boxCompare.x); float yTop = max(boxMax.y, boxCompare.y); float xRight = min(boxMax.x + boxMax.width, boxCompare.x + boxCompare.width); float yBottom = min(boxMax.y + boxMax.height, boxCompare.y + boxCompare.height); float width = max(0.0f, xRight - xLeft); float hight = max(0.0f, yBottom - yTop); float area = width * hight; float iou = area / (boxMax.width * boxMax.height + boxCompare.width * boxCompare.height - area); // filter boxes by NMS threshold if (iou > NMSThreshold) { boxes.erase(boxes.begin() + index); continue; } ++index; } boxes.erase(boxes.begin()); } ACLLITE_LOG_INFO("filter boxes by NMS threshold > %f success, result size is %ld", NMSThreshold,result.size()); // opencv draw label params const double fountScale = 0.5; const uint32_t lineSolid = 2; const uint32_t labelOffset = 11; const cv::Scalar fountColor(0, 0, 255); // BGR const vector<cv::Scalar> colors{ cv::Scalar(255, 0, 0), cv::Scalar(0, 255, 0), cv::Scalar(0, 0, 255)}; int half = 2; for (size_t i = 0; i < result.size(); ++i) { cv::Point leftUpPoint, rightBottomPoint; leftUpPoint.x = result[i].x - result[i].width / half; leftUpPoint.y = result[i].y - result[i].height / half; rightBottomPoint.x = result[i].x + result[i].width / half; rightBottomPoint.y = result[i].y + result[i].height / half; cv::rectangle(srcImage, leftUpPoint, rightBottomPoint, colors[i % colors.size()], lineSolid); string className = label[result[i].classIndex]; string markString = to_string(result[i].score) + ":" + className; ACLLITE_LOG_INFO("object detect [%s] success", markString.c_str()); cv::putText(srcImage, markString, cv::Point(leftUpPoint.x, leftUpPoint.y + labelOffset), cv::FONT_HERSHEY_COMPLEX, fountScale, fountColor); } string savePath = "out_" + to_string(imageIndex) + ".jpg"; cv::imwrite(savePath, srcImage); if (release) { free(classBuff); classBuff = nullptr; } return SUCCESS;}

YOLOV8的输出尺寸为 (1, 84, 8400),其中的8400代表模型原始预测的对象检测框信息,即代码中用size_t modelOutputBoxNum = 8400 ;进行表示。而 84 代表 4个位的边界框预测值(x,y,w,h)位置信息和80个检测类别数,即84 = 4 + 80。由于模型检测结果是用内存连续的一维数组进行表示的,因此,需要根据yolov8输出尺寸的实际含义,来访问需要的数组内存地址来获取需要的值。根据资料显示,yolov8模型不另外对置信度进行预测, 而是采用类别里面最大的概率作为置信度的值,8400是yolov8模型各尺度输出特征图叠加之后的结果,一般推理不需要处理。下面给出模型尺寸和内存数组的映射示意图 :

实例讲解昇腾 CANN YOLOV8 和 YOLOV9 适配 - HuaweiCloudDeveloper (2)

即除首行外,将其他83行的每一行依次变换到首行的末尾构成一维数组,一维数组的大小位 8400 x 84。遍历数组时,首先将8400个预测信息中的置信度获取到,即偏移offset=4个后,获取80个类别位中最大的值以及索引转化为置信度和类别ID。前4个代表x,y,w,预测框信息。对于个性化定制的模型,则需要修改size_t classNum = 80; 即可,参考onnx输出尺寸[1,84,8400]中的84-4 = 80 , 比如自定义的模型输出为[1,26,8400],则size_t classNum = 22 (26-4).

4 编译运行

下载开源代码,上传服务器,并解压,然后执行如下命令进行代码编译:

unzip ascend-yolov8-sample-master.zip -d ./kztech cd ascend-yolov8-sample-master/src# src目录下cmake .make#如果正确执行,则会在../out目录中生成 main 可执行文件,在src目录中运行示例../out/main#如果报如下错误:../out/main: error while loading shared libraries: libswresample.so.3: cannot open shared object file: No such file or directory则尝试设置如下环境变量后重试:export LD_LIBRARY_PATH=/usr/local/Ascend/thirdpart/aarch64/lib:$LD_LIBRARY_PATH#正确执行后,会在当前目录中生成out_0.jpg文件

执行成功,控制台打印如下信息:

root@atlas500ai:/home/kztech/ascend-yolov8-sample-master/src# ../out/main[INFO] Acl init ok[INFO] Open device 0 ok[INFO] Use default context currently[INFO] dvpp init resource ok[INFO] Load model ../model/yolov8n.om success[INFO] Create model description success[INFO] Create model(../model/yolov8n.om) output success[INFO] Init model ../model/yolov8n.om success[INFO] filter boxes by confidence threshold > 0.350000 success, boxes size is 10[INFO] filter boxes by NMS threshold > 0.450000 success, result size is 1[INFO] object detect [0.878906:dog] success[INFO] Inference elapsed time : 0.038817 s , fps is 25.761685[INFO] Unload model ../model/yolov8n.om success[INFO] destroy context ok[INFO] Reset device 0 ok[INFO] Finalize acl ok

实例讲解昇腾 CANN YOLOV8 和 YOLOV9 适配 - HuaweiCloudDeveloper (3)

5总结

YOLO各系列的适配过程,大部分都是处理输入格式和输出格式的变换上,参考YOLOV7,可以进行YOLOV8模型的适配,同理,YOLOV9的模型适配也是一样的。目前YOLOV9和YOLOV8模型输出格式一致,因此,只需要进行yolov9xx.om模型的生成工作即可。yolov9-c-converted.pt模型(https://github.com/WongKinYiu/yolov9/releases/download/v0.1/yolov9-c-converted.pt)转换如下:在windows操作系统上可以安装YOLOV9环境,并执行如下python脚本将.pt模型转化成.onnx模型:

#从base环境创建新的环境yolov9conda create -n yolov9 --clone base#激活虚拟环境yolov9conda activate yolov9#克隆yolov9代码git clone https://github.com/WongKinYiu/yolov9# 安装yolov9项目的依赖(yolov9) I:\yolov9-main>pip install -r requirements.txt# 模型转换导出onnx(yolov9) I:\yolov9-main>python export.py --weights yolov9-c-converted.pt --include onnx
(yolov9) I:\yolov9-main>python export.py --weights yolov9-c-converted.pt --include onnxexport: data=I:\yolov9-main\data\coco.yaml, weights=['yolov9-c-converted.pt'], imgsz=[640, 640], batch_size=1, device=cpu, half=False, inplace=False, keras=False, optimize=False, int8=False, dynamic=False, simplify=False, opset=12, verbose=False, workspace=4, nms=False, agnostic_nms=False, topk_per_class=100, topk_all=100, iou_thres=0.45, conf_thres=0.25, include=['onnx']YOLO 2024-3-13 Python-3.11.5 torch-2.1.2 CPUFusing layers...gelan-c summary: 387 layers, 25288768 parameters, 64944 gradients, 102.1 GFLOPsPyTorch: starting from yolov9-c-converted.pt with output shape (1, 84, 8400) (49.1 MB)ONNX: starting export with onnx 1.15.0...ONNX: export success 2.9s, saved as yolov9-c-converted.onnx (96.8 MB)Export complete (4.4s)Results saved to I:\yolov9-mainDetect: python detect.py --weights yolov9-c-converted.onnxValidate: python val.py --weights yolov9-c-converted.onnxPyTorch Hub: model = torch.hub.load('ultralytics/yolov5', 'custom', 'yolov9-c-converted.onnx')Visualize: https://netron.app
atc --model=yolov9-c-converted.onnx --framework=5 --output=yolov9-c-converted --input_shape="images:1,3,640,640" --soc_version=Ascend310P3 --insert_op_conf=aipp.cfg

点击关注,第一时间了解华为云新鲜技术~

实例讲解昇腾 CANN YOLOV8 和 YOLOV9 适配 - HuaweiCloudDeveloper (2024)

References

Top Articles
Imgur: The magic of the Internet
The politics of memes: How Biden and Trump are fighting each other on the internet
Funny Roblox Id Codes 2023
Golden Abyss - Chapter 5 - Lunar_Angel
Www.paystubportal.com/7-11 Login
Joi Databas
DPhil Research - List of thesis titles
Shs Games 1V1 Lol
Evil Dead Rise Showtimes Near Massena Movieplex
Steamy Afternoon With Handsome Fernando
fltimes.com | Finger Lakes Times
Detroit Lions 50 50
18443168434
Newgate Honda
Zürich Stadion Letzigrund detailed interactive seating plan with seat & row numbers | Sitzplan Saalplan with Sitzplatz & Reihen Nummerierung
Grace Caroline Deepfake
978-0137606801
Nwi Arrests Lake County
Justified Official Series Trailer
London Ups Store
Committees Of Correspondence | Encyclopedia.com
Pizza Hut In Dinuba
Jinx Chapter 24: Release Date, Spoilers & Where To Read - OtakuKart
How Much You Should Be Tipping For Beauty Services - American Beauty Institute
Free Online Games on CrazyGames | Play Now!
Sizewise Stat Login
VERHUURD: Barentszstraat 12 in 'S-Gravenhage 2518 XG: Woonhuis.
Jet Ski Rental Conneaut Lake Pa
Unforeseen Drama: The Tower of Terror’s Mysterious Closure at Walt Disney World
Ups Print Store Near Me
What Time Does Walmart Auto Center Open
Nesb Routing Number
Olivia Maeday
Random Bibleizer
10 Best Places to Go and Things to Know for a Trip to the Hickory M...
Black Lion Backpack And Glider Voucher
Gopher Carts Pensacola Beach
Duke University Transcript Request
Lincoln Financial Field, section 110, row 4, home of Philadelphia Eagles, Temple Owls, page 1
Jambus - Definition, Beispiele, Merkmale, Wirkung
Ark Unlock All Skins Command
Craigslist Red Wing Mn
D3 Boards
Jail View Sumter
Birmingham City Schools Clever Login
Thotsbook Com
Funkin' on the Heights
Caesars Rewards Loyalty Program Review [Previously Total Rewards]
Vci Classified Paducah
Www Pig11 Net
Ty Glass Sentenced
Latest Posts
Article information

Author: Terrell Hackett

Last Updated:

Views: 5809

Rating: 4.1 / 5 (52 voted)

Reviews: 83% of readers found this page helpful

Author information

Name: Terrell Hackett

Birthday: 1992-03-17

Address: Suite 453 459 Gibson Squares, East Adriane, AK 71925-5692

Phone: +21811810803470

Job: Chief Representative

Hobby: Board games, Rock climbing, Ghost hunting, Origami, Kabaddi, Mushroom hunting, Gaming

Introduction: My name is Terrell Hackett, I am a gleaming, brainy, courageous, helpful, healthy, cooperative, graceful person who loves writing and wants to share my knowledge and understanding with you.