畢業專題Capstone
機械手臂視覺抓取系統Robotic-Arm Visual Grasping System
結合 YOLOv7 與 OpenCV 的六軸機械手臂視覺抓取,同時取得類別、位置與角度。A six-axis robotic-arm grasping system fusing YOLOv7 and OpenCV to obtain class, position, and angle at once.

概述Overview
標準 YOLO 能辨識類別與位置卻算不出角度;OpenCV 能由輪廓求角度卻無法分類。本專題將兩者結合,搭配深度影像與六軸手臂控制,實現即時動態追蹤與自動抓取,可用於藥品取放、工廠瑕疵檢測等情境。Standard YOLO classifies and localizes but cannot infer angle; OpenCV recovers angle from contours but cannot classify. This project fuses the two with depth imaging and six-axis arm control for real-time tracking and automated grasping — e.g., medication pick-and-place or factory defect inspection.
系統流程Pipeline
- 影像擷取:Intel RealSense D435 對齊 RGB + 深度影像。Capture: Intel RealSense D435 provides aligned RGB + depth.
- 物件偵測:YOLOv7(6 類、549 張訓練影像、1000 epochs;IoU 0.5 精確率約 1.0)輸出類別與框座標。Detection: YOLOv7 (6 classes, 549 images, 1000 epochs; precision ≈1.0 @ IoU 0.5) outputs class and box coordinates.
- 裁切 + 遮罩 + 角度:依框裁切深度影像、二值化,OpenCV
findContours取最小外接矩形 → 中心、面積、旋轉角。Crop + mask + angle: crop the depth image by the box, binarize, then OpenCVfindContours+ min-area rectangle → center, area, rotation angle. - 座標校正:由相機內參與中心點深度動態計算 FOV 與像素→實際座標的轉換(非固定值)。Calibration: FOV and pixel-to-world mapping are computed dynamically from camera intrinsics and center-point depth (not a fixed value).
- 抓取與控制:對 x / y / z / θ 採比例控制;垂直距離 < 15 cm 即執行抓取。Grasp & control: proportional control on x / y / z / θ; grasp triggers when the vertical distance < 15 cm.
- 六軸手臂:AR3 經
pyserial以 G-code 控制,逆向運動學求六軸關節角。6-axis arm: the AR3 is driven viapyserialG-code; inverse kinematics solves the six joint angles.
硬體與工具Hardware & tools
- 手臂Arm
- AR3 六軸機械手臂(步進馬達)AR3 six-axis arm (stepper-driven)
- 相機Camera
- Intel RealSense D435(RGB + Depth)
- 夾爪 / 控制箱Gripper / Controller
- 伺服夾爪 · 自製控制箱(驅動器、編碼器、電源)Servo gripper · custom control box (drivers, encoder, PSU)
- 軟體Software
- Python · PyTorch · OpenCV · pyrealsense2 · Tkinter
成果Results
可即時追蹤並抓取不同類別、大小、形狀的物體,並能連續處理多個物體。相較單用 OpenCV 會把相鄰物體誤併為一,YOLOv7 + OpenCV 能正確分離相鄰/重疊物體並附上類別與信心度。 Reliably tracks and grasps objects of varied class, size, and shape, and handles multiple objects sequentially. Where OpenCV alone merges adjacent objects, YOLOv7 + OpenCV correctly separates overlapping objects and adds class + confidence.
實機 DemoLive Demo


