文章引用说明 更多>> (返回到该文章)

Collins, R., Lipton, A., Fujiyoshi, H. and Kanade, T. (2001) Algorithms for Cooperative Multisensor Surveillance. Proceedings of the IEEE, 89, 1456-1477.
http://dx.doi.org/10.1109/5.959341

被以下文章引用:

  • 标题: 基于TLD的物体自动识别系统The Automatic Object Detection System Based on TLD Framework

    作者: 李学彦, 王春南, 谢敏, 王昌栋

    关键字: 物体识别, 语音处理, TLD, DTWObject Detection, Voice Processing, Tracking-Learning-Detection, Dynamic Programming Algorithm

    期刊名称: 《Computer Science and Application》, Vol.6 No.4, 2016-04-29

    摘要: 大数据时代下随着计算机数据处理能力的提高,传感技术、音频技术、自动化控制技术得到不断地发展,视频帧和图像信息作为人类通过客观世界获得信息的主要来源之一更是得到了诸多的重视。如今计算机视觉作为当下研究的热潮之一,拥有诸如识别、运动、场景重建、图像恢复等众多技术挑战。其中又以物体识别最为重要。与此同时,众多的物体识别系统却仅仅侧重于物体识别的精度而缺乏其他辅助功能的实现,如何拥有更好的人机交互以及更广阔的市场前景的物体自动识别系统是当下众多开发者所探讨的。在本文中我们将物体识别与语音处理相结合,首先在物体识别算法Tracking-Learning-Detection (TLD)的基础上进行改进,以给定的一类物体的图片数据集为基础,训练出适合于识别该类物体的分类器,从而判断新的物体是否为目标物体,实现对指定一类物体的识别;同时该系统将以语音识别作为人机交互的基础,使用户可以利用语音将图片数据添加到训练集中并更新分类器,同时采用动态规划的方式(DTW)对语音特征进行匹配从而保证了语音识别的准确度。 With the enhancement of the data processing ability of computer, the technology on sensor, audio and automation control has been developed continuously, and the information in video frames and image has got a lot of attention, which is one of the main sources that human obtain information from the world. Computer vision, as one of the present research upsurges, has many technical challenges such as detection, motion, scene reconstruction and image restoration. Object detection is one of the most important challenges. Although there are plenty of object detection systems with high accuracy rate of detection in the market, they lack realization on auxiliary functions so that they provide poor experience on man-machine interaction. Therefore, many developers focus on the topic that how to design a better man-machine interaction of detection system for human so that the detection system can be accepted widely. In this paper, we propose a system framework which contains the technology on object detection and voice processing. Firstly, we make improvement on the algorithm of Tracking-Learning-Detection (TLD). We use the image sets of the object which we want to detect to get a suitable classifier by training algorithm. Then, we can use the classifier to determine whether the new object is the target object and get the aim of detecting the specified object. Then, the system contains the module of speech recognition for a better man- machine interaction so that the user can add the image data to the data set and update the classifier by voice. In order to guarantee the accuracy of speech recognition, we use the Dynamic Time Warping (DTW) to match the phonetic characteristics.

在线客服:
对外合作:
联系方式:400-6379-560
投诉建议:feedback@hanspub.org
客服号

人工客服,优惠资讯,稿件咨询
公众号

科技前沿与学术知识分享