关于验证码识别中去除噪声的讨论
本帖最后由 tryhi 于 2013-2-3 12:43 编辑注:图中的软件是用了疯子前辈的软件http://www.autoitx.com/forum.php?mod=viewthread&tid=32094
由于没学过验证码识别,所以提出的问题可能有点弱
图中红色、绿色的点是要去除的,目前我只想到了红色点的去除,关于绿色点没有比较好的方案,红色的思路是这样的,搜索该点周围8个点,如果空白,则去除,如果周围只有1个点,则搜索这一个的周围8个点,发现它也只有一个,就两个都去除,这样就能实现去除红色点的效果,代码如下,可能效率、思路、代码都不是很好,关于三点相连或者抱团的点,应该怎么去除比较好?R的右腿那里有个很浅的红色圆圈那里三个点是不用去除的,怕有人误除,特指出来。
年底比较忙,可能跟帖速度会很慢,大家见谅
#include <Array.au3>
$string = " 0 "&@CRLF& _
" 0 00 "&@CRLF& _
" 00 0 0 0 "&@CRLF& _
" 0 0 0 0 0 "&@CRLF& _
" 0 000 0 "&@CRLF& _
" 0 0 "&@CRLF& _
" 0 0 0 "&@CRLF& _
" 0 0 00 "&@CRLF& _
" 0 0 "&@CRLF& _
" 00 00 0 0 "&@CRLF& _
" 00000000 0 0 "&@CRLF& _
" 00 0 0 0 0000000 "&@CRLF& _
" 00 0 0 00000 000 "&@CRLF& _
" 00 0 00 00 0 "&@CRLF& _
" 00 0 00 00"&@CRLF& _
" 0 00 000000000 0 00 00 "&@CRLF& _
"0 00 0 00 000 0000000000 00 00"&@CRLF& _
" 00 0 00 000 000 0 0 00 000 "&@CRLF& _
" 00 00 00 000 0 0 00 0 00"&@CRLF& _
" 00 00 0 00 00 0 00 00"&@CRLF& _
" 00 0 00 000 0 00 00 00 "&@CRLF& _
" 00 0 00 00 0000 000 00 "&@CRLF& _
" 00 00 0000 0 00 0 0 00 00 "&@CRLF& _
" 0 00 0000000 0 00 00 00 00 "&@CRLF& _
" 00 0000 0 0 00000000 0 0 000 00 "&@CRLF& _
" 00 0 0000 00 00 00 0 00 00"&@CRLF& _
" 00 00 00 000 0 00 0 "&@CRLF& _
" 00 00 000 00 0 00 00 "&@CRLF& _
" 00 000000 00 00 0"&@CRLF& _
" 00 00 0000 00 000 00"&@CRLF& _
" 00 00 00 00 0 00 000 "&@CRLF& _
" 0000 00 00 0 00 0 0000000 "&@CRLF& _
" 00000000 0 0000 000 00 000 "&@CRLF& _
" 00 0 000 0 0"&@CRLF& _
" 0 0 0 00 0 0 "&@CRLF
Local $array_s
$hang = StringRegExp($string,'\V+',3)
For $i = 0 To 34
$lie = StringSplit($hang[$i],'',2)
For $j = 0 To 96
$array_s[$i][$j] = $lie[$j]
Next
Next
_quchuzaodian($array_s)
_ArrayDisplay($array_s,'去除噪声之后')
Func _quchuzaodian(ByRef $array)
$35 = UBound($array, 1) - 1
$97 = UBound($array, 2) - 1
For $i = 0 To $35
For $j = 0 To $97
If $array[$i][$j] = "0" Then
$sizhou = _sizhou($array, $i, $j)
If $sizhou = 0 Then $array[$i][$j] = " "
If $sizhou = 1 Then
$sizhou2 = _sizhou($array, $sizhou, $sizhou)
If $sizhou2 = 1 Then
$array[$i][$j] = " "
$array[$sizhou][$sizhou] = " "
EndIf
EndIf
EndIf
Next
Next
EndFunc ;==>_quchuzaodian
Func _sizhou($array, $1, $2, $front = '0')
If Not IsArray($array) Then Return SetError(1)
Local $temp = 0, $a = UBound($array, 1) - 1, $b = UBound($array, 2) - 1
Local $temp =
If $1 > 0 And $2 > 0 Then
If $array[$1 - 1][$2 - 1] = $front Then $temp = _sizhou_udf($temp, $1, $2, 1)
EndIf
If $1 > 0 Then
If $array[$1 - 1][$2 + 0] = $front Then $temp = _sizhou_udf($temp, $1, $2, 2)
EndIf
If $1 > 0 And $2 < $b Then
If $array[$1 - 1][$2 + 1] = $front Then $temp = _sizhou_udf($temp, $1, $2, 3)
EndIf
If $2 < $b Then
If $array[$1 + 0][$2 + 1] = $front Then $temp = _sizhou_udf($temp, $1, $2, 4)
EndIf
If $1 < $a And $2 < $b Then
If $array[$1 + 1][$2 + 1] = $front Then $temp = _sizhou_udf($temp, $1, $2, 5)
EndIf
If $1 < $a Then
If $array[$1 + 1][$2 + 0] = $front Then $temp = _sizhou_udf($temp, $1, $2, 6)
EndIf
If $1 < $a And $2 > 0 Then
If $array[$1 + 1][$2 - 1] = $front Then $temp = _sizhou_udf($temp, $1, $2, 7)
EndIf
If $2 > 0 Then
If $array[$1 + 0][$2 - 1] = $front Then $temp = _sizhou_udf($temp, $1, $2, 8)
EndIf
Return $temp
EndFunc ;==>_sizhou
Func _sizhou_udf($array,$1,$2,$i)
Switch $i
Case 1
$array = $1-1
$array = $2-1
Case 2
$array = $1-1
$array = $2+0
Case 3
$array =$1-1
$array = $2+1
Case 4
$array = $1+0
$array = $2+1
Case 5
$array = $1+1
$array = $2+1
Case 6
$array = $1+1
$array = $2+0
Case 7
$array = $1+1
$array = $2-1
Case 8
$array = $1+0
$array = $2-1
EndSwitch
$array += 1
Return $array
EndFunc 厉害哈,,表示关注 不错不错,很牛啊! 有没人提出可行的思路 我觉得不要管红点绿点,取灰点就行了,
搜索到一个灰点,并将其周围8个点中是灰点的位置记录下来,再分别分析这8个点的周围的点(已分析过的点要略过),将灰点提取出来.....
总之,英文字符点阵肯定是连续的,所以这种方法肯定可行
另外,我想知道楼主提取出灰点后,怎样将点阵识别成字? 噢,不能识别颜色啊?不过看起来噪点跟字符没有粘连,所以还是可行的,只是注意每个点如果标记为已搜索就不要再搜索,否则可能就死循环了 搜索踫到边界或全是空白时就停下,并且每个点只搜索一次,这样应该不会死循环 回复 7# unique009
回复 6# unique009
回复 5# unique009
听起来好像没有什么可行性。。。。。。
页:
[1]