是否有可能读取一个网页时,只读取这个网页的前100k字节即停止
网页比较大,我想只读取前100k字节就停止,因为前100k的字节中,已经包含我想要的元素了。 帮助文档Local $hDownload = InetGet("http://www.autoitscript.com/autoit3/files/beta/update.dat", @TempDir & "\update.dat", 1, 1)
Do
Sleep(250)
Until InetGetInfo($hDownload, 2) ; 检查下载是否完成.
Local $nBytes = InetGetInfo($hDownload, 0)
InetClose($hDownload) ; 关闭句柄,释放资源.
MsgBox(0, "", "读取字节数: " & $nBytes)
你这个是完成后才监测接收文件的大小,他要的是接收中检测,获取的数据长度到100字节,就结束会话。
应该是这样的
Local $hDownload = InetGet("http://dl_dir.qq.com/qqfile/qq/QQ2011/QQ2011.exe", @TempDir & "\qq.exe", 1, 1)
Local $nBytes = 0
Do
$nBytes = InetGetInfo($hDownload, 0)
Until $nBytes>100 ; 检查下载的数据是否大于100字节
InetClose($hDownload) ; 关闭句柄,释放资源.
MsgBox(0, "", "读取字节数: " & $nBytes) 回复 3# lanfengc
我是在楼主发完贴两分钟之内回的贴,直接复制的帮助文档例子,估计楼主会变通运用,但无下文… 不会变通。一会我把我的程序的一部分拿出来,你看看怎么改吧,应该是一个自定义函数。 Do
$cookie = IniRead("config.ini", "settings", "cookie", "")
$url1 = "www.baidu.com"
$html1 = _send($url1, $cookie, "")
$url2 = "www.sina.com.cn"
$html2 = _send($url2, $cookie, "")
$html = $html1 & @CRLF & $html2 _send
这个自定义能自动停止在100K吗? 各有各精彩啊 _send的自定义是这么写的,但是写到了程序的尾部。
Func _send($url, $cookie = "", $moreheader = "")
Local $MyOpen, $rContext
$MyOpen = _WinHttpOpen()
$moreheader = "User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:5.0) Gecko/20100101 Firefox/5.0"
$rContext = _WinHTTP_GetRespond($MyOpen, $url, 0 + 2 + 4, "", "", $cookie, $moreheader)
;~ _ArrayDisplay($rContext)
If IsArray($rContext) Then
Return BinaryToString($rContext)
Else
Return ""
EndIf
_WinHttpCloseHandle($MyOpen)
EndFunc ;==>_send 这个简单,你不是用WinHttp吗?
WinHttp中接收数据的UDF:_WinHttpReadData($hRequest [, $iMode = Default [, $iNumberOfBytesToRead = Default ]])
$iNumberOfBytesToRead 随便设成多少就接收多少 同学习,正想找这方面的内容,同谢谢了 学习中这方面知识值得学习 多谢共享,支持楼主 是这么写吗?,但是,出错了。Func _send($url, $cookie = "", $moreheader = "")
Local $MyOpen, $rContext
$MyOpen = _WinHttpOpen()
$moreheader = "User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:5.0) Gecko/20100101 Firefox/5.0"
$rContext = _WinHTTP_GetRespond($MyOpen, $url, 0 + 2 + 4, "", "", $cookie, $moreheader)
$rContext2 = _WinHttpReadData($MyOpen, ,,100)
Local $nBytes = 0
Do
$nBytes = InetGetInfo($rContext2, 0)
Until $nBytes>100 ; 检查下载的数据是否大于100字节
InetClose($rContext2) ; 关闭句柄,释放资源.
MsgBox(0, "", "读取字节数: " & $nBytes)
本帖最后由 甲壳虫 于 2012-2-29 21:01 编辑
回复 14# sex123
1. 如果只是单纯下载(GET方式),用3楼的思路就可以了。
2. 如果要POST发数据,下面这个是我在 MyChrome v2.6.1 中用的方法,稍改了一下。这个UDF使用AutoIt原生的命令,输出的是Binary,需根据不同编码用BinaryToString转成字符。另外没考虑Chunked Transfer,未将表示块大小的字去掉。网页内容中可能会包括一些不应有的字符,仅供参考。
; 例子:只读取网页前100字节内容
$var = TCPGetPost("http://paper.pubmed.cn", "GET", 100)
$var = BinaryToString($var, 4)
ConsoleWrite("网页内容:" & @crlf & $var & @crlf)
; #FUNCTION# ;===============================================================================
; Name...........: TCPGetPost
; Description ...: TCP Post / Get
; Syntax.........: TCPGetPost($url, $method, $bytes = 0, $data = "", $Proxy = "", $ProxyPort = 8080)
; Parameters ....: $url - 如:"http://hi.baidu.com/jdchenjian/blog/item/23114bf153aba5c47831aa60.html"
; $method - GET / POST
; $bytes - 要接收的字节数, 0 表示全部接收
; $data - 要发送的数据
; $Proxy - 代理服务器,如 google.cn,203.208.46.178, 127.0.0.1, localhost
; $ProxyPort - 代理服务器端口,如 80, 8080
; Return values .: Success - 网页内容(二进制)
; Failure - 空
; Author ........: 甲壳虫
;============================================================================================
Func TCPGetPost($url, $method, $bytes = 0, $data = "", $proxy = "", $ProxyPort = 8080)
Local $array = HttpParseUrl($url) ; $Array - host, $Array - page, $Array - port
If @error Or $method <> "GET" And $method <> "POST" Then Return SetError(1, "", "")
Local $host = $array
Local $page = $array
Local $port = $array
Local $socket = -1, $command
TCPStartup()
If $proxy <> "" Then
$socket = TCPConnect(TCPNameToIP($proxy), $ProxyPort)
;~ When a client uses a proxy, it typically sends all requests to that proxy,
;~ instead of to the servers in the URLs.Requests to a proxy differ from normal requests in one way:
;~ in the first line, they use the complete URL of the resource being requested,
;~ instead of just the path. For example: GET http://www.somehost.com/path/file.html HTTP/1.0
;~ That way, the proxy knows which server to forward the request to (though the proxy itself may use another proxy).
$command = $method & " " & $url & " HTTP/1.1" & @CRLF
Else
$socket = TCPConnect(TCPNameToIP($host), $port)
$command = $method & " " & $page & " HTTP/1.1" & @CRLF
EndIf
If $socket = -1 Then
TCPShutdown()
Return SetError(1, "", "")
EndIf
If $method = "POST" Then $command &= "Content-Length: " & StringLen($data) & @CRLF
$command &= "Host: " & $host & @CRLF
$command &= "Connection: close" & @CRLF
$command &= "" & @CRLF
If $method = "POST" Then $command &= $data & @CRLF
TCPSend($socket, $command)
Local $recv, $sdata, $b = 1024, $ReceivedSize = 0
Do
$recv &= TCPRecv($socket, 1)
Until @error Or StringInStr($recv, @CRLF & @CRLF)
; 去掉 response header (response header 以空行结尾)
If $bytes <> 0 Then $b = $bytes
While 1
$recv = TCPRecv($socket, $b, 1) ; 接收
If @error Then ExitLoop
$sdata = _WinHttpBinaryConcat($sdata, $recv)
$ReceivedSize = BinaryLen($sdata)
If $bytes <> 0 Then
If $ReceivedSize >= $bytes Then ExitLoop
$b = $bytes - $ReceivedSize
EndIf
WEnd
TCPCloseSocket($socket)
TCPShutdown()
Return SetError(0, "",$sdata)
EndFunc ;==>TCPGetPost
; #FUNCTION# ;===============================================================================
; Name...........: HttpParseUrl
; Description ...: 解析 http 网址
; Syntax.........: HttpParseUrl($url)
; Parameters ....: $url - 网址,如:http://dl.google.com/chrome/install/912.12/chrome_installer.exe
; Return values .: Success - $Array - host, 如:dl.google.com
; $Array - page, 如:/chrome/install/912.12/chrome_installer.exe
; $Array - port, 如:80
; Failure - Returns empty sets @error
; Author ........: 甲壳虫
;============================================================================================
Func HttpParseUrl($url)
Local $host, $page, $port, $aResults
Local $match = StringRegExp($url, '(?i)^https?://([^/]+)(/?.*)', 1)
If @error Then Return SetError(1, "", $aResults)
$aResults = $match ; host
$aResults = $match ; page
If $aResults = "" Then $aResults = "/"
If StringLeft($url, 5) = "https" Then
$aResults = 443
Else
$aResults = 80
EndIf
Return SetError(0, "", $aResults)
EndFunc ;==>HttpParseUrl
; #FUNCTION# ;===============================================================================
; Name...........: _WinHttpBinaryConcat
; Description ...: Concatenates two binary data returned by _WinHttpReadData() in binary mode.
; Syntax.........: _WinHttpBinaryConcat(ByRef $bBinary1, ByRef $bBinary2)
; Parameters ....: $bBinary1 - Binary data that is to be concatenated.
; $bBinary2 - Binary data to concatenate.
; Return values .: Success - Returns concatenated binary data.
; Failure - Returns empty binary and sets @error:
; |1 - Invalid input.
; Author ........: ProgAndy
; Modified.......: trancexx
; Remarks .......:
; Related .......: _WinHttpReadData
; Link ..........:
; Example .......:
;============================================================================================
Func _WinHttpBinaryConcat(ByRef $bBinary1, ByRef $bBinary2)
Switch IsBinary($bBinary1) + 2 * IsBinary($bBinary2)
Case 0
Return SetError(1, 0, Binary(''))
Case 1
Return $bBinary1
Case 2
Return $bBinary2
EndSwitch
Local $tAuxiliary = DllStructCreate("byte[" & BinaryLen($bBinary1) & "];byte[" & BinaryLen($bBinary2) & "]")
DllStructSetData($tAuxiliary, 1, $bBinary1)
DllStructSetData($tAuxiliary, 2, $bBinary2)
Local $tOutput = DllStructCreate("byte[" & DllStructGetSize($tAuxiliary) & "]", DllStructGetPtr($tAuxiliary))
Return DllStructGetData($tOutput, 1)
EndFunc ;==>_WinHttpBinaryConcat
3. 如果你用WinHTTP,用我10楼的思路,具体看下WinHTTP的帮助文档。用这个就简单多了:
#include "WinHttp.au3"
;~ 例:只读取网页前100字节内容
$var = TCPGetPost("paper.pubmed.cn", "/", "GET", 100)
$var = BinaryToString($var, 4)
ConsoleWrite("网页内容:" & @crlf & $var & @crlf)
; #FUNCTION# ;===============================================================================
; Name...........: TCPGetPost
; Description ...: TCP Post / Get
; Syntax.........: TCPGetPost($host, $page, $method, $bytes = 0, $sdata = Default, $sHeader = Default)
; Parameters ....: $host - hi.baidu.com
; $page - 如:"/jdchenjian/blog/item/23114bf153aba5c47831aa60.html"
; $method - GET / POST
; $bytes - 要接收的字节数, 0 表示全部接收
; $sdata - 要发送的数据
; $sHeader - 额外的 header
; Return values .: Success - 网页内容(字符)
; Failure - 空
;============================================================================================
Func TCPGetPost($host, $page, $method, $bytes = 0, $sdata = Default, $sHeader = Default)
Local $hOpen = _WinHttpOpen(Default, $WINHTTP_ACCESS_TYPE_NO_PROXY) ; 不用代理
Local $hConnect = _WinHttpConnect($hOpen, $host, 80)
Local $hRequest = _WinHttpSimpleSendRequest($hConnect, $method, $page, Default, $sdata, $sHeader)
_WinHttpReceiveResponse($hRequest)
Local $data = _WinHttpReadData($hRequest, 2, $bytes)
_WinHttpCloseHandle($hRequest)
_WinHttpCloseHandle($hConnect)
_WinHttpCloseHandle($hOpen)
Return $data
EndFunc
页:
[1]