怎样提取字符串
本帖最后由 neity 于 2009-12-21 11:42 编辑从下列给出的文本中提取:
ImageX Tool for Windows
Copyright (C) Microsoft Corp. All rights reserved.
Version: 6.1.7600.16385
WIM Information:
----------------
Path: f:\7lite\s1\sources\install.wim
GUID: {bc90f44f-35aa-48fe-9fa0-0b6c63c6b4d7}
Image Count: 5
Compression: LZX
Part Number: 1/1
Attributes:0xc
Integrity info
Relative path junction
Available Image Choices:
------------------------
<WIM>
<TOTALBYTES>2241959924</TOTALBYTES>
<IMAGE INDEX="1">
<DIRCOUNT>9550</DIRCOUNT>
<FILECOUNT>47318</FILECOUNT>
<TOTALBYTES>7983637109</TOTALBYTES>
<CREATIONTIME>
<HIGHPART>0x01CA0443</HIGHPART>
<LOWPART>0x6568BDF8</LOWPART>
</CREATIONTIME>
<LASTMODIFICATIONTIME>
<HIGHPART>0x01CA0463</HIGHPART>
<LOWPART>0x76D5423C</LOWPART>
</LASTMODIFICATIONTIME>
<WINDOWS>
<ARCH>0</ARCH>
<PRODUCTNAME>Microsoft® Windows® Operating System</PRODUCTNAME>
<EDITIONID>Starter</EDITIONID>
<INSTALLATIONTYPE>Client</INSTALLATIONTYPE>
<HAL>acpiapic</HAL>
<PRODUCTTYPE>WinNT</PRODUCTTYPE>
<PRODUCTSUITE>Terminal Server</PRODUCTSUITE>
<LANGUAGES>
<LANGUAGE>zh-CN</LANGUAGE>
<DEFAULT>zh-CN</DEFAULT>
</LANGUAGES>
<VERSION>
<MAJOR>6</MAJOR>
<MINOR>1</MINOR>
<BUILD>7600</BUILD>
<SPBUILD>16385</SPBUILD>
<SPLEVEL>0</SPLEVEL>
</VERSION>
<SYSTEMROOT>WINDOWS</SYSTEMROOT>
</WINDOWS>
<NAME>Windows 7 STARTER</NAME>
<DESCRIPTION>Windows 7 STARTER</DESCRIPTION>
<FLAGS>Starter</FLAGS>
<HARDLINKBYTES>3045021372</HARDLINKBYTES>
<DISPLAYNAME>Windows 7 简易版</DISPLAYNAME>
<DISPLAYDESCRIPTION>Windows 7 简易版</DISPLAYDESCRIPTION>
</IMAGE>
<IMAGE INDEX="2">
<DIRCOUNT>9561</DIRCOUNT>
<FILECOUNT>47403</FILECOUNT>
<TOTALBYTES>8003795881</TOTALBYTES>
<CREATIONTIME>
<HIGHPART>0x01CA0443</HIGHPART>
<LOWPART>0x6568BDF8</LOWPART>
</CREATIONTIME>
<LASTMODIFICATIONTIME>
<HIGHPART>0x01CA0463</HIGHPART>
<LOWPART>0x929ACF4C</LOWPART>
</LASTMODIFICATIONTIME>
<WINDOWS>
<ARCH>0</ARCH>
<PRODUCTNAME>Microsoft® Windows® Operating System</PRODUCTNAME>
<EDITIONID>HomeBasic</EDITIONID>
<INSTALLATIONTYPE>Client</INSTALLATIONTYPE>
<HAL>acpiapic</HAL>
<PRODUCTTYPE>WinNT</PRODUCTTYPE>
<PRODUCTSUITE>Terminal Server</PRODUCTSUITE>
<LANGUAGES>
<LANGUAGE>zh-CN</LANGUAGE>
<DEFAULT>zh-CN</DEFAULT>
</LANGUAGES>
<VERSION>
<MAJOR>6</MAJOR>
<MINOR>1</MINOR>
<BUILD>7600</BUILD>
<SPBUILD>16385</SPBUILD>
<SPLEVEL>0</SPLEVEL>
</VERSION>
<SYSTEMROOT>WINDOWS</SYSTEMROOT>
</WINDOWS>
<NAME>Windows 7 HOMEBASIC</NAME>
<DESCRIPTION>Windows 7 HOMEBASIC</DESCRIPTION>
<FLAGS>HomeBasic</FLAGS>
<HARDLINKBYTES>3060203459</HARDLINKBYTES>
<DISPLAYNAME>Windows 7 家庭普通版</DISPLAYNAME>
<DISPLAYDESCRIPTION>Windows 7 家庭普通版</DISPLAYDESCRIPTION>
</IMAGE>
<IMAGE INDEX="3">
<DIRCOUNT>9779</DIRCOUNT>
<FILECOUNT>48416</FILECOUNT>
<TOTALBYTES>8445655979</TOTALBYTES>
<CREATIONTIME>
<HIGHPART>0x01CA0443</HIGHPART>
<LOWPART>0x6568BDF8</LOWPART>
</CREATIONTIME>
<LASTMODIFICATIONTIME>
<HIGHPART>0x01CA0463</HIGHPART>
<LOWPART>0xAE00FE9C</LOWPART>
</LASTMODIFICATIONTIME>
<WINDOWS>
<ARCH>0</ARCH>
<PRODUCTNAME>Microsoft® Windows® Operating System</PRODUCTNAME>
<EDITIONID>HomePremium</EDITIONID>
<INSTALLATIONTYPE>Client</INSTALLATIONTYPE>
<HAL>acpiapic</HAL>
<PRODUCTTYPE>WinNT</PRODUCTTYPE>
<PRODUCTSUITE>Terminal Server</PRODUCTSUITE>
<LANGUAGES>
<LANGUAGE>zh-CN</LANGUAGE>
<DEFAULT>zh-CN</DEFAULT>
</LANGUAGES>
<VERSION>
<MAJOR>6</MAJOR>
<MINOR>1</MINOR>
<BUILD>7600</BUILD>
<SPBUILD>16385</SPBUILD>
<SPLEVEL>0</SPLEVEL>
</VERSION>
<SYSTEMROOT>WINDOWS</SYSTEMROOT>
</WINDOWS>
<NAME>Windows 7 HOMEPREMIUM</NAME>
<DESCRIPTION>Windows 7 HOMEPREMIUM</DESCRIPTION>
<FLAGS>HomePremium</FLAGS>
<HARDLINKBYTES>3439427655</HARDLINKBYTES>
<DISPLAYNAME>Windows 7 家庭高级版</DISPLAYNAME>
<DISPLAYDESCRIPTION>Windows 7 家庭高级版</DISPLAYDESCRIPTION>
</IMAGE>
<IMAGE INDEX="4">
<DIRCOUNT>9836</DIRCOUNT>
<FILECOUNT>48866</FILECOUNT>
<TOTALBYTES>8326968857</TOTALBYTES>
<CREATIONTIME>
<HIGHPART>0x01CA0443</HIGHPART>
<LOWPART>0x6568BDF8</LOWPART>
</CREATIONTIME>
<LASTMODIFICATIONTIME>
<HIGHPART>0x01CA0463</HIGHPART>
<LOWPART>0xBC779674</LOWPART>
</LASTMODIFICATIONTIME>
<WINDOWS>
<ARCH>0</ARCH>
<PRODUCTNAME>Microsoft® Windows® Operating System</PRODUCTNAME>
<EDITIONID>Professional</EDITIONID>
<INSTALLATIONTYPE>Client</INSTALLATIONTYPE>
<HAL>acpiapic</HAL>
<PRODUCTTYPE>WinNT</PRODUCTTYPE>
<PRODUCTSUITE>Terminal Server</PRODUCTSUITE>
<LANGUAGES>
<LANGUAGE>zh-CN</LANGUAGE>
<DEFAULT>zh-CN</DEFAULT>
</LANGUAGES>
<VERSION>
<MAJOR>6</MAJOR>
<MINOR>1</MINOR>
<BUILD>7600</BUILD>
<SPBUILD>16385</SPBUILD>
<SPLEVEL>0</SPLEVEL>
</VERSION>
<SYSTEMROOT>WINDOWS</SYSTEMROOT>
</WINDOWS>
<NAME>Windows 7 PROFESSIONAL</NAME>
<DESCRIPTION>Windows 7 PROFESSIONAL</DESCRIPTION>
<FLAGS>Professional</FLAGS>
<HARDLINKBYTES>3305882953</HARDLINKBYTES>
<DISPLAYNAME>Windows 7 专业版</DISPLAYNAME>
<DISPLAYDESCRIPTION>Windows 7 专业版</DISPLAYDESCRIPTION>
</IMAGE>
<IMAGE INDEX="5">
<DIRCOUNT>9866</DIRCOUNT>
<FILECOUNT>49019</FILECOUNT>
<TOTALBYTES>8485352280</TOTALBYTES>
<CREATIONTIME>
<HIGHPART>0x01CA0443</HIGHPART>
<LOWPART>0x6568BDF8</LOWPART>
</CREATIONTIME>
<LASTMODIFICATIONTIME>
<HIGHPART>0x01CA0463</HIGHPART>
<LOWPART>0xCBBB37DC</LOWPART>
</LASTMODIFICATIONTIME>
<WINDOWS>
<ARCH>0</ARCH>
<PRODUCTNAME>Microsoft® Windows® Operating System</PRODUCTNAME>
<EDITIONID>Ultimate</EDITIONID>
<INSTALLATIONTYPE>Client</INSTALLATIONTYPE>
<HAL>acpiapic</HAL>
<PRODUCTTYPE>WinNT</PRODUCTTYPE>
<PRODUCTSUITE>Terminal Server</PRODUCTSUITE>
<LANGUAGES>
<LANGUAGE>zh-CN</LANGUAGE>
<DEFAULT>zh-CN</DEFAULT>
</LANGUAGES>
<VERSION>
<MAJOR>6</MAJOR>
<MINOR>1</MINOR>
<BUILD>7600</BUILD>
<SPBUILD>16385</SPBUILD>
<SPLEVEL>0</SPLEVEL>
</VERSION>
<SYSTEMROOT>WINDOWS</SYSTEMROOT>
</WINDOWS>
<NAME>Windows 7 ULTIMATE</NAME>
<DESCRIPTION>Windows 7 ULTIMATE</DESCRIPTION>
<FLAGS>Ultimate</FLAGS>
<HARDLINKBYTES>3463057728</HARDLINKBYTES>
<DISPLAYNAME>Windows 7 旗舰版</DISPLAYNAME>
<DISPLAYDESCRIPTION>Windows 7 旗舰版</DISPLAYDESCRIPTION>
</IMAGE>
</WIM>
提取出的字符显示为:
Windows 7 简易版
Windows 7 家庭普通版
Windows 7 家庭高级版
Windows 7 专业版
Windows 7 旗舰版
其它字符不要显示 '<DISPLAYNAME>([^<]+)</DISPLAYNAME>' 用正则表达式比较好!不过在下不会!只是一个建议! 如果在INI里面或者是列表还好搞一点 谢谢2楼,正则完成
此楼给不会正则的朋友看,这里是一个比较通用的方法.
本帖最后由 sanmoking 于 2009-12-23 16:14 编辑我自己写了个func ,我平时不怎么会用正则,所以下面这个func能用到大部分文字提取的应用中,虽然楼主的问题解决了,但是后来的新人并没有一个通用的方法,下面的代码解决楼主的问题:$file=FileOpen("111.txt",0);打开下载的文件
If $file <> -1 Then $html = FileRead($file);读取文件源码
FileClose($file);关闭打开的文件
$aaa = ""
for $ti = 1 to amount($html,'<DISPLAYNAME>');查询有多少个条目
$ok=StringInStr($html,'<DISPLAYNAME>',0,$ti);得到第几条数据开始的位置
$aa = ies($html,"<DISPLAYNAME>","</DISPLAYNAME>",$ok);获取当前标题
$aaa = $aaa & $aa &@CRLF;每次循环后把结果附加到之前结果的后边...
next
MsgBox(0,"结果","条目如下:"&@CRLF&"------------------------------"&@CRLF&$aaa) ;显示结果
;~ 下面是几个简单的func,很有用的哦,你也可以写一些其他比如提取数字,删除<>之间字符,删除空白等等自定义func。
func ies($data,$a,$b,$s = 1,$c = 1);查找$a$b之间的文字,$a = 前面的关键词,$b=后边的关键词,$s = 从哪里开始查询 ,默认从$data最开始,$c = 查询第几个,默认为1
$start = StringInStr ($data, $a,0,$c,$s)+StringLen ($a);查找$a的结束位置
$end = StringInStr ($data,$b,0,1,$start);从$a的结束位置开始查找$b的位置
$amount = $end - $start;获得中间部分字符的数量
$txt=StringMid ($data, $start, $amount);得到中间部分的字符
Return $txt ;返回结果
EndFunc
func amount($data,$txt);返回$data一共有多少个$txt,没有考虑最大字节数..貌似会有限制...
$am = 0
while 1
$ok=StringInStr ($data,$txt,0,$am + 1)
If $ok > 0 Then
$am = $am + 1
Else
ExitLoop
EndIf
wend
Return $am
EndFunc另外此func 解决了很多这样的问题,给个[传送门] 现在是看不懂....不知多久后能懂.
页:
[1]