代码如下,大家运行就知道怎么回事了。希望对大家会有些帮助~ 源文段来自一篇英文文献
目前的缺陷:
不能去除单词里的连词号 比如“auto-it”
单词里有非换行的连字符会导致匹配两个单词 比如“auto-it”会被当作auto和it两个单词
某些引号会被当作一个单词
### 友情提示:本脚本由 Au3.REHelper 于 2010/12/31 19:24 自动生成,不保证其正确性,请自行测试 ###
#include <Array.au3>
Local $Str = _
'Gene slr1393 of the cyanobacterium Synechocystis sp.' & @CRLF & _
'PCC6803 encodes a red–green photoreversible cyanobacter-' & @CRLF & _
'iochrome. The full-length protein contains three GAF' & @CRLF & _
'domains, but GAF3 (aa 441–596) alone is capable of' & @CRLF & _
'autocatalytically binding PCB to cysteine-528.' & @CRLF & _
'[21]' & @CRLF & _
'Addition' & @CRLF & _
'of PCB to GA results in a reversibly photochromic chromo-' & @CRLF & _
'protein, termed RGS (red–green switchable protein): state Pr' & @CRLF & _
'(lmax =650 nm) is strongly fluorescent (FF =0.06); it is' & @CRLF & _
'reversibly converted by irradiation with red light into state' & @CRLF & _
'Pg (lmax =539 nm), which has reduced and strongly blue-' & @CRLF & _
'shifted fluorescence (Table 1, Figure 1a). Photoswitching can' & @CRLF & _
'be repeated many times; it is stable over a wide pH range, and' & @CRLF & _
'is retained after RGS is embedded into polyvinyl alcohol' & @CRLF & _
'(PVA) film (see Figures S1 and S2 in the Supporting' & @CRLF & _
'Information).'
MsgBox(0, '原字符串', $Str)
Local $Test = StringRegExp($str, "\b(?!'-)(?:[a-zA-Z']|-[\r\n]+[a-zA-Z']+)+", 3)
If Not @Error Then MsgBox(0, '匹配数量: ' & UBound($Test), '其中[0]元素为: ' & $Test[0])
_ArrayDisplay($Test, UBound($Test))
|