【已解决】求取一条正则提取数据
本帖最后由 chamlien 于 2022-4-7 17:21 编辑源数据如下:
"NewsletterV3:44e3f26acd1a": {
"id": "44e3f26acd1a",
"__typename": "NewsletterV3",
"subscribersCount": 57
},
"Post:7bf5a183df09": {
"id": "7bf5a183df09",
"__typename": "Post",
"firstPublishedAt": 1648826488738,
"readingTime": 0.9886792452830189,
"createdAt": 1648826033137,
"mediumUrl": "https://lakshmanok.medium.com/why-i-left-google-7bf5a183df09",
"previewImage": {
"__ref": "ImageMetadata:1*AmbTCdFrLdvueJU0e3OmhQ.jpeg"
},
"title": "Why I left Google",
"collection": null,
"creator": {
"__ref": "User:247b0630b5d6"
},
"visibility": "PUBLIC",
"isProxyPost": false,
"isLocked": false,
"previewContent": {
"__typename": "PreviewContent",
"subtitle": "The obligatory manifesto that one needs to publish when leaving Google",
"isFullContent": false
},
"tags": [],
"allowResponses": true,
"statusForCollection": null,
"isPublished": true,
"clapCount": 866,
"pinnedAt": 0,
"pinnedByCreatorAt": 0,
"curationEligibleAt": 0,
"responseDistribution": "NOT_DISTRIBUTED",
"inResponseToPostResult": null,
"inResponseToCatalogResult": null,
"pendingCollection": null,
"isNewsletter": false,
"isAuthorNewsletter": true,
"voterCount": 217,
"recommenders": []
},
"ImageMetadata:1*_--RXc8V5WaMSH5F-sO3og.png": {
"id": "1*_--RXc8V5WaMSH5F-sO3og.png",
"__typename": "ImageMetadata"
},
"NewsletterV3:379e9e644de4": {
"id": "379e9e644de4",
"__typename": "NewsletterV3",
"subscribersCount": 455
},
"Post:b5a172672bd8": {
"id": "b5a172672bd8",
"__typename": "Post",
"firstPublishedAt": 1649066586515,
"readingTime": 4.816981132075472,
"createdAt": 1645141769978,
"mediumUrl": "https://medium.com/yardcouch-com/elon-musk-people-dont-realize-what-s-coming-b5a172672bd8",
"previewImage": {
"__ref": "ImageMetadata:1*_--RXc8V5WaMSH5F-sO3og.png"
},
"title": "Elon Musk: People Don’t Realize What’s Coming",
"collection": {
"__ref": "Collection:9da99a5586b0"
},
"creator": {
"__ref": "User:8f67d94fbbe9"
},
"visibility": "LOCKED",
"isProxyPost": false,
"isLocked": true,
"previewContent": {
"__typename": "PreviewContent",
"subtitle": "“Earth won’t exist in 12 years”",
"isFullContent": false
},
"tags": [
{
"__ref": "Tag:technology"
},
{
"__ref": "Tag:space"
},
{
"__ref": "Tag:politics"
},
{
"__ref": "Tag:science"
},
{
"__ref": "Tag:elon-musk"
}
],
"allowResponses": true,
"statusForCollection": "APPROVED",
"isPublished": true,
"clapCount": 1313,
"pinnedAt": 0,
"pinnedByCreatorAt": 0,
"curationEligibleAt": 1649034924969,
"responseDistribution": "NOT_DISTRIBUTED",
"inResponseToPostResult": null,
"inResponseToCatalogResult": null,
"pendingCollection": null,
"isNewsletter": false,
"isAuthorNewsletter": true,
"voterCount": 133,
"recommenders": []
},
"ImageMetadata:1*uZ_Em7p_1lPOGPwrYQysdA.jpeg": {
"id": "1*uZ_Em7p_1lPOGPwrYQysdA.jpeg",
"__typename": "ImageMetadata"
}
上面数据仅想要提取蓝色部分,也即 Post:xxx 花括号里的内容,应该怎么提取呢?
Highlight code by AuREHelper
(?si)"Post:\w+":\h+({.+?},)\s+"Image
(?is)(?:"Post:\w+":)(.+?)(?=,\s*"ImageMetadata:) 正则虽然可以,但用json更合适。 afan 发表于 2022-4-7 16:10
感谢,如果上面的数据没有换行,正则应该怎么写呢 zghwelcome 发表于 2022-4-7 16:11
(?is)(?:"Post:\w+":)(.+?)(?=,\s*"ImageMetadata:)
最后一条数据如果并没有ImageMetadata,而是直接 一个 花括号 }结束,应该怎么匹配呢 chamlien 发表于 2022-4-7 16:53
感谢,如果上面的数据没有换行,正则应该怎么写呢
试试
Highlight code by AuREHelper
(?si)"Post:\w+":\h+({.+?},?)\s*(?="Image|$)
afan 发表于 2022-4-7 16:57
试试
a大这条不行呢 chamlien 发表于 2022-4-7 16:54
最后一条数据如果并没有ImageMetadata,而是直接 一个 花括号 }结束,应该怎么匹配呢
稍微修改了一下正则:
(?is)(?:"Post:\w+":)(.+?)(?=]\s*})
匹配到的每条数据结尾再手动补上 "]}",也能解决需求了 chamlien 发表于 2022-4-7 17:07
a大这条不行呢
不可能吧
afan 发表于 2022-4-7 17:13
不可能吧
原来是java符号要转义,这条是可以的,感谢
我的万能大法
(?s)"Post:.*?": {(.*?)},\s*"ImageMetadata
还可行
页:
[1]