V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
The Go Programming Language
http://golang.org/
Go Playground
Go Projects
Revel Web Framework
wudaown
V2EX  ›  Go 编程语言

golang regex 匹配所有符合字符串

  •  
  •   wudaown · 2020-03-21 09:55:32 +08:00 · 2611 次点击
    这是一个创建于 1733 天前的主题,其中的信息可能已经有所发展或是发生改变。

    我有下面这段字符串

    text := 59-1124043-1053 - FLOATING JOINT4PC15-Feb-2020 Purchase PriceUSD 28.00112.00per 1 PCTaxable: N Resale: N<70.000KG DHL Express - Intl Shpts 962183650SCAC1:DHLCOver70.000KG Agility Logistics - Export StandardSCAC2:AGYC3 59-1124043-1055 - UNDER RAIL1PC15-Feb-2020 Purchase PriceUSD 138.00138.00per 1 PCTaxable: N Resale: N<70.000KG DHL Express - Intl Shpts 962183650SCAC1:DHLCOver70.000KG Agility Logistics - Export StandardSCAC2:AGYC4 59-1124043-1056 - UPPER RAIL1PC15-Feb-2020

    通过正则规则

    PNExprV2 = [^(-138)][\d]+-.*[\d][\s]+.*[\d]{1}[a-zA-Z]{2}[\d]{2}-[a-zA-Z]{3}-[\d]{4}

    PNParser := regexp.MustCompile(PNExprV2)

    line1 := PNParser.FindAllStringSubmatch(c, 10)

    匹配出来的还是原字符串 期待匹配结果是 ["59-1124043-1053 - FLOATING JOINT4PC15-Feb-2020", ..., "59-1124043-1056 - UPPER RAIL1PC15-Feb-2020"]

    请大佬们帮忙看看

    第 1 条附言  ·  2020-03-21 15:39:04 +08:00
    延申一下
    如果字串是
    text := "611-0571271-20126 8PC03-Feb-2020DEBURR INSERTMaterial revision level: A Purchase PriceUSD 14.00112.00per 1 PCTaxable: N Resale: NRFQ# 170593<70.000KG DHL Express - Intl Shpts 962183650SCAC1:DHLCOver70.000KG Agility Logistics - Export StandardSCAC2:AGYC711-0571337-21531 5PC03-Feb-2020U-ING PUNCHMaterial revision level: B Purchase PriceUSD 41.00205.00per 1 PCTaxable: N Resale: NRFQ# 170593<70.000KG DHL Express - Intl Shpts 962183650SCAC1:DHLCOver70.000KG Agility Logistics - Export StandardSCAC2:AGYC811-0571378-5147B 7PC03-Feb-2020"

    还想匹配到 ["611-0571271-20126 8PC03-Feb-2020", ..., "811-0571378-5147B 7PC03-Feb-2020"] 这样呢
    有没有通用的

    谢谢大佬们
    第 2 条附言  ·  2020-03-21 15:43:50 +08:00
    xxx-xxx-xxx 9PC 日期 中间可能有也可能没有任意字串
    7 条回复    2020-03-21 16:09:45 +08:00
    buhuiqizi
        1
    buhuiqizi  
       2020-03-21 10:30:45 +08:00
    .*就已经匹配完了后面的所有字符串。
    [^(-138)][\d]+-[\d]+-[\d]+\s-\s[a-zA-Z]+\s[a-zA-Z0-9]+-[a-zA-Z]+-[\d]+
    这个能匹配出来,但是我也是个菜鸟,还是等大佬的吧。
    Akiyu
        2
    Akiyu  
       2020-03-21 10:34:12 +08:00
    有什么特殊要求么?
    [^(-138)] 看起来你想不以 -, 1, 3, 8 开头? 那么不需要括号, 直接 [^-138] 就好. 除非你是想不以整个 -138 开头.
    其他方便好像没看出什么来.
    简单的话就 "\d+-\d+-\d+ - \w+ \w+-\w{3}-\d{4}"

    PS: 不要滥用 .*

    测试可以用这个网站: https://regex101.com/
    wudaown
        3
    wudaown  
    OP
       2020-03-21 10:47:20 +08:00
    @buhuiqizi
    @Akiyu 感谢两位,这段文本可以匹配了,我在试试看其他文本。谢谢
    wudaown
        4
    wudaown  
    OP
       2020-03-21 15:39:17 +08:00
    @buhuiqizi @Akiyu 大佬能帮忙看看新的么
    Akiyu
        5
    Akiyu  
       2020-03-21 15:58:50 +08:00
    "\d+-\d+-\w+( - \w+)? \w+-\w{3}-\d{4}"

    "611-0571271-20126 8PC03-Feb-2020" 这个字符串中间没有之前 "59-1124043-1053 - FLOATING JOINT4PC15-Feb-2020" 中的 " - FLOATING" 改成可选的就行.
    然后 "811-0571378-5147B 7PC03-Feb-2020" 中的 "5147B" 出现了字母, 用字符匹配.

    推荐一下 <regular expression> 这本书. 简单的话 <正则表达式必知必会> 就几十页, 上班通勤就完全搞定. 基本上大多简单的问题都能解决.
    wudaown
        6
    wudaown  
    OP
       2020-03-21 16:09:11 +08:00
    @Akiyu 感谢
    我这个比较复杂并且不是很通用

    如下字符
    ···
    611-0571271-20126 8PC03-Feb-2020DEBURR INSERTMaterial revision level: A Purchase PriceUSD 14.00112.00per 1 PCTaxable: N Resale: NRFQ# 170593<70.000KG DHL Express - Intl Shpts 962183650SCAC1:DHLCOver70.000KG Agility Logistics - Export StandardSCAC2:AGYC711-0571337-21531 5PC03-Feb-2020U-ING PUNCHMaterial revision level: B Purchase PriceUSD 41.00205.00per 1 PCTaxable: N Resale: NRFQ# 170593<70.000KG DHL Express - Intl Shpts 962183650SCAC1:DHLCOver70.000KG Agility Logistics - Export StandardSCAC2:AGYC811-0571378-5147B 7PC03-Feb-2020

    59-1124043-1053 - FLOATING JOINT4PC15-Feb-2020 Purchase PriceUSD 28.00112.00per 1 PCTaxable: N Resale: N<70.000KG DHL Express - Intl Shpts 962183650SCAC1:DHLCOver70.000KG Agility Logistics - Export StandardSCAC2:AGYC3 59-1124043-1055 - UNDER RAIL1PC15-Feb-2020 Purchase PriceUSD 138.00138.00per 1 PCTaxable: N Resale: N<70.000KG DHL Express - Intl Shpts 962183650SCAC1:DHLCOver70.000KG Agility Logistics - Export StandardSCAC2:AGYC4 59-1124043-1056 - UPPER RAIL1PC15-Feb-2020

    219-1895504 Die Sleeve, Floating Die Case6PC31-Jan-2020 Purchase PriceUSD 117.00702.00per 1 PCTaxable

    119-1895433-00008 Punch Cap6PC31-Jan-2020 Purchase PriceUSD 95.0

    are met for this material / product. This specification is available under Documents in the TE SupplierPortal at https://supplierportal.te.comGD40 - EACH LOT MUST INCLUDE AN INSPECTION REPORT, WITH 100% INSPECTION OF 10% OF THE PIECES, WITH A MINIMUM OF ONE PIECE INSPECTED AND A MAXIMUM OF 3 PIECES INSPECTED FOR EACH LOT.GD50 - EACH PIECE OF TOOLING TO BE CONSTRUCTED PER TE DRAWING AND SCRIBED IN A NON-FUNCTIONAL AREA WITH THE DETAIL NUMBER, REVISION LEVEL AND SUPPLIER NUMBER. SEE PURCHASE ORDER FOR SUPPLIER NUMBER. IF SIZE IS RESTRICTED, BAG AND TAG EACH PIECE WITH THE SAME INFORMATION.GE26 - THE VALUE OF THIS PO IS NOT CONSIDERED AN "ASSIST" BY U.S.CUSTOMS.119-1895433-00008 Punch Cap6PC31-Jan-2020 219-1895504 Die Sleeve, Floating Die Case6PC31-Jan-2020
    ···
    wudaown
        7
    wudaown  
    OP
       2020-03-21 16:09:45 +08:00
    我最后用了
    (\d+-\d+-[\d]+|\d+-\d+)(.*?)\d{1,2}\w{2}\d{2}-\w{3}-\d{4}
    关于   ·   帮助文档   ·   博客   ·   API   ·   FAQ   ·   实用小工具   ·   1011 人在线   最高记录 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 22ms · UTC 19:17 · PVG 03:17 · LAX 11:17 · JFK 14:17
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.