首頁猿問使用正則表達式通配符獲取沒有周圍文...

使用正則表達式通配符獲取沒有周圍文本的標簽

Go

慕運維8079593 2023-07-26 10:06:55

我試圖在下面獲取“完成”值，該值位于分塊 http 流末尾返回的字節(jié)切片中。X-sync-status: done\r\n這是我到目前為止所做的 go 正則表達式syncStatusRegex = regexp.MustCompile("(?i)X-sync-status:(.*)\r\n")我只是想讓它返回這一點(.*)這是獲取狀態(tài)的代碼syncStatus := strings.TrimSpace(string(syncStatusRegex.Find(body))) fmt.Println(syncStatus)如何讓它只返回“完成”而不返回標題？

查看完整描述

1 回答

慕少森

TA貢獻2019條經(jīng)驗獲得超9個贊

您想要實現(xiàn)的是訪問捕獲組。我更喜歡命名捕獲組，并且有一個非常簡單的輔助函數(shù)可以處理這個問題：

package main

import (

"fmt"

"regexp"

)

// Our example input

const input = "X-sync-status: done\r\n"

// We anchor the regex to the beginning of a line with "^".

// Then we have a fixed string until our capturing group begins.

// Within our capturing group, we want to have all consecutive non-whitespace,

// non-control characters following.

const regexString = `(?i)^X-sync-status: (?P<status>\w*)`

// We ensure our regexp is valid and can be used.

var syncStatusRegexp *regexp.Regexp = regexp.MustCompile(regexString)

// The helper function...

func namedResults(re *regexp.Regexp, in string) map[string]string {

// ... does the matching

match := re.FindStringSubmatch(in)

result := make(map[string]string)

// and puts the value for each named capturing group

// into the result map

for i, name := range re.SubexpNames() {

if i != 0 && name != "" {

result[name] = match[i]

}

return result

}

func main() {

fmt.Println(namedResults(syncStatusRegexp, input)["status"])

}

Run on playground

注意您當前的正則表達式有些錯誤，因為您也會捕獲空格。使用當前的正則表達式，結(jié)果將是“done”而不是“done”。

編輯：當然，如果沒有正則表達式，您可以更便宜地做到這一點：

fmt.Print(strings.Trim(strings.Split(input, ":")[1], " \r\n"))

Run on playground

Edit2我很好奇 split 方法便宜多少，因此我想出了非常粗略的方法：

package main

import (

"fmt"

"log"

"regexp"

"strings"

)

// Our example input

const input = "X-sync-status: done\r\n"

// We anchor the regex to the beginning of a line with "^".

// Then we have a fixed string until our capturing group begins.

// Within our capturing group, we want to have all consecutive non-whitespace,

// non-control characters following.

const regexString = `(?i)^X-sync-status: (?P<status>\w*)`

// We ensure our regexp is valid and can be used.

var syncStatusRegexp *regexp.Regexp = regexp.MustCompile(regexString)

func statusBySplit(in string) string {

return strings.Trim(strings.Split(input, ":")[1], " \r\n")

}

func statusByRegexp(re *regexp.Regexp, in string) string {

return re.FindStringSubmatch(in)[1]

}

[...]

和一個小基準：

package main

import "testing"

func BenchmarkRegexp(b *testing.B) {

for i := 0; i < b.N; i++ {

statusByRegexp(syncStatusRegexp, input)

}

func BenchmarkSplit(b *testing.B) {

for i := 0; i < b.N; i++ {

statusBySplit(input)

}

然后，我讓它們分別在 1 個、2 個和 4 個可用的 CPU 上運行 5 次。恕我直言，結(jié)果非常有說服力：

go test -run=^$ -test.bench=. -test.benchmem -test.cpu 1,2,4 -test.count=5

goos: darwin

goarch: amd64

pkg: github.com/mwmahlberg/so-regex

BenchmarkRegexp 5000000 383 ns/op 32 B/op 1 allocs/op

BenchmarkRegexp 5000000 382 ns/op 32 B/op 1 allocs/op

BenchmarkRegexp 5000000 384 ns/op 32 B/op 1 allocs/op

BenchmarkRegexp-2 5000000 384 ns/op 32 B/op 1 allocs/op

BenchmarkRegexp-2 5000000 382 ns/op 32 B/op 1 allocs/op

BenchmarkRegexp-2 5000000 384 ns/op 32 B/op 1 allocs/op

BenchmarkRegexp-2 5000000 382 ns/op 32 B/op 1 allocs/op

BenchmarkRegexp-4 5000000 382 ns/op 32 B/op 1 allocs/op

BenchmarkRegexp-4 5000000 380 ns/op 32 B/op 1 allocs/op

BenchmarkRegexp-4 5000000 377 ns/op 32 B/op 1 allocs/op

BenchmarkSplit 10000000 161 ns/op 80 B/op 3 allocs/op

BenchmarkSplit 10000000 164 ns/op 80 B/op 3 allocs/op

BenchmarkSplit 10000000 165 ns/op 80 B/op 3 allocs/op

BenchmarkSplit 10000000 162 ns/op 80 B/op 3 allocs/op

BenchmarkSplit-2 10000000 159 ns/op 80 B/op 3 allocs/op

BenchmarkSplit-2 10000000 167 ns/op 80 B/op 3 allocs/op

BenchmarkSplit-2 10000000 161 ns/op 80 B/op 3 allocs/op

BenchmarkSplit-2 10000000 159 ns/op 80 B/op 3 allocs/op

BenchmarkSplit-4 10000000 159 ns/op 80 B/op 3 allocs/op

BenchmarkSplit-4 10000000 161 ns/op 80 B/op 3 allocs/op

BenchmarkSplit-4 10000000 159 ns/op 80 B/op 3 allocs/op

BenchmarkSplit-4 10000000 160 ns/op 80 B/op 3 allocs/op

PASS

ok github.com/mwmahlberg/so-regex 61.340s

它清楚地表明，在拆分標簽的情況下，實際使用拆分的速度是預(yù)編譯正則表達式的兩倍多。對于您的用例，我顯然會選擇使用 split。

反對回復(fù) 2023-07-26

1 回答
0 關(guān)注
103 瀏覽

關(guān)注

添加回答

舉報

0/150

提交

取消

使用 Ctrl+D 可將網(wǎng)站添加到書簽

微信客服

購課補貼
聯(lián)系客服咨詢優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動學習伙伴

公眾號

掃描二維碼
關(guān)注慕課網(wǎng)微信公眾號

第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

熱搜

最近搜索清空

使用正則表達式通配符獲取沒有周圍文本的標簽

使用正則表達式通配符獲取沒有周圍文本的標簽

1 回答

添加回答