golang中bufio.SplitFunc的深入理解

åè¨

bufioæ¨¡åæ¯golangæ ååºä¸­ç模åä¹ä¸ï¼ä¸»è¦æ¯å®ç°äºä¸ä¸ªè¯»åçç¼å­ï¼ç¨äºå¯¹æ°æ®ç读åæèå奿ä½ã该模åå¨å¤ä¸ªæ¶åioçæ ååºä¸­è¢«ä½¿ç¨ï¼æ¯å¦http模å中使ç¨buffioæ¥å®æç½ç»æ°æ®ç读åï¼å缩æä»¶çzip模åå©ç¨bufioæ¥æä½æä»¶æ°æ®ç读åç­ã

golangçbufioåéé¢å®ä»¥çSplitFuncæ¯ä¸ä¸ªæ¯è¾éè¦ä¹æ¯è¾é¾ä»¥çè§£çä¸è¥¿ï¼æ¬æå¸æéè¿ç»åç®åçå®ä¾ä»ç»SplitFuncçå·¥ä½åç以åå¦ä½å®ç°ä¸ä¸ªèªå·±çSplitFuncã

ä¸ä¸ªä¾å­

å¨bufioåéé¢å®ä¹äºä¸äºå¸¸ç¨ç工巿¯å¦Scanner,ä½ å¯è½éè¦è¯»åç¨æ·å¨æ åè¾å¥éé¢è¾å¥çä¸äºä¸è¥¿ï¼æ¯å¦æä»¬åä¸ä¸ªå¤è¯»æºï¼è¯»åç¨æ·çæ¯ä¸è¡è¾å¥ï¼ç¶åæå°åºæ¥ï¼

package main
import (
 "bufio"
 "fmt"
 "os"
)
func main() {
 scanner := bufio.NewScanner(os.Stdin)
 scanner.Split(bufio.ScanLines)
 for scanner.Scan() {
 fmt.Println(scanner.Text())
 }
}

è¿ä¸ªç¨åºå¾ç®åï¼os.Stdinå®ç°äºio.Readeræ¥å£ï¼æä»¬ä»è¿ä¸ªreaderå建äºä¸ä¸ªscanner,设置åå²å½æ°ä¸ºbufio.ScanLinesï¼ç¶åfor循ç¯ï¼æ¯æ¬¡è¯»å°ä¸è¡æ°æ®å°±å°ææ¬å容æå°åºæ¥ã麻éè½å°äºè俱å¨ï¼è¿ä¸ªå°ç¨åºè½ç¶ç®åï¼å´å¼åºäºæä»¬ä»å¤©è¦ä»ç»ç对象: bufio.SplitFunc,å®çå®ä¹æ¯è¿ä¸ªæ ·å­çï¼

package "buffio"
type SplitFunc func(data []byte, atEOF bool) (advance int, token []byte, err error)

golang宿¹ææ¡£çæè¿°æ¯è¿ä¸ªæ ·å­çï¼

SplitFunc is the signature of the split function used to tokenize the input. The arguments are an initial substring of the remaining unprocessed data and a flag, atEOF, that reports whether the Reader has no more data to give. The return values are the number of bytes to advance the input and the next token to return to the user, if any, plus an error, if any.

Scanning stops if the function returns an error, in which case some of the input may be discarded.

Otherwise, the Scanner advances the input. If the token is not nil, the Scanner returns it to the user. If the token is nil, the Scanner reads more data and continues scanning; if there is no more data--if atEOF was true--the Scanner returns. If the data does not yet hold a complete token, for instance if it has no newline while scanning lines, a SplitFunc can return (0, nil, nil) to signal the Scanner to read more data into the slice and try again with a longer slice starting at the same point in the input.

The function is never called with an empty data slice unless atEOF is true. If atEOF is true, however, data may be non-empty and, as always, holds unprocessed text.

è±æï¼åæ°è¿ä¹å¤ï¼è¿åå¼è¿ä¹å¤ï¼å¥½ç¦ï¼ä¸ç¥éåä½è¯»èéå°è¿ç§ææ¡£ä¼ä¸ä¼æè¿ç§æè§...æ­£å¼ç±äºè¿ç§æåµï¼ææå³å®åä¸ç¯æç« ä»ç»ä¸ä¸SplitFuncçå·ä½å·¥ä½åçï¼ç¨ä¸ç§éä¿çæ¹å¼ç»åå·ä½å®ä¾å ä»¥è¯´æï¼å¸æå¯¹è¯»èææå¸®å©ã
好äºï¼åºè¯å°è¯´ï¼å¼å§æ­£é¢å§ï¼

ScanneråSplitFuncç工使ºå¶

package "buffio"
type SplitFunc func(data []byte, atEOF bool) (advance int, token []byte, err error)

Scanneræ¯æç¼å­çï¼æææ¯Scanneråºå±ç»´æ¤äºä¸ä¸ªSliceç¨æ¥ä¿å­å·²ç»ä»Reader中读åçæ°æ®ï¼Scannerä¼è°ç¨æä»¬è®¾ç½®SplitFuncï¼å°ç¼å²åºå容(data)忝å¦å·²ç»è¾å¥å®äº(atEOF)以忰çå½¢å¼ä¼ éç»SplitFuncï¼èSplitFuncçèè´£å°±æ¯æ ¹æ®ä¸è¿°çä¸¤ä¸ªåæ°è¿åä¸ä¸æ¬¡Scanéè¦åè¿å ä¸ªå­è(advance)ï¼åå²åºæ¥çæ°æ®(token)ï¼ä»¥åé误(err)ã

è¿æ¯ä¸ä¸ªéä¿¡ååçè¿ç¨ï¼Scanneråè¯æä»¬çSplitFuncå·²ç»æ«æå°çæ°æ®åæ¯å¦å°ç»å°¾äºï¼æä»¬çSplitFuncåæ ¹æ®è¿äºä¿¡æ¯å°åå²çç»æè¿åå䏿¬¡æ«æéè¦åè¿çä½ç½®è¿åç»Scannerãç¨ä¸ä¸ªä¾å­æ¥è¯´æï¼

package main
import (
 "bufio"
 "fmt"
 "strings"
)
func main() {
 input := "abcdefghijkl"
 scanner := bufio.NewScanner(strings.NewReader(input))
 split := func(data []byte, atEOF bool) (advance int, token []byte, err error) {
  fmt.Printf("%t\t%d\t%s\n", atEOF, len(data), data)
  return 0, nil, nil
 }
 scanner.Split(split)
 buf := make([]byte, 2)
 scanner.Buffer(buf, bufio.MaxScanTokenSize)
 for scanner.Scan() {
  fmt.Printf("%s\n", scanner.Text())
 }
}

è¾åº

false 2 ab
false 4 abcd
false 8 abcdefgh
false 12 abcdefghijkl
true 12 abcdefghijkl

è¿éæä»¬æç¼å²åºçåå§å¤§å°è®¾ç½®ä¸ºäº2ï¼ä¸å¤çæ¶å伿©å±ä¸ºåæ¥ç2åï¼æå¤§ä¸ºbufio.MaxScanTokenSize,è¿æ ·ä¸å¼å§æ«æ2个å­èï¼æä»¬çç¼å²åºå°±æ»¡äºï¼readerçåå®¹è¿æ²¡æè¯»åå°EOFï¼ç¶åsplit彿°æ§è¡ï¼è¾åº:

false 2 ab

ç´§æ¥ç彿°è¿å 0, nil, nilè¿ä¸ªè¿åå¼åè¯Scanneræ°æ®ä¸å¤ï¼ä¸æ¬¡è¯»åçä½ç½®åè¿0ä½ï¼éè¦ç»§ç»­ä»readeréé¢è¯»å,æ­¤æ¶å ä¸ºç¼å²åºæ»¡äºï¼æä»¥å®¹éæ©å±ä¸º2 * 2 = 4ï¼readerçåå®¹è¿æ²¡æè¯»åå°EOFï¼è¾åº

false 4 abcd

éå¤ä¸è¿°æ­¥éª¤ï¼ä¸ç´å°æåå¨é¨å容读åå®äºï¼EOFæ­¤æ¶åæäºtrue

true 12 abcdefghijkl

çäºä¸é¢çè¿ç¨æ¯ä¸æ¯å¯¹SplitFuncçå·¥ä½åæ¥æäºä¸ç¹çè§£äºå¢ï¼åå头çä¸ä¸golangç宿¹ææ¡£ææ²¡æè§å¾ç¨å¾®çè§£äºä¸ç¹?ä¸é¢æ¯bufio.ScanLinesçå®ç°ï¼è¯»èå¯ä»¥èªå·±ç ç©¶ä¸ä¸è¯¥å½æ°æ¯å¦ä½å·¥ä½ç

æ ååºéçScanLines

func ScanLines(data []byte, atEOF bool) (advance int, token []byte, err error) {
 // 表示æä»¬å·²ç»æ«æå°ç»å°¾äº
 if atEOF && len(data) == 0 {
  return 0, nil, nil
 }
 // æ¾å°\nçä½ç½®
 if i := bytes.IndexByte(data, '\n'); i >= 0 {
  // æä¸æ¬¡å¼å§è¯»åçä½ç½®ååç§»å¨i + 1ä½
  return i + 1, dropCR(data[0:i]), nil
 }
 // è¿éå¤ççreaderå容å¨é¨è¯»åå®äºï¼ä½æ¯å容ä¸ä¸ºç©ºï¼æä»¥éè¦æå©ä½çæ°æ®è¿å
 if atEOF {
  return len(data), dropCR(data), nil
 }
 // 表示ç°å¨ä¸è½åå²ï¼åReaderè¯·æ±æ´å¤çæ°æ®
 return 0, nil, nil
}

åè

In-depth introduction to bufio.Scanner in Golang

æ»ç»

以ä¸å°±æ¯è¿ç¯æç« çå¨é¨å容äºï¼å¸ææ¬æçå容对大家ç学习æèå·¥ä½å·æä¸å®çåè学习价å¼ï¼å¦ææçé®å¤§å®¶å¯ä»¥çè¨äº¤æµï¼è°¢è°¢å¤§å®¶å¯¹èæ¬ä¹å®¶çæ¯æã

相关推荐