Haskell Parsec 解析空间问题(也许)

问题描述 投票:0回答:1

我已经写了几百行来解析我正在使用的玩具语言。我以为我开始真正理解秒差距了。但现在我在看似非常简单的解析任务上遇到了困难,所以我显然缺少一些基本理解的元素。

我想匹配这样的例子(实际上不是,但在这个最小的例子中我这样做):

  • 名字结束
  • 名字 Foo 结束
  • 姓名 富 结束
  • 等等

我已将其简化为一个人为的最小示例(位于文件 MinEx.hs 中):

module MinEx where

import Text.Parsec
import Text.Parsec.Token
import Data.Char (isSpace)
import Data.Maybe (fromMaybe)
import System.Environment (getArgs)
import System.IO.Unsafe (unsafePerformIO)

myDef :: LanguageDef st
myDef = LanguageDef
  { commentStart    = ""
  , commentEnd      = ""
  , commentLine     = "#"
  , nestedComments  = True
  , identStart      = letter
  , identLetter     = alphaNum
  , opStart         = opLetter myDef
  , opLetter        = oneOf ":!#$%&*+./<=>?@\\^|-~"
  , reservedOpNames = []
  , reservedNames   = []
  , caseSensitive   = True
  }

TokenParser{parens = myParens
           , identifier = myIdentifier
           , reservedOp = myReservedOp
           , reserved = myReserved
           , semiSep1 = mySemiSep1
           , whiteSpace = myWhiteSpace } = makeTokenParser myDef

simpleSpace :: Parsec String st ()
simpleSpace = skipMany1 (satisfy isSpace)

upperIdentifier :: Parsec String st String
upperIdentifier = lookAhead upper >> myIdentifier

x `uio` y = unsafePerformIO x `seq` y

nameThenEnd :: Parsec String st String
nameThenEnd = do
  print "at name" `uio` string "name"
  print "at spaces after name" `uio` simpleSpace
  maybeName <- print "at ident" `uio` optionMaybe upperIdentifier
  -- I only match this here and not above with `optionMaybe (upperIdentifier <* simpleSpace)` for debugging purposes.
  case maybeName of
    Nothing -> (print "at no spaces after no ident" >> print maybeName) `uio` return ()
    Just name -> (print "at spaces after ident" >> print maybeName) `uio` simpleSpace
  string "end"
  return (fromMaybe "" maybeName)

main :: IO ()
main = getArgs >>= \args -> print (parse (nameThenEnd <* eof) "" (args !! 0))

在没有给出 ident 的情况下,在它工作的地方运行示例:

> runhaskell MinEx.hs "name end"
"at name"
"at spaces after name"
"at ident"
"at no spaces after no ident"
Nothing
Right ""

非工作示例:

> runhaskell MinEx.hs "name Foo end"
"at name"
"at spaces after name"
"at ident"
"at spaces after ident"
Just "Foo"
Left (line 1, column 10):
unexpected "e"

谢谢。如果我遗漏了一些非常明显的东西,请道歉。

haskell
1个回答
0
投票

生成的

identifier
解析器是一个 lexeme 解析器,这意味着它将吃掉标识符后面的任何空格。因此,您的
simpleSpace
失败,因为没有剩余空间可供消耗,并且您使用
skipMany1
定义了它。

如果使用

makeTokenParser
生成的解析器,通常不需要手动处理空格(例如,使用
symbol
reserved
reservedOp
而不是
string
)。请参阅文档了解更多信息。

© www.soinside.com 2019 - 2024. All rights reserved.