我在 Andrew Birkett 的博客上读到XML 的应用箭头 &&& 回归纯粹我们可以混合箭头和应用函子。
我自己尝试过,但没有达到我的预期。 我想要这个结果:
[Scenario {scenario = "11111", origin = "333", alarm = "Sonde1"},
Scenario {scenario = "22222", origin = "444", alarm = "Sonde2"}]
但我得到的是这个:
[Scenario {scenario = "11111", origin = "333", alarm = "Sonde1"},
Scenario {scenario = "11111", origin = "333", alarm = "Sonde2"},
Scenario {scenario = "11111", origin = "444", alarm = "Sonde1"},
Scenario {scenario = "11111", origin = "444", alarm = "Sonde2"},
Scenario {scenario = "22222", origin = "333", alarm = "Sonde1"},
Scenario {scenario = "22222", origin = "333", alarm = "Sonde2"},
Scenario {scenario = "22222", origin = "444", alarm = "Sonde1"},
Scenario {scenario = "22222", origin = "444", alarm = "Sonde2"}]
我认为我的代码有问题,但我不知道在哪里搜索。
下面是我的代码,如果有人可以提供一些帮助。
{-# LANGUAGE Arrows, NoMonomorphismRestriction #-}
import Text.XML.HXT.Core
import Control.Applicative
import Text.XML.HXT.Arrow.ReadDocument
import Data.Maybe
import Text.XML.HXT.XPath.Arrows
import Text.Printf
data Scenario = Scenario
{ scenario, origin, alarm :: String
}
deriving (Show, Eq)
xml= "<DATAS LANG='en'>\
\ <SCENARIO ID='11111'>\
\ <ORIGIN ID='333'>\
\ <SCENARIO_S ERR='0'></SCENARIO_S>\
\ <SCENARIO_S ERR='2'></SCENARIO_S>\
\ <ALARM_M NAME='Sonde1'></ALARM_M>\
\ </ORIGIN>\
\ </SCENARIO>\
\ <SCENARIO ID='22222'>\
\ <ORIGIN ID='444'>\
\ <SCENARIO_S ERR='10'></SCENARIO_S>\
\ <SCENARIO_S ERR='12'></SCENARIO_S>\
\ <ALARM_M NAME='Sonde2'></ALARM_M>\
\ </ORIGIN>\
\ </SCENARIO>\
\</DATAS>"
parseXML string = readString [ withValidate no
, withRemoveWS yes -- throw away formating WS
] string
parseVal tag name = WrapArrow $ getXPathTrees (printf "/DATAS/%s" tag) >>> getAttrValue name
parseDatas = unwrapArrow $ Scenario <$> parseVal "SCENARIO" "ID"
<*> parseVal "SCENARIO/ORIGIN" "ID"
<*> parseVal "SCENARIO/ORIGIN/ALARM_M" "NAME"
testarr1= runX (parseXML xml >>> parseDatas)
正如 Rampion 所指出的,问题在于列表 monad 如何与 applicative 一起工作。看看这个:
λ *Main > (+) <$> [1,2,3] <*> [1,2,3]
[2,3,4,3,4,5,4,5,6]
结果是 (+) 应用于 [1,2,3] 和 [1,2,3] 的笛卡尔积:结果列表有 9 个元素。
在您的代码中,
parseVal "SCENARIO" "ID"
将返回一个包含2个元素的列表,parseVal "SCENARIO/ORIGIN" "ID"
和parseVal "SCENARIO/ORIGIN/ALARM_M" "NAME"
也是如此。因此,结果将有 8 个元素。
相反,这就是我更改您的代码的方式:
--- parse a generic tag
parseVal tag name = WrapArrow $ getXPathTrees (printf "%s" tag) >>> getAttrValue name
--- parse a "SCENARIO" xml element
parseScenario = unwrapArrow $ Scenario
<$> (WrapArrow $ getAttrValue "ID")
<*> (parseVal "SCENARIO/ORIGIN" "ID")
<*> (parseVal "SCENARIO/ORIGIN/ALARM_M" "NAME")
--- parse the XML, extract a list of SCENARIOS and, for each, apply parseScenario
testarr1= runX (parseXML xml >>> getXPathTrees (printf "/DATAS/SCENARIO" ) >>> parseScenario)
结果如你所愿:
λ *Main > testarr1
[Scenario {scenario = "11111", origin = "333", alarm = "Sonde1"},Scenario {scenario = "22222", origin = "444", alarm = "Sonde2"}]