XML属性互值对的排序算法

问题描述 投票:4回答:3

前提:

我试图找到或者更确切地想出一个算法来解析几个XML文件并提取保存在FROM=XXTO=YY节点属性中的启动序列。

  • 有数百条记录,可以拆分成对
  • 每个人都有FROMTO
  • 每对的TO值是一个指标,表明有一对新的qzexswpoi值
  • 所以这些对将链并创建连续的FROM值。

棘手的部分是,配对可以分裂,分支,分成多个并在某一点连接。

XML:

FROM-TO

I can help to visualise that like the following diagram:

<FLOW>
    <TASK FROM ="B" TO ="C"/>
    <TASK FROM ="C" TO ="E1"/>
    <TASK FROM ="C" TO ="F1"/>
    <TASK FROM ="A1" TO ="A2"/>
    <TASK FROM ="A2" TO ="A3"/>
    <TASK FROM ="A3" TO ="C"/>
    <TASK FROM ="C" TO ="D1"/>
    <TASK FROM ="D2" TO ="D3"/>
    <TASK FROM ="D1" TO ="D2"/>
    <TASK FROM ="E1" TO ="E2"/>
    <TASK FROM ="Start" TO ="B"/>
    <TASK FROM ="E2" TO ="E3"/>
    <TASK FROM ="F1" TO ="F2"/>
    <TASK FROM ="F2" TO ="F3"/>
    <TASK FROM ="F3" TO ="G"/>
    <TASK FROM ="Start" TO ="A1"/>
    <TASK FROM ="G" TO ="Finish"/>
    <TASK FROM ="E3" TO ="G"/>
    <TASK FROM ="D3" TO ="G"/>
</FLOW>

期望的输出:

开始,A1,A2,A3,B,C,D1,D2,D3,E1,E2,E3,F1,F2,F3,G,完成

问题:

我有这个代码运行,但我不能让它以正确的顺序工作并克服分裂。

    Start
  /        \  
 [A1]       |
  |         |
 [A2]      [B]
  |         |
 [A3]       |
   \       /
      [C]
   /   |    \ 
[D1]  [E1]  [F1]      
 |     |     |
[D2]  [E2]  [F2]
 |     |     |
[D3]  [E3]  [F3]
 \     |    /
      [G]
       |
     Finish
java xml powershell sorting graph-algorithm
3个回答
1
投票

当然,我需要一段时间才能为您提供优化的解决方案。开始:

解决方案(图表!)

好的,你为什么不像在OOP语言中那样处理它:

<# 
    INIT Values, starting pair is always considered as combination of tasks where FROM is 'Start'
    All task are loaded in pairs, and the sequence begining is assembled.
#>

$Counter = [int]1
$Pairs = $File.SelectNodes('/FLOW/TASK[@FROM="Start"]')
$Tasks = $File.SelectNodes("/FLOW/TASK")

$Sequence = @()
ForEach ($Pair in $Pairs) {
    $Sequence += $Pair.TOTASK
}

<#
    Scan the tasks and find the matching pair for initial task pair, save it to the output array. 
#>

Do {
    ForEach ($Task in $Tasks) {
        ## Main loop counter, on matching pairs
        If ($Pairs.FROM -eq $Task.FROM) {
               $Counter++ 
        }
        ## Find current pair's TO in task and flag it as next pair 
        If ($Pairs.TO -eq $Task.FROM) {
            $NextTask = $Task.FROM
            $NextPairs = $File.SelectNodes("/FLOW/TASK[@FROM='$NextTask']")
            $Sequence += $Task.TO
        }
    }

    ## Set new pair to be evaluated
    $Pairs = $NextPairs
} 
While ($Counter -le $Tasks.Count)

测试证明!

我测试了它如下:

# Use a hashtable for O(1) lookup for a node by name
$Script:NodeTracker = @{}
class TaskNode {
    #==PROPS==================================================|
    [System.Collections.ArrayList]$then = @()
    [String] $Val
    [Bool]$Visited = $false
    [Collections.ArrayList]$Parent = @()
    #==CONSTRUCTORS===========================================|
    TaskNode([String]$Val) {$this.constructor($Val, $null)}
    TaskNode([String]$Val, [TaskNode]$Parent) {$this.constructor($Val, $Parent)}
    hidden constructor([String]$Val, [TaskNode]$Parent) {
        $This.Val = $Val
        if ($Parent) {$This.Parents.Add($Parent)}
        $Script:NodeTracker.$Val = $This
    }

    #==METHODS================================================|
    [TaskNode] To([String]$Val) {
        $Node = $Script:NodeTracker.$Val

        # If node doesn't exist, create and track
        if (!$Node) {$Node = New-Object TaskNode $Val}
        $This.then.Add($Node)
        $Node.Parent.Add($This)
        return $Node
    }
    [String] ToString() {return "$($this.val)-$(if($this.Visited){'✓'}else{'✘'})"}
    static [String] ReverseModdedBFS([Collections.Queue]$Queue) {
        $Output = ""
        [Collections.Queue]$NextQueue = @()
        do {
            while ($Queue.Count) {
                $Root = $Queue.Dequeue()
                # Write-Host "Root: $Root | Queue: [$Queue] | Next-Queue [$NextQueue] | Non-Visited [$($Root.then | {!$_.Visited})]"
                if ($Root.Visited) {continue}
                if ($Root.Then.Count -gt 1 -and ($Root.then | {!$_.Visited})) {$NextQueue.Enqueue($Root);continue}
                $Root.Visited = $true
                $Output += ','+$Root.Val
                $Root.Parent | % {
                    # Write-Host "    Enqueuing $_"
                    $Queue.Enqueue($_)
                }
            }
            If ($Queue.Count -eq 1 -and !$Queue.Peek().Parent.count) {break}
            $Queue = $NextQueue
            $NextQueue = @()
        } while($Queue.Count)
        $Out = $Output.Substring(1).split(',')
        [Array]::Reverse($Out)
        return $Out -join ','
    }
    #==STATICS=================================================|
    static [TaskNode] Fetch([String]$Val) {
        $Node = $Script:NodeTracker.$Val
        # If node doesn't exist, create and track
        if (!$Node) {return (New-Object TaskNode $Val)}
        return $Node
    }
    static [TaskNode[]] GetAll() {
        return @($Script:NodeTracker.Values)
    }
    static [TaskNode] GetStart() {
        $Nodes = [TaskNode]::GetAll() | {$_.Parent.count -eq 0}
        if ($Nodes.Count -gt 1) {throw 'There is more than one starting node!'}
        if (!$Nodes.Count) {throw 'There is no starting node!'}
        return @($Nodes)[0]
    }
    static [TaskNode[]] GetEnds() {
        $Nodes = [TaskNode]::GetAll() | {$_.Then.count -eq 0}
        if (!$Nodes.Count) {throw 'There is no ending node!'}
        return @($Nodes)
    }
}

function Parse-Edge($From, $To) {
    # Use the safe static accessor so that it will fetch the singleton instance of the node, or create and return one!
    [TaskNode]::Fetch($From).To($To)
}

function XML-Main([xml]$XML) {
    @($XML.Flow.Task) | % {Parse-Edge $_.From $_.To}
    [TaskNode]::ReverseModdedBFS([TaskNode]::GetEnds())
}

测试输出

#Create or Find root node 'O'
$Root = [TaskNode]::Fetch('O')

# Set up Chains! Please draw to test
$root.To('A').To('B').To('C').To('H').To('Z').To('M')
$Root.To('D').To('E').To('C').To('H').To('I').To('M')
$Root.To('F').To('G').To('C').To('H').To('J').To('M')
[TaskNode]::Fetch('C').To('K').To('L').To('M')

# Run BFS!
[TaskNode]::ReverseModdedBFS([TaskNode]::GetEnds())

Explanation, and Algo

我们使用Root: M-✘ | Queue: [] | Next-Queue [] | Non-Visited [] Enqueuing Z-✘ Enqueuing I-✘ Enqueuing J-✘ Enqueuing L-✘ Root: Z-✘ | Queue: [I-✘ J-✘ L-✘] | Next-Queue [] | Non-Visited [] Enqueuing H-✘ Root: I-✘ | Queue: [J-✘ L-✘ H-✘] | Next-Queue [] | Non-Visited [] Enqueuing H-✘ Root: J-✘ | Queue: [L-✘ H-✘ H-✘] | Next-Queue [] | Non-Visited [] Enqueuing H-✘ Root: L-✘ | Queue: [H-✘ H-✘ H-✘] | Next-Queue [] | Non-Visited [] Enqueuing K-✘ Root: H-✘ | Queue: [H-✘ H-✘ K-✘] | Next-Queue [] | Non-Visited [] Enqueuing C-✘ Enqueuing C-✘ Enqueuing C-✘ Root: H-✓ | Queue: [H-✓ K-✘ C-✘ C-✘ C-✘] | Next-Queue [] | Non-Visited [] Root: H-✓ | Queue: [K-✘ C-✘ C-✘ C-✘] | Next-Queue [] | Non-Visited [] Root: K-✘ | Queue: [C-✘ C-✘ C-✘] | Next-Queue [] | Non-Visited [] Enqueuing C-✘ Root: C-✘ | Queue: [C-✘ C-✘ C-✘] | Next-Queue [] | Non-Visited [] Enqueuing B-✘ Enqueuing E-✘ Enqueuing G-✘ Root: C-✓ | Queue: [C-✓ C-✓ B-✘ E-✘ G-✘] | Next-Queue [] | Non-Visited [] Root: C-✓ | Queue: [C-✓ B-✘ E-✘ G-✘] | Next-Queue [] | Non-Visited [] Root: C-✓ | Queue: [B-✘ E-✘ G-✘] | Next-Queue [] | Non-Visited [] Root: B-✘ | Queue: [E-✘ G-✘] | Next-Queue [] | Non-Visited [] Enqueuing A-✘ Root: E-✘ | Queue: [G-✘ A-✘] | Next-Queue [] | Non-Visited [] Enqueuing D-✘ Root: G-✘ | Queue: [A-✘ D-✘] | Next-Queue [] | Non-Visited [] Enqueuing F-✘ Root: A-✘ | Queue: [D-✘ F-✘] | Next-Queue [] | Non-Visited [] Enqueuing O-✘ Root: D-✘ | Queue: [F-✘ O-✘] | Next-Queue [] | Non-Visited [] Enqueuing O-✘ Root: F-✘ | Queue: [O-✘ O-✘] | Next-Queue [] | Non-Visited [] Enqueuing O-✘ Root: O-✘ | Queue: [O-✘ O-✘] | Next-Queue [] | Non-Visited [] Root: O-✓ | Queue: [O-✓] | Next-Queue [] | Non-Visited [] Root: O-✓ | Queue: [] | Next-Queue [] | Non-Visited [] O,F,D,A,G,E,B,C,K,H,L,J,I,Z,M 使用一些漂亮的OOP技巧来绘制彼此的所有边缘。然后我们从所有汇聚节点(即没有子节点的节点)反向遍历图形。我们继续做graph,直到我们遇到一个节点:

  • 有1个以上的孩子
  • AND,有超过0个未访问的后代
  • 如果是这样,请为下一轮BFS添加!

重复此操作,直到当前和将来的队列为空,在这种情况下,您的输出现在已完成。现在:

  • 用逗号分隔
  • 反向数组(因为我们进行了反向遍历)
  • 打印输出!

0
投票

此脚本将打印出您想要的内容。它是独立的,因此您可以将其全部复制并运行它。还有改进的余地!

一件事:这假设一次分裂,就像你的样本一样。如果可能出现这种情况,则需要更多逻辑:

BFS

 |     |       |
 |     |       |
[D3]  [E3]   [F3]
 \     | \    /
  \    | [H] /              - spit from C is not resolved, F1, F2 and F3 not handled
   \   | /  /  
      [G]
       |
     Finish

0
投票

您的数据似乎受$script:xml = [xml] @" <FLOW> <TASK FROM ="B" TO ="C"/> <TASK FROM ="C" TO ="E1"/> <TASK FROM ="C" TO ="F1"/> <TASK FROM ="A1" TO ="A2"/> <TASK FROM ="A2" TO ="A3"/> <TASK FROM ="A3" TO ="C"/> <TASK FROM ="C" TO ="D1"/> <TASK FROM ="D2" TO ="D3"/> <TASK FROM ="D1" TO ="D2"/> <TASK FROM ="E1" TO ="E2"/> <TASK FROM ="Start" TO ="B"/> <TASK FROM ="E2" TO ="E3"/> <TASK FROM ="F1" TO ="F2"/> <TASK FROM ="F2" TO ="F3"/> <TASK FROM ="F3" TO ="G"/> <TASK FROM ="Start" TO ="A1"/> <TASK FROM ="G" TO ="Finish"/> <TASK FROM ="E3" TO ="G"/> <TASK FROM ="D3" TO ="G"/> </FLOW> "@ Function GetNextNode { param($node) $nextNode = $xml.FLOW.TASK | Where-Object {$_.FROM -eq $node.TO} | Sort-Object TO return $nextNode } Function GetPrevNode { param($node) $nextNode = $xml.FLOW.TASK | Where-Object {$_.TO -eq $node.FROM} | Sort-Object TO return $nextNode } $startNode = $xml.FLOW.TASK | Where-Object {$_.FROM -eq "Start"} | Sort-Object TO do{ # deal with the start node as it's a special case if(-not [string]::IsNullOrEmpty($startNode)){ # start node is the start of the chain [array]$mainChain = $startNode[0] # if you have two or more nodes, store them for now. They will be referenced later if($startNode.Count -gt 1){ [array]$splitterNode = $startNode } # take the first start node and find out which node it leads to [array]$nextNode = GetNextNode -node $startNode[0] # add one of the nodes. set $oldNode for next iteration [array]$mainChain += $nextNode[0] [array]$oldNode = $nextNode[0] # this is only for the first node, don't run through again $startNode = $null continue } # get the next node AND previus nodes for that next node [array]$nextNode = GetNextNode -node $oldNode[0] [array]$prevNode = GetPrevNode -node $nextNode[0] if($prevNode.Count -ne 1){ # if there are multiple previous nodes, go back and deal with them # to do this, go back to the $splitterNode if(-not [string]::IsNullOrEmpty($splitterNode)){ # exclude anything already added [array]$remainingNodes = $splitterNode | Where-Object {$_ -notin $mainChain} # if that leaves only one node, others have been dealt with # there is no longer a split if($remainingNodes.Count -eq 1){ $splitterNode = $null } [array]$oldNode = $remainingNodes[0] $mainChain += $remainingNodes[0] continue }else{ # if there is no $splitterNode, all previous nodes should already be in the chain # check foreach($node in $prevNode){ if($node -notin $mainChain){ Write-Host "Broken chain" Exit } } } } # if this node is a splitter, set it so it can be referenced later if($nextNode.Count -gt 1){ $splitterNode = $nextNode } # add this node to the chain. [array]$mainChain += $nextNode[0] [array]$oldNode = $nextNode[0] }while($oldNode.To -ne "Finish") ($mainChain.FROM | Select-Object -Unique) + "Finish" | Out-Host 的影响,并且具有最大和最少的元素。你需要的是一个partial order

下面是一种方法(在C ++中)。与接受答案中的广度优先搜索相比,这是深度优先搜索。这可能更容易阅读。

topological sort
© www.soinside.com 2019 - 2024. All rights reserved.