我如何以编程方式有效地生成非常大的测试数据表?

问题描述 投票:0回答:1

目标:

  • 一个.csv,以值0-100填充17列,增量为5。
  • The.csv应该记录所有行,其中一行中的值之和等于100。(仅以及所有组合的总和等于100%)]
  • 例如,有效条目为:

START

0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,100

0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,95

0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,10,90

...

0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,100,0

...

5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,20

...

100,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0

END

背景,如果相关:

我目前正在使用AutoIT填充.csv文件,以对基于索引的投资组合进行机器学习测试。就是说,针对17种指数基金的每一个组合,对我用一些手动创建的训练数据训练的模型进行测试。

我开始意识到我正在使用的脚本可能需要数周(或数月)才能完成。我让它运行一整夜,它计算出数十亿个组合,记录了880K行,并且只使用了三分之一的列。我认为增长是指数级的,因此看来这是一种可行的方法。

我不介意使用另一种方法来生成.csv文件,但是我确定有更好的方法来执行此操作,以便文件完成得更快?

我对AutoIT(SciTE)代码的优化或对更好工具的建议都值得欢迎,我不是专业的编码人员,只是有基本的了解。

我当前与AutoIT(SciTE编辑器)一起使用的代码在这里:
HotKeySet("{ESC}", "Terminate") ; press ESC to stop script
Func Terminate()
   Exit
EndFunc

Local $testnum = 0

Local $minFund = 1
Local $maxFund = 17
Local $increment = 5; fund percentage of portfolio to increment

; total stock market
Local $minTSM = 0
Local $maxTSM = 100
Local $TSM

; large cap blend
Local $minLCB = 0
Local $maxLCB = 100
Local $LCB

; large cap value
Local $minLCV = 0
Local $maxLCV = 100
Local $LCV

; large cap growth
Local $minLCG = 0
Local $maxLCG = 100
Local $LCG

; mid cap blend
Local $minMCB = 0
Local $maxMCB = 100
Local $MCB

; mid cap value
Local $minMCV = 0
Local $maxMCV = 100
local $MCV

; mid cap growth
Local $minMCG = 0
Local $maxMCG = 100
Local $MCG

; small cap blend
Local $minSCB = 0
Local $maxSCB = 100
Local $SCB

; small cap value
Local $minSCV = 0
Local $maxSCV = 100
Local $SCV

; small cap growth
Local $minSCG = 0
Local $maxSCG = 100
Local $SCG

; long term bond
Local $minLTB = 0
Local $maxLTB = 100
Local $LTB

; intermediate term bond
Local $minITB = 0
Local $maxITB = 100
Local $ITB

; short term bond
Local $minSTB = 0
Local $maxSTB = 100
local $STB

; treasury bills
Local $minBIL = 0
Local $maxBIL = 100
Local $BIL

; real estate
Local $minREIT = 0
$maxREIT = 100
Local $REIT

; commodities
Local $minCOM = 0
Local $maxCOM = 100
Local $COM

; gold
Local $minGLD = 0
Local $maxGLD = 100
Local $GLD

$TSM = $minTSM
While $TSM <= $maxTSM
   $LCB = $minLCB
   While $LCB <= $maxLCB
      $LCV = $minLCV
      While $LCV <= $maxLCV
         $LCG = $minLCG
         While $LCG <= $maxLCG
            $MCB = $minMCB
            While $MCB <= $maxMCB
               $MCV = $minMCV
               While $MCV <= $maxMCV
                  $MCG = $minMCG
                  While $MCG <= $maxMCG
                     $SCB = $minSCB
                     While $SCB <= $maxSCB
                        $SCV = $minSCV
                        While $SCV <= $maxSCV
                           $SCG = $minSCG
                           While $SCG <= $maxSCG
                              $LTB = $minLTB
                              While $LTB <= $maxLTB
                                 $ITB = $minITB
                                 While $ITB <= $maxITB
                                    $STB = $minSTB
                                    While $STB <= $maxSTB
                                       $BIL = $minBIL
                                       While $BIL <= $maxBIL
                                          $REIT = $minREIT
                                          While $REIT <= $maxREIT
                                             $COM = $minCOM
                                             While $COM <= $maxCOM
                                                $GLD = $minGLD
                                                While $GLD <= $maxGLD
                                                   $testnum = $testnum + 1
                                                   If $TSM + $LCB + $LCV + $LCG + $MCB + $MCV + $MCG + $SCB + $SCV + $SCG + $LTB + $ITB + $STB + $BIL + $REIT + $COM + $GLD = 100 Then
                                                      FileWrite("C:\Users\MyName\Desktop\predict.csv",$testnum & "," & $TSM & "," & $LCB & "," & $LCV & "," & $LCG & "," & $MCB & "," & $MCV & "," & $MCG & "," & $SCB & "," & $SCV & "," & $SCG & "," & $LTB & "," & $ITB & "," & $STB & "," & $BIL & "," & $REIT & "," & $COM & "," & $GLD & "," & @HOUR & ":" & @MIN &  ":" & @SEC & @CRLF)
                                                   EndIf
                                                   $GLD = $GLD + $increment
                                                WEnd
                                                $COM = $COM + $increment
                                             WEnd
                                             $REIT = $REIT + $increment
                                          WEnd
                                          $BIL = $BIL + $increment
                                       WEnd
                                       $STB = $STB + $increment
                                    WEnd
                                    $ITB = $ITB + $increment
                                 WEnd
                                 $LTB = $LTB + $increment
                              WEnd
                              $SCG = $SCG + $increment
                           WEnd
                           $SCV = $SCV + $increment
                        WEnd
                        $SCB = $SCB + $increment
                     WEnd
                     $MCG = $MCG + $increment
                  WEnd
                  $MCV = $MCV + $increment
               WEnd
               $MCB = $MCB + $increment
            WEnd
            $LCG = $LCG + $increment
         WEnd
         $LCV = $LCV + $increment
      WEnd
      $LCB = $LCB + $increment
   WEnd
   $TSM = $TSM + $increment
WEnd

Send ("COMPLETED")

目标:一个.csv,以0-100的值填充17列,其增量为5。.csv应该记录所有行,其中一行中的值之和等于100。(仅以及所有组合的总和为...

arrays csv bigdata
1个回答
0
投票

这将在Mac上约6分钟内打印所有73亿行。我不知道它将如何对任何人有用:-)

© www.soinside.com 2019 - 2024. All rights reserved.