awk 打印具有匹配子字符串的所有列

问题描述 投票:0回答:1

我想打印部分匹配字符串(疫霉菌)的列。该文件以逗号分隔,我更喜欢使用 awk,但欢迎使用所有选项:)

这是我文件的前十行:

head file
index,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporales_fam_Incertae_sedis;g__Phytophthora;s__Phytophthora_plurivora,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Pythiales;f__Pythiaceae;g__Pythium;s__Pythium_sp,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporales_fam_Incertae_sedis;g__Phytophthora;s__Phytophthora_sp,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporales_fam_Incertae_sedis;g__Phytophthora;s__Phytophthora_psychrophila,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Saprolegniales;f__Saprolegniaceae;g__unidentified;s__Saprolegniaceae_sp,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Pythiales;f__Pythiaceae;g__Pythium;s__Pythium_grandisporangium,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporales_fam_Incertae_sedis;g__Phytopythium;s__Phytopythium_citrinum,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Saprolegniales;f__Saprolegniaceae;g__Leptolegnia;s__Leptolegnia_sp,k__Viridiplantae;p__Anthophyta;c__Eudicotyledonae;o__Fagales;f__Fagaceae;g__Fagus;__,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Pythiales;f__Pythiaceae;g__Globisporangium;s__Globisporangium_sp,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporales_fam_Incertae_sedis;g__Phytophthora;__,Unassigned;__;__;__;__;__;__,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Saprolegniales;f__Saprolegniaceae;g__Aphanomyces;s__Aphanomyces_sp,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Saprolegniales;f__Saprolegniaceae;g__Pythiopsis;s__Pythiopsis_sp,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Saprolegniales;f__Saprolegniaceae;g__Saprolegnia;s__Saprolegnia_sp,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Pythiales;f__unidentified;g__unidentified;s__Pythiales_sp,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Pythiales;f__Pythiaceae;g__Globisporangium;s__Globisporangium_heterothallicum,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Pythiales;f__Pythiaceae;g__Globisporangium;__,k__Fungi;p__Ascomycota;c__Dothideomycetes;o__Pleosporales;f__Pleosporales_fam_Incertae_sedis;g__Cheiromyces;s__Cheiromyces_sp,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporaceae;g__Peronospora;s__Peronospora_chrysosplenii,k__Fungi;p__Ascomycota;c__Sordariomycetes;o__Hypocreales;f__Hypocreaceae;g__Trichoderma;__,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporaceae;g__Hyaloperonospora;s__Hyaloperonospora_sp,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporales_fam_Incertae_sedis;g__Phytophthora;s__Phytophthora_obscura,k__Stramenopila;p__Ochrophyta;c__unidentified;o__unidentified;f__unidentified;g__unidentified;s__Ochrophyta_sp,k__Fungi;p__Ascomycota;c__Orbiliomycetes;o__Orbiliales;f__Orbiliaceae;__;__,k__Stramenopila;__;__;__;__;__;__,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Saprolegniales;f__Saprolegniaceae;__;__,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Pythiales;f__Pythiaceae;g__Globisporangium;s__Globisporangium_rostratifingens,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Pythiales;f__Pythiaceae;__;__,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporales_fam_Incertae_sedis;g__Phytophthora;s__Phytophthora_castanetorum,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporales_fam_Incertae_sedis;g__Phytophthora;s__Phytophthora_aleatoria,k__Fungi;p__Ascomycota;c__Orbiliomycetes;o__Orbiliales;f__Orbiliaceae;g__Dactylella;s__Dactylella_sp,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Pythiales;f__Pythiaceae;g__unidentified;s__Pythiaceae_sp,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporales_fam_Incertae_sedis;g__Phytophthora;s__Phytophthora_vulcanica,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporales_fam_Incertae_sedis;g__Phytophthora;s__Phytophthora_europaea,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporales_fam_Incertae_sedis;g__Phytopythium;s__Phytopythium_boreale,k__Fungi;p__Ascomycota;c__Dothideomycetes;o__Pleosporales;f__Dictyosporiaceae;g__Dictyosporium;s__Dictyosporium_wuyiense,k__Stramenopila;p__Oomycota;c__Oomycetes;o__unidentified;f__unidentified;g__unidentified;s__Oomycetes_sp,k__Stramenopila;p__Oomycota;c__Oomycetes;__;__;__;__,k__Stramenopila;p__unidentified;c__unidentified;o__unidentified;f__unidentified;g__unidentified;s__Stramenopila_sp,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Pythiales;f__Pythiaceae;g__Pythium;s__Pythium_chondricola,k__Fungi;p__Ascomycota;c__Dothideomycetes;o__Pleosporales;f__Dictyosporiaceae;g__Digitodesmium;s__Digitodesmium_intermedium,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Saprolegniales;f__Saprolegniaceae;g__Aplanopsis;s__Aplanopsis_sp,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Pythiales;f__Pythiaceae;g__Pythium;__,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporales_fam_Incertae_sedis;g__Phytophthora;s__Phytophthora_crassamura,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Pythiales;f__Pythiaceae;g__Pythium;s__Pythium_caudatum,k__Alveolata;p__Dinophyta;c__unidentified;o__unidentified;f__unidentified;g__unidentified;s__Dinophyta_sp,k__Viridiplantae;p__Anthophyta;c__Eudicotyledonae;o__Fagales;f__Fagaceae;g__Fagus;s__Fagus_sp,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Pythiales;f__Pythiaceae;g__Pythium;s__Pythium_graminicola,k__Fungi;p__Ascomycota;c__Dothideomycetes;o__Pleosporales;f__Dictyosporiaceae;g__Pseudodictyosporium;s__Pseudodictyosporium_indicum,k__Amoebozoa;p__Variosea;c__unidentified;o__unidentified;f__unidentified;g__unidentified;s__Variosea_sp,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporaceae;g__Peronospora;s__Peronospora_arvensis,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Oomycetes_ord_Incertae_sedis;f__Lagenaceae;g__Lagena;s__Lagena_sp,k__Viridiplantae;p__Anthophyta;c__Eudicotyledonae;o__Fagales;f__Fagaceae;g__Fagus;s__Fagus_sylvatica,k__Fungi;p__Ascomycota;c__Orbiliomycetes;o__Orbiliales;f__Orbiliaceae;g__Orbilia;s__Orbilia_auricolor,k__Fungi;p__Ascomycota;c__Dothideomycetes;o__Pleosporales;f__Cucurbitariaceae;g__Neocucurbitaria;s__Neocucurbitaria_sp,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Pythiales;f__Pythiaceae;g__Globisporangium;s__Globisporangium_parvum,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Pythiales;__;__;__,k__Fungi;p__Mucoromycota;c__Umbelopsidomycetes;o__Umbelopsidales;f__Umbelopsidaceae;g__Umbelopsis;s__Umbelopsis_vinacea,absolute-filepath
11,0.0,34032.0,0.0,0.0,0.0,0.0,0.0,99.0,0.0,0.0,0.0,6742.0,0.0,3261.0,0.0,0.0,0.0,0.0,0.0,0.0,5793.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,608.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,11.reads.fastq.gz
12,2236.0,54019.0,0.0,0.0,8680.0,103.0,0.0,0.0,0.0,0.0,0.0,0.0,1345.0,0.0,0.0,0.0,1283.0,0.0,0.0,0.0,0.0,0.0,1071.0,0.0,0.0,0.0,1606.0,0.0,478.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,456.0,0.0,0.0,0.0,0.0,0.0,0.0,99.0,0.0,0.0,0.0,0.0,464.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,12.reads.fastq.gz
13,0.0,73152.0,0.0,0.0,0.0,0.0,0.0,131.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,13.reads.fastq.gz
14,0.0,23345.0,0.0,0.0,0.0,0.0,0.0,24.0,0.0,0.0,0.0,50.0,0.0,0.0,1468.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,5041.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,98.0,0.0,0.0,0.0,0.0,0.0,0.0,707.0,0.0,0.0,0.0,0.0,81.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,14.reads.fastq.gz
15,0.0,42605.0,0.0,0.0,0.0,0.0,0.0,3590.0,0.0,0.0,0.0,12066.0,0.0,1729.0,12188.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,201.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1696.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,15.reads.fastq.gz
16,0.0,57495.0,0.0,0.0,2581.0,0.0,0.0,23.0,860.0,0.0,0.0,574.0,218.0,523.0,346.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,3402.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,977.0,0.0,0.0,7.0,385.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,16.reads.fastq.gz
17,0.0,24558.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1807.0,0.0,0.0,68.0,145.0,0.0,9700.0,7040.0,0.0,0.0,0.0,0.0,0.0,0.0,5548.0,0.0,0.0,0.0,166.0,1292.0,0.0,2598.0,0.0,0.0,0.0,0.0,0.0,0.0,678.0,1563.0,0.0,0.0,0.0,0.0,0.0,0.0,157.0,0.0,0.0,33.0,0.0,0.0,0.0,295.0,0.0,0.0,0.0,0.0,165.0,0.0,17.reads.fastq.gz
18,0.0,55229.0,16003.0,0.0,10674.0,0.0,0.0,0.0,11626.0,485.0,0.0,0.0,841.0,0.0,0.0,20.0,84.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,779.0,905.0,0.0,0.0,0.0,0.0,0.0,0.0,168.0,0.0,24.0,120.0,0.0,0.0,0.0,0.0,0.0,588.0,0.0,0.0,0.0,101.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,18.reads.fastq.gz
1,98039.0,268.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2495.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.reads.fastq.gz

这就是我想看到的

k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporales_fam_Incertae_sedis;g__Phytophthora;s__Phytophthora_plurivora,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporales_fam_Incertae_sedis;g__Phytophthora;s__Phytophthora_sp
0.0,0.0
2236.0,0.0
0.0,0.0
0.0,0.0
0.0,0.0
0.0,0.0
0.0,0.0
0.0,16003.0

我可以打印最后一场比赛,但无法打印所有比赛:

awk -F',' -v word="Phytophthora" 'NR==1{for(i=1;i<=NF;i++){ if ($i ~ word) c=i } } c{print $c}' file

k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporales_fam_Incertae_sedis;g__Phytophthora;s__Phytophthora_crassamura
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
1567.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0

我还尝试将匹配的列存储为变量,然后将它们添加到第二个 awk 命令中:

variable=$(awk -F ','  '/Phytophthora/ { for (i=1;i<=NF;i++) if ($i ~ "Phytophthora") print "$"i }' file| awk 'BEGIN { ORS = " " } { print }')
awk -v var=$variable '{print var}'
$2 $4 $5 $12 $24 $31 $32 $35 $36 $46 
$2 $4 $5 $12 $24 $31 $32 $35 $36 $46 
$2 $4 $5 $12 $24 $31 $32 $35 $36 $46 
$2 $4 $5 $12 $24 $31 $32 $35 $36 $46 
$2 $4 $5 $12 $24 $31 $32 $35 $36 $46 
awk match
1个回答
0
投票

由于您可能有多个匹配的列,因此您需要修改

NF==1
处理以跟踪列号(例如,通过数组)。

一个想法:

awk -v word="Phytophthora" '
BEGIN { FS = OFS = "," }
NR==1 { for (i=1; i<=NF; i++) 
            if ($i ~ word) 
               col_list[++c]=i 
      } 
      { for (i=1; i<=c; i++)
            printf "%s%s", $(col_list[i]), (i<c ? OFS : ORS)
      }
' file

这会生成:

k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporales_fam_Incertae_sedis;g__Phytophthora;s__Phytophthora_plurivora,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporales_fam_Incertae_sedis;g__Phytophthora;s__Phytophthora_sp,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporales_fam_Incertae_sedis;g__Phytophthora;s__Phytophthora_psychrophila,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporales_fam_Incertae_sedis;g__Phytophthora;__,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporales_fam_Incertae_sedis;g__Phytophthora;s__Phytophthora_obscura,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporales_fam_Incertae_sedis;g__Phytophthora;s__Phytophthora_castanetorum,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporales_fam_Incertae_sedis;g__Phytophthora;s__Phytophthora_aleatoria,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporales_fam_Incertae_sedis;g__Phytophthora;s__Phytophthora_vulcanica,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporales_fam_Incertae_sedis;g__Phytophthora;s__Phytophthora_europaea,k__Stramenopila;p__Oomycota;c__Oomycetes;o__Peronosporales;f__Peronosporales_fam_Incertae_sedis;g__Phytophthora;s__Phytophthora_crassamura
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2236.0,0.0,0.0,0.0,1071.0,0.0,0.0,0.0,0.0,0.0
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
0.0,0.0,0.0,0.0,0.0,5041.0,0.0,0.0,0.0,0.0
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
0.0,0.0,0.0,0.0,0.0,0.0,2598.0,0.0,0.0,0.0
0.0,16003.0,0.0,0.0,0.0,905.0,0.0,0.0,0.0,0.0
98039.0,0.0,0.0,2495.0,0.0,0.0,0.0,0.0,0.0,0.0

注意事项:

  • 此代码查找 10 列,其标题包含字符串
    Phytophthora
  • 这与 OP 的预期输出 2 列
  • 但是,OP 的最后一组
    awk|awk
    代码显示有 10 列,其标题包含字符串
    Phytophthora
  • 此时我必须假设OP的预期输出(2列)是不正确的,实际上应该由10列
  • 组成
© www.soinside.com 2019 - 2024. All rights reserved.