如何操纵数据来查找独特招聘人员的比例以及他们填补职位的日期[重复]

问题描述 投票:0回答:1

我正在尝试找出一种简单的方法来按[组]、[地理位置]和[已填补]细分独特招聘人员已填补职位ID的比例。这最终将扩展到数万个数据,但在扩大规模之前无法找出完成任务的基本代码。

示例:

FY23_Q1 的欧洲销售有 30 个职位 ID,其中包含 5 个独特的招聘人员 = 比例为 6

enter image description here

样本数据:

structure(list(`Job ID` = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 
12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 
28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 
44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 
60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 
76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 
92, 93, 94, 95, 96, 97, 98), Group = c("Sales", "IT", "Sales", 
"HR", "Sales", "IT", "IT", "IT", "IT", "IT", "IT", "HR", "Product", 
"Product", "Product", "Sales", "IT", "Sales", "Finance", "IT", 
"IT", "IT", "IT", "Sales", "Sales", "Sales", "Sales", "Sales", 
"IT", "Sales", "Marketing", "Sales", "Sales", "Sales", "Sales", 
"Sales", "Marketing", "Sales", "Sales", "Sales", "Sales", "IT", 
"IT", "Sales", "Sales", "Sales", "Sales", "Sales", "Sales", "Sales", 
"HR", "HR", "Marketing", "IT", "IT", "Sales", "Sales", "IT", 
"Sales", "Sales", "Sales", "Product", "Finance", "IT", "IT", 
"IT", "IT", "IT", "Sales", "Sales", "Sales", "Sales", "Finance", 
"Sales", "Sales", "IT", "IT", "Sales", "Sales", "Sales", "IT", 
"Sales", "IT", "IT", "Sales", "IT", "Finance", "IT", "IT", "IT", 
"Sales", "Sales", "IT", "Sales", "Sales", "Sales", "IT", "Legal"
), geography = c("Europe", "US", "Europe", "US", "US", "US", 
"US", "Asia", "US", "US", "US", "US", "US", "US", "US", "US", 
"US", "US", "US", "Asia", "Asia", "Asia", "US", "Asia", "Asia", 
"Asia", "US", "US", "US", "Europe", "Europe", "Europe", "Asia", 
"US", "Europe", "US", "US", "Europe", "Europe", "US", "US", "US", 
"Asia", "Asia", "Asia", "Asia", "US", "US", "US", "US", "US", 
"US", "US", "US", "US", "US", "US", "US", "Europe", "US", "Europe", 
"US", "US", "US", "US", "US", "US", "US", "US", "US", "US", "US", 
"US", "Europe", "Europe", "US", "US", "Asia", "US", "US", "US", 
"Europe", "Asia", "US", "US", "US", "Asia", "US", "US", "US", 
"Europe", "US", "US", "US", "Asia", "Europe", "Asia", "US"), 
    recruiter = c("Alan Omalney", "Alex Andrew", "Aline Chu", 
    "Allison Jones", "Andrew Smith", "Andrew Wong", "Andrew Johnson", 
    "Anita watson", "Lisa Chu", "Lisa Chu", "Lisa Chu", "Steve Han", 
    "Steve Han", "Ben Austin", "Bill Kim", "Tom Murphy", "Jessica Navarro", 
    "Carson Bill", "Cathy Sanchez", "Abib Qu", "Abib Qu", "Abib Qu", 
    "Michael Wong", "Chris Lee", "Chris Lee", "Chris Lee", "Chris Ramirez", 
    "Chris Ramirez", "Chris Moon", "Eric Rodriguez", "Eric Rodriguez", 
    "Luke Jones", "Larry David", "Dino Martinez", "Tania Velasquez", 
    "Dom Ruli", "Liz Black", "Emily Crown", "Emily Crown", "Steven Johnson", 
    "George Michael", "Sage Young", "Jack Ma", "Jack Ma", "Jack Ma", 
    "Jack Ma", "Jim Buck", "Kavita Van", "Kareem Johnson", "Kareem Johnson", 
    "Kathy Johnson", "Kelly Sanchez", "Kirstin Apple", "Laura Collins", 
    "Laruren Chi", "Larry Tang", "Lee Corso", "Lena Michaels", 
    "Lex Pretty", "Lily Oh", "Lucy Young", "Maggy Green", "Mary Diaz", 
    "Megan Gekko", "Megan Gekko", "Megan Gekko", "Megan Gekko", 
    "Megan Gekko", "Megan Hoqser", "Megan Xie", "Molly Grey", 
    "Monica Black", "Naomi Okata", "Natalie Alex", "Nathalie Yusef", 
    "Nico Carden", "Nikki Flower", "Nira Nora", "Olivia Fives", 
    "Olivia Fives", "Paul Smith", "Risa Mowow", "Reggie Bush", 
    "Sam Cooke", "Sasha vivi", "Shawn Wong", "Shobit Alex", "Stacy Beck", 
    "Stacy Beck", "Sydeny Rapp", "Tate Scoops", "Derick Price", 
    "Taylor Hellam", "Trisitn Moore", "Val Mago", "Valencia Marcy", 
    "Victor Pico", "Jasime Q"), Filled = c("FY23_Q1", "FY22_Q4", 
    "FY21_Q3", "FY22_Q2", "FY22_Q2", "FY21_Q2", "FY22_Q4", "FY23_Q1", 
    "FY22_Q3", "FY20_Q2", "FY21_Q2", "FY24_Q1", "FY22_Q2", "FY22_Q4", 
    "FY22_Q2", "FY22_Q2", "FY20_Q1", "FY22_Q4", "FY22_Q2", "FY22_Q4", 
    "FY20_Q2", "FY22_Q3", "FY22_Q4", "FY21_Q4", "FY22_Q2", "FY22_Q1", 
    "FY21_Q3", "FY21_Q4", "FY22_Q1", "FY24_Q1", "FY24_Q3", "FY24_Q3", 
    "FY23_Q3", "FY22_Q2", "FY24_Q1", "FY21_Q3", "FY22_Q3", "FY24_Q1", 
    "FY23_Q4", "FY22_Q4", "FY21_Q4", "FY20_Q2", "FY22_Q4", "FY22_Q1", 
    "FY22_Q1", "FY21_Q4", "FY21_Q4", "FY22_Q1", "FY24_Q3", "FY22_Q3", 
    "FY22_Q3", "FY22_Q2", "FY24_Q1", "FY22_Q3", "FY22_Q2", "FY22_Q1", 
    "FY24_Q2", "FY22_Q1", "FY23_Q1", "FY21_Q4", "FY21_Q3", "FY24_Q2", 
    "FY21_Q4", "FY21_Q4", "FY22_Q2", "FY22_Q3", "FY22_Q3", "FY22_Q4", 
    "FY24_Q3", "FY21_Q4", "FY23_Q4", "FY22_Q3", "FY21_Q4", "FY24_Q3", 
    "FY22_Q2", "FY22_Q1", "FY24_Q2", "FY22_Q3", "FY22_Q3", "FY22_Q2", 
    "FY22_Q2", "FY23_Q2", "FY22_Q2", "FY23_Q2", "FY22_Q2", "FY20_Q1", 
    "FY22_Q4", "FY21_Q4", "FY21_Q4", "FY22_Q4", "FY22_Q3", "FY23_Q1", 
    "FY22_Q3", "FY24_Q1", "FY22_Q3", "FY22_Q4", "FY21_Q3", "FY22_Q4"
    )), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-98L))
r dataframe dplyr
1个回答
0
投票

使用包

tidyverse
(或
dplyr
):

out <- df %>%
  group_by(Group, geography, Filled) %>%
  summarize(
    no.JobID = length(unique(`Job ID`)),
    no.recruiter = length(unique(recruiter))
  ) %>%
  mutate(
    ratio = no.JobID / no.recruiter
  )

逐行:

group_by
按您想要的变量进行分组。

summarise
使用
length(unique())
获取唯一职位 ID 和招聘人员的数量,并将其放入新列中。现在,数据框中的一行包含上面指定的一组的汇总统计信息。

mutate
计算一个新列,即唯一职位 ID 与唯一招聘人员的比率。

© www.soinside.com 2019 - 2024. All rights reserved.