我用来解析食物订单的简单Javascript RegExp嘲笑我

问题描述 投票:-2回答:1

我坚持使用这个简单的正则表达式,我正在写解析食物订单。我将每个订单作为JSON对象获取,如下所示:

{"text": "order"}

规则

人们以几乎统一的方式订购食物,但并不完全。所以我决定制定一些我认为是食品订单的规则,以便能够解析它。

  • 基线格式:{"text": "restaurant name: food order"} [1-10,16,17]
  • restaurant name总是在text [1-10]的开始
  • restaurant name [1-10]之前没有空格
  • 有时restaurant name将加强[2,5,7,8]
  • 有时,甚至:将加强[3,6,10]
  • 有时,餐馆名称[8]中会出现unicode字符
  • 有时,食物订单[1-10]中存在unicode字符
  • 有时,食物订单跨越几行[3-6,8-10]
  • 有时,text [8-10]有几个订单

有效的食品订单

这些是正则表达式应该能够解析的有效食品订单:

 [1] {"text": "Xanacuk:   Ensalada de Espinacas + Crema del d\u00eda. :grin: "}
 [2] {"text": "*Xanacuk*:  Ensalada de Espinacas + Crema del d\u00eda"}
 [3] {"text": "*Xanacuk:*\nEnsalada de Espinacas + Crema del d\u00eda. Thanks! :sunglasses:  "} 
 [4] {"text": "pok\u00e9 restaurant:\n1- Crea tu Bowl: At\u00fan, Smoked Paprika,  Cebolla Roja\n1- Salm\u00f3n Wasabi Pomelo"}
 [5] {"text": "*POKE restaurant*:\n1- Crea tu Bowl: At\u00fan, Smoked Paprika,  Cebolla Roja\n1- Salm\u00f3n Wasabi Pomelo"}
 [6] {"text": "*POKE restaurant:*\n1- Crea tu Bowl: At\u00fan, Smoked Paprika,  Cebolla Roja\n1- Salm\u00f3n Wasabi Pomelo"}
 [7] {"text": "*Xanacuk Place*: Ensalada de dise\u00f1o peque\u00f1a (base espinacas + jam\u00f3n cocido + at\u00fan + aceite extra virgen)  + mollete malasa\u00f1a. Gracias!"}
 [8] {"text": "*Ohana Pok\u00e9 House*: Bowl - Arroz Negro Salvaje, At\u00fan, Tuna flakes, Zanahoria, Edamame , Wakame, Nori Furikake\n*Tierra Burrito*: Cookie Doble Chocolate"}
 [9] {"text": "Poke Bowl: Bowl , \nBaby Spinach , \nAt\u00fan, \nSiracha de Manzana, Tuna Flakes, Pepino, Edamame , Cacahuete, Granada\n*Tierra Burrito*: Cookie Doble Chocolate"}
[10] {"text": "*Poke:* Bowl , \nBaby Spinach , \nAt\u00fan, \nSiracha de Manzana, Tuna Flakes, Pepino, Edamame , Cacahuete, Granada\n*Tierra Burrito*: Cookie Doble Chocolate "}

食品订单无效

这些不符合规则,所以正则表达式应该返回null

[11] {"text: ":heart: everywhere"}
[12] {"text: "this is not a food order"}
[13] {"text: "Mike +1"}
[14] {"text: "It\u2019s hot in here :fire:"}
[15] {"text: "we need to talk"}

误报

当然,一些误报是不可避免的,这是可以的:

[16] {"text: "Hey: :heart:"}
[17] {"text: "Jim: come here"}

解析食物订单

所以预期的结果是:

 [1] restaurant: "Xanacuk"
     order: "Ensalada de Espinacas + Crema del día"
 [2] restaurant: "Xanacuk"
     order: "Ensalada de Espinacas + Crema del día"
 [3] restaurant: "Xanacuk"
     order: "Ensalada de Espinacas + Crema del día. Thanks! :sunglasses:"
 [4] restaurant: "poké restaurant"
     order: "1- Crea tu Bowl: Atún, Smoked Paprika,  Cebolla Roja
             1- Salmón Wasabi Pomelo"
 [5] restaurant: "POKE restaurant"
     order: "1- Crea tu Bowl: Atún, Smoked Paprika,  Cebolla Roja
             1- Salmón Wasabi Pomelo"
 [6] restaurant: "POKE restaurant"
     order: "1- Crea tu Bowl: Atún, Smoked Paprika,  Cebolla Roja
             1- Salmón Wasabi Pomelo"
 [7] restaurant: "Xanacuk Place"
     order: "Ensalada de diseño pequeña (base espinacas + jamón cocido + atún + aceite extra virgen)  + mollete malasaña. Gracias!"
 [8] restaurant: "Ohana Poké House"
     order: "Bowl - Arroz Negro Salvaje, Atún, Tuna flakes, Zanahoria, Edamame , Wakame, Nori Furikake"
     restaurant: "Tierra Burrito"
     order: "Cookie Doble Chocolate"
 [9] restaurant: "Poke Bowl"
     order: "Bowl , 
             Baby Spinach , 
             Atún, 
             Siracha de Manzana, Tuna Flakes, Pepino, Edamame , Cacahuete, Granada"
     restaurant: "Tierra Burrito"
     order: "Cookie Doble Chocolate"
[10] restaurant: "Poke"
     order: "Bowl , 
             Baby Spinach , 
             Atún, 
             Siracha de Manzana, Tuna Flakes, Pepino, Edamame , Cacahuete, Granada"
     restaurant: "Tierra Burrito"
     order: "Cookie Doble Chocolate"
[11] null
[12] null
[13] null
[14] null
[15] null
[16] restaurant: "Hey"
     order: ":heart:" 
[17] restaurant: "Jim"
     order: "come here" 

注意:restaurantorder总是被修剪,开头和结尾的任何换行都应该被删除

我的解决方案

在做JSON.parse(event)后,我将我的正则表达式应用于event.text

到目前为止,我已经能够提出这个正则表达式:

/\*?([\w ]+)\*?:\*?(?:\s*)((\n|.)+)/gm

正则表达式将与[1-3,5-7,11-17]合作

正则表达式不适用于:

  • [4]restaurant name的unicode
  • [8-10]text的两个订单

正则表达式创建了第三个捕获组,这在美学上并不令人愉快,但我只是忽略它...... :-)

正则表达式片段:https://regex101.com/r/AbtLgm/13

我觉得我很亲密。我只需要一点推动......

谢谢你的帮助!

javascript regex
1个回答
0
投票

它并不完美,但这就是我最终做的事情:

/^\*?([\w \u00a0-\u0200]+)\*? *:\*? *(.+)/gm

https://regex101.com/r/ymvYb1/4

我可能会再给它一次,但是现在......它有效! :-)

© www.soinside.com 2019 - 2024. All rights reserved.