我坚持使用这个简单的正则表达式,我正在写解析食物订单。我将每个订单作为JSON对象获取,如下所示:
{"text": "order"}
规则
人们以几乎统一的方式订购食物,但并不完全。所以我决定制定一些我认为是食品订单的规则,以便能够解析它。
{"text": "restaurant name: food order"}
[1-10,16,17]
restaurant name
总是在text
[1-10]
的开始restaurant name
[1-10]
之前没有空格restaurant name
将加强[2,5,7,8]
:
将加强[3,6,10]
[8]
中会出现unicode字符[1-10]
中存在unicode字符[3-6,8-10]
text
[8-10]
有几个订单有效的食品订单
这些是正则表达式应该能够解析的有效食品订单:
[1] {"text": "Xanacuk: Ensalada de Espinacas + Crema del d\u00eda. :grin: "}
[2] {"text": "*Xanacuk*: Ensalada de Espinacas + Crema del d\u00eda"}
[3] {"text": "*Xanacuk:*\nEnsalada de Espinacas + Crema del d\u00eda. Thanks! :sunglasses: "}
[4] {"text": "pok\u00e9 restaurant:\n1- Crea tu Bowl: At\u00fan, Smoked Paprika, Cebolla Roja\n1- Salm\u00f3n Wasabi Pomelo"}
[5] {"text": "*POKE restaurant*:\n1- Crea tu Bowl: At\u00fan, Smoked Paprika, Cebolla Roja\n1- Salm\u00f3n Wasabi Pomelo"}
[6] {"text": "*POKE restaurant:*\n1- Crea tu Bowl: At\u00fan, Smoked Paprika, Cebolla Roja\n1- Salm\u00f3n Wasabi Pomelo"}
[7] {"text": "*Xanacuk Place*: Ensalada de dise\u00f1o peque\u00f1a (base espinacas + jam\u00f3n cocido + at\u00fan + aceite extra virgen) + mollete malasa\u00f1a. Gracias!"}
[8] {"text": "*Ohana Pok\u00e9 House*: Bowl - Arroz Negro Salvaje, At\u00fan, Tuna flakes, Zanahoria, Edamame , Wakame, Nori Furikake\n*Tierra Burrito*: Cookie Doble Chocolate"}
[9] {"text": "Poke Bowl: Bowl , \nBaby Spinach , \nAt\u00fan, \nSiracha de Manzana, Tuna Flakes, Pepino, Edamame , Cacahuete, Granada\n*Tierra Burrito*: Cookie Doble Chocolate"}
[10] {"text": "*Poke:* Bowl , \nBaby Spinach , \nAt\u00fan, \nSiracha de Manzana, Tuna Flakes, Pepino, Edamame , Cacahuete, Granada\n*Tierra Burrito*: Cookie Doble Chocolate "}
食品订单无效
这些不符合规则,所以正则表达式应该返回null
:
[11] {"text: ":heart: everywhere"}
[12] {"text: "this is not a food order"}
[13] {"text: "Mike +1"}
[14] {"text: "It\u2019s hot in here :fire:"}
[15] {"text: "we need to talk"}
误报
当然,一些误报是不可避免的,这是可以的:
[16] {"text: "Hey: :heart:"}
[17] {"text: "Jim: come here"}
解析食物订单
所以预期的结果是:
[1] restaurant: "Xanacuk"
order: "Ensalada de Espinacas + Crema del día"
[2] restaurant: "Xanacuk"
order: "Ensalada de Espinacas + Crema del día"
[3] restaurant: "Xanacuk"
order: "Ensalada de Espinacas + Crema del día. Thanks! :sunglasses:"
[4] restaurant: "poké restaurant"
order: "1- Crea tu Bowl: Atún, Smoked Paprika, Cebolla Roja
1- Salmón Wasabi Pomelo"
[5] restaurant: "POKE restaurant"
order: "1- Crea tu Bowl: Atún, Smoked Paprika, Cebolla Roja
1- Salmón Wasabi Pomelo"
[6] restaurant: "POKE restaurant"
order: "1- Crea tu Bowl: Atún, Smoked Paprika, Cebolla Roja
1- Salmón Wasabi Pomelo"
[7] restaurant: "Xanacuk Place"
order: "Ensalada de diseño pequeña (base espinacas + jamón cocido + atún + aceite extra virgen) + mollete malasaña. Gracias!"
[8] restaurant: "Ohana Poké House"
order: "Bowl - Arroz Negro Salvaje, Atún, Tuna flakes, Zanahoria, Edamame , Wakame, Nori Furikake"
restaurant: "Tierra Burrito"
order: "Cookie Doble Chocolate"
[9] restaurant: "Poke Bowl"
order: "Bowl ,
Baby Spinach ,
Atún,
Siracha de Manzana, Tuna Flakes, Pepino, Edamame , Cacahuete, Granada"
restaurant: "Tierra Burrito"
order: "Cookie Doble Chocolate"
[10] restaurant: "Poke"
order: "Bowl ,
Baby Spinach ,
Atún,
Siracha de Manzana, Tuna Flakes, Pepino, Edamame , Cacahuete, Granada"
restaurant: "Tierra Burrito"
order: "Cookie Doble Chocolate"
[11] null
[12] null
[13] null
[14] null
[15] null
[16] restaurant: "Hey"
order: ":heart:"
[17] restaurant: "Jim"
order: "come here"
注意:restaurant
和order
总是被修剪,开头和结尾的任何换行都应该被删除
我的解决方案
在做JSON.parse(event)
后,我将我的正则表达式应用于event.text
。
到目前为止,我已经能够提出这个正则表达式:
/\*?([\w ]+)\*?:\*?(?:\s*)((\n|.)+)/gm
正则表达式将与[1-3,5-7,11-17]
合作
正则表达式不适用于:
[4]
:restaurant name
的unicode[8-10]
:text
的两个订单正则表达式创建了第三个捕获组,这在美学上并不令人愉快,但我只是忽略它...... :-)
正则表达式片段:https://regex101.com/r/AbtLgm/13
我觉得我很亲密。我只需要一点推动......
谢谢你的帮助!
它并不完美,但这就是我最终做的事情:
/^\*?([\w \u00a0-\u0200]+)\*? *:\*? *(.+)/gm
https://regex101.com/r/ymvYb1/4
我可能会再给它一次,但是现在......它有效! :-)