我正在使用 Flask 和 MongoDB 编写 REST API(我使用 pymongo 与数据库交互)。我遇到的问题似乎与竞争条件有关。我已经能够通过下面所示的最小可重现示例来重现我的用例。
考虑以下集合:
[
{
'name': 'Mike',
'books': [
{
'title': 'Book 1',
'minutes-spent-reading': 100,
'date-bought': '2024-09-23'
}
]
}
]
下面第一个代码块中显示的请求完成后,集合应如第二个代码块中所示。
PUT /name/Mike/books/Book 2
Content-Type: application/json
{'minutes-spent-reading': 40}
[
{
'name': 'Mike',
'books': [
{
'title': 'Book 1',
'minutes-spent-reading': 100,
'date-bought': '2024-09-23'
},
{
'title': 'Book 2',
'minutes-spent-reading': 40,
}
]
}
]
第一个代码块中显示的请求完成后,集合应如第二个代码块中所示。
PUT /name/Mike/books/Book 2
Content-Type: application/json
{'date-bought': '2024-09-26'}
[
{
'name': 'Mike',
'books': [
{
'title': 'Book 1',
'minutes-spent-reading': 100,
'date-bought': '2024-09-23'
},
{
'title': 'Book 2',
'minutes-spent-reading': 40,
'date-bought': '2024-09-26'
}
]
}
]
但是,使用我的 REST API 会产生以下结果:
[
{
'name': 'Mike',
'books': [
{
'title': 'Book 1',
'minutes-spent-reading': 100,
'date-bought': '2024-09-23'
},
{
'title': 'Book 2',
'minutes-spent-reading': 40
},
{
'title': 'Book 2',
'date-bought': '2024-09-26'
}
]
}
]
我已经能够通过这个最小的可重现示例重现我的 REST API 执行 PUT 操作的方式:
import threading
import pymongo
import time
import json
def put_book_2_minutes_spend_reading(value):
# Find book in the array of books
book = db.experiment.find_one({
'name': 'Mike',
'books.title': 'Book 2'
})
# We use time.sleep(1) to simulate that, for some reason, the
# computer took 1 second at this step.
time.sleep(1)
# If the book doesn't exist in the array of books, insert a new
# object in the array.
if book == None:
db.experiment.find_one_and_update({
'name': 'Mike',
}, {
'$push': {
'books': {
'title': 'Book 2',
'minutes-spent-reading': value
}
}
})
else:
# If the book exists in the array of books, set the key in the
# existing book.
db.experiment.find_one_and_update({
'name': entity_id
}, {
'$set': {
'books.$[book].minutes-spent-reading': value
}
}, array_filters = [{'book.title': 'Book 2'}])
def put_book_2_date_bought(value):
book = db.experiment.find_one({
'name': 'Mike',
'books.title': 'Book 2'
})
# We use time.sleep(1) to simulate that, for some reason, the
# computer took 1 second at this step.
time.sleep(1)
# If the book doesn't exist in the array of books, insert a new
# object in the array.
if book == None:
db.experiment.find_one_and_update({
'name': 'Mike',
}, {
'$push': {
'books': {
'title': 'Book 2',
'date-bought': value
}
}
})
else:
# If the book exists in the array of books, set the key in the
# existing book.
db.experiment.find_one_and_update({
'name': 'Mike'
}, {
'$set': {
'books.$[book].date-bought': value
}
}, array_filters = [{'book.title': 'Book 2'}])
db = pymongo.MongoClient('localhost', 27017)['experiment']
db.experiment.delete_many({})
# Sample item
db.experiment.insert_one({
'name': 'Mike',
'books': [
{
'title': 'Book 1',
'minutes-spent-reading': 100,
'date-bought': '2024-09-23'
}
]
})
t1 = threading.Thread(target=put_book_2_minutes_spend_reading, args=[40])
t2 = threading.Thread(target=put_book_2_date_bought, args=['2024-01-20'])
t1.start()
t2.start()
t1.join()
t2.join()
print(json.dumps(list(db.experiment.find({}, {"_id": 0})), indent=2))
上面显示的 Python 代码打印以下内容:
[
{
"name": "Mike",
"books": [
{
"title": "Book 1",
"minutes-spent-reading": 100,
"date-bought": "2024-09-23"
},
{
"title": "Book 2",
"minutes-spent-reading": 40
},
{
"title": "Book 2",
"date-bought": "2024-01-20"
}
]
}
]
我期望这些结果:
[
{
"name": "Mike",
"books": [
{
"title": "Book 1",
"minutes-spent-reading": 100,
"date-bought": "2024-09-23"
},
{
"title": "Book 2",
"minutes-spent-reading": 40,
"date-bought": "2024-01-20"
}
]
}
]
我的问题是:当可能存在为一本书设置不同键的并发请求时,如何避免 MongoDB 创建两本具有相同标题的书?
请不要质疑示例的数据库模式,因为其主要目的是解释更复杂的数据库模式中的场景。
解决方案是在 MongoDB 中使用单个原子操作,以确保当两个操作对同一个文档进行操作时,一个操作先运行另一个操作。这样,就可以避免用户触发多次相差毫秒的写入操作时发生冲突。
取自https://www.mongodb.com/docs/manual/core/write-operations-atomicity/
在 MongoDB 中,写入操作在单个文档级别上是原子的,即使该操作修改单个文档中的多个嵌入文档也是如此。
摘自https://www.mongodb.com/docs/manual/reference/glossary/#std-term-atomic-operation
原子操作是一种写操作,要么完全完成,要么根本不完成。对于涉及写入多个文档的分布式事务,对每个文档的所有写入都必须成功,事务才能成功。原子操作无法部分完成。请参阅原子性和事务。
以下代码基于此答案:
import threading
import pymongo
import time
import json
def put_book_2_minutes_spent_reading(value):
name = 'Mike'
title = 'Book 2'
db.experiment.update_one(
{'name': name},
[
{
'$set': {
'books': {
'$cond': [
{ '$in': [title, "$books.title"] },
{
'$map': {
'input': "$books",
'in': {
'$cond': [
{ '$eq': [title, "$$this.title"] },
{
'$mergeObjects': [
'$$this',
{'minutes-spent-reading': value}
]
},
"$$this"
]
}
}
},
{
'$concatArrays': [
"$books",
[
{
'title': title,
'minutes-spent-reading': value
}
]
]
}
]
}
}
}
]
)
def put_book_2_date_bought(value):
name = 'Mike'
title = 'Book 2'
db.experiment.update_one(
{'name': name},
[
{
'$set': {
'books': {
'$cond': [
{ '$in': [title, "$books.title"] },
{
'$map': {
'input': "$books",
'in': {
'$cond': [
{ '$eq': [title, "$$this.title"] },
{
'$mergeObjects': [
'$$this',
{'date-bought': value}
]
},
"$$this"
]
}
}
},
{
'$concatArrays': [
"$books",
[
{
'title': title,
'date-bought': value
}
]
]
}
]
}
}
}
]
)
db = pymongo.MongoClient('localhost', 27017)['experiment']
db.experiment.delete_many({})
db.experiment.insert_one({
'name': 'Mike',
'books': [
{
'title': 'Book 1',
'minutes-spent-reading': 100,
'date-bought': '2024-09-23'
}
]
})
t1 = threading.Thread(target=put_book_2_minutes_spent_reading, args=[40])
t2 = threading.Thread(target=put_book_2_date_bought, args=['2024-01-20'])
t1.start()
t2.start()
t1.join()
t2.join()
print(json.dumps(list(db.experiment.find({}, {"_id": 0})), indent=2))
[
{
"name": "Mike",
"books": [
{
"title": "Book 1",
"minutes-spent-reading": 100,
"date-bought": "2024-09-23"
},
{
"title": "Book 2",
"minutes-spent-reading": 40,
"date-bought": "2024-01-20"
}
]
}
]