我正在开发一个 Django 网络应用程序,它接收 PDF 文件并对 PDF 的每一页执行一些图像处理。我得到了一份 PDF,我需要将每一页保存到我的 Google 云存储中。我正在使用
pdf2image
的 convert_from_path()
为 PDF 中的每一页生成一个枕头图像列表。现在,我想将这些图像保存到 Google Cloud Storages,但我无法弄明白。
我已经在本地成功保存了这些枕头图像,但我不知道如何在云中执行此操作。
fullURL = file.pdf.url
client = storage.Client()
bucket = client.get_bucket('name-of-my-bucket')
blob = bucket.blob(file.pdf.name[:-4] + '/')
blob.upload_from_string('', content_type='application/x-www-form-urlencoded;charset=UTF-8')
pages = convert_from_path(fullURL, 400)
for i,page in enumerate(pages):
blob = bucket.blob(file.pdf.name[:-4] + '/' + str(i) + '.jpg')
blob.upload_from_string('', content_type='image/jpeg')
outfile = file.pdf.name[:-4] + '/' + str(i) + '.jpg'
page.save(outfile)
of = open(outfile, 'rb')
blob.upload_from_file(of)
所以从不使用 blobstore 开始。他们试图得到 摆脱它并让人们使用云存储。首先设置云存储
我使用 webapp2 而不是 Django,但我相信你能弄明白。此外,我不使用 Pillow 图像,因此您必须打开要上传的图像。然后做这样的事情(假设您正在尝试发布数据):
import cloudstorage as gcs
import io
import StringIO
from google.appengine.api import app_identity
def create_file(self, filename, Dacontents):
write_retry_params = gcs.RetryParams(backoff_factor=1.1)
gcs_file = gcs.open(filename,
'w',
content_type='image/jpeg',
options={'x-goog-meta-foo': 'foo',
'x-goog-meta-bar': 'bar'},
retry_params=write_retry_params)
gcs_file.write(Dacontents)
gcs_file.close()
进入你的 HTML
<form action="/(whatever yoururl is)" method="post"enctype="multipart/form-data">
<input type="file" name="orders"/>
<input type="submit"/>
</form>
orders=self.request.POST.get(‘orders)#this is for webapp2
bucket_name = os.environ.get('BUCKET_NAME',app_identity.get_default_gcs_bucket_name())
bucket = '/' + bucket_name
OpenOrders=orders.file.read()
if OpenOrders:
filename = bucket + '/whateverYouWantToCallIt'
self.create_file(filename,OpenOrders)
由于您已将文件保存在本地,因此它们可以在运行 Web 应用程序的本地目录中使用。
你可以做的只是遍历该目录的文件并将它们一个一个地上传到谷歌云存储。
这里是示例代码:
你将需要这个图书馆:
谷歌云存储
Python代码:
#Libraries
import os
from google.cloud import storage
#Public variable declarations:
bucket_name = "[BUCKET_NAME]"
local_directory = "local/directory/of/the/files/for/uploading/"
bucket_directory = "uploaded/files/" #Where the files will be uploaded in the bucket
#Upload file from source to destination
def upload_blob(source_file_name, destination_blob_name):
storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket_name)
blob = bucket.blob(destination_blob_name)
blob.upload_from_filename(source_file_name)
#Iterate through all files in that directory and upload one by one using the same filename
def upload_files():
for filename in os.listdir(local_directory):
upload_blob(local_directory + filename, bucket_directory + filename)
return "File uploaded!"
#Call this function in your code:
upload_files()
注意:我已经在 Google App Engine 网络应用程序中测试了代码,它对我有用。了解它的工作原理并根据您的需要进行修改。我希望这对您有所帮助。
You don't need to save the image locally without save locally also you can write the image directly to gcs bucket 如下所述:
import io
from PIL import Image
from google.cloud import storage
from pdf2image import convert_from_bytes
storage_client = storage.Client()
def convert_pil_image_to_byte_array(img):
img_byte_array = io.BytesIO()
img.save(img_byte_array, format='JPEG', subsampling=0, quality=100)
img_byte_array = img_byte_array.getvalue()
return img_byte_array
def write_to_gcs_bucket(bucket_name, source_prefix, target_prefix):
bucket = storage_client.get_bucket(bucket_name)
blob = bucket.get_blob(source_prefix)
contents = blob.download_as_string()
images = convert_from_bytes(contents,first_page = 5)
for i in range(len(images)):
object_byte = convert_pil_image_to_byte_array(images[i])
file_name = 'slide' + str(i) + '.jpg'
blob = bucket.blob(target_prefix + file_name)
blob.upload_from_string(object_byte)