我想将 .tiff 文件转换为 .gpkg 文件,以便考虑每个栅格单元。
当我使用 numpy 将 .tiff 转换为图像数组时,我得到的长度返回值为 350403。
当我尝试使用 gpd.GeoDataFrame.from_features(img_src) 进行相同操作时,我得到的值为 343003。
有没有一种方法可以确保 GeoDataFrame 保留所有单元格/像素?
这是我的代码,很抱歉它不是最好的。
def tiff_to_csv(tiff_file:str,csv_file:str):
'''
Description:
- This function converts a TIFF file to a CSV file.
Parameters:
tiff_file:
- the path to the TIFF file to be converted
csv_file:
- the path to save the CSV file
Returns:
None
'''
# Open the TIFF file
tif_image = Image.open(tiff_file)
# Convert my image to a numpy array
image_array = np.array(tif_image)
# Flatten the array to convert it to a 1D vector
vector_data = image_array.flatten()
print(len(vector_data))
# Save the pixel values to a CSV file
np.savetxt(csv_file, vector_data, delimiter=",")
with rasterio.open(tiff_file) as src:
band1 = src.read(1)
no_data = src.nodata
print(f'no_data: {no_data}')
# Generate shapes (polygons) from the raster values,
results = (
{'properties': {'raster_val': v}, 'geometry': shape(s)}
for i, (s, v) in enumerate(shapes(band1, transform=src.transform))
)
# Convert my shapes to a GeoDataFrame
gdf = gpd.GeoDataFrame.from_features(band1)
total_shapes = len(gdf)
print(f"Number of shapes (including nodata): {total_shapes}")
有趣的是,没有“no_data”值,但长度输出肯定不匹配。
当您使用光栅进行矢量化时,具有相同值的相邻像素会分解为一个多边形,这就是为什么您得到的多边形比光栅像素少。
您可以向栅格数组中的每个值添加一个介于 0 和 1 之间的随机浮点,以使每个像素值唯一(希望如此,除非您非常不走运),然后矢量化为 Geopandas 数据帧,并将值平铺回原始值价值观:
import rasterio
from rasterio.features import shapes
import geopandas as gpd
import numpy as np
raster_file = r"C:\Users\bera\Desktop\gistest\random_raster.tif"
#Vectorize the raster into a Geopandas dataframe
with rasterio.open(raster_file) as src:
array = src.read(1)
print(array.shape)
#(26, 33). There are 858 pixels in my raster
results = (
{'properties': {'raster_val': v}, 'geometry': s}
for i, (s, v) in enumerate(shapes(source=array, mask=None, transform=src.transform)))
geoms = list(results)
df = gpd.GeoDataFrame.from_features(geoms)
print(df.shape)
#(679, 2)
#So 26*33 - 679 = 179 cells are missing
df.to_file(r"C:\gistest\vectorized.gpkg")
#Add a random float between 0-1 to each array vaklue
random_floats = np.random.rand(*array.shape)
array_with_random_decimals = (array + random_floats).astype("float32")
#Vectorize
results = (
{'properties': {'raster_val': v}, 'geometry': s}
for i, (s, v) in enumerate(shapes(source=array_with_random_decimals, mask=None, transform=src.transform)))
geoms = list(results)
df2 = gpd.GeoDataFrame.from_features(geoms)
df2.shape[0]
#858. Which equals the number of values in the input raster.
#Floor the values back to the original values
df2["raster_val"] = np.floor(df2["raster_val"])
df2.to_file(r"C:\gistest\vectorized_with_random_addition.gpkg")