在redshift中加载到物理表时如何获取s3文件的路径？

Question

我使用以下方法从外部表获取 s3 文件的路径：

create or replace view raw_view as select *,"$path" as sourcefilename from raw_external_table WITH NO SCHEMA BINDING;

现在我希望使用 redshift 中的

copy

命令将源文件名也包含在物理表中。

我不能使用这个：

COPY staging_table (cola, colb, colc, cold)
FROM 's3://your-bucket-name/path/to/files/'
IAM_ROLE 'arn:aws:iam::123456789012:role/YourRedshiftRole' -- Replace with your IAM role ARN
DELIMITER '|' -- Adjust delimiter as per your file format
IGNOREHEADER 1 -- If your files have a header row
REMOVEQUOTES -- If your data is enclosed in quotes
CSV; -- If your files are in CSV format

-- Update the sourcefilename column in the staging table with the S3 file path
UPDATE staging_table
SET sourcefilename = 's3://your-bucket-name/path/to/files/';

因为我是从目录加载的，目录中有很多文件。任何获取物理表中 s3 文件名的方法。

Answer 1

不，您需要在文件本身的每个记录中添加一个值来识别数据，或者批量加载它们，可能使用清单文件来了解该批次将传入哪些文件。

在redshift中加载到物理表时如何获取s3文件的路径？

问题描述投票：0回答：1

1个回答

最新问题

在redshift中加载到物理表时如何获取s3文件的路径？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1