我正在 EKS 集群中部署 Loki,使用 S3 存储桶进行存储。我已配置服务账户 (IRSA) 的 IAM 角色以允许 Loki 访问 S3 存储桶。但是,我在 Loki 日志中遇到以下错误:
WebIdentityErr: failed to retrieve credentials\ncaused by: SerializationError: failed to unmarshal error message\n\tstatus code: 405
整个错误看起来像这样:
级别=错误 ts=2024-11-14T10:08:39.203006213Z 调用者=flush.go:261 组件=摄取循环=1 org_id=1 msg=“刷新失败”重试=2 err="无法刷新块:存储放置块:WebIdentityErr:失败 检索凭证 导致:序列化错误:未能 解组错误消息 状态码:405,请求ID: 造成的: UnmarshalError:无法解组错误消息 00000000 3c 3f 78 6d 6c 20 76 65 72 73 69 6f 6e 3d 22 31 |.
方法NotAll| 00000040 6f 77 65 64 3c 2f 43 6f 64 65 3e 3c 4d 65 73 73 |欠 指定| 00000060 64 20 6d 65 74 68 6f 64 20 69 73 20 6e 6f 74 20 |d 方法不是 | 00000070 61 6c 6c 6f 77 65 64 20 61 67 61 69 6e 73 74 20 |允许反对| 00000080 74 68 69 73 20 72 65 73 6f 75 72 63 65 2e 3c 2f |此资源。| 000000a0 50 4f 53 54 3c 2f 4d 65 74 68 6f 64 3e 3c 52 65 |发布 服务| 000000c0 43 45 3c 2f 52 65 73 6f 75 72 63 65 54 79 70 65 |CEE6P4| 000000e0 34 34 34 43 51 32 4e 30 35 45 52 4a 3c 2f 52 65 |444CQ2N05ERJ| 00000100 70 4f 30 35 58 6d 42 4b 33 6d 45 36 30 43 32 66 |pO05XmBK3mE60C2f| 00000110 6c 56 74 36 76 65 45 36 37 41 6c 56 43 58 4a 4d |LVt6veE67AlVCXJM| 00000120 61 9c 66 39 65 50 36 67 56 53 7a 70 57 49 78 50 |aLf9eP6gVSzpWIxP| 00000130 57 32 32 63 58 76 51 51 73 6b 54 6b 35 78 64 6d |W22cXvQQskTk5xdm| 00000140 62 52 6b 36 35 73 48 33 4e 39 2f 54 6e 39 77 30 |bRk65sH3N9/Tn9w0| 00000150 46 72 31 2f 2f 75 61 55 34 6d 2b 78 64 6f 46 59 |Fr1//uaU4m+xdoFY| 00000160 71 7a 6e 47 79 86 35 51 4a 2b 59 3d 3c 2f 48 6f |qznGyF5QJ+Y=| 原因:未知错误 响应标签,{{Error}[]},num_chunks:1,标签: {应用=“卷曲测试”,容器=“卷曲测试”, 文件名=“/var/log/pods/default_curl-test_5cb0fb1b-922b-4a1c-b8e0-e6506bec3cee/curl-test/239.log”, 工作=“默认/卷曲测试”,命名空间=“默认”, node_name="ip-172-32-18-62.ap-southeast-1.compute.internal", pod =“curl-test”,service_name =“curl-test”,stream =“stdout”}“
这是我的 loki-values.yml:
loki:
auth-enabled: false
image:
tag: 3.2.1
schemaConfig:
configs:
- from: "2024-04-01"
store: tsdb
object_store: s3
schema: v13
index:
prefix: loki_index_
period: 24h
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::469030223850:role/loki-test-role-again
storage_config:
aws:
# s3: s3://ap-southeast-1/chunks-buck
region: ap-southeast-1
bucketnames: chunks-buck
s3forcepathstyle: false
pattern_ingester:
enabled: true
limits_config:
allow_structured_metadata: true
volume_enabled: true
retention_period: 672h # 28 days retention
querier:
max_concurrent: 4
storage:
bucketNames:
chunks: chunks-buck
ruler: ruler-buck
admin: admin-buck
type: s3
s3:
s3: s3://ap-southeast-1/chunks-buck
endpoint: s3.ap-southeast-1.amazonaws.com
region: ap-southeast-1
s3ForcePathStyle: false
memcached:
chunk_cache:
resources:
requests:
cpu: 500m
memory: 256Mi # Lower memory request
limits:
memory: 512Mi # Lower memory limit
deploymentMode: SimpleScalable
serviceAccount:
# -- Specifies whether a ServiceAccount should be created
create: true
# -- The name of the ServiceAccount to use.
# If not set and create is true, a name is generated using the fullname template
name: loki-sa
# -- Image pull secrets for the service account
imagePullSecrets: []
# -- Labels for the service account
labels: {}
# -- Annotations for the service account
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::469030223850:role/loki-test-role-again
# -- Set this toggle to false to opt out of automounting API credentials for the service account
automountServiceAccountToken: true
backend:
replicas: 3
persistence:
volumeClaimsEnabled: false
read:
replicas: 3
persistence:
storageClass: gp2
write:
replicas: 3
persistence:
volumeClaimsEnabled: false
# -- Configuration for the read pod(s)
minio:
enabled: false
singleBinary:
persistence:
storageClass: gp2
ingestor:
resources:
requests:
memory: 512Mi
limits:
memory: 1Gi
persistence:
claims:
- name: data
storageClass: gp2
querier:
persistence:
storageClass: gp2
indexGateway:
persistence:
storageClass: gp2
compactor:
persistence:
storageClass: gp2
claims:
- name: data
storageClass: gp2
bloomGateway:
persistence:
claims:
- name: data
storageClass: gp2
bloomPlanner:
persistence:
claims:
- name: data
storageClass: gp2
patternIngester:
persistence:
storageClass: gp2
claims:
- name: data
storageClass: gp2
ruler:
persistence:
storageClass: gp2
resultsCache:
persistence:
storageClass: gp2
chunksCache:
allocatedMemory: 1000
persistence:
storageClass: gp2
图表名称:grafana/loki 图表版本:6.19.0
这个解决方案对我有用:https://github.com/grafana/helm-charts/issues/1550。我取消了 value.yml 文件中端点的注释或干脆将其删除。
确保您在values.yml文件中指定了正确的ARN和注释,并且具有访问存储桶的权限。