我想问是否有办法在 kubeflow Spark 运算符中使用 python 作为
.wheel
或 .egg
或只是 .py
依赖项。
我想到的结果文件看起来像这样,依赖关系要么在 jar 下,要么在文件下,我认为文件会更有意义:
apiVersion: sparkoperator.k8s.io/v1beta2
kind: SparkApplication
metadata:
name: spark-pi-python
namespace: default
spec:
type: Python
pythonVersion: "3"
mode: cluster
image: spark:3.5.3
imagePullPolicy: IfNotPresent
mainApplicationFile: local:///path/to/my/python/script.py
deps:
jars:
- local:///path/to/python/functions.py
files:
- gs://path/to/python/functions.py
sparkVersion: 3.5.3
driver:
cores: 1
memory: 512m
serviceAccount: spark-operator-spark
executor:
instances: 1
cores: 1
memory: 512m
可以使用 python 文件作为依赖项,这对我有用:
apiVersion: sparkoperator.k8s.io/v1beta2
kind: SparkApplication
metadata:
name: view-creator-test
namespace: default
spec:
type: Python
pythonVersion: "3"
mode: cluster
image: spark:3.5.3
imagePullPolicy: IfNotPresent
mainApplicationFile: local:///path/to/my/python/script.py
arguments: []
sparkVersion: 3.5.3
deps:
pyFiles:
- local:///mnt/spark/dependency_1.py
- local:///mnt/spark/dependency_2.py
driver:
labels:
version: 3.5.3
cores: 1
memory: 512m
volumeMounts:
- name: view-creator-volume
mountPath: /mnt/spark
executor:
labels:
version: 3.5.3
instances: 1
cores: 1
memory: 512m
volumeMounts:
- name: view-creator-volume
mountPath: /mnt/spark
volumes:
- name: view-creator-volume
persistentVolumeClaim:
claimName: view-creator-pvc