Google Cloud Platform

Google Cloud Platform¶

GCPCluster([projectid, zone, network, ...])

在 GCP VM 实例上运行的集群。

概述¶

身份验证¶

为了在 GCP 上创建集群，您需要设置身份验证凭据。您可以通过 gcloud 命令行工具来完成此操作。

$ gcloud auth login

或者，您可以使用服务帐号，它在 JSON 文件中提供凭据。您必须将 GOOGLE_APPLICATION_CREDENTIALS 环境变量设置为 JSON 文件的路径。

$ export GOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials.json

项目 ID¶

要将 Dask Cloudprovider 与 GCP 一起使用，您还必须配置您的项目 ID。通常在创建 GCP 帐号时，您会创建一个默认项目。这可以在 GCP 控制台的顶部找到。

您的项目 ID 必须添加到您的 Dask 配置文件中。

# ~/.config/dask/cloudprovider.yaml
cloudprovider:
  gcp:
    projectid: "YOUR PROJECT ID"

或者通过环境变量。

$ export DASK_CLOUDPROVIDER__GCP__PROJECTID="YOUR PROJECT ID"

Google Cloud VM¶

class dask_cloudprovider.gcp.GCPCluster(projectid=None, zone=None, network=None, network_projectid=None, machine_type=None, on_host_maintenance=None, source_image=None, docker_image=None, ngpus=None, gpu_type=None, filesystem_size=None, disk_type=None, auto_shutdown=None, bootstrap=True, preemptible=None, debug=False, instance_labels=None, service_account=None, service_account_credentials: Optional[Dict[str, Any]] = None, **kwargs)[source]¶

在 GCP VM 实例上运行的集群。

此集群管理器构建在 Google Cloud Platform VM 上运行的 Dask 集群。

配置集群时，您可能会发现安装 gcloud 工具来查询 GCP API 以获取可用选项很有用。

https://cloud.google.com/sdk/gcloud

参数

projectid: str

您的 GCP 项目 ID。必须在此处或您的 Dask 配置中设置此项。

https://cloud.google.com/resource-manager/docs/creating-managing-projects

有关详细信息，请参阅 GCP 文档页面。

https://cloudprovider.dask.org.cn/en/latest/gcp.html#project-id

zone: str

用于启动集群的 GCP 区域。可以使用 gcloud compute zones list 获取完整列表。

network: str

要使用的 GCP VPC 网络/子网。默认值为 default。如果使用防火墙规则，请确保配置了以下访问权限

egress 0.0.0.0/0 在所有端口上允许出站，用于下载 docker 镜像和通用数据访问

ingress 10.0.0.0/8 在所有端口上允许入站，用于 worker 之间的内部通信

ingress 0.0.0.0/0 在 8786-8787 端口上允许入站，用于外部访问 dashboard/scheduler

(可选) ingress 0.0.0.0./0 在 22 端口上允许入站，用于 ssh 访问

network_projectid: str

GCP 网络的项目 ID。默认使用 projectid。在某些情况下（例如共享 VPC），可能会使用来自不同 GCP 项目的网络配置。

machine_type: str

VM 的 machine_type。您可以使用 gcloud compute machine-types list 获取完整列表。默认值为 n1-standard-1，其配置为 3.75GB RAM 和 1 个 vCPU。

source_image: str

用于 VM 的操作系统镜像。Dask Cloudprovider 会自动引导基于 Ubuntu 的镜像。其他镜像需要 Docker，对于 GPU 还需要 NVIDIA Drivers 和 NVIDIA Docker。

可以使用 gcloud compute images list 找到可用镜像列表

有效值包括

镜像的短名称，前提是它在 projectid 中。
完整的镜像名称 projects/<projectid>/global/images/<source_image>。
完整的镜像 URI，例如 gcloud compute images list --uri 中列出的那些。

默认值为 projects/ubuntu-os-cloud/global/images/ubuntu-minimal-1804-bionic-v20201014。

docker_image: string (可选)

在所有实例上运行的 Docker 镜像。

此镜像必须具有有效的 Python 环境，并且已安装 dask，以便可以使用 dask-scheduler 和 dask-worker 命令。建议 Python 环境与创建 GCPCluster 的本地环境匹配。

对于 GPU 实例类型，Docker 镜像必须安装 NVIDIA 驱动程序和 dask-cuda。

默认情况下，将使用 daskdev/dask:latest 镜像。

docker_args: string (可选)

传递给 Docker 的额外命令行参数。

extra_bootstrap: list[str] (可选)

在引导阶段要运行的额外命令。

ngpus: int (可选)

附加到实例的 GPU 数量。默认值为 0。

gpu_type: str (可选)

要使用的 GPU 名称。如果 ngpus>0，则必须设置此项。您可以使用 gcloud compute accelerator-types list 查看每个区域中可用的 GPU 列表。

filesystem_size: int (可选)

VM 文件系统大小（GB）。默认值为 50。

disk_type: str (可选)

要使用的磁盘类型。默认值为 pd-standard。您可以使用 gcloud compute disk-types list 查看每个区域中可用的磁盘列表。

on_host_maintenance: str (可选)

Host Maintenance GCP 选项。默认值为 TERMINATE。

n_workers: int (可选)

初始化集群时要使用的 worker 数量。默认值为 0。

bootstrap: bool (可选)

如果 ngpus>0，则安装 Docker 和 NVIDIA 驱动程序。如果您使用的是已包含这些要求的自定义 source_image，则将其设置为 False。默认值为 True。

worker_class: str

用于 worker 的 Python 类。默认值为 dask.distributed.Nanny

worker_options: dict (可选)

传递给 worker 类的参数。对于默认 worker 类，请参阅 distributed.worker.Worker。如果您设置了 worker_class，则请参阅自定义 worker 类的 docstring。

env_vars: dict (可选)

传递给 worker 的环境变量。

scheduler_options: dict (可选)

传递给 scheduler 类的参数。请参阅 distributed.scheduler.Scheduler。

silence_logs: bool (可选)

设置集群时是否应抑制日志输出。

asynchronous: bool (可选)

如果打算在事件循环中直接与 async/await 一起使用

securitySecurity 或 bool (可选)

配置此集群中的通信安全性。可以是 security 对象，或 True。如果为 True，将自动创建临时自签名凭据。默认值为 True。

preemptible: bool (可选)

是否为此集群中的 worker 使用抢占式实例。默认值为 False。

debug: bool, 可选

构建集群时将打印更多信息以启用调试。

instance_labels: dict (可选)

创建时将应用于所有 GCP 实例的标签。

service_account: str

所有 VM 将在其下运行的服务帐号。默认为您的 GCP 项目的默认 Compute Engine 服务帐号。

service_account_credentials: Optional[Dict[str, Any]]

用于创建 Compute Engine VM 的服务帐号凭据

示例

创建集群。

>>> from dask_cloudprovider.gcp import GCPCluster
>>> cluster = GCPCluster(n_workers=1)
Launching cluster with the following configuration:
Source Image: projects/ubuntu-os-cloud/global/images/ubuntu-minimal-1804-bionic-v20201014
Docker Image: daskdev/dask:latest
Machine Type: n1-standard-1
Filesytsem Size: 50
N-GPU Type:
Zone: us-east1-c
Creating scheduler instance
dask-acc897b9-scheduler
        Internal IP: 10.142.0.37
        External IP: 34.75.60.62
Waiting for scheduler to run
Scheduler is running
Creating worker instance
dask-acc897b9-worker-bfbc94bc
        Internal IP: 10.142.0.39
        External IP: 34.73.245.220

连接客户端。

>>> from dask.distributed import Client
>>> client = Client(cluster)

执行一些工作。

>>> import dask.array as da
>>> arr = da.random.random((1000, 1000), chunks=(100, 100))
>>> arr.mean().compute()
0.5001550986751964

关闭集群

>>> cluster.close()
Closing Instance: dask-acc897b9-worker-bfbc94bc
Closing Instance: dask-acc897b9-scheduler

您也可以使用上下文管理器一次性完成所有操作，以确保集群被创建和清理。

>>> with GCPCluster(n_workers=1) as cluster:
...     with Client(cluster) as client:
...         print(da.random.random((1000, 1000), chunks=(100, 100)).mean().compute())
Launching cluster with the following configuration:
Source Image: projects/ubuntu-os-cloud/global/images/ubuntu-minimal-1804-bionic-v20201014
Docker Image: daskdev/dask:latest
Machine Type: n1-standard-1
Filesystem Size: 50
N-GPU Type:
Zone: us-east1-c
Creating scheduler instance
dask-19352f29-scheduler
        Internal IP: 10.142.0.41
        External IP: 34.73.217.251
Waiting for scheduler to run
Scheduler is running
Creating worker instance
dask-19352f29-worker-91a6bfe0
        Internal IP: 10.142.0.48
        External IP: 34.73.245.220
0.5000812282861661
Closing Instance: dask-19352f29-worker-91a6bfe0
Closing Instance: dask-19352f29-scheduler

属性

asynchronous: 是否在事件循环中运行？
auto_shutdown
bootstrap
called_from_running_loop
command
dashboard_link
docker_image
gpu_instance
loop
name
observed
plan
requested
scheduler_address
scheduler_class
worker_class

方法

`adapt`([Adaptive, minimum, maximum, ...])	开启自适应功能
`call_async`(f, args, *kwargs)	在线程中将阻塞函数作为协程运行。
`from_name`(name)	创建此类的实例，以按名称表示现有集群。
`get_client`()	返回集群的客户端
`get_logs`([cluster, scheduler, workers])	返回集群、scheduler 和 workers 的日志
`get_tags`()	生成要应用于所有资源的标签。
`new_worker_spec`()	返回下一个 worker 的名称和 spec
`scale`([n, memory, cores])	将集群扩展到 n 个 worker
`scale_up`([n, memory, cores])	将集群扩展到 n 个 worker
`sync`(func, *args[, asynchronous, ...])	根据调用上下文同步或异步调用 `func` 及 `args`
`wait_for_workers`(n_workers[, timeout])	阻塞调用，等待 n 个 worker 就绪后再继续

close
get_cloud_init
logs
render_cloud_init
render_process_cloud_init
scale_down

DigitalOcean

Microsoft Azure

Dask Cloud Provider 2024.9.1+4.g7c0354d 文档

Google Cloud Platform

目录

Google Cloud Platform¶

概述¶

身份验证¶

项目 ID¶

Google Cloud VM¶