Download as pdf or txt
Download as pdf or txt
You are on page 1of 5


Adds a URL to each user code classloader on all nodes in the cluster. The paths must specify a
protocol (e.g. file://) and be accessible on all nodes (e.g. by means of a NFS share). You can use this
option multiple times for specifying more than one URL. The protocol must be supported by the
{@link}. Specify the path of the python interpreter used to execute the
python UDF worker (e.g.: –pyExecutable /usr/local/bin/python3). The python UDF worker depends
on Python 3.5+, Apache Beam (version == 2.15.0), Pip (version >= 7.1.0) and SetupTools (version
>= 37.0.0). Please ensure that the specified environment meets the above requirements. Flink 支持
多种部署方式 如 Local 、Standalone、Yarn、K8S 等,但是现在企业中大多数的大数据平
台都以 Yarn 作为资源管理器,所以 Flink On Yarn 模式也在企业中用的非常多,下面就
介绍一下Flink On Yarn 的使用。 提交作业后,一旦老师完成批改,你都可以及时的在微
信收到通知,想要获取微信通知这个功能,你需要: bin/flink run -m node1:9999 ./examples
这里Hadoop 的环境是之前装好的,如果Hadoop 环境没有装好可以先把所需环境装好。
并启动好环境。 If present, runs the job in detached mode (deprecated; use non-YARN specific
option instead) 提交作业后,一旦老师完成批改,你都可以及时的在微信收到通知,想要
获取微信通知这个功能,你需要: job 存储保存点的位置 (如: hdfs:///flink/savepoint-1537).
要连接的 Master JobManager 的地址。这个参数的优先级会高于配置文件中指定
的JobManager地址。 如果作业以附加模式提交,请在CLI突然终止时(例如,响应用户
中断,例如键入Ctrl + C),尽最大努力关闭集群。 Python 模块与程序的入口点。此选项
必须与–pyFiles 结合使用。
这里Hadoop 的环境是之前装好的,如果Hadoop 环境没有装好可以先把所需环境装好。
并启动好环境。 Ship files in the specified directory (t for transfer) Flink 支持多种部署方式
如 Local、Standalone、Yarn、K8S 等,但是现在企业中大多数的大数据平台都以Yarn 作
为资源管理器,所以 Flink On Yarn 模式也在企业中用的非常多,下面就介绍一下Flink
On Yarn 的使用。 带有程序入口点的Python脚本。该相关资源可以用’–pyFiles`选项进行
配置。 建议设置为每个机器的CPU核数。一般情况下,vcore的数量等于处理的slot(-s)
的数量 如果作业以附加模式提交,请在CLI突然终止时(例如,响应用户中断,例如键
入Ctrl + C),尽最大努力关闭集群。 Add python archive files for job. The archive files will be
extracted to the working directory of python UDF worker. Currently only zip-format is supported.
For each archive file, a target directory be specified. If the target directory name is specified, the
archive file will be extracted to a name can directory with the specified name. Otherwise, the archive
file will be extracted to a directory with the same name of the archive file. The files uploaded via this
option are accessible via relative path. ‘#’ could be used as the separator of the archive file path and
the target directory name. Comma (‘,’) could be used as the separator to specify multiple archive
files. This option can be used to upload the virtual environment, the data files used in Python UDF
(e.g.: –pyArchives file:///tmp/,file:///tmp/data. zip#data –pyExecutable
/python). The data files could be accessed in Python UDF, e.g.: f = open(‘data/data.txt’, ‘r’).
Adds a URL to each user code classloader on all nodes in the cluster. The paths must specify a
protocol (e.g. file://) and be accessible on all nodes (e.g. by means of a NFS share). You can use this
option multiple times for specifying more than one URL. The protocol must be supported by the
{@link}. Python模块与程序的入口点。此选项必须与–pyFiles 结
合使用。 带有程序入口点的Python 脚本。该相关资源可以用’–pyFiles`选项进行配置。 例
如:bin/flink run -m yarn-cluster -yjm 1024 -ytm 1024 ./examples/bath/WordCount.jar 所有的训练
作业。 对于Per-Job模式,提交作业就相对比较简单,不需要提前在yarn 中启动一
个 Flink 集群,而是直接提交作业,即可完成Flink作业。 job 存储保存点的位置 (如:
hdfs:///flink/savepoint-1537). Attach custom python files for job. These files will be added to the
PYTHONPATH of both the local client and the remote python UDF worker. The standard python
resource file suffixes such as .py/.egg/.zip or directory are all supported. Comma (‘,’) could be used
as the separator to specify multiple files (e.g.: –pyFiles file:///tmp/,hdfs:///$na
menode_address/ Add python archive files for job. The archive files will be
extracted to the working directory of python UDF worker. Currently only zip-format is supported.
For each archive file, a target directory be specified. If the target directory name is specified, the
archive file will be extracted to a name can directory with the specified name. Otherwise, the archive
file will be extracted to a directory with the same name of the archive file. The files uploaded via this
option are accessible via relative path. ‘#’ could be used as the separator of the archive file path and
the target directory name. Comma (‘,’) could be used as the separator to specify multiple archive
files. This option can be used to upload the virtual environment, the data files used in Python UDF
(e.g.: –pyArchives file:///tmp/,file:///tmp/data. zip#data –pyExecutable
/python). The data files could be accessed in Python UDF, e.g.: f = open(‘data/data.txt’, ‘r’). 1. 环境
说明 节点 系统版本 安装服务 GreenPlum 版本 node1 CentOS 7.7 Master 6.7.1 node2 CentOS
7.7 Segment 6.7.1 node3 CentOS 7.7

You might also like