OpenCL:修订间差异

求闻百科,共笔求闻
添加的内容 删除的内容
(机器人:删除1个模板:Wayback)
(我来啦, replaced: 連結 → 链接, 參考文獻 → 参考文献, 與 → 与 (2), 礦 → 矿, 處 → 处, 將 → 将, 發 → 发, 號 → 号 (3), 譯 → 译, 體 → 体, 圖 → 图, 區 → 区 (5), 為 → 为 (2), 於 → 于, 數 → 数 (4), 據 → 据, 責 → 责, 複 → 复, 運 → 运, 並 → 并, 產 → 产, 過 → 过, 較 → 较, 進 → 进 (2), 製 → 制, 現 → 现, 個 → 个, 兩 → 两 (2), 資 → 资 (9), 範 → 范 (2), 圍 → 围, 業 → 业 (2), 設 → 设 (2), 計 → 计 (2), 擴 → 扩, 結 → 结 (2), 網 → 网, 併 → 并 (7), 訊 → 讯, 轉 → 转 (2), 態 → 态 (3), 顯 → 显 (2), 執 → 执, 編 → 编 (3), 組 → 组 (5), 葉 → 叶 (3), 換 → 换 (6), 變 → 变 (4), 側 → 侧 (5))
第19行: 第19行:
}}
}}


'''OpenCL'''('''Open''' '''C'''omputing '''L'''anguage,开放计算语言)是一个为异构平台编写程序的框架,此异构平台可由[[CPU]]、[[GPU]]、[[訊號處理器|DSP]]、[[FPGA]]或其他类型的处理器加速器所组成。OpenCL由一门用于编写kernels(在OpenCL设备上运行的函数)的语言(基于[[C99]])和一组用于定义并控制平台的API组成。OpenCL提供了基于任务分割和数据分割的[[并行计算]]机制。
'''OpenCL'''('''Open''' '''C'''omputing '''L'''anguage,开放计算语言)是一个为异构平台编写程序的框架,此异构平台可由[[CPU]]、[[GPU]]、[[讯号处理器|DSP]]、[[FPGA]]或其他类型的处理器加速器所组成。OpenCL由一门用于编写kernels(在OpenCL设备上运行的函数)的语言(基于[[C99]])和一组用于定义并控制平台的API组成。OpenCL提供了基于任务分割和数据分割的[[并行计算]]机制。


OpenCL类似于另外两个开放的工业标准[[OpenGL]]和[[OpenAL]],这两个标准分别用于三维图形和计算机音频方面。OpenCL充了GPU形生成之外的能力。OpenCL由非盈利性技术组织[[Khronos Group]]掌管。
OpenCL类似于另外两个开放的工业标准[[OpenGL]]和[[OpenAL]],这两个标准分别用于三维图形和计算机音频方面。OpenCL充了GPU形生成之外的能力。OpenCL由非盈利性技术组织[[Khronos Group]]掌管。


== 历史 ==
== 历史 ==
第28行: 第28行:
2008年6月16日,Khronos的通用计算工作小组成立<ref>{{cite press release |url=http://www.khronos.org/news/press/releases/khronos_launches_heterogeneous_computing_initiative/ |title=Khronos Launches Heterogeneous Computing Initiative |accessdate=2008-06-18 |publisher=Khronos Group |date=2008-06-16 |deadurl=yes |archiveurl=https://web.archive.org/web/20080620123431/http://www.khronos.org/news/press/releases/khronos_launches_heterogeneous_computing_initiative/ |archivedate=2008-06-20 }}</ref>。5个月后的2008年11月18日,该工作组完成了OpenCL 1.0规范的技术细节<ref name=macWorld>{{cite web | url=http://www.macworld.com/article/136921/2008/11/opencl.html?lsrc=top_2 | title=OpenCL gets touted in Texas | publisher=MacWorld | date=2008-11-20 | accessdate=2009-06-12 | archive-date=2009-02-18 | archive-url=https://web.archive.org/web/20090218165557/http://www.macworld.com/article/136921/2008/11/opencl.html?lsrc=top_2 | dead-url=no }}</ref>。该技术规范在由Khronos成员进行审查之后,于2008年12月8日公开发表<ref name=khronosGroup>{{cite press release | url=http://www.khronos.org/news/press/releases/the_khronos_group_releases_opencl_1.0_specification/ | title=The Khronos Group Releases OpenCL 1.0 Specification | publisher=Khronos Group | date=2008-12-08 | accessdate=2009-06-12 | deadurl=yes | archiveurl=https://web.archive.org/web/20100713014204/http://www.khronos.org/news/press/releases/the_khronos_group_releases_opencl_1.0_specification/ | archivedate=2010-07-13 }}</ref>。2010年6月14日,OpenCL 1.1发布<ref>{{cite press release | url=http://www.khronos.org/news/press/releases/khronos-group-releases-opencl-1-1-parallel-computing-standard/ | title=Khronos Drives Momentum of Parallel Computing Standard with Release of OpenCL 1.1 Specification | publisher=Khronos Group | date=2010-06-14 | accessdate=2010-10-13 | deadurl=yes | archiveurl=https://web.archive.org/web/20100923101844/http://www.khronos.org/news/press/releases/khronos-group-releases-opencl-1-1-parallel-computing-standard | archivedate=2010-09-23 }}</ref>。
2008年6月16日,Khronos的通用计算工作小组成立<ref>{{cite press release |url=http://www.khronos.org/news/press/releases/khronos_launches_heterogeneous_computing_initiative/ |title=Khronos Launches Heterogeneous Computing Initiative |accessdate=2008-06-18 |publisher=Khronos Group |date=2008-06-16 |deadurl=yes |archiveurl=https://web.archive.org/web/20080620123431/http://www.khronos.org/news/press/releases/khronos_launches_heterogeneous_computing_initiative/ |archivedate=2008-06-20 }}</ref>。5个月后的2008年11月18日,该工作组完成了OpenCL 1.0规范的技术细节<ref name=macWorld>{{cite web | url=http://www.macworld.com/article/136921/2008/11/opencl.html?lsrc=top_2 | title=OpenCL gets touted in Texas | publisher=MacWorld | date=2008-11-20 | accessdate=2009-06-12 | archive-date=2009-02-18 | archive-url=https://web.archive.org/web/20090218165557/http://www.macworld.com/article/136921/2008/11/opencl.html?lsrc=top_2 | dead-url=no }}</ref>。该技术规范在由Khronos成员进行审查之后,于2008年12月8日公开发表<ref name=khronosGroup>{{cite press release | url=http://www.khronos.org/news/press/releases/the_khronos_group_releases_opencl_1.0_specification/ | title=The Khronos Group Releases OpenCL 1.0 Specification | publisher=Khronos Group | date=2008-12-08 | accessdate=2009-06-12 | deadurl=yes | archiveurl=https://web.archive.org/web/20100713014204/http://www.khronos.org/news/press/releases/the_khronos_group_releases_opencl_1.0_specification/ | archivedate=2010-07-13 }}</ref>。2010年6月14日,OpenCL 1.1发布<ref>{{cite press release | url=http://www.khronos.org/news/press/releases/khronos-group-releases-opencl-1-1-parallel-computing-standard/ | title=Khronos Drives Momentum of Parallel Computing Standard with Release of OpenCL 1.1 Specification | publisher=Khronos Group | date=2010-06-14 | accessdate=2010-10-13 | deadurl=yes | archiveurl=https://web.archive.org/web/20100923101844/http://www.khronos.org/news/press/releases/khronos-group-releases-opencl-1-1-parallel-computing-standard | archivedate=2010-09-23 }}</ref>。


== 例 ==
== 例 ==
=== 快速傅立葉變換 ===
=== 快速傅立叶变换 ===
[[快速傅立葉變換]]的式子:
[[快速傅立叶变换]]的式子:
<ref name=siggraph>{{cite web | url=http://s08.idav.ucdavis.edu/munshi-opencl.pdf | title=OpenCL | accessdate=2008-08-14 | publisher=SIGGRAPH2008 | date=2008-08-14 | archive-url=https://www.webcitation.org/66GmScoh5?url=http://s08.idav.ucdavis.edu/munshi-opencl.pdf | archive-date=2012-03-19 | dead-url=yes }}</ref>
<ref name=siggraph>{{cite web | url=http://s08.idav.ucdavis.edu/munshi-opencl.pdf | title=OpenCL | accessdate=2008-08-14 | publisher=SIGGRAPH2008 | date=2008-08-14 | archive-url=https://www.webcitation.org/66GmScoh5?url=http://s08.idav.ucdavis.edu/munshi-opencl.pdf | archive-date=2012-03-19 | dead-url=yes }}</ref>
<syntaxhighlight lang="c">
<syntaxhighlight lang="c">
第64行: 第64行:
</syntaxhighlight>
</syntaxhighlight>


真正的算:(基[http://www.cs.berkeley.edu/~kubitron/courses/cs258-S08/projects/reports/project6_report.pdf Fitting FFT onto the G80 Architecture])<ref name=VolkovKazianFFTG80>{{cite web | url=http://www.cs.berkeley.edu/~kubitron/courses/cs258-S08/projects/reports/project6_report.pdf | title=Fitting FFT onto G80 Architecture | accessdate=2008-11-14 | publisher=Vasily Volkov and Brian Kazian, UC Berkeley CS258 project report | format=PDF | date=May 2008 | archive-date=2012-03-19 | archive-url=https://www.webcitation.org/66GmTA1HM?url=http://www.cs.berkeley.edu/~kubitron/courses/cs258-S08/projects/reports/project6_report.pdf | dead-url=no }}</ref>
真正的算:(基[http://www.cs.berkeley.edu/~kubitron/courses/cs258-S08/projects/reports/project6_report.pdf Fitting FFT onto the G80 Architecture])<ref name=VolkovKazianFFTG80>{{cite web | url=http://www.cs.berkeley.edu/~kubitron/courses/cs258-S08/projects/reports/project6_report.pdf | title=Fitting FFT onto G80 Architecture | accessdate=2008-11-14 | publisher=Vasily Volkov and Brian Kazian, UC Berkeley CS258 project report | format=PDF | date=May 2008 | archive-date=2012-03-19 | archive-url=https://www.webcitation.org/66GmTA1HM?url=http://www.cs.berkeley.edu/~kubitron/courses/cs258-S08/projects/reports/project6_report.pdf | dead-url=no }}</ref>
<syntaxhighlight lang="c">
<syntaxhighlight lang="c">
// This kernel computes FFT of length 1024. The 1024 length FFT is decomposed into
// This kernel computes FFT of length 1024. The 1024 length FFT is decomposed into
第99行: 第99行:
}
}
</syntaxhighlight>
</syntaxhighlight>
Apple的站上可以發現傅立葉變換的例子<ref name=AppleOpenCLFFT>.
Apple的站上可以发现傅立叶变换的例子<ref name=AppleOpenCLFFT>.
{{cite web | url=https://developer.apple.com/mac/library/samplecode/OpenCL_FFT/index.html | title=OpenCL on FFT | accessdate=2009-12-07 | publisher=Apple | date=16 Nov 2009 | archive-date=2009-11-30 | archive-url=https://web.archive.org/web/20091130085543/http://developer.apple.com/mac/library/samplecode/OpenCL_FFT/index.html | dead-url=no }}</ref>
{{cite web | url=https://developer.apple.com/mac/library/samplecode/OpenCL_FFT/index.html | title=OpenCL on FFT | accessdate=2009-12-07 | publisher=Apple | date=16 Nov 2009 | archive-date=2009-11-30 | archive-url=https://web.archive.org/web/20091130085543/http://developer.apple.com/mac/library/samplecode/OpenCL_FFT/index.html | dead-url=no }}</ref>


=== 平行合排序法 ===
=== 平行合排序法 ===
使用 Python 3.x 搭配 PyOpenCL NumPy
使用 Python 3.x 搭配 PyOpenCL NumPy
<div style="height:400px; overflow-y: scroll;">
<div style="height:400px; overflow-y: scroll;">
<syntaxhighlight lang="python3">
<syntaxhighlight lang="python3">
第112行: 第112行:


def dump_step(data, chunk_size):
def dump_step(data, chunk_size):
"""示排序程"""
"""示排序程"""
msg = io.StringIO('')
msg = io.StringIO('')
div = io.StringIO('')
div = io.StringIO('')
第134行: 第134行:


def cl_merge_sort_sbs(data_in):
def cl_merge_sort_sbs(data_in):
"""平行合排序"""
"""平行合排序"""
# OpenCL kernel 函程式碼
# OpenCL kernel 函程式碼
CL_CODE = '''
CL_CODE = '''
kernel void merge(int chunk_size, int size, global long* data, global long* buff) {
kernel void merge(int chunk_size, int size, global long* data, global long* buff) {
// 取得分組編號
// 取得分组编号
const int gid = get_global_id(0);
const int gid = get_global_id(0);


// 根組編號計範圍
// 根组编号计范围
const int offset = gid * chunk_size;
const int offset = gid * chunk_size;
const int real_size = min(offset + chunk_size, size) - offset;
const int real_size = min(offset + chunk_size, size) - offset;
第147行: 第147行:
global long* buff_part = buff + offset;
global long* buff_part = buff + offset;


// 定合前的初始狀
// 定合前的初始狀
int r_beg = chunk_size >> 1;
int r_beg = chunk_size >> 1;
int b_ptr = 0;
int b_ptr = 0;
第153行: 第153行:
int r_ptr = r_beg;
int r_ptr = r_beg;


// 行合
// 行合
while (b_ptr < real_size) {
while (b_ptr < real_size) {
if (r_ptr >= real_size) {
if (r_ptr >= real_size) {
// 若右沒有料,取左側資料堆入緩衝
// 若右沒有料,取左侧资料堆入緩衝
buff_part[b_ptr] = data_part[l_ptr++];
buff_part[b_ptr] = data_part[l_ptr++];
} else if (l_ptr == r_beg) {
} else if (l_ptr == r_beg) {
// 若左沒有料,取右側資料堆入緩衝
// 若左沒有料,取右侧资料堆入緩衝
buff_part[b_ptr] = data_part[r_ptr++];
buff_part[b_ptr] = data_part[r_ptr++];
} else {
} else {
// 若兩側都有料,取料堆入緩衝
// 若两侧都有料,取料堆入緩衝
if (data_part[l_ptr] < data_part[r_ptr]) {
if (data_part[l_ptr] < data_part[r_ptr]) {
buff_part[b_ptr] = data_part[l_ptr++];
buff_part[b_ptr] = data_part[l_ptr++];
第174行: 第174行:
'''
'''


# 配置源,編譯 OpenCL 程式
# 配置源,编译 OpenCL 程式
ctx = cl.Context(dev_type=cl.device_type.GPU)
ctx = cl.Context(dev_type=cl.device_type.GPU)
prg = cl.Program(ctx, CL_CODE).build()
prg = cl.Program(ctx, CL_CODE).build()
第180行: 第180行:
mf = cl.mem_flags
mf = cl.mem_flags


# 轉換成 numpy 形式以利轉換為 OpenCL Buffer
# 转换成 numpy 形式以利转换为 OpenCL Buffer
data_np = np.int64(data_in)
data_np = np.int64(data_in)
buff_np = np.empty_like(data_np)
buff_np = np.empty_like(data_np)


# 建立緩衝複製數值到緩衝
# 建立緩衝复制数值到緩衝
data = cl.Buffer(ctx, mf.READ_WRITE | mf.COPY_HOST_PTR, hostbuf=data_np)
data = cl.Buffer(ctx, mf.READ_WRITE | mf.COPY_HOST_PTR, hostbuf=data_np)
buff = cl.Buffer(ctx, mf.READ_WRITE | mf.COPY_HOST_PTR, hostbuf=buff_np)
buff = cl.Buffer(ctx, mf.READ_WRITE | mf.COPY_HOST_PTR, hostbuf=buff_np)


# 定合前初始狀
# 定合前初始狀
data_len = np.int32(len(data_np))
data_len = np.int32(len(data_np))
chunk_size = np.int32(1)
chunk_size = np.int32(1)
第194行: 第194行:
dump_step(data_np, chunk_size)
dump_step(data_np, chunk_size)
while chunk_size < data_len:
while chunk_size < data_len:
# 更新分大小,每一回合變兩
# 更新分大小,每一回合变两
chunk_size <<= 1
chunk_size <<= 1
# 算平行作業組數
# 算平行作业组数
group_size = ((data_len - 1) // chunk_size) + 1
group_size = ((data_len - 1) // chunk_size) + 1
# 行分
# 行分
prg.merge(queue, (group_size,), (1,), chunk_size, data_len, data, buff)
prg.merge(queue, (group_size,), (1,), chunk_size, data_len, data, buff)
# 併結果作下一回合的原始
# 并结果作下一回合的原始
temp = data
temp = data
data = buff
data = buff
buff = temp
buff = temp
# 示此回合狀
# 示此回合狀
cl.enqueue_copy(queue, data_np, data)
cl.enqueue_copy(queue, data_np, data)
dump_step(data_np, chunk_size)
dump_step(data_np, chunk_size)
第224行: 第224行:
</div>
</div>


果:
果:
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
--------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------
第239行: 第239行:
</syntaxhighlight>
</syntaxhighlight>


== 考文 ==
== 考文 ==
{{Reflist|30em}}
{{Reflist|30em}}


== 外部連結 ==
== 外部链接 ==
* [https://www.khronos.org/conformance/adopters/conformant-products/opencl 支援OpenCL的品]
* [https://www.khronos.org/conformance/adopters/conformant-products/opencl 支援OpenCL的品]
* [https://web.archive.org/web/20190122085224/http://www.opengpu.org/ 开源GPU社区]{{zh-cn}}
* [https://web.archive.org/web/20190122085224/http://www.opengpu.org/ 开源GPU社区]{{zh-cn}}


第251行: 第251行:
* [[DirectCompute]]
* [[DirectCompute]]
* [[C++ AMP]]
* [[C++ AMP]]
* [[比特币]]的挖
* [[比特币]]的挖


{{-}}
{{-}}