添加的内容 删除的内容
小 (机器人:删除1个模板:Wayback) |
(我来啦, replaced: 連結 → 链接, 參考文獻 → 参考文献, 與 → 与 (2), 礦 → 矿, 處 → 处, 將 → 将, 發 → 发, 號 → 号 (3), 譯 → 译, 體 → 体, 圖 → 图, 區 → 区 (5), 為 → 为 (2), 於 → 于, 數 → 数 (4), 據 → 据, 責 → 责, 複 → 复, 運 → 运, 並 → 并, 產 → 产, 過 → 过, 較 → 较, 進 → 进 (2), 製 → 制, 現 → 现, 個 → 个, 兩 → 两 (2), 資 → 资 (9), 範 → 范 (2), 圍 → 围, 業 → 业 (2), 設 → 设 (2), 計 → 计 (2), 擴 → 扩, 結 → 结 (2), 網 → 网, 併 → 并 (7), 訊 → 讯, 轉 → 转 (2), 態 → 态 (3), 顯 → 显 (2), 執 → 执, 編 → 编 (3), 組 → 组 (5), 葉 → 叶 (3), 換 → 换 (6), 變 → 变 (4), 側 → 侧 (5)) |
||
第19行: | 第19行: | ||
}} |
}} |
||
'''OpenCL'''('''Open''' '''C'''omputing '''L'''anguage,开放计算语言)是一个为异构平台编写程序的框架,此异构平台可由[[CPU]]、[[GPU]]、[[ |
'''OpenCL'''('''Open''' '''C'''omputing '''L'''anguage,开放计算语言)是一个为异构平台编写程序的框架,此异构平台可由[[CPU]]、[[GPU]]、[[数位讯号处理器|DSP]]、[[FPGA]]或其他类型的处理器与硬体加速器所组成。OpenCL由一门用于编写kernels(在OpenCL设备上运行的函数)的语言(基于[[C99]])和一组用于定义并控制平台的API组成。OpenCL提供了基于任务分割和数据分割的[[并行计算]]机制。 |
||
OpenCL类似于另外两个开放的工业标准[[OpenGL]]和[[OpenAL]],这两个标准分别用于三维图形和计算机音频方面。OpenCL |
OpenCL类似于另外两个开放的工业标准[[OpenGL]]和[[OpenAL]],这两个标准分别用于三维图形和计算机音频方面。OpenCL扩充了GPU图形生成之外的能力。OpenCL由非盈利性技术组织[[Khronos Group]]掌管。 |
||
== 历史 == |
== 历史 == |
||
第28行: | 第28行: | ||
2008年6月16日,Khronos的通用计算工作小组成立<ref>{{cite press release |url=http://www.khronos.org/news/press/releases/khronos_launches_heterogeneous_computing_initiative/ |title=Khronos Launches Heterogeneous Computing Initiative |accessdate=2008-06-18 |publisher=Khronos Group |date=2008-06-16 |deadurl=yes |archiveurl=https://web.archive.org/web/20080620123431/http://www.khronos.org/news/press/releases/khronos_launches_heterogeneous_computing_initiative/ |archivedate=2008-06-20 }}</ref>。5个月后的2008年11月18日,该工作组完成了OpenCL 1.0规范的技术细节<ref name=macWorld>{{cite web | url=http://www.macworld.com/article/136921/2008/11/opencl.html?lsrc=top_2 | title=OpenCL gets touted in Texas | publisher=MacWorld | date=2008-11-20 | accessdate=2009-06-12 | archive-date=2009-02-18 | archive-url=https://web.archive.org/web/20090218165557/http://www.macworld.com/article/136921/2008/11/opencl.html?lsrc=top_2 | dead-url=no }}</ref>。该技术规范在由Khronos成员进行审查之后,于2008年12月8日公开发表<ref name=khronosGroup>{{cite press release | url=http://www.khronos.org/news/press/releases/the_khronos_group_releases_opencl_1.0_specification/ | title=The Khronos Group Releases OpenCL 1.0 Specification | publisher=Khronos Group | date=2008-12-08 | accessdate=2009-06-12 | deadurl=yes | archiveurl=https://web.archive.org/web/20100713014204/http://www.khronos.org/news/press/releases/the_khronos_group_releases_opencl_1.0_specification/ | archivedate=2010-07-13 }}</ref>。2010年6月14日,OpenCL 1.1发布<ref>{{cite press release | url=http://www.khronos.org/news/press/releases/khronos-group-releases-opencl-1-1-parallel-computing-standard/ | title=Khronos Drives Momentum of Parallel Computing Standard with Release of OpenCL 1.1 Specification | publisher=Khronos Group | date=2010-06-14 | accessdate=2010-10-13 | deadurl=yes | archiveurl=https://web.archive.org/web/20100923101844/http://www.khronos.org/news/press/releases/khronos-group-releases-opencl-1-1-parallel-computing-standard | archivedate=2010-09-23 }}</ref>。 |
2008年6月16日,Khronos的通用计算工作小组成立<ref>{{cite press release |url=http://www.khronos.org/news/press/releases/khronos_launches_heterogeneous_computing_initiative/ |title=Khronos Launches Heterogeneous Computing Initiative |accessdate=2008-06-18 |publisher=Khronos Group |date=2008-06-16 |deadurl=yes |archiveurl=https://web.archive.org/web/20080620123431/http://www.khronos.org/news/press/releases/khronos_launches_heterogeneous_computing_initiative/ |archivedate=2008-06-20 }}</ref>。5个月后的2008年11月18日,该工作组完成了OpenCL 1.0规范的技术细节<ref name=macWorld>{{cite web | url=http://www.macworld.com/article/136921/2008/11/opencl.html?lsrc=top_2 | title=OpenCL gets touted in Texas | publisher=MacWorld | date=2008-11-20 | accessdate=2009-06-12 | archive-date=2009-02-18 | archive-url=https://web.archive.org/web/20090218165557/http://www.macworld.com/article/136921/2008/11/opencl.html?lsrc=top_2 | dead-url=no }}</ref>。该技术规范在由Khronos成员进行审查之后,于2008年12月8日公开发表<ref name=khronosGroup>{{cite press release | url=http://www.khronos.org/news/press/releases/the_khronos_group_releases_opencl_1.0_specification/ | title=The Khronos Group Releases OpenCL 1.0 Specification | publisher=Khronos Group | date=2008-12-08 | accessdate=2009-06-12 | deadurl=yes | archiveurl=https://web.archive.org/web/20100713014204/http://www.khronos.org/news/press/releases/the_khronos_group_releases_opencl_1.0_specification/ | archivedate=2010-07-13 }}</ref>。2010年6月14日,OpenCL 1.1发布<ref>{{cite press release | url=http://www.khronos.org/news/press/releases/khronos-group-releases-opencl-1-1-parallel-computing-standard/ | title=Khronos Drives Momentum of Parallel Computing Standard with Release of OpenCL 1.1 Specification | publisher=Khronos Group | date=2010-06-14 | accessdate=2010-10-13 | deadurl=yes | archiveurl=https://web.archive.org/web/20100923101844/http://www.khronos.org/news/press/releases/khronos-group-releases-opencl-1-1-parallel-computing-standard | archivedate=2010-09-23 }}</ref>。 |
||
== |
== 范例 == |
||
=== 快速傅立 |
=== 快速傅立叶变换 === |
||
一 |
一个[[快速傅立叶变换]]的式子: |
||
<ref name=siggraph>{{cite web | url=http://s08.idav.ucdavis.edu/munshi-opencl.pdf | title=OpenCL | accessdate=2008-08-14 | publisher=SIGGRAPH2008 | date=2008-08-14 | archive-url=https://www.webcitation.org/66GmScoh5?url=http://s08.idav.ucdavis.edu/munshi-opencl.pdf | archive-date=2012-03-19 | dead-url=yes }}</ref> |
<ref name=siggraph>{{cite web | url=http://s08.idav.ucdavis.edu/munshi-opencl.pdf | title=OpenCL | accessdate=2008-08-14 | publisher=SIGGRAPH2008 | date=2008-08-14 | archive-url=https://www.webcitation.org/66GmScoh5?url=http://s08.idav.ucdavis.edu/munshi-opencl.pdf | archive-date=2012-03-19 | dead-url=yes }}</ref> |
||
<syntaxhighlight lang="c"> |
<syntaxhighlight lang="c"> |
||
第64行: | 第64行: | ||
</syntaxhighlight> |
</syntaxhighlight> |
||
真正的 |
真正的运算:(基于[http://www.cs.berkeley.edu/~kubitron/courses/cs258-S08/projects/reports/project6_report.pdf Fitting FFT onto the G80 Architecture])<ref name=VolkovKazianFFTG80>{{cite web | url=http://www.cs.berkeley.edu/~kubitron/courses/cs258-S08/projects/reports/project6_report.pdf | title=Fitting FFT onto G80 Architecture | accessdate=2008-11-14 | publisher=Vasily Volkov and Brian Kazian, UC Berkeley CS258 project report | format=PDF | date=May 2008 | archive-date=2012-03-19 | archive-url=https://www.webcitation.org/66GmTA1HM?url=http://www.cs.berkeley.edu/~kubitron/courses/cs258-S08/projects/reports/project6_report.pdf | dead-url=no }}</ref> |
||
<syntaxhighlight lang="c"> |
<syntaxhighlight lang="c"> |
||
// This kernel computes FFT of length 1024. The 1024 length FFT is decomposed into |
// This kernel computes FFT of length 1024. The 1024 length FFT is decomposed into |
||
第99行: | 第99行: | ||
} |
} |
||
</syntaxhighlight> |
</syntaxhighlight> |
||
Apple的 |
Apple的网站上可以发现傅立叶变换的例子<ref name=AppleOpenCLFFT>. |
||
{{cite web | url=https://developer.apple.com/mac/library/samplecode/OpenCL_FFT/index.html | title=OpenCL on FFT | accessdate=2009-12-07 | publisher=Apple | date=16 Nov 2009 | archive-date=2009-11-30 | archive-url=https://web.archive.org/web/20091130085543/http://developer.apple.com/mac/library/samplecode/OpenCL_FFT/index.html | dead-url=no }}</ref> |
{{cite web | url=https://developer.apple.com/mac/library/samplecode/OpenCL_FFT/index.html | title=OpenCL on FFT | accessdate=2009-12-07 | publisher=Apple | date=16 Nov 2009 | archive-date=2009-11-30 | archive-url=https://web.archive.org/web/20091130085543/http://developer.apple.com/mac/library/samplecode/OpenCL_FFT/index.html | dead-url=no }}</ref> |
||
=== 平行合 |
=== 平行合并排序法 === |
||
使用 Python 3.x 搭配 PyOpenCL |
使用 Python 3.x 搭配 PyOpenCL 与 NumPy |
||
<div style="height:400px; overflow-y: scroll;"> |
<div style="height:400px; overflow-y: scroll;"> |
||
<syntaxhighlight lang="python3"> |
<syntaxhighlight lang="python3"> |
||
第112行: | 第112行: | ||
def dump_step(data, chunk_size): |
def dump_step(data, chunk_size): |
||
""" |
"""显示排序过程""" |
||
msg = io.StringIO('') |
msg = io.StringIO('') |
||
div = io.StringIO('') |
div = io.StringIO('') |
||
第134行: | 第134行: | ||
def cl_merge_sort_sbs(data_in): |
def cl_merge_sort_sbs(data_in): |
||
"""平行合 |
"""平行合并排序""" |
||
# OpenCL kernel 函 |
# OpenCL kernel 函数程式碼 |
||
CL_CODE = ''' |
CL_CODE = ''' |
||
kernel void merge(int chunk_size, int size, global long* data, global long* buff) { |
kernel void merge(int chunk_size, int size, global long* data, global long* buff) { |
||
// 取得分 |
// 取得分组编号 |
||
const int gid = get_global_id(0); |
const int gid = get_global_id(0); |
||
// 根 |
// 根据分组编号计算责任范围 |
||
const int offset = gid * chunk_size; |
const int offset = gid * chunk_size; |
||
const int real_size = min(offset + chunk_size, size) - offset; |
const int real_size = min(offset + chunk_size, size) - offset; |
||
第147行: | 第147行: | ||
global long* buff_part = buff + offset; |
global long* buff_part = buff + offset; |
||
// |
// 设定合并前的初始狀态 |
||
int r_beg = chunk_size >> 1; |
int r_beg = chunk_size >> 1; |
||
int b_ptr = 0; |
int b_ptr = 0; |
||
第153行: | 第153行: | ||
int r_ptr = r_beg; |
int r_ptr = r_beg; |
||
// |
// 进行合并 |
||
while (b_ptr < real_size) { |
while (b_ptr < real_size) { |
||
if (r_ptr >= real_size) { |
if (r_ptr >= real_size) { |
||
// 若右 |
// 若右侧沒有资料,取左侧资料堆入緩衝区 |
||
buff_part[b_ptr] = data_part[l_ptr++]; |
buff_part[b_ptr] = data_part[l_ptr++]; |
||
} else if (l_ptr == r_beg) { |
} else if (l_ptr == r_beg) { |
||
// 若左 |
// 若左侧沒有资料,取右侧资料堆入緩衝区 |
||
buff_part[b_ptr] = data_part[r_ptr++]; |
buff_part[b_ptr] = data_part[r_ptr++]; |
||
} else { |
} else { |
||
// 若 |
// 若两侧都有资料,取较小资料堆入緩衝区 |
||
if (data_part[l_ptr] < data_part[r_ptr]) { |
if (data_part[l_ptr] < data_part[r_ptr]) { |
||
buff_part[b_ptr] = data_part[l_ptr++]; |
buff_part[b_ptr] = data_part[l_ptr++]; |
||
第174行: | 第174行: | ||
''' |
''' |
||
# 配置 |
# 配置计算资源,编译 OpenCL 程式 |
||
ctx = cl.Context(dev_type=cl.device_type.GPU) |
ctx = cl.Context(dev_type=cl.device_type.GPU) |
||
prg = cl.Program(ctx, CL_CODE).build() |
prg = cl.Program(ctx, CL_CODE).build() |
||
第180行: | 第180行: | ||
mf = cl.mem_flags |
mf = cl.mem_flags |
||
# |
# 资料转换成 numpy 形式以利转换为 OpenCL Buffer |
||
data_np = np.int64(data_in) |
data_np = np.int64(data_in) |
||
buff_np = np.empty_like(data_np) |
buff_np = np.empty_like(data_np) |
||
# 建立緩衝 |
# 建立緩衝区,并且复制数值到緩衝区 |
||
data = cl.Buffer(ctx, mf.READ_WRITE | mf.COPY_HOST_PTR, hostbuf=data_np) |
data = cl.Buffer(ctx, mf.READ_WRITE | mf.COPY_HOST_PTR, hostbuf=data_np) |
||
buff = cl.Buffer(ctx, mf.READ_WRITE | mf.COPY_HOST_PTR, hostbuf=buff_np) |
buff = cl.Buffer(ctx, mf.READ_WRITE | mf.COPY_HOST_PTR, hostbuf=buff_np) |
||
# |
# 设定合并前初始狀态 |
||
data_len = np.int32(len(data_np)) |
data_len = np.int32(len(data_np)) |
||
chunk_size = np.int32(1) |
chunk_size = np.int32(1) |
||
第194行: | 第194行: | ||
dump_step(data_np, chunk_size) |
dump_step(data_np, chunk_size) |
||
while chunk_size < data_len: |
while chunk_size < data_len: |
||
# 更新分 |
# 更新分组大小,每一回合变两倍 |
||
chunk_size <<= 1 |
chunk_size <<= 1 |
||
# |
# 换算平行作业组数 |
||
group_size = ((data_len - 1) // chunk_size) + 1 |
group_size = ((data_len - 1) // chunk_size) + 1 |
||
# |
# 进行分组合并作业 |
||
prg.merge(queue, (group_size,), (1,), chunk_size, data_len, data, buff) |
prg.merge(queue, (group_size,), (1,), chunk_size, data_len, data, buff) |
||
# |
# 将合并结果作为下一回合的原始资料 |
||
temp = data |
temp = data |
||
data = buff |
data = buff |
||
buff = temp |
buff = temp |
||
# |
# 显示此回合狀态 |
||
cl.enqueue_copy(queue, data_np, data) |
cl.enqueue_copy(queue, data_np, data) |
||
dump_step(data_np, chunk_size) |
dump_step(data_np, chunk_size) |
||
第224行: | 第224行: | ||
</div> |
</div> |
||
执行结果: |
|||
<syntaxhighlight lang="text"> |
<syntaxhighlight lang="text"> |
||
-------------------------------------------------------------------------------------- |
-------------------------------------------------------------------------------------- |
||
第239行: | 第239行: | ||
</syntaxhighlight> |
</syntaxhighlight> |
||
== |
== 参考文献 == |
||
{{Reflist|30em}} |
{{Reflist|30em}} |
||
== 外部 |
== 外部链接 == |
||
* [https://www.khronos.org/conformance/adopters/conformant-products/opencl 支援OpenCL的 |
* [https://www.khronos.org/conformance/adopters/conformant-products/opencl 支援OpenCL的产品] |
||
* [https://web.archive.org/web/20190122085224/http://www.opengpu.org/ 开源GPU社区]{{zh-cn}} |
* [https://web.archive.org/web/20190122085224/http://www.opengpu.org/ 开源GPU社区]{{zh-cn}} |
||
第251行: | 第251行: | ||
* [[DirectCompute]] |
* [[DirectCompute]] |
||
* [[C++ AMP]] |
* [[C++ AMP]] |
||
* [[比特币]]的挖 |
* [[比特币]]的挖矿 |
||
{{-}} |
{{-}} |