关于ios:Metal vs GLSL CoreImage 性能

Metal vs GLSL CoreImage performance

在 WWDC 会话 510 中,Apple 工程师提出了对在 Metal 中编码 CIKernel 的支持,并声称它应该工作得更快。

我已经一起制作了一个测试项目,它在 Metalglsl 中都实现了运动模糊(代码类似于 510 会话中的代码)。

有时 metal kernel 更快,有时 glsl kernel 更快,但我绝对看不到 metal kernel 执行一致性并且整体上明显更好。应该是这样吗,是不是漏了什么?

注意:该项目不会在模拟器上运行,您需要 A8 供电设备。


看起来其中一些与硬件有关。这是我的 iPad Pro 10.5 英寸结果:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
glsl 1 took 229.572057723999ms
glsl 2 took 49.1310358047485ms
glsl 3 took 46.7269420623779ms
glsl 4 took 53.08997631073ms
glsl 5 took 48.9979982376099ms
glsl 6 took 49.0390062332153ms
glsl 7 took 52.5139570236206ms
glsl 8 took 46.4930534362793ms
glsl 9 took 39.6310091018677ms
glsl 10 took 45.9860563278198ms
metal 1 took 77.7549743652344ms
metal 2 took 44.1800355911255ms
metal 3 took 46.0859537124634ms
metal 4 took 45.3709363937378ms
metal 5 took 43.5279607772827ms
metal 6 took 38.9848947525024ms
metal 7 took 37.1809005737305ms
metal 8 took 37.8340482711792ms
metal 9 took 37.6850366592407ms
metal 10 took 37.5720262527466ms

还有我的 iPhoneSE 结果:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
glsl 1 took 394.147992134094ms
glsl 2 took 94.601035118103ms
glsl 3 took 81.4379453659058ms
glsl 4 took 76.9931077957153ms
glsl 5 took 77.0320892333984ms
glsl 6 took 75.8579969406128ms
glsl 7 took 76.9950151443481ms
glsl 8 took 77.8199434280396ms
glsl 9 took 79.7009468078613ms
glsl 10 took 79.4800519943237ms
metal 1 took 146.992921829224ms
metal 2 took 88.6669158935547ms
metal 3 took 81.8150043487549ms
metal 4 took 78.1329870223999ms
metal 5 took 79.5910358428955ms
metal 6 took 93.6589241027832ms
metal 7 took 94.8940515518188ms
metal 8 took 89.0530347824097ms
metal 9 took 84.3830108642578ms
metal 10 took 77.949047088623ms

一个问题和一个想法:

  • 什么设备产生了你的结果?
  • 我会很好奇,如果不同类型的过滤器,比如颜色内核会表现不同。