I tested the project “Toronto”, a 2d platformer developed with Unity3D, on iPod Touch 1G. The performance was really crappy. The average frame time was 105ms. And it dropped to 400ms in the worst case.
These sub-systems were extremely slow:
- Physics: 13-30ms
- Update: 10-20ms
- Fixed-update: 8-17ms
- Render: 48-80ms
And the following optimization were applied and efficiently improved the performance:
- Time-slicing in Update & Fixed-Update. Now Update costs 0.7-0.9ms, and Fixed-Update costs 3.7-6ms
- Adjusting layer collision matrix to minimize the pairs. Now the Physics cost 4-6ms per frame.
On the rendering subsystem, it seems the bottleneck is the number of drawcalls. Currently there is 20-24 drawcalls in every frame. By merging textures, I expect the total drawcall number could be reduced to around 10. I hope it could reduce the rendering cost by half.
It would be still quite challenging, if not impossible, to get 30fps in an iPod Touch 1G. But if everything could be done right, 18-20fps would be a reasonable target. That would also guarantee a stable 30fps on any other iOS devices (excluding iPhone 1G) and leave enough room for improving the graphic quality.
I also tested the project on iPod Touch 2G. In my understanding, there should be no big difference between iPod Touch 1G and 2G. From what I saw in wikipedia, the major difference is the CPU frequency, from 1G’s 412MHz to 2G’s 533MHz. And they share the same GPU chip. But the real performance of iPod Touch 2G was truly much better — the Physics was 2 times faster, and the Render was also 1.3x faster than 1G.
Update:
There was an unexpected performance lost. Previously I use the GUI objects provided by Unity to display some statistics on screen. To my surprise, it cost 20ms or more to display 6 labels with around 150 characters. By disabling the GUI objects, I got an acceptable FPS finally 🙂
nice doc!
Hi~,i use time slice to controll A*,but , control one object A* is 4ms,but when control 20 pieces object A*,fps=2 .I do not know how to use time slice and multi thread to controll AI
Assuming A* means A-star in path finding. Generally it’s unnecessary to do that for 20 objects in one frame. You could distribute the calculation to 20 frames, or any number of frames according the budget of performance.
If you could afford to cache the open/close lists, single A* could also be distributed to multiple frames.