I am looking into one of the tools, which gives me the cycles for each assembler instruction in the editor. Also a nice feature would be, if it also summs up loops, so that you easily can see how much cycles you have in this loop.
This could be also done after run if a profiler is included.
If we run into more complext code, I like to see where most of the time is spend in a code (typically in max 10 percent of the code) and where I want to put optimization effort in in terms of speed or in terms of code size.
Code segments with low amount of usage, would be optimized to size (e.g. more loops), but very often used parts would be then optimized to speed, even at the cost of code size.
Any of the existing Assamblers/Emulators/IDEs support that at the moment?
Would explain why C418 decided to go through with his 0x10c album, or he just wanted some extra $.