阅读:3162回复:9
C54系列的汇编与C语言的速度差异
汇编应该比C快,但是能快多少?听说C6000系列的有很大差别,为什么?现在在做程序优化,主要是想提高速度,有谁有什么高见?有一次看过一张帖子说,用疑问判断句比if???else???语句快,请问这个疑问判断句是指什么语句?
|
|
沙发#
发布于:2003-05-19 14:39
我是搞2000系列的,曾把C程序反汇编过来,感觉C语言的效率要比汇编低很多的。
:cool: :cool:个人体会,权当抛砖引玉。 |
|
板凳#
发布于:2003-05-21 14:58
I always use c and only do the most criticle part in asm, such as filter, codec... I also have the habit of having a test bed, to run both the c-version and asm-version of code on as much test vector as possible to make sure that the asm is doing what it supposes to do..
asm can improve performence in following cases dramatically: dsp algorithm, parallel, special loop,complicated conditioning... For most code, it is not worth the effort. |
|
地板#
发布于:2003-05-21 15:03
by the way, to speed up your code, try to look at some easier ways, such as optimizing your compiler for speed rather than space.
Also optimize your internal memory. If you put your most criticle code in internal memory, it will be much more faster. You might not need asm at all |
|
地下室#
发布于:2003-05-21 17:01
by the way, to speed up your code, try to look at some easier ways, such as optimizing your compiler for speed rather than space. 同意!很经典。 在上一贴中提到“For most code, it is not worth the effort.”不知道你most code指的是C还是汇编,我的感觉应该是C吧。 |
|
5楼#
发布于:2003-05-22 22:36
What I wanted to say was, for most code, it is not worth the effort to write assembly or convert c to assebly -- I tend to do prototype/first round in c any way, if I do not have a libarry to call. Sorry for the ambiguioty.
TI has a pretty comprehensive DSP lib for C54, if you are interested in optimizing DSP algorithm, you can just use it. Just search TI DSP village for it. It is buggy though. I think its fir or iir does not have overflow check and clipping, which brought problem to me. |
|
6楼#
发布于:2003-05-24 18:57
我看了上面的帖子,有几个问题不明白:1、If you put your most criticle code in internal memory--是指合理分配DARAM和SARAM吗?2、TI has a pretty comprehensive DSP lib for C54--这是指什么?
我现在已经优化了40%,但是汇编程序估计只有C程序的20%,所以现在还要优化40%,但是感觉已经没有方法了,我想知道汇编跟C程序应该有3:1的差距吧 |
|
7楼#
发布于:2003-05-25 00:58
I think it all depends on the code you are optimizing on. For the code I am working on, I could get an improvement of 6~8X or even better. But my case is special, I work 8K sampled audio data, and I am doing lots of filtering and ffts, which has C54x instruction set level support.
Whether you can obtain a ratio of 3:1, I do not know. I think, it depends on your assembly, your c-compiler as well as the code you are optimizing on. It might just be a rule of thumb. What are you working on? Is MIPS really so tight on you? Sometimes, it is much more easier and mofre efficient to change the high level design a little bit, than to optimize all the code. I did a search and found the dsp lib I talked about. Just download the code, and see if anything you are doing now can be accomplished by it. They are all c-callable. I think it has source code too. http://focus.ti.com/docs/tool/toolfolder.jhtml?PartNumber=SPRC099 have fun |
|
8楼#
发布于:2003-05-29 17:23
lyingying,你好!从你的帖子中我受益很大,非常感谢!
你告诉我的dsp.lib的确是个好东西,至少我可以用里面的log2和fir。但是还是有一些麻烦,比如我的程序数据不是Q15格式的,最多的是Q4.6格式的,我想overflow可能不会发生,但是精度我想保持,dsp.lib不是可以用在Q3.12格式下吗,是不是我进行一定的修改就可以得到Q4.6格式的数据了?另外量化用什么方法会快一点? 还有一个简单的汇编问题:如为数据组x[5]的每个元素加1 ld #1,16,B stm #4, BRC stm #x, AR4 rptb nest-1 add *AR4, 16, B, A sth A, *AR4+ next**** 我想知道为什么要用16移位和sth,用stl不行吗? |
|
9楼#
发布于:2003-06-08 09:31
sorry for the delay
Q15 and Q12 are just for book keeping. You have to know what number you are working with. The saturation also depends on the taps you are accumulating. At least for fir, you know what the worst is, right? As to the code, what sxm is set at? I think it does that to keep the sign. Also, shifting up and down do not cost cycles, right? :) If you are interested, just change the code and run some tests to see if you can use the lower byte. have fun |
|