阅读:1573回复:13
wowocock在不,问一下
我的机器一运行sse方式的memcpy,就会出现错误,说是执行了特权指令,为什么会这样呢。 :(
|
|
|
沙发#
发布于:2004-03-30 12:59
肥肥虫,好啊!好久不见啦!最近忙啥呢? :D :D :D :D :D :D
|
|
板凳#
发布于:2004-03-30 13:36
肥肥虫,好啊!好久不见啦!最近忙啥呢? :D :D :D :D :D :D 上班阿,吼吼,挣钱养家糊口 :D |
|
|
地板#
发布于:2004-03-30 13:50
你的代码是什么??贴出来看看.
|
|
|
地下室#
发布于:2004-03-30 14:28
呵呵!怪不得是肥肥虫了!呵呵
|
|
5楼#
发布于:2004-03-30 15:42
你的代码是什么??贴出来看看. __asm{ push esi push edi mov esi, dword ptr[Src] mov edi, dword ptr[Dest] mov ecx, nBytes shr ecx, 7 align 4 MemcpySSE_Loop: ; movaps should be slightly more efficient ; as the data is 16 bit aligned movaps xmm0, [esi] movaps xmm1, [esi+16*1] movaps xmm2, [esi+16*2] movaps xmm3, [esi+16*3] movaps xmm4, [esi+16*4] movaps xmm5, [esi+16*5] movaps xmm6, [esi+16*6] movaps xmm7, [esi+16*7] movntps [edi], xmm0 movntps [edi+16*1], xmm1 movntps [edi+16*2], xmm2 movntps [edi+16*3], xmm3 movntps [edi+16*4], xmm4 movntps [edi+16*5], xmm5 movntps [edi+16*6], xmm6 movntps [edi+16*7], xmm7 add esi, 128 add edi, 128 dec ecx jnz MemcpySSE_Loop mov ecx, nBytes and ecx, 127 cmp ecx, 0 je MemcpySSE_End rep movsb MemcpySSE_End: pop esi pop edi } |
|
|
6楼#
发布于:2004-03-30 17:40
我用VC6不能使用SSE指令,我用MASM32测试了下没问题的,看看我的
.686p .XMM .Model Flat,StdCall Option Casemap :None ; 不区分大小写(对API与API常数无效) include \masm32\include\windows.inc include \masm32\include\user32.inc include \masm32\include\kernel32.inc includeLib \masm32\lib\user32.lib includeLib \masm32\lib\kernel32.lib .data Src db 256 dup("f") Dest db 256 dup(0) nBytes dd 128*2 .code START: push esi push edi lea esi, Src lea edi, Dest mov ecx, nBytes shr ecx, 8 align 4 MemcpySSE_Loop: ; movaps should be slightly more efficient ; as the data is 16 bit aligned movaps XMM0, [esi] movaps XMM1, [esi+16*1] movaps XMM2, [esi+16*2] movaps XMM3, [esi+16*3] movaps XMM4, [esi+16*4] movaps XMM5, [esi+16*5] movaps XMM6, [esi+16*6] movaps XMM7, [esi+16*7] movntps [edi], XMM0 movntps [edi+16*1], XMM1 movntps [edi+16*2], XMM2 movntps [edi+16*3], XMM3 movntps [edi+16*4], XMM4 movntps [edi+16*5], XMM5 movntps [edi+16*6], XMM6 movntps [edi+16*7], XMM7 add esi, 128 add edi, 128 loop MemcpySSE_Loop mov ecx, nBytes and ecx, 127 cmp ecx, 0 je MemcpySSE_End rep movsb MemcpySSE_End: pop edi pop esi invoke ExitProcess,0 END START |
|
|
7楼#
发布于:2004-03-30 17:42
挑战者阿,人家在这里讨论技术问题,你却跑进来灌水???!!!
咦?对了,这里好像是水园阿,嘿嘿,肥肥虫?你怎么回事阿 :o |
|
|
8楼#
发布于:2004-03-30 17:45
OK,多谢wowocock,我再试试看。
给分先,吼吼 |
|
|
9楼#
发布于:2004-03-30 18:11
下了个VC6 SP5的PROCESS补丁,已经可以用SSE了,测试了下,还是没问题的.
void main( ) { char Src[256]={0}; char Dest[256]; int nBytes=128*2; //cout<<test()<<endl; __asm { push esi push edi lea esi, Src lea edi, Dest mov ecx, nBytes shr ecx, 7 align 4 jecxz MemcpySSE_End MemcpySSE_Loop: ; movaps should be slightly more efficient ; as the data is 16 bit aligned movaps xmm0, [esi] movaps xmm1, [esi+16*1] movaps xmm2, [esi+16*2] movaps xmm3, [esi+16*3] movaps xmm4, [esi+16*4] movaps xmm5, [esi+16*5] movaps xmm6, [esi+16*6] movaps xmm7, [esi+16*7] movntps [edi], xmm0 movntps [edi+16*1], xmm1 movntps [edi+16*2], xmm2 movntps [edi+16*3], xmm3 movntps [edi+16*4], xmm4 movntps [edi+16*5], xmm5 movntps [edi+16*6], xmm6 movntps [edi+16*7], xmm7 add esi, 128 add edi, 128 loop MemcpySSE_Loop mov ecx, nBytes and ecx, 127 cmp ecx, 0 je MemcpySSE_End rep movsb MemcpySSE_End: pop edi pop esi } } |
|
|
10楼#
发布于:2004-03-31 13:33
正常调用确实没有问题
可是我在一个线程函数中调用就会出错,好奇怪 :( |
|
|
11楼#
发布于:2004-03-31 13:47
正常调用确实没有问题 逼的我现在只能在线程中用mmx优化的memcpy :mad: |
|
|
12楼#
发布于:2004-03-31 14:37
线程中使用可能要考虑同步问题,线程环境的CONTEXT好象没有SSE寄存器。
|
|
|
13楼#
发布于:2004-03-31 14:54
线程中使用可能要考虑同步问题,线程环境的CONTEXT好象没有SSE寄存器。 是这样阿,多谢。 :D |
|
|