itsscb/rust - rust - Gitea: Git with a cup of tea

mirror of https://github.com/rust-lang/rust.git synced 2025-11-20 17:16:28 +00:00

Author	SHA1	Message	Date
Luca Barbato	9888c6ce82	Update proc macro2 (#455 ) * Update to proc_macro2 0.4 and related * Update to proc_macro2 0.4 and related * Update to proc_macro2 0.4 and related * Add proc_macro_gen feature * Update to the new rustfmt cli * A few proc-macro2 stylistic updates * Disable RUST_BACKTRACE by default * Allow rustfmt failure for now * Disable proc-macro2 nightly feature in verify-x86 Currently this causes bugs on nightly due to upstream rustc bugs, this should be temporary * Attempt to thwart mergefunc * Use static relocation model on i686	2018-05-21 13:37:41 -05:00
gnzlbg	696eea0211	add run-time feature detection for powerpc (#452 )	2018-05-16 15:19:13 -05:00
gnzlbg	9e797f2de1	bugfix: cfg(tests) should be cfg(test) (#450 )	2018-05-16 13:59:28 -05:00
gnzlbg	8ea9bc53f1	Initial PowerPC altivec and VSX support (#447 ) * add some powerpc/powerpc64 altivec/vsx intrinsics * temporarily make IntoBits/FromBits inline(always) * include powerpc64 module; use inline(always) from/into_bits only on powerpc	2018-05-16 12:10:19 -05:00
gnzlbg	c0bf5d9c42	Workarounds for all/any mask reductions on x86, armv7, and aarch64 (#425 ) * Work arounds for LLVM6 code-gen bugs in all/any reductions This commit adds workarounds for the mask reductions: `all` and `any`. 64-bit wide mask types (`m8x8`, `m16x4`, `m32x2`) `x86_64` with `MMX` enabled ```asm all_8x8: push rbp mov rbp, rsp movzx eax, byte, ptr, [rdi, +, 7] movd xmm0, eax movzx eax, byte, ptr, [rdi, +, 6] movd xmm1, eax punpcklwd xmm1, xmm0 movzx eax, byte, ptr, [rdi, +, 5] movd xmm0, eax movzx eax, byte, ptr, [rdi, +, 4] movd xmm2, eax punpcklwd xmm2, xmm0 punpckldq xmm2, xmm1 movzx eax, byte, ptr, [rdi, +, 3] movd xmm0, eax movzx eax, byte, ptr, [rdi, +, 2] movd xmm1, eax punpcklwd xmm1, xmm0 movzx eax, byte, ptr, [rdi, +, 1] movd xmm0, eax movzx eax, byte, ptr, [rdi] movd xmm3, eax punpcklwd xmm3, xmm0 punpckldq xmm3, xmm1 punpcklqdq xmm3, xmm2 movdqa xmm0, xmmword, ptr, [rip, +, LCPI9_0] pand xmm3, xmm0 pcmpeqw xmm3, xmm0 pshufd xmm0, xmm3, 78 pand xmm0, xmm3 pshufd xmm1, xmm0, 229 pand xmm1, xmm0 movdqa xmm0, xmm1 psrld xmm0, 16 pand xmm0, xmm1 movd eax, xmm0 and al, 1 pop rbp ret any_8x8: push rbp mov rbp, rsp movzx eax, byte, ptr, [rdi, +, 7] movd xmm0, eax movzx eax, byte, ptr, [rdi, +, 6] movd xmm1, eax punpcklwd xmm1, xmm0 movzx eax, byte, ptr, [rdi, +, 5] movd xmm0, eax movzx eax, byte, ptr, [rdi, +, 4] movd xmm2, eax punpcklwd xmm2, xmm0 punpckldq xmm2, xmm1 movzx eax, byte, ptr, [rdi, +, 3] movd xmm0, eax movzx eax, byte, ptr, [rdi, +, 2] movd xmm1, eax punpcklwd xmm1, xmm0 movzx eax, byte, ptr, [rdi, +, 1] movd xmm0, eax movzx eax, byte, ptr, [rdi] movd xmm3, eax punpcklwd xmm3, xmm0 punpckldq xmm3, xmm1 punpcklqdq xmm3, xmm2 movdqa xmm0, xmmword, ptr, [rip, +, LCPI8_0] pand xmm3, xmm0 pcmpeqw xmm3, xmm0 pshufd xmm0, xmm3, 78 por xmm0, xmm3 pshufd xmm1, xmm0, 229 por xmm1, xmm0 movdqa xmm0, xmm1 psrld xmm0, 16 por xmm0, xmm1 movd eax, xmm0 and al, 1 pop rbp ret ``` After this PR for `m8x8`, `m16x4`, `m32x2`: ```asm all_8x8: push rbp mov rbp, rsp movq mm0, qword, ptr, [rdi] pmovmskb eax, mm0 cmp eax, 255 sete al pop rbp ret any_8x8: push rbp mov rbp, rsp movq mm0, qword, ptr, [rdi] pmovmskb eax, mm0 test eax, eax setne al pop rbp ret ``` x86` with `MMX` enabled Before this PR: ```asm all_8x8: call L9$pb L9$pb: pop eax mov ecx, dword, ptr, [esp, +, 4] movzx edx, byte, ptr, [ecx, +, 7] movd xmm0, edx movzx edx, byte, ptr, [ecx, +, 6] movd xmm1, edx punpcklwd xmm1, xmm0 movzx edx, byte, ptr, [ecx, +, 5] movd xmm0, edx movzx edx, byte, ptr, [ecx, +, 4] movd xmm2, edx punpcklwd xmm2, xmm0 punpckldq xmm2, xmm1 movzx edx, byte, ptr, [ecx, +, 3] movd xmm0, edx movzx edx, byte, ptr, [ecx, +, 2] movd xmm1, edx punpcklwd xmm1, xmm0 movzx edx, byte, ptr, [ecx, +, 1] movd xmm0, edx movzx ecx, byte, ptr, [ecx] movd xmm3, ecx punpcklwd xmm3, xmm0 punpckldq xmm3, xmm1 punpcklqdq xmm3, xmm2 movdqa xmm0, xmmword, ptr, [eax, +, LCPI9_0-L9$pb] pand xmm3, xmm0 pcmpeqw xmm3, xmm0 pshufd xmm0, xmm3, 78 pand xmm0, xmm3 pshufd xmm1, xmm0, 229 pand xmm1, xmm0 movdqa xmm0, xmm1 psrld xmm0, 16 pand xmm0, xmm1 movd eax, xmm0 and al, 1 ret any_8x8: call L8$pb L8$pb: pop eax mov ecx, dword, ptr, [esp, +, 4] movzx edx, byte, ptr, [ecx, +, 7] movd xmm0, edx movzx edx, byte, ptr, [ecx, +, 6] movd xmm1, edx punpcklwd xmm1, xmm0 movzx edx, byte, ptr, [ecx, +, 5] movd xmm0, edx movzx edx, byte, ptr, [ecx, +, 4] movd xmm2, edx punpcklwd xmm2, xmm0 punpckldq xmm2, xmm1 movzx edx, byte, ptr, [ecx, +, 3] movd xmm0, edx movzx edx, byte, ptr, [ecx, +, 2] movd xmm1, edx punpcklwd xmm1, xmm0 movzx edx, byte, ptr, [ecx, +, 1] movd xmm0, edx movzx ecx, byte, ptr, [ecx] movd xmm3, ecx punpcklwd xmm3, xmm0 punpckldq xmm3, xmm1 punpcklqdq xmm3, xmm2 movdqa xmm0, xmmword, ptr, [eax, +, LCPI8_0-L8$pb] pand xmm3, xmm0 pcmpeqw xmm3, xmm0 pshufd xmm0, xmm3, 78 por xmm0, xmm3 pshufd xmm1, xmm0, 229 por xmm1, xmm0 movdqa xmm0, xmm1 psrld xmm0, 16 por xmm0, xmm1 movd eax, xmm0 and al, 1 ret ``` After this PR: ```asm all_8x8: mov eax, dword, ptr, [esp, +, 4] movq mm0, qword, ptr, [eax] pmovmskb eax, mm0 cmp eax, 255 sete al ret any_8x8: mov eax, dword, ptr, [esp, +, 4] movq mm0, qword, ptr, [eax] pmovmskb eax, mm0 test eax, eax setne al ret ``` `aarch64` Before this PR: ```asm all_8x8: ldr d0, [x0] umov w8, v0.b[0] umov w9, v0.b[1] tst w8, #0xff umov w10, v0.b[2] cset w8, ne tst w9, #0xff cset w9, ne tst w10, #0xff umov w10, v0.b[3] and w8, w8, w9 cset w9, ne tst w10, #0xff umov w10, v0.b[4] and w8, w9, w8 cset w9, ne tst w10, #0xff umov w10, v0.b[5] and w8, w9, w8 cset w9, ne tst w10, #0xff umov w10, v0.b[6] and w8, w9, w8 cset w9, ne tst w10, #0xff umov w10, v0.b[7] and w8, w9, w8 cset w9, ne tst w10, #0xff and w8, w9, w8 cset w9, ne and w0, w9, w8 ret any_8x8: ldr d0, [x0] umov w8, v0.b[0] umov w9, v0.b[1] orr w8, w8, w9 umov w9, v0.b[2] orr w8, w8, w9 umov w9, v0.b[3] orr w8, w8, w9 umov w9, v0.b[4] orr w8, w8, w9 umov w9, v0.b[5] orr w8, w8, w9 umov w9, v0.b[6] orr w8, w8, w9 umov w9, v0.b[7] orr w8, w8, w9 tst w8, #0xff cset w0, ne ret ``` After this PR: ```asm all_8x8: ldr d0, [x0] mov v0.d[1], v0.d[0] uminv b0, v0.16b fmov w8, s0 tst w8, #0xff cset w0, ne ret any_8x8: ldr d0, [x0] mov v0.d[1], v0.d[0] umaxv b0, v0.16b fmov w8, s0 tst w8, #0xff cset w0, ne ret ``` `ARMv7` + `neon` Before this PR: ```asm all_8x8: vmov.i8 d0, #0x1 vldr d1, [r0] vtst.8 d0, d1, d0 vext.8 d1, d0, d0, #4 vand d0, d0, d1 vext.8 d1, d0, d0, #2 vand d0, d0, d1 vdup.8 d1, d0[1] vand d0, d0, d1 vmov.u8 r0, d0[0] and r0, r0, #1 bx lr any_8x8: vmov.i8 d0, #0x1 vldr d1, [r0] vtst.8 d0, d1, d0 vext.8 d1, d0, d0, #4 vorr d0, d0, d1 vext.8 d1, d0, d0, #2 vorr d0, d0, d1 vdup.8 d1, d0[1] vorr d0, d0, d1 vmov.u8 r0, d0[0] and r0, r0, #1 bx lr ``` After this PR: ```asm all_8x8: vldr d0, [r0] b <m8x8 as All>::all <m8x8 as All>::all: vpmin.u8 d16, d0, d16 vpmin.u8 d16, d16, d16 vpmin.u8 d0, d16, d16 b m8x8::extract any_8x8: vldr d0, [r0] b <m8x8 as Any>::any <m8x8 as Any>::any: vpmax.u8 d16, d0, d16 vpmax.u8 d16, d16, d16 vpmax.u8 d0, d16, d16 b m8x8::extract ``` (note: inlining does not work properly on ARMv7) 128-bit wide mask types (`m8x16`, `m16x8`, `m32x4`, `m64x2`) `x86_64` with SSE2 enabled Before this PR: ```asm all_8x16: push rbp mov rbp, rsp movdqa xmm0, xmmword, ptr, [rip, +, LCPI9_0] movdqa xmm1, xmmword, ptr, [rdi] pand xmm1, xmm0 pcmpeqb xmm1, xmm0 pmovmskb eax, xmm1 xor ecx, ecx cmp eax, 65535 mov eax, -1 cmovne eax, ecx and al, 1 pop rbp ret any_8x16: push rbp mov rbp, rsp movdqa xmm0, xmmword, ptr, [rip, +, LCPI8_0] movdqa xmm1, xmmword, ptr, [rdi] pand xmm1, xmm0 pcmpeqb xmm1, xmm0 pmovmskb eax, xmm1 neg eax sbb eax, eax and al, 1 pop rbp ret ``` After this PR: ```asm all_8x16: push rbp mov rbp, rsp movdqa xmm0, xmmword, ptr, [rdi] pmovmskb eax, xmm0 cmp eax, 65535 sete al pop rbp ret any_8x16: push rbp mov rbp, rsp movdqa xmm0, xmmword, ptr, [rdi] pmovmskb eax, xmm0 test eax, eax setne al pop rbp ret ``` `aarch64` Before this PR: ```asm all_8x16: ldr q0, [x0] umov w8, v0.b[0] umov w9, v0.b[1] tst w8, #0xff umov w10, v0.b[2] cset w8, ne tst w9, #0xff cset w9, ne tst w10, #0xff umov w10, v0.b[3] and w8, w8, w9 cset w9, ne tst w10, #0xff umov w10, v0.b[4] and w8, w9, w8 cset w9, ne tst w10, #0xff umov w10, v0.b[5] and w8, w9, w8 cset w9, ne tst w10, #0xff umov w10, v0.b[6] and w8, w9, w8 cset w9, ne tst w10, #0xff umov w10, v0.b[7] and w8, w9, w8 cset w9, ne tst w10, #0xff umov w10, v0.b[8] and w8, w9, w8 cset w9, ne tst w10, #0xff umov w10, v0.b[9] and w8, w9, w8 cset w9, ne tst w10, #0xff umov w10, v0.b[10] and w8, w9, w8 cset w9, ne tst w10, #0xff umov w10, v0.b[11] and w8, w9, w8 cset w9, ne tst w10, #0xff umov w10, v0.b[12] and w8, w9, w8 cset w9, ne tst w10, #0xff umov w10, v0.b[13] and w8, w9, w8 cset w9, ne tst w10, #0xff umov w10, v0.b[14] and w8, w9, w8 cset w9, ne tst w10, #0xff umov w10, v0.b[15] and w8, w9, w8 cset w9, ne tst w10, #0xff and w8, w9, w8 cset w9, ne and w0, w9, w8 ret any_8x16: ldr q0, [x0] umov w8, v0.b[0] umov w9, v0.b[1] orr w8, w8, w9 umov w9, v0.b[2] orr w8, w8, w9 umov w9, v0.b[3] orr w8, w8, w9 umov w9, v0.b[4] orr w8, w8, w9 umov w9, v0.b[5] orr w8, w8, w9 umov w9, v0.b[6] orr w8, w8, w9 umov w9, v0.b[7] orr w8, w8, w9 umov w9, v0.b[8] orr w8, w8, w9 umov w9, v0.b[9] orr w8, w8, w9 umov w9, v0.b[10] orr w8, w8, w9 umov w9, v0.b[11] orr w8, w8, w9 umov w9, v0.b[12] orr w8, w8, w9 umov w9, v0.b[13] orr w8, w8, w9 umov w9, v0.b[14] orr w8, w8, w9 umov w9, v0.b[15] orr w8, w8, w9 tst w8, #0xff cset w0, ne ret ``` After this PR: ```asm all_8x16: ldr q0, [x0] uminv b0, v0.16b fmov w8, s0 tst w8, #0xff cset w0, ne ret any_8x16: ldr q0, [x0] umaxv b0, v0.16b fmov w8, s0 tst w8, #0xff cset w0, ne ret ``` `ARMv7` + `neon` Before this PR: ```asm all_8x16: vmov.i8 q0, #0x1 vld1.64 {d2, d3}, [r0] vtst.8 q0, q1, q0 vext.8 q1, q0, q0, #8 vand q0, q0, q1 vext.8 q1, q0, q0, #4 vand q0, q0, q1 vext.8 q1, q0, q0, #2 vand q0, q0, q1 vdup.8 q1, d0[1] vand q0, q0, q1 vmov.u8 r0, d0[0] and r0, r0, #1 bx lr any_8x16: vmov.i8 q0, #0x1 vld1.64 {d2, d3}, [r0] vtst.8 q0, q1, q0 vext.8 q1, q0, q0, #8 vorr q0, q0, q1 vext.8 q1, q0, q0, #4 vorr q0, q0, q1 vext.8 q1, q0, q0, #2 vorr q0, q0, q1 vdup.8 q1, d0[1] vorr q0, q0, q1 vmov.u8 r0, d0[0] and r0, r0, #1 bx lr ``` After this PR: ```asm all_8x16: vld1.64 {d0, d1}, [r0] b <m8x16 as All>::all <m8x16 as All>::all: vpmin.u8 d0, d0, d b <m8x8 as All>::all any_8x16: vld1.64 {d0, d1}, [r0] b <m8x16 as Any>::any <m8x16 as Any>::any: vpmax.u8 d0, d0, d1 b <m8x8 as Any>::any ``` The inlining problems are pretty bad on ARMv7 + NEON. 256-bit wide mask types (`m8x32`, `m16x16`, `m32x8`, `m64x4`) With SSE2 enabled Before this PR: ```asm all_8x32: push rbp mov rbp, rsp movdqa xmm0, xmmword, ptr, [rip, +, LCPI17_0] movdqa xmm1, xmmword, ptr, [rdi] pand xmm1, xmm0 movdqa xmm2, xmmword, ptr, [rdi, +, 16] pand xmm2, xmm0 pcmpeqb xmm2, xmm0 pcmpeqb xmm1, xmm0 pand xmm1, xmm2 pmovmskb eax, xmm1 xor ecx, ecx cmp eax, 65535 mov eax, -1 cmovne eax, ecx and al, 1 pop rbp ret any_8x32: push rbp mov rbp, rsp movdqa xmm0, xmmword, ptr, [rdi] por xmm0, xmmword, ptr, [rdi, +, 16] movdqa xmm1, xmmword, ptr, [rip, +, LCPI16_0] pand xmm0, xmm1 pcmpeqb xmm0, xmm1 pmovmskb eax, xmm0 neg eax sbb eax, eax and al, 1 pop rbp ret ``` After this PR: ```asm all_8x32: push rbp mov rbp, rsp movdqa xmm0, xmmword, ptr, [rdi] pmovmskb eax, xmm0 cmp eax, 65535 jne LBB17_1 movdqa xmm0, xmmword, ptr, [rdi, +, 16] pmovmskb ecx, xmm0 mov al, 1 cmp ecx, 65535 je LBB17_3 LBB17_1: xor eax, eax LBB17_3: pop rbp ret any_8x32: push rbp mov rbp, rsp movdqa xmm0, xmmword, ptr, [rdi] pmovmskb ecx, xmm0 mov al, 1 test ecx, ecx je LBB16_1 pop rbp ret LBB16_1: movdqa xmm0, xmmword, ptr, [rdi, +, 16] pmovmskb eax, xmm0 test eax, eax setne al pop rbp ret ``` With AVX enabled Before this PR: ```asm all_8x32: push rbp mov rbp, rsp vmovaps ymm0, ymmword, ptr, [rdi] vandps ymm0, ymm0, ymmword, ptr, [rip, +, LCPI25_0] vextractf128 xmm1, ymm0, 1 vpxor xmm2, xmm2, xmm2 vpcmpeqb xmm1, xmm1, xmm2 vpcmpeqd xmm3, xmm3, xmm3 vpxor xmm1, xmm1, xmm3 vpcmpeqb xmm0, xmm0, xmm2 vpxor xmm0, xmm0, xmm3 vinsertf128 ymm0, ymm0, xmm1, 1 vandps ymm0, ymm0, ymm1 vpermilps xmm1, xmm0, 78 vandps ymm0, ymm0, ymm1 vpermilps xmm1, xmm0, 229 vandps ymm0, ymm0, ymm1 vpsrld xmm1, xmm0, 16 vandps ymm0, ymm0, ymm1 vpsrlw xmm1, xmm0, 8 vandps ymm0, ymm0, ymm1 vpextrb eax, xmm0, 0 and al, 1 pop rbp vzeroupper ret any_8x32: push rbp mov rbp, rsp vmovaps ymm0, ymmword, ptr, [rdi] vandps ymm0, ymm0, ymmword, ptr, [rip, +, LCPI24_0] vextractf128 xmm1, ymm0, 1 vpxor xmm2, xmm2, xmm2 vpcmpeqb xmm1, xmm1, xmm2 vpcmpeqd xmm3, xmm3, xmm3 vpxor xmm1, xmm1, xmm3 vpcmpeqb xmm0, xmm0, xmm2 vpxor xmm0, xmm0, xmm3 vinsertf128 ymm0, ymm0, xmm1, 1 vorps ymm0, ymm0, ymm1 vpermilps xmm1, xmm0, 78 vorps ymm0, ymm0, ymm1 vpermilps xmm1, xmm0, 229 vorps ymm0, ymm0, ymm1 vpsrld xmm1, xmm0, 16 vorps ymm0, ymm0, ymm1 vpsrlw xmm1, xmm0, 8 vorps ymm0, ymm0, ymm1 vpextrb eax, xmm0, 0 and al, 1 pop rbp vzeroupper ret ``` After this PR: ```asm all_8x32: push rbp mov rbp, rsp vmovdqa ymm0, ymmword, ptr, [rdi] vxorps xmm1, xmm1, xmm1 vcmptrueps ymm1, ymm1, ymm1 vptest ymm0, ymm1 setb al pop rbp vzeroupper ret any_8x32: push rbp mov rbp, rsp vmovdqa ymm0, ymmword, ptr, [rdi] vptest ymm0, ymm0 setne al pop rbp vzeroupper ret ``` --- Closes #362 . * test avx on all x86 targets * disable assert_instr on avx test * enable all appropriate features * disable assert_instr on x86+avx * the fn_must_use is stable * fix nbody example on armv7 * fixup * fixup * enable 64-bit wide mask MMX optimizations on x86_64 only * remove coresimd dependency on cfg_if * allow wasm to fail * use an env variable to disable assert_instr tests * disable m32x2 mask MMX optimization on macos * move cfg_if to coresimd/macros.rs	2018-05-04 16:03:45 -05:00
gnzlbg	30962e58e6	fix errors/warnings from the stabilization of cfg_target_feature and target_feature (#432 ) * fix build after stabilization of cfg_target_feature and target_feature * fix doc tests * fix spurious unused_attributes warning * fix more unused attribute warnings * More unnecessary target features * Remove no longer needed trait imports * Remove fixed upstream workarounds * Fix parsing the #[assert_instr] macro Following upstream proc_macro changes * Fix form and parsing of #[simd_test] * Don't use Cargo features for testing modes Instead use RUSTFLAGS with `--cfg`. This'll help us be compatible with the latest Cargo where a tweak to workspaces and features made the previous invocations we had invalid. * Don't thread RUSTFLAGS through docker * Re-gate on x86 verification Closes #411	2018-04-26 21:54:15 -05:00
Alex Crichton	e18fa0baf6	Shuffle around `stdsimd::arch::detect` bits (#428 ) Compile more code on more platforms, tweak imports, try to catch mistakes sooner.	2018-04-15 10:53:38 -05:00
Alex Crichton	f650b93003	Stabilize x86/x86_64 intrinsics (#414 ) This commit stabilizes all intrinsics in the `x86` and `x86_64` modules, namely allowing stabilization of the `arch::x86` and `arch::x86_64` module in libstd. Stabilizations here were applied in an automated fashion using [this script][scr], and notably everything related to `__m64` was omitted from this round of stabilization [scr]: https://gist.github.com/alexcrichton/5b456d495d6fe1df46a158754565c7a5	2018-04-13 09:32:22 -05:00
gnzlbg	87ce896543	Documents arithmetic reduction semantics (#412 ) * documents arithmetic reduction semantics	2018-04-05 19:36:04 +02:00
Alex Crichton	52f9198902	Fix compile errors in simd-test-macro	2018-04-03 07:34:02 -07:00
Alex Crichton	a3def97fc6	Bump dependencies on proc-macro2	2018-04-03 07:17:40 -07:00
gnzlbg	fa9a55105a	upgrade formatting	2018-04-03 15:40:22 +02:00
Jason Davies	4c3eed5d2d	i128 is now stable.	2018-03-27 16:09:03 +02:00
Jason Davies	f5503198b8	rustfmt	2018-03-27 16:09:03 +02:00
gnzlbg	273fc1c344	endian-dependent conversions to/from tuples tests (#400 )	2018-03-23 14:12:59 -05:00
gnzlbg	6ce3b9bbba	add test for arrays/unions (#399 )	2018-03-23 10:49:20 -05:00
Alex Crichton	aafe6ebb75	Fix default `cargo test` experience (#397 ) Turns out Cargo doesn't automatically set `TARGET` for rustc invocations so carry it forward manually from the build script over to the rustc invocation.	2018-03-22 17:40:44 -05:00
Jason Davies	de82d9d26b	Add support for Intel SHA extensions. (#395 )	2018-03-22 13:32:44 -05:00
gnzlbg	56d9a42a2f	add tests for endian-dependent behavior (#394 ) * add tests for endian-dependent behavior * format	2018-03-22 11:09:01 -05:00
gnzlbg	ff53ec6cb2	add arm neon vector types (#384 )	2018-03-20 09:11:50 -05:00
gnzlbg	68c53c1e55	Split protable vector types tests into multiple crates (#379 ) * split the portable vector tests into separate crates * use rustc reductions	2018-03-18 10:55:20 -05:00
Alex Crichton	44763c853d	Fix tests on nightly (#378 )	2018-03-16 13:06:07 -05:00
gnzlbg	2762e2ca9a	[mips/mips64: msa] add add_a_b intrinsic (#365 ) * [mips64/msa] add add_a_b intrinsic * add make/file to mips64el's Dockerfile * add run-time detection support for mips64 * add mips64 build bot * generate docs for mips64 * fix linux test * cleanup rt-detection * support mips64/mips64el in stdsimd-test * support asserting instructions with in their name * better error msgs for the auxv_crate test * debug auxv on mips64 * override run-time detection on mips msa tests * remove unused #[macro_use] * try another MIPS cpu * detect default TARGET in simd-test-macro * use mips64r2-generic * disable unused function in mips tests * move msa to mips * remove mips from ci * split into mips and mips64 modules * add rt-detection for 32-bit mips * fmt * remove merge error * add norun build bots for mips * add -p to avoid changing the cwd * fixup * refactor run-time detection module	2018-03-10 12:22:54 -06:00
QuietMisdreavus	ef0d02d04b	document all arches when part of std unfortunately, stdsimd's version of the documentation will be blanked out in favor of coresimd's version, but coresimd (when re-exported in libcore) will include all the arches	2018-03-10 00:04:01 +01:00
Alex Crichton	cb4a957efd	Add initial wasm memory grow/current intrinsics (#361 ) This exposes access to the `grow_memory` and `current_memory` instructions provided by wasm in what will hopefully be a stable interface (the stable part being x86 first in theory).	2018-03-09 09:21:08 -06:00
gnzlbg	afca7f8d16	Migrate to rustfmt-preview and require rustfmt builds to pass (#353 ) * migrate to rustfmt-preview and require rustfmt to pass * reformat with rustfmt-preview	2018-03-08 09:09:24 -06:00
Alex Crichton	d7b42faaa3	Add `cfg!` clauses to detection macro (#351 ) This way if the feature is statically detected then it'll be expanded to `true` Closes #349	2018-03-07 10:28:12 -06:00
Alex Crichton	56af498e9e	Rename `is_target_feature_detected!` (#346 ) This commit renames the `is_target_feature_detected!` macro to have different names depending on the platform. For example: * `is_x86_feature_detected!` * `is_arm_feature_detected!` * `is_aarch64_feature_detected!` * `is_powerpc64_feature_detected!` Each macro already has a platform-specific albeit similar interface. Currently, though, each macro takes a different set of strings so the hope is that like with the name of the architecture in the module we can signal the dangers of using the macro in a platform-agnostic context. One liberty taken with the macro currently though is to on both the x86 and x86_64 architectures name the macro `is_x86_feature_detected` rather than also having an `is_x86_64_feature_detected`. This mirrors, however, how all the intrinsics are named the same on x86/x86_64.	2018-03-07 09:46:16 -06:00
gnzlbg	548290b801	Prepare portable packed vector types for RFCs (#338 ) * Prepare portable packed SIMD vector types for RFCs This commit cleans up the implementation of the Portable Packed Vector Types (PPTV), adds some new features, and makes some breaking changes. The implementation is moved to `coresimd/src/ppvt` (they are still exposed via `coresimd::simd`). As before, the vector types of a certain width are implemented in the `v{width}` submodules. The `macros.rs` file has been rewritten as an `api` module that exposes the macros to implement each API. It should now hopefully be really clear where each API is implemented, and which types implement these APIs. It should also now be really clear which APIs are tested and how. - boolean vectors of the form `b{element_size}x{number_of_lanes}`. - reductions: arithmetic, bitwise, min/max, and boolean - only the facade, and a naive working implementation. These need to be implemented as `llvm.experimental.vector.reduction.{...}` but this needs rustc support first. - FromBits trait analogous to `{f32,f64}::from_bits` that perform "safe" transmutes. Instead of writing `From::from`/`x.into()` (see below for breaking changes) now you write `FromBits::from_bits`/`x.into_bits()`. - portable vector types implement `Default` and `Hash` - tests for all portable vector types and all portable operations (~2000 new tests). - (hopefully) comprehensive implementation of bitwise transmutes and lane-wise casts (before `From` and the `.as_...` methods where implemented "when they were needed". - documentation for PPTV (not great yet, but better than nothing) - conversions/transmutes from/to x86 architecture specific vector types - `store/load` API has been replaced with `{store,load}_{aligned,unaligned}` - `eq,ne,lt,le,gt,ge` APIs now return boolean vectors - The `.as_{...}` methods have been removed. Lane-wise casts are now performed by `From`. - `From` now perform casts (see above). It used to perform bitwise transmutes. - `simd` vectors' `replace` method's result is now `#[must_use]`. * enable backtrace and nocapture * unalign load/store fail test by 1 byte * update arm and aarch64 neon modules * fix arm example * fmt * clippy and read example that rustfmt swallowed * reductions should take self * rename add/mul -> sum/product; delete other arith reductions * clean up fmt::LowerHex impl * revert incorret doc change * make Hash equivalent to [T; lanes()] * use travis_wait to increase timeout limit to 20 minutes * remove travis_wait; did not help * implement reductions on top of the llvm.experimental.vector.reduction intrinsics * implement cmp for boolean vectors * add missing eq impl file * implement default * rename llvm intrinsics * fix aarch64 example error * replace #[inline(always)] with #[inline] * remove cargo clean from run.sh * workaround broken product in aarch64 * make boolean vector constructors const fn * fix more reductions on aarch64 * fix min/max reductions on aarch64 * remove whitespace * remove all boolean vector types except for b8xN * use a sum reduction fallback on aarch64 * disable llvm add reduction for aarch64 * rename the llvm intrinsics to use llvm names * remove old macros.rs file	2018-03-05 14:32:35 -06:00
gnzlbg	f1d8a88267	Run-time feature detection for new AArch64 features (#339 ) * aarch64 run-time feature detection for latest whitelisted features * dump new aarch64 features in the run-time detection tests * add some comments * remove old code	2018-03-02 21:27:55 -06:00
Alex Crichton	708cc9d9b8	Rename `bmi` to `bmi1` In accordance with rust-lang/rust#48565	2018-03-02 07:02:22 -08:00
Alex Crichton	94d8a193c4	Tweak doctests to pass in libstd as well (#335 ) The boilerplate just gets more and more ugly...	2018-02-27 13:13:22 -06:00
Alex Crichton	217f89bc4f	Reorganize the x86/x86_64 intrinsic folders (#334 ) The public API isn't changing in this commit but the internal organization is being rejiggered. Instead of `x86/$subtarget/$feature.rs` the folders are changed to `coresimd/x86/$feature.rs` and `coresimd/x86_64/$feature.rs`. The `arch::x86_64` then reexports both the contents of the `x86` module and the `x86_64` module.	2018-02-27 08:41:07 -06:00
Artyom Pavlov	aa4cef7723	Implemented rdrand and rdseed intrinsics (#326 ) * implemented rdrand and rdseed intrinsics * added "unsigned short" case moved rdrand from i686 to x86_64 * 64 bit rdrand functions in x86_64, 16 and 32 in i686	2018-02-27 07:58:08 -06:00
Alex Crichton	560fe20b61	Beef up documentation of `arch` module (#331 ) This commit reorganizes some documentation for inclusion into the standard library, moving the bulk of the docs to the `arch` module and away from the crate root which won't actually be the end-user interface.	2018-02-27 07:24:59 -06:00
Alex Crichton	746ab07521	Compile examples on CI (#329 ) Make sure the top-level `examples` folder is registered with the `stdsimd` crate!	2018-02-25 12:37:08 -06:00
Alex Crichton	db5648e0e4	Start adding stability attributes (#327 ) To integrate into the standard library this crate needs at least a stability attribute on the macro itself but this commit also beings by adding unstable attributes to the exported modules as well. This should help everything be unstable-by-default and we can start iterating from there in the standard library. This commit also does away with the `coresimd::vendor` module internal implementation detail, instead directly creating the `arch` module to allow easily documenting it in this crate and having the docs show up in rust-lang/rust.	2018-02-24 14:11:09 +09:00
Alex Crichton	39b5ec91ae	Reorganize and refactor source tree (#324 ) With RFC 2325 looking close to being accepted, I took a crack at reorganizing this repository to being more amenable for inclusion in libstd/libcore. My current plan is to add stdsimd as a submodule in rust-lang/rust and then use `#[path]` to include the modules directly into libstd/libcore. Before this commit, however, the source code of coresimd/stdsimd themselves were not quite ready for this. Imports wouldn't compile for one reason or another, and the organization was also different than the RFC itself! In addition to moving a lot of files around, this commit has the following major changes: * The `cfg_feature_enabled!` macro is now renamed to `is_target_feature_detected!` * The `vendor` module is now called `arch`. * Under the `arch` module is a suite of modules like `x86`, `x86_64`, etc. One per `cfg!(target_arch)`. * The `is_target_feature_detected!` macro was removed from coresimd. Unfortunately libcore has no ability to export unstable macros, so for now all feature detection is canonicalized in stdsimd. The `coresimd` and `stdsimd` crates have been updated to the planned organization in RFC 2325 as well. The runtime bits saw the largest amount of refactoring, seeing a good deal of simplification without the core/std split.	2018-02-18 10:07:35 +09:00

38 Commits