ivanthedugtrio / veclib Goto Github PK
View Code? Open in Web Editor NEWVector library for porting SSE2 instructions to other architectures
License: MIT License
Vector library for porting SSE2 instructions to other architectures
License: MIT License
I've been working on a similar project called SIMDe which is also MIT licensed, and is also an attempt to allow code written for one set of SIMD instructions to run on machines without them.
We're both working on implementing x86/x86_64 ISA extensions right now, but SIMDe is using portable fallbacks (with hints to encourage the compiler to vectorize what it can) instead of POWER instrisics. I have been planning to create an AltiVec/VMX/VSX backend for SIMDe eventually, but so far I've been focusing on getting the portable version in place. Eventually I also intend to go in the other direction with SIMDe: AltiVec/VMX/VSX (and others) to SSE (and everything else).
I'm wondering if you would be interested in merging the two projects. I think it would be great for both projects; it would increase the number of functions supported (SIMDe already fully supports all of MMX and SSE1, as well as partial support for several others), and of course from SIMDe's perspective it would greatly improve performance on POWER machines. I think veclib would also benefit from SIMDe's infrastructure; we have a pretty decent test suite, and continuous integration.
The big caveat, as far as I'm concerned, is that I'm not comfortable using powerveclib due to the license. I intend to reach out to the author about this issue, but given that it's an IBM project I don't hold out a lot of hope for getting a more flexibly-licensed version. I'm not exactly sure where the line between AltiVec/VMX/VSX and powerveclib is (I've never used the POWER intrinsics before), but I'm guessing this may reduce the number of instructions which are accelerated in the short term.
what is the rationale of not having everything fully inlined?
I expect a huge overhead in calling library functions for each vector instruction.
I am trying to build the ixgbe device driver in DPDK, but I got the following error message:
/home/ioa/david/dpdk-16.11.port/drivers/net/ixgbe/vec128int.h: In function 'vec_load1q':
/home/ioa/david/dpdk-16.11.port/drivers/net/ixgbe/vec128int.h:30:31: warning: cast discards 'const' qualifier from pointer target type [-Wcast-qual]
return (__m128i) vec_ld (0, (vector unsigned char*) address);
^
/home/ioa/david/dpdk-16.11.port/drivers/net/ixgbe/vec128int.h: In function 'vec_loadu1q':
/home/ioa/david/dpdk-16.11.port/drivers/net/ixgbe/vec128int.h:37:20: warning: cast discards 'const' qualifier from pointer target type [-Wcast-qual]
src1 = vec_ld(0, (vector unsigned char*) address);
^
/home/ioa/david/dpdk-16.11.port/drivers/net/ixgbe/vec128int.h:38:21: warning: cast discards 'const' qualifier from pointer target type [-Wcast-qual]
src2 = vec_ld(16, (vector unsigned char*) address);
^
/home/ioa/david/dpdk-16.11.port/drivers/net/ixgbe/vec128int.h:43:23: warning: cast discards 'const' qualifier from pointer target type [-Wcast-qual]
return vec_xl (0, (unsigned char *) address);
^
/home/ioa/david/dpdk-16.11.port/drivers/net/ixgbe/vec128int.h:43:5: error: invalid parameter combination for AltiVec intrinsic
return vec_xl (0, (unsigned char *) address);
Could you help?
First, thank you for this amazing job in converting these functions to altivec. I have one request if someone could help me convert some other ones; I'm looking at converting the following: _mm_cvtsi128_si32
Would anybody help with this one?
I noticed you currently have all your function implementations in a .h file. While this may seem simpler, this can cause big issues when depending on your project.
If two different .c files include your header (even though you have #pragma once
), you'll doubly define your function. This is not allowed in either C or C++.
The standard way of getting around this is to move your implementations into a .c
file. You'll keep your declarations in the .h
. So it'll look a bit like this:
// -------------------------------
// in your .h file
#pragma once
// your includes go here (don't forget your .h file!)
__m128i _mm_load_si128 (__m128i const* address);
__m128i _mm_loadu_si128 (__m128i const* address);
// etc.
// -------------------------------
// in your .c file
// your includes go here
__m128i _mm_load_si128 (__m128i const* address)
{
return vec_load1q (address);
}
__m128i _mm_loadu_si128 (__m128i const* address)
{
return vec_loadu1q (address);
}
In C++, you get some extra options to work around the issue
inline
your functions. This doesn't really fix anything, but compilers ignore the problem when you do this. C has support for inlining, but I tried it and it didn't seem to avoid the issue.A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.