ivanthedugtrio / veclib Goto Github PK

View Code? Open in Web Editor NEW

14.0 14.0 3.0 35 KB

Vector library for porting SSE2 instructions to other architectures

License: MIT License

C 100.00%

veclib's People

Contributors

Stargazers

Watchers

Forkers

davidzengxhsh r-barnes bear-rsg

veclib's Issues

Consider merging with SIMDe project

I've been working on a similar project called SIMDe which is also MIT licensed, and is also an attempt to allow code written for one set of SIMD instructions to run on machines without them.

We're both working on implementing x86/x86_64 ISA extensions right now, but SIMDe is using portable fallbacks (with hints to encourage the compiler to vectorize what it can) instead of POWER instrisics. I have been planning to create an AltiVec/VMX/VSX backend for SIMDe eventually, but so far I've been focusing on getting the portable version in place. Eventually I also intend to go in the other direction with SIMDe: AltiVec/VMX/VSX (and others) to SSE (and everything else).

I'm wondering if you would be interested in merging the two projects. I think it would be great for both projects; it would increase the number of functions supported (SIMDe already fully supports all of MMX and SSE1, as well as partial support for several others), and of course from SIMDe's perspective it would greatly improve performance on POWER machines. I think veclib would also benefit from SIMDe's infrastructure; we have a pretty decent test suite, and continuous integration.

The big caveat, as far as I'm concerned, is that I'm not comfortable using powerveclib due to the license. I intend to reach out to the author about this issue, but given that it's an IBM project I don't hold out a lot of hope for getting a more flexibly-licensed version. I'm not exactly sure where the line between AltiVec/VMX/VSX and powerveclib is (I've never used the POWER intrinsics before), but I'm guessing this may reduce the number of instructions which are accelerated in the short term.

why not fully inline?

what is the rationale of not having everything fully inlined?
I expect a huge overhead in calling library functions for each vector instruction.

invalid parameter combination for AltiVec intrinsic

I am trying to build the ixgbe device driver in DPDK, but I got the following error message:
/home/ioa/david/dpdk-16.11.port/drivers/net/ixgbe/vec128int.h: In function 'vec_load1q':
/home/ioa/david/dpdk-16.11.port/drivers/net/ixgbe/vec128int.h:30:31: warning: cast discards 'const' qualifier from pointer target type [-Wcast-qual]
return (__m128i) vec_ld (0, (vector unsigned char*) address);
^
/home/ioa/david/dpdk-16.11.port/drivers/net/ixgbe/vec128int.h: In function 'vec_loadu1q':
/home/ioa/david/dpdk-16.11.port/drivers/net/ixgbe/vec128int.h:37:20: warning: cast discards 'const' qualifier from pointer target type [-Wcast-qual]
src1 = vec_ld(0, (vector unsigned char*) address);
^
/home/ioa/david/dpdk-16.11.port/drivers/net/ixgbe/vec128int.h:38:21: warning: cast discards 'const' qualifier from pointer target type [-Wcast-qual]
src2 = vec_ld(16, (vector unsigned char*) address);
^
/home/ioa/david/dpdk-16.11.port/drivers/net/ixgbe/vec128int.h:43:23: warning: cast discards 'const' qualifier from pointer target type [-Wcast-qual]
return vec_xl (0, (unsigned char *) address);
^
/home/ioa/david/dpdk-16.11.port/drivers/net/ixgbe/vec128int.h:43:5: error: invalid parameter combination for AltiVec intrinsic
return vec_xl (0, (unsigned char *) address);

Could you help?

_mm_cvtsi128_si32 convert

First, thank you for this amazing job in converting these functions to altivec. I have one request if someone could help me convert some other ones; I'm looking at converting the following: _mm_cvtsi128_si32
Would anybody help with this one?

Either inline your functions, move them in a class, or move them to a .c file

I noticed you currently have all your function implementations in a .h file. While this may seem simpler, this can cause big issues when depending on your project.

If two different .c files include your header (even though you have #pragma once), you'll doubly define your function. This is not allowed in either C or C++.

The standard way of getting around this is to move your implementations into a .c file. You'll keep your declarations in the .h. So it'll look a bit like this:

// -------------------------------
// in your .h file
#pragma once

// your includes go here (don't forget your .h file!)

__m128i _mm_load_si128 (__m128i const* address);
__m128i _mm_loadu_si128 (__m128i const* address);
// etc.

// -------------------------------
// in your .c file

// your includes go here

__m128i _mm_load_si128 (__m128i const* address)
{
    return vec_load1q (address);
}

__m128i _mm_loadu_si128 (__m128i const* address)
{
    return vec_loadu1q (address);
}

In C++, you get some extra options to work around the issue

inline your functions. This doesn't really fix anything, but compilers ignore the problem when you do this. C has support for inlining, but I tried it and it didn't seem to avoid the issue.
Move your functions into a class declaration, and implement them within the class body. This actually inlines your functions, so it's really the same as #1.

ivanthedugtrio / veclib Goto Github PK

veclib's People

Contributors

Stargazers

Watchers

Forkers

veclib's Issues

Consider merging with SIMDe project

why not fully inline?

invalid parameter combination for AltiVec intrinsic

_mm_cvtsi128_si32 convert

Either inline your functions, move them in a class, or move them to a .c file

Helpful resources

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs