dmlc / halideir Goto Github PK

Symbolic Expression and Statement Module for new DSLs

License: Other

Makefile 0.16% C++ 99.84%

halideir's Introduction

Distributed Machine Learning Common Codebase

DMLC-Core is the backbone library to support all DMLC projects, offers the bricks to build efficient and scalable distributed machine learning libraries.

Developer Channel

What's New

Note on Parameter Module for Machine Learning

Known Issues

RecordIO format is not portable across different processor endians. So it is not possible to save RecordIO file on a x86 machine and then load it on a SPARC machine, because x86 is little endian while SPARC is big endian.

Contributing

Contributing to dmlc-core is welcomed! dmlc-core follows google's C style guide. If you are interested in contributing, take a look at feature wishlist and open a new issue if you like to add something.

DMLC-Core uses C++11 standard. Ensure that your C++ compiler supports C++11.
Try to introduce minimum dependency when possible

CheckList before submit code

Type make lint and fix all the style problems.
Type make doc and fix all the warnings.

NOTE

deps:

libcurl4-openssl-dev

halideir's People

Contributors

Stargazers

Watchers

halideir's Issues

Simplify single point variables

Currently, if we have a variable that can only assume a single const value (e.g., iteration variable of a loop that has a count of 1), Simplify doesn't simplify it to that const value.

Version tag

Please, generate versions tag to be able to access the code from tarballs

Tracking changes in upstream

halide/Halide#1604

TVM and HalideIR circular dependency

It appears that as of cf6090a it is no longer possible to separate HalideIR and TVM, otherwise there is an unresolved circular dependency. The two projects can only be built when combined in one tree thus it seems logical to merge the two code bases.

It seems size_t different from uint64_t on macOS (64 bit)

When compiling latest tvm, it failed on topi.cc.o

FAILED: CMakeFiles/tvm_topi.dir/topi/src/topi.cc.o 
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++  -DDMLC_USE_FOPEN64=0 -DGLFW_DLL -DTVM_CUDA_RUNTIME=0 -DTVM_LLVM_VERSION=50 -DTVM_METAL_RUNTIME=1 -DTVM_OPENCL_RUNTIME=1 -DTVM_OPENGL_RUNTIME=1 -Dtvm_topi_EXPORTS -I../include -I../HalideIR/src -I../dlpack/include -I../topi/include -I/usr/local/opt/llvm/include -I../dmlc-core/include -isystem /usr/local/include -O3 -Wall -fPIC -std=c++11 -fPIC   -DLLVM_BUILD_GLOBAL_ISEL -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -MD -MT CMakeFiles/tvm_topi.dir/topi/src/topi.cc.o -MF CMakeFiles/tvm_topi.dir/topi/src/topi.cc.o.d -o CMakeFiles/tvm_topi.dir/topi/src/topi.cc.o -c ../topi/src/topi.cc
In file included from ../topi/src/topi.cc:12:
In file included from ../topi/include/topi/broadcast.h:11:
../topi/include/topi/detail/broadcast.h:58:10: warning: explicitly assigning value of variable of type 'int' to itself [-Wself-assign]
  for (i = i; i <= max_size; ++i) {
       ~ ^ ~
In file included from ../topi/src/topi.cc:31:
In file included from ../topi/include/topi/cuda/dense.h:14:
In file included from ../topi/include/topi/contrib/cublas.h:10:
../topi/include/topi/detail/extern.h:113:5: error: call to 'make_const' is ambiguous
    make_const(Int(32), buf->shape.size()),
    ^~~~~~~~~~
../HalideIR/src/ir/IROperator.h:81:13: note: candidate function
EXPORT Expr make_const(Type t, int64_t val);
            ^
../HalideIR/src/ir/IROperator.h:82:13: note: candidate function
EXPORT Expr make_const(Type t, uint64_t val);
            ^
../HalideIR/src/ir/IROperator.h:83:13: note: candidate function
EXPORT Expr make_const(Type t, double val);
            ^
../HalideIR/src/ir/IROperator.h:84:13: note: candidate function
inline Expr make_const(Type t, int32_t val)   {return make_const(t, (int64_t)val);}
            ^
../HalideIR/src/ir/IROperator.h:85:13: note: candidate function
inline Expr make_const(Type t, uint32_t val)  {return make_const(t, (uint64_t)val);}
            ^
../HalideIR/src/ir/IROperator.h:86:13: note: candidate function
inline Expr make_const(Type t, int16_t val)   {return make_const(t, (int64_t)val);}
            ^
../HalideIR/src/ir/IROperator.h:87:13: note: candidate function
inline Expr make_const(Type t, uint16_t val)  {return make_const(t, (uint64_t)val);}
            ^
../HalideIR/src/ir/IROperator.h:88:13: note: candidate function
inline Expr make_const(Type t, int8_t val)    {return make_const(t, (int64_t)val);}
            ^
../HalideIR/src/ir/IROperator.h:89:13: note: candidate function
inline Expr make_const(Type t, uint8_t val)   {return make_const(t, (uint64_t)val);}
            ^
../HalideIR/src/ir/IROperator.h:90:13: note: candidate function
inline Expr make_const(Type t, bool val)      {return make_const(t, (uint64_t)val);}
            ^
../HalideIR/src/ir/IROperator.h:91:13: note: candidate function
inline Expr make_const(Type t, float val)     {return make_const(t, (double)val);}
            ^
1 warning and 1 error generated.

the return type of buf->shape.size() is size_t, which seems different from uint64_t on my clang of Xcode.

it's defined in stddef.h (/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/9.0.0/include/stddef.h:54)

#if defined(__need_size_t)
#if !defined(_SIZE_T) || __has_feature(modules)
/* Always define size_t when modules are available. */
#if !__has_feature(modules)
#define _SIZE_T
#endif
typedef __SIZE_TYPE__ size_t;
#endif
#undef __need_size_t
#endif /*defined(__need_size_t) */

According to this answar and the reference, the definition of size_t is implementation depended.

I added inline Expr make_const(Type t, size_t val) {return make_const(t, (uint64_t)val);}, and it works, while I don't think it's a solution, since some system may have size_t and uint64_t the same.

Why don't we use template here?

How can I get serialized HalideIR in tvm through c++, thanks

Hello, thanks for your important fundamental projects.
I want to get the serialized HalideIR in textual format and compare to another IR.
Could you give me some ideas about that.
Thanks

incorrect constant folding for division and mod for negative integers

(Pasting the bug report from https://discuss.tvm.ai/t/integer-constant-folding-for-division-and-mod/2008) My understanding is that TVM/HalideIR uses truncated division/mod, unlike Halide that uses Euclidean division/mod. Nevertheless, TVM uses div_imp (imlementation below) to do constant folding for division. Seems like this implementation is coming from Halide source code.

template<typename T>
inline T div_imp(T a, T b) {
    Type t = type_of<T>();
    if (t.is_int()) {
        int64_t q = a / b;
        int64_t r = a - q * b;
        int64_t bs = b >> (t.bits() - 1);
        int64_t rs = r >> (t.bits() - 1);
        return (T) (q - (rs & bs) + (rs & ~bs));
    } else {
        return a / b;
    }
}

substitute make expr more complicated

import tvm

def register_mem(scope_tb, max_bits):
    #Register mem
    @tvm.register_func("tvm.info.mem.%s" % scope_tb)
    def mem_info_inp_buffer():
        return tvm.make.node("MemoryInfo",
                        unit_bits= 16,
                        max_simd_bits=32,
                        max_num_bits=max_bits,
                        head_address=None)


def test():
    scope_tb = "local.L0v"
    max_bits = 1024 * 1024 * 1024

    ib = tvm.ir_builder.create()
    A = ib.allocate("int32", 200, name="A", scope=scope_tb)
    with ib.for_range(0, 10, name="i") as i:
        with ib.for_range(0, 10, name="j") as j:
            A[i*10+j] = 1

    B = ib.allocate("int32", 200, name="B", scope=scope_tb)
    with ib.for_range(0, 10, name="i") as i:
        with ib.for_range(0, 10, name="j") as j:
            with ib.if_scope(j == A[i]):
                B[i*10+j] = 2

    body = ib.get()
    print(tvm.ir_pass.Simplify(body))

test()

before

// attr [A] storage_scope = "local.L0v"
allocate A[int32 * 200]
for (i, 0, 10) {
  for (j, 0, 10) {
    A[((i*10) + j)] = 1
  }
}
// attr [B] storage_scope = "local.L0v"
allocate B[int32 * 200]
for (j, 0, 10) {
  for (j, 0, 10) {
    if ((j == A[j])) {
      B[((j*10) + j)] = 2
    }
  }
}

after, got B[((j*10) + A[j])], which is a more complicated expr

// attr [A] storage_scope = "local.L0v"
allocate A[int32 * 200]
for (i, 0, 10) {
  for (j, 0, 10) {
    A[((i*10) + j)] = 1
  }
}
// attr [B] storage_scope = "local.L0v"
allocate B[int32 * 200]
for (j, 0, 10) {
  for (j, 0, 10) {
    if ((j == A[j])) {
      B[((j*10) + A[j])] = 2
    }
  }
}

dmlc / halideir Goto Github PK

halideir's Introduction

Distributed Machine Learning Common Codebase

What's New

Contents

Known Issues

Contributing

CheckList before submit code

NOTE

halideir's People

Contributors

Stargazers

Watchers

Forkers

halideir's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs