At the time being, the (C++) SimpleArray class templa

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Though I <a href="https://github.com/solvcon/modmesh/pull/237#issuecomment-1773048071"

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

The following code is my idea to achieve the goal: <div class="highlight highlight

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

In your design, when there is a wrapper <div class="highlight highlight-source-c++

Single Python wrapper for SimpleArray,about solvcon/modmesh

Comments (24)

tigercosmos commented on June 30, 2024 1

I am now working on this.

from modmesh.

yungyuc commented on June 30, 2024 1

@tigercosmos no, for construction of SimpleArray from Python, casting may happen during runtime. The Python-side construction already happens at runtime, not compile-time.

SimpleArray::value_type is still determined at compile-time. This does not change.

@terrychan999 What about we create a SimpleArrayFacade to dispatch everything about SimpleArray between Python and C++, instead of just the constructor? Would that work?

from modmesh.

terrychan999 commented on June 30, 2024 1

Though I planned to add a Base Class for WrapSimpleArray, I found it quite complicated.
Thus, I came up with another idea of writing a Python class to achieve the goal of single wrapper.

class PySimpleArray:
    def __init__(self, *args, **kwargs):
        dtype_to_class = {
            "bool": SimpleArrayBool,
            "int8": SimpleArrayInt8,
            "int16": SimpleArrayInt16,
            "int32": SimpleArrayInt32,
            "int64": SimpleArrayInt64,
            "uint8": SimpleArrayUint8,
            "uint16": SimpleArrayUint16,
            "uint32": SimpleArrayUint32,
            "uint64": SimpleArrayUint64,
            "float32": SimpleArrayFloat32,
            "float64": SimpleArrayFloat64,
        }

        # Check if the first argument is a numpy array
        if args and isinstance(args[0], np.ndarray):
            ndarray = args[0]
            dtype = ndarray.dtype.name
            self._arr = dtype_to_class[dtype](array=ndarray)
            return

        shape, value, dtype = self._extract_args(args, kwargs)

        if dtype not in dtype_to_class:
            raise ValueError(f"Unsupported dtype: {dtype}")

        if value is not None:
            self._arr = dtype_to_class[dtype](shape, value)
        else:
            self._arr = dtype_to_class[dtype](shape)

    def _extract_args(self, args, kwargs):
        shape = dtype = value = None

        # Extract shape
        if args:
            shape = args[0]
        else:
            shape = kwargs.get("shape")

        # The second argument could be value or dtype
        if len(args) == 2:
            if "dtype" in kwargs:
                value = args[1]
            else:
                dtype = args[1]
        elif len(args) == 3:
            value = args[1]
            dtype = args[2]

        if not dtype:
            dtype = kwargs.get("dtype")

        if not value:
            value = kwargs.get("value")

        return shape, value, dtype

    def __getattr__(self, attr):
        return getattr(self._arr, attr)

    def __getitem__(self, key):
        return self._arr[key]

    def __setitem__(self, key, value):
        self._arr[key] = value

In this way, we can make SimpleArray easy to use from Python like the following example:

import modmesh as mm
import numpy as np


# case 1: init with shape and dtype
sarr1 = mm.PySimpleArray((2, 3, 4), dtype="float64")
assert sarr1.ndarray.dtype == np.float64
print(type(sarr1), sarr1.shape)
# Expected: <class 'modmesh.simplearray.PySimpleArray'> (2, 3, 4)

# case 2: init with shape, value and dtype
sarr2 = mm.PySimpleArray(shape=(2, 3, 4), value=10.0, dtype="float64")
assert sarr2[0, 0, 0] == 10.0
print(sarr2[0,0,0])
# Expected: 10.0

# case 3: init with a numpy array
ndarr4 = np.array([[1, 2, 3, 4], [5, 6, 7, 8]], dtype="int8")
sarr3 = mm.PySimpleArray(ndarr4) # no need to specify dtype
assert sarr3.ndarray.dtype == np.int8
assert np.array_equal(sarr3.ndarray, ndarr4)
print(sarr3.ndarray.dtype)
# Expected: int8

from modmesh.

yungyuc commented on June 30, 2024 1

@terrychan999 To allow the compile-time check, perhaps you may make a C++ facade class like:

class SimpleArrayPlex
{
    enum ValueType
    {
        BOOL_ = 0,
        BYTE_,
        // ...
    };

    wrapper_to_SimpleArray_member_function();

    void * m_ptr;
    ValueType m_vtype;
}; /* end class SimpleArrayPlex */

The enum ValueType works in a similar way and the same name and value as pybind11:

struct npy_api {
    enum constants {
        NPY_ARRAY_C_CONTIGUOUS_ = 0x0001,
        NPY_ARRAY_F_CONTIGUOUS_ = 0x0002,
        NPY_ARRAY_OWNDATA_ = 0x0004,
        NPY_ARRAY_FORCECAST_ = 0x0010,
        NPY_ARRAY_ENSUREARRAY_ = 0x0040,
        NPY_ARRAY_ALIGNED_ = 0x0100,
        NPY_ARRAY_WRITEABLE_ = 0x0400,
        NPY_BOOL_ = 0,
        NPY_BYTE_,
        NPY_UBYTE_,
        NPY_SHORT_,
        NPY_USHORT_,
        NPY_INT_,
// ...

from modmesh.

tigercosmos commented on June 30, 2024

The following code is my idea to achieve the goal:

#include <iostream>
#include <pybind11/pybind11.h>

namespace py = pybind11;

struct MyClassBase{

};

template <typename T>
struct MyClass : public MyClassBase
{
    MyClass(size_t size)
    {
        ptr = new T[size];
        for (size_t i = 0; i < size; ++i)
        {
            ptr[i] = 9999;
        }
    }

    T at(size_t idx)
    {
        return ptr[idx];
    }

    T * ptr;
};

MyClassBase* create(int type)
{
    if (type == 1)
    {
        return new MyClass<int>(10);
    }

    return new MyClass<double>(10);
}

PYBIND11_MODULE(example, m)
{
    py::class_<MyClassBase*>(m, "MyClass")
        .def(py::init(&create))
        .def("at", [](MyClass<int> * self, int & size)
             { return self->at(size); })
        .def("at", [](MyClass<double> * self, int & size)
             { return self->at(size); });
}

from example import *

m = MyClass(1) # type == 1, should return MyClass<int>

print(m)

print(m.at(5))

and I got the reulst:

$ python3 test.py
<example.MyClass object at 0x7f68c9a5c4b0>
Traceback (most recent call last):
  File "test.py", line 7, in <module>
    print(m.at(5))
TypeError: at(): incompatible function arguments. The following argument types are supported:
    1. (self: MyClass<int>, arg0: int) -> int
    2. (self: MyClass<double>, arg0: int) -> float

Invoked with: <example.MyClass object at 0x7f68c9a5c4b0>, 5

I don't know where I did wrong? m should be a MyClass<T> type and fit the argument?

from modmesh.

yungyuc commented on June 30, 2024

MyClassBase, MyClass<int>, and MyClass<double> are 3 difference C++ classes that all need pybind11 wrappers, but you only provide the wrapper of the first class.

from modmesh.

tigercosmos commented on June 30, 2024

I thought Python class MyClass is equal to C++ class MyClassBase*, am I right?

If so, in the following code, MyClassDouble should be equal to MyClass<double> *?

    py::class_<MyClass<double> *>(m, "MyClassDouble")
        .def(py::init([]()
                      { return new MyClass<double>(10); }))
        .def("at", [](MyClass<double> * self, int & size)
             { return self->at(size); });

m = MyClassDouble()
m.at(5)

  File "test.py", line 5, in <module>
    print(m.at(5))
TypeError: at(): incompatible function arguments. The following argument types are supported:
    1. (self: MyClass<double>, arg0: int) -> float

Invoked with: <example.MyClassDouble object at 0x7fe150ae95b0>, 5

The type still mismatch. I think I have some really basic mistakes, could you give me some hints?

from modmesh.

yungyuc commented on June 30, 2024

This is interesting. I didn't try to wrap a pointer type myself. Could you try the following and see what happens?

py::class_<MyClass<double>>(m, "MyClassDouble")
  // ...
;

from modmesh.

tigercosmos commented on June 30, 2024

@yungyuc This is my proposal. I think the same mechanism will also work for SimpleArray.

#include <iostream>
#include <variant>
#include <pybind11/pybind11.h>
#include <pybind11/stl.h>

namespace py = pybind11;

struct MyClassBase
{
    int dtype;
    void * ptr;
};

template <typename T>
struct MyClass : public MyClassBase
{
    MyClass(size_t size)
    {
        T * ptr_ = new T[size];
        for (size_t i = 0; i < size; ++i)
        {
            ptr_[i] = 100.555 + i;
        }

        ptr = static_cast<void *>(ptr_);
    }

    T at(size_t idx)
    {
        return *(reinterpret_cast<T *>(ptr) + idx);
    }
};

MyClassBase create(int type)
{
    if (type == 1)
    {
        auto m = MyClass<int>(10);
        m.dtype = 1;
        return m;
    }

    auto m = MyClass<double>(10);
    m.dtype = 2;
    return m;
}

using dtypes = std::variant<double, int>;

dtypes base_at(MyClassBase & self, int & size)
{
    if (self.dtype == 1)
    {
        auto m = reinterpret_cast<MyClass<int> &>(self);
        return m.at(size);
    }
    else
    {
        auto m = reinterpret_cast<MyClass<double> &>(self);
        return m.at(size);
    }
}

PYBIND11_MODULE(example, m)
{

    py::class_<MyClass<int>>(m, "MyClassInt")
        .def(py::init([](int size)
                      { return MyClass<int>(size); }));

    py::class_<MyClass<double>>(m, "MyClassDouble")
        .def(py::init([](int size)
                      { return MyClass<double>(size); }));

    py::class_<MyClassBase>(m, "MyClass")
        .def(py::init(&create))
        .def("at", &base_at);
}

from example import *

m = MyClass(1) # type == 1, should call MyClass<int>
print(m.at(5))
# 105

m = MyClass(2) # type == 2, should call MyClass<double>
print(m.at(6))
# 106.555

from modmesh.

yungyuc commented on June 30, 2024

In your design, when there is a wrapper

m.def("get_element", [](MyClass<int> const & c, size_t i) { return c.at(i); }));

What will happen with the Python code

c = MyClass(1)
print(m.get_element(c, 0))

from modmesh.

tigercosmos commented on June 30, 2024

What will happen with the Python code

c = MyClass(1)
print(m.get_element(c, 0))

TypeError: get_element(): incompatible function arguments. The following argument types are supported:
    1. (arg0: example.MyClassInt, arg1: int) -> int

Invoked with: <example.MyClass object at 0x7f189a2a3970>, 0

There is an error since the types are mismatched.

Why don't we make it like what I do for the function base_at to use MyClassBase?

m.def("get_element", [](MyClassBase const & c, size_t I){ ...

I think I can still make it work when the argument is a derived class as long as I define the type_caster for MyClass<int> to accept MyClassBase. Is this what you want? and what's the difference?

from modmesh.

tigercosmos commented on June 30, 2024

To answer the question from #52,

how it manages memory buffer:

There is a void pointer to store the buffer in the base class, and the base class will remember what dtype is. When the derived typed class access the buffer, the data will be casted to a correct type according to the dtype infomation. The idea is really like pybind11::array and pybind11::array_t, so we can have SimpleArrayBase and SimpleArray_T.

I make the simplified example work as the code above. The only thing I am worried about is how to modify the current code based on this concept, since maybe it's not easy to cast the SimpleArrayBase to SimpleArray_T when there is ConcreteBuffer class inside. I guess it will modify the whole architecture a lot and will be big engineering.

from modmesh.

yungyuc commented on June 30, 2024

Why don't we make it like what I do for the function base_at to use MyClassBase?
m.def("get_element", [](MyClassBase const & c, size_t I){ ...
I think I can still make it work when the argument is a derived class as long as I define the type_caster for MyClass<int> to accept MyClassBase. Is this what you want? and what's the difference?

In C++ there will be code to accept MyClass<type> instead of MyClassBase. Python must use dynamic typing but C++ usually prefers static types for speed. We must not sacrifice the compile-time speed offered by C++.

There is a void pointer to store the buffer in the base class, and the base class will remember what dtype is. When the derived typed class access the buffer, the data will be casted to a correct type according to the dtype infomation. The idea is really like pybind11::array and pybind11::array_t, so we can have SimpleArrayBase and SimpleArray_T.

This sounds like a plan. For SimpleArray we already have ConcreteBuffer to supper type erasure so that you do not really need to use void *.

from modmesh.

tigercosmos commented on June 30, 2024

@yungyuc I change the sample code: #27 (comment)

If I put void *ptr from MyClassBase to MyClass (the derived one), the sample code will fail with a segmentation fault when I try to call m.at().
Is there an obvious reason that I need to put the memory void *ptr in the base class but not a derived class?

I ask this because I am thinking if it is possible to avoid putting the memory buffer in the base class (say SimpleArrayBase), otherwise I will need to modify SimpleArray a lot. (try to let the base class have only dtype)

from modmesh.

yungyuc commented on June 30, 2024

Unassign for lack of activities

from modmesh.

terrychan999 commented on June 30, 2024

I have come up with an idea to just use a function to wrap WrapSimpleArray with a type argument.

for example

template <typename T>
class MODMESH_PYTHON_WRAPPER_VISIBILITY WrapSimpleArray
    : public WrapBase<WrapSimpleArray<T>, SimpleArray<T>>
{
    // TL;DR

public:
    static SimpleArray<T> init(py::tuple shape) {
        return wrapped_type(make_shape(shape));
    }
};

pybind11::object createSimpleArray(pybind11::tuple shape, const std::string & type)
{
    namespace py = pybind11;

    if (type == "bool")
        return py::cast(WrapSimpleArray<bool>::init(shape));
    if (type == "int8")
        return py::cast(WrapSimpleArray<int8_t>::init(shape));
    if (type == "int16")
        return py::cast(WrapSimpleArray<int16_t>::init(shape));
    if (type == "int32")
        return py::cast(WrapSimpleArray<int32_t>::init(shape));
    if (type == "int64")
        return py::cast(WrapSimpleArray<int64_t>::init(shape));
    if (type == "uint8")
        return py::cast(WrapSimpleArray<uint8_t>::init(shape));
    if (type == "uint16")
        return py::cast(WrapSimpleArray<uint16_t>::init(shape));
    if (type == "uint32")
        return py::cast(WrapSimpleArray<uint32_t>::init(shape));
    if (type == "uint64")
        return py::cast(WrapSimpleArray<uint64_t>::init(shape));
    if (type == "float32")
        return py::cast(WrapSimpleArray<float>::init(shape));
    if (type == "float64")
        return py::cast(WrapSimpleArray<double>::init(shape));
    throw std::runtime_error("unsupported type.");
}

void wrap_SimpleArray(pybind11::module & mod)
{
    // TL;DR
    mod.def("SimpleArray", &createSimpleArray, pybind11::arg("shape"), pybind11::arg("dtype"));
}

Take a step further, it can also takes pybind11::args and pybind11::kwargs to handle not only the shape argument but also the value and the array arguments.

from modmesh.

tigercosmos commented on June 30, 2024

@terrychan999 I think we want to have compilation time casting, otherwise it's too slow for numerical computation

from modmesh.

terrychan999 commented on June 30, 2024

@tigercosmos @yungyuc thanks for the comments!

I don't know much about the Facade Pattern.
Does that mean creating another class that operates at the same level as WrapSimpleArray but can handle type implicitly to replace the latter's functionality?

from modmesh.

yungyuc commented on June 30, 2024

Yes

from modmesh.

yungyuc commented on June 30, 2024

@terrychan999 If the single class is a pure Python class we cannot do compile-time checks. It is important to have compile-time checks.

I understand it is complex to implement the single class in the pybind11 level but it is what we need. Could you please elaborate what is the complexity of doing that?

from modmesh.

tigercosmos commented on June 30, 2024

Finally... I think I know how to finish this issue, I will start to work on this.

Sample code:

#include <iostream>
#include <variant>
#include <pybind11/pybind11.h>
#include <pybind11/stl.h>
#include <cstring>
namespace py = pybind11;

template <typename Derive>
struct ArrayBase
{
    void init(size_t size)
    {
        static_cast<Derive *>(this)->init(size);
    }

    decltype(auto) at(size_t idx)
    {
        return static_cast<Derive *>(this)->at(idx);
    }
};

template <typename T>
struct Array : public ArrayBase<Array<T>>
{
    void init(size_t size)
    {
        ptr_ = new T[size];
        for (size_t i = 0; i < size; ++i)
        {
            ptr_[i] = 3.33;
        }
        m_size = size;
    }

    T at(size_t idx)
    {
        return ptr_[idx];
    }

    T * ptr_;
    size_t m_size;
};

using ArrayInt = Array<int>;
using ArrayDouble = Array<double>;

struct ArrayPlex
{
    enum ValueType
    {
        Npy_Int,
        Npy_Double,
    };

    ArrayPlex(size_t size, const char * dtype)
    {
        if (std::strcmp(dtype, "int") == 0)
        {
            m_data_type = Npy_Int;
            ArrayInt * int_array = new ArrayInt();
            int_array->init(size);
            m_ptr = (void *)int_array;
        }
        else if (std::strcmp(dtype, "double") == 0)
        {
            m_data_type = Npy_Double;
            ArrayDouble * double_array = new ArrayDouble();
            double_array->init(size);
            m_ptr = (void *)double_array;
        }

        m_size = size;
    }

    ValueType get_type()
    {
        return m_data_type;
    }

    void * get_ptr()
    {
        return m_ptr;
    }

    void * m_ptr;
    size_t m_size;
    ValueType m_data_type;
};

// Define the Pybind11 caster
namespace pybind11
{
namespace detail
{

template <>
struct type_caster<ArrayInt>
{
public:
    PYBIND11_TYPE_CASTER(ArrayInt, _("ArrayInt"));

    // Conversion from Python object to C++
    bool load(py::handle src, bool convert)
    {
        // Check if the source object is a valid ArrayPlex
        if (!py::isinstance<ArrayPlex>(src))
        {
            return false;
        }

        // Get the ArrayPlex object from the source handle
        ArrayPlex arrayPlex = src.cast<ArrayPlex>();

        // Check if the data type is "int"
        if (arrayPlex.m_data_type != ArrayPlex::ValueType::Npy_Int)
        {
            return false;
        }

        // Set the m_buffer and m_size of the ArrayInt object
        ArrayInt * arr = (ArrayInt *)(arrayPlex.m_ptr);
        value.m_size = arr->m_size;
        value.ptr_ = arr->ptr_;
        return true;
    }

    // Conversion from C++ to Python object
    static py::handle cast(const ArrayInt & src, py::return_value_policy, py::handle)
    {
        // Create a new ArrayPlex with the same size and data type "int"
        ArrayPlex arrayPlex(src.m_size, "int");

        // Copy the data from src.m_buffer to arrayPlex.m_buffer
        ArrayInt * arr = (ArrayInt *)(arrayPlex.m_ptr);

        std::memcpy(arr->ptr_, src.ptr_, src.m_size * sizeof(int));

        // Return the Python object representing the converted ArrayPlex
        return py::cast(arrayPlex, py::return_value_policy::move);
    }
};

template <>
struct type_caster<ArrayDouble>
{
public:
    PYBIND11_TYPE_CASTER(ArrayDouble, _("ArrayDouble"));

    // Conversion from Python object to C++
    bool load(py::handle src, bool convert)
    {
        // Check if the source object is a valid ArrayPlex
        if (!py::isinstance<ArrayPlex>(src))
        {
            return false;
        }

        // Get the ArrayPlex object from the source handle
        ArrayPlex arrayPlex = src.cast<ArrayPlex>();

        // Check if the data type is "double"
        if (arrayPlex.m_data_type != ArrayPlex::ValueType::Npy_Double)
        {
            return false;
        }

        // Set the m_buffer and m_size of the ArrayDouble object
        ArrayDouble * arr = (ArrayDouble *)(arrayPlex.m_ptr);
        value.m_size = arr->m_size;
        value.ptr_ = arr->ptr_;
        return true;
    }

    // Conversion from C++ to Python object
    static py::handle cast(const ArrayDouble & src, py::return_value_policy, py::handle)
    {
        // Create a new ArrayPlex with the same size and data type "double"
        ArrayPlex arrayPlex(src.m_size, "double");

        // Copy the data from src.m_buffer to arrayPlex.m_buffer
        ArrayDouble * arr = (ArrayDouble *)(arrayPlex.m_ptr);

        std::memcpy(arr->ptr_, src.ptr_, src.m_size * sizeof(double));

        // Return the Python object representing the converted ArrayPlex
        return py::cast(arrayPlex, py::return_value_policy::move);
    }
};

} // namespace detail
} // namespace pybind11

PYBIND11_MODULE(example, m)
{

    py::class_<ArrayPlex>(m, "Array")
        .def(py::init([](size_t size, const char * dtype)
                      { return ArrayPlex(size, dtype); }),
             py::arg("size"),
             py::arg("dtype"))
        .def("at", [](ArrayPlex & self, int & size) -> std::variant<int, double>
             {
        if (self.get_type() == ArrayPlex::ValueType::Npy_Int)
        {
            auto * arr = (ArrayInt *)self.get_ptr();
            return arr->at(size);
        }
        else if (self.get_type() == ArrayPlex::ValueType::Npy_Double)
        {
            auto * arr = (ArrayDouble *)self.get_ptr();
           return  arr->at(size);
        } });

    py::class_<ArrayInt>(m, "ArrayInt");
    py::class_<ArrayDouble>(m, "ArrayDouble");

    m.def("get_element", [](ArrayInt & arr, size_t i)
          { return arr.at(i); });
    m.def("get_element", [](ArrayDouble & arr, size_t i)
          { return arr.at(i); });

    // Register the type caster
    py::implicitly_convertible<ArrayPlex, ArrayInt>();
    py::implicitly_convertible<ArrayPlex, ArrayDouble>();
}

from example import *

m1 = Array(10, "int")
print(m1.at(5)) # 3


m2 = Array(10, "double")
print(m2.at(6)) # 3.33

assert type(m1) is type(m2) # pass


print(get_element(m1, 3)) # 3
print(get_element(m2, 4)) # 3.33

from modmesh.

yungyuc commented on June 30, 2024

Finally... I think I know how to finish this issue, I will start to work on this.
.. snip ...

Yes, I think this is a good way to start!

Once this is working, we will have follow-up work to do by turning part of the class hierarchy polymorphic. But it needs to preserve the compile-time optimization. So it is not so obvious how to do it at the point. We should get the static-dynamic type conversion work before involving polymorphism.

from modmesh.

tigercosmos commented on June 30, 2024

Done with #266 #291 #297 #299 #300 #303 #310 #313

@yungyuc Please close this.

from modmesh.

yungyuc commented on June 30, 2024

Thanks @tigercosmos . Yes I think it's pretty much there. Close this one and we use new issues for enhancements.

from modmesh.

Single Python wrapper for SimpleArray about modmesh HOT 24 CLOSED

Comments (24)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs