GithubHelp home page GithubHelp logo

apache / arrow Goto Github PK

View Code? Open in Web Editor NEW
14.0K 353.0 3.4K 190.6 MB

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing

Home Page: https://arrow.apache.org/

License: Apache License 2.0

Makefile 0.06% C++ 53.41% C 2.27% Shell 0.85% Ruby 3.34% Batchfile 0.06% CMake 1.43% Python 6.17% Java 15.43% FreeMarker 0.01% JavaScript 0.26% HTML 0.01% TypeScript 2.06% Lua 0.02% Go 11.01% Awk 0.01% Meson 0.09% Dockerfile 0.27% Thrift 0.07% R 3.17%
arrow

arrow's Introduction

Apache Arrow

Fuzzing Status License Twitter Follow

Powering In-Memory Analytics

Apache Arrow is a development platform for in-memory analytics. It contains a set of technologies that enable big data systems to process and move data fast.

Major components of the project include:

Arrow is an Apache Software Foundation project. Learn more at arrow.apache.org.

What's in the Arrow libraries?

The reference Arrow libraries contain many distinct software components:

  • Columnar vector and table-like containers (similar to data frames) supporting flat or nested types
  • Fast, language agnostic metadata messaging layer (using Google's Flatbuffers library)
  • Reference-counted off-heap buffer memory management, for zero-copy memory sharing and handling memory-mapped files
  • IO interfaces to local and remote filesystems
  • Self-describing binary wire formats (streaming and batch/file-like) for remote procedure calls (RPC) and interprocess communication (IPC)
  • Integration tests for verifying binary compatibility between the implementations (e.g. sending data from Java to C++)
  • Conversions to and from other in-memory data structures
  • Readers and writers for various widely-used file formats (such as Parquet, CSV)

Implementation status

The official Arrow libraries in this repository are in different stages of implementing the Arrow format and related features. See our current feature matrix on git main.

How to Contribute

Please read our latest project contribution guide.

Getting involved

Even if you do not plan to contribute to Apache Arrow itself or Arrow integrations in other projects, we'd be happy to have you involved:

arrow's People

Contributors

alamb avatar alenkaf avatar andygrove avatar assignuser avatar bkietz avatar cyb70289 avatar dependabot[bot] avatar domoritz avatar emkornfield avatar fsaintjacques avatar jonkeane avatar jorgecarleitao avatar jorisvandenbossche avatar kou avatar kszucs avatar lidavidm avatar liyafan82 avatar maplefu avatar nealrichardson avatar nevi-me avatar paleolimbot avatar pcmoritz avatar pitrou avatar raulcd avatar thisisnic avatar tianchen92 avatar wesm avatar westonpace avatar xhochy avatar zeroshade avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

arrow's Issues

[C++][Parquet] minor compilation issue

I find out some very minor issue when I tried to compile the reader on my environment due to some namespace clashing.
As example shared_ptr and unordered_map are also in C++11 std namespace. Some compile don't like it.

Also I find that with my test files that I'm reading there was a dereference to a null pointer, if the field is required definition_level_decoder_ is null.

I've make the fork and the change https://github.com/ffabbri4/incubator-parquet-cpp/tree/candidate

Reporter: Fabrizio Fabbri / @ffabbri4
Assignee: Fabrizio Fabbri / @ffabbri4

Note: This issue was originally created as PARQUET-232. Please see the migration documentation for further details.

[C++][Parquet] Error handling: C++ exceptions or Status

This library currently throws C++ exceptions. I would very much prefer to use Google's convention of using Status objects to communicate errors and force explicit action to be taken on the part of the developer if an error occurs in a particular function call. It will also make it much easier to incorporate libparquet into other libraries that do not use C++ exceptions, and also to provide an ANSI C API wrapper.

Reporter: Wes McKinney / @wesm

Note: This issue was originally created as PARQUET-440. Please see the migration documentation for further details.

[C++][Parquet] 11, cpplint cleanup, package target and header installation

I'm planning to work on building out parquet-cpp with columnar data structures (see Arrow proposal) for materialized in-memory data and feature complete reader/writers so that native-code consumers like Python can finally read and write Parquet files at native speeds. It would be great to have all this officially a part of Apache Parquet.

This adds minimal support to be able to install the resulting libparquet.so and its various header files to support minimally viable development on downstream C++ and Python projects that will need to depend on this. It also builds in C++11 mode and passes Google's cpplint.

Reporter: Wes McKinney / @wesm
Assignee: Wes McKinney / @wesm

Related issues:

Externally tracked issue: apache/parquet-cpp#14

Note: This issue was originally created as PARQUET-416. Please see the migration documentation for further details.

[C++][Parquet] Add a RowGroup reader interface class

Currently the logic for interacting with row group metadata and constructing column decoders is embedded in the parquet_reader.cc executable here:

https://github.com/apache/parquet-cpp/blob/master/example/parquet_reader.cc

With PARQUET-434, we have a file reader container, which can then provide a row group reader container, something like

RowGroupReader* group_reader = file_reader->row_group(i);

Reporter: Wes McKinney / @wesm
Assignee: Wes McKinney / @wesm

Note: This issue was originally created as PARQUET-451. Please see the migration documentation for further details.

[C++][Parquet] Hide thrift dependency in parquet-cpp

Pulling in thrift compiled headers tend to pull in a lot of things. It would be nice to not expose them in the parquet library (the application should be able to use a different version of thrift, etc).

We can also see if it is practical to not depend on thrift at all and replicate the logic we need. Thrift is fairly stable at this point so this might be feasible. This would allow us to do things like not rely on boost.

Reporter: Nong Li / @nongli
Assignee: Wes McKinney / @wesm

Note: This issue was originally created as PARQUET-446. Please see the migration documentation for further details.

[C++][Parquet] Metadata generation: Nested physical schema builder

The idea here is to define a simple API for creating logical schemas, which will be then automatically flattened in DFS order to a vector of SchemaElement structs. This will spare users from having to necessarily implement their own flattening / unflattening code

Reporter: Wes McKinney / @wesm
Assignee: Wes McKinney / @wesm

Note: This issue was originally created as PARQUET-444. Please see the migration documentation for further details.

[C++][Parquet] Implement and test BIT_PACKED level encoding / decoding

While RLE is the preferred encoding format (and BIT_PACKED is deprecated in Parquet 2.0), we will need to support this encoding format for legacy Parquet files that use it. As part of this JIRA we will verify round-tripping levels to this encoding format.

See also PARQUET-462

Reporter: Wes McKinney / @wesm
Assignee: Deepak Majeti / @majetideepak

Note: This issue was originally created as PARQUET-467. Please see the migration documentation for further details.

[C++][Parquet] Support Travis CI in parquet-cpp

Having a continuous build env helps ensure that pull requests compile and pass tests. It provides valuable feedback for ensuring various environments support desired changes.

Pull request that gets Travis CI - GitHub integration up and running for parquet-cpp:
apache/parquet-cpp#9

Reporter: Kalon Mills / @kalaxy
Assignee: Kalon Mills / @kalaxy

Note: This issue was originally created as PARQUET-259. Please see the migration documentation for further details.

[C++][Parquet] Unable to Install C++ Driver - reference to 'share_ptr' is ambiguous

Install commands worked up until the make cmd

Aarons-MBP:parquet-cpp Aaron$ make
Scanning dependencies of target ThriftParquet
[ 12%] Building CXX object generated/gen-cpp/CMakeFiles/ThriftParquet.dir/parquet_constants.cpp.o
[ 25%] Building CXX object generated/gen-cpp/CMakeFiles/ThriftParquet.dir/parquet_types.cpp.o
Linking CXX static library ../../build/libThriftParquet.a
[ 25%] Built target ThriftParquet
Scanning dependencies of target Parquet
[ 37%] Building CXX object src/CMakeFiles/Parquet.dir/parquet.cc.o
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:79:5: warning: variable 'value_byte_size'
is used uninitialized whenever switch default is taken [-Wsometimes-uninitialized]
default:
^~~~~~~
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:94:46: note: uninitialized use occurs here
values_buffer_.resize(config_.batch_size * value_byte_size);
^~~~~~~~~~~~~~~
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:59:22: note: initialize the variable
'value_byte_size' to silence this warning
int value_byte_size;
^
= 0
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:167:37: error: reference to 'shared_ptr' is
ambiguous
unordered_map<Encoding::type, shared_ptr >::iterator it =
^
/usr/local/include/boost/smart_ptr/shared_ptr.hpp:717:36: note: candidate found by name
lookup is 'boost::shared_ptr'
template friend class shared_ptr;
^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/memory:3750:29: note:
candidate found by name lookup is 'std::1::shared_ptr'
class LIBCPP_TYPE_VIS_ONLY shared_ptr
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:167:48: error: 'Decoder' does not refer to
a value
unordered_map<Encoding::type, shared_ptr >::iterator it =
^
/Users/Aaron/myProgs/parquet-cpp/src/encodings/encodings.h:27:7: note: declared here
class Decoder {
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:169:11: error: use of undeclared identifier
'it'
if (it != decoders
.end()) {
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:176:7: error: reference to 'shared_ptr' is
ambiguous
shared_ptr decoder(new DictionaryDecoder(schema
->type, &dictionary));
^
/usr/local/include/boost/smart_ptr/shared_ptr.hpp:717:36: note: candidate found by name
lookup is 'boost::shared_ptr'
template friend class shared_ptr;
^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/memory:3750:29: note:
candidate found by name lookup is 'std::1::shared_ptr'
class LIBCPP_TYPE_VIS_ONLY shared_ptr
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:177:45: error: use of undeclared identifier
'decoder'; did you mean 'decoders
'?
decoders
[Encoding::RLE_DICTIONARY] = decoder;
^~~~~~~
decoders

/Users/Aaron/myProgs/parquet-cpp/src/parquet/parquet.h:152:78: note: 'decoders' declared
here
boost::unordered_map<parquet::Encoding::type, boost::shared_ptr > decoders_;
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:177:43: error: no viable overloaded '='
decoders_[Encoding::RLE_DICTIONARY] = decoder;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~
/usr/local/include/boost/smart_ptr/shared_ptr.hpp:500:18: note: candidate function not
viable: no known conversion from 'boost::unordered_map<parquet::Encoding::type,
boost::shared_ptr >' to 'const boost::shared_ptr<parquet_cpp::Decoder>' for
1st argument
shared_ptr & operator=( shared_ptr const & r ) BOOST_NOEXCEPT
^
/usr/local/include/boost/smart_ptr/shared_ptr.hpp:509:18: note: candidate template ignored:
could not match 'shared_ptr' against 'unordered_map'
shared_ptr & operator=(shared_ptr const & r) BOOST_NOEXCEPT
^
/usr/local/include/boost/smart_ptr/shared_ptr.hpp:520:18: note: candidate template ignored:
could not match 'auto_ptr' against 'unordered_map'
shared_ptr & operator=( std::auto_ptr & r )
^
/usr/local/include/boost/smart_ptr/shared_ptr.hpp:538:77: note: candidate template ignored:
substitution failure [with Ap =
boost::unordered::unordered_map<parquet::Encoding::type,
boost::shared_ptr<parquet_cpp::Decoder>, boost::hashparquet::Encoding::type,
std::__1::equal_toparquet::Encoding::type, std::__1::allocator<std::__1::pair<const
parquet::Encoding::type, boost::shared_ptr<parquet_cpp::Decoder> > > >]: no type
named 'type' in
'boost::detail::sp_enable_if_auto_ptr<boost::unordered::unordered_map<parquet::Encoding::type,
boost::shared_ptr<parquet_cpp::Decoder>, boost::hashparquet::Encoding::type,
std::__1::equal_toparquet::Encoding::type, std::__1::allocator<std::__1::pair<const
parquet::Encoding::type, boost::shared_ptr<parquet_cpp::Decoder> > > >,
boost::shared_ptr<parquet_cpp::Decoder> &>'
typename boost::detail::sp_enable_if_auto_ptr< Ap, shared_ptr & >::type operato...
~~~~ ^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:202:37: error: reference to 'shared_ptr' is
ambiguous
unordered_map<Encoding::type, shared_ptr >::iterator it =
^
/usr/local/include/boost/smart_ptr/shared_ptr.hpp:717:36: note: candidate found by name
lookup is 'boost::shared_ptr'
template friend class shared_ptr;
^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/memory:3750:29: note:
candidate found by name lookup is 'std::1::shared_ptr'
class LIBCPP_TYPE_VIS_ONLY shared_ptr
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:202:48: error: 'Decoder' does not refer to
a value
unordered_map<Encoding::type, shared_ptr >::iterator it =
^
/Users/Aaron/myProgs/parquet-cpp/src/encodings/encodings.h:27:7: note: declared here
class Decoder {
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:204:11: error: use of undeclared identifier
'it'
if (it != decoders
.end()) {
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:205:28: error: use of undeclared identifier
'it'
current_decoder
= it->second.get();
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:209:13: error: reference to 'shared_ptr' is
ambiguous
shared_ptr decoder;
^
/usr/local/include/boost/smart_ptr/shared_ptr.hpp:717:36: note: candidate found by name
lookup is 'boost::shared_ptr'
template friend class shared_ptr;
^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/memory:3750:29: note:
candidate found by name lookup is 'std::1::shared_ptr'
class LIBCPP_TYPE_VIS_ONLY shared_ptr
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:211:15: error: use of undeclared identifier
'decoder'; did you mean 'decoders
'?
decoder.reset(new BoolDecoder());
^~~~~~~
decoders

/Users/Aaron/myProgs/parquet-cpp/src/parquet/parquet.h:152:78: note: 'decoders
' declared
here
boost::unordered_map<parquet::Encoding::type, boost::shared_ptr > decoders;
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:211:23: error: no member named 'reset' in
'boost::unordered::unordered_map<parquet::Encoding::type,
boost::shared_ptr<parquet_cpp::Decoder>, boost::hashparquet::Encoding::type,
std::1::equal_toparquet::Encoding::type, std::1::allocator<std::1::pair<const
parquet::Encoding::type, boost::shared_ptr<parquet_cpp::Decoder> > > >'
decoder.reset(new BoolDecoder());
~~~~~~~ ^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:213:15: error: use of undeclared identifier
'decoder'; did you mean 'decoders
'?
decoder.reset(new PlainDecoder(schema
->type));
^~~~~~~
decoders

/Users/Aaron/myProgs/parquet-cpp/src/parquet/parquet.h:152:78: note: 'decoders
' declared
here
boost::unordered_map<parquet::Encoding::type, boost::shared_ptr > decoders;
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:213:23: error: no member named 'reset' in
'boost::unordered::unordered_map<parquet::Encoding::type,
boost::shared_ptr<parquet_cpp::Decoder>, boost::hashparquet::Encoding::type,
std::1::equal_toparquet::Encoding::type, std::1::allocator<std::1::pair<const
parquet::Encoding::type, boost::shared_ptr<parquet_cpp::Decoder> > > >'
decoder.reset(new PlainDecoder(schema
->type));
~~~~~~~ ^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:215:35: error: use of undeclared identifier
'decoder'; did you mean 'decoders
'?
decoders
[encoding] = decoder;
^~~~~~~
decoders

/Users/Aaron/myProgs/parquet-cpp/src/parquet/parquet.h:152:78: note: 'decoders
' declared
here
boost::unordered_map<parquet::Encoding::type, boost::shared_ptr > decoders;
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:215:33: error: no viable overloaded '='
decoders[encoding] = decoder;
~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~
/usr/local/include/boost/smart_ptr/shared_ptr.hpp:500:18: note: candidate function not
viable: no known conversion from 'boost::unordered_map<parquet::Encoding::type,
boost::shared_ptr >' to 'const boost::shared_ptr<parquet_cpp::Decoder>' for
1st argument
shared_ptr & operator=( shared_ptr const & r ) BOOST_NOEXCEPT
^
/usr/local/include/boost/smart_ptr/shared_ptr.hpp:509:18: note: candidate template ignored:
could not match 'shared_ptr' against 'unordered_map'
shared_ptr & operator=(shared_ptr const & r) BOOST_NOEXCEPT
^
/usr/local/include/boost/smart_ptr/shared_ptr.hpp:520:18: note: candidate template ignored:
could not match 'auto_ptr' against 'unordered_map'
shared_ptr & operator=( std::auto_ptr & r )
^
/usr/local/include/boost/smart_ptr/shared_ptr.hpp:538:77: note: candidate template ignored:
substitution failure [with Ap =
boost::unordered::unordered_map<parquet::Encoding::type,
boost::shared_ptr<parquet_cpp::Decoder>, boost::hashparquet::Encoding::type,
std::__1::equal_toparquet::Encoding::type, std::__1::allocator<std::__1::pair<const
parquet::Encoding::type, boost::shared_ptr<parquet_cpp::Decoder> > > >]: no type
named 'type' in
'boost::detail::sp_enable_if_auto_ptr<boost::unordered::unordered_map<parquet::Encoding::type,
boost::shared_ptr<parquet_cpp::Decoder>, boost::hashparquet::Encoding::type,
std::_1::equal_toparquet::Encoding::type, std::1::allocator<std::1::pair<const
parquet::Encoding::type, boost::shared_ptr<parquet_cpp::Decoder> > > >,
boost::shared_ptr<parquet_cpp::Decoder> &>'
typename boost::detail::sp_enable_if_auto_ptr< Ap, shared_ptr & >::type operato...
~~~~ ^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:216:32: error: use of undeclared identifier
'decoder'; did you mean 'decoders
'?
current_decoder
= decoder.get();
^~~~~~~
decoders

/Users/Aaron/myProgs/parquet-cpp/src/parquet/parquet.h:152:78: note: 'decoders
' declared
here
boost::unordered_map<parquet::Encoding::type, boost::shared_ptr > decoders;
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:216:40: error: no member named 'get' in
'boost::unordered::unordered_map<parquet::Encoding::type,
boost::shared_ptr<parquet_cpp::Decoder>, boost::hashparquet::Encoding::type,
std::_1::equal_toparquet::Encoding::type, std::1::allocator<std::1::pair<const
parquet::Encoding::type, boost::shared_ptr<parquet_cpp::Decoder> > > >'
current_decoder
= decoder.get();
~~~~~~~ ^
1 warning and 19 errors generated.
make[2]: *** [src/CMakeFiles/Parquet.dir/parquet.cc.o] Error 1
make[1]: *** [src/CMakeFiles/Parquet.dir/all] Error 2
make: *** [all] Error 2
Aarons-MBP:parquet-cpp Aaron$
Aarons-MBP:parquet-cpp Aaron$ git pull
Already up-to-date.
Aarons-MBP:parquet-cpp Aaron$ make
[ 25%] Built target ThriftParquet
[ 37%] Building CXX object src/CMakeFiles/Parquet.dir/parquet.cc.o
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:79:5: warning: variable 'value_byte_size' is used uninitialized whenever switch default is
taken [-Wsometimes-uninitialized]
default:
^~~~~~~
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:94:46: note: uninitialized use occurs here
values_buffer
.resize(config
.batch_size * value_byte_size);
^~~~~~~~~~~~~~~
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:59:22: note: initialize the variable 'value_byte_size' to silence this warning
int value_byte_size;
^
= 0
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:167:37: error: reference to 'shared_ptr' is ambiguous
unordered_map<Encoding::type, shared_ptr >::iterator it =
^
/usr/local/include/boost/smart_ptr/shared_ptr.hpp:717:36: note: candidate found by name lookup is 'boost::shared_ptr'
template friend class shared_ptr;
^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/memory:3750:29: note: candidate
found by name lookup is 'std::1::shared_ptr'
class LIBCPP_TYPE_VIS_ONLY shared_ptr
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:167:48: error: 'Decoder' does not refer to a value
unordered_map<Encoding::type, shared_ptr >::iterator it =
^
/Users/Aaron/myProgs/parquet-cpp/src/encodings/encodings.h:27:7: note: declared here
class Decoder {
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:169:11: error: use of undeclared identifier 'it'
if (it != decoders
.end()) {
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:176:7: error: reference to 'shared_ptr' is ambiguous
shared_ptr decoder(new DictionaryDecoder(schema
->type, &dictionary));
^
/usr/local/include/boost/smart_ptr/shared_ptr.hpp:717:36: note: candidate found by name lookup is 'boost::shared_ptr'
template friend class shared_ptr;
^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/memory:3750:29: note: candidate
found by name lookup is 'std::1::shared_ptr'
class LIBCPP_TYPE_VIS_ONLY shared_ptr
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:177:45: error: use of undeclared identifier 'decoder'; did you mean 'decoders
'?
decoders
[Encoding::RLE_DICTIONARY] = decoder;
^~~~~~~
decoders

/Users/Aaron/myProgs/parquet-cpp/src/parquet/parquet.h:152:78: note: 'decoders' declared here
boost::unordered_map<parquet::Encoding::type, boost::shared_ptr > decoders;
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:177:43: error: no viable overloaded '='
decoders[Encoding::RLE_DICTIONARY] = decoder;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~
/usr/local/include/boost/smart_ptr/shared_ptr.hpp:500:18: note: candidate function not viable: no known conversion from
'boost::unordered_map<parquet::Encoding::type, boost::shared_ptr >' to 'const boost::shared_ptr<parquet_cpp::Decoder>' for 1st
argument
shared_ptr & operator=( shared_ptr const & r ) BOOST_NOEXCEPT
^
/usr/local/include/boost/smart_ptr/shared_ptr.hpp:509:18: note: candidate template ignored: could not match 'shared_ptr' against
'unordered_map'
shared_ptr & operator=(shared_ptr const & r) BOOST_NOEXCEPT
^
/usr/local/include/boost/smart_ptr/shared_ptr.hpp:520:18: note: candidate template ignored: could not match 'auto_ptr' against
'unordered_map'
shared_ptr & operator=( std::auto_ptr & r )
^
/usr/local/include/boost/smart_ptr/shared_ptr.hpp:538:77: note: candidate template ignored: substitution failure [with Ap =
boost::unordered::unordered_map<parquet::Encoding::type, boost::shared_ptr<parquet_cpp::Decoder>, boost::hashparquet::Encoding::type,
std::__1::equal_toparquet::Encoding::type, std::__1::allocator<std::__1::pair<const parquet::Encoding::type,
boost::shared_ptr<parquet_cpp::Decoder> > > >]: no type named 'type' in
'boost::detail::sp_enable_if_auto_ptr<boost::unordered::unordered_map<parquet::Encoding::type, boost::shared_ptr<parquet_cpp::Decoder>,
boost::hashparquet::Encoding::type, std::__1::equal_toparquet::Encoding::type, std::__1::allocator<std::__1::pair<const
parquet::Encoding::type, boost::shared_ptr<parquet_cpp::Decoder> > > >, boost::shared_ptr<parquet_cpp::Decoder> &>'
typename boost::detail::sp_enable_if_auto_ptr< Ap, shared_ptr & >::type operator=( Ap r )
~~~~ ^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:202:37: error: reference to 'shared_ptr' is ambiguous
unordered_map<Encoding::type, shared_ptr >::iterator it =
^
/usr/local/include/boost/smart_ptr/shared_ptr.hpp:717:36: note: candidate found by name lookup is 'boost::shared_ptr'
template friend class shared_ptr;
^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/memory:3750:29: note: candidate
found by name lookup is 'std::1::shared_ptr'
class LIBCPP_TYPE_VIS_ONLY shared_ptr
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:202:48: error: 'Decoder' does not refer to a value
unordered_map<Encoding::type, shared_ptr >::iterator it =
^
/Users/Aaron/myProgs/parquet-cpp/src/encodings/encodings.h:27:7: note: declared here
class Decoder {
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:204:11: error: use of undeclared identifier 'it'
if (it != decoders
.end()) {
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:205:28: error: use of undeclared identifier 'it'
current_decoder
= it->second.get();
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:209:13: error: reference to 'shared_ptr' is ambiguous
shared_ptr decoder;
^
/usr/local/include/boost/smart_ptr/shared_ptr.hpp:717:36: note: candidate found by name lookup is 'boost::shared_ptr'
template friend class shared_ptr;
^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/memory:3750:29: note: candidate
found by name lookup is 'std::1::shared_ptr'
class LIBCPP_TYPE_VIS_ONLY shared_ptr
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:211:15: error: use of undeclared identifier 'decoder'; did you mean 'decoders
'?
decoder.reset(new BoolDecoder());
^~~~~~~
decoders

/Users/Aaron/myProgs/parquet-cpp/src/parquet/parquet.h:152:78: note: 'decoders
' declared here
boost::unordered_map<parquet::Encoding::type, boost::shared_ptr > decoders;
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:211:23: error: no member named 'reset' in
'boost::unordered::unordered_map<parquet::Encoding::type, boost::shared_ptr<parquet_cpp::Decoder>,
boost::hashparquet::Encoding::type, std::1::equal_toparquet::Encoding::type, std::1::allocator<std::1::pair<const
parquet::Encoding::type, boost::shared_ptr<parquet_cpp::Decoder> > > >'
decoder.reset(new BoolDecoder());
~~~~~~~ ^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:213:15: error: use of undeclared identifier 'decoder'; did you mean 'decoders
'?
decoder.reset(new PlainDecoder(schema
->type));
^~~~~~~
decoders

/Users/Aaron/myProgs/parquet-cpp/src/parquet/parquet.h:152:78: note: 'decoders
' declared here
boost::unordered_map<parquet::Encoding::type, boost::shared_ptr > decoders;
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:213:23: error: no member named 'reset' in
'boost::unordered::unordered_map<parquet::Encoding::type, boost::shared_ptr<parquet_cpp::Decoder>,
boost::hashparquet::Encoding::type, std::1::equal_toparquet::Encoding::type, std::1::allocator<std::1::pair<const
parquet::Encoding::type, boost::shared_ptr<parquet_cpp::Decoder> > > >'
decoder.reset(new PlainDecoder(schema
->type));
~~~~~~~ ^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:215:35: error: use of undeclared identifier 'decoder'; did you mean 'decoders
'?
decoders
[encoding] = decoder;
^~~~~~~
decoders

/Users/Aaron/myProgs/parquet-cpp/src/parquet/parquet.h:152:78: note: 'decoders
' declared here
boost::unordered_map<parquet::Encoding::type, boost::shared_ptr > decoders;
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:215:33: error: no viable overloaded '='
decoders[encoding] = decoder;
~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~
/usr/local/include/boost/smart_ptr/shared_ptr.hpp:500:18: note: candidate function not viable: no known conversion from
'boost::unordered_map<parquet::Encoding::type, boost::shared_ptr >' to 'const boost::shared_ptr<parquet_cpp::Decoder>' for 1st
argument
shared_ptr & operator=( shared_ptr const & r ) BOOST_NOEXCEPT
^
/usr/local/include/boost/smart_ptr/shared_ptr.hpp:509:18: note: candidate template ignored: could not match 'shared_ptr' against
'unordered_map'
shared_ptr & operator=(shared_ptr const & r) BOOST_NOEXCEPT
^
/usr/local/include/boost/smart_ptr/shared_ptr.hpp:520:18: note: candidate template ignored: could not match 'auto_ptr' against
'unordered_map'
shared_ptr & operator=( std::auto_ptr & r )
^
/usr/local/include/boost/smart_ptr/shared_ptr.hpp:538:77: note: candidate template ignored: substitution failure [with Ap =
boost::unordered::unordered_map<parquet::Encoding::type, boost::shared_ptr<parquet_cpp::Decoder>, boost::hashparquet::Encoding::type,
std::__1::equal_toparquet::Encoding::type, std::__1::allocator<std::__1::pair<const parquet::Encoding::type,
boost::shared_ptr<parquet_cpp::Decoder> > > >]: no type named 'type' in
'boost::detail::sp_enable_if_auto_ptr<boost::unordered::unordered_map<parquet::Encoding::type, boost::shared_ptr<parquet_cpp::Decoder>,
boost::hashparquet::Encoding::type, std::_1::equal_toparquet::Encoding::type, std::1::allocator<std::1::pair<const
parquet::Encoding::type, boost::shared_ptr<parquet_cpp::Decoder> > > >, boost::shared_ptr<parquet_cpp::Decoder> &>'
typename boost::detail::sp_enable_if_auto_ptr< Ap, shared_ptr & >::type operator=( Ap r )
~~~~ ^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:216:32: error: use of undeclared identifier 'decoder'; did you mean 'decoders
'?
current_decoder
= decoder.get();
^~~~~~~
decoders

/Users/Aaron/myProgs/parquet-cpp/src/parquet/parquet.h:152:78: note: 'decoders
' declared here
boost::unordered_map<parquet::Encoding::type, boost::shared_ptr > decoders;
^
/Users/Aaron/myProgs/parquet-cpp/src/parquet.cc:216:40: error: no member named 'get' in
'boost::unordered::unordered_map<parquet::Encoding::type, boost::shared_ptr<parquet_cpp::Decoder>,
boost::hashparquet::Encoding::type, std::__1::equal_toparquet::Encoding::type, std::__1::allocator<std::_1::pair<const
parquet::Encoding::type, boost::shared_ptr<parquet_cpp::Decoder> > > >'
current_decoder
= decoder.get();

Environment: Mac Mavericks
Reporter: Aaron Benz

Note: This issue was originally created as PARQUET-238. Please see the migration documentation for further details.

[C++][Parquet] Thrift 0.9.3 cannot be used in conjunction with googletest and C++11 on Linux

Thrift 0.9.3 introduces a #include <thrift/cxxfunctional.h> include which causes tr1/functional to be included, causing a compiler conflict with googletest, which has its own portability macros surrounding its use of std::tr1::tuple. I spent a bunch of time twiddling compiler flags to try to resolve this conflict, but wasn't able to figure it out.

If this is a Thrift bug, we should report it to Thrift. If it's fixable by compiler flags, then we should figure that out and track the issue here, otherwise users with the latest version of Thrift will be unable to compile the parquet-cpp test suite.

Reporter: Wes McKinney / @wesm
Assignee: Wes McKinney / @wesm

Note: This issue was originally created as PARQUET-470. Please see the migration documentation for further details.

[C++][Parquet] Add a ParquetFileReader class to encapsulate some low-level details of interacting with Parquet files

This is also related to PARQUET-418. I'm beginning work on an adapter between Parquet and in-memory C++ data structures, and it would be helpful for the moment to encapsulate various details like metadata deserialization.

This class can be expanded to include other features (such as yielding column readers) in future patches.

I've inspected the patch in apache/parquet-cpp#18 and expect there to be little overlap. @nongli if you can have a look at that and let us know how to proceed, that would be great.

Reporter: Wes McKinney / @wesm
Assignee: Wes McKinney / @wesm

Note: This issue was originally created as PARQUET-434. Please see the migration documentation for further details.

[C++][Parquet] Detach thirdparty code from build configuration.

The existing repo has source code for third party dependencies checked into the repo. The build system expects those dependencies in a certain place. This enforces that the built library conform to those exact dependencies without customization.

Managing third party dependencies is better handled through a build environment. It allows the library builder more flexibility over dependency versions and locations. It also cleans up the repo from this third party code.

Reporter: Kalon Mills / @kalaxy
Assignee: Kalon Mills / @kalaxy

Related issues:

Externally tracked issue: apache/parquet-cpp#16

Note: This issue was originally created as PARQUET-267. Please see the migration documentation for further details.

[C++][Parquet] Add a utility to print contents of a Parquet file to stdout

To improve the usability/testability of parquet-cpp, the library needs a utility to print the contents of a Parquet file. incubator-parquet-cpp used to have a parquet_reader utility, but a) it was not ported to the Apache, and b) it had memory leaks and mismanaged file handles, and required a lot of improvement.

Using parquet_reader as a starting point, I will build a utility for printing a Parquet file contents.

Reporter: Aliaksei Sandryhaila / @asandryh

Note: This issue was originally created as PARQUET-418. Please see the migration documentation for further details.

[C++][Parquet] Develop external predicate pushdown API for column readers

This will happen significantly downstream of where we are at right now, but we should be planning ahead to facilitate scanning Parquet files with externally-defined predicates as a primary use case.

I suggest that the most general (and high performance) predicate will be batch-oriented; i.e. the predicate will be passed a batch of materialized values from one or more columns, and it returns an array of booleans indicating whether or not the predicate is true. We can also develop a row-by-row "scalar" predicate API if users need that.

Reporter: Wes McKinney / @wesm

Note: This issue was originally created as PARQUET-473. Please see the migration documentation for further details.

[C++][Parquet] Improve handling of null values

Currently, the default value of the type is returned for NULL values and is incorrect.
This JIRA will correctly identify a NULL value with the help of an additional variable that will be set for NULL values.
This feature depends on reading the repetition level (PARQUET-169).

Reporter: Deepak Majeti / @majetideepak
Assignee: Deepak Majeti / @majetideepak

Note: This issue was originally created as PARQUET-459. Please see the migration documentation for further details.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.