GithubHelp home page GithubHelp logo

monix / monix-connect Goto Github PK

View Code? Open in Web Editor NEW
60.0 6.0 17.0 8.77 MB

A set of connectors for Monix. ๐Ÿ”›

Home Page: https://connect.monix.io

License: Apache License 2.0

Scala 99.31% Shell 0.02% JavaScript 0.59% CSS 0.08%
monix connectors s3 hdfs parquet redis scala reactive-streams aws dynamodb google-cloud-storage mongodb workflow sqs elasticsearch

monix-connect's Introduction

Monix Connect

release-badge workflow-badge Discord Gitter Scala Steward badge

โš ๏ธ Mind that the project isn't yet stable, so binary compatibility is not guaranteedโ—

Monix Connect is an initiative to implement stream integrations for Monix.

Learn more on how to get started in the documentation page.

Please, drop a โญ to support this project if you found it interesting! Reach us out on gitter or submit an issue if you see room for improvement.

Connectors

The below list comprehends the current set of available connectors:

  1. Apache Parquet
  2. AWS DynamoDB
  3. AWS S3
  4. AWS SQS
  5. Elasticsearch
  6. Google Cloud Storage
  7. Hdfs
  8. MongoDB
  9. Redis

Contributing

The Monix Connect project welcomes contributions from anybody wishing to participate. All code or documentation that is provided must be licensed with the same license that Monix Connect is licensed with (Apache 2.0, see LICENCE).

People are expected to follow the Scala Code of Conduct when discussing Monix on GitHub, Gitter channel, or other venues.

Feel free to open an issue if you notice a bug, you have a question about the code, an idea for an existing connector or even for adding a new one. Pull requests are also gladly accepted. For more information, check out the contributor guide.

Credits

The foundation of Monix Connect was inspired on essence to the Akka's Alpakka project, and its name from the also popular Kafka Connect.

License

All code in this repository is licensed under the Apache License, Version 2.0. See LICENCE.

monix-connect's People

Contributors

alexandru avatar avasil avatar borissmidt avatar cwgroppe avatar gkhotyan avatar livelxw avatar mattszm avatar paualarco avatar rfkm avatar scala-steward avatar t1707 avatar tapan-stb avatar tmkontra avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

monix-connect's Issues

Parquet writer unsafe

@Avasil would it make sense to align the naming convention of parquet writer with the reader one and have:

def toWriterUnsafe[T](writer: ParquetWriter[T])

So then in the future also have:

def writer[T](writer: ParquetWriter[T])

and

def read[T](reader: Task[ParquetReader[T]])

?

Redis - Remove lettuce types dependencies

Currently, monix redis connector exposes lettuce specific types that are needed for performing redis commands.
Would be great to study if we can define our own types and conversions from and forth.

So as a result, the user would not have to deal with these, neither for passing them as parameter to functions nor when getting it as result.

The aim is to reduce the user having to import the less monix external types as possible and to make the api more scala like. (No builders, use companion objects etc.)

Some examples of these might be found along the different Apis:

import io.lettuce.core.{KeyValue, Limit, Range, ScoredValue, ScoredValueScanCursor, KeyScanCursor, StreamMessage}

Type safe MongoDB results

Some of the opertation of MongoOp does return directly the results received by the uderlying mogodb reactive streams interface. We should create conversions to specific scala classes that provides safe access to these results.

Example:
When performing replace or updates it returns an option com.mongodb.client.result.UpdateResult
This object has three fields in which two of them are nullable, which might cause a NullPointerException when when trying to access them.

@nullable final Long modifiedCount,
@nullable final BsonValue upsertedId

AWS DynamoDB Benchmark

Create AWS DynamoDb benmchmarks and compare it with other existing scala integrations.

Google Cloud Storage

I am opening this issue to track it's progress, will have an initial PR up for review within the week.

Refine DynamoDb connector

  • Fix DynamoDb integration tests failing in the pipeline (but working locally)
  • Add unit tests
  • Update / improve documentation with examples
    -Implementation of pending dynamodb request/response operations

Redis Benchmarks

Create redis benmchmarks and compare it with other existing scala redis integrations.
See redis4cats, laserdisc, zio-redis...

Redis - Revise and correct methods that potentially can return `null` values

monix-redis is built on top of lettuce, and some of their methods can return a null values instead of the expected one, which makes it unsafe and not reliable.

An example of this is the hget from the Hash api (this one is already corrected, but there are more, and the idea was to identify these methods that can potentially return null and wrap them with Option.

Currently the methods have unit-tested under test, on the other handit contains functional tests that runs agains a redis docker container. But these just cover a few examples.

A way of ensuring that no null values are returned would be to add functional tests that handles different scenarios.
So in case of hget , it would return null when called for a key and field that does not exist.

A good way to find the methods that returns values is by directly checking the underlying api documentation.

Improve test coverage of Parquet connector

The parquet connector implements two main signatures (writer and reader).

We would like to see more test cases for those, like:

  • Testing scenario where the stream does not emit any records.
  • Consuming from a failed observable.
    ...

GCS Benchmark

Create Google Cloud Storage benmchmark and compare it with other existing integrations.

Failed Redis Integration test

should allow to compose nice for comprehensions *** FAILED ***
List("1684", "1779", "2465", "2910", "3066", "3921", "4441", "6315", "6865", "8213", "8941", "9138", "9977") did not contain the same elements as List("2465", "2910", "3066", "1684", "9977", "4441", "3921", "8213", "9138", "6315", "1779", "8941", "9977", "6865") (RedisIntegrationTest.scala:177)

Refine S3 connector

  • Multipart update needs to support more than one element.
    -Add testing coverage for all s3.
  • Compare the Monix S3 api with the S3AsyncClient one to see if there is any methods missing.

Fix Apache Parquet API docs

When adding apache parquet submodule to the list of projects to generate api docs:
unidocProjectFilter in (ScalaUnidoc, unidoc) := inProjects(akka, dynamodb, hdfs, redis, s3, **parquet**),

Then it fails when attempting to create docs (sbt docs/docusaurusCreateSite) with the following failure:

monix-connect/parquet/target/scala-2.12/src_managed/main/monix/connect/parquet/test/user/ProtoDoc.scala:109:29: value readStringRequireUtf8 is not a member of com.google.protobuf.CodedInputStream [error] __name = _input__.readStringRequireUtf8()
[error] (docs / Scalaunidoc / doc) Scaladoc generation failed

Fix Flaky DynamoDb integration test

A sample pipeline failure: https://github.com/monix/monix-connect/runs/942214521?check_suite_focus=true

[info] monix.connect.dynamodb.DynamoDbOp$@124d9fc6 exposes a create method that
[info] - defines the execution of any DynamoDb request *** FAILED ***
[info] software.amazon.awssdk.services.dynamodb.model.ResourceNotFoundException: Cannot do operations on a non-existent table (Service: DynamoDb, Status Code: 400, Request ID: null)
[info] at software.amazon.awssdk.services.dynamodb.model.ResourceNotFoundException$BuilderImpl.build(ResourceNotFoundException.java:118)
[info] at software.amazon.awssdk.services.dynamodb.model.ResourceNotFoundException$BuilderImpl.build(ResourceNotFoundException.java:78)
[info] at software.amazon.awssdk.protocols.json.internal.unmarshall.AwsJsonProtocolErrorUnmarshaller.unmarshall(AwsJsonProtocolErrorUnmarshaller.java:88)
[info] at software.amazon.awssdk.protocols.json.internal.unmarshall.AwsJsonProtocolErrorUnmarshaller.handle(AwsJsonProtocolErrorUnmarshaller.java:63)
[info] at software.amazon.awssdk.protocols.json.internal.unmarshall.AwsJsonProtocolErrorUnmarshaller.handle(AwsJsonProtocolErrorUnmarshaller.java:42)
[info] at software.amazon.awssdk.core.internal.http.async.AsyncResponseHandler.lambda$prepare$0(AsyncResponseHandler.java:88)
[info] at java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072)
[info] at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
[info] at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073)
[info] at software.amazon.awssdk.core.internal.http.async.AsyncResponseHandler$BaosSubscriber.onComplete(AsyncResponseHandler.java:129)

Refine Redis connector

  • Split Redis api in smaller groups (Hash, List, Sets, Streams...)
  • Create at least one integration test for each the modules mentioned above
  • Unit test for all of the redis operations
  • Update documentation with an examples for each of the redis api

GCS integration tests

This issue is aimed to add local integration tests for the Google Cloud Storage Connector.

We have not find a good way to do so apart of mocking the google services, we would rather use a docker image or a more reliable emulator:

There are some examples used in python:
https://github.com/googleapis/python-storage
googleapis/google-auth-library-python#206 (Python Anonymous Auth)
https://github.com/fsouza/fake-gcs-server/blob/main/examples/python/python.py (Example in python)

Other related links:
googleapis/google-cloud-java#7184
https://github.com/googleapis/java-storage
googleapis/google-auth-library-java#449 (Posted issue)

Remove Common Submodule

This submodule currently only have an implementation of monix transformer that is only being used for the DynamoDb connector.
As part of this issue would be to consider bringing back the signature in monix project.
See current status

Parquet - HDFS and S3 Parquet tests and examples

This is a non priority issue but would be cool for the user to have examples on how to write parquet files to hdfs and s3 using the parquet connector, that would imply writing functional tests that cover that.

Web documentation

Migrate current documentation from project's readme to a web version in monix.io/connect/.

AWS S3 Benchmarks

Create aws S3 benmchmarks and compare it with other existing scala integrations.

Add release process

Add the needed sbt configurations to start releasing artifacts to the Sonatype Nexus Repository. As well as adding git version tag for each of the releases.
Project's version would remain "0.x" until project is stable and this feature have been tested a couple of times to make sure deployment is done correctly.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.