GithubHelp home page GithubHelp logo

nevillelyh / shapeless-datatype Goto Github PK

View Code? Open in Web Editor NEW
67.0 7.0 9.0 647 KB

Shapeless utilities for common data types

License: Apache License 2.0

Scala 100.00%
shapeless datastore bigquery scala google-cloud tensorflow avro

shapeless-datatype's Introduction

shapeless-datatype

Build Status codecov.io GitHub license Maven Central Scala Steward badge

Shapeless utilities for common data types. Also see Magnolify for a simpler and faster alternative based on Magnolia.

Modules

This library includes the following modules.

  • shapeless-datatype-core
  • shapeless-datatype-avro
  • shapeless-datatype-bigquery
  • shapeless-datatype-datastore
  • shapeless-datatype-tensorflow

Core

Core includes the following components.

  • A MappableType for generic conversion between case class and other data types, used by BigQuery and Datastore modules.
  • A RecordMapper for generic conversion between case class types.
  • A RecordMatcher for generic type-based equality check bewteen case classes.
  • A LensMatcher for generic lens-based equality check between case classes.

RecordMapper

RecordMapper[A, B] maps instances of case class A and B with different field types.

import shapeless._
import shapeless.datatype.record._
import scala.language.implicitConversions

// records with same field names but different types
case class Point1(x: Double, y: Double, label: String)
case class Point2(x: Float, y: Float, label: String)

// implicit conversion bewteen fields of different types
implicit def f2d(x: Float) = x.toDouble
implicit def d2f(x: Double) = x.toFloat

val m = RecordMapper[Point1, Point2]
m.to(Point1(0.5, -0.5, "a"))  // Point2(0.5,-0.5,a)
m.from(Point2(0.5, -0.5, "a")) // Point1(0.5,-0.5,a)

RecordMatcher

RecordMatcher[T] performs equality check of instances of case class T with custom logic based on field types.

import shapeless.datatype.record._

case class Record(id: String, name: String, value: Int)

// custom comparator for String type
implicit def compareStrings(x: String, y: String) = x.toLowerCase == y.toLowerCase

val m = RecordMatcher[Record]
Record("a", "RecordA", 10) == Record("A", "RECORDA", 10)  // false

// compareStrings is applied to all String fields
m(Record("a", "RecordA", 10), Record("A", "RECORDA", 10))  // true

LensMatcher

LensMatcher[T] performs equality check of instances of case class T with custom logic based on Lenses.

import shapeless.datatype.record._

case class Record(id: String, name: String, value: Int)

// compare String fields id and name with different logic
val m = LensMatcher[Record]
  .on(_ >> 'id)(_.toLowerCase == _.toLowerCase)
  .on(_ >> 'name)(_.length == _.length)

Record("a", "foo", 10) == Record("A", "bar", 10)  // false
m(Record("a", "foo", 10), Record("A", "bar", 10))  // true

AvroType

AvroType[T] maps bewteen case class T and Avro GenericRecord. AvroSchema[T] generates schema for case class T.

import shapeless.datatype.avro._

case class City(name: String, code: String, lat: Double, long: Double)

val t = AvroType[City]
val r = t.toGenericRecord(City("New York", "NYC", 40.730610, -73.935242))
val c = t.fromGenericRecord(r)

AvroSchema[City]

Custom types are also supported.

import shapeless.datatype.avro._
import java.net.URI
import org.apache.avro.Schema

implicit val uriAvroType = AvroType.at[URI](Schema.Type.STRING)(v => URI.create(v.toString), _.toString)

case class Page(uri: URI, rank: Int)

val t = AvroType[Page]
val r = t.toGenericRecord(Page(URI.create("www.google.com"), 42))
val c = t.fromGenericRecord(r)

AvroSchema[Page]

BigQueryType

BigQueryType[T] maps bewteen case class T and BigQuery TableRow. BigQuerySchema[T] generates schema for case class T.

import shapeless.datatype.bigquery._

case class City(name: String, code: String, lat: Double, long: Double)

val t = BigQueryType[City]
val r = t.toTableRow(City("New York", "NYC", 40.730610, -73.935242))
val c = t.fromTableRow(r)

BigQuerySchema[City]

Custom types are also supported.

import shapeless.datatype.bigquery._
import java.net.URI

implicit val uriBigQueryType = BigQueryType.at[URI]("STRING")(v => URI.create(v.toString), _.toString)

case class Page(uri: URI, rank: Int)

val t = BigQueryType[Page]
val r = t.toTableRow(Page(URI.create("www.google.com"), 42))
val c = t.fromTableRow(r)

BigQuerySchema[Page]

DatastoreType

DatastoreType[T] maps between case class T and Cloud Datastore Entity or Entity.Builder Protobuf types.

import shapeless.datatype.datastore._

case class City(name: String, code: String, lat: Double, long: Double)

val t = DatastoreType[City]
val r = t.toEntity(City("New York", "NYC", 40.730610, -73.935242))
val c = t.fromEntity(r)
val b = t.toEntityBuilder(City("New York", "NYC", 40.730610, -73.935242))
val d = t.fromEntityBuilder(b)

Custom types are also supported.

import shapeless.datatype.datastore._
import com.google.datastore.v1.client.DatastoreHelper._
import java.net.URI

implicit val uriDatastoreType = DatastoreType.at[URI](
  v => URI.create(v.getStringValue),
  u => makeValue(u.toString).build())

case class Page(uri: URI, rank: Int)

val t = DatastoreType[Page]
val r = t.toEntity(Page(URI.create("www.google.com"), 42))
val c = t.fromEntity(r)
val b = t.toEntityBuilder(Page(URI.create("www.google.com"), 42))
val d = t.fromEntityBuilder(b)

TensorFlowType

TensorFlowType[T] maps between case class T and TensorFlow Example or Example.Builder Protobuf types.

import shapeless.datatype.tensorflow._

case class Data(floats: Array[Float], longs: Array[Long], strings: List[String], label: String)

val t = TensorFlowType[Data]
val r = t.toExample(Data(Array(1.5f, 2.5f), Array(1L, 2L), List("a", "b"), "x"))
val c = t.fromExample(r)
val b = t.toExampleBuilder(Data(Array(1.5f, 2.5f), Array(1L, 2L), List("a", "b"), "x"))
val d = t.fromExampleBuilder(b)

Custom types are also supported.

import shapeless.datatype.tensorflow._
import java.net.URI

implicit val uriTensorFlowType = TensorFlowType.at[URI](
  TensorFlowType.toStrings(_).map(URI.create),
  xs => TensorFlowType.fromStrings(xs.map(_.toString)))

case class Page(uri: URI, rank: Int)

val t = TensorFlowType[Page]
val r = t.toExample(Page(URI.create("www.google.com"), 42))
val c = t.fromExample(r)
val b = t.toExampleBuilder(Page(URI.create("www.google.com"), 42))
val d = t.fromExampleBuilder(b)

License

Copyright 2016 Neville Li.

Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0

shapeless-datatype's People

Contributors

franbh avatar hansencc avatar nevillelyh avatar regadas avatar scala-steward avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

shapeless-datatype's Issues

RecordMappers involving transformations on Maps do not compile

Maps which need to be mapped via an implicit conversion (e.g. String => Array[Byte]), or contain different case classes (even if they share the same structure: e.g. mapping from Map[String, X] to Map[String, Y] where X(a: Int) and Y(a: Int)), do not compile.

[datastore]: StackOverflow for Case Class with over 22 fields

I have a 4 converters where I'm calling toEntityBuilder and if I call sbt clean compile I consistently get the following StackOverflowError - https://gist.github.com/grahamar/f9100091d61f0313327a07e6aba06435#file-stack-txt

But If I don't clean it will work, which makes me think it's a race condition... All my code looks like - https://gist.github.com/grahamar/f9100091d61f0313327a07e6aba06435#file-authorconverter-scala, nothing complex.

My types are all very straight forward too:

case class Author(
  id: Long = 0L,
  source_id: Option[Long] = None,
  name: Option[String] = None,
  enabled: Option[Boolean] = None,
  avatar_url: Option[String] = None,
  extracted_name: Option[String] = None,
  item_name: Option[String] = None,
  unique_id: Option[String] = None,
  deleted_at: Option[String] = None,
  created_at: Option[String] = None,
  updated_at: Option[String] = None
)

Any help is appreciated, I don't have enough knowledge of shapeless to know where too look either...

Deriving logic prefers derived case class mapper over custom provided

In case if case class contains case class, inner case class correctly interpreted as RECORD however, if custom mapper provided (as in example in documentation), schema generator treats it as field of custom type, but ToMappable derived in still treats it as RECORD.

This test (as per documentation) succeeds:

import shapeless.datatype.bigquery._
import java.net.URI

implicit val uriBigQueryType = BigQueryType.at[URI]("STRING")(v => URI.create(v.toString), _.toString)

case class Page(uri: URI, rank: Int)

val t = BigQueryType[Page]
val r = t.toTableRow(Page(URI.create("www.google.com"), 42))

// this is as expected
r.get("uri") ==== "www.google.com"

while this fails:

import shapeless.datatype.bigquery._

case class Yuri(ss: String)
implicit val uriBigQueryType = BigQueryType.at[Yuri]("STRING")(v => new Yuri(v.toString), _.ss)

case class Page(uri: Yuri, rank: Int)

val t = BigQueryType[Page]
val r = t.toTableRow(Page(new Yuri("www.google.com"), 42))

// this FAILS, instead of string, it contains linked hash map.
r.get("uri") ==== "www.google.com"

Note that the only difference here is that instead of system non-case URI class, custom case class Yuri is used (and if I switch it from being case class to normal class, the problem is gone).

I think that reason is that ToMappable.hconsToMappable0 (which constructs basing on internal structure of case class) implicit has bigger priority than ToMappable.hconsToMappable1 (which uses externally provided mapper).

Creating a generic AvroType

Is it possible to have a generic AvroType? So for instance like the following class:

class AvroDeserializationSchema[IN: ClassTag : TypeTag] {
   val t = AvroType[IN]
}

I'm struggling with this since I want to make a generic avro (de)serializer.
Thanks in advance!

BigQueryType.toTableRow does not serialize properly in lambda when there are custom types

This serializes, but BigQuerySchema does not know about the custom types. Note: releases is a SCollection[Release].

      val bqType: BigQueryType[Release] = BigQueryType[Release]
      releases
        .map { a =>
          implicit val enumType
            : BaseBigQueryMappableType[MyEnum] =
            BigQueryType.at[MyEnum]("STRING")(
              a => MyEnum.fromName(a.toString).orNull,
              _.name)
          bqType.toTableRow(a)
        }
        .saveAsBigQuery(
          table,
          schema = BigQuerySchema[Release]
        )

This does not serialize:

      implicit val enumType
        : BaseBigQueryMappableType[MyEnum] =
        BigQueryType.at[MyEnum]("STRING")(
          a => MyEnum.fromName(a.toString).orNull,
          _.name)
      val bqType: BigQueryType[Release] = BigQueryType[Release]
      val schema = BigQuerySchema[Release]

      releases
        .map { a =>
          bqType.toTableRow(a)
        }
        .saveAsBigQuery(
          table,
          schema = schema
        )
Exception in thread "main" java.lang.IllegalArgumentException: unable to serialize anonymous function map ...
Caused by: java.io.NotSerializableException ...

RecordMappers: improvements on Option support

Right now, a RecordMapper only supports Option[X] => Option[Y] mappings. It'd be nice to enhance that by:

  1. Supporting X => Option[Y]
  2. Supporting Option[Y] => X.

I know this is controversial, as it could lead to runtime exceptions when trying to extract None. I'd suggest to provide this as an optional feature, only enabled manually by the user when deemed appropriate.

A bit of background: we need this in order to map from our Domain Model to the classes generated by ScalaPB and viceversa. Since we are using Protobuf 3 syntax, it automatically generates Options for all message references inside messages. This Option enhancement would help us to solve the impedance mismatch between the Domain Model and the ScalaPB model.

I'll provide you with a PR later.

Avro support Maps

Hi,

I don't think this library currently supports Maps for Avro. Will this be supported in the (near) future or is it possible to add it as custom type?
Thanks

Fail nested type for TensorFlowType at compile time

Right now TensorFlowType uses the same traits as BigQueryType and DatastoreType but doesn't implement methods that handle nested types. We should either duplicate the implicit chain logic or figure out a way to fail it at compile time.

Support Sets in FromMappable and ToMappable

This can be achieved the same way as done in #11 which would add support for maps too, however I have found that to be able to use a Set I also needed to define a CanBuild[T, Set[T]] so I guess the same would be true for Map

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.