GithubHelp home page GithubHelp logo

finagle / roc Goto Github PK

View Code? Open in Web Editor NEW
55.0 55.0 7.0 626 KB

A Modern Finagle-Postgresql Client

Home Page: http://finagle.github.io/roc/docs/

License: BSD 3-Clause "New" or "Revised" License

Scala 98.39% DTrace 1.61%

roc's People

Contributors

penland365 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

roc's Issues

Remove Postgresql Types

For the time being, let's yank out Postgres Types since it's unknown if this should reside in core or not. Columns can simply return the OID integer, and we will trust clients to know what they are asking for ( for right now at least ).

Cleaner README

This is probably an ongoing task, but it's here none the less.

Initial Decoders

Decoders for Standard Value types such as

  • Option
  • String
  • Short
  • Int
  • Long
  • Float
  • Double
  • Defined JSON type
  • java.time.LocalTime
  • java.time.LocalDate
  • java.time.ZonedDateTime
  • Boolean
  • Char

See java.sql.Types for base Java information and Postgresql Types for information about Postgresql Types.

Flesh out Authentication State

Currently we can authenticate w/ No Password, Plain Text Password, and MD5 encrypted Password, while all other Authentication requests just throw a unhelpful Exception.

We need to check against all possible Authentication types, and describe what they are in a new Type to describe why the Authentication failed.

Move Cats Version to 0.9.0

Related to #79 , we want to update Cats to 0.9.0 while still supporting scala 2.11.8 in preparation for a big Finagle version increase.

Result Type

The current Result type is a grab bag of random things thrown back up from a server. Time to iron it out.

Supporting additional Charsets

From #72 , @jeremyrsmith raises a question about supporting additional Charsets for Postgres. Currently we are locked in to StandardCharsets.UTF_8 throughout the code base.

Postgres gives us the current encoding of the Server through a Parameter Status update which can come at any point during the connection.

The question I have going forward - is this something we should automatically detect from Postgres and attempt to make it transparent to a user? Or should we set the encoding through Stack.Param configuration and immediately close the connection if the encoding ever changes from the Server?

Text rows not UTF-8 decoded when read from server

Rows that return unicode strings come back in a different formatting than expected. Where the string in the database is "Empeñada", I get back "Empeᅢᄆada".

Here's an example method:

def getFood(): Future[String] = {
  val req = new Request(s"SELECT 'Empeñada' as course".stripMargin)
  pgClient.query(req).map { result =>
    result.map(_.get('course).as[String]).head
  }
}

Here's how I called it:

scala> getFood().get
res5: String = Empeᅢᄆada

Which we can verify is not just my terminal:

scala> food.foreach(char => println("%02X" format char.hashCode))
45
6D
70
65
FFC3
FFB1
61
64
61

@penland365 believes the issue is in Result.rows where this:

val strValue = bytes.map(_.toChar).mkString
Text(column.name, column.columnType, strValue)

...should be something like this:

val strValue = new String(bytes, StandardCharsets.UTF_8)
Text(column.name, column.columnType, strValue)

Requests returning results from previous queries

We've been experiencing some occasional "Could not find element X in row" exceptions when mapping over queries. In each case where this happens, it's implausible that the column would be missing from that query.

The majority of queries are free from this issue, so it's been difficult to reproduce (the issue vanished after my first attempt at logging with four separate log statements, be it a coincidence or due to skewing a race condition).

At @penland365's suggestion, I added logging for result.completedCommand, result.columns and result.rows. I replaced calls to pgClient.query with a call to this method:

  private[this] def pgQuery(req: Request): Future[Result] =
    pgClient.query(req).foreach { result =>
      logger.info(
        s""" - req: $req
           | - res.completedCommand: ${result.completedCommand}
           | - res.columns: ${result.columns.map(_.name)}
           | - res.rows: ${result.mkString(",")}""".stripMargin)
    }

I've since been able to identify specific queries where this exceptions occurs, along with the associated result contents. Here's one request that resulted in an error. I've formatted it a bit and anonymized (e.g., the rows were on a single line but I've split it across lines for readability).

One such error is roc.postgresql.failures$ElementNotFoundFailure: Could not find element 'report_count in Row and the corresponding request and result is this:

- timestamp: 2016-06-29T02:17:24.471Z
- request: Request(
    select * from question_reports where
    topic_slug = '_f8507801-d9d8-4863-a6ac-499fd35efdaf' and lifecycle = 'new'
    and report_type = 'inappropriate')
- result
  - completedCommand: SELECT 0
  - columns: [Lscala.Symbol;@460039e4
  - rows: []

(Unfortunately, it looks like something is wrong with how I'm printing the columns.)

In the following query, ~900ms later, it appears that the result from the previous query is returned:

- timestamp: 2016-06-29T02:18:25.344Z
- request: Request(
    SELECT COUNT(player_id) AS report_count
    FROM player_reports WHERE player_id = '22222222')
- result
  - completedCommand: SELECT 1
  - columns: [Lscala.Symbol;@35247c44
  - rows: [
    Text('topic_slug,1043,_f8507801-d9d8-4863-a6ac-499fd35efdaf)
    Text('reporter_id,1043,11111111)
    Text('language,1043,en)
    Text('reason,1043,other)
    Text('message,1043,being a meanie)
    Text('lifecycle,1043,new)Text('created,1114,2016-05-30 17:12:41.816803)
  ]

This is an internal, low-volume service, so the next query ~1 minute later was an INSERT query whose result had completedCommand as "SELECT 0" and no rows).

~2 minutes later, we finally get the response from the count query in the result of an INSERT query:

- timestamp: 2016-06-29T02:20:50.026Z
- req: Request(
    insert into topic_reports
    (topic_slug, reporter_id, reason, language, message)
    values (
      '_5c1068a1-5780-45ff-a7c6-01ee7e3887b8',
      '333333333',
      'boring_topic',
      'en',
      'I do not care for this topic'
    ))
- result:
  - completedCommand: SELECT 1
  - columns: [Lscala.Symbol;@6caec085
  - res.rows: [Text('report_count,20,9)]

Looking through the logs, this kind of mixup appears to happen quite a bit. In many cases it's benign since we're not processing the result (e.g. for insert queries).

If you have any hunches about what might be causing this, I might be able to contribute a fix.

Yank superfluous code.

There's a lot of extraneous code from the early parts of this project, particularly around failures and decoders. Just yank it all out man.

NoticeResponse Implementation

For now, we will be throwing these responses on the floor. However, they still need to be decoded now that PostgresqlMessage has made its way into to the code base.

Return Multiple Error Messages

Under the current Postgres Protocol, a single ErrorResponse may contain multiple ErrorMessages within it. Currently, we are retrieving the first ErrorMessage from the response, then throwing the rest on the floor.

This needs to be fixed pre 0.1.0, but returning multiple messages needs to wait until we have a better handle on Dispatcher Architecture.

SQL Escape Capability

Users should be able to have SQL queries and parameters automatically SQL escaped.

Preferred Roc Error Handling?

Problem

What is the appropriate way to handle errors in Roc? There are several important factors weighing in on this decision

  1. We are not bound by JDBC style of error handling.
  2. Postgresql has explicit documentation around "error classes" that allows us to model a great many Server errors without the need to create Exceptions
  3. From point 2., there are 278 explicit types of errors Postgresql can emit, and 41 "error classes" that these 278 fall into.
  4. From point 3., these 41 Error Class codes can be further classified as "Error", "Fatal", "Panic", "Warning", "Notice", "Debug", "Info", "Log" to denote the expected action of a client.
  5. "Adhere to the style of the original" suggests that we should not stray too far from the Twitter.Future.exception style of handling.
  6. The Result type is more or less undefined at this point - currently it just holds some information returned from the DB, but this isn't firm in the slightest and I expect it to change dramatically as the code continues to reveal itself.

Currently, we are modeling any Errors that happen in the client ( for example, decoding failures ) as Failures, and any Exceptions that occur on the postgresql side as PostgresqlErrors

abstract class Failure extends Exception // Client side problems
abstract class PostgresqlError  // Server side problems

Possible Solutions

The "kill -9"

Perhaps the easiest of these, we create a simple type to model any error from a Postgresql database. Upon receipt of any error, we immediately close the connection and return a

case class PostgresqlError extends Exception
 Future.exception(new PostgresqlError())

The "all. the. errors." approach

sealed abstract class PostgresqlError // does NOT extend Exception
case class UndefinedColumn extends PostgresqlError

Under this approach, we explicitly model each Postgresql Error type. As the errors are a closed set ( i.e., there is a well defined set of errors and a well defined behavior if, for some reason, an error is returned that does not appear in this set), this is certainly a possibility. The connection would not be closed, and the Result type would have a way to model a possible error ( perhaps as a disjunction? ).

However, the enormous number of error types makes this challenging and error prone for any user.

The "adhere to JDBC style" approach

In this scenario, we would map Postgreql Errors to their corresponding JDBC Exceptions, and then simply bail out in the client code:

// some decoding has occurred above
val error = InvalidColumnReference
Future.exception(new java.sql.SQLException(error))

The "let's just cut the baby in half" approach

In this scenario, we would create Error Types based off the severity level. Errors that do not affect the connection state would simply be errors, while errors that do affect the connection state would also be Exceptions, i.e.

sealed abstract class PostgresqlError
case class Error extends PostgresqlError // example, "no table named FOO exists"
case class Fatal extends PostgresqlError with Exception // example, admin shutdown initiated
case class Panic extends PostgresqlError with Exception // example, OOM
case class Debug extends PostgresqlError

Non-exception types would be returned as the Result type (still not sure how though), while any connection closing errors would close the connection, and then bail out via

Future.exception(new Fatal())

Thoughts on this? I'm kind of torn between the "shove everything into a Future.exception" and "model all error types".

Hide extraneous types.

This one isn't too bad - we need to go through and add

private[roc]

to any types that clients don't need to know about.

Migrate to Finagle-4

A long term feature that would a great to have, but it's way down the list of priorities right now.

DataRow Generator Fails too often

The generator used for testing Data Row Packet Decoders is failing too often right now because Scalacheck is giving up after too many discarded values.

I've labeled this Pending for now but we need to look at the generator to see why this is happening.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.