GithubHelp home page GithubHelp logo

bryan_cb_sf_2020_talk's Introduction

theme: Simple, 1 autoscale: true slidenumbers: true footer: ยฉ Erlang Solutions 2020

[fit] Customer retention

[fit] and how to avoid double billing

Bryan Hunt (ESL)

fit


About me

  • Slinging code for 20 years. 1
  • Writing Elixir for about 5 years now (and loving it).
  • Built 52 slides so we'll have to move fast.

fit


Why I like Elixir

  • Elixir has lovely syntax and UX ๐Ÿ’ƒ
  • Elixir makes hard things easy โœ…
  • Elixir also makes easy things easy โœ…
  • Elixir lacks drama โœ…
  • Elixir is stable/boring โœ…
  • Elixir doesn't crash ใ€ฝ 2

fit


Talk objectives

  1. Opportunity to complain about being double charged when booking a flight.
  2. An analysis of why their system double charged the customer.
  3. Illustrate techniques to:
  • Build a more reliable system
  • Not double charge customers
  • Proactively monitor for system malfunction

Opening scene [^u owe me]

[.column]

  1. Urgently need to fly from Dublin to London the next day.
  2. The website warns me there are only 3 seats remaining for the flight.
  3. Better book fast!
    • Select flight.
    • Select any seat.
    • Enter my credit card details.
    • Deal with a couple of session timeouts.

[.column]

  1. The Aer Lingus website:
  • Insists I sign up for their loyalty scheme (the sweet, sweet irony).
  • Crashes.
  • Locks me out.
  1. I wait 30 minutes and don't receive any email confirmation.
  2. Panic ๐Ÿ˜ง.

[^u owe me]: This happened nearly 4 years ago but I still want my ยฃ95.99 back (with interest).


Like a desperate fool

  1. Fresh browser.
  2. Start the booking process from scratch.
  3. Decline the Aer Lingus loyalty scheme ๐Ÿ˜ฑ.
  4. Use the same name, email, and credit card.
  5. On the second attempt, the booking succeeds.
  6. Nervously await booking confirmation.

Success !

Receive a booking confirmation at 8:03 PM - you will be flying to London!


right

Actually Fail!

  • Another booking confirmation, this time at 8:15 PM..
  • Check my bank account - billed twice.
  • Activate the online chat - it times out
  • Call the website helpdesk... no answer..
  • Maybe I'm not the only one having issues

Good luck with that !

I can guarantee there's nobody listening on either of these two numbers -- Me

[.column]

Website Helpdesk 0333 006 6920 Mon-Fri 08:00-06:00 Sat-Sun & Bank Holidays: 09:00-06:00

[.column]

Reservations 0333 004 5000 Mon-Fri: 08:00-6:00 Sat-Sun & Bank Holidays: 09:00-06:00


Aftermath

  • Whine about Aer Lingus on social media/Linkedin
  • Ponder what went wrong
  • Could I implement a better system in Elixir? 3

^ 10 minutes in


What could we improve?

  1. Error detection/notification
  2. Fault tolerance
  3. Session storage
  4. Prevent double billing

^Error detection - the customer shouldn't have to chase a system error.
^Tolerate failure - improve reliability of network calls (database/microservices).
^Session storage - Handle more customers (sessions) at once with fewer physical resources. ^Session storage - Allow a user session to live longer (5 minute timeout are too short). ^Prevent double billing - Prevent duplicate writes to a relational database. ^Prevent double billing - Prevent duplicate writes to anything.


fit original


fit original


Error detection/notification

  • Add bugsnag-elixir as a dependency and sign up for the bugsnag software as a service (GDPR?).
  • Connect WombatOAM to the node 4 and use it to detect errors.
  • Write your own global error handler...

Global error handler implementation


defmodule Global.Handler do
  require Logger
  @behaviour :gen_event

  def init([]), do: {:ok, []}

  def handle_call({:configure, new_keys}, _state) do
    {:ok, :ok, new_keys}
  end

  def handle_event({:error_report, gl, {_pid, _type, [message | _]}}, state)
      when is_list(message) and node(gl) == node() do
      Logger.error("Global error handler: #{inspect(message, pretty: true)}")
      #Or maybe use the new (elixir 1.10.1) structured logger?
    {:ok, state}
  end

  def handle_event({_level, _gl, _event}, state) do
    {:ok, state}
  end
end

^ Rather than Logger.error - do something useful? Your choice.


Using the global error handler

Add it :

  :error_logger.add_report_handler(Global.Handler)

Trap exit signals

  Process.flag(:trap_exit,true)

Spawn at process which will raise an exception/terminate

  Task.async(fn -> raise "hell" end)

autoplay right fit loop

^ Trap exit signals (don't kill iex):


Fault tolerance

Lets concentrate on handling calls to another system which is temporarily failing.

We need an intelligent way to retry failed operations

  • Code your own retry handling logic?
  • Luke! Use the (open) source!

Options:

^Retry is more recently updated and I'm currently using it on a project, so we'll use it for this example


Using Retry library (safwank/ElixirRetry)

use Retry
retry with: exponential_backoff(1000,2) |> 
  jitter() |> 
  Enum.take(5) do
  countdown = Process.get(:countdown,0)   
  IO.puts("counter: #{countdown}, #{DateTime.utc_now}" )
  if countdown < 4 do
    Process.put(:countdown, countdown + 1)
    raise "countdown too low - trying again..."
  else 
    :ok
  end
  after
    result -> result
  else
    error -> error
end

autoplay right fit loop

^ Ships with various backoff options - exponential, linear, can also be configured to only handle certain exceptions. ^ recognises tuple starting with :error as an error (can't be overriden but you can configure it to recognise other atoms as well


Quick shout out to the Elixir macro overlords [^java (โ„ข)]

cat retry4j/src/**/*.java | wc -l 
    3178
cat deps/retry/lib/**/*.ex | wc -l 
     464

[^java (โ„ข)]: And I'm so grateful not to be coding Java...

^The thing is, implementing something like this in Elixir is very easy


Session storage

"Your session has timed out after 5 minutes of inactivity, please start again and wade through the 20 screens the marketing "people" insisted on adding to the booking flow..." -- Every Java/.Net website ever

^Ever notice how when the session times out - airline websites always set the travel dates to two weeks in the future - don't they know what cookies are for?

We need to talk about session storage

Issues :

^ Every session is stored in memory (webserver or some distributed system) ^Store the session data in a datastore


Session storage in Plug/Phoenix


fit


fit


How do we configure session storage in Phoenix/Plug

endpoint.ex

plug Plug.Session,
  store: :cookie,
  key: "_chat_key",
  signing_salt: "cKjB7sPT"
  max_age: 24*60*60*30  # 30 days

Trivial

^The Plug.Sessions module has a built-in option to set the expiration of a cookie using the max_age key. For example, extending your endpoint.ex snippet would look like: ^The session content can also be encrypted


Prevent double billing


Unique database constraints


Ecto (Elixir persistence framework)

Basic concepts

  • Migration
  • Changeset
  • Repo
  • Conveniences

^ Schema - where you define the database structure ^ Changeset - represents changes - you feed this into the Ecto.Repo ^ Repo - provides functions for storing and retrieving data


Migration

[.column]

defmodule Airline.Repo.Migrations.CreateFlightBookings do
  use Ecto.Migration

  def change do
    create table(:flight_bookings) do
      add :name, :string
      add :surname, :string
      add :cc_hash, :string
      add :flight_number, :string
      add :minute, :string
      add :hour, :string
      add :day, :string
      add :month, :string
      add :year, :string

      timestamps()
    end
.....

[.column]

    create unique_index(
             :flight_bookings,
             [:name,
              :surname,
              :cc_hash,
              :flight_number,
              :minute,
              :hour,
              :day,
              :month,
              :year ],
             name: :unique_traveller_index)
  end
end

^ we hash the cc to prevent fraud


Changeset

[.column]

defmodule Airline.Flight.Booking do
  use Ecto.Schema
  import Ecto.Changeset

  schema "flight_bookings" do
    field :cc_hash, :string
    field :day, :string
    field :flight_number, :string
    field :hour, :string
    field :minute, :string
    field :month, :string
    field :name, :string
    field :surname, :string
    field :year, :string

    timestamps()
  end

[.column]

  @doc false
  def changeset(booking, %{} = attrs) do
    booking |> cast(attrs,[ 
                    :name,
                    :surname,
                    :cc_hash,
                    :flight_number,
                    :minute,
                    :hour,
                    :day,
                    :month,
                    :year ])
    |> validate_required([ 
                    :name,
                    :surname,
                    :cc_hash,
                    :flight_number,
                    :minute,
                    :hour,
                    :day,
                    :month,
                    :year])
    |>  unique_constraint(:unique_booking_constraint, 
name: :unique_traveller_index)
  end

end

^ cast - Applies the given params as changes for the given data according to the given set of permitted keys. Returns a changeset. ^ validate required - ensures required values are set ^ unique constraint - The unique constraint works by relying on the database to check ^ if the unique constraint has been violated or not and, if so, Ecto converts it into a changeset error. ^ naive implementation - indexes are not free - they slow up writes


Repo

A repository maps to an underlying data store, controlled by the adapter. For example, Ecto ships with a Postgres adapter that stores data into a PostgreSQL database.


Convenience

[.column] Generated most of the prior code with the following command:

[.column]

mix phx.gen.schema \
  Booking \
  flight_bookings \
  name \
  surname \
  cc_hash \
  flight_number \
  minute \
  hour \
  day \
  month \
  year 

Use the Ecto changeset to validate input without touching the database

iex(8)> Airline.Flight.Booking.changeset(%Airline.Flight.Booking{}, %{})                                                            
#Ecto.Changeset<
  action: nil,
  changes: %{},
  errors: [
    name: {"can't be blank", [validation: :required]},
    surname: {"can't be blank", [validation: :required]},
    cc_hash: {"can't be blank", [validation: :required]},
    pp_hash: {"can't be blank", [validation: :required]},
    flight_number: {"can't be blank", [validation: :required]},
    minute: {"can't be blank", [validation: :required]},
    hour: {"can't be blank", [validation: :required]},
    day: {"can't be blank", [validation: :required]},
    month: {"can't be blank", [validation: :required]},
    year: {"can't be blank", [validation: :required]}
  ],
  data: #Airline.Flight.Booking<>,
  valid?: false
>

Generate a valid changeset

cc_num_hash = :crypto.hash(:sha256,"5105105105105100") |> Base.encode64
 

input = %{
  name: "davey",
  surname: "jones",
  cc_hash: cc_num_hash,
  flight_number: "flight_number",
  minute: "minute",
  hour: "hour",
  day: "day",
  month: "month",
  year: "year"
}

valid_changeset = %Ecto.Changeset{valid?: true} = Airline.Flight.Booking.changeset(%Airline.Flight.Booking{}, input)


Insert valid data

iex(7)> Airline.Repo.insert(valid_changeset)                                                                             
[debug] QUERY OK db=3.4ms decode=1.4ms queue=2.2ms idle=9906.6ms
INSERT INTO "flight_bookings" ("cc_hash","day", SNIP...
{:ok,
 %Airline.Flight.Booking{
   __meta__: #Ecto.Schema.Metadata<:loaded, "flight_bookings">,
   cc_hash: "cc_hash",
   day: "day",
   flight_number: "flight_number",
   hour: "hour",
   id: 1,
   inserted_at: ~N[2020-02-28 22:20:54],
   minute: "minute",
   month: "month",
   name: "name",
   surname: "surname",
   updated_at: ~N[2020-02-28 22:20:54],
   year: "year"
 }}

Insert duplicate data

iex(8)> Airline.Repo.insert(valid_changeset)

[debug] QUERY ERROR db=7.4ms queue=1.9ms idle=9324.1ms
INSERT INTO "flight_bookings" ("cc_hash","day", SNIP...
{:error,
 #Ecto.Changeset<
   action: :insert,
   changes: %{
     cc_hash: "cc_hash",
     day: "day",
     flight_number: "flight_number",
     hour: "hour",
     minute: "minute",
     month: "month",
     name: "name",
     surname: "surname",
     year: "year"
   },
   errors: [
     unique_booking_constraint: {"has already been taken", [constraint: :unique, constraint_name: "unique_traveller_index"]}
   ],
   data: #Airline.Flight.Booking<>,
   valid?: false
 >}

^ That was useful - we seem to be relatively safe - but that's 9 database indexes - things are going to get slow


Lets try something a little more efficient

^ we don't necessarily want random access to all of those columns but we do want to prevent duplicates. ^ we could generate a checksum in the changeset function and make it unique instead.


We add an :entity_hash column to the Booking module

[.column]

defmodule Airline.Flight.Booking do
  use Ecto.Schema
  import Ecto.Changeset

  @required_attrs [ :name, :surname, :cc_hash, :entity_hash, :flight_number, :minute, :hour, :day, :month, :year ]

  @hash_attrs @required_attrs

  schema "flight_bookings" do
    field :cc_hash, :string
    field :entity_hash, :string
    field :day, :string
    field :flight_number, :string
    field :hour, :string
    field :minute, :string
    field :month, :string
    field :name, :string
    field :surname, :string
    field :year, :string

    timestamps()
  end


And modify the changeset function to calculate the hash of the unique fields before we store a booking to the database

  @doc false
  def changeset(booking, %{} = attrs) do
    entity_hash =
      :crypto.hash(:sha256, inspect(Map.to_list(attrs |> Map.take(@hash_attrs))))
      |> Base.encode64()

    augmented_attrs = Map.put(attrs, :entity_hash, entity_hash)

    booking
    |> cast(
      augmented_attrs,
      @required_attrs
    )
    |> validate_required(@required_attrs)
    |> unique_constraint(:unique_booking_constraint, name: :unique_traveller_index)
  end
end

The schema/migration now becomes the much more reasonable

defmodule Airline.Repo.Migrations.CreateFlightBookings do
  use Ecto.Migration

  def change do
    create table(:flight_bookings) do
      add :name, :string
      add :surname, :string
      add :cc_hash, :string
      add :entity_hash, :string
      add :flight_number, :string
      add :minute, :string
      add :hour, :string
      add :day, :string
      add :month, :string
      add :year, :string
      timestamps()
    end

    create unique_index( :flight_bookings, [ :entity_hash ], name: :unique_traveller_index)
  end
end

^ Audience challenge - compare the relative insert performance for a table with 10 indexed columns VS 1


autoplay fit loop


Can we be even more efficient?

Can we know if a booking has already passed through the system without touching the database?


Bloom filter 5

Used as an optimization in many data stores to avoid hitting index files to check if an element exists - some examples :

  • Cassandra
  • Riak

^Used in Riak/Cassandra to check if a file definitely doesn't contain a record ^A bloom filter can tell if something definitely is not present (has NOT been seen) ^It cannot tell if something has been seen/exists ^Typically used in Sorted String Table datastores to avoid searching for objects in files


Using a bloom filter (gmcabrita/bloomex)

[.column]

defmodule Bloomer do
 use GenServer

  def start_link(_) do
    GenServer.start_link(__MODULE__, nil, name: __MODULE__)
  end

  def add(element) do
    GenServer.cast( __MODULE__, {:add, element})
  end

  def exists(element) do
    GenServer.call( __MODULE__, {:exists, element})
  end

[.column]

  @impl true
  def init(_) do
    {:ok, Bloomex.scalable(1000, 0.1, 0.1, 2) }
  end

  @impl true
  def handle_call({:exists,element} , _from, state) do
    exists = Bloomex.member?(state, element)
    {:reply, exists, state}
  end

  @impl true
  def handle_cast({:add, element}, state) do
    {:noreply, Bloomex.add(state, element) }
  end
end

Add the GenServer to the supervision tree of your application module

defmodule Airline.Application do
  # See https://hexdocs.pm/elixir/Application.html
  # for more information on OTP Applications
  @moduledoc false

  use Application

  def start(_type, _args) do
    # List all child processes to be supervised
    children = [
      Bloomer,
      Airline.Repo

autoplay bottom fit loop


Add the bloom filter into the storage module

changeset = Airline.Flight.Booking.changeset(%Airline.Flight.Booking{}, booking)
if Bloomer.exists {:booking, changeset.changes.entity_hash}  do
  Logger.warn("Possible duplicate booking #{inspect(booking)}")
end
Bloomer.add {:booking, changeset.changes.entity_hash}

^So if something unusual was happening... the logfiles would indicate a problem ^Slight issue - bloom filter is local to the node so if user is sending work though another node it won't be picked up ^Single point of failure - so use it as an indicator - and remember - it can only tell you if something has definitely NOT already been seen


What about the database (or other service) being temporarily unavailable ?

^ how can we handle intermittent database failures on the critical path?


Remember Retry?

We can use Retry to retry database inserts


defmodule Bookings do

  import Ecto.Query, warn: false
  alias Airline.Repo
  alias Airline.Flight.Booking

  def insert_booking_with_retry( %{ name:_, surname:_, cc_hash:_, flight_number:_, minute:_, hour:_, day:_, month:_, year:_ } = booking) do
    use Retry

    retry with: exponential_backoff()  |> Enum.take(10) , rescue_only: [DBConnection.ConnectionError]   do
      IO.puts("attempting to insert changeset - #{DateTime.utc_now}")
      changeset = Airline.Flight.Booking.changeset(%Airline.Flight.Booking{}, booking)
      case Repo.insert(changeset) do
        {:error, changeset = %{valid?: false}  } -> {:invalid_changeset, changeset }
        other -> other
      end
    after
      result -> result
    else
      error -> error
    end
  end
end

demo

autoplay loop


What could we improve?

  1. Error detection/notification โœ…
  2. Fault tolerance โœ…
  3. Session storage โœ…
  4. Prevent double billing โœ…

Thanks for listening!

Slide (markdown) content can be found at

https://github.com/esl/bryan_cb_sf_2020_talk

Thank you to :

  • Erlang team
  • Elixir team
  • The open source community
  • Erlang solutions for flying me out to sunny USA

right 600%

Footnotes

  1. Perl, VB, C, C++, PHP, Python, Java, Scala, Javascript, Actionscript, Erlang, Shell, Ansible, Zsh, AWK, Sed, etc, ๐Ÿ˜ด โ†ฉ

  2. Unlike ยฃ โ†ฉ

  3. HTTP header leakage reveals it's running on Java/JBOSS - so of course we can.. โ†ฉ

  4. ESL product โ†ฉ

  5. A Bloom filter is a space-efficient probabilistic data structure, conceived by Burton Howard Bloom in 1970, that is used to test whether an element is a member of a set https://en.wikipedia.org/wiki/Bloom_filter โ†ฉ

bryan_cb_sf_2020_talk's People

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.