GithubHelp home page GithubHelp logo

Comments (11)

mfridman avatar mfridman commented on August 25, 2024 4

I was looking into riverqueue/river last night for my own use case / general curiosity (having built a similar bespoke thing ages ago), and also happen to maintain the goose library.

This project looks AWESOME! Really well done.

Pretty much found myself here for the same reason, where I wouldn't be using the CLI and instead prefer to call a method or function at application startup.

I copied over the sql statements verbatim from the second migration, and let goose handle the river migrations, but it'd be ideal if this project exposed said functionality so it could be invoked in the same place as my other migrations.

Also, if you do add a migrate.Up(1), migrate.Up(2) or a catch-all migrate.All(), would it be safe for concurrent use?

Migration 2 (SQL)
-- +goose Up
CREATE TYPE river_job_state AS ENUM(
'available',
'cancelled',
'completed',
'discarded',
'retryable',
'running',
'scheduled'
);

CREATE TABLE river_job(
-- 8 bytes
id bigserial PRIMARY KEY,

-- 8 bytes (4 bytes + 2 bytes + 2 bytes)
--
-- `state` is kept near the top of the table for operator convenience -- when
-- looking at jobs with `SELECT *` it'll appear first after ID. The other two
-- fields aren't as important but are kept adjacent to `state` for alignment
-- to get an 8-byte block.
state river_job_state NOT NULL DEFAULT 'available' ::river_job_state,
attempt smallint NOT NULL DEFAULT 0,
max_attempts smallint NOT NULL,

-- 8 bytes each (no alignment needed)
attempted_at timestamptz,
created_at timestamptz NOT NULL DEFAULT NOW(),
finalized_at timestamptz,
scheduled_at timestamptz NOT NULL DEFAULT NOW(),

-- 2 bytes (some wasted padding probably)
priority smallint NOT NULL DEFAULT 1,

-- types stored out-of-band
args jsonb,
attempted_by text[],
errors jsonb[],
kind text NOT NULL,
metadata jsonb NOT NULL DEFAULT '{}' ::jsonb,
queue text NOT NULL DEFAULT 'default' ::text,
tags varchar(255)[],

CONSTRAINT finalized_or_finalized_at_null CHECK ((state IN ('cancelled', 'completed', 'discarded') AND finalized_at IS NOT NULL) OR finalized_at IS NULL),
CONSTRAINT max_attempts_is_positive CHECK (max_attempts > 0),
CONSTRAINT priority_in_range CHECK (priority >= 1 AND priority <= 4),
CONSTRAINT queue_length CHECK (char_length(queue) > 0 AND char_length(queue) < 128),
CONSTRAINT kind_length CHECK (char_length(kind) > 0 AND char_length(kind) < 128)
);

-- We may want to consider adding another property here after `kind` if it seems
-- like it'd be useful for something.
CREATE INDEX river_job_kind ON river_job USING btree(kind);

CREATE INDEX river_job_state_and_finalized_at_index ON river_job USING btree(state, finalized_at) WHERE finalized_at IS NOT NULL;

CREATE INDEX river_job_prioritized_fetching_index ON river_job USING btree(state, queue, priority, scheduled_at, id);

CREATE INDEX river_job_args_index ON river_job USING GIN(args);

CREATE INDEX river_job_metadata_index ON river_job USING GIN(metadata);

-- +goose StatementBegin
CREATE OR REPLACE FUNCTION river_job_notify()
RETURNS TRIGGER
AS $$
DECLARE
payload json;
BEGIN
IF NEW.state = 'available' THEN
  -- Notify will coalesce duplicate notificiations within a transaction, so
  -- keep these payloads generalized:
  payload = json_build_object('queue', NEW.queue);
  PERFORM
    pg_notify('river_insert', payload::text);
END IF;
RETURN NULL;
END;
$$
LANGUAGE plpgsql;
-- +goose StatementEnd

CREATE TRIGGER river_notify
AFTER INSERT ON river_job
FOR EACH ROW
EXECUTE PROCEDURE river_job_notify();

CREATE UNLOGGED TABLE river_leader(
-- 8 bytes each (no alignment needed)
elected_at timestamptz NOT NULL,
expires_at timestamptz NOT NULL,

-- types stored out-of-band
leader_id text NOT NULL,
name text PRIMARY KEY,

CONSTRAINT name_length CHECK (char_length(name) > 0 AND char_length(name) < 128),
CONSTRAINT leader_id_length CHECK (char_length(leader_id) > 0 AND char_length(leader_id) < 128)
);

-- +goose Down
DROP TABLE river_job;
DROP FUNCTION river_job_notify;
DROP TYPE river_job_state;

DROP TABLE river_leader;

from river.

bgentry avatar bgentry commented on August 25, 2024 1

I had a feeling this would come up quickly 😅 Is your thought that we could export the migrations as SQL string constants? The issue I'm seeing with that is that Goose Go migrations appear to only work with individual SQL statements, whereas River's migrations are comprised of multiple statements.

Or maybe I'm misreading that?

from river.

bgentry avatar bgentry commented on August 25, 2024 1

Another option is to expose this as a Go API which takes a pgx pool or driver wrapper and which you can then call something like migrate.Up(1) and migrate.Up(2) to step through the migrations in sequence. This would allow them to still manipulate multiple tables as needed and cleanly encapsulated several statements.

from river.

mradile avatar mradile commented on August 25, 2024 1

Another option is to expose this as a Go API which takes a pgx pool or driver wrapper and which you can then call something like migrate.Up(1) and migrate.Up(2) to step through the migrations in sequence. This would allow them to still manipulate multiple tables as needed and cleanly encapsulated several statements.

This is a great idea. Maybe is would also be possible to just run a method like migrate.All() do apply all needed migrations.

from river.

giautm avatar giautm commented on August 25, 2024 1

Hey, did you try ariga/atlas for versioned migration? https://atlasgo.io/versioned/intro

from river.

brandur avatar brandur commented on August 25, 2024 1

This project looks AWESOME! Really well done.

Hah, thanks Michael! And nice to see you here.

I opened #67 which exposes a new rivermigrate package and should get us most of the way towards solving this issue.

That won't be quite enough to get Goose working though because Goose exposes an sql.DB instead of pgx conn/pool, but what I think I can do after is to add an incomplete riversql driver whose only use (for now at least) would be make migrations work with Goose. That should get us to a resolution for this one.

Also, if you do add a migrate.Up(1), migrate.Up(2) or a catch-all migrate.All(), would it be safe for concurrent use?

Yep, or at least insofar as its safe to run be modifying database DDL concurrently. #67 should already have that covered, and exposes a MigrateTx function so that migrations can be run inside a transaction that can be rolled back if desired.

from river.

brandur avatar brandur commented on August 25, 2024 1

Alright, we just added a riverdatabasesql driver that lets River be used with an sql.DB/sql.Tx for purposes of migrations like seen with Goose's Go migrations. Added docs here:

https://riverqueue.com/docs/migrations#migrating-river-with-goose

Going to close this issue out as done, but let me know if there are any other limitations that people are finding.

from river.

mradile avatar mradile commented on August 25, 2024

You can use sql files with use goose and separate the up/down migrations.
Example:
SQL file:

-- +goose Up
-- +goose StatementBegin

STATEMENT_1;
STATEMENT_2;
STATEMENT_N;

-- +goose StatementEnd

-- +goose Down
-- +goose StatementBegin

DOWN_STATEMENT_1;
DOWN_STATEMENT_2;
DOWN_STATEMENT_N;

-- +goose StatementEnd

Embed the whole directory with all sql migration files:

//go:embed sql/migrations/*.sql
var EmbedMigrations embed.FS

Run the migrations:

var sqlDB *sql.DB
goose.SetBaseFS(EmbedMigrations)
goose.SetDialect("postgres")
goose.Up(db, "sql/migrations")

But to answer your initial question: I think you could pass whole SQL files with multiples statements to tx.Exec()

from river.

mfridman avatar mfridman commented on August 25, 2024

I opened #67 which exposes a new rivermigrate package and should get us most of the way towards solving this issue.

Nice! Thanks for putting this together.

I could be wrong, but I think it's less about making things work with goose (or any other migration
framework), and more about how to make riverqueue/river interoperable (without the CLI) with
existing projects that use pgx driver and database/sql, namely *sql.DB. I suspect as the
popularity of riverqueue/river grows, more people will want to use it with existing projects that
use database/sql and don't have a *pgxpool.Pool.

It's a nice property that rivermigrate encapsulates and owns all the logic. My only point was I'd
run whatever knobs river exposed alongside my application migrations.

I'm a big fan of pgxpool and the work Jack has been doing, it appears to be the defacto driver
nowadays, but there's a bit of a disconnect between the database/sql and pgxpool APIs.

EDIT: maybe it's sufficient for users to go from *sql.DB -> *pgx.Conn? (which you can already do
today) and use that as the interface to rivermigrate, in addition to / or instead of
*pgxpool.Pool?

Nvm, I see ya'll already covered this in the Limited driver support documentation. rivermigrate is sufficient for my needs. Sorry for the noise.

from river.

brandur avatar brandur commented on August 25, 2024

No worries. I'm not actually against putting in a driver for core's sql package (despite what that page says), as long as it's reasonably feasible. But yeah, I suspect that most hardcore Postgres users will be shifting to pgx over time.

from river.

brandur avatar brandur commented on August 25, 2024

We released River 0.0.12 that adds a new rivermigrate package that provides a Go API that runs migrations. See the docs an examples there.

Goose is still not supported because of the database/sql/pgx/v5 discrepancy, but I'm working on #98, which will add a minimal driver for database/sql with just enough functionality to support rivermigrate/Goose, but not usable with the full River client.

from river.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.