canner / vulcan-sql Goto Github PK

View Code? Open in Web Editor NEW

593.0 15.0 24.0 72.28 MB

Data API Framework for AI Agents and Data Apps

Home Page: https://vulcansql.com

License: Apache License 2.0

TypeScript 86.36% JavaScript 1.05% HTML 0.03% Makefile 0.19% Dockerfile 0.04% CSS 0.68% Less 0.07% MDX 11.59%

api-builder data-lake data-warehouse database sql analytics reporting spreadsheet bigquery duckdb

vulcan-sql's Introduction

CannerCMS has pivoted to CannerData - The Data Access Layer for Analytics Data Applications.

Canner now provides solution designs for data teams so that data owners, stewards, and consumers can easily collaborate in one interface. It is an ease-of-use, scalable, secure platform for querying and analyzing information across multiple databases, ﬁle systems, and other storage formats. Unlike current solutions that data is often manually integrated and transferred across different tools and sources, resulting in time-consuming and error-prone processes.

vulcan-sql's People

Contributors

Stargazers

Watchers

vulcan-sql's Issues

Package CLI tools with Docker

What’s the problem you're trying to solve

Most data analysts haven't installed NodeJS in their environment, we can provide some alternative solutions for them.

Describe the solution you’d like

Package our CLI and NPM to a Docker image, then we can use docker run -it vulcan/cli <some-command> directly, e.g. docker run -it -v ${PWD}:/usr/app -p 3000:3000 run -it vulcan/cli init.

Users can use an alias the avoid duplicated commands

alias vulcan="docker run -it -v ${PWD}:/usr/app -p 3000:3000 run -it vulcan/cli"
vulcan init

Additional context

New driver: BigQuery - statement

What’s the problem you're trying to solve

BigQuery is one of the most commonly used warehouses, we should add driver support.

Describe the solution you’d like

Provide BigQuery driver as an individual extension.

Additional context

We only support executing queries and preparing statements now, so the driver can ignore the builder commands (ExecuteOptions.operations)

Parse `cache` config from API Schema YAML

What’s the problem you're trying to solve

In #150, In order to enhance our query performance after users send the API request to run our data endpoint to get the result from the data source. We need to provide a Caching (pre-loading) Layer with the duckDB to enhance query performance.

Describe the solution you’d like

In the part, only focus on parsing the YAML by the above cache format.

# API schema YAML
cache:
  # "order" table keep in duckDB
  - cacheTableName: "order"
      sql: "select * from order",
      # optional
      profile: pg
      # optional
      refreshTime:
        every: "5m"
      # optional
      refreshExpression:
        expression: "MAX(created_at)"
        every: "5m"
      # optional
      indexes:
        'idx_a': 'col_a'
        'idx_b': 'col_b'

  # "product" table  keep in duckDB
  - cacheTableName: "product"
       sql: "select * from product",

If the refreshExpression and refreshTime are both setup, throw the error when the user type vulcan build.

Additional Context

Because we will solve the Caching (pre-loading) Layer with the duckDB by multiple issues, we need to define an interface to interact with the solution of other issues.

Here, make the APISchema like below, we add the cache!: Array<CacheLayerInfo>; in APISchema and define the CacheLayerInfo interface.

// packages/core/src/models/artifact.ts

export class CacheLayerInfo {
  cacheTableName!: string;
  sql!: string;
  refreshTime?: Record<string, string>;
  refreshExpression?: Record<string, string>;
  // the used profile to query the data from data source
  profile!: string;
  // index key name -> index column
  indexes?: Record<string, string>;
}

export class APISchema {
  // graphql operation name
  operationName!: string;
  // restful url path
  urlPath!: string;
  // template, could be name or path
  templateSource!: string;
  @Type(() => RequestSchema)
  request!: Array<RequestSchema>;
  @Type(() => ErrorInfo)
  errors!: Array<ErrorInfo>;
  @Type(() => ResponseProperty)
  response!: Array<ResponseProperty>;
  description?: string;
  // The pagination strategy that do paginate when querying
  // If not set pagination, then API request not provide the field to do it
  pagination?: PaginationSchema;
  sample?: Sample;
  profiles!: Array<string>;
  cache!: Array<CacheLayerInfo>;
}

Otherwise, when parsing the cache table name in YAML, please check whether the metadata which exists cache.vulcan.com key name or not:

if existed, but does not define the cache setting in the schema.yaml or not, throw an error, because it means your SQL will use the cache feature, not YAML defined.
add the schema cache info to RawAPISchema if metadata exists cache.vulcan.com key name.
If the user defines the same cacheTableName in the cache, it will cause our cache (duckDB) could cover the original one table data, so we should check if it throws the error message
Check the cacheTableName has value when getting the schemas from parsed result from YAML.
Check the value format of cacheTableName, because the table column often has a pattern, here suggest you follow the rule: `begin with a letter, not end in an underscore, and should contain only alphanumeric characters ( reference link, It's better if you discover the DuckDB column pattern and check it. )
Check the "sql" has the value
Check the "expression" in the "refreshExpression" field has the value that the user set it.
Get the first profile name from "profiles" field, the reason is that to make sure to use a default profile be the query data source when user set multiple "profiles".
Check the "indexes" has value and format is not key-value format.
Check that each key index name pattern in "indexes", the same suggest rule to follow "begin with a letter, not end in an underscore, and should contain only alphanumeric characters" ( reference link, It's better if you discover the DuckDB column pattern and check it. )

The server returns json format for requests with `.csv` file extension

Environment

dev

What's wrong?

The server always returns data in JSON format even if we send the requests env with .csv. e.g. http://localhost:3000/api/artist/1/works.csv

What's the correct behavior?

It should returns csv format.

Reproducing steps

Start server with playground
curl http://localhost:3000/api/artist/1/works.csv

Bigquery connection failing

I have been trying to connect VulcanSQL with BigQuery but receive an error 'Not found'.

Also, not getting any endpoints in /localhost/doc

The credentials file is working fine as I tested the connection with DBeaver.

profiles, vulcan, sql and yaml files are at the below link.

Files: https://drive.google.com/drive/folders/1g1lyEG32x63hAzrIo2JK8vFAk5GpNC1U?usp=sharing

API Data Compression

What’s the problem you're trying to solve

Support gzip compression to compress the data result ( e.g: JSON, CSV ).

Describe the solution you’d like

Support three parameters - always, enable, disable for setting compression gzip in each API schema to control compression by Accept-Encoding or ending word of API URL

# schema.yaml

gzip: enable

The `enable` mode ( control by request )

The enable means do the Gzip compression feature according to you pass Accept-Encoding: gzip in the header, or indicating the .gz in the URL ending or not, like below:

Do Gzip compression JSON result - /xxxx.json.gz ,
Do Gzip compression from JSON result - /xxxx.json.gz ,
Do Gzip compression from CSV result - /xxxx.csv.gz

The priority is detecting the URL ending word .gz first, then detecting the HTTP header Accept-Encoding: gzip second, below is samples:

If you don't pass Accept-Encoding: gzip, but you add the .gz on the URL ending word, then we will do the Gzip compression.
If you pass Accept-Encoding: gzip, but you do not add the .gz on the URL ending word, then we will do the Gzip compression too.

The `always` mode

The always means always do the Gzip compression, whether you specify the URL ending word or give it by HTTP header Accept-Encoding: gzip.

The `disable` mode

The disable means never do the Gzip compression feature, so even if you pass .gz in the URL ending word or Accept-Encoding: gzip in the header, it still does not do the compression.

Additional context

Documnet error : vulcan need install in global

What's wrong?

What's the correct behavior?

follow document, install command is npm install @vulcan-sql/cli, but it will install in local place.

But usage command is vulcan init, if modules install in local node_modules, command is npx vulcan init or write usage command in package.json.

And this project look like a sub-project control tool. Suggest install command modify to npm install -g @vulcan-sql/cli, it will make usage command right.

Dynamic data masking - Custom String

What’s the problem you're trying to solve

We can generate different queries depending on the user attribute, so it is easy to mask some columns manually like below:

SELECT
  {% if context.user.attr.group == 'admin' %}
    id
  {% else %}
    CONCAT(SUBSTR(id, 0, 4), 'xxxxxx')
  {% endif %} as id,
  age
FROM users;

But manually masking each column might be annoying and unpredictable, we should provide "masking" tag with some masking functions.

Describe the solution you’d like

Inspired by MSSQL DDM, we can add a built-in tag masking with syntax:

{% masking <column-name> <masking-function> %}

We can implement the most commonly used masking function partial first.
For example, assuming that we have a user table:

id	name
AB1234567CD	Ivan
EF2345678GH	Freda

SELECT
  {% masking id partial(2, 'xxxxxxx', 2) %} as id,
  name
FROM user;

We'll get the result below:

id	name
ABxxxxxxxCD	Ivan
EFxxxxxxxGH	Freda

Additional context

Developer document

What’s the problem you're trying to solve

We haven't had any guide for developing extensions.

Describe the solution you’d like

Add a new document series for Vulcan developers:
(TBD, feel free to edit the content below)

Overview
How does Vulcan extension work?
Extensions list (20 extension)
Description of each extension.
Testing
How to test extension using unit test package.
How to do local test.
Publish
Export classes and publish to NPM

Additional context

CLI —verbose support

What’s the problem you're trying to solve

We haven't had a proper way to set log levels and control what components to show (e.g. file path, function name ...etc.)

Describe the solution you’d like

Provide a global flag for CLI --verbose, -v. Set the log level to silly or debug and show file paths and function names, otherwise set the log level to info and hide file paths and function names.

Additional context

CLI error : "--name" not find

What's wrong?

What's the correct behavior?

follow document, this command must vulcan init --name <project-name>, but actually need use vulcan init -p <project-name>.

Data source profiles

What’s the problem you're trying to solve

We need to load data source connection information to execute queries, e.g. host, port, dbname, dbuser, dbpassword ... for Postgres.

Describe the solution you’d like

Users can set profiles property in their project config file (vulcan.yaml), it contains all the profiles' paths, in order of priority, from top to bottom.
```
profiles:
  - some/path/to/profiles.yaml
  - another/path/to/profiles.yaml
```
Two default paths will append at the end automatically: profiles.yaml and ~/.vulcan/profiles.yaml. For example:
```
profiles:
  - some/path/to/profiles.yaml
  - another/path/to/profiles.yaml
```
We'll try to find profiles from some/path/to/profiles.yaml first, then another/path/to/profiles.yaml, profiles.yaml, and ~/.vulcan/profiles.yaml
Profiles are duplicated by their names.

This is an example of a profile file:

- name: pg-admin # required. the name of this profile
   type: pg # required. which driver we should use
   connection: # required. connection info, which depends on drivers
      host: example.com
      username: vulcan
      password: xxxx
- name: duck
  type: duck
  connection:
      path: ./path/to/db

We'll need to set which profile we want to use at endpoints' schemas:
```
urlPath: /users
profile: pg-admin
```

Additional context

Switching profiles by environment (production, dev ...) might be implemented in the future.
We'll implement authorization feature soon, we'll set multiple profiles for one endpoint at that time.

Production build with binary files

What’s the problem you're trying to solve

We should provide a single executable file to serve Vulcan server, with no dependencies or docker env required,

Describe the solution you’d like

Implement commands like vulcan build --output binary --platform x86 to built executable files.

Additional context

Parameterize the output of custom tags

What’s the problem you're trying to solve

We've parameterized all the input data, but if the queries were generated by custom tags (Tag Extensions), we are not able to handle them.

Extension

createTagExtension('test', async ({ args, sql }) =>
    `${args['arg']} ${sql}`
);

Template

{% test arg=context.params.id %}
{{ context.params.id }}
{% endtest %}

Result (with id = 'some injection')

some injection $1

Describe the solution you’d like

TBD

Additional context

We'd parameterize the output of filters, so custom filters are not affected.

Improper parameterizing of variables without values

Environment

Version:

cli version: 0.2.1
core version: 0.2.1
build version: 0.2.1
serve version: 0.2.1

What's wrong?

VulcanSQL tried to parameterize the variables without values, e.g.

{% set someArr = [] %}
{{ someArr.push(1) }}

select {{ someArr[0] }} as a;

Generated query:

$1  --- This should be removed
select $1 as a;

What's the correct behavior?

We should ignore the variables without values:

select $1 as a;

Workaround

We can use expressions instead of variables to mitigate this issue.

{% set someArr = [] %}
{% set temp =  someArr.push(1) %}

select {{ someArr[0] }} as a;

Sync with catalog server APIs

What’s the problem you're trying to solve

Catalog server APIs might need some additional information about our endpoints that are not built into artifacts, e.g. request constraints. We should add the missing metadata.

Describe the solution you’d like

Serialize/Deserialize request constraints to artifacts. #119

Additional context

Issue / PR templates

What’s the problem you're trying to solve

We should have issue / PR templates for the community.

Describe the solution you’d like

Create PR template.
Create issue: feature request template.
Create issue: bug report template.

Additional context

Update the documentation

What’s the problem you're trying to solve

With the increase of users and version updates, the current documentation is no longer sufficient to meet the user's needs and requires an update.

Describe the solution you’d like

The set-sampler section does not mention that incomplete parameter settings can cause the server to fail during the startup phase, for example, if we would like to check data from context.req.xxx like below:

-- We use the req in sql file
select  
*,  
{{ context.req.method }} as "req-method",
{{ context.req.url }} as "req-url"
from "artists" where 
ConstituentID = {{ context.params.id }}
and
{{ context.req.method }} = 'GET'

# API Schema
urlPath: /artist/:id
request:
  - fieldName: id
    fieldIn: path
    description: constituent id
    validators:
      - required
sample:
  req:
    method: GET
    url: 'localhost:3000'
  parameters:
    id: '1'
  profile: duck
profile: duck

Then we should also set the req in the sample.

Otherwise the install-vulcansql-cli section has different instructions for Windows users, etc.

Syntax sugars for parameter validators

What’s the problem you're trying to solve

We need to define our parameters in YAML file when we want to add validators or change parameter sources (path, query, or header). But it might be a hassle for data engineers to define parameters in another file.

Describe the solution you’d like

Provide some syntax sugars in SQL:

Some possible solutions:

Functions

select * from user limit {{ context.params.limit.number({max: 10, min: 0}).required() }}

Filters

select * from user limit {{ context.params.limit | number(max=10, min=0) | required }}

We can add a prefix to validators toprevent collision.

select * from user limit {{ context.params.limit | validate.number(max=10, min=0) | validate.required }}

Additional context

Export the interfaces of Nunjucks

What’s the problem you're trying to solve

We've defined lots of interfaces for nunjucks here, but it was not exported, so extension developer won't able to use it.

Describe the solution you’d like

Re-export Nunjucks with correct interfaces or publish the declaration files.

Additional context

Passing route information to template engine

What’s the problem you're trying to solve

The route information like host, path ...etc. are useful while rendering SQL queries.

Describe the solution you’d like

Pass host name and url path into built-in context.
For example: https://example.com/v1/users
{{ context.host }} -> example.com
{{ context.path }} -> /v1/users
or
{{ context.req.host }} ...

Additional context

Update our official documentation

What’s the problem you're trying to solve

We need clear documentation on our official website before the first release.

Describe the solution you’d like

Additional context

Rename repo from vulcan to vulcan-sql

What’s the problem you're trying to solve

Our project name is confused with other projects.

Describe the solution you’d like

Rename the repository from vulcan to vulcan-sql. Github will automatically redirect the URLs to the new one. Following services might be impacted:

Circle CI
Codecov
Badges (readme)

Additional context

New driver: Postgres - statement

What’s the problem you're trying to solve

Postgres is one of the most commonly used warehouses, we should add driver support.

Describe the solution you’d like

Provide Postgres driver as an individual extension.

Additional context

We only support executing queries and preparing statements now, so the driver can ignore the builder commands (ExecuteOptions.operations)

Production build with Node and Docker

What’s the problem you're trying to solve

We need a production build that only includes necessary context, e.g. artifacts, production dependency, serve package ...etc.

Describe the solution you’d like

Provide package command to do production builds, and use --output flag to indicate the output type. We can implement node and docker output first:

Node:
Build: vulcan package --output node
Execute: node ./dist/index.js
Docker
Build: vulcan package --output node && docker build ./dist
Execute docker run <image

Additional context

We should prepare the resources e.g. dockerfile and all the contents then print the "docker build" commands to users instead of calling docker API directly. Because we might not able to call docker API from Vulcan CLI, e.g. using CLI in docker #49

Update CLI initialized files to align the new design

What’s the problem you're trying to solve

The files generated by CLI haven't been updated for the new design, we can't start the server with them.

Describe the solution you’d like

Update these SQL files, schemas, and config.

Additional context

Improve the stabilities of PostgreSQL and DuckDB drivers

What’s the problem you're trying to solve

We found some issues with PostgreSQL and DuckDB drivers, we should fix them.

Describe the solution you’d like

Handle all events emitted from node-postgres (#109 have handled some events)
Bump DuckDB to stable version.

Additional context

Authorization

What’s the problem you're trying to solve

After #71, we have profiles for data sources, we want to do access control for them. For example, admin users can access pg-admin profile, but non-admin users can only access pg-no-admin profile.

Describe the solution you’d like

We can implement a sample attribute-based access control (ABAC) for our profiles.

For each profile, we can set the allow property to indicate what attributes users should have.
For example, only the user with name admin can access this profile:
```
- name: 'pg-admin'
  driver: 'pg'
  connection: xx
  allow:
    - name: admin
```

Attribute filters are also supported. For example, user should have an attribute group and its value should be admin:

- name: 'pg-admin'
  driver: 'pg'
  connection: xx
  allow:
    - attribute:
          name: group
          value: admin

Wildcards are supported.

- name: 'pg-admin'
  driver: 'pg'
  connection: xx
  allow:
    - name: admin-*

Single allow condition are combined with AND logic
Only the user who has name admin and has group attribute with value admin can access this profile.

- name: 'pg-admin'
 driver: 'pg'
 connection: xx
 allow:
   - name: admin
     attribute:
         name: group
         value: admin

Multiple allow conditions are combined with OR logic.
admin, someoneelse, and those who have group attribute with value admin can access this profile.

- name: 'pg-admin'
 driver: 'pg'
 connection: xx
 allow:
   - name: admin
   - name: someoneelse
   - attribute:
         name: group
         value: admin

We can specify multiple profiles on a single schema now, from top to bottom, users use the first qualified profile. If users can't use any of them, 403 error should be thrown.

Additional context

Provide the information about pagination in API responses

What’s the problem you're trying to solve

We've implemented offset-based pagination #104, but our API response didn't give extra information about it, e.g. total count.

Describe the solution you’d like

We can wrap our result to "results" property, and add extra information about pagination like below:

{
    "_links": {
        "next": "/api/users?limit=20&offset=20",
    },
    "offset": 0
    "limit": 20,
    "results": [
        // Data gose here
    ],
    "total_count": 100,
}

Additional context

Implement offset and limit of data sources

What’s the problem you're trying to solve

Catalog UI, which is our incoming feature, needs to preview the result of each endpoint, it needs a simple pagination implementation for doing so.

Describe the solution you’d like

We can implement .limit() and .offset() of query builder. Query builder has the following operation:

export interface SQLClauseOperation {
  // record select clause operations, null means select *
  select: SelectClauseOperation | null;
  // record where clause operations
  where: Array<WhereClauseOperation>;
  // record join clause operations => ok
  join: Array<JoinClauseOperation>;
  // record groupBy clause operations, array is column name
  groupBy: GroupByClauseOperations;
  // recode having clause operations
  having: Array<HavingClauseOperation>;
  // record orderBy operations
  orderBy: Array<OrderByClauseOperation>;
  // null means not set the value
  limit: number | null;
  // null means not set the value
  offset: number | null;
}

For a temporary solution, we can implement limit and offset only,
In DataSource:

  public async execute({
    statement: sql,
    operations,
  }: ExecuteOptions): Promise<DataResult> {
    sql = `select * from (${sql}) as q limit ${limit} offset ${offset}`  // Should handle sql injection
    ....

Additional context

It's a quick implementation for incoming features, it should be improved in the future.

Missing start square bracket for empty data

Environment

dev

What's wrong?

Server only prints end square bracket when there is no data.

What's the correct behavior?

It should be [].

Reproducing steps

Start server with playground
curl http://localhost:3000/api/artist/12345678 (with invalid id)

Demo environment online

What’s the problem you're trying to solve

We have to install Vulcan CLI and download the required dependencies if we want to try Vulcan, it might be a barry that prevents new users to try Vulcan.

Describe the solution you’d like

Provide an online environment like Codesandbox that uses in-memory databases, as a playground for new users.

Additional context

Snowflake driver prints logs with console

Environment

Version:

cli version: 0.3.0
core version: 0.3.0
build version: 0.3.0
serve version: 0.3.0

What's wrong?

Snowflake driver prints logs with console:

2022-10-28 03:26:55.957  INFO  [CLI] Starting server... 
2022-10-28 03:26:56.166  DEBUG [SERVE] Initializing data source: mock 
2022-10-28 03:26:56.166  DEBUG [SERVE] Data source mock initialized
2022-10-28 03:26:56.167  DEBUG [SERVE] Initializing data source: snowflake
2022-10-28 03:26:56.169  DEBUG [CORE] Initializing profile: snow using snowflake driver
Unable to connect: Incorrect username or password was specified.
Unable to connect: Incorrect username or password was specified.
Unable to connect: Incorrect username or password was specified

What's the correct behavior?

It should print logs through our logger (tslog).

Additional context

Snowflake driver used winston as their logger and not allow us to modify it.

https://github.com/snowflakedb/snowflake-connector-nodejs/blob/master/lib/snowflake.js#L23

High level APIs to create tags and filters

What’s the problem you're trying to solve

The components: TagRunner, TagBuilder, FIlterRunner, and FilterBuilder are designed for low-level operations, that is, accessing the compiler's parser directly. It's difficult for most users, we should wrap this API and provide more delightful components.

Describe the solution you’d like

Wrap parser API with high-level components.

Examples:

import { createTagExtension, FunctionalTag } from '@vulcan-sql/core';

const MaskTag: FunctionalTag = async ({ sql, args }) => {
  return `CONCAT(SUBSTR(${sql}, 0, ${args['len'] + 1}), 'xxxx')`;
};

export default createTagExtension('mask', MaskTag);

select 
{% mask len=5 %} DisplayName {% endmask %} as name
from "artists"
limit 10

The name of this kind of extensions:

Functional Extension (FE) FunctionalTag, FunctionalFilter
~~Higher-Order Extension (HOE) TagHOE, FilterHOE~~

Additional context

New driver: DuckDB - statement

What’s the problem you're trying to solve

DuckDB is an in-process OLAP database management system that can be used in experimental/demo environments.

Describe the solution you’d like

Provide duckdb driver as an individual extension.

Additional context

We only support executing queries and preparing statements now, so the driver can ignore the builder commands (ExecuteOptions.operations)

Support SQL syntax of `{% cache %} ... {% endcache %}` tag

What’s the problem you're trying to solve

Describe the solution you’d like

Define the tag scope of {% cache %} ... {% endcache %} to parse the scope of {% cache %} ... {% endcache %} tag and send the SQL to duckDB.

-- The "cache" scope means the SQL statement will send the query to duckDB cache data source
{% cache %}
select * from order
  where type = {{ context.params.type }}
  and where exists (
    select * from product  where price >= {{ context.params.price }}
    and order.product_id = product.id
  )
{% endcache %}

Additional Context

In the build time, define a CacheTagBuilder to parse {% cache %} … {% endcache %} to the AST tree, set the cache metadata name cache.vulcan.com to make schema parser #151 to work, then keep to the artifact.

Vercel sponsor program

What’s the problem you're trying to solve

We hosted our document on Vercel with a trial Pro plan, we need to get sponsorship from them to continue using their services.

Describe the solution you’d like

Follow this requirement:
https://vercel.com/guides/can-vercel-sponsor-my-open-source-project

Additional context

We need to ensure Canner policy allows us to do this. cc @wwwy3y3

Get response from Vercel team.

Fix format error when url is not found

Environment

dev

What's wrong?

Vulcan server throws format errors when the URL is not found.

Error: Url ending format not matched in "formats" options

What's the correct behavior?

It should do nothing with URLs that are not found.

Reproducing steps

Start server.
curl localhost:3000/some/uirl

Catch all errors to prevent server crashes

What’s the problem you're trying to solve

When some uncatched errors are thrown from downstream of Vulcan server, e.g. data source, Vulcan sometimes crashes and exits.

Describe the solution you’d like

We should catch all errors and keep the server listening.

Additional context

PostgreSQL data source throws uncached errors when databases disconnected, this issue will be fixed in the future
.

[Feature Request] ClickHouse support

What’s the problem you're trying to solve

ClickHouse blazing fast analytics database
I would like to support easy analytics API creation with your great project

Describe the solution you’d like

https://clickhouse.com/docs/en/integrations/language-clients/nodejs

CI for nightly builds

What’s the problem you're trying to solve

Before every release, we need some published packages to do some manual tests, but we only publish our packages while releasing.

Describe the solution you’d like

Build the updated packages (use nx affected) at 00:00 CST every day with pre-release tags in dev channel. e.g. @vulcan-sql/core:0.1.0-dev.20220907

Additional context

CLI —watch support

What’s the problem you're trying to solve

We need to restart the server manually after updating SQL or schemas now, it'll be better if Vulcan can watch these files and reload services automatically.

Describe the solution you’d like

Provide a flag --watch, -w for start command, which lets the server watch SQL and schemas changes, and restart or reload the server when needed.

Additional context

Server error handling for restful APIs

What’s the problem you're trying to solve

We haven't had a proper way to handle errors for restful API, that is, we'll always throw "500 internal server error" for any errors.

Describe the solution you’d like

Catch these errors and set the proper HTTP status code. We can wrap our errors when they occurred and set the desired HTTP code there.

Additional context

Support GraphQL API

What’s the problem you're trying to solve

Provide new API style - GraphQL, the query language API for making users more choices.

Describe the solution you’d like

Support GraphQL API type, you could activate it by appending the GRAPHQL in types of project config:

# vulcan.yaml

types:
 - RESTFUL
 - GRAPHQL

Additional context

Get Internal Error when not setting "formats" field in "reponse-format" options, and send API request not specify the format in the URL

Environment

Version:
- cli version: 0.2.1
- core version: 0.2.1
- build version: 0.2.1
- serve version: 0.2.1
OS: macOS

What's wrong?

When not setting the formats field in reponse-format options of the vulcan.yaml, the API request will throw an internal error when API URL does not contain any format at the end of the URL.

What's the correct behavior?

It should get the default format.

Reproducing steps

Don't set the formats field in the reponse-format options of the vulcan.yaml.

# vulcan.yaml
response-format:
  enabled: true
  options:
    default: json

Use a browser to send HTTP GET requests which not has format in the end of the URL
e.g: http://localhost:3000/api/orders
Got the Internal Error and shows an error message in the terminal:

Error Response:

{
"message": "An internal error occurred",
"requestId": "d8d5d068-ef49-4e5b-bc1f-92d97a9308fb"
}

Termal Shows Error:
TypeError Cannot read properties of undefined (reading 'formatToResponse')

Wrong types of TagRunnerOptions.args

Environment

Version:

cli version: 0.3.0
core version: 0.2.1
build version: 0.2.1
serve version: 0.2.1

What's wrong?

The args property of TagRunnerOptions was constianted to be string, number, or boolean.

export declare type TagExtensionArgTypes = string | number | boolean;
export interface TagRunnerOptions {
    context: nunjucks.Context;
    args: TagExtensionArgTypes[];
    contentArgs: TagExtensionContentArgGetter[];
    metadata: NunjucksExecutionMetadata;
}

But it can be object if inputs are keyword arguments.

{% tagname foo="bar" %}

What's the correct behavior?

It should be allowed to be Record too.

Reproducing steps

CLI error : "--watch" not find

What's wrong?

What's the correct behavior?

follow document, this command must vulcan start --watch.

Parse context.params instead of params for parameter metadata

Environment

dev

What's wrong?

When we add parameters in templates but not in schemas, the document doesn't get these parameters.

What's the correct behavior?

It should get these parameters with default validators.

Reproducing steps

Add some parameters like {{ context.params.id }} but not in schema definitions.
Build Vulcan and check the document.

Epic Request: Caching (pre-loading) Layer with the duckDB

What’s the problem you're trying to solve

In order to enhance our query performance after users send the API request to run our data endpoint to get the result from the data source. We need to provide a Caching (pre-loading) Layer with the duckDB to enhance query performance.

Describe the solution you’d like

Define the cache layer in YAML by cache field, and define each cacheTableName and its config. For example we define an sql select * from order query the result from data source pg and keep to duckDB by order table name.

Otherwise, the refreshTime and refreshExpression means when should we need to refresh the query and keep the new result in the cache table in duckDB. The index is a duckDB indexes feature to index like the original database and make query efficient.

# API schema YAML
cache:
  # "order" table keep in duckDB
  - cacheTableName: "order"
      sql: "select * from order",
      # optional
      profile: pg
      # optional
      refreshTime:
        every: "5m"	
      # optional
      refreshExpression:
        expression: "MAX(created_at)"
        every: "5m"
      # optional
      indexes:
        'idx_a': 'col_a'
        'idx_b': 'col_b'

  # "product" table  keep in duckDB
  - cacheTableName: "product"
       sql: "select * from product",

# used data source
profiles: pg

Here is our SQL file, which makes our query send to duckDB to get results. In the scop of {% cache %} ... {% endcache %} , when we send the query to duckDB, we will search the each table keep in duckDB or not, so if your YAML not defined the same cache table name, it will make the query failed.

-- The "cache" scope means the SQL statement will send the query to duckDB cache data source
{% cache %}
select * from order
  where type = {{ context.params.type }}
  and where exists (
    select * from product  where price >= {{ context.params.price }}
    and order.product_id = product.id
  )
{% endcache %}

Breakdown

The #157 not contains in the epic, we will arrange to develop it in the future.

New driver: Snowflake - statement

What’s the problem you're trying to solve

Snowflake is one of the most commonly used warehouses, we should add driver support.

Describe the solution you’d like

Provide Snowflake driver as an individual extension.

Additional context

We only support executing queries and preparing statements now, so the driver can ignore the builder commands (ExecuteOptions.operations)

canner / vulcan-sql Goto Github PK

vulcan-sql's Introduction

vulcan-sql's People

Contributors

Stargazers

Watchers

Forkers

vulcan-sql's Issues

What’s the problem you're trying to solve

Describe the solution you’d like

Additional context

What’s the problem you're trying to solve

Describe the solution you’d like

Additional context

What’s the problem you're trying to solve

Describe the solution you’d like

Additional Context

Environment

What's wrong?

What's the correct behavior?

Reproducing steps

What’s the problem you're trying to solve

Describe the solution you’d like

The enable mode ( control by request )

The always mode

The disable mode

Additional context

What's wrong?

What's the correct behavior?

What’s the problem you're trying to solve

Describe the solution you’d like

Additional context

What’s the problem you're trying to solve

Describe the solution you’d like

Additional context

What’s the problem you're trying to solve

Describe the solution you’d like

Additional context

What's wrong?

What's the correct behavior?

What’s the problem you're trying to solve

Describe the solution you’d like

Additional context

What’s the problem you're trying to solve

Describe the solution you’d like

Additional context

What’s the problem you're trying to solve

Describe the solution you’d like

Additional context

Environment

What's wrong?

What's the correct behavior?

Workaround

What’s the problem you're trying to solve

Describe the solution you’d like

Additional context

What’s the problem you're trying to solve

Describe the solution you’d like

Additional context

What’s the problem you're trying to solve

Describe the solution you’d like

What’s the problem you're trying to solve

Describe the solution you’d like

Additional context

What’s the problem you're trying to solve

Describe the solution you’d like

Additional context

What’s the problem you're trying to solve

Describe the solution you’d like

Additional context

What’s the problem you're trying to solve

Describe the solution you’d like

Additional context

What’s the problem you're trying to solve

Describe the solution you’d like

Additional context

What’s the problem you're trying to solve

Describe the solution you’d like

Additional context

What’s the problem you're trying to solve

The `enable` mode ( control by request )

The `always` mode

The `disable` mode