GithubHelp home page GithubHelp logo

tangata_local's Introduction

tāngata_local

“Hutia te rito o te harakeke
Kei whea to kōmako e kō?
Ki mai ki ahau
He aha te mea nui o te Ao?
Maku e kī atu,
he tāngata, he tāngata, he tāngata..."

If the heart of the harakeke (flax plant) was removed,
where would the bellbird sing?
If I was asked what was the most important thing in the world
I would be compelled to reply,
it is people, it is people, it is people.
Ngaroto

In te ao Māori (the Māori world view), Tāngata (TAHNG-uh-tuh) describes something much larger than an addressed group of people: it describes whakapapa, the surrounding network of ancestors and descendants we are connected to.
With this work we intend to follow these principles to put our people first: not just the data & analytics engineers, but those around our workplaces that know the deep details of how our businesses actually run.
These people are the lifeblood of what we do - and to keep moving forward, we need their context far more than ever.

Current Functionality

Tāngata is an editable Data Catalog, describing a dbt_ repository.
It interfaces with dbt_ itself, git, and other sources to compile metadata in one place; and allows a non-technical user to understand what's been built, and contribute metadata to the sources & models within.
With descriptive metadata, edit history, lineage, and SQL code all available in one place, this should become the default search engine of an organisation's data users; and with specific attention applied to runtime speed will be an enjoyable place to work regardless of the technical background of any user.

The complexity of Git and engineering practise in general can be difficult to approach. With Tāngata, this sits behind the scenes - giving comfort to those who are important, while maintaining a strong, secure foundation for our most critical metadata.

Future Functionality

This project started as a graphical SQL interface - and still contains some of the pieces behind the scenes.
While the pivot to editable catalog has taken over for some time, the dream is still bigger: what if a non-technical user could design a straightforward dbt_ model using just drag and drop?
This approach may take some time, but not out of reach - SQL is structured by definition, it just takes the right interface.

Installation Instructions

pip install tangata

To Serve Tāngata:

  • Navigate to your dbt project folder
  • Run ./tangata [--skipcompile]
  • Tangata will be served at http://localhost:8080

Where's the React code?

This project was initially created as a npm-backed React app. The Javascript/React code for the front end is now located in https://github.com/ciejer/tangata_client

I have an idea!

Please log all feedback in the GitHub Issues - your feedback is crucial to make this a useful tool for the community.

#Change Log

0.2.0

  • Added config option to sort .yml files alphabetically
  • Fixed model ordering issue in db tree view

0.1.19

  • Added left click context menus to tests & promotions, with tooltips
  • Added Promoted Models to catalog landing page.

0.1.18

  • Fixed issue #45: initial run was breaking on pip upgrade, where new config options were not found in tangata_config.json
  • Resolved #49: now has config for +tags in dbt_project.yml. Behaviour respects existing tags where they exist, but all new keys will use configured choice.
  • Resolved #48: now uses preserve_quotes for all ruamel.yaml calls.

tangata_local's People

Contributors

dougscc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

davelindley

tangata_local's Issues

Add config options for "promoted" and "hidden" tags

In the absence of dbt-labs/dbt-core#1671 being completed, dbt is missing the ability to promote / discourage models in docs.
In the Config page, add a place to designate special tags that promote the use of some models, while clearly marking others as "not for use".
These tags should be hidden from the master tags list, but stored in dbt_project.yml as any other would be.

This will open up some possibilities for #15 with a "Promoted Models" list on the landing page.

Add page routing

Refresh, back button, links to page, and browser tabs should all cleanly send a user to the same page.

Error when clicking on "Refresh dbt_ catalog"

Hi,

When I click on "Refresh dbt_ catalog", the following error appears in the console.
FYI, I am running tangata in WSL, in a conda virtual env where I have both dbt and tangata installed.

reloading dbt_...
sh: 1: cd: can't cd to . 
complete
512
dbt_ update failed.

Code Change History is empty

When I select models (even ones that have been changed many times), there is no information under "Code Change History". There is no error showing up in the logs though.

Error when changing the description of a column

Hi. I have been testing 0.1.10 a bit and got some error whenever I try to update the description of columns:

[2021-06-18 09:10:55,507] ERROR in app: Exception on /api/v1/update_metadata [POST]
Traceback (most recent call last):
  File "/home/xxx/envs/dbt19/lib/python3.8/site-packages/flask/app.py", line 2070, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/xxx/envs/dbt19/lib/python3.8/site-packages/flask/app.py", line 1515, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/xxx/envs/dbt19/lib/python3.8/site-packages/flask/app.py", line 1513, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/xxx/envs/dbt19/lib/python3.8/site-packages/flask/app.py", line 1499, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
  File "/home/xxx/envs/dbt19/lib/python3.8/site-packages/tangata/tangata.py", line 55, in update_metadata
    return tangata_api.update_metadata(request.json)
  File "/home/xxx/envs/dbt19/lib/python3.8/site-packages/tangata/tangata_api.py", line 354, in update_metadata
    currentSchemaYMLModelColumn = list(filter(lambda d: d['name'] == jsonBody['column'], currentSchemaYMLModel['columns']))[0]
  File "/home/xxx/envs/dbt19/lib/python3.8/site-packages/tangata/tangata_api.py", line 354, in <lambda>
    currentSchemaYMLModelColumn = list(filter(lambda d: d['name'] == jsonBody['column'], currentSchemaYMLModel['columns']))[0]
TypeError: string indices must be integers

v0.1.16 - KeyError: 'promotion_tag'

Hi.

Just got this issue when running 0.1.16

Job "tangata.<locals>.run_first_load (trigger: date[2021-06-27 21:29:07 AEST], next run at: 2021-06-27 21:29:07 AEST)" raised an exception
Traceback (most recent call last):
  File "/home/xxx/envs/dbt19/lib/python3.8/site-packages/apscheduler/executors/base.py", line 125, in run_job
    retval = job.func(*job.args, **job.kwargs)
  File "/home/xxx/envs/dbt19/lib/python3.8/site-packages/tangata/tangata.py", line 108, in run_first_load
    tangata_api.reload_dbt(sendToast)
  File "/home/xxx/envs/dbt19/lib/python3.8/site-packages/tangata/tangata_api.py", line 446, in reload_dbt
    refreshMetadata(sendToast)
  File "/home/xxx/envs/dbt19/lib/python3.8/site-packages/tangata/tangata_api.py", line 66, in refreshMetadata
    assemblingFullCatalog = tangata_catalog_compile.compileCatalogNodes()
  File "/home/xxx/envs/dbt19/lib/python3.8/site-packages/tangata/tangata_catalog_compile.py", line 89, in compileCatalogNodes
    tempCatalogNodes[key] = populateFullCatalogNode(catalog['nodes'][key], "node", catalog, manifest)
  File "/home/xxx/envs/dbt19/lib/python3.8/site-packages/tangata/tangata_catalog_compile.py", line 31, in populateFullCatalogNode
    if len(tangataConfig["promotion_tag"]) > 0 and tangataConfig["promotion_tag"] in manifestNode['tags']:
KeyError: 'promotion_tag'

Add tags to search

Allow multiple search terms, phrases within quotes, name: "modelname" etc

tangata changes the indentation of existing yml files

After modifying a model that was already part of a yml file, tangata correctly added the new piece of configuration, but also changed the entire indentation of my file in the process.

The new configuration file still works after dbt test but the issue is that every single line of the yml is now showing up as a change in the git commit.

Error in editing tags

Error when removing a tag:

[2021-06-19 22:58:07,787] ERROR in app: Exception on /api/v1/update_metadata [POST]
Traceback (most recent call last):
File "c:\python36\lib\site-packages\flask\app.py", line 2447, in wsgi_app
response = self.full_dispatch_request()
File "c:\python36\lib\site-packages\flask\app.py", line 1952, in full_dispatch_request
rv = self.handle_user_exception(e)
File "c:\python36\lib\site-packages\flask\app.py", line 1821, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "c:\python36\lib\site-packages\flask_compat.py", line 39, in reraise
raise value
File "c:\python36\lib\site-packages\flask\app.py", line 1950, in full_dispatch_request
rv = self.dispatch_request()
File "c:\python36\lib\site-packages\flask\app.py", line 1936, in dispatch_request
return self.view_functionsrule.endpoint
File "c:\python36\lib\site-packages\tangata\tangata.py", line 58, in update_metadata
updateResult = tangata_api.update_metadata(request.json, sendToast)
File "c:\python36\lib\site-packages\tangata\tangata_api.py", line 311, in update_metadata
newDBTProjectYML = merge(dbtProjectYML, jsonToInsert)
File "c:\python36\lib\site-packages\tangata\tangata_api.py", line 271, in merge
merge(a[key], b[key], path + [str(key)])
File "c:\python36\lib\site-packages\tangata\tangata_api.py", line 271, in merge
merge(a[key], b[key], path + [str(key)])
File "c:\python36\lib\site-packages\tangata\tangata_api.py", line 271, in merge
merge(a[key], b[key], path + [str(key)])
[Previous line repeated 1 more time]
File "c:\python36\lib\site-packages\tangata\tangata_api.py", line 275, in merge
raise Exception('Conflict at %s' % '.'.join(path + [str(key)]))
Exception: Conflict at models.my_new_project.analytics.customer_film_categories.tags

Feature - UI - Sorting models alphabetically

I tested tangata with a dbt project having several dozen models and couldn't understand how the models were sorted on the left pane. Having the models sorted alphabetically or giving the option to sort them alphabetically would make it easier to find models without using the search bar,

Error: reduce() of empty sequence with no initial value

Hi,

Just trying tangata and I am getting some errors in the console. The website is loading but the catalog is empty.
FYI, I am running it in Windows Subsystem for Linux.

[2021-06-17 12:13:57,087] ERROR in app: Exception on /api/v1/model_tree [GET]
Traceback (most recent call last):
  File "/home/xxxx/envs/dbt19/lib/python3.8/site-packages/flask/app.py", line 2070, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/xxxx/envs/dbt19/lib/python3.8/site-packages/flask/app.py", line 1515, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/xxxx/envs/dbt19/lib/python3.8/site-packages/flask/app.py", line 1513, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/xxxx/envs/dbt19/lib/python3.8/site-packages/flask/app.py", line 1499, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
  File "/home/xxxx/envs/dbt19/lib/python3.8/site-packages/tangata/tangata.py", line 45, in model_tree
    return tangata_api.get_model_tree()
  File "/home/xxxx/envs/dbt19/lib/python3.8/site-packages/tangata/tangata_api.py", line 130, in get_model_tree
    resultObject = reduce(merge_models, split_models)
TypeError: reduce() of empty sequence with no initial value

Feature - Save the configuration with models in alphabetical order

It would be good for tangata to provide refactoring options.

It could change the existing yml files to sort them by model name and save new models in the correct alphabetical order. Having models in that order make it much easier to edit the yml files manually.

Use `preserve_quotes = True` in ruamel.yaml

Would it be possible to add preserve_quotes = True or make it possible to configure the value in the configuration? More info on the flag here:
https://stackoverflow.com/questions/39262556/preserve-quotes-and-also-add-data-with-quotes-in-ruamel

In my case, without this flag the first changes to dbt_project.yml modifies all the lines about source-paths, macro-paths etc... removing the quotes. Not a big deal but the git history just becomes less clean that way.

Feature - Use existing yml file in folder instead of schema.yml

In my dbt project, I like to list my tables/column descriptions and tests in a file called <current_folder>.yml. For example, in my marts folder, the file is called marts.yml and not schema.yml. I find it much easier when looking for yml files for all of them to have different names rather than having many schema.yml in different folders.

If I have already added info about a specific table, tangata, finds the correct yml file and add the new info in the correct place. But if for a model in the same folder (e.g. marts), I add info about a table that is not in my existing yml file, it adds it to a new schema.yml file. Ideally, I'd like this info to be added to any other existing yml file in the current folder rather than creating a new one.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.