GithubHelp home page GithubHelp logo

Comments (15)

MingNiu avatar MingNiu commented on August 15, 2024 1

@hubertdeng123 Aren't you going to provide an official version of the SQL upgrade script?

from self-hosted.

hubertdeng123 avatar hubertdeng123 commented on August 15, 2024

For users that are attempting to upgrade from the alpine image to the latest 24.6.0 or above, all indexes involving columns of type text, varchar, char, and citext should be reindexed before the instance is put into production (source).

For users that are attempting to fix duplicate rows of data, the process may be pretty manual 🥲. It will likely involve removing the duplicate rows of data, along with models with foreign key relations.

from self-hosted.

candux avatar candux commented on August 15, 2024

I had the same problems and fixed them in the database with the follwing commands!
THIS is dangerous and will delete data! Use at your own risk! Make a backup before

ALTER TABLE ONLY public.sentry_commitfilechange
    DROP CONSTRAINT sentry_commitfilechange_commit_id_4c6f7ec25af34227_uniq;
DELETE from public.sentry_commitfilechange ou where (SELECT count(*) from public.sentry_commitfilechange inr where inr.commit_id = ou.commit_id and inr.filename = ou.filename) > 1;
ALTER TABLE ONLY public.sentry_commitfilechange
    ADD CONSTRAINT sentry_commitfilechange_commit_id_4c6f7ec25af34227_uniq UNIQUE (commit_id, filename);
    

DROP INDEX sentry_release_organization_id_f0a7ec9ba96de76;
DELETE from sentry_release_project where release_id in (SELECT id from public.sentry_release ou where (SELECT count(*) from public.sentry_release inr where inr.organization_id = ou.organization_id and inr.version = ou.version) > 1);

DELETE from sentry_releaseprojectenvironment where release_id in (SELECT id from public.sentry_release ou where (SELECT count(*) from public.sentry_release inr where inr.organization_id = ou.organization_id and inr.version = ou.version) > 1);

Delete from sentry_grouphash where group_id in (SELECT id from sentry_groupedmessage where first_release_id in (SELECT id from public.sentry_release ou where (SELECT count(*) from public.sentry_release inr where inr.organization_id = ou.organization_id and inr.version = ou.version) > 1));

DELETE from sentry_groupedmessage where first_release_id in (SELECT id from public.sentry_release ou where (SELECT count(*) from public.sentry_release inr where inr.organization_id = ou.organization_id and inr.version = ou.version) > 1);

DELETE from public.sentry_release ou where (SELECT count(*) from public.sentry_release inr where inr.organization_id = ou.organization_id and inr.version = ou.version) > 1;

CREATE UNIQUE INDEX sentry_release_organization_id_f0a7ec9ba96de76 ON public.sentry_release USING btree (organization_id, version);

REINDEX DATABASE postgres;
REINDEX system postgres;

In case there are more duplicates the reindex will fail and you have to delete thes duplicates in a similar way.

from self-hosted.

hubertdeng123 avatar hubertdeng123 commented on August 15, 2024

@candux Thanks for the input!

@MingNiu We're trying to figure out the best way to go about this. Different folks will have different ways they'd like to address the potential duplicate environments/releases/etc in their self-hosted instance. Merging data in postgres is extremely dangerous and it's not something that we think should be done as a general solution by us. For some people, they'd like to maybe keep the latest data while others would like to keep the data up until the point where data corruption occurred.

from self-hosted.

MingNiu avatar MingNiu commented on August 15, 2024

@hubertdeng123 ok.

If I only want to keep data such as projects, members, configurations, etc., and clearing the issue record is acceptable. So can you provide the official upgrade script?

We are not familiar with the table structure and it is too difficult for us to manually fix the data.

from self-hosted.

hubertdeng123 avatar hubertdeng123 commented on August 15, 2024

We will not be providing an official upgrade script at this time. What kind of duplicates are you seeing? That will help determine what rows to delete.

from self-hosted.

shkarface avatar shkarface commented on August 15, 2024

We're also affected by this issue, environments are now duplicated and workers cannot process tasks because of:

worker-1  | Traceback (most recent call last):
worker-1  |   File "/.venv/lib/python3.11/site-packages/celery/app/trace.py", line 477, in trace_task
worker-1  |     R = retval = fun(*args, **kwargs)
worker-1  |                  ^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/.venv/lib/python3.11/site-packages/sentry_sdk/utils.py", line 1711, in runner
worker-1  |     return sentry_patched_function(*args, **kwargs)
worker-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/.venv/lib/python3.11/site-packages/sentry_sdk/integrations/celery/__init__.py", line 380, in _inner
worker-1  |     reraise(*exc_info)
worker-1  |   File "/.venv/lib/python3.11/site-packages/sentry_sdk/utils.py", line 1640, in reraise
worker-1  |     raise value
worker-1  |   File "/.venv/lib/python3.11/site-packages/sentry_sdk/integrations/celery/__init__.py", line 375, in _inner
worker-1  |     return f(*args, **kwargs)
worker-1  |            ^^^^^^^^^^^^^^^^^^
worker-1  |   File "/.venv/lib/python3.11/site-packages/celery/app/trace.py", line 760, in __protected_call__
worker-1  |     return self.run(*args, **kwargs)
worker-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/usr/src/sentry/src/sentry/silo/base.py", line 146, in override
worker-1  |     return original_method(*args, **kwargs)
worker-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/usr/src/sentry/src/sentry/tasks/base.py", line 138, in _wrapped
worker-1  |     result = func(*args, **kwargs)
worker-1  |              ^^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/usr/src/sentry/src/sentry/tasks/store.py", line 741, in save_event_transaction
worker-1  |     _do_save_event(cache_key, data, start_time, event_id, project_id, **kwargs)
worker-1  |   File "/usr/src/sentry/src/sentry/tasks/store.py", line 617, in _do_save_event
worker-1  |     manager.save(
worker-1  |   File "/.venv/lib/python3.11/site-packages/sentry_sdk/tracing_utils.py", line 669, in func_with_tracing
worker-1  |     return func(*args, **kwargs)
worker-1  |            ^^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/usr/src/sentry/src/sentry/event_manager.py", line 471, in save
worker-1  |     jobs = save_transaction_events([job], projects)
worker-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/.venv/lib/python3.11/site-packages/sentry_sdk/tracing_utils.py", line 669, in func_with_tracing
worker-1  |     return func(*args, **kwargs)
worker-1  |            ^^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/usr/src/sentry/src/sentry/event_manager.py", line 3034, in save_transaction_events
worker-1  |     _get_or_create_environment_many(jobs, projects)
worker-1  |   File "/.venv/lib/python3.11/site-packages/sentry_sdk/tracing_utils.py", line 669, in func_with_tracing
worker-1  |     return func(*args, **kwargs)
worker-1  |            ^^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/usr/src/sentry/src/sentry/event_manager.py", line 941, in _get_or_create_environment_many
worker-1  |     job["environment"] = Environment.get_or_create(
worker-1  |                          ^^^^^^^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/usr/src/sentry/src/sentry/models/environment.py", line 98, in get_or_create
worker-1  |     env = cls.objects.get_or_create(name=name, organization_id=project.organization_id)[
worker-1  |           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/usr/src/sentry/src/sentry/silo/base.py", line 146, in override
worker-1  |     return original_method(*args, **kwargs)
worker-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/.venv/lib/python3.11/site-packages/django/db/models/manager.py", line 87, in manager_method
worker-1  |     return getattr(self.get_queryset(), name)(*args, **kwargs)
worker-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/usr/src/sentry/src/sentry/silo/base.py", line 146, in override
worker-1  |     return original_method(*args, **kwargs)
worker-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/.venv/lib/python3.11/site-packages/django/db/models/query.py", line 948, in get_or_create
worker-1  |     return self.get(**kwargs), False
worker-1  |            ^^^^^^^^^^^^^^^^^^
worker-1  |   File "/.venv/lib/python3.11/site-packages/django/db/models/query.py", line 652, in get
worker-1  |     raise self.model.MultipleObjectsReturned(
worker-1  | sentry.models.environment.Environment.MultipleObjectsReturned: get() returned more than one Environment -- it returned 2!
worker-1  | 22:41:45 [ERROR] celery.app.trace: Task sentry.tasks.store.save_event_transaction[e8aaa026-f203-4119-bf1a-917a74e70462] raised unexpected: MultipleObjectsReturned('get() returned more than one Environment -- it returned 2!') (data={'hostname': 'celery@ee7e0490a110', 'id': 'e8aaa026-f203-4119-bf1a-917a74e70462', 'name': 'sentry.tasks.store.save_event_transaction', 'exc': "MultipleObjectsReturned('get() returned more than one Environment -- it returned 2!')", 'traceback': 'Traceback (most recent call last):\n  File "/.venv/lib/python3.11/site-packages/celery/app/trace.py", line 477, in trace_task\n    R = retval = fun(*args, **kwargs)\n                 ^^^^^^^^^^^^^^^^^^^^\n  File "/.venv/lib/python3.11/site-packages/sentry_sdk/utils.py", line 1711, in runner\n    return sentry_patched_function(*args, **kwargs)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/.venv/lib/python3.11/site-packages/sentry_sdk/integrations/celery/__init__.py", line 380, in _inner\n    reraise(*exc_info)\n  File "/.venv/lib/python3.11/site-packages/sentry_sdk/utils.py", line 1640, in reraise\n    raise value\n  File "/.venv/lib/python3.11/site-packages/sentry_sdk/integrations/celery/__init__.py", line 375, in _inner\n    return f(*args, **kwargs)\n           ^^^^^^^^^^^^^^^^^^\n  File "/.venv/lib/python3.11/site-packages/celery/app/trace.py", line 760, in __protected_call__\n    return self.run(*args, **kwargs)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/usr/src/sentry/src/sentry/silo/base.py", line 146, in override\n    return original_method(*args, **kwargs)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/usr/src/sentry/src/sentry/tasks/base.py", line 138, in _wrapped\n    result = func(*args, **kwargs)\n             ^^^^^^^^^^^^^^^^^^^^^\n  File "/usr/src/sentry/src/sentry/tasks/store.py", line 741, in save_event_transaction\n    _do_save_event(cache_key, data, start_time, event_id, project_id, **kwargs)\n  File "/usr/src/sentry/src/sentry/tasks/store.py", line 617, in _do_save_event\n    manager.save(\n  File "/.venv/lib/python3.11/site-packages/sentry_sdk/tracing_utils.py", line 669, in func_with_tracing\n    return func(*args, **kwargs)\n           ^^^^^^^^^^^^^^^^^^^^^\n  File "/usr/src/sentry/src/sentry/event_manager.py", line 471, in save\n    jobs = save_transaction_events([job], projects)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/.venv/lib/python3.11/site-packages/sentry_sdk/tracing_utils.py", line 669, in func_with_tracing\n    return func(*args, **kwargs)\n           ^^^^^^^^^^^^^^^^^^^^^\n  File "/usr/src/sentry/src/sentry/event_manager.py", line 3034, in save_transaction_events\n    _get_or_create_environment_many(jobs, projects)\n  File "/.venv/lib/python3.11/site-packages/sentry_sdk/tracing_utils.py", line 669, in func_with_tracing\n    return func(*args, **kwargs)\n           ^^^^^^^^^^^^^^^^^^^^^\n  File "/usr/src/sentry/src/sentry/event_manager.py", line 941, in _get_or_create_environment_many\n    job["environment"] = Environment.get_or_create(\n                         ^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/usr/src/sentry/src/sentry/models/environment.py", line 98, in get_or_create\n    env = cls.objects.get_or_create(name=name, organization_id=project.organization_id)[\n          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/usr/src/sentry/src/sentry/silo/base.py", line 146, in override\n    return original_method(*args, **kwargs)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/.venv/lib/python3.11/site-packages/django/db/models/manager.py", line 87, in manager_method\n    return getattr(self.get_queryset(), name)(*args, **kwargs)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/usr/src/sentry/src/sentry/silo/base.py", line 146, in override\n    return original_method(*args, **kwargs)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/.venv/lib/python3.11/site-packages/django/db/models/query.py", line 948, in get_or_create\n    return self.get(**kwargs), False\n           ^^^^^^^^^^^^^^^^^^\n  File "/.venv/lib/python3.11/site-packages/django/db/models/query.py", line 652, in get\n    raise self.model.MultipleObjectsReturned(\nsentry.models.environment.Environment.MultipleObjectsReturned: get() returned more than one Environment -- it returned 2!\n', 'args': '()', 'kwargs': "{'cache_key': 'e:7dacbb33dd984f41ba98253510335ef1:60', 'data': None, 'start_time': 1720132852.0, 'event_id': '7dacbb33dd984f41ba98253510335ef1', 'project_id': 60, '__start_time': 1720132852.387123}", 'description': 'raised unexpected', 'internal': False})

See:

worker-1  | sentry.models.environment.Environment.MultipleObjectsReturned: get() returned more than one Environment -- it returned 2!

from self-hosted.

hubertdeng123 avatar hubertdeng123 commented on August 15, 2024

Since it looks like most folks are experiencing issues with environments/releases, I wanted to post a manual guide to attempt to fix this data corruption.
This is dangerous and can result in data loss. Please use at your own discretion!

For users looking to delete duplicate data as a result of data corruption on 24.4.2, 24.5.0, and 24.5.1:

We'd highly recommend removing the duplicates, then upgrading to 24.6.0+ as soon as possible to avoid using the postgres alpine image which may cause painpoints when upgrading in the future. Reindexing the database with corrupt data will result in errors relating to duplicates. So, we'll need to delete duplicates before attempting to fix the indexes.

  1. docker compose up -d postgres
  2. docker compose exec postgres bash
  3. psql -U postgres
  4. Run below commands

Cleaning up environment:

DROP INDEX sentry_environment_organization_id_name_95a37dc7_uniq;
DELETE from sentry_deploy WHERE environment_id in (SELECT id from public.sentry_environment ou where (SELECT count(*) from public.sentry_environment inr where inr.organization_id = ou.organization_id and inr.name = ou.name) > 1);
DELETE from sentry_latestrelease WHERE environment_id in (SELECT id from public.sentry_environment ou where (SELECT count(*) from public.sentry_environment inr where inr.organization_id = ou.organization_id and inr.name = ou.name) > 1);
DELETE from sentry_rule WHERE environment_id in (SELECT id from public.sentry_environment ou where (SELECT count(*) from public.sentry_environment inr where inr.organization_id = ou.organization_id and inr.name = ou.name) > 1);;
DELETE from sentry_userreport WHERE environment_id in (SELECT id from public.sentry_environment ou where (SELECT count(*) from public.sentry_environment inr where inr.organization_id = ou.organization_id and inr.name = ou.name) > 1);;

DELETE from sentry_environmentproject WHERE environment_id in (SELECT id from public.sentry_environment ou where (SELECT count(*) from public.sentry_environment inr where inr.organization_id = ou.organization_id and inr.name = ou.name) > 1);;
DELETE from sentry_releaseprojectenvironment WHERE environment_id in (SELECT id from public.sentry_environment ou where (SELECT count(*) from public.sentry_environment inr where inr.organization_id = ou.organization_id and inr.name = ou.name) > 1);;
DELETE from sentry_environment WHERE id in (SELECT id from public.sentry_environment ou where (SELECT count(*) from public.sentry_environment inr where inr.organization_id = ou.organization_id and inr.name = ou.name) > 1);;
CREATE UNIQUE INDEX sentry_environment_organization_id_name_95a37dc7_uniq ON public.sentry_environment USING btree (organization_id, name);

Cleaning up releases (credit to @candux):

DROP INDEX sentry_release_organization_id_version_65da5843_uniq;
DELETE from sentry_release_project where release_id in (SELECT id from public.sentry_release ou where (SELECT count(*) from public.sentry_release inr where inr.organization_id = ou.organization_id and inr.version = ou.version) > 1);

DELETE from sentry_releaseprojectenvironment where release_id in (SELECT id from public.sentry_release ou where (SELECT count(*) from public.sentry_release inr where inr.organization_id = ou.organization_id and inr.version = ou.version) > 1);

DELETE from sentry_grouphash where group_id in (SELECT id from sentry_groupedmessage where first_release_id in (SELECT id from public.sentry_release ou where (SELECT count(*) from public.sentry_release inr where inr.organization_id = ou.organization_id and inr.version = ou.version) > 1));

DELETE from sentry_groupedmessage where first_release_id in (SELECT id from public.sentry_release ou where (SELECT count(*) from public.sentry_release inr where inr.organization_id = ou.organization_id and inr.version = ou.version) > 1);

DELETE from public.sentry_release ou where (SELECT count(*) from public.sentry_release inr where inr.organization_id = ou.organization_id and inr.version = ou.version) > 1;

CREATE UNIQUE INDEX sentry_release_organization_id_version_65da5843_uniq ON public.sentry_release USING btree (organization_id, version);

For those users with clean data on versions 24.4.2, 24.5.0, 24.5.1 to 24.6.0+

We'll need to reindex the entire database to ensure data corruption will not occur in the future. According to the postgres docs, reindex is safe to be used in all cases.

Steps to reindex:

  1. git checkout 24.6.0+
  2. ./install.sh
  3. docker compose up -d postgres
  4. docker compose exec postgres bash
  5. psql -U postgres
  6. REINDEX DATABASE postgres;

from self-hosted.

JiffsMaverick avatar JiffsMaverick commented on August 15, 2024

@hubertdeng123 Thank you for your instructions. I would like to clarify one point: when executing the REINDEX DATABASE postgres command, I get an error:

postgres=# REINDEX DATABASE postgres;
ERROR:  could not create unique index "sentry_commitauthor_organization_id_3cdc85e9f09bf3f3_uniq"
DETAIL:  Key (organization_id, external_id)=(1, github:user3) is duplicated.

Is this related to this problem? What can be done about it? Here is what this table looks like:

postgres=# SELECT organization_id, external_id, COUNT(*)
FROM sentry_commitauthor
GROUP BY organization_id, external_id
HAVING COUNT(*) > 1;
 organization_id |     external_id      | count
-----------------+----------------------+-------
               1 | github:user1         |     2
               1 | github:user2         |     2
               1 | github:user3         |     2
               1 | github:user4         |     2
               1 |                      |    12
(5 rows)

from self-hosted.

feshchenkod avatar feshchenkod commented on August 15, 2024

We installed a fresh sentry 24.5.1 a few weeks ago and are now experiencing difficulties updating to version 24.6.0 or 24.7.0. Any guide on resolving this error?

Running hooks in /etc/ca-certificates/update.d...
done.
Usage: sentry upgrade [OPTIONS]
Try 'sentry upgrade --help' for help.

Error: No such option: --create-kafka-topics
Error in install/set-up-and-migrate-database.sh:30.
'$dcr web upgrade --create-kafka-topics' exited with status 2
-> ./install.sh:main:38
--> install/set-up-and-migrate-database.sh:source:30

Cleaning up...

from self-hosted.

JiffsMaverick avatar JiffsMaverick commented on August 15, 2024

@feshchenkod It looks like you forgot to update the versions in your .env.custom file.

from self-hosted.

feshchenkod avatar feshchenkod commented on August 15, 2024

@feshchenkod It looks like you forgot to update the versions in your .env.custom file.

Got it, I overlooked that part. Thanks a lot for your help!

from self-hosted.

hubertdeng123 avatar hubertdeng123 commented on August 15, 2024

@JiffsMaverick It looks like you have duplicate entries in sentry_commitauthor table. You'll need to clean those up before trying to reindex your table again.

from self-hosted.

JiffsMaverick avatar JiffsMaverick commented on August 15, 2024

@hubertdeng123 These records are not completely duplicated. There is an issue with the duplication of only two fields external_id and organization_id. For example:

 id | organization_id |           name           |                    email                     |           external_id
----+-----------------+--------------------------+----------------------------------------------+---------------------------------
  1 |               1 | User1                    | [email protected]                            | github:gituser1
 12 |               1 | User1                    | [email protected]                      | github:gituser1
  3 |               1 | User2                    | [email protected]                            | github:gituser2
 18 |               1 | User2-alt                | [email protected]                      | github:gituser2

Here is the code responsible for this index:

    class Meta:
        app_label = "sentry"
        db_table = "sentry_commitauthor"
        unique_together = (("organization_id", "email"), ("organization_id", "external_id"))

Is deleting records in this table safe? How would you recommend choosing which record to delete? Or can I delete anything and it will restore itself later?

from self-hosted.

hubertdeng123 avatar hubertdeng123 commented on August 15, 2024

Hmmm, if that is the case, could you try changing the external_id in this case? If these are truly not duplicates maybe what happened was the broken indexes resulted in the external_id not being incremented properly.

Is deleting records in this table safe? How would you recommend choosing which record to delete? Or can I delete anything and it will restore itself later?

Deleting records is never truly safe, it depends on whether or not the data in particular is something you are ok with keeping. Which record to delete would be up to you but let's first try to change the external_id.

@JiffsMaverick

from self-hosted.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.