GithubHelp home page GithubHelp logo

Comments (11)

natea avatar natea commented on June 12, 2024

I'm wondering if I can just run the slackdump command again and it will pick up where it left off, or if it's going to do everything all over again.

I guess I could run the command again and exclude all the channels that it already downloaded, but then will they be added to channels.json so I can open up the the entire workspace in slack-export-viewer?

from slackdump.

rusq avatar rusq commented on June 12, 2024

Hey @natea , if you run the export of this particular isolated thread: CAHFFCHPG:1560791995.077000 will you get this error?

slackdump CAHFFCHPG:1560791995.077000

Also, this is quite strange:

> curl 34.204.109.226:443
curl: (56) Recv failure: Connection reset by peer

> nslookup 34.204.109.226
Non-authoritative answer:
226.109.204.34.in-addr.arpa	name = ec2-34-204-109-226.compute-1.amazonaws.com.

looks like a EC2 instance, and now I'm wondering why was it trying to fetch from there, it may have been an intermittent failure

from slackdump.

natea avatar natea commented on June 12, 2024

@rusq I ran that command and didn't get an error:

$ ./slackdump CAHFFCHPG:1560791995.077000
Slackdump 2.4.1 (commit: be0e57febccf8d705f43da2474e07c384414fe25) built on: 2023-08-15T09:48:09Z
2023/09/19 15:53:07 > checking user cache...
2023/09/19 15:53:07   cache expired: it will be recreated.
2023/09/19 15:53:07   thread request #    1, fetched:   17, total:       17, process results:  (speed: 203.38/sec, avg: 203.38/sec)
2023/09/19 15:53:07   thread fetch complete, total: 17
2023/09/19 15:53:07 dumped 1 item(s)
2023/09/19 15:53:07 completed, time taken: 425.293417ms

looks like a EC2 instance, and now I'm wondering why was it trying to fetch from there, it may have been an intermittent failure

Are you saying that it was trying to retrieve a file that was hosted on an EC2 instance, that perhaps that EC2 instance is no longer running, which is why the file retrieval failed?

Does Slackdump have a timeout mechanism so if it's unable to retrieve a file from a remote site after a particular duration, it will skip it and go on with the next one?

What do you suggest I do at this point? Do I try running the slackdump command again, or will that reset from the beginning and re-download everything?

from slackdump.

rusq avatar rusq commented on June 12, 2024

Most likely at that moment there was an internet issue, or Slack was doing something obscene with it's cluster nodes.

I suspect if you retry the full export it will complete successfully.

Let me know how it goes?

from slackdump.

natea avatar natea commented on June 12, 2024

@rusq I ran a full export again last night, and got another similar error but on a different channel:

2023/09/20 22:30:56 messages request #   63, fetched:  200 (threads: 3, files: 3), total:    12600 (speed: 186.90/sec, avg:  15.84/sec)
2023/09/20 22:34:35 file "FME1BQXD5-Screenshot 2019-08-14 at 14.34.22.png" saved to dashboard/attachments: 32828 bytes written
2023/09/20 22:34:39 file "FM8HMDQQZ-Screen Shot 2019-08-16 at 9.02.19 AM.png" saved to dashboard/attachments: 73300 bytes written
2023/09/20 22:39:30 file "FM0JN02SF-OnPaste.20190814-111620.png" saved to dashboard/attachments: 95585 bytes written
2023/09/20 22:39:30 application error: export error: channels: error: error exporting conversation C0DLX9U86: failed to dump "dashboard" (C0DLX9U86): callback error: failed to dump channel C0DLX9U86: read tcp 192.168.4.78:60563->34.205.195.66:443: read: connection reset by peer

Rather than do a full export, would it be better to explicitly name all the channels so that I can have more control over which ones are downloaded and then if there's an error, I can remove the completed channels from the list, and re-run the export with just hte ones that haven't been downloaded already?

I'm concerned that I could do this export multiple times and it will continue to get hung up on a channel, requiring me to do a full export again. The current export directory is approximately 12GB, so it's a non-trivial amount of data to download, especially with all the attachments.

Alternatively, could some error handling be added to the script that when encountering an error like this, it would skip over that channel or attachment, and try again later, and not abort the entire operation?

from slackdump.

rusq avatar rusq commented on June 12, 2024

Hey Nate, I have introduced retry logic on network errors in #235 , you can check v2.4.2 on the Releases page. By default it attempts to retry 3 times with an exponential backoff of 1, 2, 4 seconds. Let me know if it works for you.

Note to self: ported to cli-remake too

from slackdump.

natea avatar natea commented on June 12, 2024

That worked! It appears to have downloaded all 615 channels.

2023/09/24 09:21:07 channels request #    9, fetched:   56, total:      615 (speed:   0.02/sec, avg:   0.02/sec)
2023/09/24 09:21:07 channels fetch complete, total: 615 channels
2023/09/24 09:21:07   out of which exported:  615

It looks like it got the private channels too, and the multi-person DMs (signified by mpdm- prefix?).

Any idea why slack-export-viewer has no problem displaying the multi-person DMs and the private 1-1 DMs, but SlackLogViewer does not show any Direct messages or Group messages?

from slackdump.

natea avatar natea commented on June 12, 2024

btw, is there a way to speed up the export, perhaps increasing the number of download threads? I'm on a 1Gigabit connection if that makes a difference.

I'm going to do another export using -r text so i have the conversations in a plaintext format, and won't bother to download the attachments this time, so i'm wondering if i can speed up the downloads of the Slack conversations.

from slackdump.

natea avatar natea commented on June 12, 2024

Strangely the Mattermost export took 9 hr while the normal export took 8 hrs.

2023/09/25 08:50:23 channels fetch complete, total: 615 channels
2023/09/25 08:50:23   out of which exported:  615
2023/09/25 08:50:23 completed, time taken: 9h12m35.023321709s

from slackdump.

rusq avatar rusq commented on June 12, 2024

Hey Nate, thanks for the feedback, glad to hear that it worked. It must have been slack server latency/rate limiting, because mattermost and standard export formats are exactly the same in the way that they treat messages and threads. The only difference is where they put the file attachments.

Regarding the connection speed — short answer: you can experiment with the rate limiting in the slackdump CLI, by default it's set to safe values as to prevent hitting the rate limit error from Slack.

Long answer: there are several factors that affect that, from "affects the most" to "affects the least":

  1. Rate limiting. There are four throttling tiers in Slack API. If one exceeds the rate limit defined for the particular endpoint, the client receives 429 error and has to wait the number of seconds returned by the server before retrying. If one adheres to the limits, the number of 429 errors is lower. Slack allows for short bursts, but it's really a gamble. On short dumps, one might get away by exceeding these limits for short period of time.
  2. Number of threads in the channel — imagine we download a chunk of 100 messages. If none of these messages contain threads, we move on to the next chunk. If every message in the chunk has a thread attached to it, this will result in 100+ additional calls to the API to fetch the thread contents, basically reducing the speed 100+ times.
  3. Requesting 100 messages in a chunk does not guarantee that Slack will return 100 messages in response to the API call. Depending on their internal logic, it might return less. When testing on a huge channel which had messages from 2015, I saw it returning just 1 message instead of requested 100 per batch, so instead of, say, 10 requests to fetch 1000 messages with 100 chunk size, slackdump had to make 1000 requests, because Slack was being unreasonable.

from slackdump.

rusq avatar rusq commented on June 12, 2024

Re SlackLogViewer, I saw you opened an issue with it (thayakawa-gh/SlackLogViewer#19). That's exactly what I'd do.

from slackdump.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.