Comments (11)
I'm wondering if I can just run the slackdump
command again and it will pick up where it left off, or if it's going to do everything all over again.
I guess I could run the command again and exclude all the channels that it already downloaded, but then will they be added to channels.json
so I can open up the the entire workspace in slack-export-viewer?
from slackdump.
Hey @natea , if you run the export of this particular isolated thread: CAHFFCHPG:1560791995.077000
will you get this error?
slackdump CAHFFCHPG:1560791995.077000
Also, this is quite strange:
> curl 34.204.109.226:443
curl: (56) Recv failure: Connection reset by peer
> nslookup 34.204.109.226
Non-authoritative answer:
226.109.204.34.in-addr.arpa name = ec2-34-204-109-226.compute-1.amazonaws.com.
looks like a EC2 instance, and now I'm wondering why was it trying to fetch from there, it may have been an intermittent failure
from slackdump.
@rusq I ran that command and didn't get an error:
$ ./slackdump CAHFFCHPG:1560791995.077000
Slackdump 2.4.1 (commit: be0e57febccf8d705f43da2474e07c384414fe25) built on: 2023-08-15T09:48:09Z
2023/09/19 15:53:07 > checking user cache...
2023/09/19 15:53:07 cache expired: it will be recreated.
2023/09/19 15:53:07 thread request # 1, fetched: 17, total: 17, process results: (speed: 203.38/sec, avg: 203.38/sec)
2023/09/19 15:53:07 thread fetch complete, total: 17
2023/09/19 15:53:07 dumped 1 item(s)
2023/09/19 15:53:07 completed, time taken: 425.293417ms
looks like a EC2 instance, and now I'm wondering why was it trying to fetch from there, it may have been an intermittent failure
Are you saying that it was trying to retrieve a file that was hosted on an EC2 instance, that perhaps that EC2 instance is no longer running, which is why the file retrieval failed?
Does Slackdump have a timeout mechanism so if it's unable to retrieve a file from a remote site after a particular duration, it will skip it and go on with the next one?
What do you suggest I do at this point? Do I try running the slackdump command again, or will that reset from the beginning and re-download everything?
from slackdump.
Most likely at that moment there was an internet issue, or Slack was doing something obscene with it's cluster nodes.
I suspect if you retry the full export it will complete successfully.
Let me know how it goes?
from slackdump.
@rusq I ran a full export again last night, and got another similar error but on a different channel:
2023/09/20 22:30:56 messages request # 63, fetched: 200 (threads: 3, files: 3), total: 12600 (speed: 186.90/sec, avg: 15.84/sec)
2023/09/20 22:34:35 file "FME1BQXD5-Screenshot 2019-08-14 at 14.34.22.png" saved to dashboard/attachments: 32828 bytes written
2023/09/20 22:34:39 file "FM8HMDQQZ-Screen Shot 2019-08-16 at 9.02.19 AM.png" saved to dashboard/attachments: 73300 bytes written
2023/09/20 22:39:30 file "FM0JN02SF-OnPaste.20190814-111620.png" saved to dashboard/attachments: 95585 bytes written
2023/09/20 22:39:30 application error: export error: channels: error: error exporting conversation C0DLX9U86: failed to dump "dashboard" (C0DLX9U86): callback error: failed to dump channel C0DLX9U86: read tcp 192.168.4.78:60563->34.205.195.66:443: read: connection reset by peer
Rather than do a full export, would it be better to explicitly name all the channels so that I can have more control over which ones are downloaded and then if there's an error, I can remove the completed channels from the list, and re-run the export with just hte ones that haven't been downloaded already?
I'm concerned that I could do this export multiple times and it will continue to get hung up on a channel, requiring me to do a full export again. The current export directory is approximately 12GB, so it's a non-trivial amount of data to download, especially with all the attachments.
Alternatively, could some error handling be added to the script that when encountering an error like this, it would skip over that channel or attachment, and try again later, and not abort the entire operation?
from slackdump.
Hey Nate, I have introduced retry logic on network errors in #235 , you can check v2.4.2 on the Releases page. By default it attempts to retry 3 times with an exponential backoff of 1, 2, 4 seconds. Let me know if it works for you.
Note to self: ported to cli-remake
too
from slackdump.
That worked! It appears to have downloaded all 615 channels.
2023/09/24 09:21:07 channels request # 9, fetched: 56, total: 615 (speed: 0.02/sec, avg: 0.02/sec)
2023/09/24 09:21:07 channels fetch complete, total: 615 channels
2023/09/24 09:21:07 out of which exported: 615
It looks like it got the private channels too, and the multi-person DMs (signified by mpdm-
prefix?).
Any idea why slack-export-viewer
has no problem displaying the multi-person DMs and the private 1-1 DMs, but SlackLogViewer
does not show any Direct messages or Group messages?
from slackdump.
btw, is there a way to speed up the export, perhaps increasing the number of download threads? I'm on a 1Gigabit connection if that makes a difference.
I'm going to do another export using -r text
so i have the conversations in a plaintext format, and won't bother to download the attachments this time, so i'm wondering if i can speed up the downloads of the Slack conversations.
from slackdump.
Strangely the Mattermost export took 9 hr while the normal export took 8 hrs.
2023/09/25 08:50:23 channels fetch complete, total: 615 channels
2023/09/25 08:50:23 out of which exported: 615
2023/09/25 08:50:23 completed, time taken: 9h12m35.023321709s
from slackdump.
Hey Nate, thanks for the feedback, glad to hear that it worked. It must have been slack server latency/rate limiting, because mattermost and standard export formats are exactly the same in the way that they treat messages and threads. The only difference is where they put the file attachments.
Regarding the connection speed — short answer: you can experiment with the rate limiting in the slackdump CLI, by default it's set to safe values as to prevent hitting the rate limit error from Slack.
Long answer: there are several factors that affect that, from "affects the most" to "affects the least":
- Rate limiting. There are four throttling tiers in Slack API. If one exceeds the rate limit defined for the particular endpoint, the client receives 429 error and has to wait the number of seconds returned by the server before retrying. If one adheres to the limits, the number of 429 errors is lower. Slack allows for short bursts, but it's really a gamble. On short dumps, one might get away by exceeding these limits for short period of time.
- Number of threads in the channel — imagine we download a chunk of 100 messages. If none of these messages contain threads, we move on to the next chunk. If every message in the chunk has a thread attached to it, this will result in 100+ additional calls to the API to fetch the thread contents, basically reducing the speed 100+ times.
- Requesting 100 messages in a chunk does not guarantee that Slack will return 100 messages in response to the API call. Depending on their internal logic, it might return less. When testing on a huge channel which had messages from 2015, I saw it returning just 1 message instead of requested 100 per batch, so instead of, say, 10 requests to fetch 1000 messages with 100 chunk size, slackdump had to make 1000 requests, because Slack was being unreasonable.
from slackdump.
Re SlackLogViewer
, I saw you opened an issue with it (thayakawa-gh/SlackLogViewer#19). That's exactly what I'd do.
from slackdump.
Related Issues (20)
- Add support for exporting canvases and attachments HOT 3
- Variation in capitalization of "type" HOT 8
- When using export-type mattermost program downloads files anyway HOT 2
- Is possible to dump Emojis with information about who created? HOT 1
- How to export direct messages? HOT 1
- Attachments downloaded in standard export although no -download flag HOT 4
- Selective attachments download (i.e. only private/only public) HOT 1
- Add an homebrew installation method HOT 1
- Slack reports browser is not supported during login HOT 1
- Crashes on Linux and Windows HOT 2
- Incremental backup script is empty HOT 2
- failed to initialise the auth provider HOT 8
- Migrate off survey
- Leakless triggering false positive on Windows HOT 1
- Support Slack Email Challenge during EZ Login HOT 1
- Add support for exporting own messages only from timeframe HOT 6
- Chromium doesn't get launched on Guix HOT 10
- User export: Allow `-download` flag to also fetch avatars HOT 1
- Error when dumping channel with files, when there are files that was deleted. HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from slackdump.