Comments (3)
ArchiveBox doesn't go more than 1 hop deep when using depth=1
, so unless the links to the PDFs are directly on the page you're passing it, it's not going to recursively keep following links beyond the first hop.
You can do multiple passes if you need it to go deeper into a folder structure:
archivebox add --depth=1 'https://download.eversberg.eu'
archivebox list --csv=url --filter-type=domain download.eversberg.eu | archivebox add --depth=1
archivebox list --csv=url --filter-type=domain download.eversberg.eu | archivebox add --depth=1
archivebox list --csv=url --filter-type=domain download.eversberg.eu | archivebox add --depth=1
...
# repeat for however many levels deep you want to go
I recommend SiteSucker or wget
's recursive feature, they may be better suited to this use case.
See this issue for more info: #191
from archivebox.
I'm sorry, my request was terribly worded. I'm looking to archive the site "download.eversberg.eu" which is essentially just a file host running on Apache2. I managed to solve my issue by just running individual archive tasks for every folder and using depth setting 1. I'm essentially just using for a convenient way to archive below depth 1.
from archivebox.
No worries, I figured out what you meant after I visited the domain, edited my comment above 👍
from archivebox.
Related Issues (20)
- Support: singlefile & readability fail to work HOT 3
- Bug: Enter a valid URL. HOT 2
- Bug: AttributeError: 'PosixPath' object has no attribute 'split' / ImportError: attempted relative import beyond top-level package HOT 7
- New Feature: Provide deeper `mitmproxy` integration out-of-the-box in Docker HOT 1
- Bug: upgrading Docker image from 0.7.2 to 0.7.4 - The 0.7.4 version doesn't work HOT 3
- a bug of urllib.parse.urljoin HOT 2
- Feature Request: Create an ArchiveBox ingestion Slack bot
- Fix Docker image builds CI messing up `:latest`, `:stable`, and `:dev` tags HOT 14
- Feature Request: OIDC oauth2 sign in / registration HOT 3
- Make `Could not find profile "Default" in CHROME_USER_DATA_DIR` a warning instead of an error, and move to new PERSONAS_DIR system HOT 8
- Bug: Docker build failing HOT 4
- Feature Request: Set default extraction methods
- Support: How to restore accidentally deleted `docker-compose.yml` file HOT 7
- Bug: container-image on github is wrongly named archivebox/archivebox/archivebox HOT 5
- Question: How can I archive a page with expandable comments? HOT 1
- How to use CLI to set config values with JSON / nested quotes `SINGLEFILE_ARGS=["--example"]` HOT 1
- Chrome Browser Profile / Cookies not applying to SingleFile in v0.7.2? HOT 4
- Bug: can't set `CSRF_TRUSTED_ORIGINS`, preventing login when behind a load balancer HOT 3
- Bug: cannot generate API key HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from archivebox.