GithubHelp home page GithubHelp logo

ifarchive-ifmap-py's People

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

dfabulich

ifarchive-ifmap-py's Issues

Include the full pathname in the RSS entries.

It might be nice to include the full pathname of the added entry in the new-additions RSS feed. Currently the title is just the bare filename, and the body is the index description (if any). We could put the path in the title, or maybe as the first line of the description.

Support files and directories with Unicode characters in the name

We're good on files with URL-escapable characters. (E.g. "Brain_Guzzlers_from_Beyond!.gblorb".) Those work.

However, I'm not sure we've tested:

  • Files with & in the name
  • Directories with space, &, ! in the name
  • Anything with unicode characters in the name

That third one is a minefield. Linux and MacOS may not agree on how unicode is stored in the filesystem; Apache may have its own ideas. I hear that MacOS 10.13 is changing the filesystem again, too. So that will require very careful testing.

But the first two are definitely achieveable.

Correct plurals

The index template can generate output like "1 Items" and "1 Subdirectories". Fix.

We could either add hacky map properties like "count1" and "subdircount1" (bools), or build a general tag query syntax {?count=1}.

Support multiple file upload

The upload.py script was hacked out of somebody's example that supported uploading several files at once. I suspect that was implemented with some JS code to create new <input type="file"> entries in the form. This is worth re-creating, I think.

Support subdir files

If you write an Index entry that looks like

# Games/Dr Ludwig and the Devil.zip

...then ifmap chokes with an error: "Index entry without file". The file exists but ifmap doesn't know to look down the directory tree.

This should be valid, and generate an index page with the correct link.

Note that subdirectory entries with slashes don't have this problem. (See https://ifarchive.org/if-archive/games/competition97/Index .)

Legacy -X-ified pages should be redirects

Generating two sets of index page with different directory structures is a bad idea. It makes it hard to write sensible links in Markdown descriptions.

We can make the X-pages into external redirects at this point. I hope.

(It may be easiest to do this with an Apache rewrite config line.)

Can <wbr> tags be used in the index?

Could <wbr> tags be added after slashes in the indexes, to avoid breaking words across lines? It would also avoid the awkward "if-" being on a line by itself.

Mobile screenshot showing suboptimal formatting

Correctly escape & in Index files

Currently the Index files have Unicode characters in HTML-escaped form -- &oslash;, &amp;, etc. (But not consistently; there are some bare & as well.)

I would like to change these to literal Unicode characters, declaring that the Index files (and Master-Index) are all UTF-8. Then we can have ifmap.py escape them consistently when generating HTML and XML output.

(I think we're already serving plain text files with a UTF-8 content type header. Check this.)

Increase CloudFlare cache times?

(Not actually an issue with this repo, but I have to file it somewhere.)

We get 40-50% cache coverage from CloudFlare. We can probably improve that by turning up the cache lifetime.

I originally tried to balance the cache lifetime against the problem of old cache data lingering after a file was replaced. But now the admin tool has a "clear cache" button so this problem is much reduced.

Footnote: should we also add index pages to the cache now? (With a shorter lifetime, say 24 hours.) It would be fairly simple to cache-bump selected index pages when we update them. On the other hand, it would be a giant pain to cache-bump all of them, which we occasionally have to do. (E.g. when changing a web page template.)

More optimization

We could shave another 1.4 seconds (25% or so) on Index-only updates if we skipped the date_X.html pages. (They have no metadata or description lists, so they don't need to be touched if only Index files have changed. In particular, the every-file list at date.html is enormous.)

Index-only updates are unfortunately not easy to detect.

Various template cleanups

Rename {name} to {htmlname} and {rawname} to {name}.

Conditional-check on {desc} (in File-List-Entry) so we don't have to create blank {desc} entries.

Smart rewrite

We don't have to rewrite all 8000-odd index.html files every time we rebuild. Some smart date-checking could reduce that to a few dozen.

(And 16000 metadata files, don't forget those.)

We'd have to check the timestamp of all Index files, as well as all data files. Easy enough as a plan.

Switch to Jinja

My hacked-up template system wants to be Jinja, and the admin tool has Jinja as a requirement anyhow.

Jinja is probably faster although we should test that.

Get rid of the X-slash convention

Nobody likes URLs like http://ifarchive.org/indexes/if-archiveXgamesXcompetition2016.html. Change these everywhere to http://ifarchive.org/indexes/if-archive/games/competition2016.html, creating a shadow directory tree in /indexes with the same structure as the main tree.

It should also create symlinks in the old locations to preserve old links. (Except where the old and new location are identical, e.g. http://ifarchive.org/indexes/if-archive.html.)

There are commands in build-indexes which chmod/chgrp all the index files after creation. These should walk the trees correctly, but I'll have to test that.

Begin using a CDN

CloudFlare? CloudFront (Amazon)? See what's easy to set up.

We would deprecate the old mirror network. Mirrors could continue to operate, but we'd drop the list from the front page and make mirror.ifarchive.org a synonym of the main site.

Then remove robots.txt and see if everything survives. :)

make-master-index.py should walk the tree

Relying on ls -lR is an old hack. It should just walk the directory tree, looking for Index files.

Then we could get rid of the LC_COLLATE hack, too. (Sort in Python code rather than using ls order.)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.