nytimes / library Goto Github PK

View Code? Open in Web Editor NEW

1.1K 32.0 142.0 2.26 MB

A collaborative documentation site, powered by Google Docs.

Home Page: https://nyt-library-demo.herokuapp.com/

License: Apache License 2.0

Dockerfile 0.10% Shell 0.37% JavaScript 67.87% SCSS 18.36% EJS 13.26% Procfile 0.03%

library's People

Stargazers

Watchers

Forkers

louisroehrs kale-code amychurchwell widgeticdev tbstbstbs ozansisli teyit lyon-neu wherebyus varunkothamachu mtvillwock dantrevino pfhayes nickwareing chaselee0127 ivanistheone karenkuegrab department-of-general-services johninweb erich-forth maxine chrisfrank kylefowler salexpdx kkrbalam hhy5277 jonpo afischer kanglicheng wbez thomasoide intellimedia rururero rli97 abstrctn nicolezhu darthburrito mindvalley shawn620 patcon jsemon hackoregon stewartjohn alexwgraves mitodl zeke datadesk huyenmmme evelynting610 cuchalin praveenscience rai-hi imox2 foundrymakes tmotyl darobin news-catalyst covid19-guide pzzlr upstatement douglasward harperreed websiteinspiration jonahshai tmmlufkin douglastramsey admariner smoores-dev bonniernews tonybemidji isd31cloudadmin thierrryscotto jujubalandia bynikitaroy axlepayments sewerai-org jgannondo tarekrached followthemoney swardlincoln gabrielfalcao heroku2use penina-nyt scholarslab tilgovi midorikocak sclark39 donfanning reaktor volley-llc recup-git michigandaily the-spinoff-group jobajuba harukikinoshita adhocteam forkkit mistycity rextechventures launch-journey

library's Issues

Display a warning when multiple documents resolve to the same path

Problem Description

Google Drive allows multiple documents with the same name to exist in the same folder. Library currently only allows a single item to occupy a given path.

Feature

When this situation happens in a Google Drive or shared folder, Library should detect it and emit a warning that says only one document will be accessible through Library.

"no results" search should not suggest a PR in this git repository

Context (Environment)

Not context dependent -- seen in the Knight Lab Library installation as well as the demo site

Expected Behavior

Perhaps a link to a repository specified by the party running a Library instance, or perhaps entirely custom per-installation text.

Actual Behavior

A search with no results, such as https://nyt-library-demo.herokuapp.com/search?q=foobar shows this:

"a PR on GitHub" links to this very repository, even for Library instances for which NYT has no responsibility.

To Reproduce

Try this link: https://nyt-library-demo.herokuapp.com/search?q=foobar or any other no-results search in a Library instance.

Additional Information

Possible Solution

For many folks running a Library instance, a GitHub PR wouldn't be the best way for their community to request a new page. But even if it is, the Git repo URL should be configurable.

Increased styling customization options

Problem Description

@NUKnightLab has its own style guide, and we would like to use it to style our Library instance and then use select default Library styles such as the grid system alongside the custom styles.

Right now, a way to achieve this would be to create an empty file for each file in the styles directory in the custom directory (ex: custom/styles/core/_furniture.scss to override styles/core/_furniture.scss) and then selectively copy-paste Library styles we want to keep into those custom files.

Feature

It would be nice to have an easier way to customize Library beyond fonts and colors. One way to do this would be to create more modular SCSS files per feature/component that can be selectively imported. Happy to discuss other implementations in this thread!

Large document trees in shared folders exceed API limitations

As reported by @pfhayes, in large document trees inside a shared folder, the list of parentIds may get too long to complete in a single request. We should partition this into a maximum number of parentIds to maintain parallelization most of the time but resolve the issue that @pfhayes observed.

Originally posted by @pfhayes in #15

EJS template string in Google Docs are not escaped correctly when rendering

Expected Behavior

In Google Doc:

```
<%- include('./../default/partials/pagination', { pages }) %>
```

The code snippet should render with the contents:

<%- include('./../default/partials/pagination', { pages }) %>

Actual Behavior

The code snippet is rendered empty:

Possible Solution

Is there a conflict between the EJS templating and EJS template strings?

Read Access Restrictions

Problem Description

We’re interested in being able to restrict access to selected folders, pages and docs. We’d like to use Library for collaborative reporting projects, which would require restricting access to some sensitive materials. Is that a feature that can be added?

Feature

Restricted access to selected files.

Additional Information

Doubled search results

Context (Environment)

Knight Lab's installation of Library

Expected Behavior

To see each relevant page only once

Actual Behavior

To Reproduce

I've tried this with other searches and this is the only one I can cause it with.

Additional Information

The search term "design resources" is also the name of the folder containing all of the doubled documents. But searching for other folder names doesn't result in doubled results.

The "overview" page includes links to the other doubled pages, so maybe it has something to do with the doubling? I tried a few changes and can't seem to recreate it with any similar cases.

Possible Solution

/cc @maxine

Problems setting up Library

Hello,

I tried to setup Library for my organisation, but I’m facing an error that I don’t understand.

Here are the steps I took :

Create a shared folder under my account (managed by my organisation)
Create a project under my organisation
Operations on GCP :
1. Enable Google Drive API
2. Enable Cloud Datastore in Datastore mode (nothing about mode is specified in README.md)
3. Create a service account with Cloud Datastore User role (and get cred JSON file)
4. Create oAuth credentials with consent screen
Operations on local environment :
1. Cloned repo
2. Installed dependencies (npm i --no-optional)
3. Copy oAuth cred JSON file to server/.auth.json
4. Setup .env file with :
  - NODE_ENV : development
  - GOOGLE_CLIENT_ID : client id from oAuth cred
  - GOOGLE_CLIENT_SECRET : client secret from oAuth cred
  - GCP_PROJECT_ID : created project ID
  - APPROVED_DOMAINS : same as approved domains set for oAuth credentials
  - SESSION_SECRET : generated with node -e "console.log(require('crypto').randomBytes(256).toString('base64'));"
  - DRIVE_TYPE : set to folder
  - DRIVE_ID : created folder ID

After I took those steps, I launched project with npm run build && npm run watch and navigated to http://localhost:3000.

I directly got a 500 error and this is logged in terminal :

{ message:
[0]    '9 FAILED_PRECONDITION: no matching index found. recommended index is:\n- kind: LibraryViewTeam\n  properties:\n  - name: userId\n  - name: lastViewedAt\n    direction: desc\n',
[0]   stack:
[0]    'Error: 9 FAILED_PRECONDITION: no matching index found. recommended index is:\n- kind: LibraryViewTeam\n  properties:\n  - name: userId\n  - name: lastViewedAt\n    direction: desc\n\n    at Object.exports.createStatusError (/home/nicolas/dev/yellow/library/node_modules/grpc/src/common.js:91:15)\n    at Object.onReceiveStatus (/home/nicolas/dev/yellow/library/node_modules/grpc/src/client_interceptors.js:1204:28)\n    at InterceptingListener._callNext (/home/nicolas/dev/yellow/library/node_modules/grpc/src/client_interceptors.js:568:42)\n    at InterceptingListener.onReceiveStatus (/home/nicolas/dev/yellow/library/node_modules/grpc/src/client_interceptors.js:618:8)\n    at callback (/home/nicolas/dev/yellow/library/node_modules/grpc/src/client_interceptors.js:845:24)',
[0]   code: 9,
[0]   metadata: { _internal_repr: {} },
[0]   details:
[0]    'no matching index found. recommended index is:\n- kind: LibraryViewTeam\n  properties:\n  - name: userId\n  - name: lastViewedAt\n    direction: desc\n',
[0]   note:
[0]    'Exception occurred in retry method that was not classified as transient' }

I can’t find where I’m wrong setting up Library... Can you please help me ?

Thanks in advance.

Docs should make clear an HTTPS clone URL required for customization repo config variable

If I set CUSTOMIZATION_GIT_REPO to [email protected]:datadesk/library-customization.git and reboot my dyno I get:

2019-11-11T14:54:45.360804+00:00 heroku[web.1]: Starting process with command `./bin/install_customizations && npm run build && npm start`
2019-11-11T14:54:47.912752+00:00 app[web.1]: Checking for CUSTOMIZATION_GIT_REPO environment variable...
2019-11-11T14:54:47.912854+00:00 app[web.1]: Cloning custom repo...
2019-11-11T14:54:47.915816+00:00 app[web.1]: Cloning into 'custom'...
2019-11-11T14:54:47.991172+00:00 app[web.1]: Host key verification failed.
2019-11-11T14:54:47.991908+00:00 app[web.1]: fatal: Could not read from remote repository.
2019-11-11T14:54:47.991911+00:00 app[web.1]: 
2019-11-11T14:54:47.991913+00:00 app[web.1]: Please make sure you have the correct access rights
2019-11-11T14:54:47.991916+00:00 app[web.1]: and the repository exists.

If I switch it to the HTTPS version of https://github.com/datadesk/library-customization.git, it works fine.

The screenshot in the docs uses the HTTP version, but I would propose updating the text to make clear you must use HTTPS. The SSH link is the default on GitHub and, like me, I could imagine others thoughtlessly pasting it in.

History localStorage does not function correctly

Expected Behavior

History should be stored locally in localStorage using a user-specific key, and this history should be used within an hour of it being stored.

Actual Behavior

Due to an assignment where a conditional should be as well as some other issues, the item in localStorage will never be used.

Possible Solution

A few small tweaks to the client-side logic would fix this — see #51 for another example of localStorage being used with a time-based expiry.

Abstract homepage modules into custom plugins

Problem Description

Currently, the "Find by Team" section and "Useful Docs" modules are not customizable. We should allow this behavior to be changed via the custom deploy pattern demonstrated in nytimes/library-customization-example.

Feature

Core Library code should be updated to register plugins for that particular space, and generically lay out a various number of modules which contain their own logic. We should update the docs to demonstrate how those sections can be customized, and update the demo site to use this feature.

Set a max height for folder views with scroll

Problem Description

Currently there is no max height on the boxes that list the contents of folders in the /categories view. This can hinder usability and look goofy when there are top-level folders with many items, especially when next to folders containing few items.

Feature

Add a reasonable max-height to the children-container class, and set overflow: scroll. Additionally, add some indication that the element is scrollable, e.g. a fade-out at the bottom or a permanent scroll bar.

This maximum height should be small enough that sections can be bypassed on mobile devices without having to scroll to the bottom of the scroll view first. (This likely needs some a spec)

Document creation timestamps

Problem Description

An issue discussed on the October community call was that it is often difficult to determine if documentation is outdated. A good first step to provide some additional clarity here would be to include creation dates on document pages.

Feature

Maybe something like this?

Not sure if an exact date or a relative timestamps with alt-text showing the exact date would be more useful in this scenario.

Unexpected token in JSON error on current master commit d8a2c447

I followed the detailed instructions in the getting started pages (and looked at issue #12. APIs are enabled, oAuth2 tokens created, datastore indexes created, service account, etc.

I have one file only in my drive folder, and the drive ID is correct. When I run locally with npm run watch, I get the following:

[0] debug: updating tree...
[0] warn: failed updating tree
[0] { message: 'Unexpected token ' in JSON at position 0',
[0] stack: 'SyntaxError: Unexpected token ' in JSON at position 0\nSyntaxError: Unexpected token ' in JSON at position 0\n at JSON.parse ()\n at ReadStream.inputStream.setEncoding.on.on.on (/Users/mark/ibis/devel/src/internal-documentation/library/node_modules/google-auth-library/build/src/auth/googleauth.js:346:39)\n at ReadStream.emit (events.js:164:20)\n at endReadableNT (_stream_readable.js:1062:12)\n at process._tickCallback (internal/process/next_tick.js:152:19)\nFrom previous event:\n at /Users/mark/ibis/devel/src/internal-documentation/library/node_modules/promise-inflight/inflight.js:29:16\n at _inflight (/Users/mark/ibis/devel/src/internal-documentation/library/node_modules/promise-inflight/inflight.js:28:25)\n at /Users/mark/ibis/devel/src/internal-documentation/library/node_modules/promise-inflight/inflight.js:22:14\n at runCallback (timers.js:773:18)\n at tryOnImmediate (timers.js:734:5)\n at processImmediate [as _immediateCallback] (timers.js:711:5)\nFrom previous event:\n at inflight (/Users/mark/ibis/devel/src/internal-documentation/library/node_modules/promise-inflight/inflight.js:14:40)\n at updateTree (/Users/mark/ibis/devel/src/internal-documentation/library/server/list.js:66:10)\n at startTreeRefresh (/Users/mark/ibis/devel/src/internal-documentation/library/server/list.js:372:11)\n at Timeout.setTimeout [as _onTimeout] (/Users/mark/ibis/devel/src/internal-documentation/library/server/list.js:378:22)\n at ontimeout (timers.js:466:11)\n at tryOnTimeout (timers.js:304:5)\n at Timer.listOnTimeout (timers.js:264:5)',
[0] stackCleaned: true }

Is this is a known issue in the code, or an indication that I set up authentication wrong? Please help!

Audit script load locations, inline code

As Isaac said in #51, we could likely move many (if not all) of our script tags out of the document <head>.

It looks like all the other Library scripts are in the head (and we just haven't audited this in a while). [...] I think it would be worthwhile for us to be consistent about where scripts are included. [...] What do you think about moving the other includes so they can all be together? My hunch is that this wouldn't break anything but would be curious to see what you find.

Originally posted by @isaacwhite in #51

Additionally, as seen in #283, our inline client-side code could probably use some cleanup as well.

Support Node 12

Problem Description

Library officially currently supports LTS Node versions 8 and 10. Node 8 is in maintenance mode with an end-of-life planned for April 2020

Feature

Since Node 12 is the current LTS version, we should support it. We should

Update any dependencies that are rely on libraries not compatible with Node 12
Enable a Node 12 environment in Travis

Additional Information

Attempting to run the app on Node 12 on my MacOS 10.14 machine, I had to rebuild node-sass before the app would start, which would eventually fail due to dependency gRPC. Rebuilding gRPC fails on my machine.

Service Account Access

Hello! I've been working on getting an instance of Library running locally prior to deploying in a test environment and think I have it mostly working except for one part. At the bottom of this page when attempting to share the folder with the service account address, I don't have that option. Instead, "Share" is disabled and grayed out.

After reading a bit, I found this Google support answer which says Note: You may not share folders stored in Team Drives.

We attempted to give the service account read only access to the team drive even though we're using a sub-folder, but no luck.

Here's the log output when attempting to load localhost:3000:

app_1  | debug: searching for files > 0
app_1  | debug: Current file count in drive: 0
app_1  | debug: tree updated.
app_1  | error: Serving an error page for /
app_1  | { message: 'Cannot convert undefined or null to object',
app_1  |   stack:
app_1  |    'TypeError: Cannot convert undefined or null to object\n    at Function.keys (<anonymous>)\n    at buildDisplayCategories (/usr/src/app/server/routes/pages.js:56:29)\n    at handlePage (/usr/src/app/server/routes/pages.js:47:24)' }

It may be unrelated since we aren't customizing any of the behavior, but I see the following debug errors during boot:

debug: Failed pulling in custom file cache/store @ /usr/src/app/custom/cache/store
debug: Failed pulling in custom file userAuth @ /usr/src/app/custom/userAuth
debug: Failed pulling in custom file csp @ /usr/src/app/custom/csp

Maybe we need to specify these? Is there a different way to give the service account access to a subfolder of our team drive that we might be missing? Thanks in advance!

Server error 500 after deploying site to Google App Engine

Context (Environment)

Google App Engine Deployment
Node v9.0.0

Expected Behavior

After running gcloud app deploy --project=**my_project_id**
and then
gcloud app browse --project=**my_project_id**
I should be able to browse my library site.

Actual Behavior

After running gcloud app deploy --project=**my_project_id**
and then
gcloud app browse --project=**my_project_id**
I'm asked to sign into my library site. After signing in with my account, I'm presented with the following error:

To Reproduce

Follow steps as written on the following guides in the following order:

Additional Information

Seems to be the same problem as explained here: #17. Full disclosure, I'm not experienced in doing this sort of thing, and am unsure of even where to start troubleshooting this problem! I have noticed some small discrepancies between the steps I followed (from the sample NYT Library here) and the steps as they are listed here on GitHub: https://github.com/nytimes/library#deploying-the-app (e.g. the NODE_ENV=development # node environment line in step 4 of Development Workflow.

Possible Solution

I've been through all the steps a few times, maybe I just need to start again and stick to the instructions as they appear on GitHub?

Never-edited docs show undefined last editor in search results

Context (Environment)

MacOS/Linux, Node 8 - can be seen locally and in production

Expected Behavior

When searching for documents, a list should be shown with each article's folder and last editor.

Actual Behavior

If an article is new and has never been edited/only edited once, undefined is shown as the last editor.

To Reproduce

Create a new page in your drive, and search for its title from in Library.

Possible Solution

If an article has not been edited, there is currently undefined behavior, but I would advocate either

show nothing, or
show the author name instead

Preserve deep links to other documents

Expected Behavior

When one library document links to another using the docs.google.com url, the resulting rendered version of that link should preserve deep links such as those to headings in the document: .../edit#heading=h.h1dswxgykb74.

Actual Behavior

The rendered link drops the hash from the original link. This is probably because the original link is used only to get the document ID, and the canonical URL for that document (without any deep links) is used directly:

library/server/formatter.js

Line 85 in 5c08c23

const {path: libraryPath} = isDoc ? list.getMeta(docId) || {} : {}

Possible Solution

The hash component of document urls could also be parsed out, and appended to the canonical url.

Use Google Drive for checking permissions instead of e-mail address domain

After looking at the source code and some testing, my understanding that a user is allowed to view library pages (Google docs) based solely on the domain associated with their e-mail address. This authorization check is made here:

library\server\userAuth.js:61

if (isDev || (authenticated && domains.has(userDomain))) { setUserInfo(req) return next() }

This means that users who don't have permission through Google's sharing settings can still view Google documents through the library app because the library app is accessing the content through the service account (which does have permission). If the user's e-mail domain is included in the domains list specified by the APPROVED_DOMAINS environment variable, they can view all the documents in the library.

This approach to authorization is far too coarse for a large organization.

Ideally, the authorization should be controlled by the current user's permission to the folder/files through the Google API (https://developers.google.com/drive/api/v3/reference/permissions) to avoid a back door to accessing sensitive Google docs.

Please consider adding this feature to make the library app more secure.

incorrect logged in user in nav bar

Context (Environment)

Knight Lab's installation of Library

Expected Behavior

To see my name/email in the top right corner of the navbar when I am logged in. I know that I am logged in because I can see the "edit page" link on pages, and when I click on them, they open, and I can, in fact, edit them.

Actual Behavior

To Reproduce

I don't have access to another installation of Library to find out if this common.

Additional Information

Knight Lab's Library uses Google Drive folder, not Team Drive

I have multiple Google identities active in this browser; generally, my gmail identity is the default, but for Library, my u.northwestern.edu (GSuite for Education) account is required.

Possible Solution

Maybe the folder, instead of Team Drive, results in confusion over my current identity?

Maybe the fact that I have multiple active Google logins in this session causes confusion with the identity?

cc @maxine

Redirect URI incorrect when running behind a https proxy.

Context (Environment)

I'm running the Library application as a Fargate task fronted by an ALB that handles HTTPS offloading. Access is restricted to our corporate domain.

Expected Behavior

When I attempt to login, the redirect uri provided should use the same https protocol that I am using to access the library site.

Actual Behavior

When I initially visit the library website, I am redirected to google for authentication. This redirect URI that is sent with the request is for the correct domain but http instead of https.

To Reproduce

Offload HTTPS encryption outside of the application.

Possible Solution

This appears to be caused by the application not detecting that there is https offloading going on. I've found a few possible options that might explain the behavior:

app.enable("trust proxy");
https://stackoverflow.com/questions/20739744/passportjs-callback-switch-between-http-and-https

app.use(forceSsl);
https://stackoverflow.com/questions/7185074/heroku-nodejs-http-to-https-ssl-forced-redirect

I'll be testing out the first option shortly. It seems the most direct solution that wouldn't impact others who aren't using https. The later option seems like it would need to have an environmental variable that wraps that being enabled to ensure that only people who want to have it enabled are impacted.

Links on images in Google Docs

Does Library recognize and render links that are applied to images within Google Docs? I did some testing and noticed links attached to images were stripped after being rendered in Library.

Is there a workaround for this or did I miss something in the docs? Just curious.

Server error serving main page

Context (Environment)

Heroku deployment

Expected Behavior

Open app front page

Actual Behavior

Server error reported. Heroku log showing:

error: Serving an error page for / 
2019-04-15T18:45:58.064071+00:00 app[web.1]: { message: 'Cannot convert undefined or null to object',
2019-04-15T18:45:58.064074+00:00 app[web.1]:   stack:
2019-04-15T18:45:58.064084+00:00 app[web.1]:    'TypeError: Cannot convert undefined or null to object\n    at Function.keys (<anonymous>)\n    at buildDisplayCategories (/app/server/routes/pages.js:56:29)\n    at handlePage (/app/server/routes/pages.js:47:24)' }

To Reproduce

Navigate to front page.

Additional Information

I've tried my best to follow the steps for Heroku deployment. I got as far as gcloud datastore indexes create ./index.yaml --project your-gcp-project-here which is only documented in the yaml file itself.

Happy to supply config info on Google Cloud and/or Heroku as needed.

Template Customization

Problem Description

For users who want to use the underlying Library infrastructure, but be able to give the site their own look and feel, this aims to allow template/layout customization using the existing custom folder.

Feature

One idea on how to implement stems from a previous idea that allowed different sections to use different pages. We could start here: https://github.com/nytimes/library/blob/master/server/routes/categories.js#L28-L29 and potentially extend this to also check the layouts within the custom folder as well.

This is something that I'll start digging into.

Styles sometimes broken on 403 error pages

Expected Behavior

The error pages should display with the same styles regardless of authentication status.

Actual Behavior

The 403 page displays without any styles sometimes, because the stylesheets are behind the authentication layer in express.

To Reproduce

Run Library with the included Oauth 2.0 integration, selecting a Google account that does not have access based on the APPROVED_DOMAINS env var.

Possible Solution

We could inline a selection of base styles, or compile a separate stylesheet for error pages that is not authenticated.

Explore possibility of multiple Library integrations

Problem Description

During the October 15 community call, a number of people mentioned their orgs were already using other documentation tools, or were using multiple tools in concert with Library. Some of the tools mentioned were Github wikis, Confluence, and MkDocs. Library is currently limited to one Google Drive folder/shared drive, but could Library be a more general knowledge aggregator?

Feature

There has also been conversation about including support for multiple drives/folders in Library, see #40. In parallel with that work, it may be worth thinking about how non-google drive sources could be supported by Library. In the future, it would be nice to make it easy for anyone to write a "plug-in" that could provide some type of crawlable API for library to read files from.

I'd be curious to hear what integrations the community would be interested in, and if anyone has opinions as to how supporting multiple sources should be engineered.

Application not listing files on drive

Context (Environment)

Operating System : Ubuntu 18.04
node version : v10.15.1

Expected Behavior

Web UI should list the files on the shared drive folder.

Actual Behavior

Web UI is not listing the files on the shared drive folder.

To Reproduce

followed the installation steps mentioned on https://nyt-library-demo.herokuapp.com/get-started

Additional Information

I can see the application logs and it shows

[0] debug: searching for files > 3
[0] debug: Current file count in drive: 2

And whenever I create a new file it shows info: CACHE PURGE /first-demo FROM CHANGE AT /first-demo

Possible Solution

Support Multiple Team Drives or Shared Folders

Problem Description

At large organizations, it would be useful to connect Library to multiple team drives or shared folders to power site content. This would allow for sharing a single Library site with viewing permissions for everyone while keeping the write permissions more granular.

Support for iframe videos

Problem Description

Library blocks video embed codes from rendering on the page, even when using the custom power user markup feature.

Feature

Additional Information

Here's a screenshot of the error in production:

Here's a screenshot of the Google Doc:

I installed... now what? There is nothing.

Fully installed using google cloud platform and heroku following the setup guide.
After the - very elaborate and well documented - setup guide there is hardly any documentation on how to proceed.

Expected Behavior

I expected to see some app where you can browse and make pages and folders or something. Similar to the documentation pages but then with edit functionality.

Actual Behavior

Got first a 500 error. Then i tried clicking around, but there is nothing except a search bar that turns up nothing (because there is nothing).
"get started" link gives a 400 error.

Additional Information

According to documentation, there should be an add page button, but there is not.
What am I missing?
I also tried adding a page and a folder to the google drive folder to see if it shows up in the app but it does not.
I looked on the google group but it seems there has never been posted anything.

Add autocomplete for search

Problem Description

As the number of documents in Library increases, it may be hard to surface individual documents via search if you don't know exactly what you're looking for. Additionally, content may not be shown due to spelling errors (see example).

Nearly correct spelling doesn't show result:

Exact spelling does show desired result:

Feature

Create a dropdown options list for search results as you type in the input box:

Additional Information

Search Returns 500 Error

Thanks for building this and providing it to the world for free.

Context (Environment)

Heroku

Expected Behavior

Get some search results

Actual Behavior

We get the standard app 500 page.

The logs just say:

 error: Error when searching for test, Error: <HTML>
 <HEAD>
 <TITLE>Bad Request</TITLE>
 </HEAD>
 <BODY BGCOLOR="#FFFFFF" TEXT="#000000">
 <H1>Bad Request</H1>
 <H2>Error 400</H2>
 </BODY>
 </HTML>

To Reproduce

You'd have to login to our site, but I'm happy to do any debugging you suggest.

Additional Information

We use a shared folder, not a team drive.
Everything else seems to work.
I didn't setup the indexes (the optional step)
I searched around and couldn't find any info anywhere about this.

Trying to get the project running

Hello! 👋

I'm an engineer on the documentation team at GitHub. My colleague @sarahs and I have been watching this project for a while, and now I'm finally trying it out.

I've got this set up:

App is running on Node.js 10, macOS
Created a Google API project with Drive and Cloud Store access
Created a service account
Created server/.auth.json with credentials file created by Google
Created OAuth 2 client id
Created .env with all values
Created a new folder in Drive using my personal Google account and updated related values in .env
Created a new Google Doc in that folder with some filler content

Unfortunately, I'm still seeing a lot of errors in my console. I added them to a gist here: https://gist.github.com/zeke/c2572650c5211985b33aa42106f1ec02

Some highlights from that output:

'Cannot find module '/Users/z/git/forks/library/custom/cache/store'
warn: GOOGLE_APPLICATION_CREDENTIALS was undefined, using default ./.auth.json credentials file...
Cannot find module \'/Users/z/git/forks/library/custom/userAuth\''
Cannot find module \'/Users/z/git/forks/library/custom/csp\

Context (Environment)

$ node --version
v10.16.2

$ uname -a
Darwin calvisitor-10-105-181-69.calvisitor.1918.berkeley.edu 18.7.0 Darwin Kernel Version 18.7.0: Sat Oct 12 00:02:19 PDT 2019; root:xnu-4903.278.12~1/RELEASE_X86_64 x86_64

Expected Behavior

When running npm run watch, I expect to see a local running server.

Actual Behavior

I see a bunch of errors: https://gist.github.com/zeke/c2572650c5211985b33aa42106f1ec02

To Reproduce

Follow the checklist above. :)

Possible Solution

It looks like there are some undocumented expectations in the code about certain local files that need to exist.

Maybe the Google App doesn't have access to the Google Drive folder I created?

I'd be happy to jump on a video call with someone to pair on this. Once I've got a successfully running installation of library I'd be happy to follow up with updates to the documentation to make the setup process more clear for newcomers.

Thanks for reading!

Add an environment variable to allow app to be mounted below root

The library app (as I understand the source code) assumes it is mounted at the root ('/') path of a website since all the redirects and auth callbacks start at '/' (e.g., '/login').

I would like to mount the app on a path instead. For example:
https://www.example.com/wiki

This will be done using an NGINX reverse proxy mapped to the the '/wiki' location.

Ideally, there would be an optional environment variable that could be used to specify the mount point. LIBRARY_MOUNT_PATH='/wiki'

Hidden subfolders appear on "View All" page

Expected Behavior

Hidden subfolders should not appear.

Actual Behavior

Hidden subfolders do appear.

To Reproduce

Create a folder within a folder with a name ending in " | hidden".

Slack OAuth

This is the biggest feature request I can think of. Feel free to blow it off. The reason I raise it is that our newsroom, like many, uses a mix of Microsoft Active Directory and Slack for its authentication, but not Google. If you can get this app to interface with Slack for signins, I think you'd open it up to wider range of users. Though of course if you don't have time or the interest, that would be totally understandable.

Document titles ending in `home` are interpreted as homepages when not tagged

Expected Behavior

The word "home" should be preserved at the end of a document name and not used as a tag unless it follows a | character.

Actual Behavior

home is stripped from the end of document titles and the document becomes the homepage for the folder it is in

To Reproduce

See: https://nyt-library-demo.herokuapp.com/semi-secret/deeply-nested-folder

Possible Solution

Update the tags regex and investigate homepage selection logic

Make the Recently/Most Viewed Stories dropdown optional based on Datastore config

Dropdown with Datastore Indexes set up:

Context (Environment)

A Library instance without Datastore Indexes set up.

Expected Behavior

Since setting up Datastore Indexes is optional, there should not be a dropdown if the indexes aren't set up.

Actual Behavior

Even without the indexes set up, an empty window still drops down.

Possible Solution

Enable/disable the dropdown depending whether the Datastore API/indexes are set up on the GCP project.

Library doesn't handle "number-only" folder names

Context (Environment)

Used "Deploy to Heroku" button with no customization

Expected Behavior

Folders whose names consist of only numbers should be rendered, indexed, and slugged correctly.

Actual Behavior

Folders whose names consist of only numbers are displayed as empty strings in browsing (see screenshot) and are dropped from slugs (e.g. exec-agendas/2018/exec-agenda -> exec-agendas/exec-agenda).

To Reproduce

Create a folder whose name consists of only numbers (e.g. "2018") and attempt to navigate to it or any file in it in Library.

Caching logic rewrite

Problem Description

We've observed that when Library is configured with a more persistent caching layer (eg. redis), incorrect behavior can sometimes occur when redirect entries and cached HTML are stored at the same location. Additionally, a lot of extra log noise is generated around extra cache purge requests, which are fired for every item in the Library tree and sometimes ignored by the caching layer.

Feature

It would be great to revisit and simplify our current caching approach to reduce the logging noise and ensure consistent behavior. Our current caching tests should also be made more extensive with this rewrite to catch these issues.

Proposed Changes

Cache only the processed document body, not the entire page
- Move cache logic earlier in document rendering pipeline
- Move byline insertion earlier in rendering pipeline
Change cache keys to be document IDs, not paths
Remove current redirect logic, replace after cache keys are changed
Investigate current purge logic, remove now unnecessary cases

Add site access options for increased viewing permission granularity

Problem Description

Currently, Library utilizes domains to control who can access the site. However, this prohibits access on a person-by-person basis and increases the amount of maintenance needed when different aliases of the same domain are often used (ex: [email protected] versus [email protected]).

Feature

To address these issues, Library should support:

individual email address access via an environment variable and OAuth configuration
the use of regex for domain matching

Additional Information

Library accesses a user's email domain via the user's passport session and checks it against the APPROVED_DOMAINS config var.

How to deploy customization forks

I've successfully deployed the standard app to Heroku. Everything is a-okay. Now I want to begin to customize the vanilla deploy.

When I first created the Heroku app with the magic button, I input a CUSTOMIZATION_GIT_REPO variable that points to [email protected]:datadesk/library-customization.git, where I have forked the example customization library.

My hope is that I can simply push changes to my fork and somehow trigger them to build on Heroku, but it's unclear to me from the docs how to make this happen. Anyone have advice?

Better error messages to make it easier to get started with Library

Several different people have gotten tripped up on the following issues:

Documents in a new Library site do not show up on the homepage by default
Loading the Library homepage when it is empty can result in a 500
Clicking the default "Get started" link goes to a 404.

To help people better work around these issues, we could add some better error messages to each of these scenarios. A couple ideas:

When no documents have been tagged, display a special box on the homepage (instead of the homepage modules) that lists recent documents added
Display a custom error message when there are no documents in the site at all on the homepage, rather than a 500
Add some custom language to the 404 page for the "Get started" link, explaining that you should write your own page about how to get started with your Library site.

via #62

New issues from user input

Is it possible to warn if a user sets the type to drive when it's actually a folder, and vice versa? Via April Browning on the google group.

Add Google Cloud Run button

Feature

Google has recently introduced a cloud run button that seems to acts much like the Heroku one-click deploy button. If possible, it would be great to support this and include it in our README.

Additional Information

The button supports buildpacks, a concept initially created by Heroku.

Slack integration with Library

Problem Description

Conversations sometimes happen in Slack that document important decisions or features, but over time they can get lost due to scrollback trim or difficulty searching. It would be great if there were an easy way to connect a slack conversation to a page in Library that would be easily searchable and streamline the preservation of institutional knowledge discussed in slack.

Feature

One approach for an MVP could be a plugin that adds a slash command creating a thread in Slack linked to a Library page. Each comment and the person adding it could be piped into the Library page, streamlining the preservation of knowledge discussed in slack and making it easier to search archived conversations.

Additional Information

The Google Drive API has an endpoint that given a Google Docs mimetype on an HTML payload will convert the uploaded body to a Google Doc. That might be helpful here, if working with Google Docs API itself is difficult.

Improvement to Dockerfile

Problem Description

I had quite the time trying to get this deployed using Docker/ECR/ECS. I couldn't get it deployed using the nytimes/library image, but I did get something deployed when I forked the repo and used the included Dockerfile.

Feature

This is the current Dockerfile:

FROM node:10

RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app

# Install NPMs
COPY package.json* package-lock.json* /usr/src/app/
RUN npm i --production

COPY . /usr/src/app
RUN npm run build

This Dockerfile successfully builds the image, but never actually runs the start command before putting it in a ECS cluster.

Adding CMD ["npm", "run", "start] after running the build made the app work for me.

Since it's only one line, I figured it's not worth a pull request, especially since I've made other changes to my fork. Thought it would be worth giving you guys the heads up.

Additional Information

Once I finalize the details for our deployment, would there be interest in adding additional documentation on deploying with Docker? I'd be happy to write that if you feel it's necessary.

Playlist support: caching

Problem Description

Library contains basic support for a designated list of documents, called playlists. To make a list, you paste a list of document URLs (that exist inside the team drive or shared folder) into a spreadsheet, which is also in the team drive. The documents then render at a nested route inside the path of the spreadsheet. However, the full feature set for this functionality is incomplete, primarily around the caching module.

Feature

Currently playlists are not cached in order to avoid the above constraints. At the conclusion of this work, the base Library code should support the following:

cache playlist pages
when doc content changes, cache bust playlist as well
augment document meta with array of paths for thorough cache purging
handle when a document is renamed and its path in a playlist changes (and needs a redirect)
handle when the contents of a playlist changes and a doc is added or removed to a playlist

Additional Information

A few previous notes:

This might require refactoring the getTree() function in list.js, to not rewrite docsInfo but to extend it with each tree update.

We could potentially try to use the Drive metadata api for this.

Update node-sass when available

npm audit is currently unhappy because of https://hackerone.com/reports/344595.

The node-sass dependency chain specifies node-gyp ^3.8.0, which in turn specifies tar 2^. We'll need to await the outcome of sass/node-sass#2639 and isaacs/node-tar#212, or rely on a fork of node-sass in the meantime.