GithubHelp home page GithubHelp logo

vipulnaik / wikipediaviews Goto Github PK

View Code? Open in Web Editor NEW
8.0 3.0 1.0 2.31 MB

Underlying code of https://wikipediaviews.org with sensitive parts redacted

Home Page: https://wikipediaviews.org

License: Other

PHP 94.38% Python 2.69% NASL 2.93%

wikipediaviews's Introduction

wikipediaviews

Underlying code of https://wikipediaviews.org with sensitive parts redacted

GitHub URL: https://github.com/vipulnaik/wikipediaviews

WARNING: If you just get the code from GitHub and try running it, it will fail. You also need to set up the database and add a file at backend/globalVariables/passwordFile.inc to your backend subdirectory with the credentials for accessing the database. Configure passwordFile.inc according to what you currently see in backend/globalVariables/dummyPasswordFile.inc.

Instructions for database setup are at sql/table-creations.sql.

License

This code is released to the public domain. Any referenced or linked code or libraries used may be subject to their own copyright and licensing restrictions.

File structure

All publicly accessible files are in the home directory.

There are three home directory files that offer starting points:

  • index.php (Home)

  • multiplemonths.php (Multiple months)

  • multipleyears.php (Multiple years)

Each file includes these two files from the style subdirectory:

  • head.inc controls the header and any site-wide messages

  • toggler.inc includes (currently clumsy) JavaScript that allows for show/hide features in the HTML display.

In addition, each file includes a corresponding data entry file from the inputDisplay subdirectory:

  • index.php includes onemonthdataentry.inc

  • multiplemonths.php includes multiplemonthsdataentry.inc

  • multipleyears.php includes multipleyearsdataentry.inc

After we click the submit button on any of the three starting points, we get sent to a display page. There are three display pages:

  • displayviewsforonemonth.php is the target display page from onemonthdataentry.inc. In addition to the display, it includes multiplemonthsdataentry.inc to facilitate continued data entry (we transition automatically from one month to multiple months after the first data screen. This is a design decision). So if you submit the form again, you go to displayviewsformultiplemonths.php.

  • displayviewsformultiplemonths.php is the target display page from multiplemonthsdataentry.inc, included in multiplemonths.php. In addition to the display, it includes multiplemonthsdataentry.inc.

  • displayviewsformultipleyears.php is the target display page for multipleyearsdataentry.inc. In addition to the display, it includes multipleyearsdataentry.inc to facilitate continued data entry.

All these PHP files also include head.inc and toggler.inc from the style subdirectory.

A closer look at the data entry files and the inputDisplay subdirectory

The data entry files themselves include other files, because of common structure to them. The included files, however, are in the inputDisplay folder (not publicly accessible over the web). There is a two-level hierarchy of these files.

  • pageListEntry.inc: This file has the code for the text area for entering the list of pages, plus instructions on top of that text area. It pre-populates the text area with the list of pages from a previous GET or POST request if any, otherwise leaves it blank. If on a follow-up page, it is collapsed in cases that the pages were specified through an alternate page specification method.

    Included in: onemonthdataentry.inc, multiplemonthsdataentry.inc, multipleyearsdataentry.inc

    Associated retrieval file: retrieval/pagelistretrieval.inc

  • alternatePageSpecificationMethods.inc: This file has the HTML for ways of providing lists of pages without explicitly typing them in. There are many methods, each with its own inc file. The decision of whether to show or collapse this is based on the form history and an automatically generated recommendation.

    Included in: onemonthdataentry.inc, multiplemonthsdataentry.inc, multipleyearsdataentry.inc

    Associated retrieval file: retrieval/pagelistretrieval.inc

  • alternateMonthSpecificationMethods.inc: This file has the HTML for ways of providing lists of months without having to check boxes.

    Included in: multiplemonthsdataentry.inc

    Associated retrieval file: retrieval/monthlistretrieval.inc

  • advancedOptions.inc: This file has the HTML for advanced options, mostly relating to how much querying to do and how to display the results.

    Included in: onemonthdataentry.inc, multiplemonthsdataentry.inc, multipleyearsdataentry.inc

    Associated retrieval file: retrieval/advancedoptionretrieval.inc

wikipediaviews's People

Contributors

riceissa avatar vipulnaik avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

riceissa

wikipediaviews's Issues

Switch function, variable names to camel case

This should require a 2-3 day concentrated stretch to make sure there is no break in dependencies/compatibility, but it will benefit the codebase by making it more human-readable.

Fix quote escaping issues

Single quotes aren't properly escaped when making SQL insertions, causing some inconsistent behavior. Don't HTML-encode, just escape.

"There are no pages"

Using the "Alternative page specification", only the tag method works. The "category", "user" and 'linking page" methods return "There are no pages for the . . .-language combination."

ETA: I only tested this on http://wikipediaviews.org/

Add normalization option for non-HTML output

There is currently a normalization option, "Daily average (for days in the month when stats are available)", for HTML output. This option should be available for other output formats as well.

Detect redirects (Cumulative Facebook shares)

When submitting "Cumulative Facebook shares", the numbers for redirects are the same as those of the destination pages because Facebook merges redirects with the actual articles. This is confusing unless you can tell that a page is a redirect.

For instance submit the following pages:

Quora
Timeline of Quora

at http://wikipediaviews.org/multiplemonths.php (I would post a link but Wikipedia Views can't do this currently).

Notice that the "Cumulative Facebook shares" are the same because the timeline page redirects to the main page.

I would suggest coloring redirects in rgb(255, 137, 33) (#FF8921), which is the color Wikipedia uses in the "Display links to disambiguation pages in orange" gadget.

Redirects can be detected using the MediaWiki API. Compare https://en.wikipedia.org/w/api.php?action=query&titles=Timeline%20of%20Quora&redirects&format=jsonfm

{
    "batchcomplete": "",
    "query": {
        "redirects": [
            {
                "from": "Timeline of Quora",
                "to": "Quora",
                "tofragment": "Timeline"
            }
        ],
        "pages": {
            "26749224": {
                "pageid": 26749224,
                "ns": 0,
                "title": "Quora"
            }
        }
    }
}

with https://en.wikipedia.org/w/api.php?action=query&titles=Quora&redirects&format=jsonfm:

{
    "batchcomplete": "",
    "query": {
        "pages": {
            "26749224": {
                "pageid": 26749224,
                "ns": 0,
                "title": "Quora"
            }
        }
    }
}

See https://www.mediawiki.org/wiki/API:Query#Resolving_redirects for more.

Switch to more hierarchical file inclusion to make dependencies between files clearer

Currently we have pretty much a giant pool of functions split across many files, and it's often not clear what file a given function being called belongs to. This is okay for the current codebase size but is not good software engineering practice. Figure out how to fix this within PHP, otherwise just add comments identifying function sources.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.