yaronkoren / miga Goto Github PK

View Code? Open in Web Editor NEW

20.0 20.0 7.0 1.4 MB

A Javascript application (with some PHP) for viewing and browsing arbitrary structured data

License: GNU General Public License v3.0

JavaScript 70.22% PHP 26.63% CSS 3.15%

miga's People

Contributors

Stargazers

Watchers

Forkers

diankailli jqnatividad opengov-opendata kwinkunks antjpar

miga's Issues

SemanticMediaWikiImporter.php can use bigger chunks

Currently this importer is using limit=100. Limit of 500 works very well, and reduces the queries by 1/5th making imports much faster for large data sets.

-$askURL .= "&p%5Bformat%5D=csv&p%5Bheaders%5D=hide&p%5Blimit%5D=100";
+$askURL .= "&p%5Bformat%5D=csv&p%5Bheaders%5D=hide&p%5Blimit%5D=500";

Master giving odd jQuery error

Pulled a96f1e0 and now getting this odd jQuery error:

http://wikinosh.com/miga/

SyntaxError: JSON Parse error: Unexpected EOF on jquery-1.9.0.min.js

Loading interrupted by request for my storage

I'm not sure if this is something that can be coded around, but on the iPhone if the user is prompted in Safari that the site they are visiting needs more storage the loading process freezes and requires a refresh of the page.

For Wikinosh, if you delete the local storage data and reload this will trigger at 5MB, freezing the loading. Then a refresh will start it again at which point it triggers at 10MB, stopping the load. Then a final refresh will load the complete data set.

SemanticMediaWikiImporter.php allow easy linking to source URL

similar to other importers

Number of search results per page should be dynamic

I am not sure if this is possible now, but there should be a way to switch the display of search results to be dynamic. If there are 651 results we shouldn't show 500 on first page and only 151 on second page but spread them out a little.

651 is relatively a small result set and 500/page is relatively too much. I would have preferred if it were displaying a max 100/page for results set of this size.

SemanticMediaWikiImporter.php handle larger data sets

SemanticMediaWikiImporter.php does not deal well with larger data sets. It should use some sort of paging to deal with unlimited amounts of data. (reference Wikinosh data set)

Search autocompletion

The search interface would be so much better if it had autocompletion, or just showed in someway a realtime list of matches.

Get unformatted values in SemanticMediaWikiImporter.php

Currently when importing numeric data the CSV file gets numbers with comma separators. See last column in this line:

"40% Bran Flakes Cereal, Kellogg's",93,5,0.54,0,0,220,22.15,3.976,5.112,3.58,,"1,599"

I tried simply adding "#-" to the Ask query like so:

-       $askURL .= urlencode( '?' . $propertyName . "\n" );
+       $askURL .= urlencode( '?' . $propertyName . "#-\n" );

However the commas persist.

Revisit IndexedDB?

It now appears that IndexedDB is supported across all major browsers (http://caniuse.com/#search=indexeddb)

For the most part, one installation we had was very happy with MigaDV. Except, as you well know, most enterprise environments are still Windows. We created workarounds (detect browser, redirect to a static page; created a user-friendly way to install chrome with an extended installer, etc), but it was the main thing they had an issue with.

I realize it will be non-trivial though as WebSQL is relational, and IndexedDB is a kv-store, so all the dynamic sql generation for faceted searching will have to be re-implemented.

Consider bypassing DataFileReader.php to allow much faster loading of large files

So when I was working with #1 and #4 I start to get a CSV file that is farily large with 22,000 rows at 1.2MB of data. This actually wouldn't be much of a problem at all if nginx gzips it in transit.

% curl -I http://wikinosh.com/miga/apps/wikinosh/Food.csv
HTTP/1.1 200 OK
Server: nginx/1.4.1
Date: Fri, 26 Jul 2013 03:51:44 GMT
Content-Type: application/octet-stream
Content-Length: 1219451
Last-Modified: Fri, 26 Jul 2013 03:46:21 GMT
Connection: keep-alive
ETag: "51f1f10d-129b7b"
Accept-Ranges: bytes

That's big, and probably a problem. However, if I can pull it directly using gzip via nginx, it is only 99k! Note content-length.

% curl -I -H 'Accept-Encoding: gzip,deflate' http://wikinosh.com/miga/apps/wikinosh/Food.csv
HTTP/1.1 200 OK
Server: nginx/1.4.1
Date: Fri, 26 Jul 2013 03:54:01 GMT
Content-Type: application/octet-stream
Content-Length: 99558
Last-Modified: Fri, 26 Jul 2013 03:53:03 GMT
Connection: keep-alive
Vary: Accept-Encoding
ETag: "51f1f29f-184e6"
Content-Encoding: gzip

If you configure your webserver properly to allow this to pull of a gzip'd file this is super slick.

However, the use of DataFileReader.php and PHP getting in the middle kills this solution.

Suggestion, allow the front end to pull the CSV directly. jQuery will then use the compression if it's available, and very large data sets will not present a challenge in transmission.

Error in DataFileReader.php

on line 27

$data = [];

should be

$data = array();

Duplication of data in local storage?

I'm not sure how to debug this, but after reloading the data set for Wikinosh many times the local storage being used was over 34MB. When I delete the data and start fresh, the data is only around 18MB. It seems like the more times I completely reload the data the larger it gets. Perhaps there are some records that are not being purged on a reload?

pure javascript

This is awesome. It's just what I was looking for, but the need of a php server broke my legs =(, I can't use it. Is it not possible to have it in a pure javascript/html with data stored anywhere: dropbox, github pages, locally, etc...?

SemanticMediaWikiImporter.php not ending when all data is exported

Now that issue #1 is closed I reran the importer to generate data. It should have stopped with 21,516 rows (http://wikinosh.com/wiki/Category:Food) but the file went to 22,900 rows and then I CTRL-C'd the task.

http://wikinosh.com/miga/apps/wikinosh/Food.csv

Looking at the contents of that file, it seems to not be entirely sequential. Actually, doing a

grep "Agar Seaweed" Food.csv

on that CSV shows the same data over and over. Ugh. Not sure what is causing this issue, it's likely some SMW issue. The net result is that the importer never stops and runs forever.

If you want to test yourself, this is my import settings:

<?php
$gImportFileName = "Food.csv";
$gImportSpecialAskURL = "http://wikinosh.com/wiki/Special:Ask";
$gImportCategoryName = "Category:Food";
$gImportFields = array(
        'Name' => '_name',
        'Calories' => 'Has calories',
        'Fat' => 'Has fat',
        'Carbohydrates' => 'Has carbohydrates',
        'Protein' => 'Has protein'
);

Ability to open in new window/tab

When selecting a list item, it would be nice to mimic the default browser behavior of being able to Open in a New Tab/Window.

Make MigaDV more SEO-friendly

MigaDV aims to make data publishing easier. And since people find content through search engines, perhaps, MigaDV should also make finding the published data easier.

However, javascript sites are not normally indexable by search engines. Existing workarounds include creating Sitemaps and HTML snapshots.

Maybe, during "compilation", a simpler static version of the site is generated which is SEO-friendly. The simple page, can perhaps, then have a redirect to the "real" page. Robots.txt can even be told to use the simple static site.

Be more careful when building a new CSV

While running the importer Miga should take more care with the existing CSV. Ideally:

Leave the existing CSV alone.
Create the new file using a temp filename.
If fails, leave temp file for debugging and give error.
If success, move original CSV to an archive name and swap in new CSV.

Currently while building a large data set users could get partial results. Also, a failed rebuild destroys your currently working data.

yaronkoren / miga Goto Github PK

miga's People

Contributors

Stargazers

Watchers

Forkers

miga's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs