GithubHelp home page GithubHelp logo

learningequality / ricecooker Goto Github PK

View Code? Open in Web Editor NEW
18.0 15.0 52.0 56.22 MB

Python library for creating Kolibri channels and uploading to Studio

Home Page: https://ricecooker.readthedocs.io/

License: MIT License

Python 98.97% Makefile 0.38% Batchfile 0.26% Shell 0.23% HTML 0.15% JavaScript 0.01%

ricecooker's People

Contributors

antonio-cortes-perez avatar aronasorman avatar atkristijan avatar benjaoming avatar bjester avatar bkarnow avatar dependabot[bot] avatar divad12 avatar dragondave avatar elaeon avatar indirectlylit avatar intelliant01 avatar ivanistheone avatar jamalex avatar jayoshih avatar jredrejo avatar kollivier avatar lsolesen avatar miraziz avatar misrob avatar nucleogenesis avatar radinamatic avatar ralphiee22 avatar richard-dinh avatar rtibbles avatar sairina avatar theditor avatar tonioc987 avatar vkweb avatar wenyuzhang1992 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ricecooker's Issues

Double-printing logging

  • ricecooker version: 0.6.10
  • Python version: 3.5.3 (/data/sushi-chef-tessa/venv/bin/python)
  • Operating System: GNU/Linux Ubuntu 14.04.5 LTS

Description

When running a chef on vader, everything seems to double-print, i.e., the logger lines appear twice.

Use shorter playlists for youtube tests

The playlists used in the tests
https://www.youtube.com/playlist?list=PLBO8M-O_dTPE51ymDUgilf8DclGAEg9_A
and https://www.youtube.com/playlist?list=PL7m903CwFUgntbjkVMwts89fZq0INCtVS
are quite large so downloading testing videos takes a long time (e.g. the video oP4I9LKANew is 256.88MB).

We should replace with test playlists created by LE with small and fewer videos (since main purpose is to check license).

Same goes for the individual test videos with subs (the test just broke today because new subs were added, so I'll have to replace those test youtube videos with ones uplaoded by us that we know won't change).

Mandatory 'nomonitor' breaks souschef

  • ricecooker version: 0.6.16 (was fine in .13, I think)
  • Python version: 3.5.2
  • Operating System: Ubuntu 16.04

Description

I tried to run an existing souschef (https://github.com/learningequality/sushi-chef-artsedge) and after upgrading it gave a KeyError for 'nomonitor' on

config.SUSHI_BAR_CLIENT = SushiBarClient(channel, username, token, nomonitor=kwargs['nomonitor'])

This leads to calling .run on a SushiChef instance to crash if no nomonitor attribute is provided.

What I Did

I can work around this by adding nomonitor=False to the invocation -- this isn't ideal since every souschef file will need to be updated.

I recommend we change kwargs['nomonitor'] to kwargs.get('nomonitor') which will return None (assuming that's acceptable behavior) or explicitly set a default value with kwargs.get('nomonitor', True) or similarly for False.

@ivanistheone -- pinging since you'll probably have the best idea what the default should be.

File list passed into a ContentNode must consist of more than just paths

So far, we've gotten away with passing in a list of simple string paths as files when instantiating ContentNode subclasses. E.g. Document(files=["http://mysite.com/somedoc.pdf"]). It uses the class' default_preset to determine the preset for the file(s) in the list.

This won't be enough for many contexts. For example, once we start adding in subtitles, we'll need to indicate, for each of the subtitle files, what language it's for. We may also want to pass in options regarding caching, preprocessing, etc for each file.

I would propose having a File class, which takes a path (URL or file path), as well as an optional preset (or let it autodetect that, if desired). If appropriate to the preset, settings like language could also be passed in.

There could then also be subclasses of File that provide additional options, e.g. VideoFile that encapsulates the logic for doing resizing and other preprocessing.

Provide more meaningful message when user is not authorized to edit channel

  • ricecooker version: 0.5.3
  • Python version: 2.5.2
  • Operating System: Ubuntu 16.0.4

Description

Ran sushi chef, turned out I wasn't an editor on channel, but then only message shown was:

400 Client Error: Bad Request for url: https://contentworkshop.learningequality.org/api/internal/create_channel
Traceback (most recent call last):
  File "./chef.py", line 124, in <module>
    PhETSushiChef().main()
  File "/home/jamalex/fle/ricecooker/ricecooker/chefs.py", line 270, in main
    self.run(args, options)
  File "/home/jamalex/fle/ricecooker/ricecooker/chefs.py", line 260, in run
    uploadchannel_wrapper(self, args, options)
  File "/home/jamalex/fle/ricecooker/ricecooker/commands.py", line 33, in uploadchannel_wrapper
    uploadchannel(chef, **args_and_options)
  File "/home/jamalex/fle/ricecooker/ricecooker/commands.py", line 149, in uploadchannel
    config.PROGRESS_MANAGER.set_channel_created(*create_tree(tree))
  File "/home/jamalex/fle/ricecooker/ricecooker/commands.py", line 277, in create_tree
    channel_id, channel_link = tree.upload_tree()
  File "/home/jamalex/fle/ricecooker/ricecooker/managers/tree.py", line 152, in upload_tree
    root, channel_id = self.add_channel()
  File "/home/jamalex/fle/ricecooker/ricecooker/managers/tree.py", line 204, in add_channel
    response.raise_for_status()
  File "/home/jamalex/.virtualenvs/sushi-chef-phet/lib/python3.5/site-packages/requests/models.py", line 937, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://contentworkshop.learningequality.org/api/internal/create_channel

Only by checking the GCloud error dashboard could I see the cause:
image

Feature idea: predictably-named zip files in `chefdata` dir

Description

Currently chefs that generate HTML5Zip files create new zip files even if zip file with the same content already exists.

PROBLEM: chefs scheuled on cronjobs will generate a lot of zip files + reupload gigabytes to Studio. Ok for manually one-off manual chef runs, but not good for continuously running ones.

It would be nice to not regenerate zip archives for content that hasn't changed. Last-modified time check and record?

We could also use hash of content, but might not be reliable since small differences could cause hashes to be different.

Return logic/docstring for `validate` methods on ContentNodes

The validate methods include the docstring Returns: boolean indicating if video is valid, but they don't seem to ever return False, they just throw exceptions if there's a problem. Seems like they could just return nothing on success, and throw exceptions on error -- I believe that's a standard pattern in Python/Django for validation methods.

502 Bad Gateway for url: https://contentworkshop.learningequality.org/api/internal/file_diff

  • ricecooker version:0.5.4
  • Python version:3.5.2
  • Operating System:MAC OS X EI Capitan

Description

I am trying to update magogenie channel on content curation server. While uploading magogenie content using ricecooker, the api "https://contentworkshop.learningequality.org/api/internal/file_diff"
fails 502 bad gateway.
The ricecooker successfully downloads all the contents, while checking the file_diff it gives 502 bad gateway.
Below is the attached screenshot for stack-trace.
screen shot 2017-04-14 at 3 56 07 pm

create_predictable_zip does not close the temp file

  • ricecooker version: 0.6.22
  • Python version: 3.6
  • Operating System: Mac OSX

Description

When using create_predictable_zip to create zip files for HTML5Apps, the temp file created in the function create_predictable_zip is not closed after use.
So after creating many zip files, the program fails with the error below

What I Did

[Errno 24] Too many open files: '/var/folders/w6/97k87rrd7z30njcc74049z800000gn/T/tmpn5zdtpbz.zip'
Traceback (most recent call last):
  File "sushichef.py", line 223, in <module>
    chef.main()
  File "/Users/lingyiwang/.venvs/sushichef-pd/lib/python3.6/site-packages/ricecooker/chefs.py", line 296, in main
    self.run(args, options)
  File "/Users/lingyiwang/.venvs/sushichef-pd/lib/python3.6/site-packages/ricecooker/chefs.py", line 287, in run
    uploadchannel_wrapper(self, args, options)
  File "/Users/lingyiwang/.venvs/sushichef-pd/lib/python3.6/site-packages/ricecooker/commands.py", line 33, in uploadchannel_wrapper
    uploadchannel(chef, **args_and_options)
  File "/Users/lingyiwang/.venvs/sushichef-pd/lib/python3.6/site-packages/ricecooker/commands.py", line 130, in uploadchannel
    channel = chef.construct_channel(**kwargs)
  File "sushichef.py", line 105, in construct_channel
    self.download_subject(topic[0], topic[1])
  File "sushichef.py", line 168, in download_subject
    self.download_content(age_topic, link, page_params, selected_category, i*20)
  File "sushichef.py", line 203, in download_content
    zip_path = create_predictable_zip(filepath)
  File "/Users/lingyiwang/.venvs/sushichef-pd/lib/python3.6/site-packages/ricecooker/utils/zip.py", line 33, in create_predictable_zip
    with zipfile.ZipFile(zippath, "w") as outputzip:
  File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/zipfile.py", line 1090, in __init__
    self.fp = io.open(file, filemode)
OSError: [Errno 24] Too many open files: '/var/folders/w6/97k87rrd7z30njcc74049z800000gn/T/tmpn5zdtpbz.zip'

Chefs using WebVideoFile re-download files every time chef runs

  • ricecooker version: 0.6.22
  • Python version: 3.6
  • Operating System: mac os

Description

Chefs using WebVideoFile re-download files every time chef runs.

During a chef run we see..

	Downloading chefdata/transcripts/ar/Premiers secours- nourrisson : Réanimation cardio pulmonaire.pdf
	--- Downloaded 3980a3df44d95dcc07bcf8d5d0adf6b4.pdf
	--- Downloaded e4b7a8b440a5bbee4d26ba5072839f64.jpg

we get to the point where we're creating a WebVideoFile from youtube id S8AVPLg7krg
that looks like this


[youtube] S8AVPLg7krg: Downloading webpage
[youtube] S8AVPLg7krg: Downloading video info webpage
[youtube] S8AVPLg7krg: Extracting video information
[youtube] S8AVPLg7krg: Downloading MPD manifest
[youtube] S8AVPLg7krg: Downloading MPD manifest
[download] Destination: /tmp/c948b043899ffc70bec380dfdac46c9f.f135.mp4
[download] Destination: /tmp/c948b043899ffc70bec380dfdac46c9f.mp4.f140
[ffmpeg] Merging formats into "/tmp/c948b043899ffc70bec380dfdac46c9f.mp4"
Deleting original file /tmp/c948b043899ffc70bec380dfdac46c9f.f135.mp4 (pass -k to keep)
Deleting original file /tmp/c948b043899ffc70bec380dfdac46c9f.mp4.f140 (pass -k to keep)

so we're re-downloading the video from youtube every time the chef runs.

e.g. https://github.com/learningequality/sushi-chef-sikana/blob/master/sushichef.py#L205-L209

Required solution

Use some sort of cache logic for downloads of WebVideoFiles -- keyed on url and download settings.
This way chef scripts like Sikana won't re-download the convent every time they run.

How does the .ricecookerfilecache work?

Cannot use certain URLs for thumbnails

  • ricecooker version: 0.6.36
  • Python version: 3.5.3
  • Operating System: Ubuntu (vader)

Description

If you import a thumbnail via HTML5AppNode(..., thumbnail='.jpg?itok=8gl2hL7q2'); the thumbnail file is rejected, and the crawler crashes with a ricecooker.exceptions.InvalidNodeException: Invalid node (ThumbnailFile must have one of the following extensions: ['jpg', 'jpeg', 'png'].

This means that importing arbitary URLs isn't a reasonable approach; it should probably succeed (it is probably a jpg file after all) or fail non-fatally.

What I Did

  File "/data/py35/lib/python3.5/site-packages/ricecooker/classes/nodes.py", line 195, in validate_tree
    assert child.validate_tree()
  File "/data/py35/lib/python3.5/site-packages/ricecooker/classes/nodes.py", line 193, in validate_tree
    self.validate()
  File "/data/py35/lib/python3.5/site-packages/ricecooker/classes/nodes.py", line 701, in validate
    raise InvalidNodeException("Invalid node ({}): {} - {}".format(ae.args[0], self.title, self.__dict__))

Daemonized chefs don't exit cleanly

  • Python version: 3.6
  • Operating System: Mac OS

Description

daemonized chef runs don't stop cleanly

What I Did

  • Started chef in daemonmode
  • ran it a few times
  • pressed CTRL + C to stop it
^CTraceback (most recent call last):
  File "./sample_program.py", line 507, in <module>
    chef.main()
  File "/Users/ivan/Projects/FLECode/ricecooker/ricecooker/chefs.py", line 291, in main
    self.daemon_mode(args, options)
  File "/Users/ivan/Projects/FLECode/ricecooker/ricecooker/chefs.py", line 268, in daemon_mode
    cws.join()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/threading.py", line 1056, in join
    self._wait_for_tstate_lock()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/threading.py", line 1072, in _wait_for_tstate_lock
    elif lock.acquire(block, timeout):
KeyboardInterrupt
^CException ignored in: <module 'threading' from '/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/threading.py'>
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/threading.py", line 1294, in _shutdown
    t.join()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/threading.py", line 1056, in join
    self._wait_for_tstate_lock()
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/threading.py", line 1072, in _wait_for_tstate_lock
    elif lock.acquire(block, timeout):
KeyboardInterrupt

need to check why this traceback + make sure stops cleanly

Make Python 2 explicitely disallowed

Description

People are spending time running Ricecooker in Python 2, which doesn't work, and then they have strange errors. Recall that both pip and python commands default to 2.7 on most systems, so it's very easy to fall into this trap.

Python 2.7 is not in the tox.ini matrix anyways.

We should throw a RuntimeError("Cannot run this in Python 2") in ricecooker/__init__.py

Previous discussion on it here:

#5 (comment)

Is there any compelling reason that Ricecooker should support Python 2.7 ?

No way to resize exercise images

Studio currently handles the ![alt-text](filepath =widthxheight) syntax, so it would be nice to be able to set it on the ricecooker as well

Truncated exception message: "{} must have one of the following"

  • ricecooker version: 0.6.30
  • Python version: 3.6.5
  • Operating System: Linux

Trying to make a chef, but got an exception, but the output was truncated:

  File "/home/dragon/chef/py3k/lib/python3.6/site-packages/ricecooker/classes/files.py", line 357, in validate
    assert ext in self.allowed_formats, "{} must have one of the following"
AssertionError: {} must have one of the following

https://github.com/learningequality/ricecooker/blob/master/ricecooker/classes/files.py#L357 should insted use a multiline string rather than splitting the string into separate chunks.

Bootstrapping python packages

So I see this project is still pretty bare-bones :)

It's worth getting things on the right track from Day 1 because experience proves that things get harder and less prioritized as projects grow.

I have a couple of questions to start out with:

  • Should it be py2+3?
  • Should dependencies be bundled in as with Kolibri?
  • What kind of dependency policy do we otherwise impose (like pure-python etc)

Otherwise, what I'm looking for is:

  • Packaging (setup.py)
  • CI
  • Code coverage
  • Docs w/ sphinx-apidocs

The tool we can use is called cookiecutter (the link is for the specific cookie cutter template). It's a good idea to check out the forks of cookiecutter-pypackage because people sometimes come up with cool variations that don't get merged upstream because the maintainer doesn't agree.

Thread doesn't exit on vader

  • ricecooker version: 0.6.10
  • Python version: 3.5.3 (/data/sushi-chef-tessa/venv/bin/python)
  • Operating System: GNU/Linux Ubuntu 14.04.5 LTS

Description

When running a chefs on vader it doesn't exit in the end, just hangs around. The issue seems to be related to the use of a separate thread to report progress to sushibar.

Here is what I see when the chef is running with remote reporing:

run_with_sushibar_reporting_double_thread

If I disable sushibar reporting by specifying a non-functional SUSHIBAR_URL export SUSHIBAR_URL="http://localhost:8011" then ricecooker will not try to do the remote reporting and so we see just one thread:

run_with_sushibar_reporting_thread_disabled

and it exists cleanly.

Where the code?

Here is where the websocket reporting thread is started:

Here is the place where sushibar client is supposed to join the thread before exiting:

self.log_ws.join()

Images missing in Perseus exercise

  • ricecooker version:
  • Python version:
  • Operating System:

Description

Describe what you were trying to get done.
Tell us what happened, what went wrong, and what you expected to happen.

What I Did

Paste the command(s) you ran and the output.
If there was a crash, please include the traceback here.

Exercise node validation logic doesn't seem to be correct

or file_formats.PERSEUS means this will always be true.

            files_valid = len(self.files) == 0
            for f in self.files:
                files_valid = files_valid or file_formats.PERSEUS
            assert files_valid , "Assumption Failed: Exercise does not have a .perseus file attached"

Also, the logic here (checking all questions are valid) doesn't seem to match the assert text (in fact, if there are no questions, then it specifically won't display that message):

            # Check if questions are correct
            questions_valid = True
            for q in self.questions:
                questions_valid = questions_valid and q.validate()
            assert questions_valid, "Assumption Failed: Exercise does not have a question"

Need to address selenium deprecation

Description

Getting a lot of errors using selenium + phantomjs where I can't even load the page. We'll want to update the utils/downloader.py code to use something else (possibly pyppeteer?)

Ricecooker allows upload of very large files to Studio

Description

Currently ricecooker allows chef authors to upload large files (see this issue for example)

Problem

We don't want that.

Proposed solution

Ricecooker should produce a warning for uplaods files greater than XMB and fail to validate channel at YMB.

Policy decision

X = 100MB
Y = 200MB

@jamalex @radinamatic Does this make sense?

To give you an idea, some of the MIT blossoms channels were 50MB, but I can see how longer lessons/lectures could go become large. Maybe hard limit at 500MB?

Need to have an attribute to consider Units for the MagoGenie Questions.<Enhacment>

Description
Need to have an attribute to consider Units for the MagoGenie Questions.

In MagoGenie the units have been handled in a way so that they appear next to answers. While this is not the case with CC and Kolibri.

Enhancement
Hence please help in handling the units for the MagoGenie questions.

Note:
This is not a defect but an important enchantment while importing MagoGenie content.

Screenshot:
Units in MG:
screenshot_2
units in mg

Units not handled in CC:
cc

provider, tags not assignable in __init__ on content nodes

  • ricecooker version: 0.6.36
  • Python version: 3.5
  • Operating System: Ubuntu [Vader]

Description

Attempted to pass "provider" when creating a new content node with create_node().
Failed: create_node() got an unexpected keyword argument 'provider'.
The documentation for the individual content node classes implies passing provider is valid.
Inspection of the code reveals TreeNodes can take a provider when created via __init__.
https://github.com/learningequality/ricecooker/blob/master/ricecooker/classes/nodes.py#L310

This also applies to 'tags'.

Workaround

set node.provider = "Provider Company" after initialising the node.
Likewise: node.tags = ['algebra', 'calculus']

Recommendation

Modify ContentNode.__init__() to permit provider attribute.
https://github.com/learningequality/ricecooker/blob/master/ricecooker/classes/nodes.py#L447

Check that validation is performed on it, etc.

Ricecooker does not clean up temp files and directories

Description

Forgot to file this as an issue when I ran into it, and just made a mental note to look into it later. Basically, ricecooker (and hence chefs by default) do not clean up temp files and directories, which can cause them to bloat considerably over time.

In particular, create_predictable_zip creates a zip file in the temp folder that is never deleted. What makes this particularly problematic is that these temp files remain until the OS cleans them up, which on Mac appears to be upon reboot. While debugging a PraDigi issue, I ended up filling up my HDD space after a couple runs because each run would create several GB of temp files and leave them on my drive.

The simplest solution is to have ricecooker create a root temp directory on start, and provide access to this directory to chefs (e.g. ricecooker.get_temp_dir()) so that they can store all the temp files they create inside it. Then, even if the chef errors out, we wipe out this temp directory before exiting. (Maybe an @atexit handler?)

What I Did

Ran the PraDigi chef locally on OS X then inspect my temp files directory.

Allow users to choose which phantomjs they use

Description

I ran into problems while running https://github.com/learningequality/sushi-chef-teachengineering
because phantomjs version on vader is 1.9.x
but chef script depends on phanomjs 2.1.x

Workararound

Download phantomjs2.1.1 to chef dir

cd /data/sushi-chef-teachengineering
wget https://bitbucket.org/ariya/phantomjs/downloads/phantomjs-2.1.1-linux-x86_64.tar.bz2
tar xvjf phantomjs-2.1.1-linux-x86_64.tar.bz2

then monkey patched venv/lib/python3.5/site-packages/ricecooker/utils/downloader.py from

    try:
        if loadjs:                                              # Wait until js loads then return contents
            driver = driver or webdriver.PhantomJS()

to

    try:
        if loadjs:                                              # Wait until js loads then return contents
            driver = driver or webdriver.PhantomJS('/data/sushi-chef-teachengineering/phantomjs-2.1.1-linux-x86_64/bin/phantomjs')

Feature spec

Ricecooker utils downloader could use environement variable like PHANTOM_JS_BINARY to optionally specify a particular binary we want to use.

Require a channel ID before sushi chef can run

  • ricecooker version:
  • Python version:
  • Operating System:

Description

The generation of channel IDs from channel information can lead to issues when that information changes, or when two channels have the same source fields but are different. (e.g. cheffer A doesn't know about chef B's source info and uses the same thinking it is not taken.) When this happens, it leads to weird errors like a new chef run creating a whole new channel, or permissions errors when trying to create a channel. For an example, see:

https://sentry.io/organizations/learningequality/issues/943392969/?environment=master&project=1252819&referrer=alert_email&statsPeriod=14d

An alternative that I think will fix this is to, when creating a channel, always generate a fully random UUID, and the chef cannot run until it is given this UUID. The UUID should be committed in the chef's code so that all chef runs will use it. In the case that a chef with a blank ID is run, it will spit out a UUID and ask the user to paste that into their chef code.

This ensures that no two chefs can ever try to update each other accidentally. We can keep source ID and source domain for now, but I would recommend a plan to migrate to always using the UUID as the source of truth for pointing to the channel.

Images missing from certain Perseus exercises translations

  • ricecooker version: 0.6.20
  • Python version: all
  • Operating System: all

Description

The method process_image_field of the PerseusQuestion class assumes that all images present in the content will also be catalogued in the sibling images attribute, and fails to create image files when this is the case.

assumption:

        {
            "content": "md string including imgs like ![](URL-key) and ![](URL-key2)",
            "images": {
                "URL-key":  {"width": 425, "height": 425},
                "URL-key2": {"width": 425, "height": 425}
            }
        }

actual:

        {
            "content": "md string including imgs like ![](URL-key) and ![](URL-key2)",
            "images": {}
        }

(see http://www.khanacademy.org/api/v1/assessment_items/x43bbec76d5f14f88?lang=bg )

What I Did

  • Two weeks ago during Q/A the BG team noted many images were missing from the Bulgarian-transaltion of the KA channel
  • Last week I was trying to debug missing images and found this bug
  • This week I'm going to try to fix it by looking for images in content in addition to the images

Updated content not being recognised by csv import process

  • ricecooker version: v0.6.29
  • Python version: 3.5.2
  • Operating System: Ubuntu 16.04

Description

Trying to update content in a channel that I had imported a few months ago.

The new content which replaces some existing content is not not being imported and the old contents persist.

What I Did

$ ./linecook.py -vv --reset   --token="../../credentials/studiotoken.txt"   --channeldir='./content/Gujarati'
Logged in with username [email protected]
Ricecooker v0.6.29 is up-to-date.
Running get_channel...
run_id: 65a5968938b84839982ea4a87851e0b3


***** Starting channel build process *****


Calling construct_channel...
   Setting up initial channel structure...
   Validating channel structure...
      સીએસપાઠશાળા - ગુજરાતી (Import) (ChannelNode): 120 descendants
         સરળ સૂચનાઓ [01-DA-01] (TopicNode): 3 descendants
            સરળ સૂચનાઓ - પાઠ આયોજન (DocumentNode): 1 file
            સરળ સૂચનાઓ - પાઠ (HTML5AppNode): 1 file
            સરળ સૂચનાઓ - કાર્યપત્રક (DocumentNode): 1 file
         દૈનિક જીવનમાં ખૂબ સરળ પેટર્ન [01-IPP-01] (TopicNode): 3 descendants
            દૈનિક જીવનમાં ખૂબ સરળ પેટર્ન - પાઠ આયોજન (DocumentNode): 1 file
            દૈનિક જીવનમાં ખૂબ સરળ પેટર્ન - પાઠ (HTML5AppNode): 1 file
            દૈનિક જીવનમાં ખૂબ સરળ પેટર્ન - કાર્યપત્રક (DocumentNode): 1 file
         પદ્ધતિસરની ગણતરી [01-LCR-01] (TopicNode): 3 descendants
            પદ્ધતિસરની ગણતરી - પાઠ આયોજન (DocumentNode): 1 file
            પદ્ધતિસરની ગણતરી - પાઠ (HTML5AppNode): 1 file
            પદ્ધતિસરની ગણતરી - કાર્યપત્રક (DocumentNode): 1 file
         પદ્ધતિસરની ગણતરી [01-LCR-02] (TopicNode): 3 descendants
            પદ્ધતિસરની ગણતરી - પાઠ આયોજન (DocumentNode): 1 file
            પદ્ધતિસરની ગણતરી - પાઠ (HTML5AppNode): 1 file
            પદ્ધતિસરની ગણતરી - કાર્યપત્રક (DocumentNode): 1 file
         સમસ્યા ઉકેલવી: શૂન્ય ચોકડી [01-PS-06] (TopicNode): 3 descendants
            સમસ્યા ઉકેલવી: શૂન્ય ચોકડી - પાઠ આયોજન (DocumentNode): 1 file
            સમસ્યા ઉકેલવી: શૂન્ય ચોકડી - પાઠ (HTML5AppNode): 1 file
            સમસ્યા ઉકેલવી: શૂન્ય ચોકડી - કાર્યપત્રક (DocumentNode): 1 file
         હેપી નકશાઓ 1 [01-PS-11] (TopicNode): 3 descendants
            હેપી નકશાઓ 1 - પાઠ આયોજન (DocumentNode): 1 file
            હેપી નકશાઓ 1 - પાઠ (HTML5AppNode): 1 file
            હેપી નકશાઓ 1 - કાર્યપત્રક (DocumentNode): 1 file
         હેપી નકશાઓ 2 [01-PS-12] (TopicNode): 3 descendants
            હેપી નકશાઓ 2 - પાઠ આયોજન (DocumentNode): 1 file
            હેપી નકશાઓ 2 - પાઠ (HTML5AppNode): 1 file
            હેપી નકશાઓ 2 - કાર્યપત્રક (DocumentNode): 1 file
         લૂપનો પરિચય [01-PS-14] (TopicNode): 3 descendants
            લૂપનો પરિચય - પાઠ આયોજન (DocumentNode): 1 file
            લૂપનો પરિચય - પાઠ (HTML5AppNode): 1 file
            લૂપનો પરિચય - કાર્યપત્રક (DocumentNode): 1 file
         બિંદુઓ ને જોડો [01-PS-24] (TopicNode): 3 descendants
            બિંદુઓ ને જોડો - પાઠ આયોજન (DocumentNode): 1 file
            બિંદુઓ ને જોડો - પાઠ (HTML5AppNode): 1 file
            બિંદુઓ ને જોડો - કાર્યપત્રક (DocumentNode): 1 file
         તર્કશાસ્ત્ર કોયડાઓ [01-PS-33] (TopicNode): 3 descendants
            તર્કશાસ્ત્ર કોયડાઓ - પાઠ આયોજન (DocumentNode): 1 file
            તર્કશાસ્ત્ર કોયડાઓ - પાઠ (HTML5AppNode): 1 file
            તર્કશાસ્ત્ર કોયડાઓ - કાર્યપત્રક (DocumentNode): 1 file
         સૂચનોનો સમૂહ અમલમાં મૂકવો [02-DA-01] (TopicNode): 3 descendants
            સૂચનોનો સમૂહ અમલમાં મૂકવો - પાઠ આયોજન (DocumentNode): 1 file
            સૂચનોનો સમૂહ અમલમાં મૂકવો - પાઠ (HTML5AppNode): 1 file
            સૂચનોનો સમૂહ અમલમાં મૂકવો - કાર્યપત્રક (DocumentNode): 1 file
         પુનરાવર્તિત પ્રક્રિયાઓ [02-IPP-01] (TopicNode): 3 descendants
            પુનરાવર્તિત પ્રક્રિયાઓ - પાઠ આયોજન (DocumentNode): 1 file
            પુનરાવર્તિત પ્રક્રિયાઓ - પાઠ (HTML5AppNode): 1 file
            પુનરાવર્તિત પ્રક્રિયાઓ - કાર્યપત્રક (DocumentNode): 1 file
         પુનરાવર્તિત પ્રક્રિયાઓ [02-IPP-02] (TopicNode): 3 descendants
            પુનરાવર્તિત પ્રક્રિયાઓ - પાઠ આયોજન (DocumentNode): 1 file
            પુનરાવર્તિત પ્રક્રિયાઓ - પાઠ (HTML5AppNode): 1 file
            પુનરાવર્તિત પ્રક્રિયાઓ - કાર્યપત્રક (DocumentNode): 1 file
         પેટર્ન શોધવાનો પરિચય [02-PS-15] (TopicNode): 3 descendants
            પેટર્ન શોધવાનો પરિચય - પાઠ આયોજન (DocumentNode): 1 file
            પેટર્ન શોધવાનો પરિચય - પાઠ (HTML5AppNode): 1 file
            પેટર્ન શોધવાનો પરિચય - કાર્યપત્રક (DocumentNode): 1 file
         તર્કશાસ્ત્ર કોયડા [02-PS-33] (TopicNode): 2 descendants
            તર્કશાસ્ત્ર કોયડા - પાઠ આયોજન (DocumentNode): 1 file
            તર્કશાસ્ત્ર કોયડા - કાર્યપત્રક (DocumentNode): 1 file
         નાના જાણીતા કાર્યોમાં મોટી સમસ્યાઓનું વિભાજન [03-DA-01] (TopicNode): 3 descendants
            નાના જાણીતા કાર્યોમાં મોટી સમસ્યાઓનું વિભાજન - પાઠ આયોજન (DocumentNode): 1 file
            નાના જાણીતા કાર્યોમાં મોટી સમસ્યાઓનું વિભાજન - પાઠ (HTML5AppNode): 1 file
            નાના જાણીતા કાર્યોમાં મોટી સમસ્યાઓનું વિભાજન - કાર્યપત્રક (DocumentNode): 1 file
         મોટી સમસ્યાઓનું નાના કાર્યોમાં વિભાજન [03-DA-02] (TopicNode): 3 descendants
            મોટી સમસ્યાઓનું નાના કાર્યોમાં વિભાજન - પાઠ આયોજન (DocumentNode): 1 file
            મોટી સમસ્યાઓનું નાના કાર્યોમાં વિભાજન - પાઠ (HTML5AppNode): 1 file
            મોટી સમસ્યાઓનું નાના કાર્યોમાં વિભાજન - કાર્યપત્રક (DocumentNode): 1 file
         કોયડા [03-DM-02] (TopicNode): 2 descendants
            કોયડા - પાઠ (DocumentNode): 1 file
            કોયડા - પાઠ આયોજન (HTML5AppNode): 1 file
         ગોઠવણી અને વિશ્લેષણ [03-IP-01] (TopicNode): 3 descendants
            ગોઠવણી અને વિશ્લેષણ - પાઠ (DocumentNode): 1 file
            ગોઠવણી અને વિશ્લેષણ - પાઠ આયોજન (HTML5AppNode): 1 file
            ગોઠવણી અને વિશ્લેષણ - કાર્યપત્રક (DocumentNode): 1 file
         પુનરાવર્તિત પેટર્ન [03-IPP-01] (TopicNode): 3 descendants
            પુનરાવર્તિત પેટર્ન - પાઠ આયોજન (DocumentNode): 1 file
            પુનરાવર્તિત પેટર્ન - પાઠ (HTML5AppNode): 1 file
            પુનરાવર્તિત પેટર્ન - કાર્યપત્રક (DocumentNode): 1 file
         ગણતરી અને સૂચિ [03-LCR-01] (TopicNode): 3 descendants
            ગણતરી અને સૂચિ - પાઠ આયોજન (DocumentNode): 1 file
            ગણતરી અને સૂચિ - પાઠ (HTML5AppNode): 1 file
            ગણતરી અને સૂચિ - કાર્યપત્રક (DocumentNode): 1 file
         સંયોજનોની ગણતરી [03-LCR-02] (TopicNode): 4 descendants
            સંયોજનોની ગણતરી - પાઠ આયોજન (DocumentNode): 1 file
            સંયોજનોની ગણતરી - પાઠ (HTML5AppNode): 1 file
            સંયોજનોની ગણતરી - જવાબોની કૂંજી (DocumentNode): 1 file
            સંયોજનોની ગણતરી - કાર્યપત્રક (DocumentNode): 1 file
         અલ્ગોરિધમ [04-ALG-01] (TopicNode): 3 descendants
            અલ્ગોરિધમ - પાઠ આયોજન (DocumentNode): 1 file
            અલ્ગોરિધમ - પાઠ (HTML5AppNode): 1 file
            અલ્ગોરિધમ - કાર્યપત્રક (DocumentNode): 1 file
         જોડીયા સંબંધો [04-DM-01] (TopicNode): 2 descendants
            જોડીયા સંબંધો - પાઠ આયોજન (DocumentNode): 1 file
            જોડીયા સંબંધો - પાઠ (HTML5AppNode): 1 file
         ગ્રાફ પાથ [04-DM-02] (TopicNode): 3 descendants
            ગ્રાફ પાથ - પાઠ આયોજન (DocumentNode): 1 file
            ગ્રાફ પાથ - પાઠ (HTML5AppNode): 1 file
            ગ્રાફ પાથ - કાર્યપત્રક (DocumentNode): 1 file
         નાના કરવાની (સંકોચનની) (કમ્પ્રેસન) સરળ પ્રવૃત્તિઓ [04-IP-01] (TopicNode): 3 descendants
            નાના કરવાની (સંકોચનની) (કમ્પ્રેસન) સરળ પ્રવૃત્તિઓ - પાઠ આયોજન (DocumentNode): 1 file
            નાના કરવાની (સંકોચનની) (કમ્પ્રેસન) સરળ પ્રવૃત્તિઓ - પાઠ (HTML5AppNode): 1 file
            નાના કરવાની (સંકોચનની) (કમ્પ્રેસન) સરળ પ્રવૃત્તિઓ - કાર્યપત્રક (DocumentNode): 1 file
         પુનરાવર્તન રચના અને પ્રક્રિયાઓ [04-IPP-01] (TopicNode): 4 descendants
            પુનરાવર્તન રચના અને પ્રક્રિયાઓ - પાઠ આયોજન (DocumentNode): 1 file
            પુનરાવર્તન રચના અને પ્રક્રિયાઓ - પાઠ (HTML5AppNode): 1 file
            પુનરાવર્તન રચના અને પ્રક્રિયાઓ - જવાબોની કૂંજી (DocumentNode): 1 file
            પુનરાવર્તન રચના અને પ્રક્રિયાઓ - કાર્યપત્રક (DocumentNode): 1 file
         સુડોકુ [04-LCR-01] (TopicNode): 4 descendants
            સુડોકુ - પાઠ આયોજન (DocumentNode): 1 file
            સુડોકુ - પાઠ (HTML5AppNode): 1 file
            સુડોકુ - જવાબોની કૂંજી (DocumentNode): 1 file
            સુડોકુ - કાર્યપત્રક (DocumentNode): 1 file
         પુનરાવર્તિત પેટર્ન અને પ્રક્રિયાઓ / સપ્રમાણતા [05-IPP-01] (TopicNode): 3 descendants
            પુનરાવર્તિત પેટર્ન અને પ્રક્રિયાઓ / સપ્રમાણતા - પાઠ આયોજન (DocumentNode): 1 file
            પુનરાવર્તિત પેટર્ન અને પ્રક્રિયાઓ / સપ્રમાણતા - પાઠ (HTML5AppNode): 1 file
            પુનરાવર્તિત પેટર્ન અને પ્રક્રિયાઓ / સપ્રમાણતા - કાર્યપત્રક (DocumentNode): 1 file
         પુનરાવર્તિત પદ્ધતિઓ અને પ્રક્રિયાઓ / ગણિતમાં સરળ પેટર્ન [05-IPP-02] (TopicNode): 3 descendants
            પુનરાવર્તિત પદ્ધતિઓ અને પ્રક્રિયાઓ / ગણિતમાં સરળ પેટર્ન - પાઠ આયોજન (DocumentNode): 1 file
            પુનરાવર્તિત પદ્ધતિઓ અને પ્રક્રિયાઓ / ગણિતમાં સરળ પેટર્ન - પાઠ (HTML5AppNode): 1 file
            પુનરાવર્તિત પદ્ધતિઓ અને પ્રક્રિયાઓ / ગણિતમાં સરળ પેટર્ન - કાર્યપત્રક (DocumentNode): 1 file
   Tree is valid

Downloading files...
Processing content...
        --- Downloaded 54e0fdb2db9a05b1274be436d304bf8c.pdf
        --- Downloaded b0ce026ca8435e62da9410efef320abe.zip
        --- Downloaded e916c537597540e9a57b6bb8e9cca833.pdf
        --- Downloaded e92c61960d7341d2315022a995fe8053.jpg
        --- Downloaded f8cfc263272d739053ca93836f1e2def.pdf
        --- Downloaded 3b68b8b385abaff2628cde57ea5b8a16.zip
        --- Downloaded e22f7ef8009da9558bfa8b4bae23b020.pdf
        --- Downloaded e7b0d52ca5e47055feac9aa227841089.jpg
        --- Downloaded d8e09d178a32d5c8ba0e73ebf455b93f.pdf
        --- Downloaded 3a80ca602021d2ee523c1d2e4ec4492c.zip
        --- Downloaded 60b8d3356d4bcd851a378364c55dc647.pdf
        --- Downloaded f8569fe0157724a354de7d6ed5199f21.jpg
        --- Downloaded 6257b3574d1b2c19e39baf4f6de26d97.pdf
        --- Downloaded feadd96d227d7fa4895ecae9bdef60b3.zip
        --- Downloaded 65bcd3b37c27fecafabbde579e688761.pdf
        --- Downloaded f0150165ad7e23730a4033c60140d9a9.jpg
        --- Downloaded ada9944298e6fc5c4b89d50dcfeac23f.pdf
        --- Downloaded 2cd82a24d12d0f52b9ca8d8d1e3847cb.zip
        --- Downloaded 8ece41f772cb54642b0934775b4ca162.pdf
        --- Downloaded c8fed91f6e7bca6f550faa1dc544891a.jpg
        --- Downloaded 2aa6f17f91f80d7e1fa8d07ce8e587b1.pdf
        --- Downloaded f9dcf9b09a320bb186cff9cbc46a1351.zip
        --- Downloaded 64fd278ddb13fee2197d1516ad624280.pdf
        --- Downloaded 508ac8a989cb764cb4655910f4a11c47.jpg
        --- Downloaded 56db994dba38b96c81e8be45a0de0304.pdf
        --- Downloaded db44f38a967516c4b43c7b9bf2f09220.zip
        --- Downloaded b466b28e3e860a7e11e73ddc8e5b1a49.pdf
        --- Downloaded 42bbdfb3d97292cb984aacbcb008836c.jpg
        --- Downloaded 77f62e481d42be61d03574454318cb89.pdf
        --- Downloaded a4490002606d4c4ab152bb5add14c60d.zip
        --- Downloaded 6c0bbda708ae21b1e9f97a7b48ccf458.pdf
        --- Downloaded 733c63354feb86cbbf257c929b76d1a2.jpg
        --- Downloaded 56e1873271216510789e15d91e00e489.pdf
        --- Downloaded 22a6f73cbbb55d6cd0c0c4d3691efe97.zip
        --- Downloaded 8a8fac550e4ac99ed847b02bc4a27e85.pdf
        --- Downloaded 3cf40b36d808247b47a45c23372b7546.jpg
        --- Downloaded cd19cb589154595119a4b2c7e3bad7df.pdf
        --- Downloaded 1b0bec19620f45371c8af47023dab9d5.zip
        --- Downloaded f18b1c2d753a72150dd708e3841d870f.pdf
        --- Downloaded 6ea93e9b2dc06340b808712387f8d559.jpg
        --- Downloaded 3b6b271d776b2546b099bc65e4784296.pdf
        --- Downloaded f852df8bef9063717dd38ae9d69988cd.zip
        --- Downloaded 32c5f89619c0ecc0b43d579b6320ee57.pdf
        Downloading ./content/Gujarati/02-DA-01.jpg
        --- Downloaded 97ddcc3eb2baefdc4e45efc3983dfcae.pdf
        --- Downloaded a8e330fb2cf9f77375cab2abea83d3f7.zip
        --- Downloaded 983150d88910eb94a23210285fecfc01.pdf
        Downloading ./content/Gujarati/02-IPP-01.jpg
        --- Downloaded 3e17d55c2a453b77c1380da168bfe63f.pdf
        --- Downloaded bb9faf0a14b9b0ece395b53ee276e28d.zip
        --- Downloaded 8dda544d491ec095857055bb86dc08ec.pdf
        Downloading ./content/Gujarati/02-IPP-02.jpg
        --- Downloaded 8385d5201c4664085b04c4d8c4e9ac50.pdf
        --- Downloaded cb974a1aee4487ef36b37e1148c0bdd0.zip
        --- Downloaded 0e033bfffccf43618b2fc29fea7bc6be.pdf
        --- Downloaded 8b2783ad49ff2fb04ea8e07277df0db6.jpg
        --- Downloaded a8e912da389b349a84e2b028b35c7927.pdf
        --- Downloaded d31fbf1b2f089b3ffe959f4809d210d5.pdf
        --- Downloaded b622fb96814ba5fd6ab505a118eadc0e.png
        --- Downloaded 00dce690db285b2e8b635a095048700e.pdf
        --- Downloaded ecb5625a62229cfc3258c49e708e1cee.zip
        --- Downloaded 3ea391b667e8c2d0d7a053211c7c757a.pdf
        --- Downloaded 97edf2582a9dfb2a22c7d83ea04eb8e2.jpg
        --- Downloaded 33a73b14cc1ba6d301ff15dd516f2cd5.pdf
        --- Downloaded 2b43f15a5ab10d335a373a4900be815f.zip
        --- Downloaded b3626431ff22dd9167d803f740666a67.pdf
        --- Downloaded 227bccd82521df996a36233386b3e7ab.jpg
        --- Downloaded af721ba6244d7985e931d54a665b780e.pdf
        --- Downloaded 4067fed7c79dcd3ffd115d5296e07f9e.zip
        Downloading ./content/Gujarati/03-DM-02.jpg
        --- Downloaded 3ab1e4c42631201dbc6cb8686e924640.pdf
        --- Downloaded b8ee42a1aea014a4a583783ba0911b38.zip
        --- Downloaded 9e718b0fc3b6d195d5d40ab2b79d00ab.pdf
        Downloading ./content/Gujarati/03-IP-01.jpg
        --- Downloaded 9d3add5236377594d9514df889dd8aa1.pdf
        --- Downloaded 721b69d1acff3f8ffdf7e1e5d966dbc1.zip
        --- Downloaded 5de84d9a6a947981bbcefda06a6bb8d1.pdf
        --- Downloaded 8dcded7d0c3029e46d40aac786510baf.jpg
        --- Downloaded 7cde7c84a4cee6cda4a882ca4f5dabaf.pdf
        --- Downloaded 716c6ff8d88803582d401a5819ac02c5.zip
        --- Downloaded 5e9235b4c6515b6619b1c755a0e4cc80.pdf
        --- Downloaded 22cebb59410d0f2120a8c945f1685140.jpg
        --- Downloaded b5b149c39ea27b9dc0260d506b5787bc.pdf
        --- Downloaded bcd2d957a3ba9d42f52ea02f505918d6.zip
        --- Downloaded d153d254f259a5357ff699d96b6d3662.pdf
        --- Downloaded e5c5beebda7b15cd3c9fe1a400da71ff.pdf
        --- Downloaded ec8ec537b95a038f52fa5498f5583943.jpg
        --- Downloaded f6b1f53e2c5054e3d4ee57d5d3fd67c5.pdf
        --- Downloaded bae6d41086e397a100cd17951ecaacc2.zip
        --- Downloaded 4998d787ed483b87ed95559a2ab9dd54.pdf
        --- Downloaded 262ce5bae9daaf88eda71f6fc009e11c.jpg
        --- Downloaded 720a438dc487f54a0931532b0009e57f.pdf
        --- Downloaded b4a998743f3735e0e219492e68c7f22a.zip
        Downloading ./content/Gujarati/04-DM-01.jpg
        --- Downloaded 01ecd6d4c43f806875f2fdb25570a67d.pdf
        --- Downloaded 8cf0eccc9ba365bb11d6c478bcd3f585.zip
        --- Downloaded db9fb581fb70573c670d2c28445a2d1b.pdf
        --- Downloaded 6f979f3f235cf1238cb0fefab21913f1.jpg
        --- Downloaded e162ade8470031f68d3baea2d227fbe0.pdf
        --- Downloaded 982f397abd2b43197d3599001360656a.zip
        --- Downloaded 02397414f56dcb001a85929448979e0a.pdf
        --- Downloaded e1a2d5a8582de381461cb93aa844dac1.png
        --- Downloaded 2af773383c08cb53a23431d5bd37555d.pdf
        --- Downloaded 7d6e312e34efd84eced4d7b526c9ef0d.zip
        --- Downloaded 16abdbddce3c6bac43cfbe5dd300550a.pdf
        --- Downloaded ce6b18621819b46a43e88ba52c2bcf53.pdf
        --- Downloaded 58ce9bf326cf935fa04704fdf97a2752.jpg
        --- Downloaded a650846bc86ce59cec60d418a37c6823.pdf
        --- Downloaded a0fd818330b4efda84a0fda938eb0c57.zip
        --- Downloaded 277b7f07f9bdc6dd0d926528ea658eeb.pdf
        --- Downloaded 2d526409aac0a67be5de47cc93571107.pdf
        --- Downloaded 1c81a4cc6d4a2dae634a9bad43e278c1.jpg
        --- Downloaded a8d7480e0f1c501c26b2a755a7860393.pdf
        --- Downloaded 054e6ea8c35e9cc95a79272664e6eecc.zip
        --- Downloaded 194fd586deeb69fdb5dc1d70a0a2616d.pdf
        --- Downloaded aeb9394ead00a97d5337b761dc3869c6.jpg
        --- Downloaded 76333d6c79800e01fda7d321f2f6d2ee.pdf
        --- Downloaded 094847f1cf8e562c76a41afcbe3159be.zip
        --- Downloaded f1bdca4037a5bae48357d85c2ac31c79.pdf
        Downloading ./content/Gujarati/05-IPP-02.jpg
        --- Downloaded e4f10f3192217072efa6a52c174a3c9c.png
   7 file(s) have failed to download
        Topic sourceid:./content/Gujarati/02-DA-01: ./content/Gujarati/02-DA-01.jpg
           [Errno 2] No such file or directory: './content/Gujarati/02-DA-01.jpg'
        Topic sourceid:./content/Gujarati/02-IPP-01: ./content/Gujarati/02-IPP-01.jpg
           [Errno 2] No such file or directory: './content/Gujarati/02-IPP-01.jpg'
        Topic sourceid:./content/Gujarati/02-IPP-02: ./content/Gujarati/02-IPP-02.jpg
           [Errno 2] No such file or directory: './content/Gujarati/02-IPP-02.jpg'
        Topic sourceid:./content/Gujarati/03-DM-02: ./content/Gujarati/03-DM-02.jpg
           [Errno 2] No such file or directory: './content/Gujarati/03-DM-02.jpg'
        Topic sourceid:./content/Gujarati/03-IP-01: ./content/Gujarati/03-IP-01.jpg
           [Errno 2] No such file or directory: './content/Gujarati/03-IP-01.jpg'
        Topic sourceid:./content/Gujarati/04-DM-01: ./content/Gujarati/04-DM-01.jpg
           [Errno 2] No such file or directory: './content/Gujarati/04-DM-01.jpg'
        Topic sourceid:./content/Gujarati/05-IPP-02: ./content/Gujarati/05-IPP-02.jpg
           [Errno 2] No such file or directory: './content/Gujarati/05-IPP-02.jpg'
Getting file diff...

Checking if files exist on Kolibri Studio...
        Got file diff for 114 out of 114 files
Uploading files...

Uploading 0 new file(s) to Kolibri Studio...
Creating channel...

Creating tree on Kolibri Studio...
   Creating channel સીએસપાઠશાળા - ગુજરાતી (Import)
        Preparing fields...
(0 of 120 uploaded)    Processing સીએસપાઠશાળા - ગુજરાતી (Import) (ChannelNode)
(10 of 120 uploaded)       Processing સરળ સૂચનાઓ [01-DA-01] (TopicNode)
(13 of 120 uploaded)       Processing દૈનિક જીવનમાં ખૂબ સરળ પેટર્ન [01-IPP-01] (TopicNode)
(16 of 120 uploaded)       Processing પદ્ધતિસરની ગણતરી [01-LCR-01] (TopicNode)
(19 of 120 uploaded)       Processing પદ્ધતિસરની ગણતરી [01-LCR-02] (TopicNode)
(22 of 120 uploaded)       Processing સમસ્યા ઉકેલવી: શૂન્ય ચોકડી [01-PS-06] (TopicNode)
(25 of 120 uploaded)       Processing હેપી નકશાઓ 1 [01-PS-11] (TopicNode)
(28 of 120 uploaded)       Processing હેપી નકશાઓ 2 [01-PS-12] (TopicNode)
(31 of 120 uploaded)       Processing લૂપનો પરિચય [01-PS-14] (TopicNode)
(34 of 120 uploaded)       Processing બિંદુઓ ને જોડો [01-PS-24] (TopicNode)
(37 of 120 uploaded)       Processing તર્કશાસ્ત્ર કોયડાઓ [01-PS-33] (TopicNode)
(50 of 120 uploaded)       Processing સૂચનોનો સમૂહ અમલમાં મૂકવો [02-DA-01] (TopicNode)
(53 of 120 uploaded)       Processing પુનરાવર્તિત પ્રક્રિયાઓ [02-IPP-01] (TopicNode)
(56 of 120 uploaded)       Processing પુનરાવર્તિત પ્રક્રિયાઓ [02-IPP-02] (TopicNode)
(59 of 120 uploaded)       Processing પેટર્ન શોધવાનો પરિચય [02-PS-15] (TopicNode)
(62 of 120 uploaded)       Processing તર્કશાસ્ત્ર કોયડા [02-PS-33] (TopicNode)
(64 of 120 uploaded)       Processing નાના જાણીતા કાર્યોમાં મોટી સમસ્યાઓનું વિભાજન [03-DA-01] (TopicNode)
(67 of 120 uploaded)       Processing મોટી સમસ્યાઓનું નાના કાર્યોમાં વિભાજન [03-DA-02] (TopicNode)
(70 of 120 uploaded)       Processing કોયડા [03-DM-02] (TopicNode)
(72 of 120 uploaded)       Processing ગોઠવણી અને વિશ્લેષણ [03-IP-01] (TopicNode)
(75 of 120 uploaded)       Processing પુનરાવર્તિત પેટર્ન [03-IPP-01] (TopicNode)
(88 of 120 uploaded)       Processing ગણતરી અને સૂચિ [03-LCR-01] (TopicNode)
(91 of 120 uploaded)       Processing સંયોજનોની ગણતરી [03-LCR-02] (TopicNode)
(95 of 120 uploaded)       Processing અલ્ગોરિધમ [04-ALG-01] (TopicNode)
(98 of 120 uploaded)       Processing જોડીયા સંબંધો [04-DM-01] (TopicNode)
(100 of 120 uploaded)       Processing ગ્રાફ પાથ [04-DM-02] (TopicNode)
(103 of 120 uploaded)       Processing નાના કરવાની (સંકોચનની) (કમ્પ્રેસન) સરળ પ્રવૃત્તિઓ [04-IP-01] (TopicNode)
(106 of 120 uploaded)       Processing પુનરાવર્તન રચના અને પ્રક્રિયાઓ [04-IPP-01] (TopicNode)
(110 of 120 uploaded)       Processing સુડોકુ [04-LCR-01] (TopicNode)
(114 of 120 uploaded)       Processing પુનરાવર્તિત પેટર્ન અને પ્રક્રિયાઓ / સપ્રમાણતા [05-IPP-01] (TopicNode)
(117 of 120 uploaded)       Processing પુનરાવર્તિત પદ્ધતિઓ અને પ્રક્રિયાઓ / ગણિતમાં સરળ પેટર્ન [05-IPP-02] (TopicNode)
   All nodes were created successfully.
Upload time: 32.484305s


DONE: Channel created at https://api.studio.learningequality.org/channels/168f0b25964b56d7a2f95b9e4f881026/edit

There are more than 20 files which have changed. The md5sums also differ. Here are attached two sample files - old and new.

01-LCR-01-PPT-Gujarati_old.zip
01-LCR-01-PPT-Gujarati_new.zip

Possible to create invalid ThumbnailFile

Description

Ricecooker will uplaod invalid image file, e.g. HTML content from a error page Page not found...

see https://sentry.io/learningequality/studio/issues/755780740/activity/

Solution

Add a source validation check somewhere here:
https://github.com/learningequality/ricecooker/blob/master/ricecooker/classes/files.py#L348-L350

Levels of check:

  1. check status code 200
  2. check mime-type is one of jpg/png/jpeg
  3. try to open downloaded file with PIL and make sure it is an image.

Dave is looking into this for Nokkta chef; but opening an issue so we can fix ricecooker-wide.

Require every file to have a language

I was trying to parse the lang field of a file in Kolibri - it was set to "None", this is not terribly helpful. All files have a language associated with them that describes the language this content is in.

Incremental cheffing

Description

Ability to add content to an existing chef_tree on studio incrementally.

Use case (Alejandro)

The chem libretexts has 4 main categories 'Courses', 'Bookshelves', 'Homework' and 'Ancillary Materials'. The Courses category is really huge, it takes 2 days to scrape, uploading the materials to studio if something gone wrong at the moment of scrapping the Homework category, I have to wait for someting witch was do it before.

We could is an option like --merge for example

I would like do

./sushichef --merge --category='Courses'

then next day

./sushichef --merge --category='Bookshelves'

and again

./sushichef --merge --category='Homework'

and don't overwrite the previous uploaded content.

(or merge the json_tree file generated after scraping)

ffprobe fails hard on corrupted VideoFile

  • ricecooker version: 0.6.3
  • Python version: 3.4 and 3.6
  • Operating System: Debian and Mac OS X

Description

If a bad VideoFile is added to a tree, e.g. a zero-bytes sized file, then ricecooker crashes when trying to get the video info. I believe the crash happens in guess_video_preset_by_resolution :

return self.preset or guess_video_preset_by_resolution(config.get_storage_path(self.filename))

which calls
https://github.com/learningequality/pressurecooker/blob/master/pressurecooker/videos.py#L9

What I Did

Tried to download all the languages for MIT Blossoms, which include a currupt mp4 file (zero-bytes). The chef cannot test for empty files since it's ricecooker that does the downloading.

~/Projects/FLECode/ricecooker/ricecooker/classes/files.py in get_preset(self)
    378         config.LOGGER.info('In get_preset for VideoFile ' + str(self.path))
    379         video_path = config.get_storage_path(self.filename)
--> 380         guessed_preset = guess_video_preset_by_resolution(video_path)
    381         return self.preset or guessed_preset
    382 

~/Projects/FLECode/ricecooker/venv/lib/python3.6/site-packages/pressurecooker/videos.py in guess_video_preset_by_resolution(videopath)
     10     try:
     11         result = subprocess.check_output(['ffprobe', '-v', 'error', '-print_format', 'json', '-show_entries',
---> 12                                       'stream=width,height', '-of', 'default=noprint_wrappers=1', str(videopath)])
     13     except OSError:
     14         return format_presets.VIDEO_HIGH_RES

/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/subprocess.py in check_output(timeout, *popenargs, **kwargs)
    334 
    335     return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
--> 336                **kwargs).stdout
    337 
    338 

/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/subprocess.py in run(input, timeout, check, *popenargs, **kwargs)
    416         if check and retcode:
    417             raise CalledProcessError(retcode, process.args,
--> 418                                      output=stdout, stderr=stderr)
    419     return CompletedProcess(process.args, retcode, stdout, stderr)
    420 

CalledProcessError: Command '['ffprobe', '-v', 'error', '-print_format', 'json', '-show_entries', 'stream=width,height', '-of', 'default=noprint_wrappers=1', 'storage/4/3/43e933b2681769128ff503427253c514.mp4']' returned non-zero exit status 1.

Proposed solution

Add error handling in ricecooker to handle empty/currupt video files provided.

Failed to create descendants for 3 node(s) for magogenie contents.

  • ricecooker version:0.5.4
  • Python version:3.5.2
  • Operating System:MAC OS X EI Capitan

Description

Facing an issue while uploading magogenie content using ricecooker. "Failed to create descendants for 3 node(s)."

While uploading magogenie content, ricecooker shows "Tree is valid" and subtopic(topic node) has it's descendent with question count. But at the time of processing the topic nodes it gives me warning "Internal server error". You could referred processed_tree screenshot.

I remembered, I have executed same script in last week. That time all the nodes were created with it's descendent, but today some of the topic nodes failed while uploading content. Is there any recent change made onto content curation server.

Screenshot

Build_tree:
image

Processed_tree:
image
image

What I Did

I have run the sushi chef script using below command:
python -m ricecooker uploadchannel magogenie.py --token=bcea45e4eb100507afe9249772a1b22493c46620 -u -v --warn --prompt

Need to update examples/sample_program.py -- missing resources

I noticed this today:

	Audio aaaa4d: https://upload.wikimedia.org/wikipedia/commons/b/ba/Rice_grains_(IRRI)
	   404 Client Error: Not Found for url: https://upload.wikimedia.org/wikipedia/commons/b/ba/Rice_grains_(IRRI)
	Question ddddd: ka-perseus-graphie.s3.amazonaws.com/907dec1b45fb177f0937fa521b7af03fb837f0bd
	   [Errno 2] No such file or directory: 'ka-perseus-graphie.s3.amazonaws.com/907dec1b45fb177f0937fa521b7af03fb837f0bd.svg'

Buried type error in exception case in sushi_bar_client

Whilst trying to get a ricecooker up and running, I hit a chain of issues which might be worth fixing.

self.log_ws, self.log_handler = self.__config_logger()

assumes .__config_logger() returns two items. But it returns None if .run_id is falsey.

def __config_logger(self):
if not self.run_id:
return None

.run_id is the return value of .__create_channel_run

self.run_id = self.__create_channel_run(channel, username, token)

which has a broad sweeping exception which will silence the exception and return None.

(The exception in this case was a 400 for the post 'https://sushibar.learningequality.org/api/channelruns' -- will dig deeper separately; suspect API key wrong.)

I suggest .__create_channel_run possibly shouldn't silence the exception at all since it's probably impossible to recover gracefully from, and it should just crash ASAP to avoid later type errors. (It might be useful elsewhere).

Reduce duplication on ContentNode subclasses

There's a fair bit of duplication on the subclasses of ContentNode; e.g.:

  • the Attributes list in the docstring (perhaps it can just list the extra attributes specific to this kind, and then refer to the shared attributes from the docstring for ContentNode
  • the contents of the to_dict method is duplicated on a number of subclasses; perhaps it can just call to_dict on the parent and then add in its own fields to that dict
  • the arguments to __init__ are repeated on subclasses; instead, they could be captured in **kwargs and then passed along to the super via **kwargs.

Drop content nodes of YouTube videos that have been removed

  • ricecooker version:0.6.0
  • Python version:3.6.1
  • Operating System:MacOS Sierra (10.12.2)

Description

On the work-in-progress te-sushi-chef, 9 YouTube videos failed to download because they were removed from the YouTube channel. Links to these videos still exist on the scraped website, so they get scraped, but the sushi chef doesn't know that the video doesn't exist till the ricecooker attempts to download it (via

It would be good if ricecooker can just drop that content node, so that the content doesn't get linked to in the topic tree on Kolibri (currently, for non-existing videos, a black thumbnail is shown, and clicking into the video shows an endless Kolibri logo spinner).

What I Did

In the te-sushi-chef repository with PR 2 checked out, ran:

./te_chef.py -v --reset --token=... --stage

... and got the following on upload:

   9 file(s) have failed to download
        Video 33435: {'web_url': 'http://www.youtube.com/watch?v=mp2XNLBSW1s', 'download_settings': {'format': 'bestvideo[height<=720][ext=mp4]+bestaudio[ext=m4a]/best[height<=720][ext=mp4]', 'outtmpl': '/var/folders/z8/9d4m1_l55wl7q0ff6shs8c8w0000gn/T/f1da5bf042d42c0679ca8c7fbbd68edf.mp4'}, 'preset': None, 'language': None, 'default_ext': None, 'source_url': None, 'node': <ricecooker.classes.nodes.VideoNode object at 0x107480240>, 'error': '\x1b[0;31mERROR:\x1b[0m mp2XNLBSW1s: YouTube said: This video has been removed by the user.'}
           ERROR: mp2XNLBSW1s: YouTube said: This video has been removed by the user.
        Video 33340: {'web_url': 'http://www.youtube.com/watch?v=uRRco6mJp5I', 'download_settings': {'format': 'bestvideo[height<=720][ext=mp4]+bestaudio[ext=m4a]/best[height<=720][ext=mp4]', 'outtmpl': '/var/folders/z8/9d4m1_l55wl7q0ff6shs8c8w0000gn/T/45f20b0dc6561137fc020d2bd9151529.mp4'}, 'preset': None, 'language': None, 'default_ext': None, 'source_url': None, 'node': <ricecooker.classes.nodes.VideoNode object at 0x107659eb8>, 'error': '\x1b[0;31mERROR:\x1b[0m uRRco6mJp5I: YouTube said: This video has been removed by the user.'}
           ERROR: uRRco6mJp5I: YouTube said: This video has been removed by the user.
        Video 33370: {'web_url': 'http://www.youtube.com/watch?v=hZnXTC4ehkA', 'download_settings': {'format': 'bestvideo[height<=720][ext=mp4]+bestaudio[ext=m4a]/best[height<=720][ext=mp4]', 'outtmpl': '/var/folders/z8/9d4m1_l55wl7q0ff6shs8c8w0000gn/T/2ff4c3a31ef744d9163fbf277d5f8775.mp4'}, 'preset': None, 'language': None, 'default_ext': None, 'source_url': None, 'node': <ricecooker.classes.nodes.VideoNode object at 0x10793e0b8>, 'error': '\x1b[0;31mERROR:\x1b[0m hZnXTC4ehkA: YouTube said: This video has been removed by the user.'}
           ERROR: hZnXTC4ehkA: YouTube said: This video has been removed by the user.
        Video 33382: {'web_url': 'http://www.youtube.com/watch?v=VAsIBkbyoGY', 'download_settings': {'format': 'bestvideo[height<=720][ext=mp4]+bestaudio[ext=m4a]/best[height<=720][ext=mp4]', 'outtmpl': '/var/folders/z8/9d4m1_l55wl7q0ff6shs8c8w0000gn/T/e4652972c584c2f1c2782aab744b694e.mp4'}, 'preset': None, 'language': None, 'default_ext': None, 'source_url': None, 'node': <ricecooker.classes.nodes.VideoNode object at 0x1076f26a0>, 'error': '\x1b[0;31mERROR:\x1b[0m VAsIBkbyoGY: YouTube said: This video has been removed by the user.'}
           ERROR: VAsIBkbyoGY: YouTube said: This video has been removed by the user.
        Video 33202: {'web_url': 'http://www.youtube.com/watch?v=3xfeX4Aqrxc', 'download_settings': {'format': 'bestvideo[height<=720][ext=mp4]+bestaudio[ext=m4a]/best[height<=720][ext=mp4]', 'outtmpl': '/var/folders/z8/9d4m1_l55wl7q0ff6shs8c8w0000gn/T/c07f48a3cd3876593a7a10c3e46ba671.mp4'}, 'preset': None, 'language': None, 'default_ext': None, 'source_url': None, 'node': <ricecooker.classes.nodes.VideoNode object at 0x1076737b8>, 'error': '\x1b[0;31mERROR:\x1b[0m 3xfeX4Aqrxc: YouTube said: This video does not exist.'}
           ERROR: 3xfeX4Aqrxc: YouTube said: This video does not exist.
        Video 33254: {'web_url': 'http://www.youtube.com/watch?v=KcH0mouzBj4', 'download_settings': {'format': 'bestvideo[height<=720][ext=mp4]+bestaudio[ext=m4a]/best[height<=720][ext=mp4]', 'outtmpl': '/var/folders/z8/9d4m1_l55wl7q0ff6shs8c8w0000gn/T/8694c3f1caf47d07acdbc1b8f3a48158.mp4'}, 'preset': None, 'language': None, 'default_ext': None, 'source_url': None, 'node': <ricecooker.classes.nodes.VideoNode object at 0x1074e9c88>, 'error': '\x1b[0;31mERROR:\x1b[0m KcH0mouzBj4: YouTube said: This video does not exist.'}
           ERROR: KcH0mouzBj4: YouTube said: This video does not exist.
        Video 33284: {'web_url': 'http://www.youtube.com/watch?v=jSoMnywqXj0', 'download_settings': {'format': 'bestvideo[height<=720][ext=mp4]+bestaudio[ext=m4a]/best[height<=720][ext=mp4]', 'outtmpl': '/var/folders/z8/9d4m1_l55wl7q0ff6shs8c8w0000gn/T/a09bc155a4e1af1ef7cc8fbe36dbb95c.mp4'}, 'preset': None, 'language': None, 'default_ext': None, 'source_url': None, 'node': <ricecooker.classes.nodes.VideoNode object at 0x107064c18>, 'error': '\x1b[0;31mERROR:\x1b[0m jSoMnywqXj0: YouTube said: This video has been removed by the user.'}
           ERROR: jSoMnywqXj0: YouTube said: This video has been removed by the user.
        Video 33273: {'web_url': 'http://www.youtube.com/watch?v=k8747TB3Re8', 'download_settings': {'format': 'bestvideo[height<=720][ext=mp4]+bestaudio[ext=m4a]/best[height<=720][ext=mp4]', 'outtmpl': '/var/folders/z8/9d4m1_l55wl7q0ff6shs8c8w0000gn/T/329619ec38e973ba9e08bd2789b3fe7a.mp4'}, 'preset': None, 'language': None, 'default_ext': None, 'source_url': None, 'node': <ricecooker.classes.nodes.VideoNode object at 0x10776fb38>, 'error': '\x1b[0;31mERROR:\x1b[0m k8747TB3Re8: YouTube said: This video does not exist.'}
           ERROR: k8747TB3Re8: YouTube said: This video does not exist.
        Video 33315: {'web_url': 'http://www.youtube.com/watch?v=1Iv3HPkDjtQ', 'download_settings': {'format': 'bestvideo[height<=720][ext=mp4]+bestaudio[ext=m4a]/best[height<=720][ext=mp4]', 'outtmpl': '/var/folders/z8/9d4m1_l55wl7q0ff6shs8c8w0000gn/T/f3b344098e031e8c9e1a95970d05f05c.mp4'}, 'preset': None, 'language': None, 'default_ext': None, 'source_url': None, 'node': <ricecooker.classes.nodes.VideoNode object at 0x107161f28>, 'error': '\x1b[0;31mERROR:\x1b[0m 1Iv3HPkDjtQ: YouTube said: This video does not exist.'}
           ERROR: 1Iv3HPkDjtQ: YouTube said: This video does not exist.

More flexible add_folder and add_file for souschefs

  • ricecooker version: 0.6.11
  • Python version: all
  • Operating System: all

Description

The current DataWriter API assumes the display name and filesystem name for files and folders are identical. This is unnecessary coupling and leads to complications:

  • We might want to use display names with spaces / special characters that would not be valid filenames
  • Might want to use special filenames in archive to encode particular order (since file order determined by os.walk which is alphabetical)

Proposed change 1

Change the signature of add_folder from

def add_folder(self, path, title, ...

to

def add_folder(self, path, dirname, title=None, ...

where dirname will be the dir name written to the zip file and the optional title argument (str) will be written to the Title * column in the CSV.

Proposed change 2

Change the signature of add_file from

def add_file(self, path, title, download_url, write_data=True, 

to

def add_file(self, path, filename, download_url, title=None, write_data=True, 

where filename will be the filename written within zip and title will be written to the Title * column in CSV file.

Comments

I think the above changes should be backward compatible with existing souschef scripts (assuming all calls to add_folder and add_file were done with keyword arguments and not dependingon the order of kwargs.

Error with Python 3.4 and subprocess.run

  • ricecooker version: 0.5.8
  • Python version: 3.4
  • Operating System: Ubuntu 14.04

Description

After importing
from ricecooker.utils.html import download_file

and trying to run a chef, I got:
from subprocess import run
ImportError: cannot import name 'run'

The python doc says: "The run() function was added in Python 3.5; if you need to retain compatibility with older versions, see the Older high-level API section."

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.