haroldmills / vesper Goto Github PK
View Code? Open in Web Editor NEWOpen source software for acoustic monitoring of nocturnal bird migration.
License: MIT License
Open source software for acoustic monitoring of nocturnal bird migration.
License: MIT License
The archiver --stations
, --excluded-stations
, and --dates
options each precede a list of items, and if such an option is last it consumes the rest of the command line items, including the input and output directories. When this happens, the archiver currently prints the unhelpful error message too few arguments
. We should detect when one of the above options occurs last and show a more helpful error message.
The viewer is currently not very robust to preferences errors. For example, it requires that an initial station (and detector and clip class) be specified, and quits with an exception if one isn't. It should just log a message and choose an arbitrary station (e.g. the first in the combo box) instead. We should review where preferences are used and behave more gracefully when they are not present or have unusable values.
The user should be able to undo and redo manual classifications in the viewer. This will make it easier to correct mistakes, and for new users to experiment.
A user should be able to export data from archives for import into other programs, such as Excel and R. Clips should be exportable at least to WAV files, perhaps with user-specified name formats and hierarchical directory structures. Clip metadata (e.g. station, detector, start time, clip class) and clip counts should also be exportable, at least to CSV files.
The viewer clips window currently navigates to the next page when the user presses the space key. To this we should add the following commands:
Bill Evans recently showed me that the viewer clips window becomes nonfunctional if one looks at CLC noise clips for the night of 2014-09-02 in an example data set he is preparing. I was unable to reproduce this problem on Mac OS X, but I do see it on Windows 8 (Bill uses some version of Windows).
The problem appears to be related to zero-length spectrograms. The CLC clip with time 2014-09-03 01:02:08 has only 113 samples (the .wav file has 270 bytes), and so its spectrogram has zero spectra for the default window size of 128 samples. When I try to display such a spectrogram in the viewer (on Windows, but not on Mac OS X), an exception is raised from within Matplotlib and I see the following stack trace:
Traceback (most recent call last):
File "C:\Users\Harold\Anaconda\lib\site-packages\matplotlib\backends\backend_qt5.py", line 338, in resizeEvent
self.draw()
File "C:\Users\Harold\Anaconda\lib\site-packages\matplotlib\backends\backend_qt5agg.py", line 148, in draw
FigureCanvasAgg.draw(self)
File "C:\Users\Harold\Anaconda\lib\site-packages\matplotlib\backends\backend_agg.py", line 456, in draw
self.renderer = self.get_renderer(cleared=True)
File "C:\Users\Harold\Anaconda\lib\site-packages\matplotlib\backends\backend_agg.py", line 473, in get_renderer
self.renderer = RendererAgg(w, h, self.figure.dpi)
File "C:\Users\Harold\Anaconda\lib\site-packages\matplotlib\backends\backend_agg.py", line 94, in __init__
self._renderer = _RendererAgg(int(width), int(height), dpi, debug=False)
ValueError: width and height must each be below 32768
The Old Bird NFC Wrangler should collect the paths of clip files in directories whose dates are inconsistent with the times indicated in the file names and display the file paths after all files have been inspected.
This enhancement will make it more efficient to browse and classify large numbers of clips with the viewer, since the space key is both larger than the page down key and closer to the keys that are pressed to classify clips.
While the user is classifying clips on one viewer clips window page, the viewer should prepare the next page for display so that the transition to that page can be faster if and when it comes.
The NFC tools should create persistent logs to which they write error, warning, and status messages. Using logs will allow us to record messages that we might not necessarily want to interrupt the user's work with, though there should be some sort of indication within a GUI that messages have been logged, and perhaps UI for viewing them. The persistence will help with debugging and support. The Python logging
module will probably be useful.
If the archiver --dates or --stations option is specified last on the command line, before the required archive paths, the argument parser attempts to process the paths as dates or station names and fails. Perhaps instead of --dates and --stations options we should offer --date and --station options, each of which can be specified multiple times, and each occurrence of which adds one date or station?
We've been carrying around two unused main window count display types, namely the month bar chart and the month calendar, for awhile now. These should be deleted to simplify our code. The old code will still be in Git if we want it later.
The main NFC Viewer window title should include the name of the archive that is currently open.
The content of the viewer clips window currently has no minimum width. This can cause exceptions (related to #30) on Windows if the window is sized too small by the user. We should give the content a minimum width and height.
After navigation from one page of clips to another in the clips window, one or more play buttons are sometimes visible that should not be. This is apparently because the ClipFigurePlayButton._on_figure_enter
method is sometimes called unexpectedly.
The calendars in the main viewer window currently expand and contract with the window. They should have a fixed size, and probably be in a scroll area.
For consistency with GlassOFire, we currently support control modifiers in classification commands. These should not be allowed, however, because of the potential for collisions with menu item keyboard accelerators. I have discusses this change with Bill Evans and he said it would not present a problem for him.
The _parse_date
and parse_time
methods in particular need tests. Today while working on the time_utils
module I happened to realize that the regular expressions those functions use were missing an initial "^" and final "$".
The only navigation options currently offered by the viewer clips window are the page down and page up commands. It should be possible to navigate to an arbitrary place in the sequence of pages. Add a rug plot to the clips window that shows a tick for each clip, and make a click on the plot navigate to the page that contains the time corresponding to the click location. In addition to improving the available navigation options, the plot will provide a useful overview of a night's calling activity.
On both Windows and OS X, if one scrolls the archive calendar by flicking one's finger on an Apple Magic Mouse, scrolling stops as soon as the mouse is over one of the month calendars. Scrolling should not stop since the month calendars don't do anything with scroll events.
The station and detector menus in the main window of the NFC Viewer currently show all of the stations and detectors of an archive, regardless of whether or not there are clips for a station. It should instead show only those stations and detector combinations for which the archive contains clips.
Bill Evans and I found on Friday that the archiver did not appear to be updating the classifications of clips that it first encountered as calls and later as something more specific, like AMRE. We discovered this when we ran the archiver on some data and were surprised that while there were calls in the resulting archive, none of them were classified to species. The problem appears to be in the _visit_clip_file_aux
method of the _OldBirdSourceDirectoryVisitor
class of the nfc_archiver
script, which detects when a clip is reclassified to a more specific class but does not update the classification in the archive accordingly. The bug is a regression, having been introduced in version e960784 of 2014-06-27.
On OS X (I have not tested on Windows yet), when the old_bird_nfc_wrangler
script is run in non-dry-run mode, it processes clips at an acceptable rate for awhile (between about 100,000 and 200,000 clips), but for some reason processing slows considerably after that. CPU utilization typically drops from tens of percent to just a few percent. Processing does not appear to stop, but it is so slow that the script is not useful for an archive containing millions of clips.
I have spent some time characterizing this issue, and determined that the problem does not occur if the wrangler is modified so that it either copies clip files or adds clips to the SQLite database (where the database is stored in a disk file), but the problem does occur when the wrangler does both. For the time being I have implemented a workaround in which the wrangler adds clips to an SQLite database stored in memory rather than on disk, and writes the database to disk only at the end of processing. This works fine for the 1,800,000 or so clips in the Old Bird summer and fall 2012 data set. But I would still like to understand why it doesn't work to use the SQLite database stored on disk.
It might be helpful to try to pare down the wrangler to a very simple script that writes dummy clip files and creates a row for each one in a single-table SQLite database and see if the problem is still present.
I believe that I did not encounter this problem when I created the original live test archive in 2013.
We should offer minimum and maximum power preferences, and spectrograms should be rendered with the the color palette stretched so that its lower and upper ends (currently white and black) are assigned to the minimum and maximum powers, respectively.
Sometimes, at least, if I scroll the calendar in the main viewer window with an Apple Magic Mouse and then move the pointer to a date on the calendar for which there are clips, one or more (often several) clips windows open for that date, even though I did not click on the date. No such windows should open: a clips window should open if and only if the user clicks the mouse on a date for which there are clips. The problem happens on both Mac OS X 10.9.5 and Windows 8.1 running in Parallels 9.
Station, detector, and clip class data are currently hard-coded into the archiver. As another step toward more generally useful archiving tools, these data should be removed. It might be a good idea to change the current archiver to perform only incremental archiving, and offer a separate means of creating archives and editing their stations, detectors, and clip classes. These could be separate scripts, tools that run in a web browser, and/or tools incorporated into the viewer.
The clip class Call.WTSP.Songtype was used in Bill Evans's 2012 data, but he does not recognize it and says that it should be deleted.
clip_window
could be better organized to separate the concern of clip layout. See the design of iOS collection views, for example.
The original implementation of the viewer clip window showed a spectrogram in a plain wx.Panel rather than in a Matplotlib figure canvas, and as I recall it was considerably faster than the Matplotlib version. For users who don't require the flexibility of Matplotlib plots (probably most users), it would be good to implement a PyQt version of the old clip window so they can benefit from the higher performance. It might be best to tackle this issue after the refactoring of issue #3.
There is currently no Archive
API for creating, deleting, and modifying stations, detectors, and clip classes. We should provide API for this purpose. We should also provide a command line program and/or a GUI program for editing archives that uses the API. Bill already needs this, for example to add new stations and clip classes.
Up to this point the clips that the NFC software has had to deal with have all been stamped with local, DST-adjusted (when appropriate) times. This will change in the future to avoid ambiguities when DST ends in the fall (clips recorded in the hour preceding the time change and the hour after the time change are currently both stamped with a time in [01:00:00, 02:00:00)). Accordingly, I suspect that it would be best to stamp archived clips with UTC times rather than local times, and for the NFC software to convert the UTC times to local times as appropriate in UI code. That is, the internal representation of time should be UTC, with conversion to and from local time as needed for the UI.
When Bill Evans is classifying with the viewer and classifies the last clip on a page, he would like for the viewer to automatically navigate to the next page and select the first clip on the new page. He would also like for the first clip in a brand new clips window to automatically be selected.
The number of spectrogram rows and the row width in seconds are currently hard-coded into the viewer. The user should be able to specify them.
Currently, the only way for the user to control which classification command set is used in a viewer clips window is via a preference, and the set is the same for all clips windows. There should be menu or a combo box in each clips window for selecting the command set.
We currently set the Matplotlib backend.qt4
rc parameter in more than one NFC viewer source file. We should specify it in no more than one source file, probably nfc_viewer.py
, if indeed it has to be set at all. Not setting the parameter appears to work on my Mac OS X installation. If we do not set the parameter and the user does not have a matplotlibrc file, how does Matplotlib choose a back end? Perhaps it has a way of dynamically figuring out what UI toolkit is in use?
The viewer should allow the user to switch archives, and not display only a single archive chosen at startup. This could be accomplished either via an item in a file menu, which seems more natural for a traditional desktop application, or via a button that brings up a file chooser. It would also be good to include a list of favorite archives specified via the preferences file, and perhaps a list of recent archives.
The efficiency of Bill Evans's classification workflow would be improved if he could classify all clips of a night with a single keystroke. It may be that the only classification he would want to apply in this way is the noise classification, but it might still be a good idea to provide a general means of specifying in a classification command set that a command should apply to all clips of a night. Note that we already provide a means to specify that a command should apply to all clips of a page, so the enhancement should take this into consideration.
The Old Bird NFC wrangler should accept command line options to restrict its operation to a single date or a range of dates. For example, -s and -e options might allow specification of start and end dates, respectively, and a -d option might allow specification of a single date.
In the play buttons that appear over spectrograms in the viewer, the area in which the user must position the hotspot of the mouse cursor for a mouse press to result in a button press is too small. It should be made larger.
To provide a quick fix for #30, I modified the archive class to zero pad short clips to a minimum duration. This is unsatisfactory, however, for two reasons:
We should implement a proper fix, which will involve modifying the clips window to handle short clips gracefully, including when the window is small.
The font used in viewer controls by Canopy looks good, but the one used by Anaconda does not. Try to figure out what font is used by Canopy and how to use it in Anaconda. I think the nice-looking font might just be the default system font. If so, why is it not used in Anaconda?
Because clips collected with the Old Bird NFC detector Tseep have DST-adjusted times, the hour immediately preceding and the hour immediately following the end of DST are indistinguishable from the times. The Old Bird NFC Wrangler should warn of clips with ambiguous times. To do this it will need to know which which stations observed DST and when DST ended for the relevant year.
Currently, the viewer calendar is formatted in a way that works fine for some archives but not for others. In particular, the format works fine for archives that have data for consecutive months, for example for a single migration season. It does not work so well for archives that have data over a longer period, including many months for which there are no data. For example, it does not work well for an archive that spans several years but only includes data for the months of the spring and fall migrations. In this case the calendar includes all months, even though half or more of them have no data.
To address this problem, I would like to support a type of preset that specifies a subset of months to be included in the calendar. Such a preset will specify periods, with each period comprising a consecutive sequence of months to be displayed. A month that is not in one of the specified periods will be excluded from the calendar.
For several months now I have written commit messages for my GitHub projects in the imperative mood, as recommended by the GitHub folks. I was initially reluctant to adopt this convention, since I had found it confusing when I encountered such messages in other people's repositories ("Huh?", I thought, "Whom are you telling to do this? Aren't you describing something that's already done?"), and the past tense of the indicative mood is the obvious choice grammatically. I figured I would try to conform, and see if the initial awkwardness would wear off. Well, it hasn't at all, so I'm giving the GitHub convention notice. If after another month I haven't warmed to it, I will revert to the English convention.
There are several modules, mostly in the nfc.archive
package, devoted to walking the directory hierarchies that contain archive clips. These modules include archive_walker
, archive_visitor
, and several modules containing subclasses of ArchiveVisitor
. The levels of the directory hierarchy are fixed in the code, which is both inflexible and unnecessarily complicated. Refactor the code so that the levels of the directory hierarchy are not fixed.
In order to accelerate database operations, the archiver operates on an in-memory copy of the clips database, which it writes out to disk at the end of its operation. So that the original database file ClipDatabase.db
remains if something goes wrong with the write, the new database is first written to a file called ClipDatabase.db new
, and then the old database is deleted and the new one renamed to ClipDatabase.db
if and only if the write succeeds.
Bill Evans uncovered a problem with this scheme. Somehow he wound up with both ClipDatabase.db
and ClipDatabase.db new
in an archive instead of just ClipDatabase.db
, and when he tried to archive a new day's data incrementally the creation of the new database failed. Here is a stack trace of the error:
Traceback (most recent call last):
File "/Users/Harold/Documents/Code/Python/NFC/scripts/nfc_archiver.py", line 1238, in <module>
_main()
File "/Users/Harold/Documents/Code/Python/NFC/scripts/nfc_archiver.py", line 140, in _main
_close_archive(archive, logger)
File "/Users/Harold/Documents/Code/Python/NFC/scripts/nfc_archiver.py", line 363, in _close_archive
archive.close()
File "/Users/Harold/Documents/Code/Python/NFC/src/nfc/archive/archive.py", line 187, in close
self._close_db()
File "/Users/Harold/Documents/Code/Python/NFC/src/nfc/archive/archive.py", line 198, in _close_db
_copy_db(self._conn, file_conn)
File "/Users/Harold/Documents/Code/Python/NFC/src/nfc/archive/archive.py", line 750, in _copy_db
_create_db_tables(to_conn)
File "/Users/Harold/Documents/Code/Python/NFC/src/nfc/archive/archive.py", line 794, in _create_db_tables
cursor.execute(_CREATE_STATION_TABLE_SQL)
sqlite3.OperationalError: table Station already exists
Two points about this:
ClipDatabase.db new
file would have been deleted by the previous archiving operation, so presumably something went wrong with it. We should try to ensure that when something goes wrong we provide an informative error message to the user and recover more gracefully._close_db
method we should test whether ClipDatabase.db new
exists before we open it and behave more gracefully if it does.The viewer spectrogram parameters are currently hard-coded. The user should be able to change them without editing the viewer source code.
The Old Bird NFC wrangler currently ignores clip files containing relative (to the beginning of recording) times rather than absolute times. It should accept a list of (station, date, start time) recording start times and use those start times to convert relative times to absolute ones whenever possible so that the clips with relative times can be included in archives.
Currently, the viewer's archive calendar is formatted in a way that works fine for some archives but not for others. In particular, the format works fine for archives that have data for consecutive months, for example for a single migration season. It does not work so well for archives that have data over a longer period, including many months for which there are no data. For example, it does not work well for an archive that spans several years but only includes data for the months of the spring and fall migrations. In this case the calendar includes all months, even though half or more of them have no data. The archive calendar needs to be enhanced to work better out of the box with a wider variety of archives. It should also be customizable so that if the out-of-the-box layout isn't what the user wants they can improve it.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.