peshkira / c3po Goto Github PK
View Code? Open in Web Editor NEWClever, Crafty Content Profiling of Objects
Home Page: http://peshkira.github.io/c3po
License: Apache License 2.0
Clever, Crafty Content Profiling of Objects
Home Page: http://peshkira.github.io/c3po
License: Apache License 2.0
hi,
i have installed c3po and let it eat some xml files from fits i get a nice csv file.
now i tried to use the webinterface and therefore i downloaded play and run the command: ../../play/play-2.0.4/play clean compile stage
i get the following error, i hope that somebody can help:
....
[info] downloading http://repository-play-war.forge.cloudbees.com/release/com/github/play2war/play2-war-core-common_2.9.1/0.8/play2-war-core-common_2.9.1-0.8.jar ...
[info] [SUCCESSFUL ] com.github.play2war#play2-war-core-common_2.9.1;0.8!play2-war-core-common_2.9.1.jar (1731ms)
[info] Done updating.
[error] {file:/c3po/c3po/c3po-webapi/}c3po/compile:sources: java.lang.ExceptionInInitializerError
[error] Total time: 100 s, completed 11.03.2013 21:56:51
EDIT: i get the same with: ../../play/play-2.0.4/play package
Add a post processing step that filters empty values for different properties provided by fits.
These will be handled by the framework as unknown anyway.
Sample selection algorithm Size'o'Matic seems to ignore the given number of samples.
(This works for the other algorithms)
Browse the representatives for a collection (and filter based on partition filter)
Abstract the Persistence Layer interface from the Mongo Driver to allow implementation of different backends.
It seems that, when a filter is applied for a property with value "conflicted" or "unknown" the distributions do not get drawn (or may be even calculated) Check the map reduce jobs draw all other diagrams (where possible). If not possible to draw them visualise the conflicts somehow.
Enhance the controller so that it requires the adapters to return the data (not directly store it as of now) and do a check and consolidation (if necessary) before storing it.
Increment version to 0.2, deploy and release.
This includes POMs and Constants
The new extended FITS sometimes provides false tags for html, which should be filtered out by the FITS adaptor. They usually have the form xxxTagOccurrences, where xxx is a string of gibberish.
Take a look during parsing and leave only if a valid hmtl tag is found.
In the Samples View it is possible to choose between different algorithms for sample selection, and to define the desired number of representative samples.
These settings are not used for data export.
There is a problem with internal logging (log back) of play and the core module logging (slf4j). It should be fixed, so that the logs of the core module are also displayed.
The Feedback for always shows on top of the screen. If you have scrolled down and click on the feedback bar, then only the overlay is shown but the popup does not display and the user has to scroll to top. Either scroll automatically to the top of the screen (bad usability - brings user out of context), or
display the popup at the current location of the screen
because of a bug in the click event handler (event object is missing)
the algorithm description is not shown in the samples view under FF.
Release an official package of the new core and cli (start using bin tray) and write a Blog Post on the OPF
The SizeRepresentativeGeneratorTest is failing if the test db is empty.
Setup some elements before executing the test and then remove them at the end
When a new filter is created, frequent page refreshes prevent the selection of a value for the filter from the drop down.
This is hard to reproduce, as it only happens "sometimes"
When a new diagram is created the property is chosen via a selected element. The selection is rather hard, as the ui fires a change event upon click over the list. This forces the user to hold the mouse button until the correct property is selected and then release it.
Investigate why this happens and fix it.
hi,
i tried to compile c3po with "mvn install" but there i get the following error:
WARNING: emptying DBPortPool to localhost:27017 b/c of error
java.io.IOException: couldn't connect to [/:27017] bc:java.net.ConnectException: Verbindungsaufbau abgelehnt
which service is needed to compile c3po?
greetings
grego
The profile filter has to be checked for difference before giving the cached version of the profile.
Filtering on integer properties throws an index out of bound exception after the last changes.
Create a simple REST interface to allow the submission of
Numeric Property filtering does not work, because it requires some more input and some special Map Reduce Jobs.
Add information about the system that produced the feedback message.
FITS returned "300 300 300" for "xSamplingFrequency" on a Nikon NEF RAW-file, which c3po tries to save as float. Normally the value is only an integer for JPEG, but i think for RAW this metadata entry is saved for every RGB color channel separately. This leads to an exception:
11:18:18,153 WARN [FloatValue] The passed string '300 300 300' is not a valid float. Setting value to null.
11:18:18,159 WARN [IntegerValue] The passed string '3 1 1' is not a valid integer. Setting value to null.
11:18:18,318 ERROR [LocalTransactionalDAO] c3po caught an error: Validation failed for classes [com.petpet.c3po.datamodel.FloatValue] during persist time for groups [javax.validation.groups.Default, ]
List of constraint violations:[
ConstraintViolationImpl{interpolatedMessage='kann nicht null sein', propertyPath=fValue, rootBeanClass=class com.petpet.c3po.datamodel.FloatValue, messageTemplate='{javax.validation.constraints.NotNull.message}'}
]
11:18:18,365 WARN [JDBCExceptionReporter] SQL Error: 0, SQLState: null
11:18:18,365 ERROR [JDBCExceptionReporter] Batch-Eintrag 0 insert into Value (element_id, measuredAt, property_id, reliability, source_id, status, value, id) values (118818, 0, 10, 0, 2, CONFLICT, 6.0, 3276800) wurde abgebrochen. Rufen Sie 'getNextException' auf, um die Ursache zu erfahren.
11:18:18,366 WARN [JDBCExceptionReporter] SQL Error: 0, SQLState: 23503
11:18:18,366 ERROR [JDBCExceptionReporter] ERROR: insert or update on table "value" violates foreign key constraint "fk4e9a151b474df4d"
Detail: Key (element_id)=(118818) is not present in table "element".
11:18:18,366 ERROR [LocalTransactionalDAO] c3po caught an error: org.hibernate.exception.ConstraintViolationException: Could not execute JDBC batch update, object /Users/philipp/Pictures/Photo-Library/2011/06_Donauinselfest/DSC_3666.NEF could not be persisted
11:18:18,366 WARN [LocalTransactionalDAO] Transaction is still active, rolling it back manually
11:18:18,368 INFO [LocalTransactionalDAO] Flushing session
Here a file generated by FITS:
<?xml version="1.0" encoding="UTF-8"?>
<fits xmlns="http://hul.harvard.edu/ois/xml/ns/fits/fits_output" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://hul.harvard.edu/ois/xml/ns/fits/fits_output http://hul.harvard.edu/ois/xml/xsd/fits/fits_output.xsd" version="0.6.0" timestamp="22.04.12 16:05">
<identification status="CONFLICT">
<identity format="Tagged Image File Format" mimetype="image/tiff" toolname="FITS" toolversion="0.6.0">
<tool toolname="Jhove" toolversion="1.5" />
<tool toolname="file utility" toolversion="5.03" />
<tool toolname="Droid" toolversion="3.0" />
<tool toolname="NLNZ Metadata Extractor" toolversion="3.4GA" />
<tool toolname="ffident" toolversion="0.2" />
<version toolname="Jhove" toolversion="1.5" status="CONFLICT">6.0</version>
<version toolname="Droid" toolversion="3.0" status="CONFLICT">3</version>
<version toolname="Droid" toolversion="3.0" status="CONFLICT">4</version>
<version toolname="Droid" toolversion="3.0" status="CONFLICT">5</version>
<externalIdentifier toolname="Droid" toolversion="3.0" type="puid">fmt/7</externalIdentifier>
<externalIdentifier toolname="Droid" toolversion="3.0" type="puid">fmt/8</externalIdentifier>
<externalIdentifier toolname="Droid" toolversion="3.0" type="puid">fmt/9</externalIdentifier>
<externalIdentifier toolname="Droid" toolversion="3.0" type="puid">fmt/10</externalIdentifier>
</identity>
<identity format="NEF EXIF" mimetype="image/x-raw" toolname="FITS" toolversion="0.6.0">
<tool toolname="Exiftool" toolversion="7.74" />
</identity>
</identification>
<fileinfo>
<size toolname="Jhove" toolversion="1.5">12842376</size>
<creatingApplicationName toolname="Jhove" toolversion="1.5">Ver.1.02</creatingApplicationName>
<lastmodified toolname="Exiftool" toolversion="7.74" status="SINGLE_RESULT">2011:06:25 14:22:56+02:00</lastmodified>
<created toolname="Exiftool" toolversion="7.74" status="SINGLE_RESULT">2011:06:25 14:22:57</created>
<filepath toolname="OIS File Information" toolversion="0.1" status="SINGLE_RESULT">/Users/philipp/Pictures/Photo-Library/2011/06_Donauinselfest/DSC_3667.NEF</filepath>
<filename toolname="OIS File Information" toolversion="0.1" status="SINGLE_RESULT">/Users/philipp/Pictures/Photo-Library/2011/06_Donauinselfest/DSC_3667.NEF</filename>
<md5checksum toolname="OIS File Information" toolversion="0.1" status="SINGLE_RESULT">5aa8e24a47a3bab9e763af026c9930e8</md5checksum>
<fslastmodified toolname="OIS File Information" toolversion="0.1" status="SINGLE_RESULT">1309004576000</fslastmodified>
</fileinfo>
<filestatus>
<well-formed toolname="Jhove" toolversion="1.5" status="SINGLE_RESULT">true</well-formed>
<valid toolname="Jhove" toolversion="1.5" status="SINGLE_RESULT">false</valid>
<message toolname="Jhove" toolversion="1.5" status="SINGLE_RESULT">PhotometricInterpretation not defined</message>
<message toolname="Jhove" toolversion="1.5" status="SINGLE_RESULT">ImageWidth not defined</message>
<message toolname="Jhove" toolversion="1.5" status="SINGLE_RESULT">ImageLength not defined</message>
<message toolname="Jhove" toolversion="1.5" status="SINGLE_RESULT">Neither strips nor tiles defined</message>
<message toolname="Jhove" toolversion="1.5" status="SINGLE_RESULT">JPEGProc not defined for JPEG compression</message>
</filestatus>
<metadata>
<image>
<byteOrder toolname="Jhove" toolversion="1.5" status="SINGLE_RESULT">big endian</byteOrder>
<byteOrder toolname="Jhove" toolversion="1.5" status="SINGLE_RESULT">big endian</byteOrder>
<byteOrder toolname="Jhove" toolversion="1.5" status="SINGLE_RESULT">big endian</byteOrder>
<compressionScheme toolname="Jhove" toolversion="1.5" status="CONFLICT">Uncompressed</compressionScheme>
<compressionScheme toolname="NLNZ Metadata Extractor" toolversion="3.4GA" status="CONFLICT">65536</compressionScheme>
<imageWidth toolname="Jhove" toolversion="1.5">160</imageWidth>
<imageHeight toolname="Jhove" toolversion="1.5">120</imageHeight>
<colorSpace toolname="Jhove" toolversion="1.5" status="SINGLE_RESULT">RGB</colorSpace>
<referenceBlackWhite toolname="Jhove" toolversion="1.5" status="SINGLE_RESULT">0.0 255.0 0.0 255.0 0.0 255.0</referenceBlackWhite>
<YCbCrPositioning toolname="Jhove" toolversion="1.5" status="SINGLE_RESULT">Co-sited</YCbCrPositioning>
<orientation toolname="Jhove" toolversion="1.5" status="CONFLICT">normal*</orientation>
<orientation toolname="NLNZ Metadata Extractor" toolversion="3.4GA" status="CONFLICT">65536</orientation>
<samplingFrequencyUnit toolname="Jhove" toolversion="1.5">in.</samplingFrequencyUnit>
<xSamplingFrequency toolname="Jhove" toolversion="1.5" status="CONFLICT">300 300 300</xSamplingFrequency>
<xSamplingFrequency toolname="NLNZ Metadata Extractor" toolversion="3.4GA" status="CONFLICT">44.00390625</xSamplingFrequency>
<ySamplingFrequency toolname="Jhove" toolversion="1.5" status="CONFLICT">300 300 300</ySamplingFrequency>
<ySamplingFrequency toolname="NLNZ Metadata Extractor" toolversion="3.4GA" status="CONFLICT">44.00390625</ySamplingFrequency>
<bitsPerSample toolname="Jhove" toolversion="1.5">8 8 8</bitsPerSample>
<bitsPerSample toolname="Jhove" toolversion="1.5" status="SINGLE_RESULT">1</bitsPerSample>
<bitsPerSample toolname="Jhove" toolversion="1.5" status="SINGLE_RESULT">12</bitsPerSample>
<samplesPerPixel toolname="Jhove" toolversion="1.5" status="CONFLICT">3 1 1</samplesPerPixel>
<samplesPerPixel toolname="NLNZ Metadata Extractor" toolversion="3.4GA" status="CONFLICT">196608</samplesPerPixel>
<imageProducer toolname="Jhove" toolversion="1.5" status="SINGLE_RESULT" />
<scanningSoftwareName toolname="Jhove" toolversion="1.5" status="SINGLE_RESULT">Ver.1.02</scanningSoftwareName>
</image>
</metadata>
</fits>
Browse the objects and their meta data.
Filter based on collections (and mimetype/format)
When files are removed from a collection, there is no method on the command line tool to remove the collection before giving the new set of FITS files, or to define this new list of FITS files as the updated list (removing previous ones).
This is necessary when monitoring the content of a repository or folder, where files can be added, updated or removed.
It seems that a filter for a collection is application wide and it should be session wide.
Take a look at Play's caching mechanism
All cached profiles of collections that were updated should be removed after the update.
Introduce a new mongo collection with the date of the last profile generation and the date of the last update. Remove all cached profiles for this collection if the update is after the profile generation.
The format histogram is the same in all partitions of the collection profile. Each partition should include only the format histogram of the files present in it.
Investigate if this bug does not occur for other properties (besides of format) as well.
Both xml and csv export should be functional again, after the change to Mongo
Remove the diagram settings popup immediately after the hit upon apply and start the spinner.
Fix the log on error during parsing and output the correct file path/identifier, so that it is clear which objects were not processed. The problem is because of the change to streams. now the log prints the address of the InputStream object instead of the filename.
The controller should take care of this.
[FITSAdaptor] An exception occurred for file 'java.io.FileInputStream@24164d75': null
Show a web page with the aggregation statistics per collection (and or partition).
Use http://www.jqplot.com/ or similar for the graphs.
Due to time limitations at the review, we need only 3 samples in a profile output. 5 samples is too much.
Provide the ability to cache the map reduce job results and check if they first exist before submitting new jobs.
Provide the ability to remove them.
Remove the cached job results upon web app shutdown.
Show the properties only for the current collection. In the web app if a collection is chosen, only the properties present in this collection should be shown in the dropdown list.
The new profile format generator includes the partition filter property in the partitions.
It should be skipped as it is redundant. E.g. if you partition based on the mime type, there is no need to include the mime type count in each partition.
Filtering should be reiterated and enhanced. Create a convention that filters are logically bound with an AND (as of now), but also if the same property name is provided twice or more in a filter, then bind it with a logical OR. This will allow for more flexible filters. e.g.
collection == 'test_collection' AND
mimetype == 'text/html' OR mimetype == 'text/xhtml'
Setup the size of samples in the profile via an optional switch in the command line. Use 5 or similar as default if nothing is provided.
Refactor the command line.
Consider making everything via a -Dproperty=value switch instead of adding a new switch each time a new feature is requested. This will make the extension of the CLI much easier.
The current implementation always includes a list of elements in the profile.
This was implemented due to integration with Plato and later with the repositories. However, it does not scale at all for medium-sized and larger collections.
Thus, allow an opt-in/opt-out parameter in the profiler (REST API) to include the element identifiers.
The list will be sufficient for demo purposes. Later a next iteration of the profile should include a specific query that selects exactly these elements and is encoded in a way that is understood at least by the connector API as defined in SCAPE.
Create a Dev Guide (in the wiki?). It should describe the common components and provide examples for the implementation of the backend and an adaptor. Also include some code guidelines for committers.
Show the spinner when a filter is removed to give feedback that something is going on.
Also when a new filter is added (while figuring out which distinct values can be filtered)
When an algorithm is selected, input fields for the parameters are added.
If a different algorithm is chosen, the input fields of the previous one are not removed, so it's not clear which fields to use.
The data export to csv and xml from the web app should be done.
Also try to force the browser to download the files.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.