GithubHelp home page GithubHelp logo

buda-base / asset-manager Goto Github PK

View Code? Open in Web Editor NEW
0.0 0.0 0.0 65.5 MB

Asset Manager and audit tool

License: The Unlicense

Java 22.62% Shell 0.01% Python 1.76% JavaScript 75.58% CSS 0.01% HTML 0.02%

asset-manager's Introduction

Vagrant scripts for BUDA platform instanciation

The base platform is built using Vagrant and VirtualBox:

  1. Install Vagrant and VirtualBox.
  2. Download or git clone this repository.
  3. cd into the unzipped directory or git clone
  4. install VirtualBox guest additions with vagrant plugin install vagrant-vbguest
  5. run vagrant up to summon a local instance

Or for an AWS EC2 instance:

  1. install the vbguest plugin: vagrant plugin install vagrant-vbguest
  2. and run the command: vagrant up or rename Vagrantfile.aws to Vagrantfile and run vagrant up --provider=aws

This will grind awhile installing all the dependencies of the BUDA platform.

Once the initial install has completed the command: vagrant ssh will connect to the instance where development, customization of the environment and so on can be performed as for any headless server.

Similarly, the jena-fuseki server will be listening on:

http://localhost:13180/fuseki

Lds-pdi application is accessible at :

http://localhost:13280/

(see https://github.com/buda-base/lds-pdi/blob/master/README.md for details about using this rest services)

The command: vagrant halt will shut the instance down. After halting (or suspending the instance) a further: vagrant up will simply boot the instance without further downloads, and vagrant destroy will completely remove the instance.

If running an AWS instance, after provisioning access the instance via ssh -p 15345 and delete Port 22 from /etc/ssh/sshd_config and sudo systemctl restart sshd. This will further secure the instance from attacks on port 22.

asset-manager's People

Contributors

dependabot[bot] avatar jimk-bdrc avatar marcagate avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

asset-manager's Issues

Packaging is missing shell.properties

Launching start up complains cant find shell.properties

"2019-03-31 20.13.45" DEBUG io.bdrc.am.audit.iaudit.FilePropertyManager:27 - Load Properties
"2019-03-31 20.13.45" ERROR io.bdrc.am.audit.iaudit.FilePropertyManager:35 - Cant open resource shell.properties
"2019-03-31 20.13.45" ERROR io.bdrc.am.audit.shell.shell:134 - java.lang.ClassNotFoundException:

Reduce number of log entries

A test on an folder may result in three lines of output:

2019-08-30 11.36.42 [main] INFO  Invoking Web Image Attributes. Params :/Users/jimk/mnt/Incoming/scans/ReadyToProcess/1.Miscellaneous-backlog/W8LS30939:
2019-08-30 11.36.42 [main] ERROR 110:Image file /Users/jimk/mnt/Incoming/scans/ReadyToProcess/1.Miscellaneous-backlog/W8LS30939/images/W8LS30939-I8LS30941/W8LS30939 goldstein_eng_tib_dic.pdf has no suitable reader.
2019-08-30 11.36.42 [main] ERROR Test WebImageAttributes result Failed

At best, it should have only two:
the result of the folder and test, and the details.
There's no benefit in showing that a test has started.

Create history table

on proocessing RDS, add a history table to capture sum image count for each day, state and project id.

When a required directory does not exist, test should not fail

Describe the bug
If the user specifies a test which works on a specific image group parent (e.g. File size test tests the images in the children of "images" ) and that folder doesn't exist, the test fails.

To Reproduce

  1. Take any work tree, and remove the images tree.
  2. Run audittool.sh against the work.
  3. Open the summary log, which will show failures for Web Image Attributes and File Size test
    You will see lines similar to:
W1KG26108,Web Image Attributes,Failed,,,/Users/glurm/prod/W1KG26108
,,,101,Path Some folders under root /Users/jimk/glurm/W1KG26108 : images  is not a directory or does not exist.,/Users/jimk/prod/W1KG26108
...
W1KG26108,File Size Test,Failed,,,/Users/glurm/prod/W1KG26108
,,,101,Path /Users/glurm/prod/W1KG26108/images is not a directory or does not exist.,/Users/jimk/prod/W1KG26108

Expected behavior
The test itself should neither pass nor fail. The log should indicate to the user that some tests were not run because there was no input found to test. The prefix of test run logs should be =neither PASS nor FAIL but SKIPPED.
This is the way that Unit test reports.

Some images break audittool

Describe the bug
A method is not found in a sub-library of one of my components when it is invoked from a certain shell.

To Reproduce
sattva: service: depositIa.sh /mnt/rs3Archive/W8LS66803

INFO  Passed    /mnt/rs3Archive/W8LS66803               No Files in Root Folder
Exception in thread "main" java.lang.NoSuchMethodError: 'java.lang.Object com.twelvemonkeys.lang.Validate.isTrue(boolean, java.lang.Object, java.lang.String)'
        at com.twelvemonkeys.imageio.util.ImageTypeSpecifiers.createPackedGrayscale(ImageTypeSpecifiers.java:144)
        at com.twelvemonkeys.imageio.plugins.tiff.TIFFImageReader.getRawImageType(TIFFImageReader.java:498)
        at com.twelvemonkeys.imageio.plugins.tiff.TIFFImageReader.getImageTypes(TIFFImageReader.java:886)
        at io.bdrc.am.audit.audittests.ImageAttributeTests$ImageAttributeTestOperation.TestImages(ImageAttributeTests.java:193)
        at io.bdrc.am.audit.audittests.ImageAttributeTests$ImageAttributeTestOperation.run(ImageAttributeTests.java:55)
        at io.bdrc.am.audit.audittests.AuditTestBase.TestWrapper(AuditTestBase.java:103)
        at io.bdrc.am.audit.audittests.ImageAttributeTests.LaunchTest(ImageAttributeTests.java:346)
        at io.bdrc.am.audit.shell.shell.RunTest(shell.java:299)
        at io.bdrc.am.audit.shell.shell.TestOnDirPassed(shell.java:212)
        at io.bdrc.am.audit.shell.shell.RunTestsOnDir(shell.java:194)
        at io.bdrc.am.audit.shell.shell.main(shell.java:95)

but

sattva:service: auditool.sh /mnt/rs3Archive/W8LS66803
service@sattva:~/tmp$ audittool.sh -l . /mnt/rs3Archive/W8LS66803
starting -l . /mnt/rs3Archive/W8LS66803
INFO  Passed    /mnt/rs3Archive/W8LS66803               No Files in Root Folder
INFO  Passed    /mnt/rs3Archive/W8LS66803               Web Image Attributes
INFO  Passed    /mnt/rs3Archive/W8LS66803               No folders allowed in Image Group folders
INFO  Passed    /mnt/rs3Archive/W8LS66803               File Sequence Test

Debian 10

Logs to CSVs

Is your feature request related to a problem? Please describe.
Existing log files are not useful. They contain time stamps and free form texts.

Describe the solution you'd like
We want an ordinary user to understand

  • the work which failed or passed
  • the test which failed or passed

in a csv file, which the user can import into their favorite spreadsheet. There will be a summary CSV, named after the date and time, and a detail CSV, as with the straight logs.

Describe alternatives you've considered
Discussed writing directly to the Google Sheets or Excel java API, except that Google is N/A in CN, and excel is licensed.
Additional context
Add any other context or screenshots about the feature request here.

Allow worklist input

Is your feature request related to a problem? Please describe.
We have a large backlog of folders to process. the existing implementation takes a semicolon-delimited list of paths as its input. I'd like to add that it take a file which contains pathnames, as well.

Describe the solution you'd like
Add a -f --file [path] parameter which specifies a file containing a list of pathnames

Describe alternatives you've considered
Considered just calling audittool repeatedly with one parameter, but that results in four log files per invocation, which is not as useful as a fewer number of larger logfiles.

Additional context
Add any other context or screenshots about the feature request here.

Image groups must be consistent

Summary

If a work folder has parents of image groups (see shell.properties, e.g. sources, archive, images) the image groups under each parent must:

  • have the same number and name of image group folders
  • each imagegroup must have the same number of files as in the other imagegroup parents.

Example of structure which passes

/wXXXX/
  archive/
    IG1/
           IG10001.TIF
           IG10002.TIF
    IG2/
           dontCare0001.dc           
           dontCare0002.dc

 images/
    IG1/
           xxx0001.yyy
           xxx0002.yyy
    IG2/
           xxx0001.zzz           
           xxx0002.zzz

Other tests care about the semantics of non-sequence elements (the ones represented by dontcare and dc and xxx, yyy, zzz )

Dont count directories in File Sequence Error

In Incoming/scans/ReadyToProcess/ACIP-Mongolia-Texts/W1KG15898/images/W1KG15898-I1KG86833
the folder contains

drwx------ 1 jimk staff 16384 Mar 20  2018 "1-'JAM_DBYANGS_BZHAD_PA'I_RDO_RJE_KA_PO_TI"
-rwx------ 1 jimk staff 14806 Mar 28  2018  I1KG868330001.tif
-rwx------ 1 jimk staff   754 Mar 28  2018  I1KG868330002.tif

The error log for this folder correctly has

019-08-30 14.26.47 ERROR 103:Image group folder /Users/jimk/mnt/Incoming/scans/ReadyToProcess/ACIP-Mongolia-Texts/W1KG15898/images/W1KG15898-I1KG86845  contains directory 13- 'JAM_DBYANGS _ZHAD_PA'I_RDO_RJE_PA_PO_TI
2019-08-30 14.26.47 ERROR 104:Image group folder /Users/jimk/mnt/Incoming/scans/ReadyToProcess/ACIP-Mongolia-Texts/W1KG15898/images/W1KG15898-I1KG86846  fails files only test.
2019-08-30 14.26.47 ERROR 103:Image group folder /Users/jimk/mnt/Incoming/scans/ReadyToProcess/ACIP-Mongolia-Texts/W1KG15898/images/W1KG15898-I1KG86846  contains directory 14-'JAM_DBYANGS_BZHAD_PA'I_RDO_RJE_PHA_PO_TI
2019-08-30 14.26.47 ERROR Test :No folders allowed in Image Group folders: result :Failed: folder :/Users/jimk/mnt/Incoming/scans/ReadyToProcess/ACIP-Mongolia-Texts/W1KG15898:

But it also fails the sequence test:

AuditTestShell-DETAIL-2019-08-30-14-26-46.log:2019-08-30 14.26.47 ERROR 106:Folder /Users/jimk/mnt/Incoming/scans/ReadyToProcess/ACIP-Mongolia-Texts/W1KG15898/images/W1KG15898-I1KG86833 fails sequence test.
2019-08-30 14.26.47 ERROR 105:Sequence File 1-'JAM_DBYANGS_BZHAD_PA'I_RDO_RJE_KA_PO_TI does not end in an integer: ends with  not found

I don't want the directory name to fail the sequence test.
Exclude directories from sequence testing.

Dont have publish state on command line

Is your feature request related to a problem? Please describe.
#0c912a0 added a publish state to the command line. That's wrong. publish state is discovered in the internal exam of a work.

Describe the solution you'd like

  1. Have a specialized StateWriter examine internal state of published works and pass the result to SQL.
  2. Have a new SQL process for publishing

Describe alternatives you've considered

Additional context

Java util exception

Describe the bug
Some configurations throw ,,,3,Test java.util.NoSuchElementException threw exception null.,/Users/jimk/prod/W1KG26108

To Reproduce
Steps to reproduce the behavior:

  1. in auditool shell.properties, comment out the "DerivedImageGroupParent" value (as if the work you're examining has no images directory
  2. Start audit tool on a work that has no 'images' folder
  3. you see the above referenced error in the FileSize test

FAIL-W1KG26108-2020-06-30.17.45.csv.txt

Data
See Windhorse:/Volumes/SSD_HD/W1KG26108

Expected behavior
When there is no input appropriate for a test (e.g., File Size test works on "images", or whatever the value of "DerivedImageGroupParent" is, the test should pass, (or just emit an error in the ignored group.)

Actual Behavior
The test is recorded as a failure, even when there are no images to fail.

Test throws exception

Describe the bug
On debian 10, sattva, a test throws an exception, instead of passing or failing.

See the DETAIL log attached.
To Reproduce
use 0.9-SNAPSHOT-1

  1. Mount RS3: Archive on ~/mnt/rs3Archive
  2. audittool.sh ~/mnt/rs3Archive/W2KG208014

Expected behavior
The system should run tests on the images/ and archives/ subfolders

** Logs **
AuditTestShell-DETAIL-2019-11-13-08-27-28.txt
AuditTestShell-SUMMARY-2019-11-13-08-27-28.txt

Desktop (please complete the following information):

  • OS: Debian 10, Java 8

Results repository

Is your feature request related to a problem? Please describe.
We need a central repository for audit tool results. This is because we have several different audiences:

  • FRs in China
  • AO staff in WHQ

We want to run audittool as part of the last stage of publication. This generates a lot of small result files.

Describe the solution you'd like
On each machine which runs audittool, we could have a repository for each work. A Jetty app could be a frontend for the repository.
On the China front, users might not be sophisticated enough to parse lots of csv and log files. But putting them in a SQL db wouldn't necessarily help.

Describe alternatives you've considered
log4j into a sqlite file. This is clunky, because, unlike the log4j csv, the SQL appenders pass along the %m field only. One wishes they would take the varargs like the CSV.

Additional context
Do we want one central repo? Is there a point to that?
Could we partition the repo into stages:

  • receiving
  • pre-processing
  • post-processing
  • Review for publication

Summary and detail loggers

Audit tool has two levels of failure:

  • Report that a test has failed
  • report each file which failed
    These are logged to the same stream. user found this confusing and noisy.
    Log test outcome to one logger, details to another.

Make log names more grained

If audit-tool is running multiple instances which start in the same second, their log files interleave.

Version
0.8-SNAPSHOT-2
To Reproduce

  1. create a list of test paths
  2. pipe them into parallel: `cat list | parallel -j6 audittool.sh

Data
See attached log file. It has only two lines, containing tests from different folders.
[
Include a reference to the data which generated the error.(Local file system, zip file on cloud storage, etc.)

Expected behavior
It should have only test results for one test subject folder, since audittool was only invoked for one folder.

Screenshots or Logs
AuditTestShell-SUMMARY-2019-05-10-10-47-52.log](https://github.com/buda-base/asset-manager/files/3167008/AuditTestShell-SUMMARY-2019-05-10-10-47-52.log)

Desktop (please complete the following information):

  • OS: OSX Mojave

Support spaces in folder list on command line

Audit tool supports a list of multiple folders on its command line:

audittool.ps1 y:\W1FPL2251 , y:\CastleHarfang

or

audittool.ps1 /SomewhereW1FPL2251 , /Somewhere/CastleHarfang

However, if there are spaces in between the arguments, they are not parsed correctly

To Reproduce

  1. Create a couple of folders ~/tmp/f1 and ~/tmp/f2
  2. Run Audittool.sh or ps with the command line audittool.sh ~tmp/f1 , ~/tmp/f2

Expected behavior
Audit tool should run all its tests for each folder on the command line.

** Actual behavior**
only one argument is processed, or other parsing errors (on Windows power shell)
Desktop (please complete the following information):

  • OS: All

change name to bdrc-test-FolderName-SUMMARY-date-time

Is your feature request related to a problem? Please describe.
Users can't easily locate the results for a certain test. The log files are created for each run, and our users aren't big find and grep fans.

Describe the solution you'd like
Have each folder tested generate a separate log file, identifiable by the work.
Have the file contain summary and detailed texts

https://www.codota.com/code/java/methods/org.apache.log4j.FileAppender/setName

https://stackoverflow.com/questions/15441477/how-to-add-log4j2-appenders-at-runtime-programmatically . This one has all the namespaces spelled out.

And also
https://www.programcreek.com/java-api-examples/?class=org.apache.logging.log4j.core.Logger&method=addAppender

Describe alternatives you've considered
Cloud alerts, database status logging

Additional context
Add any other context or screenshots about the feature request here.

Dont count json files in image groups

Is your feature request related to a problem? Please describe.
They're usually not problematic, and are needed to support other operations

Describe the solution you'd like
All tests on image groups should pass when the image groups contain folders with json files in them

Describe alternatives you've considered
Not doing anything. This just decreases confidence. It also limits the test sets, and the operations we can do on media.

Additional context
Add any other context or screenshots about the feature request here.

Remove support for .sh scripts from Windows release

Describe the bug
it's a programming burden to support both .ps1 and .sh versions of scripts. the .ps1 versions are up to date, and are what NT requested.

Move the .sh scripts to a folder 'unsupported' so that users can maintain them if they want to.

In file sequence test, failing folder not logged

"2019-04-19 15:01:00,471" ERROR i.b.a.a.s.shell (shell.java:108) [main] FILE_SEQUENCE:Sequence folder :/Users/tbrc/staging/in-progressA/W3CN2612/images/W3CN2612-I3CN2614: Last file index is 2612, but there are 3 files in the folder . not found

Create desktop and UI audit tool

PRB: command line install and operation.

Create a desktop application which can use the same sources as a web server version.

Python windowed - not portable, not installable, not scalable

Sub tasks:

Cant process some TIF image files

Describe the bug
many older TIF files in images/ fail because ImageIO doesn't have a reader for it.

To Reproduce
audittool.sh RS3://Archive/W00EGS1016294

See

Steps to reproduce the behavior:

** Data **
RS3://Archive/W00EGS1016294

Expected behavior
Audit tool should read the images and process them.

What happens is the files show up with error 110 'No suitable reader found'

see W00EGS1016294-2019-12-02.15.32.csv.txt

Screenshots or Logs
If applicable, add screenshots, command line used, and output files to show the output to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

Deployment Pom

Figure out what we want from Maven deployment - to Github? Can we get that?

Detail log file without Web Image attributes details

On a failed Wed image attribute, the DETAIL log file will not list out any failures at all. Just the headers for the columns. The DETAIL log will still list out failures for other tests (such as sequence breaks) but not for the web image attribute failures.

SUMMARY-2020-02-20-11-54
outcome,path,test_name
Passed,/Users/timb/staging/QA-Review/W8LS31241,No Files in Root Folder
Failed,/Users/timb/staging/QA-Review/W8LS31241,Web Image Attributes
Passed,/Users/timb/staging/QA-Review/W8LS31241,No folders allowed in Image Group folders
Failed,/Users/timb/staging/QA-Review/W8LS31241,File Sequence Test

DETAIL-2020-02-20-11-54
error_number,error_test,path
109,"Folder /Users/timb/staging/QA-Review/W8LS31241/archive/W8LS31241-I8LS31246 expected 885 files in folder , found 883",/Users/timb/staging/QA-Review/W8LS31241
105,Sequence File Sequence 1 missing not found,/Users/timb/staging/QA-Review/W8LS31241
105,Sequence File Sequence 2 missing not found,/Users/timb/staging/QA-Review/W8LS31241

SUMMARY-2020-02-20-10-23
outcome,path,test_name
Passed,/Users/timb/staging/QA-Review/W8LS31064,No Files in Root Folder
Failed,/Users/timb/staging/QA-Review/W8LS31064,Web Image Attributes
Passed,/Users/timb/staging/QA-Review/W8LS31064,No folders allowed in Image Group folders
Passed,/Users/timb/staging/QA-Review/W8LS31064,File Sequence Test

DETAIL-2020-02-20-10-23
error_number,error_test,path

Image groups should not have any subfolders

Image group folders shall have no subfolders.
Some tests can take parameters which define which folders in a work's structure are parents of image groups. All children of that parent are defined as image groups, which must pass this test.

Move sequence length to global property

In v0.9, the property which determines sequence length is embedded in the audit-test-lib jar file's resources:

io.bdrc.am.audit.audittests.FileSequence.SequenceLength=4

Think about moving it into shell property space.

Fails first

Put test summaries first, details after.
Possibly separator between summary & detail?

Have the file name contain the test result

users run audit tool on a batch of files. They don't want to have to "grep" for success or failure in the results.

AuditTool has to rename the file to contain the work number, extend that to include the status.

In addition, to accommodate languages, have the text of the fail or success label be configurable in shell.properties

Define a way to allow tests to have errors, but pass

As a Digital Archivist, I want to be able to have audittool log errors, but still pass a test. We use audittool to check old, existing works for some errors, but we don't care so much about the occasional "can't find a suitable reader" for the file.

Provide a shell.property which says which errors can occur and still have a test pass.

ErrorsAsWarning=n,n,n

The errors numbers are found in the log:
/home/service/audit-test-logs/W8LS66843-2019-12-10.09.45.csv:,,,110,Image file /mnt/rs3Archive/W8LS66843/images/W8LS66843-I4CZ355054/I4CZ3550540339.tif has no suitable reader.,/mnt/rs3Archive/W8LS66843
In this example, the error number is 110, and the filter for it would be:

ErrorsAsWarning=110

Send default logging to a file, not console

audittool start up is an little noisy

"2019-03-31 20.13.45" DEBUG io.bdrc.am.audit.iaudit.FilePropertyManager:27 - Load Properties
"2019-03-31 20.13.45" ERROR io.bdrc.am.audit.iaudit.FilePropertyManager:35 - Cant open resource shell.properties
"2019-03-31 20.13.45" ERROR io.bdrc.am.audit.shell.shell:134 - java.lang.ClassNotFoundException:
"2019-03-31 20.13.45" ERROR io.bdrc.am.audit.shell.ArgParser:44 - Failed to parse Missing required option: p

Image Size test does not consider property

Describe the bug
The Image Size test does not properly evaluate the value of the "MaximumFileSize" property in shell.properties

To Reproduce

  1. Access /Volumes/Incoming
  2. audittool.sh /Volumes/Incoming/scans/Inbox/DDD/Delivery01/BDRC/W1FEMC030005

** Data **
The shell.properties file which is in effect during this run, has the entry:
MaximumFileSize=450K

Expected behavior
The size given in the detailed error message should reflect the maximum of 450*1024 = 460800, not 409600

Screenshots or Logs
You will see on the console
ERROR Failed /Volumes/Incoming/scans/Inbox/DDD/Delivery01/BDRC/W1FEMC030005 File Size Test
and in the detailed log:

,,,112,Image file /Volumes/Incoming/scans/Inbox/DDD/Delivery01/BDRC/W1FEMC030005/images/W1FEMC030005-I1FEMC030005/I1FEMC030005_0002.jpg size 732176 exceeds maximum of 409600,/Volumes/Incoming/scans/Inbox/DDD/Delivery01/BDRC/W1FEMC030005
,,,112,Image file /Volumes/Incoming/scans/Inbox/DDD/Delivery01/BDRC/W1FEMC030005/images/W1FEMC030005-I1FEMC030005/I1FEMC030005_0001.jpg size 813094 exceeds maximum of 409600,/Volumes/Incoming/scans/Inbox/DDD/Delivery01/BDRC/W1FEMC030005
,,,112,Image file /Volumes/Incoming/scans/Inbox/DDD/Delivery01/BDRC/W1FEMC030005/images/W1FEMC030005-I1FEMC030005/I1FEMC030005_0004.jpg size 877327 exceeds maximum of 409600,/Volumes/Incoming/scans/Inbox/DDD/Delivery01/BDRC/W1FEMC030005

Automated processing

@eroux writes
After my semi-success (depending if you see the glass half full or
half empty) with the cropping of the Fanfoyan images, I tried to
summarize my thoughts on the possibility of an alternative way of
importing images that could be one of the components of speeding up
the backlog processing:

Automated doc proposal

these are just my 2c, any comment is appreciated!

Not creating the "audit-test-log" folder

1) Command Prompt (CMD)

a) Downloaded this file from release

001

b) Configuration looks fine as see in the photo

image

c) looks fine when we run the command

003

but is not able to create "audit-test-log" folder

2) Window System Information

004

Some windows systems cant load tests

Describe the bug
On some windows system, the tests fail to run

To Reproduce

  1. install and configure the product on a Windows 10 system.
  2. Prepare a file containing path names . The paths need not exist.
  3. Run audittool.sh
  4. You might see a 'ClassNotFoundError'

** Data **
Any

Expected behavior
The tests should run or fail.

Screenshots or Logs
If applicable, add screenshots, command line used, and output files to show the output to help explain your problem.

Desktop (please complete the following information):

  • OS: Windows 10
  • Shell: Powershell 1.0 (x.0?)
  • 0.9 pre-release

Additional context

columnize visual logs

have the screen and file logs swap 'folder' and 'result' to 'result' and 'folder' and pad 'folder' so they align

Create a filter to exclude files from tests

We're adding and removing files from imagegroup directories. Mostly these are json files, which iiif needs for delivering media.
They get in the way of audit tool tests, because they are counted in sequences, and attempted to read in image processing.

Describe the solution you'd like
Build a filter in shell.properties which specifies a list of regexps to ignore.

Describe alternatives you've considered
The best alternative is to take these files out of "images" and put them somewhere else. We've discussed inverting the way we store all the different versions of our media:
Now it's work major, with Work > {sources|raw|images|archive}/ImageGroup/{img}{*}|{something}{*}.

Additional context
Add any other context or screenshots about the feature request here.

not reading some parameters

Tested Audit Tool with following parameters.

  • file format .tiff .png .jpg
  • file sequence: double file name, 008 instead of 0008.
  • file zise: above 400kb(webpage) And the result in the out put shows in following csv files.

output csv.zip

and it is able to detect wrong file format, double file name but it seems not able to detect files with file size over 400kb. and another things is, is it able to detect file name where one zero is less like 008 instead of 0008? In output file it seems not able to detect this issue.

Audit tool cant read Some TIF files

Describe the bug
many older TIF files in images/ fail because ImageIO doesn't have a reader for it.

To Reproduce
audittool.sh RS3://Archive/W00EGS1016294

See

Steps to reproduce the behavior:

  1. Go to '...' or 'Using /Volumes/....'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

** Data **
Include a reference to the data which generated the error.(Local file system, zip file on cloud storage, etc.)

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots or Logs
If applicable, add screenshots, command line used, and output files to show the output to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

Color TIF image fails test which it should pass

Describe the bug
A TIF file is failing the 'binarytiffnotgroup4' test, although its bit depth is 24 (ie it's not a binary image)

To Reproduce
audittool.sh /mnt/rs3Archive/W00EGS1017169

** Data **

Image file /mnt/rs3Archive/W00EGS1017196/images/W00EGS1017196-I00EGS1017198/I00EGS10171980003.TIF is invalid TIFF. Reasons: binarytif-tiffnotgroup4 :bd:24: itn :0: comp :None:
--
Image file /mnt/rs3Archive/W00EGS1017196/images/W00EGS1017196-I00EGS1017200/I00EGS10172000003.TIF is invalid TIFF. Reasons: binarytif-tiffnotgroup4 :bd:24: itn :0: comp :None:
Image file /mnt/rs3Archive/W00EGS1017196/images/W00EGS1017196-I00EGS1017199/I00EGS10171990003.TIF is invalid TIFF. Reasons: binarytif-tiffnotgroup4 :bd:24: itn :0: comp :None:

Expected behavior
These files should pass
Screenshots or Logs
If applicable, add screenshots, command line used, and output files to show the output to help explain your problem.
W00EGS1017196-2019-11-27.15.13.csv.txt

Desktop (please complete the following information):

  • Debian 9, v.09 snapshot 1
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.