macro_tagger's People
macro_tagger's Issues
Make image_upload.py sequentially scan folders for new pictures
Have the image_upload check the mongodb to see if that image has already been uploaded.
Two suggestions I have
-
Pull all the documents from each mongo collection and compare the path's. Example
folder: 230701, filename: 0001.jpg
, then list all files by iterating over all folders then running something likeif '0001.jpg in listdir('230701'): return true
-
Store a hash of the image in the DB then comparing against the image to see if that exists
Examples
File hash. Just hash's the raw bites of a file, reguardless if it's an image or not If image file is changed, including metadata, hash is altered. But this is the easiest approach ```python import hashlib
def calculate_hash(file_path):
with open(file_path, 'rb') as file:
bytes = file.read() # read entire file as bytes
readable_hash = hashlib.sha256(bytes).hexdigest()
return readable_hash
file_path = '/path/to/your/image.jpg' # replace with your file path
print(f'The SHA256 hash of the file is: {calculate_hash(file_path)}')
standardized hash: separates the actual image from the metadata and standardizes it then hash's it. Upside changing the metadata does not alter the file, downside is you must use the same process to check the hash of a file, if standardization is too general could get duplicates w/ visually similar images, longer to execute which could matter if it's checking on too many files.
```python
from PIL import Image
import io
import hashlib
def calculate_image_hash(image_path):
# Open the image file
with Image.open(image_path) as img:
# Convert image to RGB and resize
img = img.convert('RGB').resize((8, 8), Image.ANTIALIAS)
# Save resized image to a BytesIO object to get rid of any original metadata
with io.BytesIO() as temp_file:
img.save(temp_file, format='JPEG')
temp_file.seek(0) # Go to the start of the BytesIO object
# Calculate the hash on the bytes of the standard image
image_hash = hashlib.sha256(temp_file.read()).hexdigest()
return image_hash
file_path = '/path/to/your/image.jpg' # replace with your file path
print(f'The SHA256 hash of the image content is: {calculate_image_hash(file_path)}')
Maybe a combination of the two might be good, running to see if new folders are added then checking the image hash to be sure it's not just a copy of another folder or something. Also could be helpful if someone taking the picture accidentally takes a duplicate photo or something.
Create Schema for sample collection and identifying macros
What I would suggest is to create standard schema for sample collections/ identifying
schema collections
- macro_image: image info
- sample_box: info about where sample placed, when collected when deployed ect. Steph may fill this out ahead of time before deploying in the future
- species: info about the species ie scud etc. May be more general so we can use to with other projects ie wildlife tracking
- individual_species: reference
macro_image._id
, which user id'd it,species._id
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.