This project organizes and indexes files in a Content Addressable Storage (CAS) system. The metadata is stored in separate index files, ensuring that SHA-256 hashes are always in uppercase and truncated to 16 characters. The metadata includes current and previous locations, resolution, audio codec, video codec, MD5 hash mapping (truncated to 16 characters, uppercase), preferences (ranking from 1-9), source URLs, modification times (mtime), file extensions (validated by MIME type), bitrate, MIME types as indicated by the file
command, deleted files, lost files, and tags. Tags are categorized into user tags (utags) derived from directory names and file tags (ftags) derived from filenames.
Store all metadata index files in a structured directory under /index/sha256
.
Directory Structure:
/index
/sha256
/vid
vcodec.txt
acodec.txt
vres.txt
/img
vres.txt
/aud
acodec.txt
bitrate.txt
loc_current.txt
loc_previous.txt
md5.txt
pref.txt
source_url.txt
mtime.txt
ext.txt
mime.txt
deleted.txt
lost.txt
utags.txt
ftags.txt
people.txt
- /index/sha256/vid/vcodec.txt
ABCDEF1234567890 H.264
1234567890ABCDEF MP4
- /index/sha256/vid/acodec.txt
ABCDEF1234567890 AAC
1234567890ABCDEF MP3
- /index/sha256/vid/vres.txt
ABCDEF1234567890 1920x1080=2073600
1234567890ABCDEF 1280x720=921600
- /index/sha256/img/vres.txt
ABCDEF1234567890 4000x3000=12000000
1234567890ABCDEF 1920x1080=2073600
- /index/sha256/aud/acodec.txt
ABCDEF1234567890 AAC
1234567890ABCDEF MP3
- /index/sha256/aud/bitrate.txt
ABCDEF1234567890 320kbps
1234567890ABCDEF 128kbps
- /index/sha256/loc_current.txt
ABCDEF1234567890 /directory/to/file...ABCDEF1234567890...txt
1234567890ABCDEF /another/path/to/file...1234567890ABCDEF...txt
- /index/sha256/loc_previous.txt
ABCDEF1234567890 /old/directory/to/file...ABCDEF1234567890...txt
1234567890ABCDEF /previous/path/to/file...1234567890ABCDEF...txt
ABCDEF1234567890 /another/old/path/to/file...ABCDEF1234567890...txt
- /index/sha256/md5.txt
ABCDEF1234567890 ABCDEF1234567890
1234567890ABCDEF 1234567890ABCDEF
- /index/sha256/pref.txt
ABCDEF1234567890 9
1234567890ABCDEF 1
- /index/sha256/source_url.txt
ABCDEF1234567890 http://example.com/source1
1234567890ABCDEF http://example.com/source2
- /index/sha256/mtime.txt
ABCDEF1234567890 20230710123000
ABCDEF1234567890 20230711143000
1234567890ABCDEF 20230710123000
1234567890ABCDEF 20230712123000
- /index/sha256/ext.txt
ABCDEF1234567890 mp4
1234567890ABCDEF jpg
- /index/sha256/mime.txt
ABCDEF1234567890 video/mp4
1234567890ABCDEF image/jpeg
- /index/sha256/deleted.txt
ABCDEF1234567890 20230715123000
1234567890ABCDEF 20230716143000
- /index/sha256/lost.txt
ABCDEF1234567890 20230720123000
1234567890ABCDEF 20230721143000
- /index/sha256/utags.txt
ABCDEF1234567890 DIRECTORY
1234567890ABCDEF ANOTHER
1234567890ABCDEF DIRECTORY
- /index/sha256/ftags.txt
ABCDEF1234567890 this
ABCDEF1234567890 file
1234567890ABCDEF name
1234567890ABCDEF file
- /index/people.txt
John Doe, dob=19800101, occupation=actor
Jane Smith, dob=19750101, occupation=actress
Alice Johnson, dob=19900101, occupation=director
SHA-256 Hash: abcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890
Truncated SHA-256 Hash: ABCDEF1234567890
MD5 Hash: abcdef1234567890abcdef1234567890
Truncated MD5 Hash: ABCDEF1234567890
Original: Example Document.txt
Normalized: example.document.txt
Final Filename: example.document...ABCDEF1234567890...txt
CAS Path: /files/sha256/A/B/C/ABCDEF1234567890.txt
Use tools like MediaInfo or ExifTool to extract metadata and manually update index files:
/index/sha256/vid/vcodec.txt:
ABCDEF1234567890 H.264
/index/sha256/vid/acodec.txt:
ABCDEF1234567890 AAC
/index/sha256/vid/vres.txt:
ABCDEF1234567890 1920x1080=2073600
/index/sha256/img/vres.txt:
ABCDEF1234567890 4000x3000=12000000
/index/sha256/aud/acodec.txt:
ABCDEF1234567890 AAC
/index/sha256/aud/bitrate.txt:
ABCDEF1234567890 320kbps
/index/sha256/loc_current.txt:
ABCDEF1234567890 /directory/to/file...ABCDEF1234567890...txt
/index/sha256/loc_previous.txt:
ABCDEF1234567890 /old/directory/to/file...ABCDEF1234567890...txt
/index/sha256/md5.txt:
ABCDEF1234567890 ABCDEF1234567890
/index/sha256/pref.txt:
ABCDEF1234567890 9
/index/sha256/source_url.txt:
ABCDEF1234567890 http://example.com/source1
/index/sha256/mtime.txt:
ABCDEF1234567890 20230710123000
ABCDEF1234567890 20230711143000
/index/sha256/ext.txt:
ABCDEF1234567890 mp4
/index/sha256/mime.txt:
ABCDEF1234567890 video/mp4
/index/sha256/deleted.txt:
ABCDEF1234567890 20230715123000
/index/sha256/lost.txt:
ABCDEF1234567890 20230720123000
/index/sha256/utags.txt:
ABCDEF1234567890 DIRECTORY
/index/sha256/ftags.txt:
ABCDEF1234567890 this
ABCDEF1234567890 file
/index/people.txt:
John Doe, dob=19800101, occupation=actor
Jane Smith, dob=19750101, occupation=actress
Alice Johnson, dob=19900101, occupation=director