Topic: deduplication Goto Github
Some thing interesting about deduplication
Some thing interesting about deduplication
deduplication,Make it easier to compare and cross-reference the names of companies and people by applying strong normalisation.
Organization: alephdata
deduplication,Find duplicate files
User: arsenetar
Home Page: https://dupeguru.voltaicideas.net
deduplication,Deduplicating archiver with compression and authenticated encryption.
Organization: borgbackup
Home Page: https://www.borgbackup.org/
deduplication,Simple, configuration-driven backup software for servers and workstations
Organization: borgmatic-collective
Home Page: https://torsion.org/borgmatic/
deduplication,Productivity improvements for Rust ecosystem: warnings are skipped until errors are fixed, LSP-independent Neovim integration, etc.
Organization: cargo-limit
deduplication,Config driven, easy backup cli for restic.
User: cupcakearmy
Home Page: https://autorestic.vercel.app/
deduplication,Quick and dirty backup tool benchmark with reproducible results
User: deajan
deduplication,A kernel module which provide a pool of deduplicated and/or compressed block storage.
Organization: dm-vdo
deduplication,Userspace tools for managing VDO volumes.
Organization: dm-vdo
deduplication,Data deduplication engine, supporting optional compression and public key encryption.
User: dpc
deduplication,Benji Backup: A block based deduplicating backup software for Ceph RBD images, iSCSI targets, image files and block devices
User: elemental-lf
Home Page: https://benji-backup.me
deduplication,Quickly detect already witnessed data.
User: f483
Home Page: https://f483.github.io
deduplication,Tool for managing data-deduplication within extant compressed archive files, along with a relatively performant BK tree implementation for fuzzy image searching.
User: fake-name
deduplication,Deduplicating archiver with encryption and paranoid-level tests. Swiss army knife for the serious backup and disaster recovery manager. Ransomware neutralizer. Win/Linux/Unix
User: fcorbelli
deduplication,UniSim is a package for efficient similarity computation, fuzzy matching, and clustering of data.
Organization: google
deduplication,Open source project for data preparation of LLM application builders
Organization: ibm
Home Page: https://ibm.github.io/data-prep-kit/
deduplication,A list of free data matching and record linkage software.
User: j535d165
deduplication,A powerful and modular toolkit for record linkage and duplicate detection in Python
User: j535d165
Home Page: http://recordlinkage.readthedocs.io/
deduplication,RocketMQ消息幂等去重消费者,支持使用MySQL或者Redis做幂等表,开箱即用
User: jaskey
deduplication,CLI utility to find duplicate files
User: jvirkki
Home Page: http://www.virkki.com/dupd
deduplication,Cross-platform backup tool for Windows, macOS & Linux with fast, incremental backups, client-side end-to-end encryption, compression and data deduplication. CLI and GUI included.
Organization: kopia
Home Page: https://kopia.io
deduplication,Fast block-level out-of-band BTRFS deduplication tool.
User: lakshmipathi
deduplication,[UNMAINTAINED] A transactional and deduplicating virtual file system
User: lostatc
deduplication,CLI utility to find near duplicate images and remove all but the best copy.
User: markusressel
deduplication,Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents
User: mattilyra
deduplication,A fast high compression read-only file system for Linux, Windows and macOS
User: mhx
deduplication,Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends
Organization: moj-analytical-services
Home Page: https://moj-analytical-services.github.io/splink/
deduplication,A secure and efficient file backup solution that fits both system administrators (CLI) and end users (GUI)
Organization: netinvent
deduplication,FastCDC implementation in Rust
User: nlfiedler
Home Page: https://crates.io/crates/fastcdc
deduplication,Scalable data pre processing and curation toolkit for LLMs
Organization: nvidia
deduplication,Generate duplex/single consensus reads to reduce sequencing noises and remove duplications
Organization: opengene
deduplication,Framework and command-line tools for integrating FollowTheMoney data streams from multiple sources
Organization: opensanctions
deduplication,A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.
Organization: openvenues
deduplication,Duplicates Detector is a cross-platform GUI utility for finding duplicate files, allowing you to delete or link them to save space. Duplicate files are displayed and processed on two synchronized panels for efficient and convenient operation.
User: pjdude
Home Page: https://pjdude.github.io/dude/
deduplication,Prometheus Alertmanager
Organization: prometheus
Home Page: https://prometheus.io
deduplication,Fast, secure, efficient backup program
Organization: restic
Home Page: https://restic.net
deduplication,Коллекция готовых SQL запросов для PostgreSQL по часто возникающим задачам (получение и модификация данных, ускорение запросов, обслуживание БД)
User: rin-nas
deduplication,Resources for tackling record linkage / deduplication / data matching problems
User: ropeladder
deduplication,rustic - fast, encrypted, and deduplicated backups powered by Rust
Organization: rustic-rs
Home Page: https://rustic.cli.rs
deduplication,Extremely fast tool to remove duplicates and other lint from your filesystem
User: sahib
Home Page: http://rmlint.rtfd.org
deduplication,Filter, Sort & Delete Duplicate Files Recursively
User: sreedevk
deduplication,You personal database. Mirror of https://git.sr.ht/~tsileo/blobstash
User: tsileo
deduplication,Automation for the serious data hoarder that wants to have their data and use it
User: unreadablewxy
deduplication,Record Linkage ToolKit (Find and link entities)
Organization: usc-isi-i2
deduplication,PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolution using Approximate Nearest Neighbors.
Organization: vintasoftware
Home Page: https://entity-embed.readthedocs.io/en/latest/
deduplication,Straightforward fuzzy matching, information retrieval and NLP building blocks for JavaScript.
User: yomguithereal
Home Page: https://yomguithereal.github.io/talisman/
deduplication,A batch manager that will deduplicate and batch requests for a certain data type made within a window. Useful to batch requests made from multiple react components that uses react-query
User: yornaath
Home Page: https://batshit-example.vercel.app/
deduplication,Scalable identity resolution, entity resolution, data mastering and deduplication using ML
Organization: zinggai
deduplication,Spark RDD with Lucene's query and entity linkage capabilities
User: zouzias
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.