Abdelrahman Shaker's Projects
A automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
An efficient modular implementation of Associating Objects with Transformers for Video Object Segmentation in PyTorch
[CVPR 2024 Highlight] Putting the Object Back Into Video Object Segmentation
Convolutional Neural Networks
Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications
EfficientFormerV2 & EfficientFormer(NeurIPs 2022)
Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.
Official implementation of paper titled "GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model"
[WACV 2025] Efficient Video Object Segmentation via Modulated Cross-Attention Memory
[ICCV'23] Official repository of paper SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications
[IEEE TMI-2024] UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation
[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Weka package for the Deeplearning4j java library
[ECCV 2022] XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model
XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models.
A simple but robust PyTorch implementation of RetNet from "Retentive Network: A Successor to Transformer for Large Language Models" (https://arxiv.org/pdf/2307.08621.pdf)