wangxidong06 Goto Github PK

followers: 50.0 following: 32.0 repos: 24.0 gists: 0.0

Name: Xidong Wang

Type: User

Company: PHD@The Chinese University of Hong Kong, Shenzhen, BA@Beijing Institute of Technology,

Bio: Towards (Medical) LLMs’ interpretability and interactivity

Location: [email protected]

Blog: https://scholar.google.com/citations?user=WJeSzQMAAAAJ&hl=en

Xidong Wang's Projects

acl-2023

Repository for the ACL 2023 conference website

blas_testbench

Basic Linear Algebra Subprograms testbench

dola

Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"

easycontext

Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.

emnlp-2023

Repository containing the website for the EMNLP 2023 conference

firefly

Firefly(流萤): 中文对话式大语言模型(全量微调+QLoRA)，支持微调Baichuan2、CodeLlama、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya、Bloom等大模型

flash-attention

Fast and memory-efficient exact attention

flash-linear-attention

Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

llama-mistral

Inference code for Mistral and Mixtral hacked up into original Llama implementation

llava

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

llmsft_template

Various SFT acceleration framework scripts and codes

megatron-deepspeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2

megatron-llama

Best practice for training LLaMA models in Megatron-LM

neurips_llm_efficiency_challenge

NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day

opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (LLaMA, LLaMa2, ChatGLM2, ChatGPT, Claude, etc) over 50+ datasets.

openrlhf

A Ray-based High-performance RLHF framework (for 7B on RTX4090 and 34B on A100)

optimized-llm.cpp

Optimized LLM.cpp codes(LLaMa.cpp BLoomz.cpp Whisper.cpp) with Matrix Multiplication implemented by BLIS

promethai-memory

Memory management for the AI Applications and AI Agents

NVIDIA® TensorRT™, an SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for inference applications.

tinyllama

ultrafastbert

The repository for the code of the UltraFastBERT paper

wangxidong06 Goto Github PK

Xidong Wang's Projects

Recommend Projects

Recommend Topics

Recommend Org

Jobs