I share information related to the NLP what I interested in.
- modified : 2024-01-11
- Ko-Ethical-QnA: 한국어 윤리적/비윤리적 답변
- KorQuAD: 한국어 질문/답변
- KLUE: 한국어 이해 능력 평가를 위한 데이터셋
- KorNLI:
- Do Prompt-Based Models Really Understand the Meaning of Their Prompts? (NAACL'23)
- AnyText: Multilingual Visual Text Generation And Editing (ICLR'23)
- SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling (arXiv'23):
- Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models (NIPS'23)
- Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks (TMLR'23):
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models (NIPS'22)
- Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning (NIPS'22)
- LoRA: Low-Rank Adaptation of Large Language Models (ICLR'22)
- Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer (ICLR'17)
-
Sequence to Sequence Learning with Neural Networks (ICML'14)
-
Effective Approaches to Attention-based Neural Machine Translation (EMNLP'15)
- Sparse is Enough in Scaling Transformers (NeurIPS'21)
-
Improve Transformer Models with Better Relative Position Embeddings (EMNLP'20)
-
Train short, test long: Attention with linear biases enables input length extrapolation (ICLR'22)
-
Roformer: Enhanced transformer with rotary position embedding (Neurocomputing'23)
-
A Survey on Aspect-Based Sentiment Analysis: Tasks, Methods, and Challenges
-
AMMUS: A Survey of Transformer-based Pretrained Models in Natural Language Processing
-
Mamba (ICLR'24): Mamba: Linear-Time Sequence Modeling with Selective State Spaces
-
LittleBird (EMNLP'22): LittleBird: Efficient Faster & Longer Transformer for Question Answering
-
ALiBi (ICLR'22): Train short, test long: Attention with linear biases enables input length extrapolation
-
Roformer (Neurocomputing'21): Roformer: Enhanced transformer with rotary position embedding
-
SimCSE (EMNLP'21): SimCSE: Simple Contrastive Learning of Sentence Embeddings
-
GPT-3 (NIPS'20): Language Models are Few-Shot Learners
-
ELECTRA (ICLR'20): Electra: Pre-training text encoders as discriminators rather than generators
-
BigBird (NIPS'20): Big bird: Transformers for Longer Sequences
-
Reformer (ICLR'20): Reformer: The efficient transformer
-
Meena (arXiv'20): Towards a Human-like Open-Domain Chatbot
-
T5 (JMLR'20): Exploring the limits of transfer learning with a unified text-to-text transformer
-
BERT (NAACL'19): Pre-training of Deep Bidirectional Transformers for Language Understanding
-
XLNet (NIPS'19): XLNet: Generalized Autoregressive Pretraining for Language Understanding
-
GPT-2 (2019): Language Models are Unsupervised Multitask Learners
-
XLM (2019): Cross-lingual Language Model Pretraining
-
BART (2019): BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
-
BERT-E2E-ABSA (2019): Exploiting BERT for End-to-End Aspect-based Sentiment Analysis
-
RoBERTa (2019): A Robustly Optimized BERT Pretraining Approach
-
CTRL (2019): CTRL: A Conditional Transformer Language Model for Controllable Generation
-
ALBERT (2019): Albert: A Lite Bert for Self-supervised Learning of Language Representations
-
SBERT (2019): Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
-
DistillBERT (2019): DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
-
ELMo (2018): ELMo: Deep Contextualizaed word representations
-
GPT (2018): Improving Language Understandingby Generative Pre-Training
-
Transformer (NIPS'17) : Attention is all you need
-
Seq2Seq (ICML'14): Sequence to Sequence Learning with Neural Networks
-
GloVe (EMNLP'14): GloVe: Global Vectors for Word Representation
-
Word2Vec (NIPS'13): Distributed Representations of Words and Phrases and their Compositionality
-
COALS (ACM'06): An improved model of semantic similarity based on lexical co-occurrence
-
NNLM (NIPS'00) A Neural Probabilistic Language Model
- Learning to Memorize Entailment and Discourse Relations for Persona-Consistent (AAAI'23)
Persona
- Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping (ArXiv'20)
Fine-Tuning
- EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks (EMNLP-IJCNLP'19)
Data Augmentation
- Visualizing and Understanding Recurrent Networks (ICLR'16)
eXplainable AI