dair-ai / nlp_paper_summaries Goto Github PK

✍️ A carefully curated list of NLP paper summaries

License: MIT License

nlp_paper_summaries's Introduction

NLP Paper Summaries

This repository contains a list of NLP paper summaries intended to make NLP techniques and topics more approachable and accessible. We have identified and listed several important papers with summaries or TL;DRs. But we also invite the whole community to provide their own perspective and approachable explanations to these works and help in democratizing NLP research. The objective is to provide readers with a reliable resource that could serve as an entry point to the field of NLP.

Work in progress!

Join our Slack community to find our more about this and other ongoing projects or send me an email at [email protected] and I will send you an invite.

Slack channel: #paper_summaries

How to contribute

If you have blogged about an NLP paper or technique or find an interesting read out there, I encourage you to share with the wider community. To add your blog posts, summaries, or TL;DRs to this list just hit on the edit button (✏️) in the README.md file inside the corresponding folder. You can then add your entry by modifying the readme file and submitting a PR which will be reviewed before going live.

Alternatively, we can work on transferring or writing your summaries or TL;DRs to this repo directly so as to make them more accessible. To achieve this go inside any of the folders and you will find a "Contribute ✍️" link that is essentially a request for your contribution. Click on that link and it will take you directly to a window where you can start writing your summary or TL;DR. Once you are done submitting the PR, then we will review it and add the entry to the corresponding table.

If you would like to contribute by blogging about an NLP paper/technique, you can check out our suggestion/guidance at this issue.

And if you need any ideas on how else to contribute to this repo, take a look in the issues section. We are in need of maintainers.

For now, I have adopted a few tracks from ACL for the categorization of the summaries but this can change based on the granularity of grouping that is needed. Open to ideas here.

Note that we currently provide the source of where the summary originated from. We are working with a few authors to migrate the content directly to this repo so that the summaries are centralized and easily accessible. This also simplifies the way others can contribute to this project. When a summary is fully available on this repo, we will tag the summaries as "GitHub" under the "Summary" tab of the table of summaries to identify them easily.

We are including an extra TL;DR section wherever applicable. This is not meant as a full-fledged summary but rather covers the key points of each paper and serves as a refresher for those who have previously encountered the paper or want to get a quick idea of the concepts being discussed.

This video 📹 demonstrates how to add an entry to any of the folders in this repository.

This next video 📹 demonstrates how to add a summary or TL;DR in the form of a pull request to the repo.

If you are facing any issues submitting your PR, just send me an email at [email protected] or DM me on Twitter.

nlp_paper_summaries's People

Stargazers

Watchers

Forkers

arimkatz alaskaw sixingwu awesome-archive vonrosenchild cshorten theafricanquant jbdatascience dragomirradev alketcecaj12 nabinkhadka mukulkumar22 flaboss ricsinaruto omarsar chsafouane lakshmiaddepalli jamshaidsohail5 rubenmccarty praveenjoshi01 chinmayrane16 royadityak94 sts-sadr itsshaikaslam maryamnajafian amirstudy mod-cpu hhy5277 yugaljain1999 antoniolopardo goswamig shivlondon fintrek zoq tchigher ram-msft linron84 shubham23471 kellan-cao w1091681793 fahmiansori diegosiqueir4 deepchatterjeevns jalaj001 lyavale95 dheerajpatta navidmostofi janeyzy viktor2k zhang-yun-peng guojson cccadet yuze-wu ashwinbaldawa patkaczm e-tornike dlozeve vibhorag kannanv93 o7s8r6 ankitshah009 profshen davood-m xianglizuel gaps013 zhenhengdong junfanz1 bkiselgof dcthang arpitsisodia adityachandupatla aks861999 marcelomata franxan tianyustar mulinsenpan sahanduiuc qiuzhuang itratrahman masidorov plthiyagu zhangjialing1117 tns007 sunyanhust phaneendragunda hjsang sachuin23 ashishpatel26 kabeer1639 sarvesh1523 onisimchukv vigneshsomasundaram00 atendraoyo xinyuh5 aliwahba sreelekshmyselvin dxccm666 vksbhandary hoossainalik huanghonggit

nlp_paper_summaries's Issues

Can I summarize general deep learning papers

Hi,
@omarsar as the title suggests would it be possible to include summaries of seminal DL papers like https://arxiv.org/abs/1611.03530. If yes then kindly allow me to work on it.
Thanks

Migrate all TL;DRs from previous NLP Newsletters

Source: https://medium.com/dair-ai

Summarize paper: Byte Pair Encoding is Suboptimal for Language Model Pretraining

https://arxiv.org/pdf/2004.03720.pdf

Kaj Bostrom and Greg Durrett

I have already submitted a draft on Medium, please take a look there.

Summarize paper: Language Models are Few-Shot Learners

Language Models are Few-Shot Learners

https://arxiv.org/abs/2005.14165

Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something which current NLP systems still largely struggle to do. Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic. At the same time, we also identify some datasets where GPT-3's few-shot learning still struggles, as well as some datasets where GPT-3 faces methodological issues related to training on large web corpora. Finally, we find that GPT-3 can generate samples of news articles which human evaluators have difficulty distinguishing from articles written by humans. We discuss broader societal impacts of this finding and of GPT-3 in general.

Recipe for Transformers

A great summary of improvements in Transformers. It could be used to extract a great recipe around this topic.

https://lilianweng.github.io/lil-log/2020/04/07/the-transformer-family.html

Summarize paper: A Survey on Data Collection for Machine Learning: a Big Data -- AI Integration Perspective

A Survey on Data Collection for Machine Learning: a Big Data -- AI Integration Perspective
https://arxiv.org/abs/1811.03402

Potentially a video summary.

Add paper summary to privacy

http://www.cleverhans.io/2020/04/06/stealing-bert.html

Summarize: Predicting In-game Actions from Interviews of NBA Players

https://arxiv.org/pdf/1910.11292v3.pdf

Predicting In-game Actions from Interviews
of NBA Players

Move all paper summaries to GitHub so others can also edit, provide feedback, comment, contribute, and review

add remainder of paper summaries and improve categorization

In the coming days, I will add a series of papers summaries and improve categorization

adding oral presentations?

Someone on Twitter suggested adding oral presentations of those papers if they were available. Your thoughts?

🚀 Covert to website

Convert to website. Get inspiration from NLP Overview and NLP Progress

Sentence-level Semantics should have more papers

I would add this one arXiv:1905.05950v2 [cs.CL] 9 Aug 2019

Summarize paper: Deep Learning Based Text Classification: A Comprehensive Review

Deep Learning Based Text Classification: A Comprehensive Review
https://arxiv.org/pdf/2004.03705.pdf

Summarize paper: TABERT: Pretraining for Joint Understanding of Textual and Tabular Data

TABERT: Pretraining for Joint Understanding of
Textual and Tabular Data

https://scontent.fyyz1-2.fna.fbcdn.net/v/t39.8562-6/106708899_597765107810230_1899215558892880563_n.pdf?_nc_cat=107&_nc_sid=ae5e01&_nc_ohc=ca-FiDMrqEsAX87i9-7&_nc_ht=scontent.fyyz1-2.fna&oh=0c3b30f0b6eef7fa5268ddb1f2b0a496&oe=5F297C05

Summary article - MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices

CONTRIBUTING.md

Currently, we have different variants of naming hyperlinks:

Title	Summary	Paper Source	TL;DR
Title	Summary 1	Source	TL;DR
Title	Medium	Source	TL;DR
Title	Firstname Lastname	Source	TL;DR
Title	Summary 1, Summary 2	Source	TL;DR
Title	Medium, Summary	Source	TL;DR

The question is, which of these would make the most sense for the repository at the moment?

Include more info about papers, excerpt, authors, year, etc.

put years on each paper and order them

Critique-of-Taylor-s-Law-for-Human-Linguistic-Sequences [To Publish]

Got permission from @jadevaibhav to publish the following article on dair.ai's main website.

Article:
https://github.com/jadevaibhav/Critique-of-Taylor-s-Law-for-Human-Linguistic-Sequences/blob/master/README.md

Tasks:

Migrate the article to our main website repo: https://github.com/dair-ai/dair-ai.github.io/tree/master/_posts
Proofread and review and make necessary changes
Add author's information
Publish on the main website (dair.ai)
Add the entry to the NLP Paper Summaries repo (let's figure out under which folder we can add it)
Share on social media and our Slack group

Author's information:

jadevaibhav:
  name: Vaibhav Jade
  github:  jadevaibhav

looking for maintainers

This is going to require a huge effort to maintain due to the nature of the project. I have already received very positive feedback on this idea and it would be nice to get volunteers to help maintain this. If you are interested, please email me to [email protected] or DM me on Twitter.

Add your name to the contributors list

Please say hi and add your name below if you wish to contribute to this project. Make sure to link your GitHub account so that I can add you to this project.

paper summary recipes

Hi all,

I have this idea to potentially use the paper summaries here to help out students. I notice a lot of courses online and in universities recommend papers for students but this can be intimidating for some and even discouraging. What if we create "recipes" for students where we recommend them a journey on what paper summaries to look at first before jumping into the actual corresponding papers. Paper summaries are more approachable/friendly and can help guide students better before they jump into paper reading. This could be a nice addition as we keep expanding the list. Thoughts?

Add excerpt to papers

Add an excerpt to each paper summary so as to provide readers a sort of TLDR.

Elaborate to other domains

@omarsar Thanks for starting this initiative.

I was wondering if we can expand this from nlp_paper_summaries to paper_summaries instead and group papers by the domain: "NLP", "Computer Vision" etc. There is no existing platform that curates such paper summaries in one place and so paper_summaries could fill that gap. Just a thought.

dair-ai / nlp_paper_summaries Goto Github PK

nlp_paper_summaries's Introduction

NLP Paper Summaries

How to contribute

Table of Contents

nlp_paper_summaries's People

Stargazers

Watchers

Forkers

nlp_paper_summaries's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs