GithubHelp home page GithubHelp logo

Hi there 👋

I'm Runsong Zhao 赵润松, an AI major student studying for creating AI from ACGN world!

Now working on:

  • implementing the experiment of "Attention Is All You Need"(training now)
  • reading paper about LLMs
  • Improving the design of LLM in terms of architecture, data and training strategy

Future work:

  • to implement gpt2(117M)
  • to read paper about reinforcement learning, Memory network and Multimodal

Have read:

Survey
Open-domain Dialogue Generation:What We Can Do, Cannot Do, And Should Do Next
(2020)Towards Unified Dialogue System Evaluation
2022-A Survey for In-context Learning
Emergent Abilities of Large Language Models
Dialogue system
(2015)Neural Responding Machine for Short-Text Conversation
(2015)A Neural Conversational Model
(2018)Extending Neural Generative Conversational Model using External Knowledge Sources
(2016)A Persona-Based Neural Conversation Model
(2018b)Personalizing Dialogue Agents I have a dog, do you have pets too
(2018b) [Note] Personalizing Dialogue Agents I have a dog, do you have pets too
(2018)An Auto-Encoder Matching Model for Learning Utterance-Level Semantic Dependency in Dialogue Generation
(2020)Enhancing Dialogue Generation via Multi-Level Contrastive Learning
(Zhang 2020a)Modeling Topical Relevance for Multi-Turn Dialogue Generation
(Ghazvininejad 2017)A knowledge-grounded neural conversation model
(2022 Ju)Learning to Improve Persona Consistency in Multi-party Dialogue
LLMs
(2018 Radford)Improving Language Understanding by Generative Pre-Training
(GPT2)language models are unsupervised multitask learners
GPT3 Language Models are Few-Shot Learners
(instructGPT)Training language models to follow instructions with human feedback
(Code X)Evaluating Large Language Models Trained on Code
GPT-4 Technical Report
PTM
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
(ALBERT) A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS
(RoBERTa)A Robustly Optimized BERT Pretraining Approach
(T5)Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Reinforcement learning
(PPO)Proximal Policy Optimization Algorithms
other
(2002)BLEU a Method for Automatic Evaluation of Machine Translation
(2019 Feng)Answer-guided and Semantic Coherent Question Generation
(2020 Feng)A Co-Attention Neural Network Model for Emotion Cause Analysis with Emotional Context Awareness
(transformer)Attention Is All You Need
(latent diffusion)High-Resolution Image Synthesis with Latent Diffusion Models

个人经历

  高中毕业时想要做出许多幻想作品中的AI(例如Atri),于是第一志愿报了东北大学的人工智能专业。我当时的想法是:只要AI出现就好,不是我创造的也没关系。我当时只了解到神经网络、强化学习等名词,也不知道他们是什么意思。

  大一时,我看了导论,但只了解到“符号主义”,“连接主义”,“AGI”等泛泛而谈的概念和历史,而对具体的技术一无所知。于是我去找神经网络的资料去学习,很遗憾,因为数学知识的缺乏,学习并不顺利,但还是能理解到神经网络是通过“权重”去影响决策的。到此,我的AI学习之路开始停滞,这段时间我参加了学校的ACM队。

  大二上,我通过校内渠道参加了百度的AI培训班,了解到CV和NLP两个方向。我认为NLP是更触及灵魂的部分,所以我选择了NLP。

  大二下,最优化学完后,我学习深度学习的技术没有太大的障碍了,但是知识面不够广。当时我认为找到了正确的方向,所以选择退出ACM队(大三才能参加ACM-ICPC,虽然很遗憾,但我不能把时间花在这里了),把重心放到学习NLP上。我在暑假完成了百度AI培训班的大部分项目,同时确认了NLP各种技术中“开放域对话系统”与我想要的AI最像并开始阅读相关的论文。同时从这个时间点开始到现在,我开始涌现出各种关于AI的idea,我把每个idea都记在了便签里(在之后的学习中,我发现许多idea已经被证实有效)。

  同时也是在大二暑假开始,各种AI技术开始应用到二次元中。首先是语言合成:Tacotron2、TTS和VITS(8月),然后是对话系统:彩云小梦2.0(8月),最后是图像合成:novelai(8月出,10月火)

  大三上,专业课开始,底层的原理和逻辑也搞明白了,同时每次上课都会偶尔迸出一些idea。同时迫于升学的压力开始卷绩点。寒假,联系上了本校的冯时老师,冯老师了解了我的情况,让我去调研现在的对话系统。这是一个重要的转折点。我体验了当时(一月份)大部分有潜力的对话系统,很遗憾,大多数都是人工智障。不过辛运的是,有三个产品不是智障。他们分别是:ChatGPTcharacter AIGlow(当时Neuro-sama还在封禁,Moss还没出来,不作评价)。其中Glow是国产的,当时我快对国内的产品失望了,最后找到Glow让我恢复了信心。由于三个产品中,只有ChatGPT透露了技术细节,所以我去阅读了chatGPT的相关论文。从此,我从对话系统转向LLMs,而且我认为LLMs最有潜力实现我高中时设想的AI。

  现在是大三下学期,我已经不满足于AI出现,而是想要参与其中。

未来的设想

LLMs + Memory + Reinforcement learning + Multimodal = AGI

1azybug's Projects

1azybug doesn’t have any public repositories yet.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.