1、Deep Learning by Ian Goodfellow, Yoshua Bengio and Aaron Courville (2016)
这篇论文是深度学习领域的经典之作,详细介绍了深度神经网络的基本原理和技术。
2、BERT: Pretraining of Deep Bidirectional Transformers for Language Understanding by Jacob Devlin, MingWei Chang, Kenton Lee and Kristina Toutanova (2018)
这是一篇关于自然语言处理的大模型论文,介绍了BERT(Bidirectional Encoder Representations from Transformers)模型,该模型在多个NLP任务中取得了显著的效果。
3、GPT2: Language Generation Progressively Improved through Many Steps of Reinforcement Learning by Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei and Ilya Sutskever (2019)
这是关于生成预训练Transformer 2(GPT2)的论文,该模型可以生成连贯的文本,并在多项NLP任务中表现出色。
4、ViT: A Simple Way to Make Neural Networks See Align and Learn by Alexey Bochkovskiy, ChingWei Chen and KiHun Hong (2020)
这篇论文介绍了Vision Transformer(ViT)模型,该模型将Transformer架构应用于图像识别任务,并取得了与卷积神经网络相当甚至更好的效果。
5、DALLE: Creating Images from Descriptions with a Controlled Latent Space by Alec Radford, Jessica Liao, Chris Hallacy, Gabriel Ilharco, Tomas Pfaff, Jack Clark, Dhruv Mahajan, Austin Roberts, Mark Chen, Tim Salimans and Ilya Sutskever (2021)
这篇论文介绍了一个名为DALLE的模型,该模型可以根据文本描述生成高质量的图像。