AI Reading List: Technologies: Language & Speech

Below is a portion of my informal list of readings related to Artificial Intelligence (AI). This started out as a very short list created for use in conjunction with an academic presentation and has now grown much larger. Please let me know if you have any corrections, additions, suggestions, etc. It is very idiosyncratic and not meant to be comprehensive. Please feel free to share with others.

Artificial Intelligence (AI) Reading List, by Philip Rubin

Technologies — Language & Speech:

Matthew Zucca. Thanks to AI, universal translators aren't just the stuff of science fiction anymore. Android Police, Oct. 10, 2023.

Steven Piantadosi. Modern language models refute Chomsky’s approach. lingbuzz/007180, March 2023.

Stephen Ornes. The Unpredictable Abilities Emerging From Large AI Models. Quanta magazine, Mar. 16, 2023

Ian Roberts, Noam Chomsky, and Jeffrey Watumull. Noam Chomsky: The False Promise of ChatGPT. The New York Times, Mar. 8, 2023.

Danny Driess, Fei Xia, Mehdi S. M. Sajjadi, Corey Lynch, Aakanksha Chowdhery, Brian Ichter, Ayzaan Wahid, Jonathan Tompson, Quan Vuong, Tianhe Yu, Wenlong Huang, Yevgen Chebotar, Pierre Sermanet, Daniel Duckworth, Sergey Levine, Vincent Vanhoucke, Karol Hausman, Marc Toussaint, Klaus Greff, Andy Zeng, Igor Mordatch and Pete Florence. PaLM-E: An Embodied Multimodal Language Model. Google Research, March 6, 2023.

OpenAI. Forecasting Potential Misuses of Language Models for Disinformation Campaigns—and How to Reduce Risk. OpenAI, Jan. 11, 2023.

Leyland Cecco. Death of the narrator? Apple unveils suite of AI-voiced audiobooks. The Guardian, Jan. 4, 2023.

Carlos Baquero. Is Having AI Generate Text Cheating? Communications of the ACM, December 2022, Vol. 65 No. 12, 6-7.

Nathan Lambert, Louis Castricato, Leandro von Werra, and Alex Havrilla. Illustrating Reinforcement Learning from Human Feedback (RLHF). Hugging Face, Dec. 9, 2022.

Gary Marcus. How come smart assistants have virtually no ability to converse, despite all the spectacular progress with large language models? Nov. 22, 2022.

Ron Amadeo. Amazon Alexa is a “colossal failure,” on pace to lose $10 billion this year. ArsTechnica, Nov. 21, 2022.

Rishi Bommasani, Percy Liang, and Tony Lee. Language Models are Changing AI. We Need to Understand Them. HAI, Stanford University, Nov. 17, 2022.

Ingrid Fadelli. Study assesses the quality of AI literary translations by comparing them with human translations. Tech Xplore, Nov. 8, 2022.

Matthew Hutson. Could AI help you to write your next paper? Nature, Oct. 31, 2022.

Gabriel Synnaeve, Yossef Mordechay Adi, Jade Copet, and Alexandre Défossez. Using AI to compress audio files for quick and easy sharing. Meta AI, Oct. 25, 2022.

Kyt Dotson. Meta builds AI-powered speech translation for Hokkien to understand unwritten languages. siliconANGLE, Oct. 19, 2022.

Nitasha Tiku. ‘Chat’ with Musk, Trump or Xi: Ex-Googlers want to give the public AI. The Washington Post, Oct. 7, 2022.

A. Tarantola. AI is already better at lip-reading than we are. engadget, Sep. 29, 2022.

Martin Curtis. Why humanity is needed to propel conversational AI. VentureBeat, Sep. 16, 2022.

Daniel Bashir. Christopher Manning: Linguistics and the Development of NLP. The Gradient podcast, Sep. 8, 2022.

Meta AI. Using AI to decode speech from brain activity. Meta AI, Aug. 31, 2022.

Emily Anthes. The Animal Translators. The New York Times, Aug. 30, 2022.

Gary Marcus. Learning Language is Harder Than You Think. garymarcus.substack.com, July 20, 2022.

Melissa Heikkilä. Inside a radical new project to democratize AI. MIT Technology Review, July 12, 2022.

Kyle Wiggers. A year in the making, BigScience’s AI language model is finally available. TechCrunch.com, July 12, 2022.

BigScience Blog. BLOOM: Introducing The World’s Largest Open Multilingual Language Model. July 12, 2022.

NLLB team. No Language Left Behind: Scaling Human-Centered Machine Translation. arXiv:2207.04672, July 11, 2022.

Grégoire Delétang, Anian Ruoss, Jordi Grau-Moya, Tim Genewein, Li Kevin Wenliang, Elliot Catt, Marcus Hutter, Shane Legg, and Pedro A. Ortega. Neural Networks and the Chomsky Hierarchy. arXiv:2207.02098, July 5, 2022.

Gary Marcus. Three ideas from linguistics that everyone in AI should know. garymarcus.substack.com, June 22, 2022.

Timnit Gebru and Margaret Mitchell. We warned Google that people might believe AI was sentient. Now it’s happening. The Washington Post, June 17, 2022.

Gary Marcus. Nonsense on Stilts: No, LaMDA is not sentient. Not even slightly. garymarcus.substack.com, June 12, 2022.

Nitasha Tiku. The Google engineer who thinks the company’s AI has come to life. The Washington Post, June 11, 2022.

Gary Marcus. Noam Chomsky and GPT-3. garymarcus.substack.com, May 21, 2022.

Matthew Hutson. Taught to the Test. Science, issue 6693, May 6, 2022.

Bhaskar Ammu. GPT-3: All you need to know about the AI language model. Sigmoid.com, April 2022.

Steven Johnson. A.I. Is Mastering Language. Should We Trust What It Says? New York Times, April 15, 2022.

Avalon Nuovo. Researchers Gain New Understanding From Simple AI. QuantaMagazine.org, Apr. 14, 2022.

Sergios Karagiannakos. Speech synthesis: A review of the best text to speech architectures with Deep Learning. AI Summer, May 13, 2021.

Emily M. Bender, Timnet Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? FAccT '21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, March 2021, 610–623.

Matthew Hutson. Robo-writers: the rise and risks of language-generating AI. Nature.com, March 3, 2021.

Xu Tan, Tao Qin, Frank Soong, Tie-Yan Liu. A Survey on Neural Speech Synthesis. arXiv:2106.15561, 2021.

Marco Tulio Ribeiro, Tongshuang Wu, Carlos Guestrin, Sameer Singh. Beyond Accuracy: Behavioral Testing of NLP models with CheckList. Association for Computational Linguistics (ACL), 2020.

James L. McClelland, Felix Hill, Maja Rudolph, Jason Baldridge, and Hinrich Schütze. Placing language in an integrated understanding system: Next steps toward human-level performance in neural language models. Proceedings of the National Academy of Sciences, 117(42), 25966-25974, 2020. DOI: 10.1073/pnas.1910416117.

Lewis, Jason Edward, ed. Indigenous Protocol and Artificial Intelligence Position Paper. Honolulu, Hawaiʻi: The Initiative for Indigenous Futures and the Canadian Institute for Advanced Research (CIFAR). 2020. DOI: 10.11573/spectrum.library.concordia.ca.00986506.

Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R. Bowman. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. arXiv:1804.07461. 2018. (See, also: GLUE benchmark.)

Peter Clark, Isaac Cowhey, Oren Etzioni, Tushar Khot, Ashish Sabharwal, Carissa Schoenick, and Oyvind Tafjord. Think you have solved question answering? Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge. aRxiv:1803.05457. 2018.

Alexandre Gonfalonieri. How Amazon Alexa works? Your guide to Natural Language Processing (AI). Towards Data Science, Nov. 21, 2018.

Awni Hannun. Speech Recognition is Not Solved. Github, Oct. 11, 2017.

Christopher D. Manning. Last Words: Computational Linguistics and Deep Learning: A look at the importance of Natural Language Processing. Nautilus, March 16, 2017.

Steven Levy. The iBrain Is Here—and It’s Already Inside Your Phone: An exclusive inside look at how artificial intelligence and machine learning work at Apple. Wired.com, Aug. 24, 2016.

John H. L. Hansen and Taufuq Hasan. Speaker Recognition by Machines and Humans: A tutorial review. IEEE Signal Processing Magazine, 32, 74-99. 2015.

G. Hinton, et al. Deep Neural Networks for Acoustic Modeling in Speech RecognitionIEEE Signal Processing Magazine, vol. 29, no. 6, 82-97, Nov. 2012. (See PDF.)

Vikramjit Mitra, Hosung Nam, Carol Y. Espy-Wilson, Elliot Saltzman, and Louis Goldstein. Retrieving Tract Variables From Acoustics: A Comparison of Different Machine Learning Strategies. IEEE Journal of Selected Topics in Signal Processing, Vol. 4, #6, 1027-1045, Dec. 2010.

Takaaki Kuratate, Kevin G. Munhall, Philip E. Rubin, Eric Vatikiotis-Bateson, and Hania Yehia. Audio-Visual Synthesis of Talking Faces From Speech Production Correlates. Sixth European Conference on Speech Communication and Technology, EuroSpeech 1999, Paper K013, Budapest, Hungary, September 5-9, 1999.

John Hogden, Elliot Saltzman, and Philip Rubin. Unsupervised neural networks that use a continuity constraint to track articulators. The Journal of the Acoustical Society of America, 92, 2477, 1992.

Terrence J. Sejnowski and Charles R. Rosenberg. NETtalk: A parallel network that learns to read aloud. JHU/EECS-86/01: The Johns Hopkins University, Electrical Engineering and Computer Science Department, 1986.

D. E. Rumelhart and J. L. McClelland. On Learning the Past Tenses of English Verbs. Chapter 18 in David E. Rumelhart, James L. McClelland and PDP Research Group. Parallel Distributed Processing: Explorations in the Microstructure of Cognition, 216-268, 1987.

Wikipedia. Amazon Alexa.

Wikipedia. Deep learning-based synthesis.

Wikipedia. ChatGPT.

Wikipedia. GPT-3.   

Wikipedia. Siri.  

Wikipedia. Speech recognition.

(See, also, the ChatGPT section in this AI Reading List.)

< AI Reading List >