【LangAI KickOff #3】 東北大学言語AI研究センター開設記念シンポジウム 招待記念講演1:Llion Jones 氏 (Sakana AI)

22 Mar 202438:55

TLDRLlion Jones, co-author of the influential 'Attention is All You Need' paper and co-founder of Sakana AI, shares his journey from a small Welsh village to leading AI innovations. He discusses the Transformer model's impact on AI and his belief in character-level language modeling, emphasizing its potential, especially for languages like Japanese. Jones also highlights the power of AI to perform complex tasks despite limited explicit training, suggesting a future where character-level models could prevail, even considering bite-level language modeling and the challenges of multilingual text processing.


  • 🎉 Lion Jones is a co-author of the influential Transformer paper and co-founder of Sakana AI.
  • 🏆 Lion Jones started his career at Google and worked on YouTube before moving into AI research.
  • 🤖 His work has focused on character-level language modeling, which he believes is a promising direction for AI.
  • 🧠 Lion's research has shown that character-level modeling can be more effective, especially for languages with rich morphology like Japanese.
  • 🌐 He highlighted the limitations of word-level models, such as out-of-vocabulary issues, and the advantages of character-level models.
  • 📚 Lion discussed his experience working on Google Maps to improve pronunciations using character-level Transformers.
  • 🔍 He shared insights on the power of language models to perform tasks like spelling and understanding nuances despite not being explicitly trained on them.
  • 🌟 The 'Attention is all you need' paper significantly impacted the field of AI, and Lion's work has been influential in the development of deep learning models.
  • 💡 Lion's presentation emphasized the potential of character-level language models for tasks like question answering and understanding place names.
  • 🔧 He suggested that future research could explore adaptive computation to optimize character-level models and possibly move towards byte-level or even audio-level modeling.
  • 🌍 Lion is interested in multilingual capabilities and the representation of high-level concepts in language models, especially for character-level processing.

Q & A

  • Who is the guest speaker at the Tohoku University Language AI Research Center's symposium?

    -The guest speaker is Llion Jones from Sakana AI.

  • What is Llion Jones known for in the AI field?

    -Llion Jones is known for co-authoring the famous Transformer paper, which had a significant impact on the AI field.

  • What is the significance of the Transformer model in AI?

    -The Transformer model introduced the concept of attention mechanisms, which has become a fundamental part of many AI models, including those used in natural language processing like GPT.

  • What is the meaning behind the name and logo of Sakana AI?

    -The name Sakana AI and its logo represent the idea of swimming away from the norm and doing something different, inspired by nature's collective intelligence, and also alludes to the Japanese story 'Swimmy'.

  • Why did Llion Jones choose to work on character-level modeling?

    -Llion Jones chose to work on character-level modeling to avoid out-of-vocabulary problems and to simplify the language modeling process, which he believes is particularly beneficial for languages with rich morphology like Japanese.

  • What was the issue Llion Jones faced when working on the Wiki Reading project?

    -The issue was that the models at the time were word-level and struggled with out-of-vocabulary words, which led Llion Jones to explore character-level modeling as a solution.

  • How did Llion Jones address the problem of out-of-vocabulary words in the Wiki Reading project?

    -He used pre-trained language models to handle out-of-vocabulary words more effectively, which involved freezing a pre-trained RNN language model and training another recurrent neural network on top of it.

  • What is the advantage of character-level language models over word-level models?

    -Character-level language models can handle any vocabulary, including rare and new words, since they process text at the character level, thus avoiding out-of-vocabulary issues.

  • What was Llion Jones' role in improving Google Maps' pronunciations for place names in Japan?

    -Llion Jones worked on a character-level Transformer model that analyzed place names written in kanji and the surrounding context to improve the pronunciation accuracy in Google Maps.

  • Why is Llion Jones interested in exploring character-level language modeling further?

    -Llion Jones is interested in character-level language modeling because it offers a more natural and flexible approach to language processing, especially for languages like Japanese that have complex morphological structures.

  • What is the potential future of character-level language models according to Llion Jones?

    -Llion Jones believes that character-level language models will eventually become the standard due to their simplicity and effectiveness, and he is also open to the possibility of moving directly to audio-level language models if computational resources allow.



🎤 Welcoming Lion Jones to the Stage

Lion Jones, a former Google engineer and co-author of the influential Transformer paper, is introduced as the special guest speaker at a university event. He is recognized for his significant contributions to the AI field and his recent venture as the co-founder and Chief Data Officer of Sakana AI. The speaker expresses excitement about meeting Lion Jones in person and looks forward to a shared lunch, followed by Jones's public speaking debut where he promises to discuss his background, the Transformer model, and his advocacy for character-level modeling in AI.


🏞 Lion Jones' Background and Journey to AI

Lion Jones shares his personal background, starting from his Welsh roots in a small village and his initial employment at Google's YouTube branch. He narrates his transition from YouTube to Google Research during the rise of deep learning in 2015. Despite the challenges of moving to California and later to Japan just before the pandemic, Jones details his decision to leave Google after a decade to establish Sakana AI. He also discusses the inspiration behind the company's name and logo, emphasizing the desire to explore alternative approaches to AI beyond large language models.


📜 The Impact of the 'Attention is All You Need' Paper

Lion Jones reflects on the creation and impact of the 'Attention is All You Need' paper, which introduced the Transformer model. He describes the process of developing a visualization tool to demonstrate the attention layer's capabilities, highlighting a breakthrough moment in AI where models could perform common sense reasoning without explicit programming. The title's origin story is shared, revealing how the now-iconic phrase came to be and its widespread adoption in the AI community.


🔠 Pioneering Character-Level Language Modeling

The speaker delves into his early work with character-level language modeling, motivated by the limitations of word-level models and the desire to avoid out-of-vocabulary issues. He recounts the development of a pre-trained RNN model for question answering and the surprising effectiveness of character-level models across languages, especially those with rich morphology. The discussion underscores the benefits of character-level modeling and the speaker's ongoing advocacy for this approach.


🌐 Addressing Pronunciation Challenges with Character-Level Models

Lion Jones discusses his work on improving the pronunciation of place names in Google Maps, leveraging character-level Transformers to analyze and correct the pronunciation based on neighboring data. He emphasizes the importance of direct access to characters for such tasks and suggests that character-level language modeling is a natural fit for various applications, including those specific to the Japanese language.


🤖 The Limitations and Potential of Current Language Models

The speaker examines the current state of language models, focusing on their ability to spell and perform tasks despite not being explicitly trained on character-level information. He uses examples of image generation and language model failures to argue for the power and potential of character-level language models. Jones suggests that character-level models could resolve issues with spelling and improve performance on specific tasks, emphasizing the importance of research in this area.


🌟 The Future of Character-Level Language Modeling

In the concluding thoughts, Lion Jones expresses his belief in the inevitability of character-level language modeling due to its simplicity and effectiveness. He anticipates that advances in computation will make it the standard and suggests that it may even evolve to byte-level language modeling. Jones also addresses the potential of character-level models for Japanese and other languages, hinting at future research directions, including work in his native Welsh and possibly aiding low-resource languages.


🤝 Engaging with the Audience and Envisioning Multilingual Models

The session concludes with a Q&A segment where Lion Jones addresses various questions about character-level language models, including their potential for handling multiple languages, the challenges of building word-level meanings, and the possibility of incorporating phonetic information. He also considers the future of language models, contemplating the impact of audio-based models and their ability to convey nuances like emotion and stress.



💡Llion Jones

Llion Jones is a notable figure in the field of artificial intelligence, particularly recognized for his work as an engineer at Google and his contribution to the 'Attention is All You Need' paper, which introduced the Transformer model. His role in the script is that of the main speaker, sharing his experiences and insights into AI development. He is also the co-founder of Sakana AI, where he serves as the Chief Development Officer, indicating his ongoing influence in the AI industry.


In the context of this video, Transformers refer to a type of deep learning architecture that is pivotal in the field of natural language processing. The model was introduced in the paper co-authored by Llion Jones and has significantly impacted AI by enabling better understanding and generation of human-like text. The script discusses the Transformers' role in character-level modeling and their evolution from the early days of deep learning.

💡Sakana AI

Sakana AI is a startup company co-founded by Llion Jones, where he holds the position of Chief Development Officer (CDO). The company's name and logo are highlighted in the script as symbols of innovation and a departure from conventional approaches in AI. Sakana AI is positioned as a company that seeks to explore alternative paths in AI development, such as character-level modeling, rather than focusing solely on scaling up existing models.

💡Character-level modeling

Character-level modeling is a concept within natural language processing where language models operate at the character level rather than the word or sub-word level. This approach is discussed in the script as a potential solution to out-of-vocabulary issues and is highlighted as a focus area for Llion Jones and Sakana AI. The script mentions that character-level modeling could be particularly beneficial for languages with rich morphology or extensive use of unique characters.


Google is a multinational technology company that has been a significant player in the development of AI technologies. In the script, Llion Jones mentions his tenure at Google, where he worked on YouTube and later in Google Research, contributing to the advancement of AI, particularly in the development of the Transformer model. His experience at Google is part of his professional background that informs his current work and perspectives.

💡Attention mechanism

The attention mechanism is a key component of the Transformer model, allowing the model to weigh the importance of different parts of the input data when making predictions. The script describes a visualization technique developed by Llion Jones to demonstrate how the attention mechanism works, showing which words the model focuses on to understand sentence meaning. This mechanism is crucial for the model's ability to perform tasks like translation and understanding context.

💡Co-reference resolution

Co-reference resolution is a natural language processing task that involves identifying when two or more words or phrases in a text refer to the same entity. In the script, Llion Jones discusses how the Transformer model, through its attention mechanism, was able to perform co-reference resolution automatically, which was previously a complex problem in AI that required explicit programming.

💡Deep learning revolution

The deep learning revolution refers to a period in the development of AI where deep learning techniques, particularly neural networks, began to achieve unprecedented success in various tasks, including image and speech recognition, and natural language processing. The script mentions this period as the backdrop against which Llion Jones moved from YouTube to Google Research and started working on projects that led to the development of the Transformer model.

💡Language model pre-training

Language model pre-training is a technique where a language model is first trained on a large corpus of text to learn the patterns and structures of a language before being fine-tuned for specific tasks. In the script, Llion Jones talks about his early work on character-level pre-training of RNNs and how this approach became more effective with the advent of deep Transformers, leading to significant advancements in language modeling.

💡Morphological languages

Morphological languages are those in which words are formed by combining smaller meaningful units called morphemes. The script suggests that character-level modeling might be particularly advantageous for such languages because they tend to have a large number of words formed through various morphological processes. Llion Jones speculates that languages like Japanese, with its extensive use of conjugation, might benefit from character-level modeling.


Llion Jones, co-author of the influential Transformer paper, delivers a keynote speech at the Tohoku University Language AI Research Center symposium.

Llion Jones is a co-founder of Sakana AI, where he serves as the Chief Development Officer, focusing on character-level modeling in AI.

Jones discusses his background, from being an engineer at Google to his current role at Sakana AI.

The Transformer model introduced by Jones had a significant impact on the field of AI, particularly in natural language processing.

Jones shares his personal journey from a small Welsh village to working at Google and eventually founding Sakana AI.

The importance of character-level modeling is emphasized, as it can potentially offer more flexibility and power than large language models.

Sakana AI's approach to AI contrasts with the mainstream focus on scaling up language models, advocating for a nature-inspired, collective intelligence.

The 'Attention is All You Need' paper simplification of the model by removing convolutions improved performance, which was unexpected.

Jones' work on character-level language modeling showed promising results, even in languages with rich morphology like Japanese.

The title 'Attention is All You Need' became iconic in the AI community, and Jones shares the story behind its creation.

Character-level language models can address out-of-vocabulary issues and may be more suitable for languages with complex scripts.

Jones' research at Google Japan focused on improving the pronunciation of place names in Google Maps using character-level Transformers.

The potential of character-level language models to improve tasks like image generation and spelling is highlighted.

Jones envisions character-level or even byte-level language modeling as the future of AI, despite current dominance of word-level models.

The Q&A session explores the challenges and opportunities of character-level modeling for multilingual support and incorporating paralinguistic information.

Llion Jones concludes by emphasizing the need for further research into character-level language modeling, especially for Japanese and other languages.