山梨大学電子シラバス−授業データ

授業科目名

担当教員

言語・画像メディア処理特論

福本　文代／古屋　貴彦

時間割番号

単位数

コース

履修年次

期別

曜日

時限

GTK511

(未登録)

前期

水

III

［概要と目標］

[Natural Language and Image Media Processing]
This course covers fundamental topics in Natural Language Processing (NLP) and Image media processing. The course is split into two parts and taught by two instructors (Fumiyo Fukumoto and Takahiko Furuya).

The first half of the course will focus on text analysis and generation, providing an overview of methods based on deep learning which has become a mainstream approach in recent years. After introducing the basics of text processing, students will learn the fundamentals of Python and deep learning, which will be used in the course exercises. Subsequently, the course will cover distributed representations of words, specifically Word2Vec, explaining its necessity and mechanisms through hands-on exercises.

The latter half of the course will focus on the analysis of visual information, such as 2D images and 3D shapes. Specifically, students will learn techniques for extracting features from 2D image data using machine learning (primarily deep learning) and analyzing images based on these features. Additionally, the course will cover multimodal learning techniques that associate images with text, incorporating hands-on exercises.

本授業ではテキスト、及び2次元画像の解析に焦点をあて、深層学習、画像処理などを用
いて実現する方法を学ぶ。
(オムニバス方式/全15回，福本文代/7回，古屋貴彦/8回）

授業前半及び中盤は、テキストの解析・生成に焦点をあて、近年主流となっている深層学
習を用いた手法を解説する。テキスト処理について概説した後、本授業の演習で用いる
Python, 深層学習の基礎を学ぶ。次に語の表現方法の一つである分散表現Word2Vecにつ
いてその必要性と仕組みを演習を交えて理解する。

授業後半は画像などのビジュアル情報の解析に焦点を充てる。具体的には、機械学習（主に深層学習）を用いて2次元画像データから特徴を抽出し、この特徴に基づいて画像を解析する技術を中心に学ぶ。さらに，画像とテキストを関連付けるマルチモーダル学習技術を学ぶ。

なお、本授業はコンピュータ理工学コースのディプロマポリシーで定めた専門知識・技術(C3)に対応する。

[到達目標]

1. Be able to understand and explain methods for text analysis.
2. Be able to understand, explain, and implement image analysis and classification techniques using deep learning.

１．テキストを解析する手法を理解し説明できる。
２．深層学習による画像解析、分類手法を理解し、説明・実装できる。

［必要知識・準備］

Knowledge of mathematics such as linear algebra and calculus, programming skills, and an understanding of algorithms and data structures are required. Basic knowledge of image representation and fundamental filtering techniques is also desirable. Additionally, a foundational understanding of machine learning concepts such as clustering, support vector machines, and neural networks is beneficial.
Python will be used as the programming language for the exercises. If you have no prior experience with Python, it is recommended to review basic syntax and how to handle classes in advance. Google Colab will be used for exercises. Therefore, if you do not already have a Google account, please create one.

線形代数、微積分学などの数学、プログラミングのスキル、アルゴリズムとデータ構造に関する知識が必要である。また、画像の表現やその基本的なフィルタリングの基礎知識を持っていることが望ましい。また，クラスタリング、サポートベクトルマシン，ニューラルネットワークなどの機械学習の基礎知識があるとよい。
演習ではプログラミング言語としてPhtyonを想定する。（Pythonは未経験でも、Java、C++などの言語によるプログラミングスキルがあれば、本授業で使う程度のPython言語は自習できるはず。）演習ではGoogle Colabを利用する予定であるため、Googleアカウントを所持していない場合は作成してほしい。

［評価基準］

No	評価項目	割合	評価の観点
1	試験：期末期	50 %	The first half of the course will be evaluated through reports (50%). 第1～7回については、レポートにより評価する（50%）。
2	試験：中間期	50 %	The latter half of the course will be evaluated through reports (50%). 第8～15回については、レポートにより評価する（50%）。

［教科書］

(未登録)

［参考書］

(未登録)

［講義項目］

[First Half: Fukumoto]
1. Introduction, Deep Learning and Natural Language Processing
2. Introduction to Colab and Python
3. Classification Using Neural Networks (Exercises)
4. Retrieval Using the Vector Space Model (Exercises)
5. Word Embedding Representations
6. Word Embedding Representations (Exercises)
7. BERT Model

[Second Half: Furuya]
8. Human Vision and 2D Image Data Representation
9. Deep Neural Network Architectures for 2D Image Analysis
10. Effective Training of Deep Neural Networks for 2D Image Analysis
11. Advances in 2D Image Analysis Techniques (Self-Supervised Learning, Multimodal Learning)
12. Deep Neural Networks for 3D Shape Analysis
13. Exercises Using Multimodal Foundation Models
14. Exercises Using Multimodal Foundation Models
15. Exercises, Summary and Review

[前半の担当：福本]
１．導入，深層学習と自然言語処理
２．Colab, Python 導入，
３. Neural Networkによる分類 (実習)
４．Vector Space Modelによる検索（実習）
５．単語の埋め込み表現
６．単語の埋め込み表現 (実習）
７．BERTモデル

[後半の担当：古屋]
８．人間の視覚、2D画像のデータ表現
９．2D画像解析向け深層ニューラルネットワークの構造
１０．2D画像解析向け深層ニューラルネットワークの効果的な学習
１１．2D画像解析技術の発展（自己教師あり学習，マルチモーダル学習）
１２．3D形状解析向け深層ニューラルネットワーク
１３．マルチモーダル基盤モデルを用いた演習
１４．マルチモーダル基盤モデルを用いた演習
１５．演習，および総括とまとめ

［前年度授業に対する改善要望等への対応］

担当者変更