Grant-in-Aid for Scientific Research (KAKENHI): Fund for the Promotion of Joint International Research (Fostering Joint International Research (B)) 2021-2026
京都大学

Events

International Workshop

Exploring the Potential of Utilizing Data from Sanskrit Literature

19-20, March 2024

Lecture Room 2, Faculty of Letters, Kyoto University (Yoshida Main Campus)

in person and online

Download Flyer: PDF 4MB

In recent years, techniques in the digital domain such as digitalization of materials and cross-referential search in a database have become commonplace in literary studies. However, in the field of Sanskrit studies, there remain numerous challenges in creating digital data and conducting data analysis due to the unique characteristics of the language and materials. In this workshop, we aim to discuss, from the perspectives of both digital humanities and literary studies, effective ways of organizing data and methods of utilizing data that are relevant to studies in Sanskrit literature. Our aim is to explore the potential for future research that opens up new perspectives and new frontier in this field.

Please register using the Google Form: https://forms.gle/NcDrsim8psM5xgfd9
Please choose between on-site participation or online participation. (A Zoom meeting link will be sent via email.)

Participation is free of charge.

After the conclusion of the first day on the 19th, a reception will be held in Lecture Room 1 of the Faculty of Letters (next to the workshop venue). (We will collect a fee from those who attend.)
We encourage you to join the reception as well.

Registration will be accepted until the end of the workshop, but for those wishing to attend the reception, please apply by March 5th.

Program

March 19 (Tue) 10:30-18:30

10:30-10:40
Opening Remarks Kyoko Amano (Kyoto University)

Development and Utilization of OCR for Indian Scripts

10:40-11:20
The Paleographic Database Indoskript – Design and Future Applications
Oliver Hellwig (University of Zurich)
11:20-11:45
break
11:45-12:45
Development of Devanagari OCR: From Typeset to Handwritten Letters
Takahiro Kato (University of Tokyo)
Yuki Tomonari (University of Tokyo)
12:45-13:15
Discussion
13:15-15:00
lunch break

Creating and Utilizing Data of Sanskrit Texts and Non-Textual Data

15:00-16:00
Efficiency in Text Reading through OCR and Text-mining
Yoichi Iwasaki (Nagoya University)
16:00-17:00
Creating Vedic Texts and the Challenge of Indentifying the Same Words
Yuzuki Tsukagoshi (University of Tokyo)
17:00-17:15
break
17:15-18:00
Visual Analytics of Intertextual Relationship Using a Mantra Index
Hiroaki Natsukawa (Osaka Seikei University)
Kyoko Amano (Kyoto University)
18:00-18:30
Discussion (Comments by Kiyonori Nagasaki (International Institute for Digital Humanities))
18:45-21:00
reception(Faculty of Letters, Lecture Room 1)

March 20 (Wed) 9:30-16:30

Analysis of Internal Structure and Chronology Using Vedic Corpus, and its Visualization

9:30-10:30
Dating the Vedic Corpus Oliver Hellwig (University of Zurich)
10:30-10:45
break

Analysis of Similarity in Yajurvedic Texts

10:45-11:00
Background of Similarity Analysis in Yajurvedic Texts
Kyoko Amano (Kyoto University)
11:00-11:45
A Corpus Linguistic Analysis of Intertextuality in Vedic Literature using TRACER and Stylo
So Miyagawa (NINJAL)
11:45-12:30
Reassessment of Similarity Measures for Sanskrit: Word2Vec and Transformers
Yuki Kyogoku (Leipzig Univerisity)
12:30-12:40
Evaluation of Analysis Results
Kyoko Amano (Kyoto University)
12:40-13:10
Discussion
13:10-15:00
lunch break
15:00-16:00
VL2: Visualization of Linguistic Layers in Vedic Literature
Hiroaki Natsukawa (Osaka Seikei University)
Kyoko Amano (Kyoto University)
16:00-16:20
Discussion
16:20-16:30
Closing Remarks

 

The 2nd and concluding Workshop

Ancient India meets Data-Science

2022/2/11(Friday, Japanese National Day)

16:00-19:00 JST (= 8:00-11:00 CET)

SPIRITS project “Chronological and Geographical Features of Ancient Indian Literature Explored by Data-Driven Science”

It’s also a Kick-off for Joint International Research “A Study of Language Layers in Vedic Literature for the Development of a Program for Age-Estimation”

Flyer Download (PDF 601KB)

16:00-16:30 JST (= 8:00-8:30 CET)
The Result of the Two-Year SPIRITS Project and Our Vision for the Next Research.
Kyoko Amano (Kyoto University, Hakubi Center / Institute for Research in Humanities)
16:30-17:00 JST (= 8:30-9:00 CET)
Visualization meets Ancient India: Mapping the Structure of Vedic Texts
Hiroaki Natsukawa (Kyoto University, Academic Center for Computing and Media Studies)
17:00-17:30 JST (= 9:00-9:30 CET)
“One Step Further: Assessing Semantic Similarity in Sanskrit Using Word Embeddings with a Weighting Factor”
Yuki Kyogoku (Leipzig University, Indology)
17:30-17:45 JST (= 9:30-9:45 CET)
Break
17:45-18:15 JST (= 9:45-10:15 CET)
“Computational Stylometric Analysis on Intertextuality in Historical Written Languages: A Case Study of Coptic”
So Miyagawa (Kyoto University, Graduate School of Letters / Center for Cultural Heritage Studies and Inter Humanities)
18:15-18:45 JST (= 10:15-10:45 CET)
Dependency parsing of Vedic Sanskrit – Algorithms and linguistic conclusions
Oliver Hellwig, Sebastian Nehrdich, Sven Selllmer (Dusseldorf University, Institute for Language and Information)
18:45-19:00 JST (= 10:45-11:00 CET)
Discussion and Concluding remark: Oliver Hellwig

Please register using the Google Form. The Zoom Meeting ID and password will be sent to you by e-mail.
*** Closed ***

Registration is available until the end of the workshop.
No registrant limit. No registration fee.

Organizer: SPIRITS Project “Chronological and Geographical Features of Ancient Indian Literature Explored by Data-Driven Science” (Kyoko Amano, Hiroaki Natsukawa, Oliver Hellwig, Yuki Kyogoku); Fund for the Promotion of Joint International Research (Fostering Joint International Research (B)) of KAKENHI “A Study of Language Layers in Vedic Literature for the Development of a Program for Age-Estimation” (Kyoko Amano, Hiroaki Natsukawa, So Miyagawa, Oliver Hellwig, Yuki Kyogoku)

Co-Organizer: Kyoto University, Academic Center for Computing and Media Studies; Kyoko Amano Hakubi Project “Language and Social-Cultural Background of the Ancient Indian Ritual Literature”; Grant-in-Aid for Challenging Research (Exploratory) “Constructing a Database for Quantitative Analysis of Style Toward Elucidation of the Formation Process of Ancient Indian Texts” (Representative researcher: Kyoko Amano, 20K20697)

 

The 1st Workshop

Dynamism of Social Context Deciphered by a Linguistic Analysis of Ancient Literature

Friday, February 12, 2021, 14:00 – 19:10 (JST)

The project will be held online.

The first workshop of SPIRITS project “Chronological and Geographical Features of Ancient Indian Literature Explored by Data-Driven Science”

2020-2021 Interdisciplinary type project, in the priority area of humanities and social sciences SPIRITS: Supporting Program for Interaction-based Initiative Team Studies

Flyer Download (PDF 1.6MB)

Part 1

14:00 – 14:30 Opening:
Problems in the Formation of the Vedas, Ancient Indian Religious Texts
Kyoko Amano (Kyoto University, Institute for Research in Humanities / Hakubi Center)
14:30 – 15:10
The Possibility of Information Visualization and Data Analysis for Ancient India
Literature
Hiroaki Natsukawa (Kyoto University, Kyoto University, Academic Center for Computing and Media Studies)
15:10 – 15:50
Relationship Among Vedic Schools Deciphered by the Visualization of Mantra Collocation
Kyoko Amano (Kyoto University, Institute for Research in Humanities /
Hakubi Center)
15:50 – 16:30
Citation Prediction Using Academic Paper Data and Application for Surveys
Shun Hamachi (Kyoto University, Graduate School of Engineering)

Part 2

16:50 – 17:30
Measuring the Semantic Similarity between the Chapters of Taittirīya Saṃhitā using a Vector Space Model
Yuki Kyogoku(Leipzig University, Indology)
17:30 – 18:10
Dating Vedic texts with computational models: Algorithmic considerations and data selection
Oliver Hellwig (University of Zurich, Department of Comparative Language Science)
18:10 – 18:50
morogram: Background, History, and Purpose of a Tool for East Asian Text Analysis
Shigeki Moro (Hanazono University, Faculty of Letter)
18:50 – 19:10
Discussion (Moderator: Hiroaki Natsukawa)

Understanding the social background of text formation is a basic requirement to accurately understand documents. However, the background of ancient societies is often hidden in a veil of mystery, which makes it difficult to understand the process of text formation. The Vedas, religious texts in Ancient India, are among these documents. In this workshop, we will seek to decipher the social movements, geographical mobility, and change in the spheres of influence in ancient India through a language analysis of the Vedic texts. The discussion will include how data science can be applied to this field.

Please register using the following Google Form. The Zoom Meeting ID and password will be sent to you by e-mail.
*** Closed ***

Registration is available untill the end of the workshop.
No registrant limit. No registration fee.

Organizer: SPIRITS project “Chronological and Geographical Features of Ancient Indian Literature Explored by Data-Driven Science” (Kyoko Amano, Hiroaki Natsukawa, Oliver Hellwig, Yuki Kyogoku)

Co-Organizer: Kyoto University, Academic Center for Computing and Media Studies; Kyoko Amano Hakubi Project “Language and Social-Cultural Background of the Ancient Indian Ritual Literature”; Grant-in-Aid for Challenging Research (Exploratory) “Constructing a Database for Quantitative Analysis of Style Toward Elucidation of the Formation Process of Ancient Indian Texts” (Representative researcher: Kyoko Amano, 20K20697)

Oliver Hellwig

Title: Dating Vedic texts with computational models: Algorithmic considerations and data selection

Text: In spite of over 150 years of scholarly research, the chronology of the Vedic corpus is still far from being well understood as external historical evidence is largely missing and post-Rigvedic Sanskrit shows only minor developments on the levels of phonetics and morph-syntax.

This presentation discusses mathematical models that can be used for dating (Vedic) texts based on the linguistic evidence they provide. It also addresses the important questions of how to select expressive linguistic features, i.e. those whose distribution is coupled with the time of composition; and how to interpret the parameters of the resulting models in a linguistic context. The discussions are exemplified by a corpus of classical and medieval Latin texts which show comparable linguistic developments, but can, in contrast, be dated exactly, thereby facilitating model evaluation.