NLP Projects

NLP Projects
💡
Natural Language Processing

PROJECT à€­à¥à€°à€®à€£ :Tracing the footsteps of Mystics

Introduction

Why has travel been a significant part of many journeys to enlightenment and self realization? Why is travel still, a way for relief from the mundane? I explore these these questions from an exploratory lens, aiming to visualize and see where all these travelers from the past traveled to, and if there are some significant locations, where all these mystics frequently visited.To begin, I would like to quote Paramahansa Yogananda, the renowned author of Autobiography of a Yogi. Initially, in his understanding of God, he believed that the Universal Consciousness resides solely within, feeling no need to bow to external places of worship or travel to holy sites. This belief persisted until he lost his way while seeking a mystic saint. When he finally found his way back, the saint asked, "Tell me, where do you think God is?""Why, He is within me and everywhere," Yogananda replied, feeling bewildered."All-pervading, eh?" The saint chuckled. "Then why, young sir, did you fail to bow before the Infinite in the stone symbol at the Tarakeswar temple yesterday?"This ancient dilemma of seeking the Creator in the cosmos or within oneself is elegantly resolved in these words from the Autobiography.
Through this project, I want to create a collection of travel maps, from various travelogs and accounts of mystic saints. For the current work, I selected five accounts of travelogs, in the form of autobiographies and biographies, accounts from different time periods, for instance Autobiography of a Yogi was published in 1946, whereas the travels of Adi Shankracharya are from the first half of 8th century, and travels of Swami Narayan, are from 18th century, and Guru Nanak’s travel are from 15th century and Buddha’s travel are from 6th century .

Why à€­à¥à€°à€®à€£ ?

Why I named the project à€­à¥à€°à€®à€£ ? This is my own interpretation of the word and its contextual meaning .I would like to place in word in context of four other words :

Brahma: The ultimate higher consciousness.
Brahmaand: The universe.
Brahman: The ‘seeker’ looking for Brahma, the ultimate liberator.
à€­à¥à€°à€®à€£: The journey undertaken by the seeker for ultimate liberation/enlightenment.

Geolocations of book, Following in Buddha's footsteps.
Geolocations of the book, Walking with Nanak.
Geolocations of biography of Adi Shankaracharya
Geolocations of book, Lost Year of Jesus.
Geolocations of book, Nilakantha Charita
Geolocations of Autobiography of a Yogi

Methodology

The methodology used for building this project has been based on Natural Language Processing and Natural Language Understanding, to extract locations from the text and visualize them.The key steps taken in the project are as follows:

  • Researched and selected travel accounts of different mystics and saints.

  • Downloaded PDF files from Libgen.

  • Converted the PDF to .TXT format, through python script.

  • Selected the most efficient NER available in English to extract locations.

  • Wrote python code to extract locations

  • Customized pre-processing steps for each text.

  • Removed duplicate locations through rounds of filtering them through code as well as manually by using a stopword list when it could no further be filtered through code.

  • Utilized Wikipedia API to get coordinates for locations.

  • Utilized Google Maps to find coordinates for locations not found by Wiki API.

  • Saved locations in csv file for visualization step.

  • Applied Folium library to visualize locations.

  • Visualized locations through simple line graphs on map as well as through heatmap.

  • Calculated frequency for locations by combining locations csv from all the text, to check if there were any locations that were frequently traveled to by the mystics.

  • Books/Sources

  • The key source for this work has been autobiographies/biographies and accounts of travelers.

  • The six books selected for the project are:

  • Screenshot 2024-07-02 at 17.26.56.png

  • It took extensive research to find these books. While there is a wealth of literature on each of these spiritual figures, I sought key accounts with significant insights into their travels. Despite the abundance of available material, finding literature specifically dedicated to their journeys proved to be a challenging task.

  • Each of these masters was renowned for traveling the world primarily on foot. Their journeys answered many of their questions, yet some mysteries remain. For instance, Shankaracharya, the mystic saint of the 15th century, traversed the country on foot. It is believed he possessed 'pad siddhi' (with 'pad' meaning feet and 'siddhi' referring to a mystical power attained through advanced yoga techniques), enabling him to travel from one place to another almost instantaneously, as if vanishing into thin air.

In-depth Analysis

In this section I would like to focus on the entire project pipeline step by step, reflecting on what worked and what failed, to achieve the final results.

There are a lot of layers of problems with locations in a travelog. I will explain each one in detail here:

The location that NER selected, might not be a location actually visited by the traveler, it might be a ‘phrase in language’ like ‘mecca of all religion’, or just a reference to it, like, ‘I wanted to travel to___ but couldn’t, because language is slippery it's not easy to know the context the word is applied in.

In the two examples below, I had to go back to the actual text to check if the list was correct or not because I knew that the author did not visit Mecca, in these we can see, first it's an expression of language and second it is a mention to place somewhere but the author never visited it, so distant reading cannot be a replacement for close reading.It was close reading of this text that allowed me to check what was correct and what was incorrect.
Screenshot 2024-07-02 at 17.32.15.png

Names of locations bring along a baggage of historical renaming with it, in my case I discovered that while working with ancient travelers like Adi Shankaracharya and Neelkanth Varni, most of the places couldn't be identified by Wiki API. I had to go to Google maps, to make sense of what was an actual location and what wasn’t, for instance I found how the name change for the ancient location ‘Vartal’ which couldn’t be located by Wiki API came out to be ‘Vadtal’, which is located in present day Gujrat.
Screenshot 2024-07-02 at 17.34.38.png

Some places were temple names, which couldn’t be found, so I had to manually add their locations.
Screenshot 2024-07-02 at 17.34.43.png

Other, for places like ‘Darbhasayanam,’ which was an ancient place in Ramayana, is now a place where ‘Tiruppullaani’, which I replaced for the word ‘Darbhasayanam’ which couldn't be geolocated by Wiki API.
Screenshot 2024-07-02 at 17.34.49.png

Ancient names for places like ‘Shrirang Kshetra’ are current day ‘Srirangam’, for which I had to replace them by looking for current locations.
Screenshot 2024-07-02 at 17.34.53.png

For some places like ‘Himagiri’ which was the name of ancient mountain ranges, but now is now no longer a locatable place on map, rather it has become part of more ‘commercial’ establishments.
Screenshot 2024-07-02 at 17.34.59.png

A very good example to see how language is culturally rooted is , the how the major river of India, Ganga, which is considered a Godesses is referred as ‘Ganga ji’ rather than Ganga, which NER has tracked as ‘Gangaji’ unable to find any geolocation.
Screenshot 2024-07-02 at 17.35.05.png

Found places indirectly, I was looking for the word ‘Vikramshila’, which couldn’t be located so instead I had to replace it with geo coordinates for ‘Antichak’.
Screenshot 2024-07-02 at 17.35.10.png

The key strategy for mapping names, which remained central was first displaying maps and then checking where the locations coordinates seemed wrong, and then going back to map and replacing them with correct coordinates.Here is how my initial visualization for Autobiography looked like, and how on later analysis it looked , which is the final heatmap.

Screenshot 2024-07-02 at 17.37.19.png

For the first book, I tried mapping locations, but it was challenging to clean up the messy data. Even after applying the .set method to remove duplicate locations, I couldn’t get it to work properly. Ultimately, the .unique method proved to be the most effective in helping me obtain a filtered list of locations.

Screenshot 2024-07-02 at 17.37.26.png

The Mystical Dilemma:

**Places in the mind? Astral Spheres? **

Words like ‘Samadhi’ ‘Harinya(imperishable) loka’, or ‘Shambala’, ‘Heaven’, are these real places or just imagined or some experience which is beyond human intellect.
These words, which are unlocatable makes me think about how mystics could travel to the, and come back to the present sphere.In Buddhism for instance, when I was looking for this place, ‘Chambhalla’, I came across the place, ‘Shambhala’ which holds significance as a city of the mystics, which many people went out looking for as well, but could not find it. There even exists a painting of how enlightened masters are hierarchically placed there, which is shown below.
Screenshot 2024-07-02 at 17.37.33.png
Screenshot 2024-07-02 at 17.37.49.png

This is another example of how, ‘Trayastrimsa’ was the second of the six heavens in Buddhist cosmology.
Screenshot 2024-07-02 at 17.37.54.png

The last, example is one from Hindu cosmology where it is depicted that Vishnu is holding in his body, different lokas and spheres.

Screenshot 2024-07-02 at 17.38.00.png

Vishvarupa of Vishnu as the Cosmic Man with the three realms: heaven - Satya to Bhuvar loka (head to belly), earth - Bhu loka (groin), underworld - Atala to Patala loka (legs).

While researching different spheres or lokas, I discovered that these realms can be accessed only by individuals who can tune themselves to a specific energy frequency or waveform, enabling them to enter that sphere of consciousness, so maybe these are real places indeed.

Screenshot 2024-07-02 at 17.38.06.png

The final step of my project involved identifying the most frequently visited locations in all the travelogues. Topping the list was Mount Kailash, renowned as the 'Abode of Shiva', holding immense spiritual significance in both Hinduism and Buddhism. Following closely was the Himalayas, the second-highest peak, long revered as a sanctuary for mystics seeking solitude and introspection. Interestingly, all the top five places are situated in hilly terrain, indicating that mystics often journeyed to the mountains in search of seclusion and self-realization.

Screenshot 2024-07-02 at 17.46.13.jpeg

Conclusion:

To conclude, the most revealing aspect of this project was feeling surprised to see how these mystics traveled so much, with not much transportation help.For instance, while tracing locations for Jesus Christ, it felt that he traveled throughout the world, which is unbelievable and yet astonishing. Earlier when I used to think about mystics, I used to image then in a cave meditating to find the answer to the questions of existence, but now when I can see for myself, how they traveled, meeting numerous other intellectuals, disciples and mystics, it makes me see them as well as the idea of spiritual pilgrimage in a completely different light. I hope this project helps many other seekers like me to find the way to follow in the footsteps of the mystics.