Topic maps, similarity metrics and study recommendation

Jon Fernquest, Mae Fah Luang University, IT Department, 5/11/2021

[Full Index]

Similarity measures or metrics are an essential part of recommendation systems. 

In essence, they help the system find items to recommend to a user that are similar to the items they are currently using or that are used by similar users who might be using items that can be recommended.

In a typical recommendation system, the similarity metric is applied to, for instance, user ratings of items, finding similarity between users in their ratings. Similarity in user ratings is taken as a measure to base recommendations on.

Item similarity proceeds by gauging the similarity between an item a customer purchased and another item that is very similar which might thus be a good recommendation for future purchase.

This is the high-level logic at work in recommendation systems that tends to get lost in the algorithms and mathematics. 

Text Similarity

However, there is another way to calculate similarity and that is text similarity.

This is used, for instance, to recommend academic articles relevant to a user's research.

Based on, for instance, keywords that a user provides in a search or in a user profile or based upon previous articles read.

This can apply to any text, for instance, recommending a book to read that might be relevant to a user's interests. The traditional toy example for recommendation systems is movie recommendation.

Topic Similarity within Textbooks

In this case, what we have is a student studying a subject, namely dental anatomy and they're using a textbook, and they're going through this textbook step-by-step as a course progresses.

And as they review for the course the question arises, what do I review next?

We're going through the course step-by-step, we've already gone through quite a lot of material and the student needs help to decide what to review next in this mass of material.

Similarity metrics can help as, for instance, when student is studying a topic and it is similar and other topics already studied so it might be a good time to review these similar topics again to help the student understand repeated patterns in the material and thus simplify and enhance memory of them. 

Or the student may still not understand certain points and not have mastered a topic yet, thus they must go back and study them again.

There are other topics that are related earlier on, for instance, that may be a prerequisite to this topic and thus worth reviewing, or topics further on that rely heavily on this topic that would be worth reviewing again for better understanding once the student masters this topic.

Topic Model of a Textbook & Flashcards

One can make a topic model of a textbook by calculating the within or intra-textbook similarity of the topics presented in the textbook (Dawar et al. 2019, Guerra et al 2013, & Huang  and Yudelson 2016 ). 


Dental anatomy textbooks, like the textbooks of other fields as well, are highly factual, consisting essentially of a collection of facts.

Each sentence is basically a fact, so one can actually take all the sentences of the textbook and create matching flashcards out of them systematically.

Sometimes a sentence actually reduces to two or three flash cards (facts, relations) to be memorized.

To create flashcards from a fact-intensive textbook one first separates the sentences out of a textbook.

One then locates a key word or phrase within each sentence and blanks it out, putting that word or phrase in a bag of words to select from and match to one of the sentences with a blanked-out-word.

And then one groups these word-blanked-out sentences into groups, five sentences in a group for instance, and one takes the words out and puts them in a separate group to select from and match to the blanked-out-word sentences.

One then has a matching question to add to Moodle question bank or the H5P JavaScript e-learning component library, for instance.

This is the way either a human or an automated algorithm can create matching questions from the sentences in a textbook with a lot of facts to memorize, such as that of dental anatomy. 

From Matching Test Questions to Flashcards

Matching questions in a test or quiz also have a nice property.

Namely, you can take the blanked out sentences and their matching word or phrase, we can call them ‘sub-questions’ as Moodle does because a matching question consists of a group of five to ten of them.

One can go from each one of those sub-questions to a flashcard, the sentence with word blanked out being on the front and the word taken out on the back.

Then you look at the front, test yourself, and then flip the card and self-evaluate.

What is your knowledge? To what degree have you mastered this subject on a scale from 1 to 5? This is the question that the student asks themselves in self-evaluation. This is the way the popular Brainscape flashcard app works.

From Topic Map to Recommendations for Review 

To recapitulate and summarize, as a preliminary step to creating a study recommendation system we can create a topic map of the textbook showing how different parts of the textbook are interrelated via a similarity metric.

And this is useful for this study recommendation system later because the study recommendation system is going to use this metric and the intra-textbook relations it identifies between topics (topic map) to recommend further items to study. 

This is machine learning or data mining with the similarity metric is a fundamental part of mining these texts. It's also very similar to an unsupervised clustering, nearest neighbor or self-organizing map (SOM) approach. 

In language learning this is also important when studying a language particularly at an advanced level the question frequently arises of what to study next, other texts with similar vocabulary or grammatical forms to reinforce what one has already studies, for instance.

In the case of dental anatomy, we are essentially data mining for similar anatomical patterns on other teeth that we can inter-relate to material we have already learned.

Vocabulary acquisition expert Nation (2001) likens this to finding additional hooks in our memory to hang the word on and thus enhance out memory-based model of the word.

References

Dawar, Kanika, Ashwanth J. Samuel, and Raf Alvarado. "Comparing topic modeling and named entity recognition techniques for the semantic indexing of a landscape architecture textbook." In 2019 Systems and Information Engineering Design Symposium (SIEDS), pp. 1-6. IEEE, 2019.

Guerra, Julio, Sergey Sosnovsky, and Peter Brusilovsky. "When one textbook is not enough: Linking multiple textbooks using probabilistic topic models." In European Conference on Technology Enhanced Learning, pp. 125-138. Springer, Berlin, Heidelberg, 2013.

Huang, Yun, Michael Yudelson, Shuguang Han, Daqing He, and Peter Brusilovsky. "A framework for dynamic knowledge modeling in textbook-based learning." In Proceedings of the 2016 conference on user modeling adaptation and personalization, pp. 141-150. 2016.

Nation, I.S.P. Learning vocabulary in another language. Cambridge University Press. 2001. 



Comments

Popular Posts