Document Similarity A Python script which, using the library tkinter, reads through a given directory of text files and uses Jaccard’s Similarity Measure to determine the most similar files, disregarding common stop words.
View Source Code Here
Perhaps you want to know just how alike two essays are, the similarity of the topics in your favourite novels, or even the overlap of your favourite restaurants' menus? To run this similarity measuring script on a directory of text files:
Run from the project directory: Set up venv
$ python -m venv venv # note on Apple silicon you may need python 3.11 $ source venv/bin/activate $ pip install tk $ python main.py