Lexical Similarity Analysis

Do Transgendered persons communicate more like their birth gender or their preferred gender?

This paper attempts to answer that. I scraped nearly half a million Reddit comments from four different cohorts: Cisgendered Males, Cisgendered females, Transgender males and Transgender females. This was proceeded by utilization of WordScore and Jensen-Shannon Divergence to measure distance between the 4 corpora. In addition, data was split into train-test splits and I tested how useful Random Forest was for classification.

Check out the Paper