Lexical Similarity Analysis
Do Transgendered persons communicate more like their birth gender or their preferred gender?
This paper attempts to answer that. I scraped nearly half a million Reddit comments from four different cohorts: Cisgendered Males, Cisgendered females, Transgender males and Transgender females. This was proceeded by utilization of WordScore and Jensen-Shannon Divergence to measure distance between the 4 corpora. In addition, data was split into train-test splits and I tested how useful Random Forest was for classification.