TY - GEN
T1 - Simple neologism based domain independent models to predict year of authorship
AU - Kulkarni, Vivek
AU - Tian, Yingtao
AU - Dandiwala, Parth
AU - Skiena, Steven
N1 - Publisher Copyright:
© 2018 COLING 2018 - 27th International Conference on Computational Linguistics, Proceedings. All rights reserved.
PY - 2018
Y1 - 2018
N2 - We present domain independent models to date documents based only on neologism usage patterns. Our models capture patterns of neologism usage over time to date texts, provide insights into temporal locality of word usage over a span of 150 years, and generalize to various domains like News, Fiction, and Non-Fiction with competitive performance. Quite intriguingly, we show that by modeling only the distribution of usage counts over neologisms (the model being agnostic of the particular words themselves), we achieve competitive performance using several orders of magnitude fewer features (only 200 input features) compared to state of the art models some of which use 200K features.
AB - We present domain independent models to date documents based only on neologism usage patterns. Our models capture patterns of neologism usage over time to date texts, provide insights into temporal locality of word usage over a span of 150 years, and generalize to various domains like News, Fiction, and Non-Fiction with competitive performance. Quite intriguingly, we show that by modeling only the distribution of usage counts over neologisms (the model being agnostic of the particular words themselves), we achieve competitive performance using several orders of magnitude fewer features (only 200 input features) compared to state of the art models some of which use 200K features.
UR - https://www.scopus.com/pages/publications/85064940521
M3 - Conference contribution
AN - SCOPUS:85064940521
T3 - COLING 2018 - 27th International Conference on Computational Linguistics, Proceedings
SP - 202
EP - 212
BT - COLING 2018 - 27th International Conference on Computational Linguistics, Proceedings
A2 - Bender, Emily M.
A2 - Derczynski, Leon
A2 - Isabelle, Pierre
PB - Association for Computational Linguistics (ACL)
T2 - 27th International Conference on Computational Linguistics, COLING 2018
Y2 - 20 August 2018 through 26 August 2018
ER -