4 4 There’s an interesting discussion to be had as to whether it is better to have a few longer texts or many complete short texts. Short-form messaging analysis (e.g., SMS, Twitter) can take advantage of the fact that hundreds of complete messages can be collected even if the overall number of words in an analysis may be small (see, e.g., Grant, 2012, for a discussion of such analyses).
5 5 An n-gram refers to clusters of, for example, two, three, or four items in length. Thus, word 3-grams from the opening clause of this paragraph are: “JG also took,” “also took a,” “took a second,” and “a second approach”.
FURTHER READING
1 Argamon, S. (2018). Computational forensic authorship analysis: Promises and pitfalls. Language and Law/Linguagem e Direito, 5(2), 7–37.
2 Dror, I.E., Peron, A. E., Hind, S. L., & Charlton, D. (2005). When emotions get the better of us: the effect of contextual top-down processing on matching fingerprints. Applied Cognitive Psychology, 19(6), 799–809.
3 Grant, T. (2020). Text messaging forensics: Txt 4n6: Idiolect free authorship analysis? In M. Coulthard, A. May, & R. Sousa-Silva (eds.), The Routledge Handbook of Forensic Linguistics. (2nd ed.). Routledge.
4 Grant, T. (2012). TXT 4N6: method, consistency, and distinctiveness in the analysis of SMS text messages. Journal of Law & Policy, 21, 467.
5 Grant, T. (2021). The idea of progress in forensic authorship analysis. Cambridge University Press.
6 Grieve, J., & Woodfield, H. (2020). Investigative linguistics. In M. Coulthard, A. May, & R. Sousa-Silva (eds.), The Routledge Handbook of Forensic Linguistics (2nd ed.). Routledge.
SUGGESTED RESEARCH QUESTIONS
The susceptibility of authorship analysis to contextual bias could be explored experimentally. i.e. telling authorship analysts alternative stories around a set problem, and seeing if they come up with different answers. This could be applied to both stylistic and computational approaches to authorship analysis.
It would be useful to explore ways in which computational and more qualitative approaches to authorship analysis might be combined. e.g. using heavily computational methods to elicit a large set of features but to also to examine those features to rule out ones without linguistic explanation.
REFERENCES
1 Argamon, S. (2018). Computational forensic authorship analysis: Promises and pitfalls. Language and Law/Linguagem E Direito, 5(2), 7–37.
2 Biber, D. (1995). Dimensions of register variation. Cambridge University Press.
3 Coulthard, M. (2004). Author identification, idiolect, and linguistic uniqueness. Applied Linguistics, 25(4), 431–447.
4 Dror, I. E., Peron, A. E., Hind, S. L., & Charlton, D. (2005). When emotions get the better of us: The effect of contextual top-down processing on matching fingerprints. Applied Cognitive Psychology, 19(6), 799–809.
5 Dror, I. E., Charlton, D., & Péron, A. E. (2006). Contextual information renders experts vulnerable to making erroneous identifications. Forensic Science International, 156, 74–78.
6 Dror, I. E., & Hampikian, G. (2011). Subjectivity and bias in forensic DNA mixture interpretation. Science & Justice, 51(4), 204–208.
7 Forensic Regulator. (2015). Cognitive bias effects relevant to forensic science examinations. Home Office.
8 Grant, T. (2012). TXT 4N6: Method, consistency, and distinctiveness in the analysis of SMS text messages. Journal of Law & Policy, 21, 467.
9 Grant, T. (2020). Text messaging forensics: Txt 4n6: Idiolect free authorship analysis? In M. Coulthard, A. May, & R. Sousa-Silva (eds.), The Routledge handbook of forensic linguistics (2nd ed.). Routledge.
10 Grant, T., & Baker, K. (2001). Identifying reliable, valid markers of authorship: A response to Chaski. Forensic Linguistics, 8, 66–79.
11 Grant, T., & MacLeod, N. (2020). Language and online identities: The undercover policing of internet sexual crime. Cambridge University Press.
12 Grieve, J. (2007). Quantitative authorship attribution: An evaluation of techniques. Literary and Linguistic Computing 22(3), 251–270.
13 Grieve, J., Clarke, I., Chiang, E., Gideon, H., Heini, A., Nini, A., & Waibel, E. (2019). Attributing the Bixby Letter using n-gram tracing. Digital Scholarship in the Humanities, 34(3), 493–512.
14 Grieve, J., & Woodfield, H. (2020). Investigative linguistics. In M. Coulthard, A. May, & R. Sousa Silva (eds.), The Routledge handbook of forensic linguistics (2nd ed.). Routledge.
15 Luyckx, K., & Daelemans, W. (2011). The effect of author set size and data size in authorship attribution. Literary and Linguistic Computing, 26(1), 35–55.
16 Wagner, S. E. (2012). Age grading in sociolinguistic theory. Language and Linguistics Compass, 6(6), 371–382.
17 Wright, D. (2017). Using word n-grams to identify authors and idiolects: A corpus approach to a forensic linguistic problem. International Journal of Corpus Linguistics, 22(2), 212–241.
Конец ознакомительного фрагмента.
Текст предоставлен ООО «ЛитРес».
Прочитайте эту книгу целиком, купив полную легальную версию на ЛитРес.
Безопасно оплатить книгу можно банковской картой Visa, MasterCard, Maestro, со счета мобильного телефона, с платежного терминала, в салоне МТС или Связной, через PayPal, WebMoney, Яндекс.Деньги, QIWI Кошелек, бонусными картами или другим удобным Вам способом.