Abstract
This article explores the intersection of technology, particularly artificial intelligence and anti-human trafficking and child sexual exploitation (CSE) initiatives. Despite the promises of technology, the overall effectiveness in addressing these crimes is limited. The authors argue that a key factor contributing to this limitation is the insufficient understanding of the crimes, rooted in the absence of meaningful engagement with victims. To address these challenges, the authors propose two strategies. First, they advocate for data-sharing partnerships to access extensive textual data held by organizations. Second, they highlight the use of natural language processing (NLP) to systematically analyse victim narratives without causing further interview fatigue. Embracing NLP is seen as a way to expedite research, enhance scalability and provide a fresh perspective on combating human trafficking and CSE. The article is structured to define key terminology, make a case for a victim-led response, demonstrate the application of NLP and conclude with insights from in-house research. The authors urge stakeholders to consider this new approach in their efforts to combat human trafficking and CSE.
Introduction
In recent years, there has been a noticeable surge in anti-human trafficking and combating child sexual exploitation (CSE) initiatives, from an increase in the adoption of national and regional legislations, updates of existing legislation (e.g., the EU Directive 2011/36/EU on Preventing and Combating Trafficking in Human Beings and Protecting Its Victims) to voluminous awareness-raising campaigns. Perhaps one of the areas where we have seen the greatest surge is in the development and uptake of technology.
Some hail technology, which is now made more extensive by the availability of artificial intelligence (AI), as a possible solution to, at the very least, understanding the crime to, at the other end of the spectrum, capturing perpetrators or preventing the crimes in the first place. Law enforcement officers are, for example, increasingly using data analytics and machine learning algorithms to identify patterns and trends associated with human trafficking and CSE. They have also been leveraging facial recognition technology to help identify victims and traffickers by matching images with databases of missing persons or known criminals. Elsewhere they monitor social media platforms to detect recruitment efforts and potential victims. Non-governmental organizations (NGOs) are using apps to ask the public to record suspected instances of human trafficking.
Technology does hold promises, and indeed this paper will advocate for the use of the latest innovation. However, to date, despite the embracement of all sorts of tools, the overall effectiveness in eliminating this crime remains limited. The fundamental premise of this paper posits that a contributing factor lies in our insufficient comprehension of the crimes. In turn, this deficiency in understanding stems from a lack of meaningful engagement with individuals who possess first-hand experience, who indeed constitute the genuine experts on the subject: the victims.
Despite its widely recognized significance, the inclusion of victim voices into approaches against human trafficking and CSE at the national, regional and international level, whether policy, law enforcement, safeguarding or other, remains absent at worst or a challenge at best. The omission of victim voices and narratives in research and subsequently in policy can be attributed to several factors. First, victims often experience research fatigue, as they may have already participated in numerous interviews, sharing their traumatic experiences. Consequently, there is a practical limit to the number of interviews they can reasonably engage in. Second, researchers face constraints in their capacity to conduct interviews, stemming from considerations of time, financial resources and emotional well-being. Engaging with victims can, at times, expose researchers to secondary trauma, adding a layer of complexity to the research process. Lastly, whilst the wealth of victim stories is frequently documented in various forms, including NGO case notes, police reports and witness statements, this means it is embedded in extensive volumes of textual data, held by an organization that may not want to share data. Even if that data were made available for research, analysing and coding these extensive archives are labour intensive endeavours that can be overwhelming within the constraints of limited research resources and time.
In light of these formidable challenges, one promising avenue that the authors have been diligently exploring involves first the establishment of data-sharing partnerships and, Second, the utilization of natural language processing (NLP) capabilities. This use of technology emerges as a pivotal strategy. NLP, a subfield of AI, holds the potential to systematically analyse and extract meaningful insights from amalgamated textual data. This technology can identify recurrent themes, patterns and sentiments within the narratives, facilitating a deeper understanding of the experiences and perspectives of victims without subjecting them to further interview fatigue. Furthermore, harnessing NLP in this context can expedite the research process and enhance its scalability, as it offers the capacity to process vast amounts of text efficiently.
To that end, the aim of this paper is to persuade stakeholders working in the anti-human trafficking domain that it is time for a new approach, a new interpretation of the way we address human trafficking and CSE. We do this by explaining how the application of NLP techniques on victim narratives can facilitate a deeper understanding of the crime they experienced and their consequent—and genuine—needs. Globally, as researchers and stakeholders involved in fighting crimes, we have stared at the chessboard of solving the human trafficking and CSE problem for too long, with our opponent (the criminal) being always one step ahead. But perhaps using AI to enable learning at scale from those who experienced the crime could be our checkmate move.
As to the structure of the paper, first the authors define key terminology used in this paper, to help the reader navigate the content. Second, we make the case for why stakeholders’ response to the crimes of human trafficking and CSE should be victim led. In the third section, we show how can this be done, using NLP, based on some of the work that the authors of this paper did together with their colleagues at Trilateral Research, an ethical AI development company. Subsequently, we provide the reader with a discussion of what our in-house research concluded.
Terminology
At the outset, it is helpful to define some of the technical and subject terminology used in this paper.
First, human trafficking and CSE are phenomena that can be examined from multiple perspectives, including morality, criminality, public safety, human rights, labour issues and migration. Various stakeholders often emphasize distinct aspects of these complex issues, and in this paper, we acknowledge that human trafficking and CSE unfold in a myriad of ways, hence the importance of listening to victim stories.
As to definition, human trafficking is defined by the United Nations in the Protocol to Prevent, Suppress and Punish Trafficking in Persons, Especially Women and Children,
2
which is often referred to as the ‘Palermo Protocol’. This protocol supplements the United Nations Convention Against Transnational Organized Crime. The UN definition reads:
Trafficking in Persons shall mean the recruitment, transportation, transfer, harbouring or receipt of persons, by means of the threat or use of force or other forms of coercion, of abduction, of fraud, of deception, of the abuse of power or of a position of vulnerability or of the giving or receiving of payments or benefits to achieve the consent of a person having control over another person, for the purpose of exploitation. Exploitation shall include, at a minimum, the exploitation of the prostitution of others or other forms of sexual exploitation, forced labour or services, slavery or practices similar to slavery, servitude or the removal of organs.
In summary, according to the UN definition, human trafficking involves actions such as recruitment, transportation, harbouring or receipt of individuals through various means of coercion, deception or abuse of power for the purpose of exploitation, which can include sexual exploitation, forced labour, slavery, servitude or organ removal, albeit the exploitation can take on any forms and it is limited only by the ‘creativity’ of the exploiters.
CSE is a type of sexual abuse. When a child or young person is exploited, they are given things such as gifts, drugs, money, status and affection, in exchange for performing sexual activities. Children and young people are often tricked into believing they are in a loving and consensual relationship. This is called grooming. They may trust their abuser and not understand that they are being abused. CSE can occur in person or online.
This paper also uses the term ‘victim’, although it is acknowledged by the authors that there has been a notable shift from the use of the term ‘victim’ to describe those who have experienced exploitation, to the term ‘survivor’ or ‘person with lived experience’. There are several reasons why the term ‘victim’ is being rejected by stakeholders. It could be argued that the word ‘victim’ can encourage stereotypes and discrimination and comes with stigma. This stigma is often furthered by the way ‘victims’ are portrayed through the media, film industry, news cycles and the judicial system. These all affect the public’s perception of what it means to be a victim of human trafficking or CSE. Arguably, the term implies a passive role and suggests that the person has no agency or ability to escape their situation. 3
However, because the law and general practice still use the term victim, this paper follows suit. Here victim refers to a person who has been harmed or injured by another, often in the context of a crime. The 2012 EU Directive on victims’ rights,
4
Article 2(1)(a), defines a victim as:
a natural person who has suffered harm, including physical, mental or emotional harm or economic loss, which was directly caused by a criminal offence; family members of a person whose death was directly caused by a criminal offence and who have suffered harm as a result of that person’s death.
AI refers to the ability of computers to perform tasks which would traditionally have required human intelligence. This could include things such as understanding human language, visual perception and performing human-like reasoning to make decisions. This generally comes in the form of mathematical models that allow computers to process and learn from historical data in order to improve their performance on a task. NLP is a subfield of AI, which specifically focuses on the ability of computers to understand and process human language, which may come in the form of text or speech.
The Case for Leveraging Victims’ Voices
In this section, we turn our attention to making the case for the importance of using victims’ voices, including by drawing on stakeholders and literature have said about the use of victim stories and victim voices in addressing crimes.
In 2022, the United Nations ‘World Day Against Trafficking in Persons’ was centred around victims’ voices, highlighting ‘the importance of listening to and learning from survivors of human trafficking’. 5 Scholarly discourse and practitioner advocacy have also underscored the significance of collaborative strategies and enhanced the incorporation of survivor narratives as potential avenues for enhancing anti-trafficking endeavours. 6 Funding opportunities also embrace the importance of capturing victim stories, with the inclusion of individuals with lived experience being a priority for policy-related research funded by the UK Modern Slavery Policy Evidence Centre. Survivor expertise is more prominent within the anti-human trafficking domain than in CSE, exemplified by organizations like the Human Trafficking Foundation in the United Kingdom, and it is progressively gaining prominence. Increasingly, survivors assume leadership roles, engage in advocacy efforts and participate in activism. 7
This is not surprising, victims can provide first-hand accounts of a crime, including human trafficking and CSE, allowing researchers and insight users to properly understand the driving factors behind these phenomena and what happens during and in the immediate aftermath. Who better to understand the nuances of why a false job advertisement is appealing, even if some level of risk is known, than the victim who took that risk? Likewise, who better to understand the impact of control exerted on someone to cultivate cannabis than the person who was forced to do it? In these cases, where a victim was compelled to commit a crime as a result of their human-trafficking situation, explaining to a prosecutor or to a jury that the victim really felt they had no choice is a challenge that defence lawyers continue to grapple with, but the victim’s own words can help. Likewise, we can continue to assume using our decades of experience what victims need for recovery, but only they truly know. The authors of this paper discovered by reviewing the analysis of victim stories, that amongst the evident needs such as shelter and access to food, a large proportion of victims struggle with accessing public transport, predominantly due to costs. Perhaps an obvious point but often overlooked, in order for stakeholders delivering recovery and assistance programmes to have a significant and lasting influence on a victim’s revival, it is crucial for them to have a deeper comprehension of the difficulties encountered by victims and understand their actual needs. The victims’ stories and experiences are a source of true and reliable data.
Incorporating individuals who have personally lived through human trafficking or CSE also has the potential to address the current demographic disparity within research. For example, often research focuses on female victims of sexual exploitation. This leads to legislation, policy and interferences, which ignore the reality that boys can also be trafficking victims and can also be victims of CSE. As such, victim viewpoints can offer more precise and efficient guidance for shaping anti-trafficking policies and programme strategies. Scholars note that ‘inclusion of survivor perspectives helps ensure that anti-trafficking efforts are culturally appropriate, and serves as a reality check to guide outreach, advocacy, service provision, research, and policy-making efforts’. 8
The imperative for the inclusion of individuals with lived experience, particularly those most affected by policy changes, extends into every aspect of personal crime, and not just human trafficking and CSE. Indeed, there is much to learn from other domains. This expectation is evidenced in literature by Gardner and Brindis. 9 A comprehensive review of mental health, substance abuse (SU) and domestic violence research literature reveals widespread support for involving SU individuals in research, policy formulation, programme planning and implementation. Despite Abayneh et al.’s 10 observation of an overall deficiency in sustained efforts and mechanisms for involving SU voices in mental health research, they affirm that the informal knowledge derived from lived experiences can complement formal theorizing, enhance peer-reviewed literature reviews and hold potential for advancing mental health systems.
However, as Lockyer writes ‘there are very few studies exploring the perspectives of survivors’. This is for a myriad of reasons. Certainly, as already mentioned in the Introduction, it is a challenging undertaking. Already the first step is hard, namely having access to ‘research subjects’. Often researchers will be able to interview a handful of victims, rarely do we see studies with a sample size of over 100. For example, a study by Lim et al. 11 looked to qualitatively examine the factors that impact the recovery and reintegration of Asian sex trafficking survivors in the United States. Ten interviews were conducted with only three Korean survivors of sex trafficking and seven frontline service providers who had worked or were currently working with East Asian survivors.
Even when researchers secure the ‘research subjects’, the data collection is hard emotionally on both the victim and the researcher. Consequently, research ethics need to form a central part of any engagement, and here getting ethical clearance can also be hard. Indeed, there are numerous ethical fears, encompassing concerns such as securing informed consent from participants, safeguarding their confidentiality, ensuring their safety during the interview process and addressing sensitive or emotionally charged topics. Managing these ethical and practical aspects requires careful planning and adherence to ethical guidelines to protect both researchers and participants.
The act of gathering data, that is, through interviews, surveys or focus groups imposes significant demands on researchers in terms of time, resources and expertise. Researchers are required to meticulously craft, conduct, transcribe, scrutinize and interpret interviews, which can be resource-intensive and time-consuming. The data-gathering process often yields copious amounts of data, which can pose substantial challenges in terms of data management, organization and coding. Researchers grapple with the complexities of handling and extracting meaningful insights from the wealth of information generated through interviews.
As mentioned above, if these interactions with victims are undertaken with ethical considerations, including the minimization of the risk of traumatization, there can be benefits for the victims who participate. The authors of this paper are experienced researchers, some with the practice of providing direct support to victims of human trafficking in safe houses and having in our time interviewed numerous victims. In our roles, we have observed the increase in calls for victim presence in anti-trafficking efforts. We have talked with persons with lived experience who want their perspectives to be considered by all stakeholders, in particular service providers and policymakers. Additionally, we have witnessed the discord between intervention strategies and the realities on the ground in the realm of human trafficking.
Whilst we support the notion that intervention in a crime ought to be led by the victim’s experience, we also acknowledge that one victim is not an expert on all matters of human trafficking or CSE. What does a child soldier in Sudan know about the experience of a Vietnamese man forced into labour to repay his late father’s debts to a gang of loan sharks? Probably not much. In turn, the said Vietnamese man does not know what technology is used to recruit women from Kosovo to have their organs stolen from them. Likewise, one girl’s experience of CSE from a family member will be different from another girl’s story, even if the exploitation was also at the hands of a family member. As such it is not enough to collect data via interviews from 5 or 20 victims, what we need to truly understand the nuances of the crimes—and how they are related to other crimes—is mass amounts of data. Moreover, not just data at one point in time. Instead, data are continuously collected so we can monitor the changes in the modus operandi of the criminals. A case in point is generative AI; a few years ago, few perpetrators were developing their own child explicit material, but now there is an increasing trend of this phenomenon (see a report by the International Watch Foundation 12 ). What is needed, thus, is vast amounts of continuous data that probably only a computer could efficiently analyse.
The latter point is crucial. There is no point collecting victim stories, if we will not use them. Regrettably, that is the status quo. Think how many interviews a victim of human trafficking has to go through: A police interview, a review to enter a national referral mechanism, an interview by the NGO that will provide them with assistance, an examination in court and that is all before the researchers come knocking. Although these interview transcripts may initially serve specific purposes, such as aiding a safe house in determining the victim’s required support or contributing to a risk assessment, they often languish in databases, forgotten. Periodically, someone may make a simple request, like inquiring about the number of safe house clients needing mental health support. The case worker, then, grapples with revisiting numerous case notes or, if fortunate, extracting a statistic from a structured dataset. Challenges intensify with nuanced requests, such as understanding the intricacies of victims’ movements from one country to another and identifying emerging patterns. This detailed information is seldom captured in the structured data of a CRM and demands exhaustive scrutiny of each narrative—an impractical task for an overwhelmed case worker.
Taking note of all the above motivated the authors of this paper, and the colleagues they work with, to devise innovative methodologies for amplifying the voices of victims, all while sparing them from enduring protracted interviews and safeguarding researchers from the emotional toll of reading countless distressing accounts. We wanted to find a way to bring to light victims’ voices at scale to allow a true understanding of the crime. The next section provides a layman’s description of our proposed methods for including victim stories in counter-trafficking and CSE efforts.
NLP as a Method for Bringing Survivor’s Voices to the Front
Below, we illustrate how we employed NLP models as a tool to analyse the various needs articulated by survivors of human trafficking who receive support from an NGO. Please note that whilst in this instance we focused on needs, the same method can be devised for understanding other aspects of the crimes. For example, we could equally aim to understand recruitment methods or control methods in CSE or transport routes in international trafficking.
Our research utilizes a dataset comprising nearly 40,000 human-typed notes generated by NGO staff dedicated to aiding human-trafficking victims. These notes provide comprehensive insights into the topics discussed by victims during their routine support calls and meetings. Primarily, the notes delve into the victims’ historical experiences, current situations and ongoing needs, all of which the NGO is actively addressing and supporting.
Rather than undertake the labour-intensive task of reading through 40,000 notes, as mentioned, we employed NLP to automatically detect the support requirements discussed within the dataset’s notes. Subsequently, we consolidated this information to generate insights into the predominant support needs at specific times or for particular demographic groups.
In its broadest sense, NLP describes any computational method which facilitates the understanding and processing of human language. We refer to these computational methods as ‘models’; however, in their simplest form, these models may simply consist of a set of rules which identify the presence of specific keywords which may be indicative of the topic being discussed within an extract of text. Consider the sentence,
Joe Bloggs was criminally exploited in the United Kingdom.
If we were to develop an NLP model capable of identifying whether criminal exploitation was being discussed in a text extract, it may be enough to simply search for the words ‘criminally exploited’, and all other derivatives of the term. If the key term appears in the text we would classify the sentence as one which discusses criminal exploitation, and if it does not, we would not. However, if instead we wanted to determine who was being criminally exploited, where they were being criminally exploited, why they were being exploited and any other details of their experience being discussed, the problem very quickly becomes too large and complex to use this type of simple rules-based approach. Instead, we require sophisticated mathematical models which are capable of reading through and understanding text in a similar way to how a human reader would do so.
Recently, large language models (LLMs) have emerged as a promising development in the field of NLP. LLMs are generally based on the transformer model architecture of Vaswani et al. 13 and are the basis of many recent high-profile breakthroughs in the field of AI, such as OpenAI’s release of ChatGPT. They work by breaking down a sentence into ‘tokens’, which are the individual words or sub-words within a piece of text. For each token, the model considers its relationship with every other token in the text in an attempt to understand any relevant context which may give meaning to it. In the example sentence above, the presence of the words ‘in the’ provides an entirely different context to ‘United Kingdom’ than if they instead said, ‘by the’. Hence, much like a human would do, an LLM will read a piece of text and consider the relationship between each token and every one of its neighbouring tokens in order to build its interpretation of the text’s overall meaning.
For the work that the authors carried out in proof-of-concept projects, we fine-tuned an LLM to perform the classification of the topic being discussed in a given piece of text, which is also known as topic classification. We aimed to understand the different types of support needs being discussed by each victim of exploitation, which may be anything from their mental and physical health to their financial or legal support needs. The need for an LLM arises from the complexity and the nuance involved when interpreting each victim’s story, as we not only need to identify the topic of each text (e.g., mental health), but we also need to understand whether an actual support need is being discussed, the grammatical tense of the support need, who the support need is in regard to, and so on. Hence, a thorough understanding of the text’s context and meaning is required to deliver accurate insights.
To our knowledge, this is the first attempt to apply LLMs in the domain of understanding human-trafficking victim needs; hence, there was no off-the-shelf model capable of performing this task. Instead, we had to develop a bespoke classification schema of support needs relevant to human-trafficking victims and fine-tune an LLM to understand what each of these support needs mean, specifically in the context of human trafficking. Amongst other things, this required the model to develop an understanding of the various terms, names of organizations or names of initiatives which appear in the context of human trafficking but may not appear much elsewhere in the English language. To that end, we also heavily relied on the input of subject-matter experts (SMEs) when training the models and defining our classification schema, which ensures that the different types of support needs we have encoded our model to understand are accurate to their real-world meaning.
Including SMEs in our work is part of what we call a socio-tech approach. This approach integrates the insights and expertise of numerous individuals who possess a deep understanding of the complex issues surrounding the problem we are working on—from victimologists, sociologists and legal advocates to ethicists and experts in privacy and data protection. By collaboratively engaging with these SMEs, our NLP development aims to bridge the gap between technological innovation, the nuanced realities faced by victims and the challenges posed by working with sensitive data. Through this synergistic approach, we create solutions that not only leverage cutting-edge technology but also reflect a comprehensive comprehension of the social, legal and victimologist dimensions of human trafficking. In doing so, we develop NLP tools that are not only efficient and accurate but also ethically sound and attuned to the unique needs of those affected by this crime.
Once trained, the LLM was deployed on a corpus of text, which can theoretically be an unlimited size, identifying all mentions of each type of support need we have trained the model to understand. As this classification can be performed efficiently at scale, we were able to extract insights about the trends of different support needs being discussed which would otherwise have taken a human analyst weeks or months to do manually, now within a matter of hours. We, then, calculated the prevalence of each support need type over time, in addition to identifying insights such as the different groups of people or the different locations which have a high prevalence of each support need type. Hence, these results facilitate organizations that support victims of human trafficking to reallocate resources and implement measures to support victims more effectively.
So What…
The implementation of our NLP-based approach to understanding the needs of human-trafficking victims serves as a compelling proof of concept, showcasing the transformative potential of technology in addressing intricate social issues. Through rigorous testing and real-world application, our system demonstrated its efficacy in extracting valuable insights from victim narratives, shedding light on hidden dimensions that might otherwise remain overlooked. By way of an example, we noted that most (96%) victims focused on needing legal support. Of the 515 service users who talked about legal support needs, almost half (249 individuals, 48% of the sample) also mentioned ‘asylum’. Looking at another need, in total 442 individuals discussed mental health support needs, accounting for 83% of the whole population. The split by mental health condition tends to be similar by gender for anxiety (29% overall), PTSD and trauma (13% overall). The figures vary for depression (22% of women, 19% of men), suicide (14% of women, 17% of men) and panic attacks (5% of women, whilst men did not mention panic attacks).
Whilst it is not the purpose of this article to present the insights emerging from our work, we do wish to convey that the results underscore the significance of employing advanced linguistic models to unravel complex and multifaceted stories, enabling a more nuanced understanding of victim experiences. Before this, no one had gone through all the data to understand patterns and trends in victim needs.
Moreover, this proof of concept not only validates the feasibility of our NLP approach in the context of human trafficking but also advocates for its broader application. By leveraging similar methodologies, we can extend our socio-tech approach to unravel the intricate threads of various societal challenges, ranging from developing an accurate view of group-based CSE and/or domestic abuse to developing a picture of location points of criminal exploitation taking place.
The adaptability of NLP technologies holds the promise of uncovering obscured narratives, contributing to a more comprehensive understanding of a crime and its impacts. Unlike traditional approaches that heavily rely on structured data, which often prove rigid and may not fully capture the nuanced and evolving nature of victims’ experiences, NLP excels in handling unstructured textual data. Structured data, while of course valuable in certain contexts, impose limitations when applied to the complex and dynamic narratives surrounding human-trafficking victims. These stories often defy conventional categorizations and evolve over time, making them challenging to capture through rigid data structures. NLP, by contrast, thrives on the richness of unstructured text. By analysing the language used in victims’ narratives, it can discern patterns, sentiments and hidden meanings that might elude more structured approaches.
The true power, we thus found, lies in NLP’s capacity to navigate the intricate tapestry of victims’ stories, extracting subtle nuances, event emotions and contextual information that are essential for a holistic understanding and subsequent interventions. For example, by knowing there is a continued increase in victims’ needs for legal support, NGOs can ensure their future budget and resources accommodate this.
Conclusion
Within the vast array of applications for NLP, aiding law enforcement agencies, NGOs and policymakers in comprehending and addressing human trafficking and CSE stands out as a particularly recent and promising development. In this paper, we developed and tested NLP models to extract insights from copious amounts of data that represented victims’ stories they had told to their case workers. The results are promising. To understand needs surrounding, for example, mental health, we did not have to interview victims, hence avoiding causing any further trauma. Instead, we used what they had already said. In turn, we can now take our insights to advocate for change, for instance, ask budget holders for further investment in therapy for men to address suicidal thoughts.
Tackling the complex issues of human trafficking and CSE demands a paradigm shift in our approach. Relying less on speculative insights derived from limited interviews and embracing the power of NLP to analyse vast data sets can provide a more comprehensive understanding of these crimes. However, this transformative shift requires substantial investment. We know good, ethical, technology does not come cheap. As governments formulate strategies to combat these acts, it is imperative to allocate budgetary resources towards ethical AI initiatives. By prioritizing investment in advanced technologies, we pave the way for a more effective and data-driven approach to combating human trafficking and CSE, ultimately striving for a safer and more secure future for all.
Footnotes
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
