Abstract
Technical and professional communicators need to continue to interrogate how the seemingly mundane documents they create, such as terms of service (ToS) and community guidelines, and the systems those documents become a part of can oppress, exclude, and affect marginalized and hypermarginalized communities. This article presents an adapted corpus-assisted discourse analysis of how “sexual content” is defined across a corpus of 176 ToS and community guidelines from 118 social media sites. The findings show how ToS and community guidelines can work together to complicate our understanding of how values are intentionally and unintentionally embedded in these documents in order to uphold power or to meet emancipatory ends.
I know it when I see it—Justice Potter Stewart describing obscenity. (Miller v. California, 1973)
ToS are legal agreements between an entity providing a service and the person using that service. For social media, ToS are legal agreements that outline multiple rules for participating in the online community and using the platform as well as the obligations of the platform to the user. In contrast, Community guidelines are not legal agreements but are policies about how to behave on a platform and usually include content rules. Although many ToS include adhering to community guidelines as part of their agreement, not all do. Also, both ToS and community guidelines are complicated by their constantly changing nature. These documents can change sporadically, without warning, and in ways that sometimes purposefully obfuscate how they have changed. Most platforms are required to notify users of ToS changes, but these notifications often come in the form of e-mails or short pop-up notices that are easily discarded.
TPC scholars study content moderation more broadly (Edwards, 2018; Potts & Trice, 2022) and occasionally help guide forum moderation (Frith, 2014; Swarts, 2015), privacy policy, (Markel, 2005, 2010) and ToS (Hope, 2021; Pigg, 2014; Ramler, 2020). But the application of ethics to studying and analyzing ToS and their relationship with community guidelines and content moderation, particularly with concerns for hypermarginalized groups, is still much needed. In this study, I continue work started by Moeggenberg et al. (2022) about how specific documents harm and oppress particular groups. I do so by calling attention to the documents influencing, impacting, and informing content-moderation decisions: ToS and community guidelines.
The Application of Ethics to ToS and Community Guidelines
ToS and community guidelines create (effectively or not) the ethical parameters for social media sites that are then upheld (effectively or not) through content moderation. For example, Colton and Holmes (2018) shared how a social media platform has a “publicly available legal page, which includes its terms of service for elements such as intellectual property and copyrights, which enforce ethical conduct” (p. 26). Not all ToS and community guidelines effectively enforce ethical conduct in practice, but they can be a starting point for analysis. By beginning with how ToS and community guidelines define “sexual content,” I will show the values that are already laden in these documents.
Although most ToS are “designed to protect the company's legal interest” (Suzor, 2019, p. 11), scholars have included discussions about ToS as part of their larger work on digital spaces and their effect on community identity, values, and practices. For example, Lingel (2017) expressed that “shared physical spaces, rules, and practices, including the terms of service (TOS), are a practical means for members to draw together as a group” (p. 49). Lingel posited that ToS build a community, setting boundaries for who belongs in the community and the consequences of violating those boundaries. Her study makes clear that members of a counterculture, such as body modification enthusiasts, punk rockers, and drag queens, read ToS carefully. In line with Lingel's view, I argue that ToS and community guidelines together become governing documents imbued with values. The ways that such documents can complicate our understanding of how values and morals are embedded, intentionally or unintentionally, in these documents and used to uphold power in the processes established in those documents need to be more closely examined.
Other TPC scholars have begun conversations engaging ethics and content moderation, particularly focusing on “rules” and ToS. For example, Potts et al. (2019) discussed platform rules and moderation, concluding that users are able to avoid creating and maintaining positive values while still, on paper, following the rules set out by the platform (pp. 361–362). Although they focused on “amoral groups” and “communities of harm,” their analysis sets a solid foundation for studying the connection between ethics and platform governance. I aim to build on this argument here by discussing how ToS and community guidelines, as part of content moderation, can create, uphold, or express certain ethical frameworks in online communities, particularly on social media.
By centering my discussion on sexual content, with its multitude of definitions and applications—from sexual health education to queer content, sex work, fan art, erotica, and even child sexual abuse materials, and more—I aim to discuss ethical parameters beyond harassment and abuse. Although Trice et al. (2020) argued that “many digital communities have descended into an ethos of aggrievement and harm” (p. 48), I argue that the same tactics used to comply with platform rules and skirt them at the same time can be used to protect certain communities and that ToS and community guidelines, and the content-moderation practices that stem from these documents, can be used to help both moderators and users to flourish. In doing so, this study extends other ethics studies in TPC, such as Walwema et al.'s (2024) work on the ethics of exclusion, Colton et al.'s (2017) work connecting ethics and tactical technical communication, and Edenfield et al.'s (2021) work on digital research ethics.
My focus on sexual content is important because of the impact sexual content moderation has on hypermarginalized users, such as queer people and sex workers. Here I am extending work on gender in TPC, such as Frost's (2023) call for feminist technical communicators to be “devoted to decentering traditional centers of power in favor of radical, inclusive, and diverse feminist praxes” (p. 123). Focusing on sexual content, something that centers of power continually try to regulate, opens up ways to understand content moderation's impact on marginalized users and is in line with feminist TPC work. For example, flagging, tagging, content removal, and deplatforming queer users are all tactics that stem from upholding ToS and community guidelines. These tactics, rules, laws, processes, flagging systems, and more that disproportionately remove queer content are recreating past power hierarchies aimed at controlling public sexual expression (Waldman, 2022). Markers such as “NSFW” (not safe for work) or the term “SafeSearch” for searches that exclude sexual materials indicate that the “practice of conflating safety with the filtering of sexual content both builds on and bolsters an understanding of sex and sexuality as inherently risky, potentially harmful, and best hidden away and left unmentioned” (Paasonen et al., 2019, p. 2). Content moderation, and its documents, can be particularly detrimental for LGBTQIA+ communities and criminalized groups, such as sex workers.
Like Itchuaqiyaq et al. (2022), I want to center on hypermarginalized communities—communities with intersections of multiple marginalizations such as “identities derived from their criminalized work, race, gender, ability, and other identities” (p. 2). The term “hypermarginalized,” then, calls attention to the ways that marginalizations can intersect in various and diverse ways across different institutions, contexts, and publics, as well as over time.
Therefore, in this article, I examine the following research questions by conducting an adapted corpus-assisted discourse analysis (CADA) of ToS and community guidelines from a wide range of social media platforms to see how they define “sexual content”:
To what extent are ethical frameworks or values apparent in ToS and community guidelines through their definitions of sexual content? In what ways (if any) can ToS and community guidelines be emancipatory?
I created a corpus of 176 unique ToS and community guidelines documents from 118 unique social media sites. I begin by discussing CADA's place in TPC and why I consider my method to be an adapted CADA. Then I explain how I conducted my adapted CADA, including selecting the artifacts, defining “social media,” and creating the final corpus. I conclude by sharing the implications of my study and pointing out the future research that is necessary for considering how “mundane texts” impact users online and how engaging ethical frameworks can strengthen TPC social justice work, particularly for emancipatory aims.
CADA and TPC
CADA focuses specifically on “frequency, keyword, collocation, and concordances to find out the discursive patterns in language” (Ding & Kong, 2019, p. 94) and has been used in a variety of ways for more than 20 years, especially in TPC. Graham et al. (2015) used “big data methodologies” to conduct a statistical genre analysis of Food and Drug Administration (FDA) meetings, Gallagher et al. (2019) used what they coined “big data audience analysis” to analyze a corpora of user comments from The New York Times, Anesa and Pellón (2019) conducted a keyword analysis on a corpus of patents to inform technical communication courses, and Carradini (2020) used a collocation analysis to study research topics in TPC and business communication, to name a few.
More recently, Lang et al. (2023) traced the history of corpus analysis in TPC, creating a guide for TPC scholars and practitioners to practice CADA. They highlighted how TPC scholars have used corpus analysis “to examine textual practices in a wide variety of organizational and cultural settings” (p. 95) and how this work has been published across TPC's major journals.
Using Lang et al. (2023) and Itchuaqiyaq et al. (2022) as a model, I conducted an adapted CADA. Itchuaqiyaq et al. (2022), in their work on risk communication and safety guides for street-based sex workers, used an adapted CADA technique to create a custom keywords list for keyword analysis. They argued that adapting CADA techniques to create a keyword list helped “confirm whether our read observations were supported by data” because “corpus analysis techniques can be a potential corrective to such preexisting biases by grounding [the] analysis in data rather than impressions” (p. 16). Of course, as a social media user myself, I had preexisting ideas about ToS and community guidelines and my own experiences with navigating them, so as I built the corpus, these preexisting ideas and experiences affected my initial observations about the language I was noticing. By using an adapted CADA method, like Itchuaqiyaq et al. (2022) did, I can see if my initial impressions are grounded in the research. Because my goal was to examine how sexual content is defined and used in ToS and community guidelines, I used an adapted CADA method to collect keywords for sexual content in order to “take both an overview of a body of work and a detailed, contextual view of keywords and phrases in a text” (Itchuaqiyaq et al., 2022, p. 15). I was concerned not only with how often sexual content and related keywords appeared in these documents (frequency) but also with analyzing the contexts and patterns of the content and keywords.
Corpus analysis helps us “see patterns across collections of documents with the aid of computer applications … those patterns are interpreted through close reading that allows inferences that are beyond machine capability” (Lang et al., 2023, p. 94). Conducting a CADA can be like looking at a large city (the corpus) and “zooming in” on the individual residents (keyword count), houses on a block (clusters), and various neighborhoods (collocations). Although I did not conduct a traditional corpus analysis comparing large corpora to one another, Itchuaqiyaq et al. (2022) set the foundation for adapting CADA to fit the needs of TPC researchers looking at a small corpus of related documents.
Also, Itchuaqiyaq et al. (2021) stressed the importance of a hybrid approach to “assign tedious, repetitive tasks to computational algorithms, leaving tasks of contextual inquiry to human coders” (p. 11). Thus, one way I used an adapted CADA is through a hybrid approach of both computer-assisted analysis and my own analysis. The “tedious, repetitive tasks” that I left to a computer program were to count keywords and identify clusters and collocations.
CADA and Definitional Work
Doing definitional work as the first stage was important because “‘defining’ work sets policy and, therefore, particularly around expression, implicates core public values” (Bamberger & Mulligan, 2022, p. 1126). Because ToS and community guidelines are used in content moderation to form a basis or use as evidence for moderation, it is important to understand how sexual content is being defined. An adapted CADA enabled me to find the underlying patterns of how sexual content is defined, which then informed my use of ethical frameworks to understand how the way it is defined upholds power and authority and could make room for emancipatory ends.
CADA can also be used to explore practices dealing with “nonnormative” issues such as socially just citation (Itchuaqiyaq et al., 2021) and sexual normativity (Motschenbacher, 2022). This scholarship moves beyond the corpora usually studied in linguistic analysis to show how TPC scholars can use methods derived from CADA in their own textual and discourse analyses.
Study Design
Following Lang et al.'s (2023) tutorial for TPC scholars and practitioners to conduct a CADA along with the adapted methods from Itchuaqiyaq et al.'s (2022) study, I conducted an adapted CADA of 176 ToS and community guidelines documents from 118 social media sites. I focused on one snapshot in time. That is, I collected the corpus in January 2024, and the definitional content reflects that specific time period. Since then, there have been numerous changes to these sites’ ToS and community guidelines, particularly since the 2024 presidential election.
For my corpus, I use Potts and Trice's (2023) definition of social media as a “diverse, and not always cohesive, array of platforms wherein participants can interact with each other in digital spaces meant to communicate across space and time” (p. 273). This definition includes popular sites such as Facebook, X, and Reddit as well as smaller communal sites such as The StoryGraph (a book sharing, recommendation, and social media site) or Dreamwidth (a journaling social media site). Gillespie et al. (2020) have called for researchers to make sure that their research and policy work extend beyond the largest platforms because innovative and interesting work is being done on a wide range of platforms.
Including a diverse range of platforms is also important for evaluating how the definitions of sexual content vary depending on their purpose, users, audience, and owners. Because impacted communities (e.g., sex workers) use a variety of social media sites that are not necessarily focused on their work, such as Meta platforms (e.g., Instagram) to share “safe for work” content or X to conduct activism or even to advertise, my corpus needed to capture a wide range of sites. Therefore, after solidifying my definition of a social media platform, I decided that the ToS and community guidelines for my corpus must each be
the official ToS or community guidelines for the platform from a platform whose main purpose is social media or social networking (i.e., not dating sites, pornography sites, or wikis) from a different company (i.e., only one artifact from Meta) to ensure a diverse range of artifacts from an active platform (i.e., not archived ToS/community guidelines of inactive sites) owned by U.S. based companies (This project is not set up to engage with global technical communication scholarship.)
The corpus was also limited to which ToS and community guidelines I could access without needing to sign up for an account and find using links from the home page or app store description. With the rate of new app creation, changes, and deletions of social media, my goal was not to have the ToS and community guidelines from every social media platform but instead to have a representative sample. Although traditional corpus research includes large corpora of millions to billions of words/tokens, the TPC studies that I am using as a foundation (i.e., Itchuaqiyaq et al., 2022; Lang et al., 2023) use smaller corpora in their studies. My goal was to achieve a corpus of at least 100 documents from social media sites because this seemed like a number that was a large expansion of my initial pilot study (Jordan, 2023), would offer a range of ToS and community guidelines, and could be achieved with my criteria.
Locating the Corpus
Rather than web scraping, I conducted a manual search due to ethical concerns. I did not want a web scraper to pull any information that could exploit or harm the groups that I am advocating for here, that is, criminalized populations such as sex workers, who are on “platforms hostile to them, using payment platforms that make their income precarious, and needing additional protection measures in many other aspects of their lives” (Bhalerao et al., 2022, p. 549). Also, a manual search allowed me to locate ToS and community guidelines that were named with different terms (e.g., “Rules” or “Community Conduct”).
My manual search resulted in 362 unique sites. Next, I removed any sites that were based in a country outside of the United States (as determined through their Wikipedia pages and websites) or otherwise did not meet my criteria, as well as any sites that ended up on the list but were not social-media focused. After these cuts, I had 118 unique sites (see Appendix at Google Docs). Then I accessed and downloaded the ToS and community guidelines of all 118 sites by going to their webpages or app information in Google Play, retrieving 176 unique documents for my corpus. There are more documents than sites because I pulled both ToS and community guidelines. In some instances, ToS and community guidelines were combined; in others, they were separate documents or community guidelines did not exist. All the ToS and community guidelines were downloaded from January 9 to January 11, 2024.
Unlike Lang et al. (2023) and Itchuaqiyaq et al. (2022), I did not organize my files into batches. Because my focus was on patterns (not comparisons), I did not need to create subgenres or groups for the documents. I wanted to take this corpus as a whole and use AntConc (Anthony, 2024) to assist in counting keywords and identifying clusters and collocations. In the future, it would be useful to have a more nuanced analysis of the documents, comparing them to one another, but in this first stage of the study, I aimed to analyze these documents overall in order to access patterns. Again, we can view the corpus as the larger “city” that we want to learn more about by analyzing the patterns that exist across neighborhoods (collocations), houses on a block (clusters), and individual residents (keywords) and how the “neighborhood watch” (content moderators) may use these documents to surveil, impact, or affect that particular city.
Corpus Analysis Software
I chose AntConc (Anthony, 2024) as my corpus analysis software. Inspired by Itchuaqiyaq et al.'s (2022) study, I have used AntConc in the past to conduct a pilot study on four platforms’ ToS documents (Jordan, 2023). One strength of AntConc is its ability to identify the collocation (words surrounding the keywords) data. Collocation data can help researchers find concordance in a corpus. Concordance analysis “may indeed make a useful contribution to linking quantitative findings to the wider social context” (Motschenbacher, 2022, p. 40), which is important when discussing ethical frameworks for analyzing these documents.
Corpus Results and Analysis
Because my focus was to see how ToS and community guidelines define sexual content and the contexts in which sexual content is discussed within these documents, I created a keyword list to search the corpus based on my earlier pilot study (Jordan, 2023). Although a traditional CADA would use a reference corpus to generate a keyword list to use as a comparison, I did not identify a reference text that would work for my study's parameters. If I were studying the ToS and community guidelines as a genre, then I would have been able to find representative corpora for comparison, but my focus specifically on sexual content is atypical. As Itchuaqiyaq et al. (2022) highlighted, TPC “has struggled to compile a suitable reference corpus” (p. 17) for research dealing with sex work documents. Therefore, to analyze the nuance, diversity, and constant development of policies and guidelines around sexual content, I needed to create my own keyword list rather than pull one from a reference corpus.
I wanted to make sure to include keywords that sex workers and other hypermarginalized users had already identified for understanding and deciphering ToS. There is a history of sharing this information on social media, such as X, to communally understand ToS and changes and updates to ToS that may impact a users’ ability to fully participate on a particular site. For example, one user shared on X the google doc “Adult Selling Site Restrictions update 20 January 2023,” a user-created document that shared what content is or is not allowed on sites that allow users to sell adult content (Ladder, 2023). To create my keyword list, then, I combined keywords generated by sex workers on X with the keywords that I gleaned from searching “sex work and terms of service” and synonyms for “sexually explicit material” on Google, resulting in 18 keywords:
adult explicit inappropriate intimate mature nudity obscene obscenity porn pornography sex sexual sexuality sensitive solicitation soliciting trafficking vulgar
Keyword Count and Sexual Content Policies
In our corpus-as-city analogy, we can view the keywords as residents in the city. The keyword count shows us how often an individual resident goes to other people's houses and how many houses that resident visits. The 18 keywords in my study returned 1,801 hits from the 176 documents (with a total of 839,692 tokens, or word counts). “Sexual” was the most frequent keyword (with 567 instances), occurring across 82 of the 176 documents. In other words, the keyword “sexual” was our busiest “resident” who went to other people's houses 567 times, but only went to 82 out of the 176 houses in the city. “Inappropriate” also occurred across 82 documents but only appeared 150 times. Table 1 shows each keyword with its rank, frequency (how many times the word appeared across all documents), and range (how many of the 176 documents the word appeared in).
AntConc's Results for Keyword Rank, Frequency, and Range.
One particularly interesting finding from my initial keyword query is that “sex” or “sexual” did not appear in every document. This finding points to the numerous ways that ToS and community guidelines may handle issues around sexual content, especially on smaller and newer platforms. All the major social media platforms (e.g., Meta) do have sexual content policies, but other platforms house their sexual content policies in different ways, such as under “adult” (ranked 6th) or “sensitive” (ranked 10th) content policies. For example, the site BAND, an app focused on group communication, organization, and connection for sports teams, work teams, and families, has an activity policy that functions as community guidelines. This policy does not mention “sexual content” at all (see Figure 1). Instead, it has policies for “inappropriate” content and states that content containing pornography is not allowed. The site Academia.edu has one line under its general prohibitions policy stating that content that “is defamatory, obscene, pornographic, vulgar or offensive” is not allowed.

Screenshot of BAND's activity policy.
But some sites are explicit about allowing “not safe for work” content. For example, in its first line, the ToS for BDSMlr.com, a blogging site for people interested in bondage, domination, submission, and masochism (BDSM), states that “while our site is meant for BDSM, NSFW material and porn, we do not allow the following content in any circumstances” (see Figure 2). It then goes on to highlight that content cannot contain minors, “zoophilia,” “revenge porn,” or any identifying information of individuals.

Screenshot of BDSMlr's terms of service.
My corpus shows a wide range in the ways that the sites handle sexual content definitions and policies, from general guidance about “sensitive” content to explicit allowance of sexual materials. Many of the sites without sexual content policies are sites in which, compared to other sites, the community in general might not be as motivated to post sexual material. For example, the ToS of Catster, a platform in which users share and connect with others about their cats, only outlines that it “may” remove at its “sole discretion” any content that is “defamatory, obscene or otherwise objectionable” (see Figure 3). Here the keyword “obscene” covers a broad range of possible content. Also, the term “sole discretion” makes clear that content moderators can remove any content that they believe falls under these descriptions, and they are given plenty of leeway to enact moderation with the term “otherwise objectionable.” This leeway, however, offers an initial insight into these documents’ emancipatory potential. Although the term “obscene” can be used subjectively to encompass more content under the umbrella of “sexual content,” its loosely defined nature can also be used to include more content on a site. There are tactical ways that unclear or general policies can be used for inclusion rather than exclusion.

Screenshot of Catster's terms of service.
Other sites that would seem to deal with sexual content also do not have sexual content policies. For example, Clubhouse, a social media app based on voice notes, has one line in their ToS that states, “You understand and agree that you may be exposed to User Content that is inaccurate, objectionable, inappropriate for children or otherwise unsuited to your purpose” (see Figure 4). Rather than create a sexual content policy, Clubhouse has users agree that they may be exposed to content that they will not like. Clubhouse's policy is particularly interesting because it states that there may be content “inappropriate for children,” language that echoes current legislation around platform content aimed at “protecting children” (Kids Online Safety Act, 2025). One way to view this type of policy is that it absolves a platform of its legal responsibility but enables users to cite this policy when defending their own content against unfair censorship and removal. Policies can be used, then, for both enforcement and evasion, aiding in their emancipatory potential.

Screenshot of Clubhouse's terms of service.
Although Clubhouse has a much more open approach to content in its ToS—rather than restricting users to certain types of content, it has users agree that they may be exposed to content that the user finds personally problematic—its community guidelines have “Nudity and Sexual Content” and “Sexual Exploitation” sections that state, “We don’t allow sexual imagery or nude photographs. Adult-themed conversations should not violate our other policies around child safety and sexual exploitation.” Clubhouse is a strong example of how community guidelines can be used to specify ideas that are not included in the ToS, creating more “wiggle room” in policies that users can manipulate to be able to use the site. Clubhouse's use of ToS in conjunction with its community guidelines exemplifies how intertwined these documents can be on many platforms.
Last, “trafficking” is a fairly uncommon word from the keyword list in the ToS documents, ranking 11th in the corpus (and appearing in just 15 of the 176 documents). Although much of the discourse surrounding sexual content and the laws governing sexual content online focuses on trafficking, the appearance of that term in these documents is minimal.
Keyword Clusters and Inherent Values
AntConc will also create clusters, or words that are found not only together but specifically in sequence. Clusters, then, are like houses all in a row on one street. A cluster shows a close relationship between the words. Because my study concerns definitional work, these clusters are useful to see how the keywords are put into phrases. For example, the most common cluster was “sexual orientation,” with 62 hits across 48 documents. Here, we can see that “sexual” lives on the same street as “orientation,” “activity,” “exploitation,” and “abuse.” Table 2 shows the rank, frequency, and range of the 25 most frequent clusters that AntConc created.
AntConc's Results for the 25 Most Frequent Keyword Clusters.
As Table 2 shows, the cluster “sexual orientation” is prevalent in a range of documents. Because “sex” can be used to discuss sex assigned at birth or gender identity, and “sexual” can be used to discuss orientation (e.g., lesbian, gay, or bisexual), some of the hits for these keywords are not necessarily directed at sexual content dealing with sexual expression, sex acts, or nudity. But in our current political climate, these terms and their definitions are under attack. For example, historical arguments against the transgender community are resurfacing as “transgender” is being conflated with deviant sexual behavior. Rhetoric around people in the LGBTQIA+ community, particularly transgender people, has accused them of “grooming” children through their mere existence. Thus, in platform governance, users’ identities and personhoods are sometimes moderated and even erased.
So it is not surprising that these cluster findings show how “sexual content” terms are often paired with words such as “inappropriate,” “vulgar,” and “obscene.” Further analysis of these collocations can help show how these terms are used to “other” marginalized users, especially when LGBTQIA+ content (whether or not sexual in nature) gets tagged as “graphic” or “offensive.” Clusters help to show how these keywords are paired and can be useful for understanding the larger definitional framework for how they get applied across platforms, including the ways that these definitional frameworks can be used in larger patterns of marginalization. For example, it makes sense that keyword clusters containing “obscene” are frequent because U.S. legislation still uses Miller v. California (1973), an obscenity case, to understand the difference between obscenity and free speech. This case decided that “obscene” is the legal term used to describe anything that meets three criteria: (a) whether “the average person, applying contemporary community standards” would find that the work, taken as a whole, appeals to the prurient interest; (b) whether the work depicts or describes, in a patently offensive way, sexual conduct specifically defined by the applicable state law; and (c) whether the work, taken as a whole, lacks serious literary, artistic, political, or scientific value. (p. 15)
Collocation and Inherent Value
AntConc can also show the keywords’ collocation to other relevant terms within documents. Collocations are slightly more general than clusters because collocations highlight other words that a keyword appears near to but not necessarily in sequence with. We can think of a collocation, then, as a neighborhood for that particular word. Collocations can add to the context for how these keywords are being used, point to their possible different uses, and enhance our analysis of the document. That is, understanding what other terms and phrases that a keyword exists near can illuminate the fuller context of that keyword's inclusion in a document. Table 3 shows the 25 most frequent collocations for my keyword list including their rank, the frequency in which they occur near both the left and the right of the keywords, and the range of documents they appear in.
AntConc's Results for the 25 Most Frequent Collocations.
Note. This table reflects the way that AntConc compiled results. I copied the table that AntConc created directly and used its rankings in my analysis. AntConc's default is to order its rankings by “statistical strength” rather than frequency. I had not removed “the” “or” and other filler words, an oversight I will correct in future studies. Also, I did not remove keywords because, as the table shows, many of the keywords are collocated with one another.
“Exploitation” is high on the collocation list, which indicates that many platforms are rightly concerned about sexual exploitation occurring on their platforms. “Child” is also high on this list, suggesting that platforms are likewise concerned about child sexual abuse materials occurring on their platforms. Other collocations also show how there are attempted protections within ToS and community guidelines. For example, collocations of “gender” and “orientation” appear in the antidiscrimination portions of ToS and community guidelines. For example, in the “Civility and Respect” section of its community guidelines, Twitch (2024), a site that streams video games, states that Twitch does not permit behavior that is motivated by hatred, prejudice, or intolerance, including behavior that promotes or encourages discrimination, denigration, harassment, or violence based on the following protected characteristics: race, ethnicity, color, caste, national origin, immigration status, religion, sex, gender, gender identity, sexual orientation, disability, serious medical condition, and veteran status. We also provide certain protections for age, which are expressly noted in the examples.
“Nudity” is the fourth-ranking collocation. Many policies concerning content moderation deal with sexual imagery, with nudity being a main concern. For example, in the community guidelines of Spoutible (2023), a Twitter-like platform but with a zero tolerance for hate speech, the “Adult Nudity and Sexual Content” section states that “you may not post any media that is pornographic or intended to cause sexual arousal including, full or partial nudity, and simulated sexual acts. Exceptions may be made for breastfeeding, medical, health, educational or artistic content.” Unlike Grindr (a social media app for LGBTQIA+ individuals), Spoutible does not allow partial nudity but lists exceptions to this rule (e.g., for a breastfeeding photo that might show partial nudity). Spoutible also, as in the “obscenity” clause from Miller v. California (1973), allows exceptions for “artistic content.” Spoutible's community guidelines, then, are yet another example of how platforms can use “gray areas” and subjective guidelines to make exceptions for certain types of content. Analysis of the collocations, or keyword “neighborhoods,” can also reveal pathways, alleys, and blurred boundary lines, like a row of houses without fences demarcating each property. Within these blurred boundaries, users can find space to exist.
The Definitional Work of ToS and Community Guidelines
Because most sites include access to their former ToS and community guidelines, a whole study could revolve around changes to sexual content policies over time. For example, YouTube's (2023) ToS includes other lists of possible content that may violate their policy, and it used to have a note at the bottom stating that “this list is not complete,” which has since been removed. It did, however, keep other language that supports Joseph's (2019) report about companies using wide, general, or loose definitions within their ToS and community guidelines. For example, YouTube cites many examples of what would violate their community guidelines about sexual content but, at the end of these examples, states that “the above are just some examples. If you think content might violate this policy, do not post it.” Such statements give users room to then cite sexual content that is not included in the list. Although such general definitions can be used tactically to defend users’ choices by pointing to the unclear boundaries, they can also be used tactically for the opposite reason—to provide “room” to silence or ban content for violating unclear ToS or community guidelines. For example, the YouTube policy states that “explicit content meant to be sexually gratifying is not allowed on YouTube.” But it does not offer a specific definition of “explicit content” or “sexually gratifying.” Its community guidelines also mention pornography without giving a definition of pornography.
Although initially I viewed these loose definitions negatively, such vague definitions do have affordances. Opaque definitions can also be used to allow content that might be considered objectionable in other situations. General definitions can be used as reasons to not remove content flagged by other users. For example, Dreamwidth (2013) opens its ToS with the line “we hate legalese, so we've tried to make ours readable.” Dreamwidth goes against the assumption that ToS are unread by explicitly addressing readability. Dreamwidth is also transparent in saying that “we do not pre-screen Content. However, you acknowledge that we have the right (but not the obligation), in our sole discretion, to remove or refuse to remove any Content from the service.” By stating that it has the “right (but not the obligation),” Dreamwidth is expressing that users are acknowledging and agreeing to the possibility that they may be exposed to content they do not agree with or find objectionable and that Dreamwidth is not responsible for shielding them from that content. In this way, Dreamwidth is moving away from a utilitarian approach to content moderation. That is, it is not prescreening content to generate the “best outcome” for a majority of users; instead, it is relying on the users themselves to handle varying content across the platform.
The definitional work apparent in the corpus analysis results became the foundation into contextualizing and more closely analyzing ethical frameworks at work within the ToS and community guidelines documents. My adapted CADA afforded a wide lens of patterns happening across the corpus that further informed my analysis in close readings of ToS and community guidelines in the corpus. I will now provide examples of specific ethical frameworks within certain documents from the corpus. In this way, I move from the birds’-eye view of a city, its neighborhoods, alleyways, and streets to focus on how the patterns I identified exist (or do not exist) within one house.
Ethical Implications and Emancipatory Potential
Content moderation is one possible source for communal ethical practices. Because content moderation is part of an infrastructure that can regulate, uphold, and constitute cisnormative and heteronormative, White patriarchal values within online communities, we cannot discuss content moderation without relying on ethics. Using ethical frameworks rather than relying only on social justice to study the infrastructure and practices of content moderation is useful because “ethical frameworks help us understand how to enact justice and identify the behaviors, actions, and policies that should be considered just or unjust” (Walwema et al., 2022, p. 259). The combination of behaviors, actions, and policies is paramount.
For example, even though it would seem that ToS would be purely utilitarian, some sites are beginning to develop documents that are more in line with Colton and Holmes's (2018) description of Aristotle's ethics as “much more interested in the development of hexeis that lead to practical and ethical decision making in real social situations” (p. 34). Content moderation is based on decision-making, and because ToS and community guidelines are used to help guide, inform, and instruct those decisions, certain platforms have begun to move away from guidelines that outline “an appropriately fixed action in all circumstances” toward a virtue ethics framework whose goal is communal flourishing (eudemonia).
Other platforms and communities write their community guidelines based on an ethics of care (Gilligan, 2014), especially on sites that focus on bringing certain communities, identities, and populations together. Although the concept of “rules” or “guidelines” may seem to conflict with an ethics of care, community guidelines can be used to create suggestions or parameters for how to be in relationships in an online community. For example, the first three community standards of Gays.com (2024), an LGBTQIA+ social networking service, are to “Be friendly,” “Be considerate,” and “Respect each other's differences” (p. 1). Unlike other sites in which the ToS and community guidelines primarily concern legal compliance, Gays.com's community standards articulate the platform's values: We’re a diverse bunch with a wide range of ethnic backgrounds, races, genders, cultures, sexual orientations, religions, ages and other preferences which form our identity and the way we live. You don’t have to agree with someone else's views but it's important to respect that everyone has the right to express them without prejudice, discrimination or abuse. (p. 1)
These community standards, which call for respecting and creating space for everyone in the community to be able to share their views, set the foundation for enacting what Gilligan (2014) suggested: “to hear and encourage the full range of voices within and around us by becoming a society of listeners” (p. 104). Rather than relying on users (community members) to naturally care about one another, Gays.com's (2024) community standards create a framework in which users, even though they may not agree, must approach interactions from a caring position. Even the “Safety” section of these community standards does not relate to protecting the platform but instead gives advice for how users can protect themselves. This section includes advice to “Be Smart—Don’t give out personal information straight away and never send money” and “Be Sure—Get to know someone first” and “Be Safe—If you decide to meet any Gays.com members, try to meet in a busy place with other people around” (p. 1). Unlike larger, corporate platforms (e.g., Meta's Facebook or Instagram), community standards like these center on the user's, rather than the platform's, safety.
But other sites, particularly larger platforms, rely on a utilitarian approach to content moderation. For example, Meta's (2024) community standards is not one document but a main document that links to multiple pages of definitions, descriptions, and examples to create 90 pages of information. These standards cover everything from “Violence and Incitement” to “Dangerous Organizations and Individuals” to “Adult Sexual Exploitation” and more. Even though Meta states that it “also allow[s] for the discussion of sex worker rights advocacy and sex work regulation. We draw the line, however, when content facilitates sexual encounters or commercial sexual services between adults.” It also states that it “restrict[s] sexually-explicit language that may lead to sexual solicitation because some audiences within our global community may be sensitive to this type of content, and it may impede the ability for people to connect with their friends and the broader community” (Meta, 2024). Through these detailed definitions, examples, and rationale, Meta is complying with federal legislation while still creating enough room (e.g., by stating “sexually-explicit language that may [emphasis added] lead to sexual solicitation”) to allow it to remove content and deplatform users. Such statements can also help users take advantage of Facebook's flagging system. Reports have shown that trolls and other aggrieved online users will disproportionately flag LGBTQIA+ and other queer content under these sexual content guidelines (Harmon, 2018; Valens, 2018).
These examples show that we need to move beyond using ethical frameworks to assess whether a technology and its outcomes are ethical. Although other scholars have argued that ethics is useful for after-the-fact reflection of right versus wrong (see e.g., Markel, 1997, for a discussion of the Challenger disaster), ethical frameworks can also be used to examine our present and look to the future. For example, Verbeek (2017) argued that, based on technology, we need to develop new ways of “doing ethics.” He called for specifically looking at technological development, implementation, and use. My analysis extends this conversation to how technology is then mediated and maintained, specifically through ToS and community guidelines.
There are also interesting outliers for the ways that ToS and community guidelines create values and ethics. One example is the now defunct right-wing social media app, FrankSpeech. The FrankSpeech website now reroutes to LindellTV, and the FrankSpeech social media site was replaced with VOCL social media, which requires a login to access the community guidelines. But in 2024, FrankSpeech's community standards’ first line announces that “The Community Standards for Frank are Based Upon America's Constitutional Republic and the Laws of Nature and Nature's God That Are Its Foundation” (FrankSpeech, 2024). These community standards specify that “Frank is a place for discussion by those who recognize our freedoms are gifts, not from government, but from God.” Rather than lay out guidelines, FrankSpeech's community standards make an argument that the United States was founded as a Judeo-Christian republic, not a democracy: Frank is grounded in certain eternal truths, including respect for the rights and human dignity of others. In keeping with our legal and Constitutional history as a nation, Frank will enforce and enjoy community standards that are consistent with the legal and constitutional standards embodied in the laws of nature and nature's God that are the basis of our federal and state Constitutional Republics.
Although FrankSpeech is clear about its overall values, it does not include particular actions or rules for content in their community guidelines. Other sites, however, are explicit about the values they want their community to uphold, even the emancipatory aims of certain sites. For example, Grindr's (2024) community guidelines state that “Grindr is committed to creating a safe and authentic environment in which diversity, mutual respect, and sex-positivity thrive.” Within this one sentence, users can see what the site values: authenticity, “diversity, mutual respect, and sex-positivity.” These terms signal to the user that anyone who is not aligned with these values will not find the site useful and may not even be welcome on it. Through this statement, Grindr is documenting its goal to create a space in which users are free to be themselves. Rather than using its community guidelines to restrict, Grindr is using these guidelines to create more room, space, and emancipatory potential.
Grindr (2024) is also specific about when its values come into tension with the platform itself. Its community guidelines share that although the platform is sex-positive, it “can’t allow pornography or sexually explicit nudity in public profiles. Our user experience has to be acceptable to the chaste eyes of Apple and Google's app store decision-makers, lest we get kicked off their platforms.” Although Grindr set up this clear boundary, it also shares ways to work around this boundary through adapting their community guidelines. These guidelines exemplify how to relate ethics to social justice work because they express particular protective values that the platform is attempting to uphold. Although Grindr may not be able to change its ToS, or the laws governing that policy, it can adapt its community guidelines to help users work around or with the boundaries set in the ToS. By doing so, Grindr is protecting its users from removal. For example, the community guidelines outline that users can “totally send photos privately, but always make sure that you have consent from the recipient first.” These suggestions are specific examples of how users can enact values while on the platform through their behavior: They must ask for and receive consent. Grindr, then, is working from a care-based ethic that upholds interdependency and healthy relationships between users to enable communal flourishing on the site, and it shows how ToS and community guidelines can be used toward these aims.
Like Grindr, other sites have attempted to create a “safe space” for marginalized users. For example, there are communities created by and for sex workers that include gatekeeping mechanisms, such as the requirement for sex workers to verify that they are a sex worker before being allowed into the space. Although gatekeeping is used to exclude a certain population (non–sex workers), this exclusion is necessary to protect community members. These processes—verifying users and flagging and removing any non–sex workers that make it past the verification stage—have a utilitarian foundation. In other words, the community members (sex workers) have deemed that the consequence of an action (banning non–sex workers from their space) is beneficial to their majority (sex workers). Or the ends, having a safe place for sex workers to talk, discuss, and organize, justify the means, excluding a certain group of people. Thus, on such sites, community members’ happiness is more important than allowing unlimited free speech or being a fully inclusive space, so the sites use a form of qualified utilitarianism. Although utilitarianism is not often evoked when discussing justice, these sites take a utilitarian approach (toward the greatest good for the majority of people) that is in direct opposition to an unjust practice: stigmatizing and oppressing sex workers. Their processes, then, including their ToS and community guidelines, are used for emancipatory purposes.
This analysis has shown that combining CADA methods with a close reading of the corpus texts can lead to rich insights. Using an adapted CADA to locate patterns within a corpus that become the foundation for further close readings can offer multiple modes of data analysis. Through this adapted CADA, I found language patterns and usage within the corpus as a whole that showed me how specific ethical frameworks were at work in various documents across the corpus. As a result, this mixed-methods approach to discourse analysis was useful in studying the complicated building work of ethical frameworks within these documents.
Limitations
Of course, there are limitations to doing keyword counts and finding clusters and collocations. Other researchers may create a social media corpus differently, including making alterations to my keyword list. For example, the term “clean” was prominent throughout my corpus, with references to clean images or clean behavior, using “clean” as “nonsexual.” But “clean” was not a part of my keyword list. Another limitation of CADA is that it only looks at words and texts and cannot necessarily speak to larger sociocultural issues and contexts. This limitation is why researcher analysis is so important and why I make the case that these documents do project or include certain values and uphold certain ethical frameworks.
My adapted CADA is just one way to analyze this ToS and community guidelines corpus. I wanted to uncover some underlying patterns in order to argue that the discourse in ToS and community guidelines contains or creates specific ethics. In many cases, the rules and guidance outlined in ToS and community guidelines documents do have ethical meaning, especially when taken as part of the content moderation of their site as a whole. A rule may seem simple but could be difficult to enact in practice. For example, a common phrase used in various ToS is that “adult” or “sexual” content is “any media that is pornographic and/or may be intended to cause sexual arousal” (CounterSocial, 2024). Content “intended to cause sexual arousal” seems like a simple descriptor, but what may cause sexual arousal for one person might not cause it for another. And sexual arousal might even be caused by something innocuous that would not traditionally be defined as “sexual content.” These blurred areas and complicated tensions are why I was so interested in exploring not only what is written in these documents but also, eventually, how these guidelines get (or do not get) enacted in practice.
Future Implications
Opaque terms are used throughout the corpus, which is one reason why I conducted a separate study interviewing content moderators to see how these terms and guidelines get applied. For example, it is important to understand how “obscene” is used in different contexts by different content moderators and how they decide what kinds of sexual content are “objectionable” or even how they use their policies to allow content. Although doing a corpus analysis of the documents was an important step, seeing how these documents are used in practice as part of a larger content-moderation infrastructure would be an important next step. Findings from early interviews show how content moderators can, and do, affect user flourishing on any given site and have the potential to be used toward activist aims (Jordan & Holmes, in press).
Also, I cut 135 sites from this study because they were not created in or based in the United States. There is ample room to study with a cross-cultural and global lens the ToS and community guidelines (along with connected documents such as FAQs and privacy policies) from non–U.S.-based social media sites. These sites are also useful for comparative analyses, especially as similar laws are being enacted across the world regarding what users should be allowed to post and engage with on social media.
A longitudinal study, particularly around “sexual content,” of ToS and community guidelines both before and after Donald Trump's election to a second term as president of the United States would be interesting. Since Trump's election, the American Civil Liberties Union (2025) has “tracked 616 anti-LGBTQ bills” in the United States, and sex worker legislation has been introduced that particularly targets these groups in online spaces, such as the Interstate Obscenity Definition Act, which effectively makes pornography a federal offense (Mike Lee US Senator for Utah, 2025). Examining how platforms have changed their policies already, with many complying in advance, would be an important extension of this study.
Overall, conducting an adapted CADA allowed me to answer my initial research question: To what extent are ethical frameworks or values apparent in ToS and community guidelines through their definitions of sexual content? I found that the definitional work that ToS and community guidelines do sets up sexual content in a variety of ways depending on the purposes, goals, and concerns of the platform or online community. By arguing for ToS and community guidelines as value-laden documents, I am participating in one goal of TPC: to challenge value neutrality in technology. The documents themselves create both affordances and constraints in enacting the values of a platform, particularly when the platform's values are at odds with its users’ values. As a TPC scholar and teacher invested in how technological infrastructure impacts marginalized and oppressed groups and how we can intervene, the compliance documents we all encounter became an ample starting point. Studying how language is used in these mundane texts also opens up our ability to see their emancipatory potential. Although it may not seem that mundane texts like ToS or community guidelines are opportunities for transformations, they can be access points with potential for change.
Although Katz (1992), one of the most widely cited scholars for ethics in TPC, argued for a broader “humanitarian ethic” by critiquing expediency as an ethical end without considering humanitarian concerns, and Dombrowski (2007) stated that “we only have ourselves” (p. 317) to handle human relations, my analysis affirms that we need an ethics that can account for ethical protections and exclusions (see Walwema et al., 2024). And even though a site such as Grindr, with its ethics of care, seems more inclusive, its rules of inclusivity still protect certain values and communities.
Ethical decision-making has a long history in TPC, which has envisioned ethics in broad strokes, making judgments about the values that guide actions and what theories may help us understand those values (Dombrowski, 2000) and even using ethical decision-making processes to better understand the rhetorical situation (Markel, 2001). Although these engagements have been fruitful, helping to show us how ethics and praxis can be bridged for the technical communicator, this type of ethics scholarship is just one of many types that we need. For example, Rice (2014) argued that those types of ethical models “emphasize an individual's strict presentation of ethical information, and not their rhetorical engagement with it” (p. 2). My analysis shows that one way that we can have fuller conversations about rhetorical engagements with ethics is by studying content moderation as infrastructure—from the value-laden documents, interfaces, and technologies to the content moderators themselves.
Using ethics to view whole systems helps us understand ethics beyond an individual's choices. Freedom and autonomy is a continuum that is complicated in the ToS and community guidelines in my corpus. In those documents, we can see a form of qualified utilitarianism at work to support social justice ends such as the inclusion of sex workers or the flourishing of transgender users. The ways that utilitarianism can be used as a counterweight to the unethical exclusionary practices of the majority, particularly through content moderation, is an area that TPC scholars can engage with further. With rapidly developing legislation and new laws impacting free speech, especially sexual free speech, on the Internet, concerns about content moderation are becoming even more paramount. Our field has made clear that TPC texts, documents, and practices are not neutral; they are value-laden and coconstitute knowledge and meaning. Now we can use the language of ethics to further interrogate how various TPC contexts use, navigate, and conflict with various ethical frameworks and how those frameworks all work to protect, sometimes through exclusion, certain values, actions, and ethical outcomes.
Footnotes
Author's Note
Although I am now an assistant professor at the University of South Carolina, I conducted this research while I was a graduate student at Texas Tech University and a full-time lecturer at California State University Channel Islands.
Funding
The author discloses receipt of the following financial support for the research, authorship, and/or publication of this article: California State University Channel Islands’ Arts and Sciences Research Seed Grant and Research, Scholarly, and Creative Activities Grant; Texas Tech University's Summer Dissertation Completion Fellowship.
Declaration of Conflicting Interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
