Of 'Batter'ed by rage, Misinforming misinformation and fact-checking by the crowds

MisDisMal-Information Edition 49

Sep 14, 2021

What is this? MisDisMal-Information (Misinformation, Disinformation and Malinformation) aims to track information disorder and the information ecosystem largely from an Indian perspective. It will also look at some global campaigns and research.

What this is not? A fact-check newsletter. There are organisations like Altnews, Boomlive, etc., who already do some great work. It may feature some of their fact-checks periodically.

Welcome to Edition 49 of MisDisMal-Information

In this edition:

‘Batter’ed by disinformation: Cybersocial attacks on corporates, disinformation-for-hire, everyone is a stakeholder in the information ecosystem, and Twitter activity of Fortune 500 CEOs in India and the U.S.
Misinforming Misinformation studies: Researchers discovered anomalies in the U.S. data Facebook provided to Social Science One.
Crowdsourced fact-checking: Can aggregate, low-effort, fact-checking by non-professional fact-checkers match the performance of professional fact-checkers?

Note: MisDisMal-Information 49 won’t be published on 21st September. Hopefully, very little crazy stuff will happen this week 🤞.

Update 14-10-2021: Clarified phrasing of research findings from comparative study of Indian and US CEO tweets with respect to SDG 16.

Update 2021-09-30: Included link to comparative study of Indian and US CEO tweets.

‘Batter’ed by Rage

iD - known for its Idly Dosa Batter found itself at the wrong end of an outrage cycle last week. The claim was that it uses animal products (in this case, it was ‘cow bones and calf rennet’) [TheNewsMinute]. It has reportedly filed a complaint with the cybercrime of the Bengaluru Police [Telegraph]. Cadbury’s found itself in a similar position in July, based on information from its Australian subsidiary’s website [Livemint]. And just in the last few weeks, we have had:

Myntra: An outrage campaign that recycled something from 2016 [News18], which seems to pop up every few years [Twitter Search showing a few instances back in Oct 2020 as well]. Related: Read Karthik Srinivasan’s post on Myntra’s (lack of) response to the 2021 rage cycle [BeastofTraal].
Times of India: There was some activity on the hashtag ‘ShameonTOI’ [GetDayTrends], based on an article from … September 2018.
Infosys: though the origin of the rage cycle seems to be different compared with the examples listed earlier, i.e. a cover story in a weekly magazine published by the RSS instead of something that “surfaced” on Digital Communication Networks (DCNs).

With, perhaps the exception of the Infosys one, one common thread (there are possibly more) that runs through them is the expression of ‘iD’entity - in some shape or form. These are only just the recent ones that I can recall off the top of my hand. I’m sure if we were to scour the resources like Trendinalia or GetDayTrends, there would be many more that just don’t make it to the news cycle. Now we can dislike it, but we can’t wish it away. But I want to draw your attention to the other common thread for now:

In July 2021, Ina Fried wrote, “disinformation is coming for your business” [Axios].

craig newmark @craignewmark

Businesses face disinformation chaos: "You’ve either been the target of a disinformation attack or you are about to be," former U.S. cybersecurity head @C_C_Krebs told Axios. Important via @inafried @axios

axios.comBusinesses face disinformation chaosRansomware may be in the headlines, but disinformation attacks on businesses are multiplying fast.

Graphika Labs referred to such events as ‘cybersocial attacks’ in a whitepaper titled: Weaponised Social Media (Again, I should identify incentives because the paper was clearly pitching a service to potential clients)
A new reputational threat has emerged across social platforms: cybersocial attacks. The rise of mis- and disinformation, conspiracy theories, and coordinated social attacks requires communicators to become adept at navigating social spaces and the audiences that matter most to their brand, with greater situational awareness than ever before.

Now pause for a second and think back to the countless ‘ban XYZ’ or ‘censor ABC’ hashtags and/or 1-star rating sprees we’ve seen over the years in India. While it is tempting to explain it all away using the identity frame, I think it is important to add more layers to this model. One such layer is, of course, the role of incentives. And when we’re doing that, financial incentives play an important role. A lot has been written about the role of DCN firms in creating attention markets. For now, let’s look at the world of ‘disinformation for hire’.

Disinformation-for-hire: On more than one occasion, Facebook has linked coordinated inauthentic behaviour to marketing/public relations services firms (2 examples from India are SilverTouch, and aRepGlobal). In fact, Nathaniel Gleicher (Head of security policy at Facebook) testified about the rising trend of “disinfo-for-hire” in front of the Select Committee on Foreign Interference Through Social Media in Australia [Asha Barbaschow - ZDNet]. In 2018, a report based on interviews with various participants in Philippines’ disinformation ecosystem developed three personas/portraits [Jonathan Corpus Ong and Jason Vincent A. Cabañes]

Chief Architects: Ad and PR strategists
At the top level of networked disinformation campaigns are ad and PR executives who take on the role of high-level political operators. Usually they occupy leadership roles in “boutique agencies,” and handle a portfolio of corporate brands while maintaining consultancies with political clients
…
While many chief architects are very savvy with digital technology, they are actually wary of emerging techniques in global disinformation campaigns such as using automated software like bots. They would much rather rely on the labor of savvy creative writers with knowledge of popular vernaculars who can mobilize populist public sentiment. As one chief architect remarked about bots, “Bots are like the white walkers in Game of Thrones. They’re stupid and obvious and easily killed. They can’t inspire engagement.”
Anonymous Digital Influencers and Key Opinion Leaders
At the top level of networked disinformation campaigns are ad and PR executives who take on the role of high-level political operators. Usually they occupy leadership roles in “boutique agencies,” and handle a portfolio of corporate brands while maintaining consultancies with political clients
…
While many chief architects are very savvy with digital technology, they are actually wary of emerging techniques in global disinformation campaigns such as using automated software like bots. They would much rather rely on the labor of savvy creative writers with knowledge of popular vernaculars who can mobilize populist public sentiment. As one chief architect remarked about bots, “Bots are like the white walkers in Game of Thrones. They’re stupid and obvious and easily killed. They can’t inspire engagement.”
Community-level fake account operators
These workers are tasked to follow what we call script-based disinformation work, which consists of posting written and/or visual content previously designed by the strategists on a predetermined schedule, as well as affirming and amplifying key messages by strategists and influencers through likes and shares, thus creating “illusions of engagement”. Community-level fake account operators are tasked to post a prescribed number of posts or comments on Facebook community groups,75 news sites, or rival politicians’ pages per day.
…
Community-level fake account operators’ motivation is primarily financial. We found out that some of their fake accounts on Facebook or Twitter had prior histories before their political trolling work, used as part of pyramid marketing schemes. These “networking” schemes required them to visually display groups of friends; fake accounts were one way to artificially manufacture group support.

Now, this may not map exactly to the situation in India, and I also think it is fair to say that our understanding has been refined since 2018 (thanks to work like this) - but there are parallels. It is also clear that our attention is largely consumed by ‘digital influencers’, ‘account operators’ rungs and their activities. We eventually need to move up to the chief architects.

For businesses, there are 2 more aspects to explore.

Becoming a target of ‘cybersocial attacks’ is a question of ‘when’ not ‘if’, and there is no control over ‘why’. The consequences may not always be limited to 1-star reviews or hashtags.

Lamenting that American businesses ceded space, allowing the Republican Party to drift further and further extreme, Mark S. Mizruchi writes [Niskanen Center]:

Regardless of which approach the corporate elite takes, it is essential for the group to exercise the kind of strong leadership that would free today’s conservatives from the grip of the fringe elements that traffic in conspiracy theories, reject science, deny facts, and subvert the norms on which a democratic society depends.

Aside: Niskanen Center also did a 4-part webinar series on businesses and depolarisation.

This brings me to the 2nd.

Everyone is a stakeholder in public discourse and the information ecosystem. So a strategic silence on issues that affect basic democratic values can come back sooner or later to haunt everyone.

Which is why I read a forthcoming paper by Arshia Arya, Shehla Rashid Shora and Joyojeet Pal with great interest. It compared twitter activity by CEOs in India and the U.S. across a range of topics centred around the United Nations Sustainable Development Goals (SDGs) (As far as I know, the paper has not yet been published, so I am not including links or quotes). Two things stood out to me.

Overall, across 17 SDGs, engagement with various topics for the India and the U.S. datasets were similar. SDG 16 (Peace, Justice and Strong Institution) saw the highest engagement from the Indian CEO dataset. The authors identified ‘democracy’ and ‘governance’ as keywords related to this particular SDG and observed that CEOs in the India dataset did not appear to engage with issues “related to democratic rights or governance”. Note that engagement in this context meant they ‘engaged’ with a topic and not how much engagement their tweets received.

While tweets from the U.S. CEO dataset engaged with topics like the black lives matters movement, hate crimes against Asians, there appeared to be no such content in the India CEO dataset with regard to protest movements or crimes against minorities.

(It is important to point out that any engagement with these topics outside Twitter by CEOs without Twitter accounts or just using another platform, by senior management members other than CEOs is unlikely to have been recorded)

Update 2021-09-30: The paper is now published.

I’m not exactly advocating ‘performative’ Twitter activism (though one cannot discount the potential signalling effects in such cases when you’re starting from such a low base). I’ll go back to Mirzuchi’s paper:

… corporate elite needs to organize itself the way its counterparts in the mid-20th century did, with a focus on the well-being of the entire society. Without such a thrust, the entire system, and the benefits that business draws from it, may be in jeopardy. Yet organizing, and regaining a strong centrist presence, represent only part of the solution. The corporate elite needs to refocus its orientation toward its long-term viability rather than short-term gain.

Misinforming Misinformation Studies

Oops, we did it again!

It turns out that Social Science One, “a consortium founded in 2018 that Facebook hails as a model for collaboration with academics”, was given imperfect data by Facebook [Craig Timberg - Washington Post]

So, what was wrong with this data?

The error resulted from Facebook accidentally excluding data from U.S. users who had no detectable political leanings — a group that amounted to roughly half of all of Facebook’s users in the United States. Data from users in other countries was not affected.

It was caused by a ‘technical error’ as per a quote from a Facebook spokesperson in the article that first reported the story [Davey Alba - New York Times].

What exactly was the technical error? We don’t know. Sol Messing believes it could have something to do with the way it was combined.

Sol Messing @SolomonMg

What happened that generated the error: TBD. I'd bet that U.S. user-political affinity was joined to the rest of the data using a LEFT JOIN instead of a LEFT OUTER JOIN. Again FB folks are likely working to fix this ASAP.

This raises lots of questions

1) How many papers have been affected and will be retracted as a result? Also, as the reporting points out, this also probably affects a number of them that were in the process of being written. And, though the U.S. dataset was affected, I also wonder if it will affect papers that undertook comparative analyses across countries/any other dimension that may have relied on this data.

2) Big questions about the reliability of such ‘big data’ and sustainability of a model in which academics rely on firms operating DCNs for data. It is also important to view this in light of Facebook cutting of New York University’s ad observatory, Algorithm Watch’s claim that Instagram bullied them into halting research, Facebook’s semi-feud with Kevin Roose over CrowdTangle engagement data, and reports of subsequent restructuring of CrowdTangle that are being looked at with concern.

Aside: The error was discovered by Fabio Giglietto, who spotted anomalies based on the Q1 (shelved and then released) Widely Viewed Content Report - which was meant to be a way to clarify that view/reach data would paint a different picture than CrowdTangle’s interaction data. You can’t make this stuff up.

3) Will Facebook be held to account for an error that has affected several research papers and research papers? Or will they be able to shrug and move on like they typically do (Related: The Backlash Against Big Tech Is Mostly in the Minds of the Media - Bloomberg)? I’ve seen suggestions that the researchers should get together and sue Facebook, but that’s probably a lot harder than it sounds.

4) Also worth pointing out that there’s limited evidence to suggest that this was deliberate or that it would have helped Facebook’s image management efforts. Dean Eckles believes that this error, arising out of omission, would have made the dataset seem more polarised than it actually was.

Dean Eckles @deaneckles

That is, contra some reactions that somehow this error "helped" Facebook, I would expect this made FB look more filled with misinfo & polarizing content than it was. Obviously, this error will have lasting consequences...

Davey Alba @daveyalba

It's also interesting how this flaw was discovered in the first place. @fabiogiglietto took Facebook's shelved Q1 report on Widely Viewed Content, then compared it to the researchers data set. That's when he found a huge discrepancy. https://t.co/2ejZFEkyia https://t.co/I5SsJ0Sm8U

5) It likely wasn’t the error with the dataset (not that uncommon), but the scale is what sets it apart.

Kate Starbird @katestarbird

As one of the research teams working on this data… this isn’t the first error in the data, but this one is egregious. How do we build research on top of these opaque data — having lost trust in their curation?

Davey Alba @daveyalba

New (w/ @RMac18 assist): More than three years ago, Mark Zuckerberg unveiled a plan to share Facebook data with outside researchers, so that academics could study misinformation on the site. But the data set had a major flaw. https://t.co/NSs0ZFfm2q

6) Broader point, not just limited to disinformation research - it exposes a big weakness with mechanisms that rely on purely discretionary participation/action.

DЯ ΞMMΛ Γ BRIΛИT 📚🐶🧡 @EmmaLBriant

What was I saying about relying on what the platforms dole out? Academics are literally being fed and have little ability to really know what they are told is true and complete.

Davey Alba @daveyalba

Ultimately, though, this will exacerbate Facebook’s trust problem. There are two components you can score this on - malice and competence. I put this one in the competence bucket.

Fact-checking accuracy of the crowds

In 37 - Har(dly)vard(th) Fact checking? I wrote about necessity v/s necessity + sufficiency of fact-checking:

None of this new. It would also be a stretch to label this as anything but motivated. And fact-checkers have tirelessly debunked several such items in the past, yet, they keep on coming. These items also have a specific audience in mind.
Note: The paragraphs that follow are not to imply that fact-checking is ineffective. It is the examine the gap(s) between whether it is necessary or necessary and sufficient. It certainly has signalling benefits. It can also stem the flow of people into influence circles that rely on motivated information. It can even arm community-led corrective efforts with information - how these communities choose to intervene is a different matter.
…

< Context: In April, there were reports of two studies about the UP government, one from Harvard and the other from Johns Hopkins University. The Harvard study supposedly praised its handling of the migrant crisis, while the Johns Hopkins University study was portrayed as claiming that the state government was among the best managers of pandemic responses around the world >

A recently published paper by Jennifer Allen, Antonio A. Arechar, Gordon Pennycook and David G. Rand investigated the possibility of scaling fact-checking using the ‘wisdom of crowds’. Now, even before I went into the paper, I wondered about two things:

1) How you would solve for brigading and swarming. As an early review of Twitter’s crowdsourced fact-checking experiment suggested [Factually - Poynter]:

The results so far aren’t encouraging, as I found blatant misinformation receiving “not misleading” notes, context that reveals political bias and a small number of voices — with dubious Twitter feeds of their own — dominating Birdwatch activity.

2) Given how intensive/high-effort some fact-checks are, why would people make an effort?

For the purpose of this study, they got around 1) by recruiting crowd fact-checkers from ‘an online labour market’ (their words, not mine)

However, this danger is largely eliminated by using a rating system in which random users are invited to provide their opinions about a specific piece of content (as in, for example, election polling), or simply hiring laypeople to rate content (as is done with content moderation). When the crowd is recruited in this manner, it is much more difficult for the mechanism to be infiltrated by a coordinated attack, as the attackers would have to be invited in large numbers to participate and suspicious accounts could be screened out when selecting which users to invite to rate content.

And what about 2)? (emphasis added)

Examining 207 news articles flagged for fact-checking by Facebook algorithms, we compare accuracy ratings of three professional fact-checkers who researched each article to those of 1128 Americans from Amazon Mechanical Turk who rated each article’s headline and lede.

And, what did they find:

The average ratings of small, politically balanced crowds of laypeople
(i) correlate with the average fact-checker ratings as well as the fact-checkers’ ratings correlate with each other and
(ii) predict whether the majority of fact-checkers rated a headline as “true” with high accuracy

And, while there are shades of Superforecasting [Goodreads] here, I am a little concerned that the crowd relied on just the headline and the lede. They do address this (quicker + most people don’t read past the headlines/lede). Still, I remain concerned about this (this also diverges from the Superforecasting analogy since the ‘amateur forecasters’ were still undertaking significant research):

Directly rating headlines and ledes is much quicker than reading full articles and doing research. Second, this approach protects against articles that have inaccurate or sensational headlines but accurate texts. Given that most users do not read past the headline of articles on social media (24), it is the accuracy of the headline (rather than the full article) that is most important for preventing exposure to misinformation online.

Going into this paper, I was also curious to see if they defined any metrics to rate fact-checks by the professional fact-checkers who participated. Ultimately, they looked for how near or far the aggregate accuracy ratings for the crowdsourced participants were from the ratings provided by the professional fact-checkers.

The Information Ecologist

Of 'Batter'ed by rage, Misinforming misinformation and fact-checking by the crowds

MisDisMal-Information Edition 49

‘Batter’ed by Rage

Misinforming Misinformation Studies

Fact-checking accuracy of the crowds