Stat 585 - Ethics and Reproducibility

blog-5-sudi007

Frontmatter check

Prompt:

In May 2015 Science retracted - without consent of the lead author - a paper on how canvassers can sway people’s opinions about gay marriage, see also: http://www.sciencemag.org/news/2015/05/science-retracts-gay-marriage-paper-without-agreement-lead-author-lacour The Science Editor-in-Chief cited as reasons for the retraction that the original survey data was not made available for independent reproduction of results, that survey incentives were misrepresented and that statements made about sponsorships turned out to be incorrect.
The investigation resulting in the retraction was triggered by two Berkeley grad students who attempted to replicate the study and discovered that the data must have been faked.

FiveThirtyEight has published an article with more details on the two Berkeley students’ work.

Malicious changes to the data such as in the LaCour case are hard to prevent, but more rigorous checks should be built into the scientific publishing system. All too often papers have to be retracted for unintended reasons. Retraction Watch is a data base that keeps track of retracted papers (see the related Science magazine publication).

Read the paper Ten Simple Rules for Reproducible Computational Research by Sandve et al.

Write a blog post addressing the questions:

Pick one of the papers from Retraction Watch that were retracted because of errors in the paper (you might want to pick a paper from the set of featured papers, because there are usually more details available). Describe what went wrong. Would any of the rules by Sandve et al. have helped in this situation?

[What went wrong]: This paper was published in the journal, Lancet, on February 28, 1998, by Andrew Wakefield and 12 other authors. The paper claimed to have established links between the measles, mumps, and rubella vaccine (MMR) and autism, after conducting research on 12 participants However, the paper was retracted in 2010 due to incorrect data manipulation. This issue came to light when several researchers tried to duplicate the results of this study and failed in the endeavor. Thereafter, it was retracted with some on its team of authors stating that the original data was misinterpreted and that in reality, the data was insufficient to reveal any causal links between the MMR vaccine and autism. As this study had created a world-wide stir, such retraction caught the attention of the media. Consequently, journalists investigated the authenticity of the study and upon interviewing the participants, it surfaced that their medical details did not match with those stated in the article. Eventually, the vested interests of the researchers were exposed in this rare case of fraud. However, in other similar studies with such incorrect data manipulation but without fraudulent intentions, the implementation of Sandve et al.’s rules 1 and 2 would have helped the situation. According to Sandve et al., good research practices involve keeping track of how every result was produced. In this study, the research data did not match the actual data, generating incorrect results. If the data itself was cross-checked and how the results were produced properly document at each step, such discrepancies could have been avoided.

After reading the paper by Sandve et al. describe which rule you are most likely to follow and why, and which rule you find the hardest to follow and will likely not (be able to) follow in your future projects.

After reading the Sandve et al. paper, all the rules described in it appear to be a pre-requisite to produce an exemplar research paper. However, if I had to choose one that I am most likely to follow, it would be rule 9, Connect Textual Statements to Underlying Results. This is because I have often looked at the results of my research projects and wondered how they should be interpreted in the context of the research. Linking the results to the statements from my notes and emails would thus make it easy to write the results section of any of the research projects I take up in the future. On the other hand, following rule 8, Generate Hierarchical Analysis Output, Allowing Layers of Increasing Detail to Be Inspected, appears the hardest from the list. That is because data analysis itself is a substantial task entailing writing and running codes, which may sometimes need to be closely examined, re-written and re-run. To remember and find the time to accommodate rule 8 is easier said than done. As a result, following rule 8 in the near future appears to be more of a challenge than the remaining rules.

Submission

Push your changes to your repository.
You are ready to call it good, once all your github actions pass without an error. You can check on that by selecting ‘Actions’ on the menu and ensure that the last item has a green checkmark. The action for this repository checks the yaml of your contribution for the existence of the author name, a title, date and categories. Don’t forget the space after the colon! Once the action passes, the badge along the top will also change its color accordingly. As of right now, the status for the YAML front matter is:

Frontmatter check

---
author: "Sudesh Bhagat"
title: "Ethics and Reproducibility"
date: "2023-02-23"
categories: "Ethics and Reproducibility..."
---