Stat 585 - Ethics and Reproducibility

Frontmatter check

Prompt:

In May 2015 Science retracted - without consent of the lead author - a paper on how canvassers can sway people’s opinions about gay marriage, see also: http://www.sciencemag.org/news/2015/05/science-retracts-gay-marriage-paper-without-agreement-lead-author-lacour The Science Editor-in-Chief cited as reasons for the retraction that the original survey data was not made available for independent reproduction of results, that survey incentives were misrepresented and that statements made about sponsorships turned out to be incorrect.
The investigation resulting in the retraction was triggered by two Berkeley grad students who attempted to replicate the study and discovered that the data must have been faked.

FiveThirtyEight has published an article with more details on the two Berkeley students’ work.

Malicious changes to the data such as in the LaCour case are hard to prevent, but more rigorous checks should be built into the scientific publishing system. All too often papers have to be retracted for unintended reasons. Retraction Watch is a data base that keeps track of retracted papers (see the related Science magazine publication).

Read the paper Ten Simple Rules for Reproducible Computational Research by Sandve et al.

Write a blog post addressing the questions:

Pick one of the papers from Retraction Watch that were retracted because of errors in the paper (you might want to pick a paper from the set of featured papers, because there are usually more details available). Describe what went wrong. Would any of the rules by Sandve et al. have helped in this situation?

Solution:

I chose Stimulus-triggered fate conversion of somatic cells into pluripotency, which was retracted due to discovery of multiple instances of image manipulation and duplication. The author duplicated some of the images within the experiment, used images from different experiments, and presented them as if they were from the same experiment. This will be the main concerns about reproducibility and findings of the results.

I think if they followed “Sandve et al” ten rules for reproducible computational research Paper, I could help them in preventing the issues in the paper. For e.g. Rule 5: “Record all intermediate results, when possible in standardized formats” would have helped ensure that any changes to images were documented and justified. Also Rule 2:”Avoid Manual Data Manipulation Steps”, it would have ensure any changes in the images and image directory were documented. I think by following the ten rules, such retracted would have not happened.

References: https://www.researchgate.net/publication/259984904_Stimulus-triggered_fate_conversion_of_somatic_cells_into_pluripotency

After reading the paper by Sandve et al. describe which rule you are most likely to follow and why, and which rule you find the hardest to follow and will likely not (be able to) follow in your future projects.

Solution:

“Sandve et al” presented ten rules for reproducible computational research. I find all of them are quite interesting, but I like Rule 1: For Every Result, Keep Track of How It Was Produced ,I think this rule is very important as it will help in of keeping a detailed record of every step taken during the research process, including data cleaning, preprocessing, and analysis. So, keeping track how every result is produced, it becomes easier to reproduce the results and help further in ensuring the validity of the research outputs. Moreover, second most rules I like is Rule 4: Version Control All Custom Scripts, as it will help in not losing any work and keep track of all rough work of each steps of the code during research development.

However, the hardest rule, I find to follow is Rule 9: Connect Textual Statements to Underlying Results. It is not necessary to have access of all the tools and resources specially when we are dealing with huge datasets or some data only belong to certain organization, so it will difficult to reproduce the results. It is difficult to follow in practice due to the time and resource requirements, technical difficulties, and personal concerns of organizations. However, this rule is very nice and help to increase transparency and reliability of the work.

I think that this rule is quite difficult to follow in my current setting of my research.

Submission

Push your changes to your repository.
You are ready to call it good, once all your github actions pass without an error. You can check on that by selecting ‘Actions’ on the menu and ensure that the last item has a green checkmark. The action for this repository checks the yaml of your contribution for the existence of the author name, a title, date and categories. Don’t forget the space after the colon! Once the action passes, the badge along the top will also change its color accordingly. As of right now, the status for the YAML front matter is:

Frontmatter check

---
author: "Kundan Kumar"
title: "Ethics and Reproducibility"
date: "2023-02-23"
categories: "Ethics and Reproducibility..."
---