“You never forget your first… function”. Yes, I know this isn’t the right expression, but it still works.
Today, I’m here to show my first function, written during my thesis internship.
The aim of this post is not to analyze step by step the code of the function, but to understand why it was written, its use, the main feautures and how it could be improved.
Disclaimer: data shown in the table are fake.
Aim
As for all the functions, it was written because I needed to save time in copying and pasting from an Excel file to another. In particular, during my thesis internship in the Viral Pathogenesis and Biosafety Unit, I used to work with PCR techniques to analyze transcriptional levels in Zika Virus-infected macrophages. PCR machine returned an Excel file like this:Block Type | 96fast | …3 | …4 | …5 | …6 |
---|---|---|---|---|---|
Chemistry | SYBR_GREEN | NA | NA | NA | NA |
Experiment File Name | E:-USER-998.eds | NA | NA | NA | NA |
Experiment Run End Time | 2020-09-16 17:36:48 PM CEST | NA | NA | NA | NA |
Instrument Type | sds7500fast | NA | NA | NA | NA |
Passive Reference | ROX | NA | NA | NA | NA |
NA | NA | NA | NA | NA | NA |
Sample Name | Target Name | Cт | Cт Mean | Cт SD | Ct Threshold |
CTRL Nil 6h | GAPDH | 19.640373229980469 | 19.780765533447266 | 0.19854335486888885 | 0.33915683988368872 |
CTRL Nil 6h | GAPDH | 19.92115592956543 | 19.780765533447266 | 0.19854335486888885 | 0.33915683988368872 |
CTRL Nil 6h | tXBP1 | 25.290117263793945 | 24.952022552490234 | 0.47813811898231506 | 0.52222729036036653 |
CTRL Nil 6h | tXBP1 | 24.613927841186523 | 24.952022552490234 | 0.47813811898231506 | 0.52222729036036653 |
CTRL Nil 6h | sXBP1 | 30.254941940307617 | 30.524925231933594 | 0.3818126916885376 | 0.5662700331258208 |
CTRL Nil 6h | sXBP1 | 30.794906616210938 | 30.524925231933594 | 0.3818126916885376 | 0.5662700331258208 |
CTRL Nil 6h | ATF4 | 26.92424201965332 | 27.035743713378906 | 0.15768586099147797 | 0.58827107235415099 |
CTRL Nil 6h | ATF4 | 27.147243499755859 | 27.035743713378906 | 0.15768586099147797 | 0.58827107235415099 |
CTRL Nil 6h | CHOP | 26.493007659912109 | 26.470455169677734 | 0.031894039362668991 | 0.23654008298589471 |
CTRL Nil 6h | CHOP | 26.447902679443359 | 26.470455169677734 | 0.031894039362668991 | 0.23654008298589471 |
CTRL Nil 6h | BiP | 24.844993591308594 | 24.112155914306641 | 1.0363876819610596 | 0.35245788740176781 |
CTRL Nil 6h | BiP | 23.37932014465332 | 24.112155914306641 | 1.0363876819610596 | 0.35245788740176781 |
CTRL CDC 6h | GAPDH | 19.987825393676758 | 19.965446472167969 | 0.031648572534322739 | 0.33915683988368872 |
CTRL CDC 6h | GAPDH | 19.94306755065918 | 19.965446472167969 | 0.031648572534322739 | 0.33915683988368872 |
CTRL CDC 6h | tXBP1 | 25.751449584960938 | 25.805307388305664 | 0.076166436076164246 | 0.52222729036036653 |
CTRL CDC 6h | tXBP1 | 25.859165191650391 | 25.805307388305664 | 0.076166436076164246 | 0.52222729036036653 |
CTRL CDC 6h | sXBP1 | 30.767412185668945 | 30.850719451904297 | 0.11781291663646698 | 0.5662700331258208 |
CTRL CDC 6h | sXBP1 | 30.934024810791016 | 30.850719451904297 | 0.11781291663646698 | 0.5662700331258208 |
CTRL CDC 6h | ATF4 | 26.766836166381836 | 26.863870620727539 | 0.13722744584083557 | 0.58827107235415099 |
CTRL CDC 6h | ATF4 | 26.960905075073242 | 26.863870620727539 | 0.13722744584083557 | 0.58827107235415099 |
CTRL CDC 6h | CHOP | 26.424724578857422 | 26.374011993408203 | 0.07171977311372757 | 0.23654008298589471 |
CTRL CDC 6h | CHOP | 26.323297500610352 | 26.374011993408203 | 0.07171977311372757 | 0.23654008298589471 |
CTRL CDC 6h | BiP | 25.402254104614258 | 25.564586639404297 | 0.22957152128219604 | 0.35245788740176781 |
CTRL CDC 6h | BiP | 25.726917266845703 | 25.564586639404297 | 0.22957152128219604 | 0.35245788740176781 |
M1 Nil 6h | GAPDH | 20.320545196533203 | 20.252937316894531 | 0.095610633492469788 | 0.33915683988368872 |
M1 Nil 6h | GAPDH | 20.185331344604492 | 20.252937316894531 | 0.095610633492469788 | 0.33915683988368872 |
M1 Nil 6h | tXBP1 | 25.388547897338867 | 25.423015594482422 | 0.048746034502983093 | 0.52222729036036653 |
M1 Nil 6h | tXBP1 | 25.457485198974609 | 25.423015594482422 | 0.048746034502983093 | 0.52222729036036653 |
M1 Nil 6h | sXBP1 | 30.424283981323242 | 30.483283996582031 | 0.08343861997127533 | 0.5662700331258208 |
M1 Nil 6h | sXBP1 | 30.54228401184082 | 30.483283996582031 | 0.08343861997127533 | 0.5662700331258208 |
M1 Nil 6h | ATF4 | 27.50013542175293 | 27.068460464477539 | 0.61048060655593872 | 0.58827107235415099 |
M1 Nil 6h | ATF4 | 26.636785507202148 | 27.068460464477539 | 0.61048060655593872 | 0.58827107235415099 |
M1 Nil 6h | CHOP | 25.918970108032227 | 26.166193008422852 | 0.34962597489356995 | 0.23654008298589471 |
M1 Nil 6h | CHOP | 26.413415908813477 | 26.166193008422852 | 0.34962597489356995 | 0.23654008298589471 |
M1 Nil 6h | BiP | 25.549455642700195 | 25.439476013183594 | 0.1555333286523819 | 0.35245788740176781 |
M1 Nil 6h | BiP | 25.329498291015625 | 25.439476013183594 | 0.1555333286523819 | 0.35245788740176781 |
M1 CDC 6h | GAPDH | 20.08906364440918 | 20.097148895263672 | 0.011432922445237637 | 0.33915683988368872 |
M1 CDC 6h | GAPDH | 20.105232238769531 | 20.097148895263672 | 0.011432922445237637 | 0.33915683988368872 |
M1 CDC 6h | tXBP1 | 25.849300384521484 | 25.935159683227539 | 0.1214233860373497 | 0.52222729036036653 |
M1 CDC 6h | tXBP1 | 26.021018981933594 | 25.935159683227539 | 0.1214233860373497 | 0.52222729036036653 |
M1 CDC 6h | sXBP1 | 31.350503921508789 | 31.301733016967773 | 0.068972475826740265 | 0.5662700331258208 |
M1 CDC 6h | sXBP1 | 31.252962112426758 | 31.301733016967773 | 0.068972475826740265 | 0.5662700331258208 |
M1 CDC 6h | ATF4 | 27.383247375488281 | 27.352087020874023 | 0.044067397713661194 | 0.58827107235415099 |
M1 CDC 6h | ATF4 | 27.320926666259766 | 27.352087020874023 | 0.044067397713661194 | 0.58827107235415099 |
M1 CDC 6h | CHOP | 26.69500732421875 | 26.680751800537109 | 0.020160354673862457 | 0.23654008298589471 |
M1 CDC 6h | CHOP | 26.666496276855469 | 26.680751800537109 | 0.020160354673862457 | 0.23654008298589471 |
M1 CDC 6h | BiP | 25.671287536621094 | 25.666770935058594 | 0.0063887881115078926 | 0.35245788740176781 |
M1 CDC 6h | BiP | 25.662252426147461 | 25.666770935058594 | 0.0063887881115078926 | 0.35245788740176781 |
CTRL Nil D1 | GAPDH | 20.042705535888672 | 20.082698822021484 | 0.056559048593044281 | 0.33915683988368872 |
CTRL Nil D1 | GAPDH | 20.122692108154297 | 20.082698822021484 | 0.056559048593044281 | 0.33915683988368872 |
CTRL Nil D1 | tXBP1 | 25.8763427734375 | 25.880035400390625 | 0.0052208146080374718 | 0.52222729036036653 |
CTRL Nil D1 | tXBP1 | 25.883726119995117 | 25.880035400390625 | 0.0052208146080374718 | 0.52222729036036653 |
CTRL Nil D1 | sXBP1 | 31.229412078857422 | 31.258998870849609 | 0.041843391954898834 | 0.5662700331258208 |
CTRL Nil D1 | sXBP1 | 31.28858757019043 | 31.258998870849609 | 0.041843391954898834 | 0.5662700331258208 |
CTRL Nil D1 | ATF4 | 27.159099578857422 | 27.170570373535156 | 0.016223501414060593 | 0.58827107235415099 |
CTRL Nil D1 | ATF4 | 27.182043075561523 | 27.170570373535156 | 0.016223501414060593 | 0.58827107235415099 |
CTRL Nil D1 | CHOP | 26.582208633422852 | 26.690837860107422 | 0.15362358093261719 | 0.23654008298589471 |
CTRL Nil D1 | CHOP | 26.799465179443359 | 26.690837860107422 | 0.15362358093261719 | 0.23654008298589471 |
CTRL Nil D1 | BiP | 25.77208137512207 | 25.828418731689453 | 0.079671703279018402 | 0.35245788740176781 |
CTRL Nil D1 | BiP | 25.884754180908203 | 25.828418731689453 | 0.079671703279018402 | 0.35245788740176781 |
CTRL CDC D1 | GAPDH | 19.895290374755859 | 19.918468475341797 | 0.032777436077594757 | 0.33915683988368872 |
CTRL CDC D1 | GAPDH | 19.941644668579102 | 19.918468475341797 | 0.032777436077594757 | 0.33915683988368872 |
CTRL CDC D1 | tXBP1 | 25.520210266113281 | 25.508262634277344 | 0.016897851601243019 | 0.52222729036036653 |
CTRL CDC D1 | tXBP1 | 25.496313095092773 | 25.508262634277344 | 0.016897851601243019 | 0.52222729036036653 |
CTRL CDC D1 | sXBP1 | 30.057857513427734 | 30.062736511230469 | 0.0068999449722468853 | 0.5662700331258208 |
CTRL CDC D1 | sXBP1 | 30.067615509033203 | 30.062736511230469 | 0.0068999449722468853 | 0.5662700331258208 |
CTRL CDC D1 | ATF4 | 27.420682907104492 | 27.437673568725586 | 0.024028424173593521 | 0.58827107235415099 |
CTRL CDC D1 | ATF4 | 27.45466423034668 | 27.437673568725586 | 0.024028424173593521 | 0.58827107235415099 |
CTRL CDC D1 | CHOP | 26.260444641113281 | 26.347702026367188 | 0.12340057641267776 | 0.23654008298589471 |
CTRL CDC D1 | CHOP | 26.434959411621094 | 26.347702026367188 | 0.12340057641267776 | 0.23654008298589471 |
CTRL CDC D1 | BiP | 25.359561920166016 | 25.348148345947266 | 0.016142580658197403 | 0.35245788740176781 |
CTRL CDC D1 | BiP | 25.336732864379883 | 25.348148345947266 | 0.016142580658197403 | 0.35245788740176781 |
M1 Nil D1 | GAPDH | 19.990224838256836 | 19.982322692871094 | 0.011173972859978676 | 0.33915683988368872 |
M1 Nil D1 | GAPDH | 19.974422454833984 | 19.982322692871094 | 0.011173972859978676 | 0.33915683988368872 |
M1 Nil D1 | tXBP1 | 25.831789016723633 | 25.692073822021484 | 0.19758577644824982 | 0.52222729036036653 |
M1 Nil D1 | tXBP1 | 25.552360534667969 | 25.692073822021484 | 0.19758577644824982 | 0.52222729036036653 |
M1 Nil D1 | sXBP1 | 30.719028472900391 | 30.761308670043945 | 0.059793226420879364 | 0.5662700331258208 |
M1 Nil D1 | sXBP1 | 30.8035888671875 | 30.761308670043945 | 0.059793226420879364 | 0.5662700331258208 |
M1 Nil D1 | ATF4 | 26.930835723876953 | 26.874160766601562 | 0.080151841044425964 | 0.58827107235415099 |
M1 Nil D1 | ATF4 | 26.817483901977539 | 26.874160766601562 | 0.080151841044425964 | 0.58827107235415099 |
M1 Nil D1 | CHOP | 26.15199089050293 | 26.175567626953125 | 0.033342540264129639 | 0.23654008298589471 |
M1 Nil D1 | CHOP | 26.19914436340332 | 26.175567626953125 | 0.033342540264129639 | 0.23654008298589471 |
M1 Nil D1 | BiP | 25.138555526733398 | 25.358840942382812 | 0.31153196096420288 | 0.35245788740176781 |
M1 Nil D1 | BiP | 25.579128265380859 | 25.358840942382812 | 0.31153196096420288 | 0.35245788740176781 |
M1 CDC D1 | GAPDH | 19.693199157714844 | 19.692794799804688 | 0.00057319714687764645 | 0.33915683988368872 |
M1 CDC D1 | GAPDH | 19.692388534545898 | 19.692794799804688 | 0.00057319714687764645 | 0.33915683988368872 |
M1 CDC D1 | tXBP1 | 25.209577560424805 | 25.069972991943359 | 0.19742932915687561 | 0.52222729036036653 |
M1 CDC D1 | tXBP1 | 24.930370330810547 | 25.069972991943359 | 0.19742932915687561 | 0.52222729036036653 |
M1 CDC D1 | sXBP1 | 30.504255294799805 | 30.463470458984375 | 0.057679817080497742 | 0.5662700331258208 |
M1 CDC D1 | sXBP1 | 30.422683715820312 | 30.463470458984375 | 0.057679817080497742 | 0.5662700331258208 |
M1 CDC D1 | ATF4 | 27.132936477661133 | 27.047580718994141 | 0.12071262300014496 | 0.58827107235415099 |
M1 CDC D1 | ATF4 | 26.962223052978516 | 27.047580718994141 | 0.12071262300014496 | 0.58827107235415099 |
M1 CDC D1 | CHOP | 25.797645568847656 | 26.164039611816406 | 0.51816076040267944 | 0.23654008298589471 |
M1 CDC D1 | CHOP | 26.530435562133789 | 26.164039611816406 | 0.51816076040267944 | 0.23654008298589471 |
M1 CDC D1 | BiP | 24.931297302246094 | 24.782367706298828 | 0.21061959862709045 | 0.35245788740176781 |
M1 CDC D1 | BiP | 24.63343620300293 | 24.782367706298828 | 0.21061959862709045 | 0.35245788740176781 |
NA | NA | NA | NA | NA | NA |
Analysis Type | Singleplex | NA | NA | NA | NA |
Endogenous Control | GAPDH | NA | NA | NA | NA |
RQ Min/Max Confidence Level | 95.0 | NA | NA | NA | NA |
Reference Sample | CTRL Nil 6h | NA | NA | NA | NA |
What I had to do was copying ang pasting some cells into another Excel file (used by other lab members), in which for each gene I had the following table:
Sample.Name | Cт.Mean | Sign | Cт.SD |
---|---|---|---|
CTRL Nil 6h | 19.78077 | ± | 0.1985434 |
CTRL CDC 6h | 19.96545 | ± | 0.0316486 |
M1 Nil 6h | 20.25294 | ± | 0.0956106 |
M1 CDC 6h | 20.09715 | ± | 0.0114329 |
CTRL Nil D1 | 20.08270 | ± | 0.0565590 |
CTRL CDC D1 | 19.91847 | ± | 0.0327774 |
M1 Nil D1 | 19.98232 | ± | 0.0111740 |
M1 CDC D1 | 19.69279 | ± | 0.0005732 |
As you can see by comparing the tables, the needed work is time-consuming and error-prone, since the rows are organized by the name of the sample and not by gene and each sample/gene combination is in duplicate.
This means that every time I had to filter out by gene, carefully select the right cells, and paste them in the other file; you can clearly imagine how long it used take and how many mistakes had been made.
For all these reasons, one night I decided to write a function that takes the PCR file and returns an Excel file with one gene table per sheet, all in ~20 seconds (and you will now see why it takes so much time).
Function command and required packages
To launch the function, you just need to type the name of the function, with no input required; however, the following packages must be already installed in the machine:Main steps
Let’s see roughly how the function is organized, without going into the details of the code, as this is not the aim of this post.
Since the function has no input, you have to somehow tell it which file to take: for this reason, the first line of code asks the user to choose the file of interest (via file.choose ()
) and save the name/path into a character variable.
The table is then imported into R; as you have noticed before, the table contains header rows and final rows that are not needed and that must be removed in such a way as to obtain an ordered table, with values in all the cells and with right column names: this is the first operation that the function implements on the table.
Subsequently, the user is asked to indicate which columns to keep (Sample, Target, Ct mean and Ct sd), in order to extrapolate them and work on a smaller dataset, including only the required data. In this regard, since I only need Ct mean and sd, one duplicate for each gene/target combination is deleted.
At this point everything is ready, thanks to a for loop
in which one gene is taken into account in each iteration, the final Excel file is created and saved in the same folder as the original file.
Possible improvements
Although in the eyes of the less experienced it may seem a well-done function (and I was convinced of it when I created it), now that I look at it with a more “trained” vision, I can guarantee that it can be improved in several points. In particular, there are several limits:
This is an example of how working more and more with R, you can notice inaccuracies and rookie “mistakes” - which are normal - and already have an idea of how you can improve your work, but this will be the content of a future post.