This page looks best with JavaScript enabled

My first function

 ·  🕘 8 min read  ·  🤖 Matteo Miotto

“You never forget your first… function”. Yes, I know this isn’t the right expression, but it still works.
Today, I’m here to show my first function, written during my thesis internship.
The aim of this post is not to analyze step by step the code of the function, but to understand why it was written, its use, the main feautures and how it could be improved.
Disclaimer: data shown in the table are fake.

Aim

As for all the functions, it was written because I needed to save time in copying and pasting from an Excel file to another. In particular, during my thesis internship in the Viral Pathogenesis and Biosafety Unit, I used to work with PCR techniques to analyze transcriptional levels in Zika Virus-infected macrophages. PCR machine returned an Excel file like this:
Block Type 96fast …3 …4 …5 …6
Chemistry SYBR_GREEN NA NA NA NA
Experiment File Name E:-USER-998.eds NA NA NA NA
Experiment Run End Time 2020-09-16 17:36:48 PM CEST NA NA NA NA
Instrument Type sds7500fast NA NA NA NA
Passive Reference ROX NA NA NA NA
NA NA NA NA NA NA
Sample Name Target Name Cт Mean Cт SD Ct Threshold
CTRL Nil 6h GAPDH 19.640373229980469 19.780765533447266 0.19854335486888885 0.33915683988368872
CTRL Nil 6h GAPDH 19.92115592956543 19.780765533447266 0.19854335486888885 0.33915683988368872
CTRL Nil 6h tXBP1 25.290117263793945 24.952022552490234 0.47813811898231506 0.52222729036036653
CTRL Nil 6h tXBP1 24.613927841186523 24.952022552490234 0.47813811898231506 0.52222729036036653
CTRL Nil 6h sXBP1 30.254941940307617 30.524925231933594 0.3818126916885376 0.5662700331258208
CTRL Nil 6h sXBP1 30.794906616210938 30.524925231933594 0.3818126916885376 0.5662700331258208
CTRL Nil 6h ATF4 26.92424201965332 27.035743713378906 0.15768586099147797 0.58827107235415099
CTRL Nil 6h ATF4 27.147243499755859 27.035743713378906 0.15768586099147797 0.58827107235415099
CTRL Nil 6h CHOP 26.493007659912109 26.470455169677734 0.031894039362668991 0.23654008298589471
CTRL Nil 6h CHOP 26.447902679443359 26.470455169677734 0.031894039362668991 0.23654008298589471
CTRL Nil 6h BiP 24.844993591308594 24.112155914306641 1.0363876819610596 0.35245788740176781
CTRL Nil 6h BiP 23.37932014465332 24.112155914306641 1.0363876819610596 0.35245788740176781
CTRL CDC 6h GAPDH 19.987825393676758 19.965446472167969 0.031648572534322739 0.33915683988368872
CTRL CDC 6h GAPDH 19.94306755065918 19.965446472167969 0.031648572534322739 0.33915683988368872
CTRL CDC 6h tXBP1 25.751449584960938 25.805307388305664 0.076166436076164246 0.52222729036036653
CTRL CDC 6h tXBP1 25.859165191650391 25.805307388305664 0.076166436076164246 0.52222729036036653
CTRL CDC 6h sXBP1 30.767412185668945 30.850719451904297 0.11781291663646698 0.5662700331258208
CTRL CDC 6h sXBP1 30.934024810791016 30.850719451904297 0.11781291663646698 0.5662700331258208
CTRL CDC 6h ATF4 26.766836166381836 26.863870620727539 0.13722744584083557 0.58827107235415099
CTRL CDC 6h ATF4 26.960905075073242 26.863870620727539 0.13722744584083557 0.58827107235415099
CTRL CDC 6h CHOP 26.424724578857422 26.374011993408203 0.07171977311372757 0.23654008298589471
CTRL CDC 6h CHOP 26.323297500610352 26.374011993408203 0.07171977311372757 0.23654008298589471
CTRL CDC 6h BiP 25.402254104614258 25.564586639404297 0.22957152128219604 0.35245788740176781
CTRL CDC 6h BiP 25.726917266845703 25.564586639404297 0.22957152128219604 0.35245788740176781
M1 Nil 6h GAPDH 20.320545196533203 20.252937316894531 0.095610633492469788 0.33915683988368872
M1 Nil 6h GAPDH 20.185331344604492 20.252937316894531 0.095610633492469788 0.33915683988368872
M1 Nil 6h tXBP1 25.388547897338867 25.423015594482422 0.048746034502983093 0.52222729036036653
M1 Nil 6h tXBP1 25.457485198974609 25.423015594482422 0.048746034502983093 0.52222729036036653
M1 Nil 6h sXBP1 30.424283981323242 30.483283996582031 0.08343861997127533 0.5662700331258208
M1 Nil 6h sXBP1 30.54228401184082 30.483283996582031 0.08343861997127533 0.5662700331258208
M1 Nil 6h ATF4 27.50013542175293 27.068460464477539 0.61048060655593872 0.58827107235415099
M1 Nil 6h ATF4 26.636785507202148 27.068460464477539 0.61048060655593872 0.58827107235415099
M1 Nil 6h CHOP 25.918970108032227 26.166193008422852 0.34962597489356995 0.23654008298589471
M1 Nil 6h CHOP 26.413415908813477 26.166193008422852 0.34962597489356995 0.23654008298589471
M1 Nil 6h BiP 25.549455642700195 25.439476013183594 0.1555333286523819 0.35245788740176781
M1 Nil 6h BiP 25.329498291015625 25.439476013183594 0.1555333286523819 0.35245788740176781
M1 CDC 6h GAPDH 20.08906364440918 20.097148895263672 0.011432922445237637 0.33915683988368872
M1 CDC 6h GAPDH 20.105232238769531 20.097148895263672 0.011432922445237637 0.33915683988368872
M1 CDC 6h tXBP1 25.849300384521484 25.935159683227539 0.1214233860373497 0.52222729036036653
M1 CDC 6h tXBP1 26.021018981933594 25.935159683227539 0.1214233860373497 0.52222729036036653
M1 CDC 6h sXBP1 31.350503921508789 31.301733016967773 0.068972475826740265 0.5662700331258208
M1 CDC 6h sXBP1 31.252962112426758 31.301733016967773 0.068972475826740265 0.5662700331258208
M1 CDC 6h ATF4 27.383247375488281 27.352087020874023 0.044067397713661194 0.58827107235415099
M1 CDC 6h ATF4 27.320926666259766 27.352087020874023 0.044067397713661194 0.58827107235415099
M1 CDC 6h CHOP 26.69500732421875 26.680751800537109 0.020160354673862457 0.23654008298589471
M1 CDC 6h CHOP 26.666496276855469 26.680751800537109 0.020160354673862457 0.23654008298589471
M1 CDC 6h BiP 25.671287536621094 25.666770935058594 0.0063887881115078926 0.35245788740176781
M1 CDC 6h BiP 25.662252426147461 25.666770935058594 0.0063887881115078926 0.35245788740176781
CTRL Nil D1 GAPDH 20.042705535888672 20.082698822021484 0.056559048593044281 0.33915683988368872
CTRL Nil D1 GAPDH 20.122692108154297 20.082698822021484 0.056559048593044281 0.33915683988368872
CTRL Nil D1 tXBP1 25.8763427734375 25.880035400390625 0.0052208146080374718 0.52222729036036653
CTRL Nil D1 tXBP1 25.883726119995117 25.880035400390625 0.0052208146080374718 0.52222729036036653
CTRL Nil D1 sXBP1 31.229412078857422 31.258998870849609 0.041843391954898834 0.5662700331258208
CTRL Nil D1 sXBP1 31.28858757019043 31.258998870849609 0.041843391954898834 0.5662700331258208
CTRL Nil D1 ATF4 27.159099578857422 27.170570373535156 0.016223501414060593 0.58827107235415099
CTRL Nil D1 ATF4 27.182043075561523 27.170570373535156 0.016223501414060593 0.58827107235415099
CTRL Nil D1 CHOP 26.582208633422852 26.690837860107422 0.15362358093261719 0.23654008298589471
CTRL Nil D1 CHOP 26.799465179443359 26.690837860107422 0.15362358093261719 0.23654008298589471
CTRL Nil D1 BiP 25.77208137512207 25.828418731689453 0.079671703279018402 0.35245788740176781
CTRL Nil D1 BiP 25.884754180908203 25.828418731689453 0.079671703279018402 0.35245788740176781
CTRL CDC D1 GAPDH 19.895290374755859 19.918468475341797 0.032777436077594757 0.33915683988368872
CTRL CDC D1 GAPDH 19.941644668579102 19.918468475341797 0.032777436077594757 0.33915683988368872
CTRL CDC D1 tXBP1 25.520210266113281 25.508262634277344 0.016897851601243019 0.52222729036036653
CTRL CDC D1 tXBP1 25.496313095092773 25.508262634277344 0.016897851601243019 0.52222729036036653
CTRL CDC D1 sXBP1 30.057857513427734 30.062736511230469 0.0068999449722468853 0.5662700331258208
CTRL CDC D1 sXBP1 30.067615509033203 30.062736511230469 0.0068999449722468853 0.5662700331258208
CTRL CDC D1 ATF4 27.420682907104492 27.437673568725586 0.024028424173593521 0.58827107235415099
CTRL CDC D1 ATF4 27.45466423034668 27.437673568725586 0.024028424173593521 0.58827107235415099
CTRL CDC D1 CHOP 26.260444641113281 26.347702026367188 0.12340057641267776 0.23654008298589471
CTRL CDC D1 CHOP 26.434959411621094 26.347702026367188 0.12340057641267776 0.23654008298589471
CTRL CDC D1 BiP 25.359561920166016 25.348148345947266 0.016142580658197403 0.35245788740176781
CTRL CDC D1 BiP 25.336732864379883 25.348148345947266 0.016142580658197403 0.35245788740176781
M1 Nil D1 GAPDH 19.990224838256836 19.982322692871094 0.011173972859978676 0.33915683988368872
M1 Nil D1 GAPDH 19.974422454833984 19.982322692871094 0.011173972859978676 0.33915683988368872
M1 Nil D1 tXBP1 25.831789016723633 25.692073822021484 0.19758577644824982 0.52222729036036653
M1 Nil D1 tXBP1 25.552360534667969 25.692073822021484 0.19758577644824982 0.52222729036036653
M1 Nil D1 sXBP1 30.719028472900391 30.761308670043945 0.059793226420879364 0.5662700331258208
M1 Nil D1 sXBP1 30.8035888671875 30.761308670043945 0.059793226420879364 0.5662700331258208
M1 Nil D1 ATF4 26.930835723876953 26.874160766601562 0.080151841044425964 0.58827107235415099
M1 Nil D1 ATF4 26.817483901977539 26.874160766601562 0.080151841044425964 0.58827107235415099
M1 Nil D1 CHOP 26.15199089050293 26.175567626953125 0.033342540264129639 0.23654008298589471
M1 Nil D1 CHOP 26.19914436340332 26.175567626953125 0.033342540264129639 0.23654008298589471
M1 Nil D1 BiP 25.138555526733398 25.358840942382812 0.31153196096420288 0.35245788740176781
M1 Nil D1 BiP 25.579128265380859 25.358840942382812 0.31153196096420288 0.35245788740176781
M1 CDC D1 GAPDH 19.693199157714844 19.692794799804688 0.00057319714687764645 0.33915683988368872
M1 CDC D1 GAPDH 19.692388534545898 19.692794799804688 0.00057319714687764645 0.33915683988368872
M1 CDC D1 tXBP1 25.209577560424805 25.069972991943359 0.19742932915687561 0.52222729036036653
M1 CDC D1 tXBP1 24.930370330810547 25.069972991943359 0.19742932915687561 0.52222729036036653
M1 CDC D1 sXBP1 30.504255294799805 30.463470458984375 0.057679817080497742 0.5662700331258208
M1 CDC D1 sXBP1 30.422683715820312 30.463470458984375 0.057679817080497742 0.5662700331258208
M1 CDC D1 ATF4 27.132936477661133 27.047580718994141 0.12071262300014496 0.58827107235415099
M1 CDC D1 ATF4 26.962223052978516 27.047580718994141 0.12071262300014496 0.58827107235415099
M1 CDC D1 CHOP 25.797645568847656 26.164039611816406 0.51816076040267944 0.23654008298589471
M1 CDC D1 CHOP 26.530435562133789 26.164039611816406 0.51816076040267944 0.23654008298589471
M1 CDC D1 BiP 24.931297302246094 24.782367706298828 0.21061959862709045 0.35245788740176781
M1 CDC D1 BiP 24.63343620300293 24.782367706298828 0.21061959862709045 0.35245788740176781
NA NA NA NA NA NA
Analysis Type Singleplex NA NA NA NA
Endogenous Control GAPDH NA NA NA NA
RQ Min/Max Confidence Level 95.0 NA NA NA NA
Reference Sample CTRL Nil 6h NA NA NA NA


What I had to do was copying ang pasting some cells into another Excel file (used by other lab members), in which for each gene I had the following table:

Sample.Name Cт.Mean Sign Cт.SD
CTRL Nil 6h 19.78077 ± 0.1985434
CTRL CDC 6h 19.96545 ± 0.0316486
M1 Nil 6h 20.25294 ± 0.0956106
M1 CDC 6h 20.09715 ± 0.0114329
CTRL Nil D1 20.08270 ± 0.0565590
CTRL CDC D1 19.91847 ± 0.0327774
M1 Nil D1 19.98232 ± 0.0111740
M1 CDC D1 19.69279 ± 0.0005732


As you can see by comparing the tables, the needed work is time-consuming and error-prone, since the rows are organized by the name of the sample and not by gene and each sample/gene combination is in duplicate. This means that every time I had to filter out by gene, carefully select the right cells, and paste them in the other file; you can clearly imagine how long it used take and how many mistakes had been made.

For all these reasons, one night I decided to write a function that takes the PCR file and returns an Excel file with one gene table per sheet, all in ~20 seconds (and you will now see why it takes so much time).

Function command and required packages

To launch the function, you just need to type the name of the function, with no input required; however, the following packages must be already installed in the machine:
  • readxl
  • svGUI
  • svDialogs
  • dplyr
  • xlsx
  • Main steps

    Let’s see roughly how the function is organized, without going into the details of the code, as this is not the aim of this post.

    Since the function has no input, you have to somehow tell it which file to take: for this reason, the first line of code asks the user to choose the file of interest (via file.choose ()) and save the name/path into a character variable. The table is then imported into R; as you have noticed before, the table contains header rows and final rows that are not needed and that must be removed in such a way as to obtain an ordered table, with values in all the cells and with right column names: this is the first operation that the function implements on the table.

    Subsequently, the user is asked to indicate which columns to keep (Sample, Target, Ct mean and Ct sd), in order to extrapolate them and work on a smaller dataset, including only the required data. In this regard, since I only need Ct mean and sd, one duplicate for each gene/target combination is deleted.

    At this point everything is ready, thanks to a for loop in which one gene is taken into account in each iteration, the final Excel file is created and saved in the same folder as the original file.

    Possible improvements

    Although in the eyes of the less experienced it may seem a well-done function (and I was convinced of it when I created it), now that I look at it with a more “trained” vision, I can guarantee that it can be improved in several points. In particular, there are several limits:
  • total absence of automation: the user must select one file at a time and also indicate the reference columns
  • length of the process: again, for the reasons above, there is a need for interaction with the user and, therefore, more time requested
  • possibility to work only if the analyses are done in duplicates, otherwise some rows of interest are deleted
  • output in Excel format only
  • one table per sheet in the output, it would be better to have the tables one below the other in one single sheet/file

  • This is an example of how working more and more with R, you can notice inaccuracies and rookie “mistakes” - which are normal - and already have an idea of how you can improve your work, but this will be the content of a future post.

    Share on
    Support the author with

    Matteo Miotto
    WRITTEN BY
    Matteo Miotto
    Genomic Data Science master student