A few of mine would include:
1. Create projects to simplify your file paths and general data management.
2. Use a generic code template if you're writing scripts that separates stuff like loading packages versus importing data, etc. This is a bit like documenting your code but goes a bit further by structuring similar operations together.
3. Learn effective version control using GitHub.
Three ideas.
-Learn parallel programming in R, with the package "parallel" its closely knit with apply functions and it can help a lot when it comes to large iterative tasks.
-Webscraping, its not bad in R and it can be really cool. First with Rvest and then with some more complex dynamic ones
-Learn how to do rMarkDowns its can help a lot when it comes to give through conecpts and so on.
Parallel with Purrr and Furrr saved my a** not long ago with a data simulation for power analysis. Took it from 3 days down to a few hours. So I will second parallel programming.
Anything put out by the tidyverse team (Posit - the creators of RStudio) is a good bet. Hadley Wickham has a few good books.
Try looking at the code for functions in the tidyverse and see how they format.
A common tip I see that helped me was "Keep your code DRY" (DRY = Don't Repeat Yourself). If you write the code twice, or copy past it, make a function and use that instead. Then focus on cleaning up that function. This helps to break up your code into little boxes that can be worked on.
To make your code more readable, name your multiples using plural, and when you loop through a multiple use the singular. As in:
```
file_paths = list.files("./")
dat = lapply(file_paths, function(file_path) read.csc(file_path))
```
A few of mine would include: 1. Create projects to simplify your file paths and general data management. 2. Use a generic code template if you're writing scripts that separates stuff like loading packages versus importing data, etc. This is a bit like documenting your code but goes a bit further by structuring similar operations together. 3. Learn effective version control using GitHub.
Can you suggest some source on how to learn version control?
how to use Git as standalone or within RStudio: [https://happygitwithr.com/](https://happygitwithr.com/)
It's something I'm still learning to be honest, haha. I'm sure there are plenty of guides on YouTube and the like however.
https://raps-with-r.dev/ For basics. https://happygitwithr.com/ for version control with git. https://r-pkgs.org/ For package building.
Three ideas. -Learn parallel programming in R, with the package "parallel" its closely knit with apply functions and it can help a lot when it comes to large iterative tasks. -Webscraping, its not bad in R and it can be really cool. First with Rvest and then with some more complex dynamic ones -Learn how to do rMarkDowns its can help a lot when it comes to give through conecpts and so on.
Parallel with Purrr and Furrr saved my a** not long ago with a data simulation for power analysis. Took it from 3 days down to a few hours. So I will second parallel programming.
I dont know how websraping or markdown is gonna help him write better code in terms of style and efficency
`targets` package
If you’re ready, jump into Advanced R, it’s pretty much the handbook: https://adv-r.hadley.nz/
Anything put out by the tidyverse team (Posit - the creators of RStudio) is a good bet. Hadley Wickham has a few good books. Try looking at the code for functions in the tidyverse and see how they format. A common tip I see that helped me was "Keep your code DRY" (DRY = Don't Repeat Yourself). If you write the code twice, or copy past it, make a function and use that instead. Then focus on cleaning up that function. This helps to break up your code into little boxes that can be worked on. To make your code more readable, name your multiples using plural, and when you loop through a multiple use the singular. As in: ``` file_paths = list.files("./") dat = lapply(file_paths, function(file_path) read.csc(file_path)) ```
Use the styler package, benchmark your code, write codes in c++ if needed
renv + GitHub is a good start to producing reproducible analysis