1. use python3 engine in rmarkdown

The \(reticulate\) package provides a comprehensive set of tools for interoperability between Python and R. The package includes facilities for translation between R and Python. \(reticulate\) embeds a Python session within your R session, enabling seamless, high-performance interoperability. If you are an R developer that uses Python for some of your work or a member of data science team that uses both languages, reticulate can dramatically streamline your workflow!

When calling into Python, R data types are automatically converted to their equivalent Python types. When values are returned from Python to R they are converted back to R types. Types are converted as follows:

data type convertion between R and Python
R Python Examples
Single-element vector Scalar 1, 1L, TRUE, “foo”
Multi-element vector List c(1.0, 2.0, 3.0), c(1L, 2L, 3L)
List of multiple types Tuple list(1L, TRUE, “foo”)
Named list Dict list(a = 1L, b = 2.0), dict(x = x_data)
Matrix/Array NumPy ndarray matrix(c(1,2,3,4), nrow = 2, ncol = 2)
Data Frame Pandas DataFrame data.frame(x = c(1,2,3), y = c(“a”, “b”, “c”))
Function Python function function(x) x + 1
NULL, TRUE, FALSE None, True, False NULL, TRUE, FALSE

(source: https://rstudio.github.io/reticulate/)

2. Example

2.1 apply python3 engine

# use python3 engine
library(reticulate)
use_python("/usr/local/bin/python3")
library(dplyr)
library(purrr)
iris %>% head %>% knitr::kable(format = "markdown") 
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
5.1 3.5 1.4 0.2 setosa
4.9 3.0 1.4 0.2 setosa
4.7 3.2 1.3 0.2 setosa
4.6 3.1 1.5 0.2 setosa
5.0 3.6 1.4 0.2 setosa
5.4 3.9 1.7 0.4 setosa

2.2 Data from R to Python environment

# import r object dat to python
iris = r.iris
# python code
iris2 = iris
x = iris2.iloc[:,:4]
y = iris2.iloc[:,4]
# This chunk is python code !!!
import pandas as pd # data structure
import seaborn as sns # visualization
from sklearn.ensemble import RandomForestClassifier # algorithm
from sklearn.model_selection import cross_val_score # target function for Bayesian optimization
from bayes_opt import BayesianOptimization # Bayesian optimization

# Bayesian optimization
def rf_cv(n_estimators, min_samples_split, max_features, max_depth):
    val = cross_val_score(RandomForestClassifier(n_estimators=int(n_estimators),
                                                 min_samples_split=int(min_samples_split),
                                                 max_features=min(max_features, 0.999),
                                                 max_depth=int(max_depth),
                                                 random_state=2),
                          x, y, cv=5).mean()
    return val
rf_bo = BayesianOptimization(rf_cv,
                            {'n_estimators': (10, 250),
                             'min_samples_split': (2, 25),
                             'max_features': (0.1, 0.999),
                             'max_depth': (5, 15) } ) # object for Bayesian optimization
    
rf_bo.maximize() # optimizing
## |   iter    |  target   | max_depth | max_fe... | min_sa... | n_esti... |
## -------------------------------------------------------------------------
## |  1        |  0.96     |  11.78    |  0.9302   |  23.51    |  75.09    |
## |  2        |  0.96     |  6.404    |  0.5346   |  3.992    |  242.2    |
## |  3        |  0.9533   |  8.627    |  0.3307   |  20.8     |  163.3    |
## |  4        |  0.9533   |  12.63    |  0.3901   |  14.77    |  149.8    |
## |  5        |  0.9533   |  12.3     |  0.8749   |  16.16    |  158.7    |
## |  6        |  0.9667   |  12.21    |  0.7069   |  22.37    |  74.29    |
## |  7        |  0.9667   |  11.62    |  0.6678   |  21.36    |  74.47    |
## |  8        |  0.96     |  11.89    |  0.1083   |  21.52    |  72.43    |
## |  9        |  0.9667   |  12.99    |  0.5134   |  20.55    |  74.97    |
## |  10       |  0.96     |  13.3     |  0.9152   |  19.52    |  76.82    |
## |  11       |  0.96     |  14.53    |  0.2492   |  20.64    |  72.7     |
## |  12       |  0.9667   |  14.43    |  0.6484   |  23.27    |  74.33    |
## |  13       |  0.96     |  13.17    |  0.1849   |  21.68    |  75.26    |
## |  14       |  0.9667   |  13.76    |  0.6379   |  22.45    |  73.29    |
## |  15       |  0.9533   |  14.91    |  0.1762   |  24.03    |  72.7     |
## |  16       |  0.96     |  13.48    |  0.9034   |  22.97    |  73.78    |
## |  17       |  0.96     |  10.1     |  0.1503   |  17.13    |  67.45    |
## |  18       |  0.96     |  11.22    |  0.9914   |  22.36    |  74.87    |
## |  19       |  0.9667   |  6.3      |  0.6236   |  7.066    |  49.52    |
## |  20       |  0.9533   |  12.86    |  0.4823   |  7.341    |  103.9    |
## |  21       |  0.96     |  8.98     |  0.8033   |  6.907    |  203.3    |
## |  22       |  0.9467   |  13.34    |  0.3529   |  23.99    |  128.1    |
## |  23       |  0.96     |  12.16    |  0.1344   |  20.59    |  74.49    |
## |  24       |  0.9667   |  12.67    |  0.5545   |  21.62    |  73.52    |
## |  25       |  0.9667   |  12.59    |  0.7377   |  21.6     |  74.47    |
## |  26       |  0.96     |  13.77    |  0.8722   |  21.26    |  73.84    |
## |  27       |  0.96     |  11.96    |  0.2607   |  21.84    |  73.94    |
## |  28       |  0.96     |  14.71    |  0.9825   |  23.55    |  74.87    |
## |  29       |  0.9667   |  12.92    |  0.6757   |  20.42    |  80.11    |
## |  30       |  0.9533   |  9.965    |  0.1326   |  9.472    |  159.8    |
## =========================================================================
res = pd.DataFrame(rf_bo.res)

2.3 Data from Python to R environment

# import python object dat to r
rf_bo.res <- py$res
# r code
plot(rf_bo.res$target)